Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crashed while uploading a large number of files #273

Closed
mosic opened this issue Mar 30, 2016 · 2 comments
Closed

Server crashed while uploading a large number of files #273

mosic opened this issue Mar 30, 2016 · 2 comments

Comments

@mosic
Copy link

mosic commented Mar 30, 2016

Hi, I have been testing seaweedfs for storing large numbers of images as mentioned in #271. I have run into a situation where the weed server with filer crashes and I'm unable to start it again.

I made a simple application that reproduces this behaviour consistently: https://github.com/mosic/seaweedfs-repro

It starts a pool of 50 Erlang processes which then (in parallel) upload an image to a path which includes a timestamp, incrementing the time by one second, 10 million times. The seaweedfs server crashed in multiple tries, sometimes after ~100k and one time after ~900k requests.

This is an excerpt from the logged output:

$ weed -v=0 server -filer=true -volume.max 50 -dir /mnt/seaweedfs
I0330 19:14:09 32497 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0330 19:14:09 32497 file_util.go:20] Folder /mnt/seaweedfs/filer Permission: -rwx------
I0330 19:14:09 32497 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0330 19:14:09 32497 topology.go:86] Using default configurations.
I0330 19:14:09 32497 master_server.go:59] Volume Size Limit is 30000 MB
I0330 19:14:09 32497 server.go:205] Start Seaweed Master 0.70 beta at localhost:9333
I0330 19:14:09 32497 server.go:178] Start Seaweed Filer 0.70 beta at port 8888
I0330 19:14:09 32497 raft_server.go:103] Old conf,log,snapshot should have been removed.
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/1.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/1.dat, replicaPlacement=000 v=2 size=184997272 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/2.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/2.dat, replicaPlacement=000 v=2 size=192314608 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/3.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/3.dat, replicaPlacement=000 v=2 size=188749752 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/4.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/4.dat, replicaPlacement=000 v=2 size=178993304 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/5.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/5.dat, replicaPlacement=000 v=2 size=195128968 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/6.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/6.dat, replicaPlacement=000 v=2 size=187811632 ttl=
I0330 19:14:09 32497 volume.go:110] loading index file /mnt/seaweedfs/7.idx readonly false
I0330 19:14:09 32497 store.go:218] data file /mnt/seaweedfs/7.dat, replicaPlacement=000 v=2 size=185372520 ttl=
I0330 19:14:09 32497 store.go:227] Store started on dir: /mnt/seaweedfs with 7 volumes max 50
I0330 19:14:09 32497 server.go:259] Start Seaweed volume server 0.70 beta at localhost:8080
I0330 19:14:09 32497 volume_server.go:73] Volume server bootstraps with master localhost:9333
I0330 19:14:27 32497 master_server.go:89] [ localhost:9333 ] localhost:9333 becomes leader.
I0330 19:14:28 32497 node.go:208] topo adds child DefaultDataCenter
I0330 19:14:28 32497 node.go:208] topo:DefaultDataCenter adds child DefaultRack
I0330 19:14:28 32497 node.go:208] topo:DefaultDataCenter:DefaultRack adds child localhost:8080
I0330 19:14:28 32497 volume_server.go:85] Volume Server Connected with master at localhost:9333

fatal error: concurrent map read and map write
fatal error: concurrent map read and map write
fatal error: concurrent map read and map write

goroutine 1587 [running]:
runtime.throw(0xe3d720, 0x21)
  /usr/lib/go/src/runtime/panic.go:530 +0x90 fp=0xc823865438 sp=0xc823865420
runtime.mapaccess2_faststr(0xae4fc0, 0xc8206b37d0, 0xc822fbf29d, 0x2, 0xc824696990, 0x1)
  /usr/lib/go/src/runtime/hashmap_fast.go:307 +0x5b fp=0xc823865498 sp=0xc823865438
github.com/chrislusf/seaweedfs/go/filer/embedded_filer.(*DirectoryManagerInMap).makeDirectory(0xc82000ea40, 0xc822fbf275, 0x2a, 0xc8206b3800, 0xc82285c800)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/filer/embedded_filer/directory_in_map.go:190 +0x177 fp=0xc823865518 sp=0xc823865498
github.com/chrislusf/seaweedfs/go/filer/embedded_filer.(*DirectoryManagerInMap).MakeDirectory(0xc82000ea40, 0xc822fbf275, 0x2b, 0x2b, 0x0, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/filer/embedded_filer/directory_in_map.go:206 +0x41 fp=0xc823865548 sp=0xc823865518
github.com/chrislusf/seaweedfs/go/filer/embedded_filer.(*FilerEmbedded).CreateFile(0xc8207d80e0, 0xc822fbf275, 0x3b, 0xc821fa11f0, 0x10, 0x0, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/filer/embedded_filer/filer_embedded.go:38 +0x8d fp=0xc8238655c0 sp=0xc823865548
github.com/chrislusf/seaweedfs/go/weed/weed_server.(*FilerServer).PostHandler(0xc82001a150, 0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/weed_server/filer_server_handlers.go:235 +0x23db fp=0xc823865a90 sp=0xc8238655c0
github.com/chrislusf/seaweedfs/go/weed/weed_server.(*FilerServer).filerHandler(0xc82001a150, 0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/weed_server/filer_server_handlers.go:32 +0x24c fp=0xc823865ad0 sp=0xc823865a90
github.com/chrislusf/seaweedfs/go/weed/weed_server.(*FilerServer).(github.com/chrislusf/seaweedfs/go/weed/weed_server.filerHandler)-fm(0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/weed_server/filer_server.go:60 +0x3e fp=0xc823865af8 sp=0xc823865ad0
net/http.HandlerFunc.ServeHTTP(0xc8201b8de0, 0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /usr/lib/go/src/net/http/server.go:1618 +0x3a fp=0xc823865b18 sp=0xc823865af8
net/http.(*ServeMux).ServeHTTP(0xc8200132f0, 0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /usr/lib/go/src/net/http/server.go:1910 +0x17d fp=0xc823865b70 sp=0xc823865b18
net/http.serverHandler.ServeHTTP(0xc820098100, 0x7f5e8dce5130, 0xc822ec85b0, 0xc824144620)
  /usr/lib/go/src/net/http/server.go:2081 +0x19e fp=0xc823865bd0 sp=0xc823865b70
net/http.(*conn).serve(0xc821d23080)
  /usr/lib/go/src/net/http/server.go:1472 +0xf2e fp=0xc823865f98 sp=0xc823865bd0
runtime.goexit()
  /usr/lib/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc823865fa0 sp=0xc823865f98
created by net/http.(*Server).Serve
  /usr/lib/go/src/net/http/server.go:2137 +0x44e

You can see more at: https://gist.github.com/mosic/902eb9a34ae1ae87c0e967cee45fa9bc

After this crash, when I try to start seaweed again I get this:

$ weed -v=0 server -filer=true -volume.max 50 -dir /mnt/seaweedfs
I0330 20:02:50 24509 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0330 20:02:50 24509 file_util.go:20] Folder /mnt/seaweedfs/filer Permission: -rwx------
I0330 20:02:50 24509 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0330 20:02:50 24509 topology.go:86] Using default configurations.
I0330 20:02:50 24509 master_server.go:59] Volume Size Limit is 30000 MB
F0330 20:02:50 24509 filer_server.go:53] Can not start filer in dir /mnt/seaweedfs/filer : /gpocam/snapshots/recordings/2016/03/31/08 should be have id 41 instead of 42
goroutine 21 [running]:
github.com/chrislusf/seaweedfs/go/glog.stacks(0x11eb100, 0x0, 0x0, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:767 +0xb8
github.com/chrislusf/seaweedfs/go/glog.(*loggingT).output(0x11cc5a0, 0xc800000003, 0xc82019cc00, 0x11a28c1, 0xf, 0x35, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:718 +0x259
github.com/chrislusf/seaweedfs/go/glog.(*loggingT).printf(0x11cc5a0, 0xc800000003, 0xe3aea0, 0x22, 0xc820203d20, 0x2, 0x2)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:656 +0x1d4
github.com/chrislusf/seaweedfs/go/glog.Fatalf(0xe3aea0, 0x22, 0xc820203d20, 0x2, 0x2)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:1149 +0x5d
github.com/chrislusf/seaweedfs/go/weed/weed_server.NewFilerServer(0xc8201673e0, 0x22b8, 0xc82018b0d0, 0xe, 0xc8200823e0, 0x1b, 0x0, 0x0, 0xd70d88, 0x3, ...)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/weed_server/filer_server.go:53 +0xb9e
main.runServer.func1()
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/server.go:174 +0x206
created by main.runServer
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/server.go:189 +0xdf9

Running weed fix for each volume doesn't fix this. Any suggestions?

@chrislusf
Copy link
Collaborator

Thanks for the detailed information! This should be fixed now. Please re-run your test case again.

@mosic
Copy link
Author

mosic commented Mar 30, 2016

@chrislusf Thanks for the quick update, that indeed fixed the crashes! I wasn't able to make the server crash again. But I still get the same filer error if I stop the server and try to start it again. This doesn't seem to happen on every run, only after a large number of files have been uploaded.

Here is the output I got upon stopping the server after 1 million files finished uploading:

...
I0331 01:07:34 28195 files_in_leveldb.go:41] directory 333 fileName 16_45_260240.jpg fid 41,0f421cc377e0a5
I0331 01:07:34 28195 filer_server_handlers.go:209] post result {"name":"gpocam.jpg","size":187580}
I0331 01:07:34 28195 filer_server_handlers.go:234] saving /gpocam/snapshots/recordings/2016/04/11/12/16_56_260240.jpg => 41,0f422758b73c5a
I0331 01:07:34 28195 files_in_leveldb.go:41] directory 333 fileName 16_56_260240.jpg fid 41,0f422758b73c5a
^CI0331 01:07:37 28195 volume_server.go:117] Shutting down volume server...
I0331 01:07:37 28195 volume_server.go:119] Shut down successfully!


$ ./weed -v=4 server -filer=true -volume.max 50 -dir /mnt/seaweedfs
I0331 01:07:38 13324 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0331 01:07:38 13324 file_util.go:20] Folder /mnt/seaweedfs/filer Permission: -rwx------
I0331 01:07:38 13324 file_util.go:20] Folder /mnt/seaweedfs Permission: -rwxrwxr-x
I0331 01:07:38 13324 topology.go:86] Using default configurations.
I0331 01:07:38 13324 master_server.go:59] Volume Size Limit is 30000 MB
I0331 01:07:38 13324 server.go:205] Start Seaweed Master 0.70 beta at localhost:9333
F0331 01:07:38 13324 filer_server.go:53] Can not start filer in dir /mnt/seaweedfs/filer : /gpocam/snapshots/recordings/2016/04/05/09 should be have id 145 instead of 156
goroutine 36 [running]:
github.com/chrislusf/seaweedfs/go/glog.stacks(0x11ed100, 0x0, 0x0, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:767 +0xb8
github.com/chrislusf/seaweedfs/go/glog.(*loggingT).output(0x11ce5a0, 0xc800000003, 0xc8201a6c00, 0x11a4806, 0xf, 0x35, 0x0)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:718 +0x259
github.com/chrislusf/seaweedfs/go/glog.(*loggingT).printf(0x11ce5a0, 0xc800000003, 0xe3c600, 0x22, 0xc82004dd20, 0x2, 0x2)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:656 +0x1d4
github.com/chrislusf/seaweedfs/go/glog.Fatalf(0xe3c600, 0x22, 0xc82004dd20, 0x2, 0x2)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/glog/glog.go:1149 +0x5d
github.com/chrislusf/seaweedfs/go/weed/weed_server.NewFilerServer(0xc8200e3860, 0x22b8, 0xc820171370, 0xe, 0xc820172580, 0x1a, 0x0, 0x0, 0xd723e8, 0x3, ...)
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/weed_server/filer_server.go:53 +0xb9e
main.runServer.func1()
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/server.go:174 +0x206
created by main.runServer
  /home/mosic/go/src/github.com/chrislusf/seaweedfs/go/weed/server.go:189 +0xdf9

@mosic mosic changed the title Unable to start filer after a crash Server crashed while uploading a large number of files Apr 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants