Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: runtime error: invalid memory address or nil pointer dereference, github.com/chrislusf/seaweedfs/weed/storage.(*SortedFileNeedleMap).Close(0x0) #1870

Closed
charnger opened this issue Mar 7, 2021 · 3 comments

Comments

@charnger
Copy link

charnger commented Mar 7, 2021

Describe the bug
We had to power cycle a server1 with master, filer and one set of volumes. The data is replicated to server2 in another datacenter with 100 replication. That server2 is intact. Upon restarting and starting up the server1, I am getting the following message:

I0307 08:58:58 62024 volume_loading.go:122] loading sorted db /mnt/image/weed/29.sdx error: unexpected fil
e /mnt/image/weed/29.idx size: 44163072
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xf487a2]

goroutine 231 [running]:
github.com/chrislusf/seaweedfs/weed/storage.(*SortedFileNeedleMap).Close(0x0)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/needle_map_sorted_file.go:97 +
0x22
github.com/chrislusf/seaweedfs/weed/storage.(*Volume).load.func1(0xc000753a2d, 0xc0007b4120)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/volume_loading.go:32 +0x95
github.com/chrislusf/seaweedfs/weed/storage.(*Volume).load(0xc0007b4120, 0xc0007b0101, 0x0, 0x0, 0x270f0a0
, 0xc0bd95c9f0)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/volume_loading.go:176 +0x3be
github.com/chrislusf/seaweedfs/weed/storage.NewVolume(0x7fff665d95d3, 0xf, 0x7fff665d95d3, 0xf, 0x0, 0x0, 
0x19, 0x0, 0x0, 0x0, ...)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/volume.go:59 +0x156
github.com/chrislusf/seaweedfs/weed/storage.(*DiskLocation).loadExistingVolume(0xc00065e090, 0x2760d00, 0x
c001698000, 0x0, 0xc000734301)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/disk_location.go:124 +0x38e
github.com/chrislusf/seaweedfs/weed/storage.(*DiskLocation).concurrentLoadingVolumes.func2(0xc0005e0044, 0
xc0001b8420, 0xc00065e090, 0x0)
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/disk_location.go:164 +0x87
created by github.com/chrislusf/seaweedfs/weed/storage.(*DiskLocation).concurrentLoadingVolumes
        /home/travis/gopath/src/github.com/chrislusf/seaweedfs/weed/storage/disk_location.go:161 +0xeb
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xf487a2]


(full log attached)

System Setup
server1 (the one with segfault):
~/weed server -dataCenter=FMT2 -master.port=10333 -volume.port=10334 -volume.max=0 -master.defaultReplication=100 -master.volumeSizeLimitMB=500000 -metricsPort=10335 -filer=true -filer.port=$FILER_PORT -filer.peers=localhost:$FILER_PORT,$REMOTE:$FILER_PORT -dir=/mnt/image/weed
server2:

~/weed volume -mserver=nemesis.diffbot.com:10333 -dir=$DIR -port=20334 -dataCenter=FMT1 -max 100 &
~/weed filer -metricsPort=10335 -port=$FILER_PORT -peers=localhost:$FILER_PORT,$REMOTE:$FILER_PORT -master=$REMOTE:$MASTER_PORT -defaultStoreDir=$DIR
  • List the command line to start "weed master", "weed volume", "weed filer", "weed s3", "weed mount".

  • OS version:
    Ubuntu 20.04.2 LT
    5.4.0-66-generic

  • output of weed version
    version 8000GB 2.26 71f0c19 linux amd64

  • if using filer, show the content of filer.toml

Expected behavior
Volume server not crashing on startup. As a temportary solution, a way how to restore from the non-rebooted replica

@charnger
Copy link
Author

charnger commented Mar 7, 2021

Attached full log of the run command:
weed.log

@chrislusf chrislusf reopened this Mar 7, 2021
@chrislusf
Copy link
Collaborator

chrislusf commented Mar 7, 2021

Added code to fix the nil when failing to load the index files.

seems the /mnt/image/weed/25.idx file is corrupted. Anything special happened to it?

You can run weed fix -dir=/mnt/image/weed -volumeId=25 to fix the 25.idx file offline.

@charnger
Copy link
Author

charnger commented Mar 7, 2021

Thanks. I fixed a couple of volumes and brought it all back up.

Yes, the server process got stuck while on heavy I/O load on the raid array. The weed process could not be killed, nor could I sync, probably due to some ext4 / mdadm issue that sometimes occurs when the array is really heavily loaded. Had to reboot forcefully, so I imaging some of the blocks that were supposed to be written did not make it to the disks..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants