Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.9.0-rc1 crashes with "panic: send on closed channel" #8146

Closed
Luflosi opened this issue May 18, 2021 · 1 comment
Closed

0.9.0-rc1 crashes with "panic: send on closed channel" #8146

Luflosi opened this issue May 18, 2021 · 1 comment
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization

Comments

@Luflosi
Copy link
Contributor

Luflosi commented May 18, 2021

Version information:

go-ipfs version: 0.9.0-rc1
Repo version: 11
System version: amd64/linux
Golang version: go1.15.10

Description:

After updating IPFS from version 0.8.0 to version 0.9.0-rc1, the daemon crashed after a little more than a day of running. It printed the following and exited with exit code 2/INVALIDARGUMENT:

panic: send on closed channel
goroutine 1720527432 [running]:
github.com/libp2p/go-libp2p-kad-dht/fullrt.(*FullRT).findProvidersAsyncRoutine.func1(0x23a9060, 0xc2315a15c0, 0xc01e55faa0, 0x22, 0xc2315a15c0, 0xc1e4fa17b0)
        /build/ipfs-src/vendor/github.com/libp2p/go-libp2p-kad-dht/fullrt/dht.go:1107 +0x649
github.com/libp2p/go-libp2p-kad-dht/fullrt.(*FullRT).execOnMany.func1(0x23a8fe0, 0xc35dfe2280, 0xc0003a0580, 0xc25ca568c0, 0xc0602e9da0, 0xc0602e9c80, 0xc01e55faa0, 0x22)
        /build/ipfs-src/vendor/github.com/libp2p/go-libp2p-kad-dht/fullrt/dht.go:844 +0xae
created by github.com/libp2p/go-libp2p-kad-dht/fullrt.(*FullRT).execOnMany
        /build/ipfs-src/vendor/github.com/libp2p/go-libp2p-kad-dht/fullrt/dht.go:841 +0x1c5

Before starting the daemon, I executed ipfs config --json Experimental.AcceleratedDHTClient true if that's relevant.
I don't know how to reproduce the error.
I'm happy to provide any additional information (command line arguments, config, etc....) that may be useful.

@Luflosi Luflosi added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels May 18, 2021
@aschmahmann aschmahmann self-assigned this May 19, 2021
@aschmahmann
Copy link
Contributor

aschmahmann commented May 19, 2021

Thanks @Luflosi for the report this seems to be related to a race condition in the experimental DHT client. It should get fixed up before the next RC.

What happened/what's a solution here:

The error is fairly self explanatory, what's happening is that the execOnMany function is completing after findProvidersAsyncRoutine and execOnMany is sending to a channel closed in findProvidersAsyncRoutine. Probably findProvidersAsyncRoutine should just wait for execOnMany to finish before returning although if we really wanted to allow the cleanup to happen in the background we could rework the channel usage a bit. This doesn't seem worth the hassle though.

aschmahmann pushed a commit to libp2p/go-libp2p-kad-dht that referenced this issue May 27, 2021
…ppy mode

The execOnMany function was able to exit prematurely, leaving its child goroutines running.  These would write to a channel that closed after execOnMany returned in findProvidersAsyncRoutine. The function is now able to run both in a sloppy mode where it allows goroutines to be cleaned up after the function has completed as well a safer non-sloppy mode where goroutines will complete before the function returns. The sloppy mode is used for DHT "get" operations like FindProviders and SearchValues whereas the non-sloppy mode is used for "put" operations like PutValue and Provide (along with their bulk operation equivalents).

This fixes ipfs/kubo#8146
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants