Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: leaking goroutines on aborted /add requests #1732

Merged
merged 1 commit into from Jul 8, 2022

Conversation

hsanjuan
Copy link
Collaborator

@hsanjuan hsanjuan commented Jul 8, 2022

It has been observed that some peers have a growing number of goroutines,
usually stuck in go-libp2p-gorpc.MultiStream() function, which is waiting to
read items from the arguments channel.

We suspect this is due to aborted /add requests. In situations when the add
request is aborted or fails, Finalize() is never called and the blocks channel
stays open, so MultiStream() can never exit, and the BlockStreamer can never
stop streaming etc.

As a fix, we added the requirement to call Close() when we stop using a
ClusterDAGService (error or not). This should ensure that the blocks channel
is always closed and not just on Finalize().

@hsanjuan hsanjuan self-assigned this Jul 8, 2022
@hsanjuan hsanjuan added this to the Release v1.0.3 milestone Jul 8, 2022
It has been observed that some peers have a growing number of goroutines,
usually stuck in go-libp2p-gorpc.MultiStream() function, which is waiting to
read items from the arguments channel.

We suspect this is due to aborted /add requests. In situations when the add
request is aborted or fails, Finalize() is never called and the blocks channel
stays open, so MultiStream() can never exit, and the BlockStreamer can never
stop streaming etc.

As a fix, we added the requirement to call Close() when we stop using a
ClusterDAGService (error or not). This should ensure that the blocks channel
is always closed and not just on Finalize().
@hsanjuan hsanjuan merged commit c4d78d5 into master Jul 8, 2022
@hsanjuan hsanjuan deleted the fix/goroutine-leak-adder branch July 8, 2022 15:40
@RubenKelevra
Copy link
Collaborator

RubenKelevra commented Jul 9, 2022

I think I ran into this on one of my servers, which just got 4 GB of memory, before. It mainly happens when QUIC is enabled. Maybe a 1 to 100 reduction when QUIC is disabled.

I thought its an issue in Kubo itself (still weird to use this name :D).

@hsanjuan
Copy link
Collaborator Author

This is not related to QUIC memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants