Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(share/discovery): deadlock in limitedSet #2190

Merged
merged 1 commit into from
May 13, 2023
Merged

Conversation

Wondertan
Copy link
Member

@Wondertan Wondertan commented May 12, 2023

Self-explanatory.

The deadlock eventually blocks the disconnect event processing in Discovery, causing libp2p connection processing to stall. Was discovered via libp2p metrics

@Wondertan Wondertan self-assigned this May 12, 2023
@Wondertan Wondertan added the kind:fix Attached to bug-fixing PRs label May 12, 2023
@Wondertan Wondertan added the area:shares Shares and samples label May 12, 2023
Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for the ps.limit > 0 check inside of Remove anyway?

share/availability/discovery/set.go Show resolved Hide resolved
@Wondertan
Copy link
Member Author

Because there is no reason to grow set indefinitely when limits is zero

Copy link
Member

@walldiss walldiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to release asap

share/availability/discovery/set.go Show resolved Hide resolved
@renaynay
Copy link
Member

Eh weird @Wondertan - if discovery short-circuits bc peerslimit is 0, then Remove wouldn't even get called.

Anyway, approved

@Wondertan
Copy link
Member Author

The surprising fact here is that I remember fixing this, and I even found a commit in revamp discovery PR 23b590e, and @vgonkivs pointed tp the deadlock. But somehow, it was removed afterward.

@Wondertan
Copy link
Member Author

Eh weird @Wondertan - if discovery short-circuits bc peerslimit is 0, then Remove wouldn't even get called.

Yeah and there is no need to have that if, if that case is impossible

@Wondertan
Copy link
Member Author

Wondertan commented May 12, 2023

Found the commit which introduced the bug 82eb331.

Note that before that, I removed the if with lock(23b590e) completely in c5e0d00

@renaynay renaynay enabled auto-merge (squash) May 12, 2023 15:05
@renaynay renaynay disabled auto-merge May 12, 2023 15:06
@Wondertan Wondertan enabled auto-merge (squash) May 13, 2023 08:11
@Wondertan Wondertan merged commit 9baade4 into main May 13, 2023
17 of 20 checks passed
@Wondertan Wondertan deleted the hlib/fix-deadlock branch May 13, 2023 08:11
vgonkivs pushed a commit to vgonkivs/celestia-node that referenced this pull request May 22, 2023
The deadlock eventually blocks the disconnect event processing in Discovery, causing libp2p connection processing to stall. It was discovered via libp2p metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:shares Shares and samples kind:fix Attached to bug-fixing PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants