-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bitswap/network: refactor connectEventManager more simply in in bsnet #436
Conversation
Codecov Report
@@ Coverage Diff @@
## main #436 +/- ##
==========================================
+ Coverage 49.80% 49.94% +0.13%
==========================================
Files 249 248 -1
Lines 29972 29888 -84
==========================================
- Hits 14928 14927 -1
+ Misses 13615 13532 -83
Partials 1429 1429
|
30693e7
to
aca0769
Compare
FWIW
|
I think this is fine because libp2p's .Connect call does not return before the .Connected callback has returned. This new code also have similar dedup logic and state transitions. Fixes #432
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was written the way it was to decouple connect/disconnect notifications from actually processing those notifications. When overloaded (lots of connect/disconnect events), this code is "well behaved" and will simply end up skipping over short-lived connections.
Specifically:
- We put the connected/disconnected/unresponsive states into a map when we receive a relevant event.
- Asynchronously, we update the system state to reflect this.
- Importantly, we don't spend any time processing "stale" events.
This change will:
- Block all new connections in libp2p until bitswap can react to them. Blocking in
Connected
andDisconnected
event handlers is absolutely forbidden. - Cause the system to grind to a halt under load as (dis)connect events get delayed behind other (dis)connect events.
So yeah, don't do it.
Also note, this kind of change should be tested on a gateway and bootstrapper before merging. Testing there and then looking at stack dumps, mutex profiles, etc. will usually reveal problems like this.
We already did that in #435. This is a more efficient and simpler way to do this. We could also change the sessions's internals to sort it out with the old behaviour but this isn't the surgical win @rvagg and @hannahhoward were searchig. |
OK, well the key thing we need to solve is in the boxo/bitswap/client/internal/session/session.go Lines 393 to 397 in 7ec68c5
We need to either make sure that when calling Some options for exploration that come to mind for me:
The problem with all of these of course is in dealing with cleanup of peers we shouldn't have to care about for various reasons. |
@Stebalien for reference, this is what we're trying to solve: #432 (comment) |
So, what we should be doing there is:
Independently:
|
I think this is fine because libp2p's .Connect call does not return before the .Connected callback has returned.
This new code also have similar dedup logic and state transitions.
Fixes #432