-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p: peer state init too late and pex message too soon #3634
Conversation
Peer does not have a state yet. We set it in AddPeer. We need an new interface before mconnection is started
fix reconnection pex send too fast, error is caused lastReceivedRequests is still not deleted when a peer reconnected
Co-Authored-By: guagualvcha <baifudong@lancai.cn>
Co-Authored-By: guagualvcha <baifudong@lancai.cn>
Codecov Report
@@ Coverage Diff @@
## develop #3634 +/- ##
===========================================
+ Coverage 63.42% 63.45% +0.02%
===========================================
Files 217 217
Lines 18169 18157 -12
===========================================
- Hits 11524 11521 -3
+ Misses 5684 5671 -13
- Partials 961 965 +4
|
by removing the peer from the switch last (after we've stopped it and removed from all reactors)
Ok, I think the correct fix for #3338 is to remove the peer from the Switch last (after we've stopped it and removed from all reactors). This will prevent it from reconnecting to us before we've removed it as it will be rejected by the |
} | ||
|
||
// see stopAndRemovePeer | ||
func TestSwitchInitPeerIsNotCalledBeforeRemovePeer(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes even on develop
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test passes on develop because there's no InitPeer method
if you replace InitPeer with AddPeer
it will fail on develop
reactor.AddPeer(peer) | ||
}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test needs the InitPeer method to pass 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about the last test TestSwitchInitPeerIsNotCalledBeforeRemovePeer
(not sure it is meant to test the changes but it passes without the changes in this PR). The rest of the changes LGTM. I would like a 2nd pair of eyes by someone more familiar with the touched packages though @ebuchman @xla @ancazamfir
Thanks for removing the timeouts 👍 The test in question paniced now though: click to see log
Looks like with the last change introduced (or rather revealed) a non-deterministic test 🤔 (as |
I see the test |
I brought back timeouts, sorry. Maybe there's a way to get rid of them, need to think about it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By exposing InitPeer
it appears that we are making the p2p.Switch
responsible for lifecycle management of reactor specific peer state. Although this fixes #3346 and #3338. does it not expose us to the same class of bug? Would it possible to limit reactor specific peer lifecycle management to the current public interface (AddPeer, RemovePeer) to limit the burden on the Switch knowing what internal state the reactors need from peers?
Looking back over the previous iteration of this PR it appears that I'm a bit of a broken record. It looks like a better solution might be outside to scope of this particular issue and could be better serviced in a follow up PR. I've tried to capture my reservations in a separate issue #3679 such that at we can get the current bug fixed. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted (eg. #3679) there's still lots of work to do here, but this should be good for now.
Thanks!
Original issue:
fix two issue #3346 and
#3338
1、fix reconnection pex send too fast,
error is caused lastReceivedRequests is still
not deleted when a peer reconnected
2、Peer does not have a state yet. We set it in AddPeer.
We need an new interface before mconnection is started
Replaced #3361