-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use floodsub events to simplify discovery #176
Conversation
@dirkmc if you rebase from master you should get that dependency included.. |
It looks like there are a couple of pinner related tests failing that I don't think are related to the work I've been doing (but I will look into it in the morning). I think the discovery related code is ready for testing, maybe with the minimalist peer-pad @jimpick? I wrote a simple chat app to try it out, and it looks like the peers are connecting and setting up membership gossip within a couple of seconds (this is using the default rendezvous server at ws-star1.par.dwebops.pub). Watch for the |
FYI to see the log messages for web-socket-star, floodsub, and peer-star-app discovery paste this into the browser console and refresh: |
@pgte there are quite a lot of changes here, let me know if you would like to code review in real time and I can talk you through it :) |
@dirkmc thanks, I may take you up on that some time later! I pulled your branch locally and rebased from latest master and did a fresh install.
I did a bit of digging around and I found out that, when the tests fail, sometimes the connection from the nodes to the pinner is not closed after |
@pgte it seems like at some stage in the past libp2p must have called AppTransport.close() but it no longer seems to do that (it calls close on the listeners, not the transport itself) I can listen for when libp2p calls discovery.close(), but it seems like at that point libp2p has already stopped, so it's not possible to disconnect. I'm digging around in the libp2p code trying to work out how we can make it wait till we've closed out connections before shutting down. |
@dirkmc @pgte how much have you spoken with the (js)libp2p folks about this? I know service discovery is something they're interested in and it feels like we're basically using pubsub to implement our own discovery service because it's not done there yet. Maybe there's some way for us to help them out and ensure the libp2p discovery covers the peer-star-app use cases. |
@aschmahmann that seems like it could be useful for other projects, although our use case is quite particular - it feels a little experimental @pgte it seems like maybe the disconnects are not happening cleanly. I notice that when the pinner closes its connections, the pinner's push-protocol always detects the error and bails out, while the pull-protocol doesn't always notice the error. So there seems to be something about the connection disconnect that is not being detected. I will do more digging |
@dirkmc thank you! @aschmahmann there's a spec for a rendezvous protocol being designed here: libp2p/specs#56 . Is this what you were referring to? |
I looked into this more over the weekend. If I add a random delay before dialing, the replication test passes. My theory is that when libp2p-switch gets an incoming connection for a peer at the same time as it is dialing that same peer, it gets confused and loses track of some connections. When it shuts down, it closes all connections for a peer, but I believe it misses some of the connections, which is why the protocol pull stream never detects the connection ending. Tomorrow I'll try to write a test case against libp2p-switch that consistently fails. |
@pgte Yes, thanks. |
I created a test case to demonstrate the issue at #199 |
@dirkmc I dug into this a bit and I think you're right, apparently there's some problem with libp2p connection management. This is evident through your new test9: While I did some digging I think I found a problem in this package and the way that it handles connections: when you do Because I'd like to review these fixes with you, I suggest we continue this in another PR that joins these contributions to the connection management issues: #200 |
Fixes #153
Note: This PR is ready for review, but will not pass tests until release v0.15.2 is included into libp2p2, and we include that version of libp2p2 in peer-star-app