Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with gossipsub subscribers may not receive the first couple of messages #331

Open
hugomrdias opened this issue May 12, 2020 · 18 comments
Open

Comments

@hugomrdias
Copy link
Member

hugomrdias commented May 12, 2020

While trying to sync js repos with go-ipfs 0.5.x i notice pubsub tests were failing for js-ipfs-http-client the specific test works like this:

  1. peer1 swarm connects to peer2
  2. peer1 and peer2 subscribe to topicA
  3. peer2 runs ipfs.pubsub.peers() multiple times until it sees peer1
  4. peer2 publishes a msg to the topic
  5. test asserts that both peers receive the msg

So this test started to fail with go-ipfs 0.5 talking to @aschmahmann we discovered that this behaviour is due to 0.5 defaulting to gossipsub because with floodsub the test passes.

To make the test pass with gossipsub i needed to add a ~2s (may work with less just a random number) delay between step 3 and 4.

Can we remove the need for that delay and make gossipsub work like floodsub ?

issue ipfs/js-ipfs#3030
pr ipfs/js-ipfs#3013

@Stebalien Stebalien added kind/bug A bug in existing code (including security flaws) and removed kind/bug A bug in existing code (including security flaws) labels May 12, 2020
@Stebalien
Copy link
Member

This is probably because "peers" is all peers we know that speak the topic while we really care about "grafted peers".

There's a new publish option WithReadyness(MinTopicSize(1)) that would allow us to require at least one peer before we let the publish succeed. However, that will stall if we're offline.

@vyzo
Copy link
Collaborator

vyzo commented May 19, 2020

Yes, it will take a heartbeat for the new peer to get grafted, which is by default 1s.

@hugomrdias
Copy link
Member Author

Is changing pubsub.peers() to grafted an option?

@vyzo
Copy link
Collaborator

vyzo commented May 19, 2020

Well, ListPeers returns all the known peers that speak the same pubsub protocol(s) as us.
It would be a breaking change in behavior to only return peers in the mesh -- perhaps we can add a new method that has the desired behavior.

@hugomrdias
Copy link
Member Author

@vyzo that sounds great! @Stebalien is this something we can bubble up to go-ipfs ? maybe in a arg like ipfs pubsub peers --grafted

@Stebalien
Copy link
Member

Yes, it will take a heartbeat for the new peer to get grafted, which is by default 1s.

We shouldn't need to wait.

@Stebalien
Copy link
Member

This needs to be fixed the right way: ipfs/kubo#7350

@hugomrdias
Copy link
Member Author

where can i follow this issue? the one you mentioned is already closed does that mean its already in master ?

@vyzo
Copy link
Collaborator

vyzo commented May 22, 2020

Actually, the delay has been fixed in v0.3.0:
https://github.com/libp2p/go-libp2p-pubsub/blob/master/gossipsub.go#L952

So when we join, we immediately graft some peers.

@Stebalien
Copy link
Member

@hugomrdias ipfs/kubo#6621

@achingbrain
Copy link
Member

achingbrain commented May 29, 2020

Actually, the delay has been fixed in v0.3.0:

This is biting us in the interop tests too, where the delay needs to be longer than 2s, more like 10s 20s. Actually who knows, it seems to vary wildly.

As it's fixed in go-libp2p-pubsub master will the fix ship in go-IPFS 0.6.x?

achingbrain added a commit to ipfs/interop that referenced this issue May 29, 2020
Because of libp2p/go-libp2p-pubsub#331 we have
to wait for a the gossipsub peer to become grafted before sending messages
we expect to be recieved.  There's no way to tell from the API which peers
are grafted so we just have to add an abitrary wait and hope it happens
during that time window.

Also refactors the pubsub tests to remove duplicated code.
achingbrain added a commit to ipfs/interop that referenced this issue May 29, 2020
Because of libp2p/go-libp2p-pubsub#331 we have
to wait for a the gossipsub peer to become grafted before sending messages
we expect to be recieved.  There's no way to tell from the API which peers
are grafted so we just have to add an abitrary wait and hope it happens
during that time window.

Also refactors the pubsub tests to remove duplicated code and the
ipns-pubsub tests to use named variables instead of array indices for
readability.
@achingbrain
Copy link
Member

It looks like these interop tests are failing with go-IPFS master, even though the build step says it succeeded: https://app.circleci.com/pipelines/github/ipfs/go-ipfs/2908/workflows/0b7a5ebf-cb6c-4960-b19e-fd4c1b61983c/jobs/34278

@achingbrain
Copy link
Member

Would it please be possible to have some way of getting the list of peers who are likely to receive a pubsub message for a given topic? That would unblock us and fix the interop tests which can then be made to fail the go-ipfs build if compatibility is broken again.

Getting a list of peers for a topic when those peers may not receive a message sent to the topic because they aren't grafted is not very useful.

@vyzo
Copy link
Collaborator

vyzo commented May 29, 2020

Actually, if you enable flood publishing in gsv1.1 (WithFloodPublish option), all the direct peers and not just the ones in the mesh will receive any messages you publish (modulo dropped messages of course).

We can still add a ListMeshPeers (or some other better name) interface if you can't or don't want to enable flood publishing.

@achingbrain
Copy link
Member

ipfs pubsub peers ... given a topic, will list connected peers who are subscribed to the named topic.

I think if you provide a method that claims to return the peers subscribed to a topic, then send a message to that topic and those peers do not receive the message, the user is likely to encounter significant surprise.

@vyzo
Copy link
Collaborator

vyzo commented May 29, 2020

Yes, that's true -- we'll have to update if we add a different interface. But I really do recommend enabling flood publishing, it's a security feature of the protocol.

@Stebalien
Copy link
Member

Flood publishing for the first hop sounds reasonable.

@Stebalien
Copy link
Member

Flood publishing for the first hop sounds reasonable.

ipfs/kubo#7394

#331 (comment)

🙄 ipfs/kubo#7395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants