Skip to content
This repository has been archived by the owner on Apr 16, 2020. It is now read-only.

Explain IPFS PubSub Behaviour over WebSockets-Star Topology #23

Closed
harrshasri opened this issue Apr 23, 2018 · 25 comments
Closed

Explain IPFS PubSub Behaviour over WebSockets-Star Topology #23

harrshasri opened this issue Apr 23, 2018 · 25 comments
Assignees

Comments

@harrshasri
Copy link

harrshasri commented Apr 23, 2018

Hi @diasdavid David,

I integrated IPFS successfully into my App at https://dukaanbabu.com
But I Didn’t publish the ipfs update into the Store yet.

Bcoz of a couple of reasons

I know I have to wait for DHT implementation in JS-IPFS.
But I am wondering about the scalability on websockets-star topology as of now.

As of now, I am using websockets-star for PubSub

But I didn’t understand how websockets-star works for pubsub.

  • Is the signalling data of peers getting shared.
    (Or)
    The whole data is transmitted via the signalling server.

  • Are the peers subscribing /asking in a flooding technique after peer discovery/ Swarming?
    (Or)
    Are they subscribing via the signalling server (Rendezvous) ?

My current config is

Addresses: {
        Swarm: [
             // Will update the hosted rendezvous in production
            '/dns4/ws-star.discovery.libp2p.io/tcp/443/wss/p2p-websocket-star/',
            //'/dns4/star-signal.cloud.ipfs.team/tcp/443/wss/p2p-webrtc-star',
        ]
    },
Discovery: {
    MDNS: {
        Enabled: true,
        Interval: 10
    }
    // },
    // webRTCStar: {
    //     Enabled: true
    // }
},
Bootstrap: [
    "/dns4/ams-1.bootstrap.libp2p.io/tcp/443/wss/ipfs/QmSoLer265NRgSp2LA3dPaeykiS1J6DifTC88f5uVQKNAd",
    "/dns4/lon-1.bootstrap.libp2p.io/tcp/443/wss/ipfs/QmSoLMeWqB7YGVLJN3pNLQpmmEk35v6wYtsMGLzSr5QBU3",
    "/dns4/sfo-3.bootstrap.libp2p.io/tcp/443/wss/ipfs/QmSoLPppuBtQSGwKDZT2M73ULpjvfd3aZ6ha4oFGL1KrGM",
    "/dns4/sgp-1.bootstrap.libp2p.io/tcp/443/wss/ipfs/QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu",
    "/dns4/wss0.bootstrap.libp2p.io/tcp/443/wss/ipfs/QmZMxNdpMkewiVZLMRxaNxUeZpDUb34pWjZ1kZvsd16Zic",
    "/dns4/wss1.bootstrap.libp2p.io/tcp/443/wss/ipfs/Qmbut9Ywz9YEDrz8ySBSgWyJk41Uvm2QJPhwDJzJyGFsD6"
]
}

Here is a SNEAK PEEK of Github for Shopping
pricegraph

@daviddias
Copy link
Contributor

@pgte mind getting this one?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

Hi @harrshasri,

This is based on my recent sparse knowledge of the ipfs and libp2p stack, so please @diasdavid correct me if I'm wrong or imprecise.

There are two layers in your question. First is the floodsub algorithm (which is the current naive implementation of pubsub in ipfs) and then there is the websocket-star protocol.

Floodsub is a very naive and simple protocol: when a peer connects, we dial the floodsub protocol (multiplexed over the peer connection). Each node keeps a list of all the nodes it's connected to. Each node updates the remote nodes on the topics it's interested in. When receiving a message, a node checks to see if that message was already processed. If not, 1) the node caches the message id, 2) emits that (topic, message) to the user and 3) forwards that message to every node that's interested in that topic. As you can see, there is no overlay network found here. It simply constructs a star overlay on every known peer that the transport connects to.

Now, the websocket-star transport: The websocket-star server serves as both discovery and transport relay, connecting peers through it.

So, trying to answer your question: when a node connects to a websocket-star server, that peer address is frequently broadcasted to every other connected peer. When finding out about a new peer, js-ipfs dials to it immediately, which makes the floodsub protocol dial to it, which happens through the websocket-start-rendezvous server (which is not only a rendezvous server, but also a 2-hop relay).

@harrshasri Does this answer your question?

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

@pgte
So, it is flooding via the Signalling-Server
Which means the Signalling-Server is having the
Bandwidth cost of not only the peerInfo but also message/content transfer between nodes.

Did I understand correctly?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

Correct. It's the nature of websockets: they don't allow direct p2p connections, while the webrtc-based ones do.

@harrshasri
Copy link
Author

But we don't have PubSub for webrtc functioning as of now, right?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

Webrtc works (through the libp2p-webrtc-star transport), and it allows pubsub (treats it transparently as any other multiplexed algorithm).

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

I will give a try once again. But last week when I tested it didn't discover its peer. Thats why I asked you this question.

So What happens in WEBRTC for pubsub peerDiscovery?

Is everybody in the network swarmed after rendezvous signalling and floodSub Query to its peers based on the topic?
Or
They swarm based on the topic via Rendezvous server and floodsub the content?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

It's the same process as described, with the exception that the p2p connections should happen directly, without needing a relay server.

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

Okay, so does It swarm everybody irrespective of the topic?
If there are 1000 nodes in the room
every client opens 1000 connections right?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

No, it still uses the floodsub protocol irrespective of transport:

Each node keeps a list of all the nodes it's connected to. Each node updates the remote nodes on the topics it's interested in. When receiving a message, a node checks to see if that message was already processed. If not, 1) the node caches the message id, 2) emits that (topic, message) to the user and 3) forwards that message to every node that's interested in that topic.

@harrshasri
Copy link
Author

So when subscribing to every peer it knows. What if the topic is not published by those known peers?
this.peers.forEach((peer) => sendSubscriptionsOnceReady(peer))

@harrshasri
Copy link
Author

I'm sorry I didn't understand this part.

If not,

  1. the node caches the message id,
  2. emits that (topic, message) to the user and
  3. forwards that message to every node that's interested in that topic.

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

In floodsub, a node must declare interest in a topic for it to get messages on that topic.

When receiving a message from a remote node, a node only forwards that message to a known node if and only if that node is interested in that topic.

This simplistic approach has (besides others), the downside of poorly-connected nodes have a low probability of getting messages on unpopular topics.

@harrshasri
Copy link
Author

So would you recommend DHT For my use case as the PubSub peer & content discovery takes time?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

Could you describe your use case? What would you use pubsub for?

@harrshasri
Copy link
Author

I understand the dynamics PubSub play when there are no Publishers. But even when we have enough publishers. But not within the known list of peers on the node which is subscribing. How do they discover the publishers?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

They don't explicitely. To receive the message on a given topic, each node must already be connected to a publisher or to a node that's interested in that topic.

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

I am sharing Price Data which is published by some users in their wish list And people who are browsing the product page will subscribe and retrieve the price history.

This is a subscriber

pricegraph

Who queries within the other publishers to get the price data

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

@harrshasri does my last answer answer your question?

@harrshasri
Copy link
Author

They don't explicitely. To receive the message on a given topic, each node must already be connected to a publisher or to a node that's interested in that topic.

Aah! Thats why It didnt work for me in WebRTC the data wasnt transferring. As WebSockets are acting as 2HOP relay . This app was working.

Now It makes sense.

I have only two options.
1)WebSockets
2)PubSub over DHT

@harrshasri
Copy link
Author

I need to now think upon scalability requirements of Rendezvous Server.

Until DHT is implemented in JS-IPFS.

One more query do we have BitSwap in JS-IPFS yet?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

Yes, bitswap is the block exchange protocol, used in the object, files and DAG APIs.

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

But we aren't leveraging that in PubSub Layer, are we?

@pgte
Copy link
Contributor

pgte commented Apr 23, 2018

pubsub does not use bitswap.
Pub-sub is designed to be a real-time(ish) best-effort topic-based message delivery system and has nothing to do with the content-addressable part of IPFS.

@harrshasri
Copy link
Author

harrshasri commented Apr 23, 2018

Oh! I was thinking PubSub is done on top of BitSwap.
There is some research involving PubSub Over DHT.
Which I think is very necessary if I want to remove the 2hop relay.
Or to make it completely decentralised.

https://discuss.ipfs.io/t/dht-on-pubsub-and-general-pubsub-improvement/1692/2

@pgte pgte closed this as completed May 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants