-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GossipSub Improvements blog post #130
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this post!
I will provide the rest of this feedback tomorrow.
(Many of the suggestions are sembr related, you can just accept them to integrate these. Do not forget to "git pull" after accepting these, to update your local copy.)
Most of the research on P2P networks provides simulation results assuming nodes with similar capabilities. | ||
The aspect of dissimilar capabilities and resource-constrained nodes is less explored. | ||
|
||
It is discussed in GOAL1 that overlay mesh results in better performance if $D_{avg} = \ln(N) + C$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about gossipsub peer scoring in that context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Peer scoring is a function of neighbors, not the node itself. It may help control D for a node (as a response) with some manipulation. It will at least require node-specific thresholds etc
At the same time, connecting high-bandwidth nodes through a low-bandwidth node undermines the network's performance. | ||
Ideally, every node should contribute proportionally to the available resources. A better solution involves a two-phased operation: | ||
|
||
1. Every node computes its available bandwidth and selects a node degree $D_i$ proportional to its available bandwidth. Different bandwidth estimation approaches are suggested in literature [4,5]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every node computes its available bandwidth and selects a node degree
$D_i$ proportional to its available bandwidth
Is there a source where this has been proposed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in [3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: kaiserd <1684595+kaiserd@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the rest of my feedback :)
However, problems like peer churn and in-network adversaries can be best alleviated through balanced redundant coverage. | ||
|
||
# References | ||
[1] D. Vyzovitis, Y. Napora, D. McCormick, D. Dias, and Y. Psaras, “Gossipsub: Attack-resilient message propagation in the filecoin and eth2. 0 networks,” arXiv preprint arXiv:2007.02754, 2020. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hyperlink in the article would be good :), as well as a hyperlink in the reference list. (This post is not meant for print.)
To further conform to the suggested mesh-degree average $D_{avg}$, every node tries achieving this number within its neighborhood, resulting in an overall similar mesh-degree average. | ||
|
||
2. From the available local view, every node tries connecting peers with the lowest latency until $D$ connections are made. | ||
We suggest referring to the peering solution discussed in GOAL5 to avoid partitioning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also from [3]? We should add a ref here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readers might ask: Will this lead to starvation of high latency nodes? Will high latency nodes still receive the data?
E.g. spread over ocean cables etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generates a very interesting discussion. I believe this will create partitions (clusters), where the nearest pears communicate with each other (at very low latency). That is why, I suggested in Goal 5 that high-capacity peers (from each cluster) must create an additional overlay.
## GOAL3: Bandwidth Maximization | ||
Redundant message transmissions are essential for handling adversaries/node failure. However, these transmissions result in traffic bursts, cramming many overlay links. | ||
This not only adds to the network-wide message dissemination latency but a significant share of the network's bandwidth is wasted on (usually) unnecessary transmissions. | ||
It is essential to explore solutions that can minimize the impact of redundant transmissions while assuring resilience against node failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minimize the impact of redundant transmissions
Is this about minimizing the impact of each individual redundant transmissions,
or is it about minimizing the number of redundant transmissions?
(I assume the latter, I'd make this explicit.)
|
||
## GOAL5: Scalability | ||
P2P networks are inherently scalable because every incoming node brings in bandwidth and compute resources. | ||
Under such arrangements, it is desirable to achieve IP-like scalability, as seen in Bittorrent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by IP-like scalability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the average message arrival rate is L bytes/sec, any node joining the network must have approximately L x D bandwidth available. As far as this holds, we can keep on adding new nodes. However, this increases network-wide message dissemination latency. Keeping the latency constant requires linking D to the network size!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be in the blog post. It is more expressive than IP-like scalability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most efforts for bringing scalability to the P2P systems have focused on curtailing redundant transmissions and flat overlay adjustments. Hierarchical overlay designs, on the other hand, are less explored. | ||
Placing a logical structure in unstructured P2P systems can help scale the P2P networks. | ||
|
||
We suggest using a hierarchical overlay inspired by the approaches [14-16]. An abstract operation of the suggested overlay design is provided below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We suggest using a hierarchical overlay inspired by the approaches [14-16]. An abstract operation of the suggested overlay design is provided below: | |
[14-16] propose using a hierarchical overlay inspired by the approaches . | |
An abstract operation of the suggested overlay design is provided below: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should no say that we suggest using a hierarchical approach. In fact, we would like to avoid that if possible.
Let's simply list that as one possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
We suggest using a hierarchical overlay inspired by the approaches [14-16]. An abstract operation of the suggested overlay design is provided below: | ||
|
||
1. We cluster nodes based on locality, assuming that such peers will have lower intra-cluster latency and higher bandwidth. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not use "we" here, but a more neutral tone. This is one possibilty. Not what we propose (necessarily).
We would need more insights to suggest this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also applies to the rest of the "we" in the following.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3. Virtual nodes form a fully connected mesh to construct a hierarchical overlay. | ||
Each virtual node is essentially a collection of super nodes; a link to any of the constituent super nodes represents a link to the virtual node. | ||
|
||
4. We suggest using GossipSub for intra-cluster message dissemination and FloodSub for inter-cluster message dissemination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, too, something like:
"A possible idea is.."
We do not suggest this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: kaiserd <1684595+kaiserd@users.noreply.github.com>
Publishing through a D-regular overlay triggers approximately $N \times D$ transmissions. | ||
Reducing $D$ reduces the redundant transmissions but compromises reachability and latency. | ||
|
||
Sharing metadata through a K-regular overlay (where $K > D$) allows nodes to pull missing messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, the K overlay is gossiping (since it happens at fixed intervals), the D overlay is flooding (or maybe another name, since we don't flood to everyone, not sure)
(so the description on line 32 isn't accurate, since it calls D the gossip)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GossipSub uses eager push (through overlay mesh) and lazy push (through IWANT messages). | ||
|
||
The mesh degree $D_{Low} \leq D \leq D_{High}$ is crucial in deciding message dissemination latency. | ||
A smaller value for $D$ results in higher latency due to increased rounds, whereas a higher $D$ reduces latency on the cost of increased redundancy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whereas a higher
$D$ reduces latency on the cost of increased redundancy.
should we say at the cost of increased bandwidth usage
to be more explicit?
Also, this is only true up to a point where the congestion slows down propagation (but maybe we don't have to say that here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
The mesh degree $D_{Low} \leq D \leq D_{High}$ is crucial in deciding message dissemination latency. | ||
A smaller value for $D$ results in higher latency due to increased rounds, whereas a higher $D$ reduces latency on the cost of increased redundancy. | ||
It is suggested that the average mesh degree should be $D_{avg} = \ln(N) + C$ for an optimal operation, where $N$ is the network size and C is a small constant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is that coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from [3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only true if you want to keep the latency stable as the network grows, but that will cost more bandwidth
Most networks using GossipSub won't make that trade-off, and will use a fixed D (which means constant bandwidth, increasing latency as network grows)
Also, methods used to automatically estimate the network sizes will most likely be susceptible to attacks
Speaking about attacks, sybil resistance is a big part of choosing a D, since they are directly connected
Let's say you have D = 8, and then 75% of the network does an attack
Your effective D will become 8*0.25=2, until the network recovers (recovering speed depends on the scoring system)
All of that to say, choosing a D is a complex subject, and I don't think we can sum it up in one sentence like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better solution involves a two-phased operation: | ||
|
||
|
||
1. Every node computes its available bandwidth and selects a node degree $D_i$ proportional to its available bandwidth [3]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would only be applicable in networks where participants have "bandwidth to spare"
Generally, home users don't want to spend all of their bandwidth to participate in some network (though maybe we could look at something like µTP)
And datacenter users tend to pay for their bandwidth, so even though they have a lot at hand, they might not want to "waste it"
However, I can see a benefit to lowering D locally if the node doesn't has enough bandwidth, as this would slow down the network either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ufarooqstatus did you add this? This should be reflected in the blog post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose that we have three equal-length messages $x1, x2, x3$. Assuming an XOR coding function, | ||
we know two trivial properties: $x1 \oplus x2 \oplus x2 = x1$ and $\vert x1 \vert = \vert x1 \oplus x2 \oplus x2 \vert$. | ||
|
||
This implies that instead of sending messages individually, we can encode and transmit composite message(s) to the network. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also implies that the messages are coming from the same node, which is only one specific use case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a protocol can be devised (maybe by adding peer-IDs and message hash/ID-s) to indicate combined messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this improvment worth the complexity it adds? Maybe just add this question to mark this is just an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can parallelize message transmission by partitioning large messages into smaller chunks, letting intermediate peers relay chunks as soon as they receive them. | ||
|
||
|
||
## GOAL5: Scalability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you want to scale here?
Number of participants? I'd argue that GossipSub already scales that with O(log(n))
, so it's already quite scalable
Sure, we could aim for O(log log n)
, but with the current scalability properties, going from 10k nodes to 100 million node would ~ double the latency, which seems already quite reasonable
(100 million being apparently the number of BitTorrent users, seems like a high target already)
Scaling message sizes? That's interesting, and that's already described in goal 4
Scaling message count? That's interesting, and probably the next step of GS research, but that's already described in goal 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the average message arrival rate is L bytes/sec, any node joining the network must have approximately L x D bandwidth available. As far as this holds, we can keep on adding new nodes. However, this increases network-wide message dissemination latency (at least log_D (N) hops needed for network-wide message dissemination). Now, per hop latency can vary between few milliseconds to a few hundred miliseconds. connecting nearby nodes means lowering average per hop latency. And there is usually an upper bound on network-wide message dissemination time in many applications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this to the blog post.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: ksr <kaiserd@users.noreply.github.com>
Research blog post on P2P network scaling with GossipSub as a reference protocol