Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug why gossip push messages are not propagated effectively enough #28642

Closed
behzadnouri opened this issue Oct 27, 2022 · 2 comments
Closed
Labels
stale [bot only] Added to stale content; results in auto-close after a week.

Comments

@behzadnouri
Copy link
Contributor

behzadnouri commented Oct 27, 2022

Problem

Metrics indicate that we still rely more than desired on pull request to propagate CRDS values. Pull requests are slow and have significant overhead on bandwidth use.

Proposed Solution

Investigate why messages are not propagated effectively through push.

  • Something wrong with prune messages preventing active push from working properly?
  • Push active set is shuffled too frequently and it does not get enough time to span across the cluster?!
  • Need a larger push fanout since the cluster size has grown by multiple x since the code was written.
  • Stake weights for random peer selection are not ideal; e.g. too much concentration at high stake nodes.

also #11698

@behzadnouri
Copy link
Contributor Author

behzadnouri commented Dec 5, 2022

Each node maintains an active_set of peers to which it pushes gossip messages:
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip_push.rs#L67-L68

On the receiving end, we are maintaining received_cache to represent a lagging view of which nodes currently have this node in their active_set:
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip_push.rs#L67-L68
and then use this set to prune nodes except a few:
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip_push.rs#L147-L209

One issue arises when the active_set is rotated:
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip_push.rs#L344-L428
every 7.5 seconds.

In the simplest case:

  • Say node a is pushing messages to node b, and so b prunes other nodes.
  • At some point a refreshes its active set and stops pushing to b.
  • Ideally, some other node c might randomly pick b, add it to its active set and start pushing to b.
  • However, b does not know that a is not going to push to it anymore so it will prune c, and c will stop pushing to b.

So b would stop receiving messages until the entries for a in received_cache are purged, which currently takes 2.5 minutes:
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip_push.rs#L57
https://github.com/solana-labs/solana/blob/718f43320/gossip/src/crds_gossip.rs#L332-L333
and some other node randomly picks b and add it to its active set.

cc @aeyakovenko @carllin @sakridge

behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 11, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 12, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 12, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 12, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 12, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 13, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Dec 13, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
behzadnouri added a commit that referenced this issue Dec 15, 2022
As described here:
#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
gnapoli23 pushed a commit to gnapoli23/solana that referenced this issue Dec 16, 2022
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
nickfrosty pushed a commit to nickfrosty/solana that referenced this issue Jan 4, 2023
As described here:
solana-labs#28642 (comment)
current gossip pruning code fails to maintain spanning trees across
cluster.

This commit instead implements a pruning code based on timeliness of
delivered messages. If a messages is delivered timely enough (in terms
of number of duplicates already observed for that value), it counts
towards the respective node's score. Once there are enough many CRDS
upserts from a specific origin, redundant nodes are pruned based on the
tracked score.

Since the pruning leaves some configurable redundancy and the scores are
reset frequently, it should better tolerate active-set rotations.
@behzadnouri
Copy link
Contributor Author

Testnet since upgrading to v1.15.x.
crds upserts coming from push messages vs pull responses on testnet; showing that push messages are propagated a lot better and less reliance on pull requests.

gossip

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Feb 22, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
Development

No branches or pull requests

1 participant