Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

samples repair peers using WeightedIndex #13919

Merged

Conversation

behzadnouri
Copy link
Contributor

Problem

To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.

Summary of Changes

  • Replaced weighted_best with WeightedIndex.
  • Reorganized some of the locking code when selecting repair peer.
  • Tests show ~20% improvement in timings and less variations.

To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.
@codecov
Copy link

codecov bot commented Dec 2, 2020

Codecov Report

Merging #13919 (04688e5) into master (85bec37) will increase coverage by 0.0%.
The diff coverage is 83.3%.

@@           Coverage Diff           @@
##           master   #13919   +/-   ##
=======================================
  Coverage    82.1%    82.1%           
=======================================
  Files         382      382           
  Lines       93635    93654   +19     
=======================================
+ Hits        76900    76921   +21     
+ Misses      16735    16733    -2     

@behzadnouri behzadnouri marked this pull request as ready for review December 2, 2020 19:12
})
.collect()
};
let slot_peers = match self.lookup(slot) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh unrelated to this PR, but I'm realizing that this lookup is probably not that useful, especially since we cache the value the first time time we compute weights for this slot. Because repair usually runs on missing shreds in the middle of a slot, not many people will have the slot completed, so the lookup() here will return a slot_peers that's mostly empty.

Copy link
Contributor

@carllin carllin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think the WeightedIndex was introduced to rand recently, a welcome improvement on our existing logic!

@behzadnouri behzadnouri merged commit c3048b4 into solana-labs:master Dec 3, 2020
@behzadnouri behzadnouri deleted the repair-peers-weighted-index branch December 3, 2020 14:26
mergify bot pushed a commit that referenced this pull request Dec 3, 2020
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.

(cherry picked from commit c3048b4)
jstarry pushed a commit that referenced this pull request Dec 3, 2020
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.

(cherry picked from commit c3048b4)
mergify bot added a commit that referenced this pull request Dec 3, 2020
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83

Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.

(cherry picked from commit c3048b4)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Comment on lines +1209 to +1210
let gossip = self.gossip.read().unwrap();
self.tvu_peers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behzadnouri this can cause a deadlock on machines with write biased rwlocks if a writer gets queued for the rwlock before the read lock is acquired inside self.tvu_peers().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe good catch! glad that local cluster debugging also reaped other benefits 🎉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. Sent out #13973

behzadnouri added a commit to behzadnouri/solana that referenced this pull request Apr 5, 2022
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
solana-labs#18343
and can be replaced with WeightedShuffle.

Also, as described in:
solana-labs#13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
behzadnouri added a commit to behzadnouri/solana that referenced this pull request Apr 5, 2022
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
solana-labs#18343
and can be replaced with WeightedShuffle.

Also, as described in:
solana-labs#13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
behzadnouri added a commit that referenced this pull request Apr 5, 2022
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
#18343
and can be replaced with WeightedShuffle.

Also, as described in:
#13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.
mergify bot pushed a commit that referenced this pull request Apr 6, 2022
Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
#18343
and can be replaced with WeightedShuffle.

Also, as described in:
#13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.

(cherry picked from commit db23295)
behzadnouri added a commit that referenced this pull request Apr 7, 2022
…24139)

Older weighted_shuffle is based on a heuristic which results in biased
samples as shown in:
#18343
and can be replaced with WeightedShuffle.

Also, as described in:
#13919
weighted_best can be replaced with rand::distributions::WeightedIndex,
or WeightdShuffle::first.

(cherry picked from commit db23295)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants