-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
samples repair peers using WeightedIndex #13919
samples repair peers using WeightedIndex #13919
Conversation
To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code.
Codecov Report
@@ Coverage Diff @@
## master #13919 +/- ##
=======================================
Coverage 82.1% 82.1%
=======================================
Files 382 382
Lines 93635 93654 +19
=======================================
+ Hits 76900 76921 +21
+ Misses 16735 16733 -2 |
}) | ||
.collect() | ||
}; | ||
let slot_peers = match self.lookup(slot) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh unrelated to this PR, but I'm realizing that this lookup
is probably not that useful, especially since we cache the value the first time time we compute weights for this slot. Because repair usually runs on missing shreds in the middle of a slot, not many people will have the slot completed, so the lookup()
here will return a slot_peers
that's mostly empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I think the WeightedIndex
was introduced to rand
recently, a welcome improvement on our existing logic!
To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code. (cherry picked from commit c3048b4)
To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code. (cherry picked from commit c3048b4)
To output one random sample, weighted_best generates n random numbers: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63 WeightedIndex does so with only one random number: https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240 Additionally, if the index is already constructed, it only does a total of O(log(n)) amount of work; which can be achieved if RepairCache, caches the weighted index: https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83 Also, the repair-peers code can be reorganized to have fewer redundant unlock-then-lock code. (cherry picked from commit c3048b4) Co-authored-by: behzad nouri <behzadnouri@gmail.com>
let gossip = self.gossip.read().unwrap(); | ||
self.tvu_peers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@behzadnouri this can cause a deadlock on machines with write biased rwlocks if a writer gets queued for the rwlock before the read lock is acquired inside self.tvu_peers()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hehe good catch! glad that local cluster debugging also reaped other benefits 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. Sent out #13973
Older weighted_shuffle is based on a heuristic which results in biased samples as shown in: solana-labs#18343 and can be replaced with WeightedShuffle. Also, as described in: solana-labs#13919 weighted_best can be replaced with rand::distributions::WeightedIndex, or WeightdShuffle::first.
Older weighted_shuffle is based on a heuristic which results in biased samples as shown in: solana-labs#18343 and can be replaced with WeightedShuffle. Also, as described in: solana-labs#13919 weighted_best can be replaced with rand::distributions::WeightedIndex, or WeightdShuffle::first.
Older weighted_shuffle is based on a heuristic which results in biased samples as shown in: #18343 and can be replaced with WeightedShuffle. Also, as described in: #13919 weighted_best can be replaced with rand::distributions::WeightedIndex, or WeightdShuffle::first. (cherry picked from commit db23295)
…24139) Older weighted_shuffle is based on a heuristic which results in biased samples as shown in: #18343 and can be replaced with WeightedShuffle. Also, as described in: #13919 weighted_best can be replaced with rand::distributions::WeightedIndex, or WeightdShuffle::first. (cherry picked from commit db23295) Co-authored-by: behzad nouri <behzadnouri@gmail.com>
Problem
To output one random sample, weighted_best generates n random numbers:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/weighted_shuffle.rs#L38-L63
WeightedIndex does so with only one random number:
https://github.com/rust-random/rand/blob/eb02f0e46/src/distributions/weighted_index.rs#L223-L240
Additionally, if the index is already constructed, it only does a total
of O(log(n)) amount of work; which can be achieved if RepairCache,
caches the weighted index:
https://github.com/solana-labs/solana/blob/f751a5d4e/core/src/serve_repair.rs#L83
Also, the repair-peers code can be reorganized to have fewer redundant
unlock-then-lock code.
Summary of Changes
weighted_best
withWeightedIndex
.