-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: weighted candidate scoring w/ biased random selection #253
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #253 +/- ##
==========================================
+ Coverage 71.48% 71.54% +0.05%
==========================================
Files 69 69
Lines 6043 6126 +83
==========================================
+ Hits 4320 4383 +63
- Misses 1484 1504 +20
Partials 239 239
|
aaaactually this isn't true. I just realised why this might be flaky: the a:b comparison sort-esq action of the prioritywaitqueue requires determinism, "is there one better than me? I'll check them all". If they are all doing that, then the randomness introduces the possibility of being in a situation where they all find that they are the worst candidate so shouldn't run; leading to deadlock. I've seen deadlock-like action in tests a couple of times and thought I ironed it out, but a CI failure here kind of looks like it, which is strange because Converted to draft until I sort that out. Probably means finally changing prioritywaitqueue to do a line-them-all-up-and-pick-one. |
agree on the randomness bugs. I've long thought the architecture of the priority wait queue would eventually run into problems. I think you just need one more coordinating go-routine. Essentially, that's what would have the choice whom to run each time the queue frees up, choosing from all the possibilities. I know you're aiming to avoid it for as long as needed, but maybe now we're here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, this is awesome and the right direction. If we eventually get to cross protocol choices, we'll probably want to reduce the graphsync weights (+ maybe add some protocol prioritizations).
One amazing thing to get to would be instead of protocol racing, we just have everyone running through the priority wait queue with a parallelization of >1. Then again, I have no idea how that works with not "whole request" logic for protocols like Bitswap. TBD and far away.
However, before we do anything further (and I think we need to focus on TTFB next, followed by bandwidth), I think that we need to:
- Resolve the decision making process so it's "pick one from many" rather than "pick one of two".
- Focus on how we record what decisions were made and why, so we can see how they affect performance when deployed.
Anyway though, this seems good and we should probably merge it if we can get the tests reliably passing.
c4d32b9
to
65fef84
Compare
Good to go
|
57c9b10
to
1765385
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put a few remaining comments, but IMHO, this is LGTM.
for ii, p := range peers { | ||
ind[ii] = ii | ||
gsmd, _ := mda[ii].(*metadata.GraphsyncFilecoinV1) | ||
scores[ii] = spt.scoreProvider(p, gsmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just something to consider: maybe at some point we start squaring these values to reduce randomness. (or perhaps make that it's own adjustable parameter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping we can come up with ways to be able to know what to do here without guessing too much; I'm trying modelling for now, but insights and A/B testing live might be better.
* Rename Graphsync specific options * Remove PaidRetrievals option
1765385
to
882af7d
Compare
* add Duration property to FirstByte event * fix http retriever ttfb, duration and speed metrics * add MockSession to track metric reporting * ttfb & bandwidth contribute to scoring
Duration
for EMA (α=0.5)exp(-λx)
, whereλ
is1/overall_ema
, giving us a [0,1] range (curve shape below foroverall_ema
of50
, values from [0,250] for illustration).1
, with a weight multiplier of3
1
, with a weight multiplier of2
(important, but Verified beats it)1
, roll a[0,1]
dice and if falls within the first candidate then it wins.DefaultConfig()
which should be used and some sugar utility methods to modify the config according to needs.WithoutRandomness()
config modifier that sets up a fixed dice roll of0.5
so the best is always selected. This is useful for tests of course.In #92 we still have ttfb and bandwidth. I think the approach with both of these would be the same as the one with connect time (overall EMA, make a [0,1] curve, plot per-SP EMA on that curve).
There's also concurrency which we're still keeping a strict lid on. We should talk about if/how to incorporate that in scoring.