perf(discv5): populate kbuckets & improved RLPx peering #7683

emhane · 2024-04-16T16:23:15Z

Improves bootstrapping by starting to fill kbuckets at furthest log2distance from local node id, as these are easier to fill.

emhane · 2024-04-17T17:10:17Z

closed in favour of #7695

emhane · 2024-04-17T18:24:17Z

will still be useful to reverse this, to iteratively get closer to own node id

…nto emhane/rev-lookup-tgt

Rjected

Am I understanding correctly that this code is supposed to get discv5 to return a specific distance internally since it uses self.local_key.distance(target)? But instead of doing it for all log2 up to 255, we do it for half?

emhane · 2024-04-18T16:39:00Z

Am I understanding correctly that this code is supposed to get discv5 to return a specific distance internally since it uses self.local_key.distance(target)? But instead of doing it for all log2 up to 255, we do it for half?

exactly. it's unlikely to find peers for the closest buckets, 0-127. I think we should eventually extend it to cover all buckets though, if not already do that now tbh.

emhane · 2024-04-19T11:22:29Z

Metrics on outgoing RLPx connections back the hypothesis that pseudorandom FINDNODE target selection, explicitly targeting each kbucket, is more beneficial over time than a completely random target in lookup queries that aim to discover RLPx peers. Outgoing RLPx connections are important for nodes behind NAT to be successful, as @joshieDo has been strongly advocating for this week.

This pr was run between 16.04 21:30 - 17.04 12:30, although targeting all kbuckets, not only the top half (half furthest away from local node id as latest commit decreased it to in fear of too much overhead targeting all). Between 17.04 19:30 - 18.04 19:30, the node was ran with target generated by let tgt = NodeId::random(), which under the hood is let tgt: [u8; 32] = rand::random(). Both runs started with empty known_peers.json file (RLPx peers saved from prev run), and on new non-default ports with --instance flag.

This panel shows the number of successfully established outgoing RLPx connections over time.

Pseudorandom target selection, targeting each kbukcet index equal many times, outperforms random target selection. The former discovers equally many useful RLPx peers as the latter in almost half the time (15 hours as opposed to 24 hours).

Furthermore, pseudorandom target selection targeting each kbukcet is also cheaper than random target selection, as it leads to less peer churn.

This metric doesn't account for unique peers, it's hence likely that random target selection has higher peer churn than the given pseudorandom target selection as it re-establishes sessions with recently evicted peers (sigp/discv5 library stores sessions in an LRU cache). This assumption is strengthened by looking at the first panel, which shows that the increased peer churn isn't leading to more outgoing RLPx connections.

Based on these metrics, I will increase the targeted kbucket range in the lookup query to stretch over the whole kbucket range, since peer churn is less than the completely random target selection commonly used.

Rjected

lgtm, thanks for the additional docs and explanation with metrics!

emhane · 2024-04-19T20:04:26Z

lgtm, thanks for the additional docs and explanation with metrics!

kinda got the "metrics or go home" mentality from you guys

Reverse target lookup

518c490

emhane added C-perf A change motivated by improving speed, memory usage or disk footprint A-discv5 Related to discv5 discovery labels Apr 16, 2024

emhane requested a review from joshieDo April 16, 2024 16:23

emhane requested review from mattsse and Rjected as code owners April 16, 2024 16:23

emhane changed the title ~~feat(discv5):~~ perf(discv5): bootstrap Apr 16, 2024

emhane added 2 commits April 16, 2024 20:34

Fix flaky test

9f2ef37

Use kbucket index instead of log2distance

f714329

emhane requested a review from DaniPopes April 16, 2024 19:54

emhane mentioned this pull request Apr 17, 2024

perf(discv5): boost bootstrap lookups #7695

Merged

emhane closed this Apr 17, 2024

emhane deleted the emhane/rev-lookup-tgt branch April 17, 2024 17:53

emhane restored the emhane/rev-lookup-tgt branch April 17, 2024 18:23

emhane reopened this Apr 17, 2024

emhane added 6 commits April 17, 2024 20:30

Merge branch 'main' into emhane/rev-lookup-tgt

444a45f

Fix test for bucket index

45dc517

Fill top half kbuckets only

3e162fa

Merge branch 'emhane/rev-lookup-tgt' of github.com:paradigmxyz/reth i…

6aa9c1a

…nto emhane/rev-lookup-tgt

Driveby, fix docs

7b18437

Fix docs

b2d2b6a

Rjected reviewed Apr 17, 2024

View reviewed changes

emhane changed the title ~~perf(discv5): bootstrap~~ perf(discv5): populate kbuckets Apr 18, 2024

emhane requested a review from Rjected April 18, 2024 16:40

emhane changed the title ~~perf(discv5): populate kbuckets~~ perf(discv5): populate kbuckets & improved RLPx peering Apr 19, 2024

Stretch target selection over whole kbucket range

510a361

Rjected approved these changes Apr 19, 2024

View reviewed changes

emhane added this pull request to the merge queue Apr 19, 2024

Merged via the queue into main with commit 20568b8 Apr 19, 2024
27 checks passed

emhane deleted the emhane/rev-lookup-tgt branch April 19, 2024 20:14

onbjerg mentioned this pull request Apr 25, 2024

docs: update examples readme #7852

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(discv5): populate kbuckets & improved RLPx peering #7683

perf(discv5): populate kbuckets & improved RLPx peering #7683

emhane commented Apr 16, 2024

emhane commented Apr 17, 2024

emhane commented Apr 17, 2024

Rjected left a comment

emhane commented Apr 18, 2024

emhane commented Apr 19, 2024

Rjected left a comment

emhane commented Apr 19, 2024

perf(discv5): populate kbuckets & improved RLPx peering #7683

perf(discv5): populate kbuckets & improved RLPx peering #7683

Conversation

emhane commented Apr 16, 2024

emhane commented Apr 17, 2024

emhane commented Apr 17, 2024

Rjected left a comment

Choose a reason for hiding this comment

emhane commented Apr 18, 2024

emhane commented Apr 19, 2024

Rjected left a comment

Choose a reason for hiding this comment

emhane commented Apr 19, 2024