Syncing warm-up time #2048

acud · 2021-06-10T11:51:47Z

Task

We need to add a warm-up time to pushsync and pullsync protocols.

CLI flag for warmup that can be used on both protocols, default 10m, can accept 0 as immediate startup, otherwise integration tests won't work, enforce upper bound of 1h as too high
update helm charts and beekeeper to propagate correct values for CI and integration tests
pusher and puller should start after the warmup time expired
pullsync needs no changes
pushsync does not give back receipts during the warmup period, stream should reset

The text was updated successfully, but these errors were encountered:

Eknir · 2021-06-10T12:11:07Z

slight clarification: push sync may forward receipts, but not generate receipts itself. @acud , can you validate?

acud · 2021-06-10T12:13:03Z

correct. pushsync should not sign receipts during the warmup period

ldeffenb · 2021-06-10T12:14:29Z

In my experience, 10 minutes, or any coded or user-specified time frame, is not going to solve the problem that this is trying to solve, namely services activating before the depth has been established.

I believe kademlia needs to find a way to use the count of known nodes per bin in the address book to estimate what depth is likely to be achievable and provide a method that pusher, puller, and pushsync can use to determine if they are "clear" to operate, because depth has reached the estimated achievable. This "clear" flag only needs to be recalculated when the depth changes to avoid unnecessary overhead.

This approach will also provide for a subsequent loss of depth to "re-block" these features if the check is made in their normal processing loops and not only a startup delay. I have seen my depth go from 15 to 3, stay there for a while, and then jump back to 15 as a single pivotal peer connection was lost from bin 4.

For some real-world traces of depth changes from node startup and over time, see:

https://ipfs.io/ipfs/QmSs8qGMEWyzezWtfs7fM2QuCcfjKn72J9D1zp1wBXibom

bee1 took 45 minutes to initially get to depth 14 jumping from 4
bee2 took 30 minutes to initially get to depth 13 jumping from 4
bee3 took 25 minutes to initially get to depth 15 jumping from 2

And for posterity, these are the current connected counts for topology bins 0->15 from left to right:

bee1: 8 8 15 10 12 20 20 20 20 19 20 17 14 8 3 4
bee2: 9 5 14 18 13 10 20 13 20 18 18 14 14 4 6 4
bee3: 12 10 9 11 10 20 19 15 11 14 17 15 16 9 7 3

and they are all running version 0.6.3-5b9541c4 which I believe was yesterday's master from github.

acud · 2021-06-10T12:20:20Z

@ldeffenb we are incrementing on this issue on all fronts, in particular kademlia. this is to improve on the protocols on top of that.

acud assigned istae Jun 10, 2021

bee-runner bot added the issue label Jun 10, 2021

istae mentioned this issue Jun 10, 2021

feat: node warmup time for pull/pushsync protocols #2050

Merged

istae linked a pull request Jun 10, 2021 that will close this issue

feat: node warmup time for pull/pushsync protocols #2050

Merged

istae closed this as completed in #2050 Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing warm-up time #2048

Syncing warm-up time #2048

acud commented Jun 10, 2021 •

edited

Loading

Eknir commented Jun 10, 2021

acud commented Jun 10, 2021

ldeffenb commented Jun 10, 2021

acud commented Jun 10, 2021

Syncing warm-up time #2048

Syncing warm-up time #2048

Comments

acud commented Jun 10, 2021 • edited Loading

Task

Eknir commented Jun 10, 2021

acud commented Jun 10, 2021

ldeffenb commented Jun 10, 2021

acud commented Jun 10, 2021

acud commented Jun 10, 2021 •

edited

Loading