Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompute epoch transition #3383

Merged
merged 11 commits into from
Nov 4, 2021
Merged

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Oct 21, 2021

Motivation

  • We want to have early epoch transition when the node is synced

Description

Test

  • "Gossip Block" metrics in contabo-1 shows good result

Screen Shot 2021-10-21 at 18 08 02

@codecov
Copy link

codecov bot commented Oct 21, 2021

Codecov Report

Merging #3383 (6c8bea7) into master (13b61a3) will decrease coverage by 0.29%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3383      +/-   ##
==========================================
- Coverage   38.36%   38.07%   -0.30%     
==========================================
  Files         303      304       +1     
  Lines        7738     7843     +105     
  Branches     1157     1189      +32     
==========================================
+ Hits         2969     2986      +17     
- Misses       4628     4716      +88     
  Partials      141      141              

@codeclimate
Copy link

codeclimate bot commented Oct 21, 2021

Code Climate has analyzed commit 48b8e51 and detected 0 issues on this pull request.

View more on Code Climate.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2021

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: a0262e1 Previous: 397327a Ratio
BeaconState.hashTreeRoot - No change 698.00 ns/op 785.00 ns/op 0.89
BeaconState.hashTreeRoot - 1 full validator 100.79 us/op 95.972 us/op 1.05
BeaconState.hashTreeRoot - 32 full validator 1.3130 ms/op 1.3770 ms/op 0.95
BeaconState.hashTreeRoot - 512 full validator 15.944 ms/op 18.084 ms/op 0.88
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 87.925 us/op 93.138 us/op 0.94
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.3502 ms/op 1.6496 ms/op 0.82
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 20.565 ms/op 20.500 ms/op 1.00
BeaconState.hashTreeRoot - 1 balances 61.388 us/op 70.480 us/op 0.87
BeaconState.hashTreeRoot - 32 balances 562.26 us/op 609.76 us/op 0.92
BeaconState.hashTreeRoot - 512 balances 5.4559 ms/op 5.5161 ms/op 0.99
BeaconState.hashTreeRoot - 250000 balances 103.39 ms/op 106.80 ms/op 0.97
processSlot - 1 slots 45.598 us/op 47.122 us/op 0.97
processSlot - 32 slots 2.4069 ms/op 2.5799 ms/op 0.93
getCommitteeAssignments - req 1 vs - 250000 vc 5.2109 ms/op 6.2224 ms/op 0.84
getCommitteeAssignments - req 100 vs - 250000 vc 7.6593 ms/op 8.5100 ms/op 0.90
getCommitteeAssignments - req 1000 vs - 250000 vc 7.9527 ms/op 9.1040 ms/op 0.87
computeProposers - vc 250000 23.768 ms/op 24.660 ms/op 0.96
computeEpochShuffling - vc 250000 195.87 ms/op 225.43 ms/op 0.87
getNextSyncCommittee - vc 250000 417.26 ms/op 404.65 ms/op 1.03
altair processAttestation - 250000 vs - 7PWei normalcase 39.176 ms/op 40.262 ms/op 0.97
altair processAttestation - 250000 vs - 7PWei worstcase 41.472 ms/op 48.976 ms/op 0.85
altair processAttestation - setStatus - 1/6 committees join 12.459 ms/op 11.480 ms/op 1.09
altair processAttestation - setStatus - 1/3 committees join 22.278 ms/op 24.196 ms/op 0.92
altair processAttestation - setStatus - 1/2 committees join 30.991 ms/op 37.602 ms/op 0.82
altair processAttestation - setStatus - 2/3 committees join 42.335 ms/op 50.074 ms/op 0.85
altair processAttestation - setStatus - 4/5 committees join 58.993 ms/op 67.475 ms/op 0.87
altair processAttestation - setStatus - 100% committees join 65.889 ms/op 82.269 ms/op 0.80
altair processAttestation - updateEpochParticipants - 1/6 committees join 11.587 ms/op 12.422 ms/op 0.93
altair processAttestation - updateEpochParticipants - 1/3 committees join 23.694 ms/op 25.473 ms/op 0.93
altair processAttestation - updateEpochParticipants - 1/2 committees join 25.315 ms/op 28.203 ms/op 0.90
altair processAttestation - updateEpochParticipants - 2/3 committees join 28.133 ms/op 29.532 ms/op 0.95
altair processAttestation - updateEpochParticipants - 4/5 committees join 26.623 ms/op 31.457 ms/op 0.85
altair processAttestation - updateEpochParticipants - 100% committees join 31.752 ms/op 35.258 ms/op 0.90
altair processAttestation - updateAllStatus 19.799 ms/op 23.764 ms/op 0.83
altair processBlock - 250000 vs - 7PWei normalcase 43.204 ms/op 48.748 ms/op 0.89
altair processBlock - 250000 vs - 7PWei worstcase 124.25 ms/op 129.84 ms/op 0.96
altair processEpoch - pyrmont_e62330 473.84 ms/op 506.88 ms/op 0.93
pyrmont_e62330 - altair beforeProcessEpoch 162.43 ms/op 170.06 ms/op 0.96
pyrmont_e62330 - altair processJustificationAndFinalization 109.66 us/op 90.093 us/op 1.22
pyrmont_e62330 - altair processInactivityUpdates 7.9644 ms/op 9.9084 ms/op 0.80
pyrmont_e62330 - altair processRewardsAndPenalties 61.904 ms/op 65.726 ms/op 0.94
pyrmont_e62330 - altair processRegistryUpdates 18.019 us/op 11.630 us/op 1.55
pyrmont_e62330 - altair processSlashings 6.2060 us/op 4.0140 us/op 1.55
pyrmont_e62330 - altair processEth1DataReset 5.6360 us/op 3.8790 us/op 1.45
pyrmont_e62330 - altair processEffectiveBalanceUpdates 6.9934 ms/op 7.0376 ms/op 0.99
pyrmont_e62330 - altair processSlashingsReset 32.567 us/op 21.637 us/op 1.51
pyrmont_e62330 - altair processRandaoMixesReset 41.276 us/op 26.953 us/op 1.53
pyrmont_e62330 - altair processHistoricalRootsUpdate 7.5400 us/op 4.8320 us/op 1.56
pyrmont_e62330 - altair processParticipationFlagUpdates 47.554 ms/op 46.523 ms/op 1.02
pyrmont_e62330 - altair processSyncCommitteeUpdates 5.2020 us/op 3.1110 us/op 1.67
pyrmont_e62330 - altair afterProcessEpoch 110.30 ms/op 137.33 ms/op 0.80
altair processInactivityUpdates - 250000 normalcase 66.252 ms/op 78.687 ms/op 0.84
altair processInactivityUpdates - 250000 worstcase 72.532 ms/op 75.728 ms/op 0.96
altair processParticipationFlagUpdates - 250000 anycase 96.865 ms/op 120.17 ms/op 0.81
altair processRewardsAndPenalties - 250000 normalcase 133.72 ms/op 134.49 ms/op 0.99
altair processRewardsAndPenalties - 250000 worstcase 112.13 ms/op 135.09 ms/op 0.83
altair processSyncCommitteeUpdates - 250000 412.79 ms/op 427.25 ms/op 0.97
Tree 40 250000 create 525.89 ms/op 596.71 ms/op 0.88
Tree 40 250000 get(125000) 251.19 ns/op 330.93 ns/op 0.76
Tree 40 250000 set(125000) 1.5383 us/op 1.7300 us/op 0.89
Tree 40 250000 toArray() 43.726 ms/op 43.479 ms/op 1.01
Tree 40 250000 iterate all - toArray() + loop 47.346 ms/op 43.180 ms/op 1.10
Tree 40 250000 iterate all - get(i) 101.57 ms/op 121.02 ms/op 0.84
MutableVector 250000 create 20.484 ms/op 26.189 ms/op 0.78
MutableVector 250000 get(125000) 14.444 ns/op 15.770 ns/op 0.92
MutableVector 250000 set(125000) 515.97 ns/op 622.06 ns/op 0.83
MutableVector 250000 toArray() 8.8291 ms/op 11.456 ms/op 0.77
MutableVector 250000 iterate all - toArray() + loop 8.4933 ms/op 11.892 ms/op 0.71
MutableVector 250000 iterate all - get(i) 6.2284 ms/op 4.3001 ms/op 1.45
Array 250000 create 5.1411 ms/op 6.6366 ms/op 0.77
Array 250000 clone - spread 2.0040 ms/op 2.6947 ms/op 0.74
Array 250000 get(125000) 1.1540 ns/op 1.3550 ns/op 0.85
Array 250000 set(125000) 1.0430 ns/op 0.96400 ns/op 1.08
Array 250000 iterate all - loop 144.62 us/op 201.27 us/op 0.72
aggregationBits - 2048 els - readonlyValues 234.42 us/op 275.27 us/op 0.85
aggregationBits - 2048 els - zipIndexesInBitList 44.115 us/op 52.349 us/op 0.84
ssz.Root.equals 1.4040 us/op 1.6030 us/op 0.88
ssz.Root.equals with valueOf() 1.6550 us/op 1.9680 us/op 0.84
byteArrayEquals with valueOf() 1.6920 us/op 1.9130 us/op 0.88
phase0 processBlock - 250000 vs - 7PWei normalcase 12.592 ms/op 12.696 ms/op 0.99
phase0 processBlock - 250000 vs - 7PWei worstcase 88.126 ms/op 88.568 ms/op 1.00
phase0 afterProcessEpoch - 250000 vs - 7PWei 215.71 ms/op 254.67 ms/op 0.85
phase0 beforeProcessEpoch - 250000 vs - 7PWei 683.21 ms/op 652.30 ms/op 1.05
phase0 processEpoch - mainnet_e58758 856.89 ms/op 891.07 ms/op 0.96
mainnet_e58758 - phase0 beforeProcessEpoch 526.77 ms/op 526.16 ms/op 1.00
mainnet_e58758 - phase0 processJustificationAndFinalization 103.92 us/op 84.698 us/op 1.23
mainnet_e58758 - phase0 processRewardsAndPenalties 85.026 ms/op 94.080 ms/op 0.90
mainnet_e58758 - phase0 processRegistryUpdates 77.885 us/op 59.253 us/op 1.31
mainnet_e58758 - phase0 processSlashings 5.9100 us/op 4.2480 us/op 1.39
mainnet_e58758 - phase0 processEth1DataReset 5.7040 us/op 3.6890 us/op 1.55
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 9.8667 ms/op 11.219 ms/op 0.88
mainnet_e58758 - phase0 processSlashingsReset 30.607 us/op 18.771 us/op 1.63
mainnet_e58758 - phase0 processRandaoMixesReset 42.191 us/op 24.843 us/op 1.70
mainnet_e58758 - phase0 processHistoricalRootsUpdate 7.0860 us/op 4.7970 us/op 1.48
mainnet_e58758 - phase0 processParticipationRecordUpdates 28.869 us/op 16.052 us/op 1.80
mainnet_e58758 - phase0 afterProcessEpoch 187.41 ms/op 221.08 ms/op 0.85
phase0 processEffectiveBalanceUpdates - 250000 normalcase 13.246 ms/op 13.148 ms/op 1.01
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.3666 s/op 1.3245 s/op 1.03
phase0 processRegistryUpdates - 250000 normalcase 83.534 us/op 53.807 us/op 1.55
phase0 processRegistryUpdates - 250000 badcase_full_deposits 3.7626 ms/op 3.6163 ms/op 1.04
phase0 processRegistryUpdates - 250000 worstcase 0.5 1.9301 s/op 1.7742 s/op 1.09
phase0 getAttestationDeltas - 250000 normalcase 38.575 ms/op 43.449 ms/op 0.89
phase0 getAttestationDeltas - 250000 worstcase 37.821 ms/op 43.109 ms/op 0.88
phase0 processSlashings - 250000 worstcase 35.879 ms/op 39.646 ms/op 0.90
shuffle list - 16384 els 13.845 ms/op 16.706 ms/op 0.83
shuffle list - 250000 els 181.55 ms/op 219.61 ms/op 0.83
getEffectiveBalances - 250000 vs - 7PWei 10.892 ms/op 13.528 ms/op 0.81
computeDeltas 3.5734 ms/op 3.8979 ms/op 0.92
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.8141 ms/op 2.3571 ms/op 1.19
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 1.1501 ms/op 1.0960 ms/op 1.05
BLS verify - blst-native 1.9943 ms/op 2.2317 ms/op 0.89
BLS verifyMultipleSignatures 3 - blst-native 4.4292 ms/op 4.5714 ms/op 0.97
BLS verifyMultipleSignatures 8 - blst-native 9.5320 ms/op 9.8646 ms/op 0.97
BLS verifyMultipleSignatures 32 - blst-native 30.566 ms/op 35.766 ms/op 0.85
BLS aggregatePubkeys 32 - blst-native 44.082 us/op 47.785 us/op 0.92
BLS aggregatePubkeys 128 - blst-native 166.82 us/op 186.58 us/op 0.89
getAttestationsForBlock 77.544 ms/op 100.35 ms/op 0.77
CheckpointStateCache - add get delete 16.918 us/op 18.230 us/op 0.93
validate gossip signedAggregateAndProof - struct 5.2259 ms/op 5.3393 ms/op 0.98
validate gossip signedAggregateAndProof - treeBacked 4.7003 ms/op 5.2659 ms/op 0.89
validate gossip attestation - struct 2.1913 ms/op 2.4914 ms/op 0.88
validate gossip attestation - treeBacked 2.3755 ms/op 2.4817 ms/op 0.96
Object access 1 prop 0.45600 ns/op 0.51100 ns/op 0.89
Map access 1 prop 0.46500 ns/op 0.48400 ns/op 0.96
Object get x1000 17.037 ns/op 20.096 ns/op 0.85
Map get x1000 1.1080 ns/op 1.3380 ns/op 0.83
Object set x1000 103.41 ns/op 113.74 ns/op 0.91
Map set x1000 87.138 ns/op 76.522 ns/op 1.14
Return object 10000 times 0.39090 ns/op 0.46200 ns/op 0.85
Throw Error 10000 times 6.8137 us/op 7.6312 us/op 0.89

by benchmarkbot/action

g11tech
g11tech previously approved these changes Oct 22, 2021
wemeetagain
wemeetagain previously approved these changes Oct 26, 2021
Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@twoeths
Copy link
Contributor Author

twoeths commented Oct 28, 2021

Tested this for a while:

  • Majority of blocks could be processed <=3s after slot starts, the main reason block is processed after 1/3 slot is due to the gossip block received late

Screen Shot 2021-10-28 at 10 39 08

@twoeths twoeths force-pushed the tuyen/precompute-epoch-transition branch from 371622d to 644b60b Compare October 29, 2021 07:49
@twoeths
Copy link
Contributor Author

twoeths commented Oct 29, 2021

New precompute epoch metrics:

  • Success vs Error vs Skip (no data for Error/Skip legend so it doesn not show)
  • Checkpoint state cache hits vs waste

Screen Shot 2021-10-29 at 17 33 44

@@ -538,5 +538,22 @@ export function createLodestarMetrics(
name: "unhandeled_promise_rejections",
help: "UnhandeledPromiseRejection count",
}),

// Precompute epoch
preComputeEpoch: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name preComputeEpoch is bit too generic could we do precomputeNextEpochTransition? If sounds good please rename files and classes on all the PR

// Precompute epoch
preComputeEpoch: {
count: register.counter<"result">({
name: "precompute_epoch_result",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #3399

count: register.counter<"result">({
name: "precompute_epoch_result",
labelNames: ["result"],
help: "Number of precompute epoch skip/success/error",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Total number of precomputeNextEpochTransition runs by result

.getBlockSlotState(blockRoot, nextSlot, RegenCaller.preComputeEpoch)
.then(() => {
this.metrics?.preComputeEpoch.count.inc({result: "success"}, 1);
const previousHits = this.chain.checkpointStateCache.updatePreComputedCheckpoint(blockRoot, nextEpoch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you find a a way to track this without adding custom code in the checkpointStateCache? Ideally this class should have sufficient info to know if the previous run was re-orged or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dapplion even when our node is reorged, it does not mean the pre computed epoch transition is wasteful as some nodes can still send attestations with our old/obsolete target checkpoint and we still need that cached target checkpoint. So for this waste metric, it does not depend on the node being re-orged or not, it depends on the produced checkpoint state is used or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's true. Let's leave it like this then

@twoeths twoeths force-pushed the tuyen/precompute-epoch-transition branch from 13f97c0 to 48b8e51 Compare November 1, 2021 03:05
Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make the naming consistent?
Either precompute or preCompute, not both

Other than that lgtm

@dapplion dapplion merged commit 4461207 into master Nov 4, 2021
@dapplion dapplion deleted the tuyen/precompute-epoch-transition branch November 4, 2021 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Early epoch transition when node is synced
4 participants