Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkpoint balances cache #4290

Merged
merged 7 commits into from Jul 13, 2022
Merged

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Jul 13, 2022

Motivation

Right now whenever we provide a justified checkpoint to forkchoice, we provide balances too but forkchoice may switch justified checkpoint so we want to provide a way to cache balances from checkpoints

Description

  • Implement a CheckpointBalancesCache, provide checkpoint balances per block process
  • Implement justifiedBalancesGetter function:
    • Get from the cache first
    • If not find the closest state to checkpoint and extract balances from there

@github-actions
Copy link
Contributor

github-actions bot commented Jul 13, 2022

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 3ca87fd Previous: dc8bdfc Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.4421 ms/op 2.5754 ms/op 0.95
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 76.364 us/op 129.64 us/op 0.59
BLS verify - blst-native 1.8601 ms/op 2.6716 ms/op 0.70
BLS verifyMultipleSignatures 3 - blst-native 3.8146 ms/op 5.4198 ms/op 0.70
BLS verifyMultipleSignatures 8 - blst-native 8.2163 ms/op 11.672 ms/op 0.70
BLS verifyMultipleSignatures 32 - blst-native 29.778 ms/op 43.098 ms/op 0.69
BLS aggregatePubkeys 32 - blst-native 39.414 us/op 58.689 us/op 0.67
BLS aggregatePubkeys 128 - blst-native 153.22 us/op 227.90 us/op 0.67
getAttestationsForBlock 48.348 ms/op 54.098 ms/op 0.89
isKnown best case - 1 super set check 451.00 ns/op 603.00 ns/op 0.75
isKnown normal case - 2 super set checks 445.00 ns/op 594.00 ns/op 0.75
isKnown worse case - 16 super set checks 454.00 ns/op 589.00 ns/op 0.77
CheckpointStateCache - add get delete 11.459 us/op 15.018 us/op 0.76
validate gossip signedAggregateAndProof - struct 4.2877 ms/op 6.4041 ms/op 0.67
validate gossip attestation - struct 2.0456 ms/op 2.9852 ms/op 0.69
altair verifyImport mainnet_s3766816:31 6.4712 s/op 8.8446 s/op 0.73
pickEth1Vote - no votes 2.1194 ms/op 2.5768 ms/op 0.82
pickEth1Vote - max votes 28.469 ms/op 30.894 ms/op 0.92
pickEth1Vote - Eth1Data hashTreeRoot value x2048 12.678 ms/op 15.807 ms/op 0.80
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 23.665 ms/op 32.334 ms/op 0.73
pickEth1Vote - Eth1Data fastSerialize value x2048 1.5602 ms/op 1.7970 ms/op 0.87
pickEth1Vote - Eth1Data fastSerialize tree x2048 20.998 ms/op 15.890 ms/op 1.32
bytes32 toHexString 1.1400 us/op 1.2340 us/op 0.92
bytes32 Buffer.toString(hex) 710.00 ns/op 797.00 ns/op 0.89
bytes32 Buffer.toString(hex) from Uint8Array 950.00 ns/op 1.0110 us/op 0.94
bytes32 Buffer.toString(hex) + 0x 709.00 ns/op 771.00 ns/op 0.92
Object access 1 prop 0.38500 ns/op 0.42000 ns/op 0.92
Map access 1 prop 0.30000 ns/op 0.34600 ns/op 0.87
Object get x1000 18.103 ns/op 17.806 ns/op 1.02
Map get x1000 0.98500 ns/op 0.95500 ns/op 1.03
Object set x1000 124.22 ns/op 130.26 ns/op 0.95
Map set x1000 72.847 ns/op 91.991 ns/op 0.79
Return object 10000 times 0.37800 ns/op 0.45880 ns/op 0.82
Throw Error 10000 times 5.8931 us/op 7.3815 us/op 0.80
enrSubnets - fastDeserialize 64 bits 2.9390 us/op 3.3350 us/op 0.88
enrSubnets - ssz BitVector 64 bits 786.00 ns/op 862.00 ns/op 0.91
enrSubnets - fastDeserialize 4 bits 453.00 ns/op 456.00 ns/op 0.99
enrSubnets - ssz BitVector 4 bits 784.00 ns/op 844.00 ns/op 0.93
prioritizePeers score -10:0 att 32-0.1 sync 2-0 96.287 us/op 111.34 us/op 0.86
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 123.46 us/op 159.77 us/op 0.77
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 223.37 us/op 284.27 us/op 0.79
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 484.72 us/op 608.93 us/op 0.80
prioritizePeers score 0:0 att 64-1 sync 4-1 464.84 us/op 658.27 us/op 0.71
RateTracker 1000000 limit, 1 obj count per request 209.93 ns/op 234.81 ns/op 0.89
RateTracker 1000000 limit, 2 obj count per request 153.36 ns/op 175.79 ns/op 0.87
RateTracker 1000000 limit, 4 obj count per request 127.35 ns/op 138.53 ns/op 0.92
RateTracker 1000000 limit, 8 obj count per request 114.07 ns/op 119.55 ns/op 0.95
RateTracker with prune 4.7880 us/op 5.5090 us/op 0.87
array of 16000 items push then shift 3.2058 us/op 5.3464 us/op 0.60
LinkedList of 16000 items push then shift 27.148 ns/op 30.008 ns/op 0.90
array of 16000 items push then pop 255.17 ns/op 250.40 ns/op 1.02
LinkedList of 16000 items push then pop 24.216 ns/op 24.289 ns/op 1.00
array of 24000 items push then shift 4.5733 us/op 7.5898 us/op 0.60
LinkedList of 24000 items push then shift 28.486 ns/op 28.376 ns/op 1.00
array of 24000 items push then pop 198.12 ns/op 243.15 ns/op 0.81
LinkedList of 24000 items push then pop 24.539 ns/op 23.666 ns/op 1.04
intersect bitArray bitLen 8 11.797 ns/op 12.104 ns/op 0.97
intersect array and set length 8 170.74 ns/op 190.86 ns/op 0.89
intersect bitArray bitLen 128 72.173 ns/op 69.941 ns/op 1.03
intersect array and set length 128 2.3733 us/op 2.4194 us/op 0.98
pass gossip attestations to forkchoice per slot 3.1803 ms/op 7.0882 ms/op 0.45
computeDeltas 3.4227 ms/op 3.8862 ms/op 0.88
computeProposerBoostScoreFromBalances 803.73 us/op 894.53 us/op 0.90
altair processAttestation - 250000 vs - 7PWei normalcase 4.1919 ms/op 6.3241 ms/op 0.66
altair processAttestation - 250000 vs - 7PWei worstcase 6.3279 ms/op 8.9065 ms/op 0.71
altair processAttestation - setStatus - 1/6 committees join 188.15 us/op 262.37 us/op 0.72
altair processAttestation - setStatus - 1/3 committees join 399.03 us/op 493.29 us/op 0.81
altair processAttestation - setStatus - 1/2 committees join 554.82 us/op 666.38 us/op 0.83
altair processAttestation - setStatus - 2/3 committees join 718.51 us/op 903.61 us/op 0.80
altair processAttestation - setStatus - 4/5 committees join 998.25 us/op 1.2897 ms/op 0.77
altair processAttestation - setStatus - 100% committees join 1.1887 ms/op 1.5275 ms/op 0.78
altair processBlock - 250000 vs - 7PWei normalcase 27.750 ms/op 32.032 ms/op 0.87
altair processBlock - 250000 vs - 7PWei normalcase hashState 43.918 ms/op 44.631 ms/op 0.98
altair processBlock - 250000 vs - 7PWei worstcase 82.699 ms/op 105.46 ms/op 0.78
altair processBlock - 250000 vs - 7PWei worstcase hashState 108.17 ms/op 128.44 ms/op 0.84
phase0 processBlock - 250000 vs - 7PWei normalcase 3.6639 ms/op 4.5505 ms/op 0.81
phase0 processBlock - 250000 vs - 7PWei worstcase 46.991 ms/op 64.665 ms/op 0.73
altair processEth1Data - 250000 vs - 7PWei normalcase 857.07 us/op 1.1733 ms/op 0.73
Tree 40 250000 create 903.98 ms/op 1.1076 s/op 0.82
Tree 40 250000 get(125000) 299.99 ns/op 332.57 ns/op 0.90
Tree 40 250000 set(125000) 3.0890 us/op 3.6467 us/op 0.85
Tree 40 250000 toArray() 34.888 ms/op 37.608 ms/op 0.93
Tree 40 250000 iterate all - toArray() + loop 35.027 ms/op 39.446 ms/op 0.89
Tree 40 250000 iterate all - get(i) 116.70 ms/op 137.96 ms/op 0.85
MutableVector 250000 create 18.922 ms/op 17.836 ms/op 1.06
MutableVector 250000 get(125000) 13.144 ns/op 14.505 ns/op 0.91
MutableVector 250000 set(125000) 770.77 ns/op 1.0191 us/op 0.76
MutableVector 250000 toArray() 7.4361 ms/op 8.1819 ms/op 0.91
MutableVector 250000 iterate all - toArray() + loop 7.5265 ms/op 7.9185 ms/op 0.95
MutableVector 250000 iterate all - get(i) 3.3523 ms/op 3.5311 ms/op 0.95
Array 250000 create 6.7293 ms/op 6.7934 ms/op 0.99
Array 250000 clone - spread 3.8533 ms/op 4.7710 ms/op 0.81
Array 250000 get(125000) 1.6330 ns/op 2.0000 ns/op 0.82
Array 250000 set(125000) 1.6790 ns/op 2.1770 ns/op 0.77
Array 250000 iterate all - loop 168.17 us/op 149.06 us/op 1.13
effectiveBalanceIncrements clone Uint8Array 300000 579.80 us/op 287.61 us/op 2.02
effectiveBalanceIncrements clone MutableVector 300000 728.00 ns/op 849.00 ns/op 0.86
effectiveBalanceIncrements rw all Uint8Array 300000 254.72 us/op 289.80 us/op 0.88
effectiveBalanceIncrements rw all MutableVector 300000 175.34 ms/op 228.81 ms/op 0.77
phase0 afterProcessEpoch - 250000 vs - 7PWei 181.91 ms/op 224.27 ms/op 0.81
phase0 beforeProcessEpoch - 250000 vs - 7PWei 76.614 ms/op 79.992 ms/op 0.96
altair processEpoch - mainnet_e81889 596.62 ms/op 665.61 ms/op 0.90
mainnet_e81889 - altair beforeProcessEpoch 165.32 ms/op 163.66 ms/op 1.01
mainnet_e81889 - altair processJustificationAndFinalization 22.732 us/op 61.113 us/op 0.37
mainnet_e81889 - altair processInactivityUpdates 12.062 ms/op 11.990 ms/op 1.01
mainnet_e81889 - altair processRewardsAndPenalties 95.133 ms/op 107.70 ms/op 0.88
mainnet_e81889 - altair processRegistryUpdates 4.2730 us/op 13.368 us/op 0.32
mainnet_e81889 - altair processSlashings 1.0710 us/op 3.5120 us/op 0.30
mainnet_e81889 - altair processEth1DataReset 1.0400 us/op 3.7250 us/op 0.28
mainnet_e81889 - altair processEffectiveBalanceUpdates 2.4124 ms/op 2.4536 ms/op 0.98
mainnet_e81889 - altair processSlashingsReset 6.2880 us/op 24.082 us/op 0.26
mainnet_e81889 - altair processRandaoMixesReset 5.5380 us/op 23.697 us/op 0.23
mainnet_e81889 - altair processHistoricalRootsUpdate 1.1110 us/op 3.8900 us/op 0.29
mainnet_e81889 - altair processParticipationFlagUpdates 3.3100 us/op 11.259 us/op 0.29
mainnet_e81889 - altair processSyncCommitteeUpdates 982.00 ns/op 3.0780 us/op 0.32
mainnet_e81889 - altair afterProcessEpoch 192.33 ms/op 217.40 ms/op 0.88
phase0 processEpoch - mainnet_e58758 539.99 ms/op 635.94 ms/op 0.85
mainnet_e58758 - phase0 beforeProcessEpoch 239.78 ms/op 293.83 ms/op 0.82
mainnet_e58758 - phase0 processJustificationAndFinalization 22.017 us/op 61.070 us/op 0.36
mainnet_e58758 - phase0 processRewardsAndPenalties 139.69 ms/op 149.51 ms/op 0.93
mainnet_e58758 - phase0 processRegistryUpdates 11.431 us/op 34.335 us/op 0.33
mainnet_e58758 - phase0 processSlashings 897.00 ns/op 2.9220 us/op 0.31
mainnet_e58758 - phase0 processEth1DataReset 938.00 ns/op 3.2300 us/op 0.29
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.8550 ms/op 2.5443 ms/op 0.73
mainnet_e58758 - phase0 processSlashingsReset 6.4040 us/op 14.668 us/op 0.44
mainnet_e58758 - phase0 processRandaoMixesReset 6.9870 us/op 21.899 us/op 0.32
mainnet_e58758 - phase0 processHistoricalRootsUpdate 898.00 ns/op 3.7780 us/op 0.24
mainnet_e58758 - phase0 processParticipationRecordUpdates 5.5190 us/op 22.132 us/op 0.25
mainnet_e58758 - phase0 afterProcessEpoch 157.19 ms/op 177.15 ms/op 0.89
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.7298 ms/op 2.4091 ms/op 1.13
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 2.3259 ms/op 2.7006 ms/op 0.86
altair processInactivityUpdates - 250000 normalcase 42.599 ms/op 53.003 ms/op 0.80
altair processInactivityUpdates - 250000 worstcase 52.167 ms/op 61.443 ms/op 0.85
phase0 processRegistryUpdates - 250000 normalcase 10.710 us/op 27.912 us/op 0.38
phase0 processRegistryUpdates - 250000 badcase_full_deposits 402.10 us/op 583.29 us/op 0.69
phase0 processRegistryUpdates - 250000 worstcase 0.5 218.94 ms/op 255.19 ms/op 0.86
altair processRewardsAndPenalties - 250000 normalcase 145.00 ms/op 150.26 ms/op 0.97
altair processRewardsAndPenalties - 250000 worstcase 108.40 ms/op 98.557 ms/op 1.10
phase0 getAttestationDeltas - 250000 normalcase 12.508 ms/op 16.040 ms/op 0.78
phase0 getAttestationDeltas - 250000 worstcase 13.052 ms/op 15.904 ms/op 0.82
phase0 processSlashings - 250000 worstcase 5.4486 ms/op 7.3286 ms/op 0.74
altair processSyncCommitteeUpdates - 250000 284.77 ms/op 349.99 ms/op 0.81
BeaconState.hashTreeRoot - No change 497.00 ns/op 588.00 ns/op 0.85
BeaconState.hashTreeRoot - 1 full validator 56.991 us/op 75.620 us/op 0.75
BeaconState.hashTreeRoot - 32 full validator 544.64 us/op 736.95 us/op 0.74
BeaconState.hashTreeRoot - 512 full validator 6.0589 ms/op 7.8964 ms/op 0.77
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 70.480 us/op 92.449 us/op 0.76
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.0212 ms/op 1.3250 ms/op 0.77
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 13.032 ms/op 17.400 ms/op 0.75
BeaconState.hashTreeRoot - 1 balances 54.207 us/op 69.347 us/op 0.78
BeaconState.hashTreeRoot - 32 balances 484.37 us/op 622.84 us/op 0.78
BeaconState.hashTreeRoot - 512 balances 4.4980 ms/op 5.5858 ms/op 0.81
BeaconState.hashTreeRoot - 250000 balances 79.253 ms/op 111.57 ms/op 0.71
aggregationBits - 2048 els - zipIndexesInBitList 34.586 us/op 33.396 us/op 1.04
regular array get 100000 times 67.611 us/op 55.261 us/op 1.22
wrappedArray get 100000 times 67.669 us/op 53.683 us/op 1.26
arrayWithProxy get 100000 times 29.063 ms/op 34.435 ms/op 0.84
ssz.Root.equals 515.00 ns/op 576.00 ns/op 0.89
byteArrayEquals 504.00 ns/op 549.00 ns/op 0.92
shuffle list - 16384 els 11.051 ms/op 10.944 ms/op 1.01
shuffle list - 250000 els 162.72 ms/op 170.78 ms/op 0.95
processSlot - 1 slots 12.555 us/op 16.928 us/op 0.74
processSlot - 32 slots 1.7486 ms/op 2.3400 ms/op 0.75
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 372.34 us/op 494.33 us/op 0.75
getCommitteeAssignments - req 1 vs - 250000 vc 5.3233 ms/op 5.1429 ms/op 1.04
getCommitteeAssignments - req 100 vs - 250000 vc 7.3093 ms/op 8.0069 ms/op 0.91
getCommitteeAssignments - req 1000 vs - 250000 vc 7.7371 ms/op 8.0836 ms/op 0.96
computeProposers - vc 250000 19.046 ms/op 19.582 ms/op 0.97
computeEpochShuffling - vc 250000 164.72 ms/op 179.19 ms/op 0.92
getNextSyncCommittee - vc 250000 277.90 ms/op 346.01 ms/op 0.80

by benchmarkbot/action

@twoeths twoeths marked this pull request as ready for review July 13, 2022 08:48
@twoeths twoeths requested a review from a team as a code owner July 13, 2022 08:48
const checkpointSlot = computeStartSlotAtEpoch(checkpoint.epoch);
const head = this.forkChoice.getHead();
// Find a state in the same branch of checkpoint at same epoch. Balances should exactly the same
for (const descendantBlock of this.forkChoice.forwardIterateDescendants(checkpoint.rootHex)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consume forwardIterateDescendants only once to improve performance a bit? either extract to an array, or that function could return an array itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's very unlikely to hit this code path I would not do that for simplicity. Metrics will clearly show if we have to run the iterator twice too often, only then optimize

const blockDelaySec = (Math.floor(Date.now() / 1000) - postState.genesisTime) % chain.config.SECONDS_PER_SLOT;
const blockRoot = toHexString(chain.config.getForkTypes(block.message.slot).BeaconBlock.hashTreeRoot(block.message));
// Should compute checkpoint balances before forkchoice.onBlock
chain.checkpointBalancesCache.processState(blockRoot, postState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling fc_store.onVerifiedBlock() in a similar pattern as lighthouse sounds better than here.

@dapplion dapplion enabled auto-merge (squash) July 13, 2022 14:41
@dapplion dapplion merged commit cb97873 into unstable Jul 13, 2022
@dapplion dapplion deleted the tuyen/checkpoint-balances-cache branch July 13, 2022 15:02
@twoeths
Copy link
Contributor Author

twoeths commented Jul 14, 2022

Tested for a while in nightly group, metrics show that we always hit balances cache

Screen Shot 2022-07-14 at 09 08 30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants