Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve compute new state root when producing block #6195

Merged
merged 3 commits into from
Dec 15, 2023

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Dec 15, 2023

Motivation

Improve "computeNewStateRoot" function

Description

  • computeNewStateRoot was so slow because we didn't compute hash tree root to cache in "prepareNextSlot". The main fix for this is to do it in "prepareNextSlot"
  • Also track state.hashTreeRoot() call in 3 places: verifyBlocksStateTransitionOnly(), prepareNextSlot() and computeNewStateRoot()
  • Also correct the bucket of that metrics as current values are not suitable for different networks

Closes #6194

@@ -104,6 +104,12 @@ export class PrepareNextSlotScheduler {
RegenCaller.precomputeEpoch
);

// cache HashObjects for faster hashTreeRoot() later, especially for computeNewStateRoot() if we need to produce a block at slot 0 of epoch
// see https://github.com/ChainSafe/lodestar/issues/6194
const hashTreeRootTimer = this.metrics?.stateHashTreeRootTime.startTimer({source: "prepare_next_slot"});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the naming convention we use for source?

e.g. here we use camelCase (function names)

onStateCloneMetrics(postState, metrics, "stateTransition");

and recently added (#6143) step metrics also use camelCase

const timer = metrics?.epochTransitionStepTime.startTimer({step: "afterProcessEpoch"});

I kinda like the function names and might be better for consistency, but maybe not always applicable (i.e. if there is no function call specific to the metric)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the naming convention we use for source?

agree to have some naming convention, we used snake case as suggested by @dapplion and that's my preference. Agree there are consistencies so need some agreement to work on

I don't want to use function names, just which whatever name that makes sense in the context because:

  • some function names are so long so it's not nice to render in grafana, for example verifyBlocksStateTransitionOnly
  • one function could call a metric multiple times

so this one should be more dynamic to me, as long as a PR gets through review process we're fine I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to use function names, just which whatever name that makes sense in the context because:

Makes sense to not be to strict here then, also I guess those labels are mostly used to visualize and separate metric values in a panel. As long as it is visually readable and makes sense in the context of the metrics the naming convention (camelCase vs snake_case) should not matter that much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snake case just reads better in grafana :)

Copy link

codecov bot commented Dec 15, 2023

Codecov Report

Merging #6195 (291d800) into unstable (4658348) will not change coverage.
The diff coverage is n/a.

Additional details and impacted files
@@            Coverage Diff            @@
##           unstable    #6195   +/-   ##
=========================================
  Coverage     90.35%   90.35%           
=========================================
  Files            78       78           
  Lines          8087     8087           
  Branches        490      490           
=========================================
  Hits           7307     7307           
  Misses          772      772           
  Partials          8        8           

Copy link
Contributor

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 8f31487 Previous: 4658348 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 793.94 us/op 882.70 us/op 0.90
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 55.667 us/op 120.35 us/op 0.46
BLS verify - blst-native 1.1252 ms/op 1.4129 ms/op 0.80
BLS verifyMultipleSignatures 3 - blst-native 2.3291 ms/op 3.0208 ms/op 0.77
BLS verifyMultipleSignatures 8 - blst-native 5.0544 ms/op 6.7764 ms/op 0.75
BLS verifyMultipleSignatures 32 - blst-native 19.398 ms/op 25.414 ms/op 0.76
BLS verifyMultipleSignatures 64 - blst-native 37.358 ms/op 46.629 ms/op 0.80
BLS verifyMultipleSignatures 128 - blst-native 72.364 ms/op 95.842 ms/op 0.76
BLS deserializing 10000 signatures 795.75 ms/op 1.0122 s/op 0.79
BLS deserializing 100000 signatures 8.3795 s/op 9.7136 s/op 0.86
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.1646 ms/op 1.4406 ms/op 0.81
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.7247 ms/op 1.6743 ms/op 1.03
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.0851 ms/op 2.5283 ms/op 0.82
BLS verifyMultipleSignatures - same message - 64 - blst-native 4.1298 ms/op 3.9022 ms/op 1.06
BLS verifyMultipleSignatures - same message - 128 - blst-native 5.5685 ms/op 6.2711 ms/op 0.89
BLS aggregatePubkeys 32 - blst-native 24.227 us/op 28.518 us/op 0.85
BLS aggregatePubkeys 128 - blst-native 91.752 us/op 111.93 us/op 0.82
getAttestationsForBlock 54.604 ms/op 71.687 ms/op 0.76
getSlashingsAndExits - default max 133.94 us/op 213.84 us/op 0.63
getSlashingsAndExits - 2k 485.98 us/op 629.77 us/op 0.77
proposeBlockBody type=full, size=empty 4.3132 ms/op 6.4094 ms/op 0.67
isKnown best case - 1 super set check 596.00 ns/op 392.00 ns/op 1.52
isKnown normal case - 2 super set checks 606.00 ns/op 442.00 ns/op 1.37
isKnown worse case - 16 super set checks 446.00 ns/op 431.00 ns/op 1.03
CheckpointStateCache - add get delete 4.5390 us/op 6.2830 us/op 0.72
validate api signedAggregateAndProof - struct 2.3669 ms/op 2.9800 ms/op 0.79
validate gossip signedAggregateAndProof - struct 2.4993 ms/op 2.9898 ms/op 0.84
validate gossip attestation - vc 640000 1.1582 ms/op 1.4403 ms/op 0.80
batch validate gossip attestation - vc 640000 - chunk 32 141.55 us/op 172.52 us/op 0.82
batch validate gossip attestation - vc 640000 - chunk 64 126.85 us/op 149.08 us/op 0.85
batch validate gossip attestation - vc 640000 - chunk 128 129.01 us/op 141.79 us/op 0.91
batch validate gossip attestation - vc 640000 - chunk 256 134.40 us/op 139.49 us/op 0.96
pickEth1Vote - no votes 1.0633 ms/op 1.4380 ms/op 0.74
pickEth1Vote - max votes 14.568 ms/op 13.693 ms/op 1.06
pickEth1Vote - Eth1Data hashTreeRoot value x2048 24.425 ms/op 19.386 ms/op 1.26
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 23.906 ms/op 34.656 ms/op 0.69
pickEth1Vote - Eth1Data fastSerialize value x2048 440.30 us/op 727.01 us/op 0.61
pickEth1Vote - Eth1Data fastSerialize tree x2048 5.8115 ms/op 8.4885 ms/op 0.68
bytes32 toHexString 524.00 ns/op 552.00 ns/op 0.95
bytes32 Buffer.toString(hex) 395.00 ns/op 303.00 ns/op 1.30
bytes32 Buffer.toString(hex) from Uint8Array 574.00 ns/op 455.00 ns/op 1.26
bytes32 Buffer.toString(hex) + 0x 388.00 ns/op 309.00 ns/op 1.26
Object access 1 prop 0.23600 ns/op 0.17300 ns/op 1.36
Map access 1 prop 0.21200 ns/op 0.15400 ns/op 1.38
Object get x1000 5.6270 ns/op 7.5740 ns/op 0.74
Map get x1000 0.88300 ns/op 0.85500 ns/op 1.03
Object set x1000 31.302 ns/op 60.155 ns/op 0.52
Map set x1000 19.146 ns/op 49.719 ns/op 0.39
Return object 10000 times 0.24780 ns/op 0.25420 ns/op 0.97
Throw Error 10000 times 2.8852 us/op 3.9618 us/op 0.73
fastMsgIdFn sha256 / 200 bytes 2.1850 us/op 3.3960 us/op 0.64
fastMsgIdFn h32 xxhash / 200 bytes 389.00 ns/op 334.00 ns/op 1.16
fastMsgIdFn h64 xxhash / 200 bytes 408.00 ns/op 411.00 ns/op 0.99
fastMsgIdFn sha256 / 1000 bytes 6.5920 us/op 12.253 us/op 0.54
fastMsgIdFn h32 xxhash / 1000 bytes 526.00 ns/op 501.00 ns/op 1.05
fastMsgIdFn h64 xxhash / 1000 bytes 481.00 ns/op 479.00 ns/op 1.00
fastMsgIdFn sha256 / 10000 bytes 54.742 us/op 112.67 us/op 0.49
fastMsgIdFn h32 xxhash / 10000 bytes 1.9310 us/op 2.0950 us/op 0.92
fastMsgIdFn h64 xxhash / 10000 bytes 1.3420 us/op 1.4890 us/op 0.90
send data - 1000 256B messages 15.475 ms/op 22.716 ms/op 0.68
send data - 1000 512B messages 19.207 ms/op 29.809 ms/op 0.64
send data - 1000 1024B messages 26.280 ms/op 46.404 ms/op 0.57
send data - 1000 1200B messages 30.959 ms/op 46.073 ms/op 0.67
send data - 1000 2048B messages 43.301 ms/op 50.023 ms/op 0.87
send data - 1000 4096B messages 36.218 ms/op 48.695 ms/op 0.74
send data - 1000 16384B messages 90.002 ms/op 121.81 ms/op 0.74
send data - 1000 65536B messages 378.96 ms/op 555.25 ms/op 0.68
enrSubnets - fastDeserialize 64 bits 1.6280 us/op 1.8390 us/op 0.89
enrSubnets - ssz BitVector 64 bits 596.00 ns/op 551.00 ns/op 1.08
enrSubnets - fastDeserialize 4 bits 283.00 ns/op 255.00 ns/op 1.11
enrSubnets - ssz BitVector 4 bits 606.00 ns/op 560.00 ns/op 1.08
prioritizePeers score -10:0 att 32-0.1 sync 2-0 82.488 us/op 120.19 us/op 0.69
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 114.17 us/op 152.32 us/op 0.75
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 170.23 us/op 227.55 us/op 0.75
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 254.69 us/op 358.07 us/op 0.71
prioritizePeers score 0:0 att 64-1 sync 4-1 255.39 us/op 389.00 us/op 0.66
array of 16000 items push then shift 1.3684 us/op 1.7831 us/op 0.77
LinkedList of 16000 items push then shift 7.8350 ns/op 9.9810 ns/op 0.78
array of 16000 items push then pop 113.53 ns/op 109.65 ns/op 1.04
LinkedList of 16000 items push then pop 7.1270 ns/op 9.7930 ns/op 0.73
array of 24000 items push then shift 2.0097 us/op 2.6875 us/op 0.75
LinkedList of 24000 items push then shift 6.5650 ns/op 10.010 ns/op 0.66
array of 24000 items push then pop 138.65 ns/op 150.97 ns/op 0.92
LinkedList of 24000 items push then pop 5.8790 ns/op 9.3180 ns/op 0.63
intersect bitArray bitLen 8 5.1660 ns/op 6.8220 ns/op 0.76
intersect array and set length 8 50.264 ns/op 77.427 ns/op 0.65
intersect bitArray bitLen 128 28.264 ns/op 36.184 ns/op 0.78
intersect array and set length 128 729.16 ns/op 1.1185 us/op 0.65
bitArray.getTrueBitIndexes() bitLen 128 1.5680 us/op 1.8200 us/op 0.86
bitArray.getTrueBitIndexes() bitLen 248 2.8340 us/op 3.0830 us/op 0.92
bitArray.getTrueBitIndexes() bitLen 512 5.1760 us/op 5.8150 us/op 0.89
Buffer.concat 32 items 895.00 ns/op 1.0460 us/op 0.86
Uint8Array.set 32 items 1.5090 us/op 2.0830 us/op 0.72
Set add up to 64 items then delete first 1.7445 us/op 4.8910 us/op 0.36
OrderedSet add up to 64 items then delete first 2.6774 us/op 6.5355 us/op 0.41
Set add up to 64 items then delete last 2.0078 us/op 5.1979 us/op 0.39
OrderedSet add up to 64 items then delete last 2.9653 us/op 6.6742 us/op 0.44
Set add up to 64 items then delete middle 1.9828 us/op 5.1421 us/op 0.39
OrderedSet add up to 64 items then delete middle 4.1711 us/op 8.4158 us/op 0.50
Set add up to 128 items then delete first 3.9067 us/op 10.801 us/op 0.36
OrderedSet add up to 128 items then delete first 6.1744 us/op 15.508 us/op 0.40
Set add up to 128 items then delete last 3.8131 us/op 11.019 us/op 0.35
OrderedSet add up to 128 items then delete last 5.7540 us/op 13.310 us/op 0.43
Set add up to 128 items then delete middle 3.8166 us/op 10.886 us/op 0.35
OrderedSet add up to 128 items then delete middle 11.716 us/op 19.622 us/op 0.60
Set add up to 256 items then delete first 7.7743 us/op 21.072 us/op 0.37
OrderedSet add up to 256 items then delete first 12.486 us/op 27.862 us/op 0.45
Set add up to 256 items then delete last 7.5324 us/op 21.466 us/op 0.35
OrderedSet add up to 256 items then delete last 11.683 us/op 25.540 us/op 0.46
Set add up to 256 items then delete middle 7.6196 us/op 19.239 us/op 0.40
OrderedSet add up to 256 items then delete middle 32.663 us/op 51.681 us/op 0.63
transfer serialized Status (84 B) 1.4920 us/op 1.8330 us/op 0.81
copy serialized Status (84 B) 1.4180 us/op 1.6010 us/op 0.89
transfer serialized SignedVoluntaryExit (112 B) 1.6370 us/op 1.9120 us/op 0.86
copy serialized SignedVoluntaryExit (112 B) 1.4610 us/op 1.6830 us/op 0.87
transfer serialized ProposerSlashing (416 B) 2.8700 us/op 2.1880 us/op 1.31
copy serialized ProposerSlashing (416 B) 2.5090 us/op 2.1440 us/op 1.17
transfer serialized Attestation (485 B) 2.1210 us/op 2.2670 us/op 0.94
copy serialized Attestation (485 B) 2.3780 us/op 3.1110 us/op 0.76
transfer serialized AttesterSlashing (33232 B) 2.2710 us/op 3.1600 us/op 0.72
copy serialized AttesterSlashing (33232 B) 5.8100 us/op 6.8690 us/op 0.85
transfer serialized Small SignedBeaconBlock (128000 B) 2.1090 us/op 3.2410 us/op 0.65
copy serialized Small SignedBeaconBlock (128000 B) 13.698 us/op 20.516 us/op 0.67
transfer serialized Avg SignedBeaconBlock (200000 B) 2.8180 us/op 3.6510 us/op 0.77
copy serialized Avg SignedBeaconBlock (200000 B) 16.464 us/op 30.211 us/op 0.54
transfer serialized BlobsSidecar (524380 B) 3.0650 us/op 3.7160 us/op 0.82
copy serialized BlobsSidecar (524380 B) 85.137 us/op 146.42 us/op 0.58
transfer serialized Big SignedBeaconBlock (1000000 B) 2.8150 us/op 3.9060 us/op 0.72
copy serialized Big SignedBeaconBlock (1000000 B) 154.93 us/op 440.02 us/op 0.35
pass gossip attestations to forkchoice per slot 2.9137 ms/op 4.2543 ms/op 0.68
forkChoice updateHead vc 100000 bc 64 eq 0 484.92 us/op 822.73 us/op 0.59
forkChoice updateHead vc 600000 bc 64 eq 0 2.6638 ms/op 4.4371 ms/op 0.60
forkChoice updateHead vc 1000000 bc 64 eq 0 4.9360 ms/op 7.6048 ms/op 0.65
forkChoice updateHead vc 600000 bc 320 eq 0 2.6967 ms/op 4.8353 ms/op 0.56
forkChoice updateHead vc 600000 bc 1200 eq 0 2.8314 ms/op 4.9266 ms/op 0.57
forkChoice updateHead vc 600000 bc 7200 eq 0 3.8122 ms/op 5.8589 ms/op 0.65
forkChoice updateHead vc 600000 bc 64 eq 1000 10.337 ms/op 12.515 ms/op 0.83
forkChoice updateHead vc 600000 bc 64 eq 10000 10.431 ms/op 12.687 ms/op 0.82
forkChoice updateHead vc 600000 bc 64 eq 300000 12.994 ms/op 20.370 ms/op 0.64
computeDeltas 500000 validators 300 proto nodes 3.4179 ms/op 7.3850 ms/op 0.46
computeDeltas 500000 validators 1200 proto nodes 3.2723 ms/op 7.5132 ms/op 0.44
computeDeltas 500000 validators 7200 proto nodes 2.9998 ms/op 7.1197 ms/op 0.42
computeDeltas 750000 validators 300 proto nodes 4.5070 ms/op 10.983 ms/op 0.41
computeDeltas 750000 validators 1200 proto nodes 4.5910 ms/op 10.807 ms/op 0.42
computeDeltas 750000 validators 7200 proto nodes 4.4944 ms/op 10.959 ms/op 0.41
computeDeltas 1400000 validators 300 proto nodes 9.0006 ms/op 21.770 ms/op 0.41
computeDeltas 1400000 validators 1200 proto nodes 9.0045 ms/op 20.724 ms/op 0.43
computeDeltas 1400000 validators 7200 proto nodes 9.0025 ms/op 20.340 ms/op 0.44
computeDeltas 2100000 validators 300 proto nodes 13.479 ms/op 30.408 ms/op 0.44
computeDeltas 2100000 validators 1200 proto nodes 13.896 ms/op 30.126 ms/op 0.46
computeDeltas 2100000 validators 7200 proto nodes 13.315 ms/op 30.063 ms/op 0.44
computeProposerBoostScoreFromBalances 500000 validators 3.3327 ms/op 3.9348 ms/op 0.85
computeProposerBoostScoreFromBalances 750000 validators 3.3437 ms/op 3.9708 ms/op 0.84
computeProposerBoostScoreFromBalances 1400000 validators 3.2918 ms/op 4.0003 ms/op 0.82
computeProposerBoostScoreFromBalances 2100000 validators 3.3111 ms/op 3.9232 ms/op 0.84
altair processAttestation - 250000 vs - 7PWei normalcase 1.4438 ms/op 2.3746 ms/op 0.61
altair processAttestation - 250000 vs - 7PWei worstcase 2.1626 ms/op 3.8248 ms/op 0.57
altair processAttestation - setStatus - 1/6 committees join 68.218 us/op 155.76 us/op 0.44
altair processAttestation - setStatus - 1/3 committees join 139.78 us/op 295.18 us/op 0.47
altair processAttestation - setStatus - 1/2 committees join 213.05 us/op 396.45 us/op 0.54
altair processAttestation - setStatus - 2/3 committees join 282.04 us/op 488.79 us/op 0.58
altair processAttestation - setStatus - 4/5 committees join 387.47 us/op 678.30 us/op 0.57
altair processAttestation - setStatus - 100% committees join 516.35 us/op 817.90 us/op 0.63
altair processBlock - 250000 vs - 7PWei normalcase 7.9534 ms/op 11.305 ms/op 0.70
altair processBlock - 250000 vs - 7PWei normalcase hashState 32.054 ms/op 40.306 ms/op 0.80
altair processBlock - 250000 vs - 7PWei worstcase 32.987 ms/op 40.115 ms/op 0.82
altair processBlock - 250000 vs - 7PWei worstcase hashState 86.261 ms/op 102.61 ms/op 0.84
phase0 processBlock - 250000 vs - 7PWei normalcase 2.5046 ms/op 2.6923 ms/op 0.93
phase0 processBlock - 250000 vs - 7PWei worstcase 27.297 ms/op 33.188 ms/op 0.82
altair processEth1Data - 250000 vs - 7PWei normalcase 285.51 us/op 692.00 us/op 0.41
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 5.9000 us/op 9.6230 us/op 0.61
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 38.819 us/op 73.695 us/op 0.53
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 20.357 us/op 26.418 us/op 0.77
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 12.598 us/op 15.115 us/op 0.83
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 132.99 us/op 195.04 us/op 0.68
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.1027 ms/op 1.2987 ms/op 0.85
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 880.24 us/op 1.8292 ms/op 0.48
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.0384 ms/op 1.5619 ms/op 0.66
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 2.6560 ms/op 3.5488 ms/op 0.75
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 1.4574 ms/op 2.7979 ms/op 0.52
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 3.5559 ms/op 6.3707 ms/op 0.56
Tree 40 250000 create 262.34 ms/op 364.58 ms/op 0.72
Tree 40 250000 get(125000) 111.42 ns/op 205.96 ns/op 0.54
Tree 40 250000 set(125000) 717.24 ns/op 1.0673 us/op 0.67
Tree 40 250000 toArray() 9.8097 ms/op 20.403 ms/op 0.48
Tree 40 250000 iterate all - toArray() + loop 9.6299 ms/op 20.892 ms/op 0.46
Tree 40 250000 iterate all - get(i) 38.637 ms/op 70.407 ms/op 0.55
MutableVector 250000 create 11.322 ms/op 15.209 ms/op 0.74
MutableVector 250000 get(125000) 5.7280 ns/op 6.6350 ns/op 0.86
MutableVector 250000 set(125000) 213.46 ns/op 293.36 ns/op 0.73
MutableVector 250000 toArray() 2.0860 ms/op 3.3986 ms/op 0.61
MutableVector 250000 iterate all - toArray() + loop 2.4790 ms/op 5.8806 ms/op 0.42
MutableVector 250000 iterate all - get(i) 1.3138 ms/op 1.5632 ms/op 0.84
Array 250000 create 2.6932 ms/op 3.0244 ms/op 0.89
Array 250000 clone - spread 1.1053 ms/op 1.2683 ms/op 0.87
Array 250000 get(125000) 1.0170 ns/op 1.0620 ns/op 0.96
Array 250000 set(125000) 1.2290 ns/op 4.2250 ns/op 0.29
Array 250000 iterate all - loop 155.01 us/op 169.60 us/op 0.91
effectiveBalanceIncrements clone Uint8Array 300000 13.588 us/op 25.863 us/op 0.53
effectiveBalanceIncrements clone MutableVector 300000 395.00 ns/op 365.00 ns/op 1.08
effectiveBalanceIncrements rw all Uint8Array 300000 186.53 us/op 211.60 us/op 0.88
effectiveBalanceIncrements rw all MutableVector 300000 63.278 ms/op 84.749 ms/op 0.75
phase0 afterProcessEpoch - 250000 vs - 7PWei 75.411 ms/op 117.50 ms/op 0.64
phase0 beforeProcessEpoch - 250000 vs - 7PWei 45.642 ms/op 54.913 ms/op 0.83
altair processEpoch - mainnet_e81889 450.79 ms/op 491.14 ms/op 0.92
mainnet_e81889 - altair beforeProcessEpoch 76.556 ms/op 88.570 ms/op 0.86
mainnet_e81889 - altair processJustificationAndFinalization 6.6090 us/op 16.587 us/op 0.40
mainnet_e81889 - altair processInactivityUpdates 3.7502 ms/op 6.8458 ms/op 0.55
mainnet_e81889 - altair processRewardsAndPenalties 58.772 ms/op 66.975 ms/op 0.88
mainnet_e81889 - altair processRegistryUpdates 2.0760 us/op 2.6990 us/op 0.77
mainnet_e81889 - altair processSlashings 602.00 ns/op 381.00 ns/op 1.58
mainnet_e81889 - altair processEth1DataReset 562.00 ns/op 440.00 ns/op 1.28
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.7580 ms/op 1.4510 ms/op 1.21
mainnet_e81889 - altair processSlashingsReset 2.6490 us/op 3.3780 us/op 0.78
mainnet_e81889 - altair processRandaoMixesReset 3.0810 us/op 5.2960 us/op 0.58
mainnet_e81889 - altair processHistoricalRootsUpdate 630.00 ns/op 656.00 ns/op 0.96
mainnet_e81889 - altair processParticipationFlagUpdates 1.2970 us/op 2.3550 us/op 0.55
mainnet_e81889 - altair processSyncCommitteeUpdates 694.00 ns/op 676.00 ns/op 1.03
mainnet_e81889 - altair afterProcessEpoch 82.332 ms/op 125.40 ms/op 0.66
capella processEpoch - mainnet_e217614 2.1362 s/op 2.2830 s/op 0.94
mainnet_e217614 - capella beforeProcessEpoch 466.99 ms/op 507.27 ms/op 0.92
mainnet_e217614 - capella processJustificationAndFinalization 15.632 us/op 15.519 us/op 1.01
mainnet_e217614 - capella processInactivityUpdates 22.676 ms/op 21.027 ms/op 1.08
mainnet_e217614 - capella processRewardsAndPenalties 406.92 ms/op 424.59 ms/op 0.96
mainnet_e217614 - capella processRegistryUpdates 20.947 us/op 22.098 us/op 0.95
mainnet_e217614 - capella processSlashings 898.00 ns/op 888.00 ns/op 1.01
mainnet_e217614 - capella processEth1DataReset 812.00 ns/op 398.00 ns/op 2.04
mainnet_e217614 - capella processEffectiveBalanceUpdates 3.8554 ms/op 6.1410 ms/op 0.63
mainnet_e217614 - capella processSlashingsReset 4.0020 us/op 3.6900 us/op 1.08
mainnet_e217614 - capella processRandaoMixesReset 5.7080 us/op 4.6470 us/op 1.23
mainnet_e217614 - capella processHistoricalRootsUpdate 891.00 ns/op 472.00 ns/op 1.89
mainnet_e217614 - capella processParticipationFlagUpdates 2.0520 us/op 1.4450 us/op 1.42
mainnet_e217614 - capella afterProcessEpoch 240.22 ms/op 317.27 ms/op 0.76
phase0 processEpoch - mainnet_e58758 433.38 ms/op 463.71 ms/op 0.93
mainnet_e58758 - phase0 beforeProcessEpoch 128.54 ms/op 131.53 ms/op 0.98
mainnet_e58758 - phase0 processJustificationAndFinalization 8.6240 us/op 16.907 us/op 0.51
mainnet_e58758 - phase0 processRewardsAndPenalties 51.957 ms/op 53.273 ms/op 0.98
mainnet_e58758 - phase0 processRegistryUpdates 11.944 us/op 23.246 us/op 0.51
mainnet_e58758 - phase0 processSlashings 859.00 ns/op 718.00 ns/op 1.20
mainnet_e58758 - phase0 processEth1DataReset 722.00 ns/op 1.1730 us/op 0.62
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 838.78 us/op 1.1838 ms/op 0.71
mainnet_e58758 - phase0 processSlashingsReset 2.5810 us/op 3.2500 us/op 0.79
mainnet_e58758 - phase0 processRandaoMixesReset 2.9740 us/op 5.3970 us/op 0.55
mainnet_e58758 - phase0 processHistoricalRootsUpdate 603.00 ns/op 490.00 ns/op 1.23
mainnet_e58758 - phase0 processParticipationRecordUpdates 2.9960 us/op 5.5510 us/op 0.54
mainnet_e58758 - phase0 afterProcessEpoch 65.306 ms/op 101.45 ms/op 0.64
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.0400 ms/op 1.4237 ms/op 0.73
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.1506 ms/op 1.4887 ms/op 0.77
altair processInactivityUpdates - 250000 normalcase 24.098 ms/op 26.112 ms/op 0.92
altair processInactivityUpdates - 250000 worstcase 23.802 ms/op 27.046 ms/op 0.88
phase0 processRegistryUpdates - 250000 normalcase 8.9560 us/op 9.7150 us/op 0.92
phase0 processRegistryUpdates - 250000 badcase_full_deposits 310.05 us/op 390.29 us/op 0.79
phase0 processRegistryUpdates - 250000 worstcase 0.5 104.89 ms/op 120.32 ms/op 0.87
altair processRewardsAndPenalties - 250000 normalcase 56.340 ms/op 58.650 ms/op 0.96
altair processRewardsAndPenalties - 250000 worstcase 53.004 ms/op 57.520 ms/op 0.92
phase0 getAttestationDeltas - 250000 normalcase 5.8816 ms/op 10.101 ms/op 0.58
phase0 getAttestationDeltas - 250000 worstcase 6.0946 ms/op 10.058 ms/op 0.61
phase0 processSlashings - 250000 worstcase 92.093 us/op 95.215 us/op 0.97
altair processSyncCommitteeUpdates - 250000 118.76 ms/op 175.48 ms/op 0.68
BeaconState.hashTreeRoot - No change 408.00 ns/op 272.00 ns/op 1.50
BeaconState.hashTreeRoot - 1 full validator 95.041 us/op 129.98 us/op 0.73
BeaconState.hashTreeRoot - 32 full validator 1.0914 ms/op 1.4092 ms/op 0.77
BeaconState.hashTreeRoot - 512 full validator 10.295 ms/op 13.575 ms/op 0.76
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 130.49 us/op 152.07 us/op 0.86
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.7125 ms/op 2.1238 ms/op 0.81
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 22.494 ms/op 28.175 ms/op 0.80
BeaconState.hashTreeRoot - 1 balances 106.30 us/op 143.13 us/op 0.74
BeaconState.hashTreeRoot - 32 balances 1.5856 ms/op 1.2551 ms/op 1.26
BeaconState.hashTreeRoot - 512 balances 9.4615 ms/op 12.253 ms/op 0.77
BeaconState.hashTreeRoot - 250000 balances 137.35 ms/op 235.73 ms/op 0.58
aggregationBits - 2048 els - zipIndexesInBitList 11.431 us/op 16.370 us/op 0.70
byteArrayEquals 32 64.164 ns/op 74.860 ns/op 0.86
Buffer.compare 32 39.666 ns/op 56.012 ns/op 0.71
byteArrayEquals 1024 1.7527 us/op 2.0292 us/op 0.86
Buffer.compare 1024 46.893 ns/op 71.329 ns/op 0.66
byteArrayEquals 16384 26.238 us/op 32.293 us/op 0.81
Buffer.compare 16384 183.94 ns/op 261.50 ns/op 0.70
byteArrayEquals 123687377 200.56 ms/op 240.13 ms/op 0.84
Buffer.compare 123687377 5.7358 ms/op 6.0344 ms/op 0.95
byteArrayEquals 32 - diff last byte 63.371 ns/op 70.432 ns/op 0.90
Buffer.compare 32 - diff last byte 39.456 ns/op 55.944 ns/op 0.71
byteArrayEquals 1024 - diff last byte 1.7332 us/op 1.9874 us/op 0.87
Buffer.compare 1024 - diff last byte 49.634 ns/op 70.364 ns/op 0.71
byteArrayEquals 16384 - diff last byte 27.746 us/op 31.684 us/op 0.88
Buffer.compare 16384 - diff last byte 198.66 ns/op 270.69 ns/op 0.73
byteArrayEquals 123687377 - diff last byte 205.64 ms/op 249.41 ms/op 0.82
Buffer.compare 123687377 - diff last byte 5.2385 ms/op 6.1991 ms/op 0.85
byteArrayEquals 32 - random bytes 4.6420 ns/op 5.2710 ns/op 0.88
Buffer.compare 32 - random bytes 36.510 ns/op 59.580 ns/op 0.61
byteArrayEquals 1024 - random bytes 4.3130 ns/op 5.1280 ns/op 0.84
Buffer.compare 1024 - random bytes 34.903 ns/op 58.982 ns/op 0.59
byteArrayEquals 16384 - random bytes 4.3030 ns/op 5.0890 ns/op 0.85
Buffer.compare 16384 - random bytes 34.756 ns/op 58.769 ns/op 0.59
byteArrayEquals 123687377 - random bytes 7.6000 ns/op 8.2000 ns/op 0.93
Buffer.compare 123687377 - random bytes 38.160 ns/op 62.110 ns/op 0.61
regular array get 100000 times 40.012 us/op 43.527 us/op 0.92
wrappedArray get 100000 times 40.115 us/op 43.518 us/op 0.92
arrayWithProxy get 100000 times 9.4854 ms/op 13.787 ms/op 0.69
ssz.Root.equals 53.368 ns/op 53.000 ns/op 1.01
byteArrayEquals 53.565 ns/op 52.130 ns/op 1.03
Buffer.compare 9.1370 ns/op 10.666 ns/op 0.86
shuffle list - 16384 els 4.6297 ms/op 6.9635 ms/op 0.66
shuffle list - 250000 els 68.563 ms/op 102.05 ms/op 0.67
processSlot - 1 slots 13.297 us/op 17.018 us/op 0.78
processSlot - 32 slots 2.3284 ms/op 3.5627 ms/op 0.65
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 52.002 ms/op 58.614 ms/op 0.89
getCommitteeAssignments - req 1 vs - 250000 vc 2.3397 ms/op 2.4704 ms/op 0.95
getCommitteeAssignments - req 100 vs - 250000 vc 3.5138 ms/op 3.6362 ms/op 0.97
getCommitteeAssignments - req 1000 vs - 250000 vc 3.8155 ms/op 4.0091 ms/op 0.95
findModifiedValidators - 10000 modified validators 400.42 ms/op 533.52 ms/op 0.75
findModifiedValidators - 1000 modified validators 330.91 ms/op 440.96 ms/op 0.75
findModifiedValidators - 100 modified validators 346.64 ms/op 409.15 ms/op 0.85
findModifiedValidators - 10 modified validators 333.78 ms/op 413.39 ms/op 0.81
findModifiedValidators - 1 modified validators 335.90 ms/op 389.95 ms/op 0.86
findModifiedValidators - no difference 330.76 ms/op 414.72 ms/op 0.80
compare ViewDUs 3.8554 s/op 4.3646 s/op 0.88
compare each validator Uint8Array 1.2723 s/op 1.8015 s/op 0.71
compare ViewDU to Uint8Array 840.34 ms/op 1.1058 s/op 0.76
migrate state 1000000 validators, 24 modified, 0 new 665.48 ms/op 793.59 ms/op 0.84
migrate state 1000000 validators, 1700 modified, 1000 new 910.42 ms/op 1.0999 s/op 0.83
migrate state 1000000 validators, 3400 modified, 2000 new 1.1839 s/op 1.3564 s/op 0.87
migrate state 1500000 validators, 24 modified, 0 new 730.19 ms/op 766.56 ms/op 0.95
migrate state 1500000 validators, 1700 modified, 1000 new 929.88 ms/op 1.0926 s/op 0.85
migrate state 1500000 validators, 3400 modified, 2000 new 1.2201 s/op 1.3374 s/op 0.91
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.1500 ns/op 4.1600 ns/op 1.24
state getBlockRootAtSlot - 250000 vs - 7PWei 943.18 ns/op 790.40 ns/op 1.19
computeProposers - vc 250000 6.6966 ms/op 9.2034 ms/op 0.73
computeEpochShuffling - vc 250000 67.184 ms/op 104.71 ms/op 0.64
getNextSyncCommittee - vc 250000 105.06 ms/op 158.41 ms/op 0.66
computeSigningRoot for AttestationData 24.411 us/op 26.799 us/op 0.91
hash AttestationData serialized data then Buffer.toString(base64) 1.3345 us/op 2.3085 us/op 0.58
toHexString serialized data 842.79 ns/op 1.0741 us/op 0.78
Buffer.toString(base64) 145.64 ns/op 215.07 ns/op 0.68

by benchmarkbot/action

@twoeths
Copy link
Contributor Author

twoeths commented Dec 15, 2023

  • On mainnet, state.hashTreeRoot() for computeNewStateRoot() runs mostly < 50ms
Screenshot 2023-12-15 at 19 25 12
  • On holesky, state.hashTreeRoot() for computeNewStateRoot() runs mostly 200ms to 500ms
Screenshot 2023-12-15 at 19 26 10

@twoeths twoeths marked this pull request as ready for review December 15, 2023 12:27
@twoeths twoeths requested a review from a team as a code owner December 15, 2023 12:27
Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@nflaig nflaig merged commit c044d4c into unstable Dec 15, 2023
15 checks passed
@nflaig nflaig deleted the tuyen/improve_compute_new_state_root branch December 15, 2023 15:14
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.13.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

computeNewStateRoot performance issue at slot 0 of epoch
4 participants