Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: block network processor when processing current slot block #5458

Merged
merged 3 commits into from May 4, 2023

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented May 3, 2023

Motivation

  • We want to process current slot block asap
  • Right now we do that by limiting maxGossipTopicConcurrency to 512 which is not really necessary because the "hot" time is only ~600ms to process block while we applied for the whole slot

Description

  • When processing current slot block, block network processor. Most of the gossip messages at this time are attestations with unknown block root so should not be an issue
  • Remove the default value of maxGossipTopicConcurrency
  • Add "reason" to "lodestar_network_processor_can_not_accept_work_total" metric

part of #5413
Closes #5441

Testing

  • Initial test on test mainnet passed
  • Test for 1 more day with latest version

@github-actions
Copy link
Contributor

github-actions bot commented May 3, 2023

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: a1ada38 Previous: 597996a Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 616.34 us/op 750.78 us/op 0.82
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 53.755 us/op 46.975 us/op 1.14
BLS verify - blst-native 1.2416 ms/op 1.2167 ms/op 1.02
BLS verifyMultipleSignatures 3 - blst-native 2.5298 ms/op 2.4830 ms/op 1.02
BLS verifyMultipleSignatures 8 - blst-native 5.4166 ms/op 5.3393 ms/op 1.01
BLS verifyMultipleSignatures 32 - blst-native 20.047 ms/op 19.290 ms/op 1.04
BLS aggregatePubkeys 32 - blst-native 26.587 us/op 25.902 us/op 1.03
BLS aggregatePubkeys 128 - blst-native 102.85 us/op 100.98 us/op 1.02
getAttestationsForBlock 65.119 ms/op 55.814 ms/op 1.17
isKnown best case - 1 super set check 270.00 ns/op 252.00 ns/op 1.07
isKnown normal case - 2 super set checks 260.00 ns/op 247.00 ns/op 1.05
isKnown worse case - 16 super set checks 260.00 ns/op 242.00 ns/op 1.07
CheckpointStateCache - add get delete 5.9390 us/op 5.0350 us/op 1.18
validate gossip signedAggregateAndProof - struct 2.8404 ms/op 2.7678 ms/op 1.03
validate gossip attestation - struct 1.3488 ms/op 1.3223 ms/op 1.02
pickEth1Vote - no votes 1.4107 ms/op 1.3083 ms/op 1.08
pickEth1Vote - max votes 11.522 ms/op 9.3031 ms/op 1.24
pickEth1Vote - Eth1Data hashTreeRoot value x2048 9.6948 ms/op 8.8569 ms/op 1.09
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 16.401 ms/op 13.407 ms/op 1.22
pickEth1Vote - Eth1Data fastSerialize value x2048 779.88 us/op 660.06 us/op 1.18
pickEth1Vote - Eth1Data fastSerialize tree x2048 6.0072 ms/op 7.5434 ms/op 0.80
bytes32 toHexString 701.00 ns/op 517.00 ns/op 1.36
bytes32 Buffer.toString(hex) 431.00 ns/op 370.00 ns/op 1.16
bytes32 Buffer.toString(hex) from Uint8Array 640.00 ns/op 572.00 ns/op 1.12
bytes32 Buffer.toString(hex) + 0x 439.00 ns/op 379.00 ns/op 1.16
Object access 1 prop 0.21500 ns/op 0.17300 ns/op 1.24
Map access 1 prop 0.17700 ns/op 0.15100 ns/op 1.17
Object get x1000 7.0380 ns/op 6.8160 ns/op 1.03
Map get x1000 0.63900 ns/op 0.64900 ns/op 0.98
Object set x1000 70.516 ns/op 55.906 ns/op 1.26
Map set x1000 56.437 ns/op 47.270 ns/op 1.19
Return object 10000 times 0.25170 ns/op 0.24160 ns/op 1.04
Throw Error 10000 times 4.4173 us/op 4.2985 us/op 1.03
fastMsgIdFn sha256 / 200 bytes 3.6210 us/op 3.5720 us/op 1.01
fastMsgIdFn h32 xxhash / 200 bytes 316.00 ns/op 289.00 ns/op 1.09
fastMsgIdFn h64 xxhash / 200 bytes 480.00 ns/op 402.00 ns/op 1.19
fastMsgIdFn sha256 / 1000 bytes 11.932 us/op 11.736 us/op 1.02
fastMsgIdFn h32 xxhash / 1000 bytes 449.00 ns/op 430.00 ns/op 1.04
fastMsgIdFn h64 xxhash / 1000 bytes 557.00 ns/op 480.00 ns/op 1.16
fastMsgIdFn sha256 / 10000 bytes 106.92 us/op 103.76 us/op 1.03
fastMsgIdFn h32 xxhash / 10000 bytes 2.0540 us/op 1.9600 us/op 1.05
fastMsgIdFn h64 xxhash / 10000 bytes 1.4630 us/op 1.3930 us/op 1.05
enrSubnets - fastDeserialize 64 bits 1.8100 us/op 1.3120 us/op 1.38
enrSubnets - ssz BitVector 64 bits 644.00 ns/op 481.00 ns/op 1.34
enrSubnets - fastDeserialize 4 bits 221.00 ns/op 164.00 ns/op 1.35
enrSubnets - ssz BitVector 4 bits 648.00 ns/op 490.00 ns/op 1.32
prioritizePeers score -10:0 att 32-0.1 sync 2-0 126.20 us/op 102.29 us/op 1.23
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 160.96 us/op 133.39 us/op 1.21
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 205.93 us/op 168.49 us/op 1.22
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 388.35 us/op 300.74 us/op 1.29
prioritizePeers score 0:0 att 64-1 sync 4-1 464.56 us/op 365.24 us/op 1.27
array of 16000 items push then shift 1.6915 us/op 1.6644 us/op 1.02
LinkedList of 16000 items push then shift 9.1570 ns/op 8.8940 ns/op 1.03
array of 16000 items push then pop 121.65 ns/op 112.80 ns/op 1.08
LinkedList of 16000 items push then pop 9.0640 ns/op 8.8280 ns/op 1.03
array of 24000 items push then shift 2.4211 us/op 2.3616 us/op 1.03
LinkedList of 24000 items push then shift 9.1670 ns/op 8.8910 ns/op 1.03
array of 24000 items push then pop 81.596 ns/op 77.275 ns/op 1.06
LinkedList of 24000 items push then pop 9.0190 ns/op 8.5870 ns/op 1.05
intersect bitArray bitLen 8 14.435 ns/op 13.262 ns/op 1.09
intersect array and set length 8 117.42 ns/op 85.823 ns/op 1.37
intersect bitArray bitLen 128 46.992 ns/op 44.109 ns/op 1.07
intersect array and set length 128 1.3783 us/op 1.1134 us/op 1.24
Buffer.concat 32 items 3.3730 us/op 2.9200 us/op 1.16
Uint8Array.set 32 items 2.3780 us/op 2.3490 us/op 1.01
pass gossip attestations to forkchoice per slot 3.3769 ms/op 3.0272 ms/op 1.12
computeDeltas 3.3619 ms/op 2.9138 ms/op 1.15
computeProposerBoostScoreFromBalances 1.8786 ms/op 1.8059 ms/op 1.04
altair processAttestation - 250000 vs - 7PWei normalcase 3.2635 ms/op 2.5737 ms/op 1.27
altair processAttestation - 250000 vs - 7PWei worstcase 4.7173 ms/op 4.2304 ms/op 1.12
altair processAttestation - setStatus - 1/6 committees join 145.21 us/op 142.76 us/op 1.02
altair processAttestation - setStatus - 1/3 committees join 288.59 us/op 283.38 us/op 1.02
altair processAttestation - setStatus - 1/2 committees join 380.11 us/op 390.78 us/op 0.97
altair processAttestation - setStatus - 2/3 committees join 474.12 us/op 470.85 us/op 1.01
altair processAttestation - setStatus - 4/5 committees join 667.04 us/op 670.82 us/op 0.99
altair processAttestation - setStatus - 100% committees join 766.47 us/op 774.63 us/op 0.99
altair processBlock - 250000 vs - 7PWei normalcase 17.441 ms/op 16.699 ms/op 1.04
altair processBlock - 250000 vs - 7PWei normalcase hashState 28.977 ms/op 24.877 ms/op 1.16
altair processBlock - 250000 vs - 7PWei worstcase 51.915 ms/op 53.900 ms/op 0.96
altair processBlock - 250000 vs - 7PWei worstcase hashState 69.476 ms/op 71.139 ms/op 0.98
phase0 processBlock - 250000 vs - 7PWei normalcase 2.3486 ms/op 2.0907 ms/op 1.12
phase0 processBlock - 250000 vs - 7PWei worstcase 30.435 ms/op 29.796 ms/op 1.02
altair processEth1Data - 250000 vs - 7PWei normalcase 559.67 us/op 526.61 us/op 1.06
vc - 250000 eb 1 eth1 1 we 0 wn 0 - smpl 15 10.747 us/op 8.0230 us/op 1.34
vc - 250000 eb 0.95 eth1 0.1 we 0.05 wn 0 - smpl 219 38.021 us/op 23.206 us/op 1.64
vc - 250000 eb 0.95 eth1 0.3 we 0.05 wn 0 - smpl 42 14.305 us/op 9.2200 us/op 1.55
vc - 250000 eb 0.95 eth1 0.7 we 0.05 wn 0 - smpl 18 11.431 us/op 7.2290 us/op 1.58
vc - 250000 eb 0.1 eth1 0.1 we 0 wn 0 - smpl 1020 128.99 us/op 97.494 us/op 1.32
vc - 250000 eb 0.03 eth1 0.03 we 0 wn 0 - smpl 11777 660.07 us/op 635.15 us/op 1.04
vc - 250000 eb 0.01 eth1 0.01 we 0 wn 0 - smpl 16384 955.62 us/op 900.58 us/op 1.06
vc - 250000 eb 0 eth1 0 we 0 wn 0 - smpl 16384 910.27 us/op 857.99 us/op 1.06
vc - 250000 eb 0 eth1 0 we 0 wn 0 nocache - smpl 16384 3.1526 ms/op 2.3539 ms/op 1.34
vc - 250000 eb 0 eth1 1 we 0 wn 0 - smpl 16384 1.8213 ms/op 1.5344 ms/op 1.19
vc - 250000 eb 0 eth1 1 we 0 wn 0 nocache - smpl 16384 4.6925 ms/op 3.9581 ms/op 1.19
Tree 40 250000 create 467.36 ms/op 352.72 ms/op 1.33
Tree 40 250000 get(125000) 200.37 ns/op 197.00 ns/op 1.02
Tree 40 250000 set(125000) 1.2333 us/op 1.1437 us/op 1.08
Tree 40 250000 toArray() 22.691 ms/op 23.340 ms/op 0.97
Tree 40 250000 iterate all - toArray() + loop 22.897 ms/op 23.228 ms/op 0.99
Tree 40 250000 iterate all - get(i) 76.791 ms/op 75.735 ms/op 1.01
MutableVector 250000 create 10.477 ms/op 10.713 ms/op 0.98
MutableVector 250000 get(125000) 6.4870 ns/op 6.3230 ns/op 1.03
MutableVector 250000 set(125000) 286.95 ns/op 266.78 ns/op 1.08
MutableVector 250000 toArray() 3.2016 ms/op 3.0673 ms/op 1.04
MutableVector 250000 iterate all - toArray() + loop 3.8064 ms/op 3.3894 ms/op 1.12
MutableVector 250000 iterate all - get(i) 1.5480 ms/op 1.5248 ms/op 1.02
Array 250000 create 3.3729 ms/op 3.3839 ms/op 1.00
Array 250000 clone - spread 1.2522 ms/op 1.2426 ms/op 1.01
Array 250000 get(125000) 0.61700 ns/op 0.58800 ns/op 1.05
Array 250000 set(125000) 0.69500 ns/op 0.68400 ns/op 1.02
Array 250000 iterate all - loop 90.790 us/op 84.330 us/op 1.08
effectiveBalanceIncrements clone Uint8Array 300000 33.562 us/op 49.948 us/op 0.67
effectiveBalanceIncrements clone MutableVector 300000 375.00 ns/op 378.00 ns/op 0.99
effectiveBalanceIncrements rw all Uint8Array 300000 170.73 us/op 166.91 us/op 1.02
effectiveBalanceIncrements rw all MutableVector 300000 87.614 ms/op 89.916 ms/op 0.97
phase0 afterProcessEpoch - 250000 vs - 7PWei 116.05 ms/op 116.39 ms/op 1.00
phase0 beforeProcessEpoch - 250000 vs - 7PWei 42.019 ms/op 44.591 ms/op 0.94
altair processEpoch - mainnet_e81889 332.74 ms/op 312.95 ms/op 1.06
mainnet_e81889 - altair beforeProcessEpoch 70.113 ms/op 65.642 ms/op 1.07
mainnet_e81889 - altair processJustificationAndFinalization 19.047 us/op 25.941 us/op 0.73
mainnet_e81889 - altair processInactivityUpdates 6.7260 ms/op 7.9489 ms/op 0.85
mainnet_e81889 - altair processRewardsAndPenalties 51.782 ms/op 55.587 ms/op 0.93
mainnet_e81889 - altair processRegistryUpdates 3.0900 us/op 2.7440 us/op 1.13
mainnet_e81889 - altair processSlashings 610.00 ns/op 766.00 ns/op 0.80
mainnet_e81889 - altair processEth1DataReset 879.00 ns/op 572.00 ns/op 1.54
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.3367 ms/op 1.2401 ms/op 1.08
mainnet_e81889 - altair processSlashingsReset 8.1200 us/op 4.6070 us/op 1.76
mainnet_e81889 - altair processRandaoMixesReset 6.6360 us/op 4.7390 us/op 1.40
mainnet_e81889 - altair processHistoricalRootsUpdate 802.00 ns/op 621.00 ns/op 1.29
mainnet_e81889 - altair processParticipationFlagUpdates 4.2260 us/op 2.2280 us/op 1.90
mainnet_e81889 - altair processSyncCommitteeUpdates 1.2870 us/op 530.00 ns/op 2.43
mainnet_e81889 - altair afterProcessEpoch 132.99 ms/op 120.94 ms/op 1.10
phase0 processEpoch - mainnet_e58758 375.76 ms/op 325.56 ms/op 1.15
mainnet_e58758 - phase0 beforeProcessEpoch 138.14 ms/op 127.68 ms/op 1.08
mainnet_e58758 - phase0 processJustificationAndFinalization 35.340 us/op 14.497 us/op 2.44
mainnet_e58758 - phase0 processRewardsAndPenalties 55.195 ms/op 41.665 ms/op 1.32
mainnet_e58758 - phase0 processRegistryUpdates 13.992 us/op 11.498 us/op 1.22
mainnet_e58758 - phase0 processSlashings 1.0710 us/op 1.2300 us/op 0.87
mainnet_e58758 - phase0 processEth1DataReset 1.1200 us/op 536.00 ns/op 2.09
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 1.1165 ms/op 996.76 us/op 1.12
mainnet_e58758 - phase0 processSlashingsReset 6.6960 us/op 2.9390 us/op 2.28
mainnet_e58758 - phase0 processRandaoMixesReset 8.3990 us/op 7.8760 us/op 1.07
mainnet_e58758 - phase0 processHistoricalRootsUpdate 2.1640 us/op 1.0110 us/op 2.14
mainnet_e58758 - phase0 processParticipationRecordUpdates 11.109 us/op 4.1310 us/op 2.69
mainnet_e58758 - phase0 afterProcessEpoch 106.78 ms/op 95.077 ms/op 1.12
phase0 processEffectiveBalanceUpdates - 250000 normalcase 2.1703 ms/op 1.1943 ms/op 1.82
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 2.6266 ms/op 1.4844 ms/op 1.77
altair processInactivityUpdates - 250000 normalcase 30.723 ms/op 25.760 ms/op 1.19
altair processInactivityUpdates - 250000 worstcase 30.085 ms/op 27.692 ms/op 1.09
phase0 processRegistryUpdates - 250000 normalcase 15.577 us/op 7.0510 us/op 2.21
phase0 processRegistryUpdates - 250000 badcase_full_deposits 381.52 us/op 266.68 us/op 1.43
phase0 processRegistryUpdates - 250000 worstcase 0.5 163.55 ms/op 117.13 ms/op 1.40
altair processRewardsAndPenalties - 250000 normalcase 79.953 ms/op 66.360 ms/op 1.20
altair processRewardsAndPenalties - 250000 worstcase 78.306 ms/op 69.078 ms/op 1.13
phase0 getAttestationDeltas - 250000 normalcase 10.511 ms/op 6.5313 ms/op 1.61
phase0 getAttestationDeltas - 250000 worstcase 9.9180 ms/op 6.6172 ms/op 1.50
phase0 processSlashings - 250000 worstcase 5.1051 ms/op 3.4381 ms/op 1.48
altair processSyncCommitteeUpdates - 250000 200.77 ms/op 172.78 ms/op 1.16
BeaconState.hashTreeRoot - No change 326.00 ns/op 263.00 ns/op 1.24
BeaconState.hashTreeRoot - 1 full validator 58.491 us/op 53.078 us/op 1.10
BeaconState.hashTreeRoot - 32 full validator 612.93 us/op 493.71 us/op 1.24
BeaconState.hashTreeRoot - 512 full validator 7.3025 ms/op 5.4450 ms/op 1.34
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 70.991 us/op 60.257 us/op 1.18
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.0760 ms/op 912.67 us/op 1.18
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 13.521 ms/op 11.279 ms/op 1.20
BeaconState.hashTreeRoot - 1 balances 55.288 us/op 47.088 us/op 1.17
BeaconState.hashTreeRoot - 32 balances 574.45 us/op 429.12 us/op 1.34
BeaconState.hashTreeRoot - 512 balances 5.2461 ms/op 4.3203 ms/op 1.21
BeaconState.hashTreeRoot - 250000 balances 91.819 ms/op 75.880 ms/op 1.21
aggregationBits - 2048 els - zipIndexesInBitList 28.473 us/op 15.738 us/op 1.81
regular array get 100000 times 37.143 us/op 32.820 us/op 1.13
wrappedArray get 100000 times 36.987 us/op 32.802 us/op 1.13
arrayWithProxy get 100000 times 16.939 ms/op 15.819 ms/op 1.07
ssz.Root.equals 809.00 ns/op 590.00 ns/op 1.37
byteArrayEquals 786.00 ns/op 548.00 ns/op 1.43
shuffle list - 16384 els 8.5610 ms/op 6.8752 ms/op 1.25
shuffle list - 250000 els 110.17 ms/op 101.57 ms/op 1.08
processSlot - 1 slots 12.272 us/op 8.8110 us/op 1.39
processSlot - 32 slots 1.5435 ms/op 1.3826 ms/op 1.12
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 38.310 ms/op 36.745 ms/op 1.04
getCommitteeAssignments - req 1 vs - 250000 vc 2.9871 ms/op 2.8726 ms/op 1.04
getCommitteeAssignments - req 100 vs - 250000 vc 4.4753 ms/op 4.0746 ms/op 1.10
getCommitteeAssignments - req 1000 vs - 250000 vc 4.7685 ms/op 4.4259 ms/op 1.08
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 5.3100 ns/op 4.6600 ns/op 1.14
state getBlockRootAtSlot - 250000 vs - 7PWei 818.54 ns/op 929.73 ns/op 0.88
computeProposers - vc 250000 11.628 ms/op 10.346 ms/op 1.12
computeEpochShuffling - vc 250000 111.63 ms/op 102.15 ms/op 1.09
getNextSyncCommittee - vc 250000 189.73 ms/op 169.54 ms/op 1.12
computeSigningRoot for AttestationData 15.206 us/op 12.987 us/op 1.17
hash AttestationData serialized data then Buffer.toString(base64) 2.5716 us/op 2.4661 us/op 1.04
toHexString serialized data 1.4908 us/op 1.0821 us/op 1.38
Buffer.toString(base64) 371.15 ns/op 313.47 ns/op 1.18

by benchmarkbot/action

@twoeths twoeths changed the title Block network processor when processing current slot block feat: block network processor when processing current slot block May 3, 2023
@twoeths
Copy link
Contributor Author

twoeths commented May 4, 2023

metrics look good compared to beta:

  • feat2: shorter block processs time

Screenshot 2023-05-04 at 08 47 06

  • vs beta

Screenshot 2023-05-04 at 08 46 05

  • Blocked messages per slot due to processing current block slot

Screenshot 2023-05-04 at 08 48 06

  • Almost no dropped attestations (vs 30% - 40 in beta)

Screenshot 2023-05-04 at 08 49 02

@twoeths twoeths marked this pull request as ready for review May 4, 2023 01:49
@twoeths twoeths requested a review from a team as a code owner May 4, 2023 01:49
@@ -284,6 +285,10 @@ export class BeaconChain implements IBeaconChain {
await this.bls.close();
}

isProcessingCurrentSlotBlock(): boolean {
return this._isProcessingCurrentSlotBlock;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this logic and _isProcessingCurrentSlotBlock be in the network processor class? It knows the block's slot and current clock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I can move the logic there. It means we block network processor messages right after we receive a gossip block message instead of at the time we process block

Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@wemeetagain wemeetagain merged commit a24ada9 into unstable May 4, 2023
11 checks passed
@wemeetagain wemeetagain deleted the tuyen/block_network_processor branch May 4, 2023 13:32
@wemeetagain
Copy link
Member

🎉 This PR is included in v1.9.0 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance issue due to unlimited maxGossipTopicConcurrency
3 participants