Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement rate limit #3454

Merged
merged 11 commits into from
Dec 15, 2021
Merged

Implement rate limit #3454

merged 11 commits into from
Dec 15, 2021

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Nov 23, 2021

Motivation

  • At some points, there are a lot of peers connect to us and it takes 20% of our CPU time to serve block requests. p2p: stream bytes from db #3435 should mitigate the issue and this is the 2nd part
  • We want to apply rate limit when we receive beacon_blocks_by_range and beacon_blocks_by_root requests
  • We want to apply block count rate limit when we receive beacon_blocks_by_range and beacon_blocks_by_root requests and request count rate limit for all request types

Description

  • Implement a RateTracker class that check the rate limit of request count and block count
  • Added 4 params to network: per peer requestCountPeerLimit, blockCountPeerLimit and total requestCountTotalLimit, blockCountTotalLimit
  • Ban peer if it violates per peer limit
  • By default, we limit 500 blocks and 50 requests per peer per minute or the peer is banned, make the total params 4x the peer params
  requestCountTotalLimit: 200,
  requestCountPeerLimit: 50,
  blockCountTotalLimit: 2000,
  blockCountPeerLimit: 500,
  rateTrackerTimeoutMs: 60 * 1000,

Note
We will need to apply RateTracker when we issue rpc block requests too or we'll get banned by other peers, see #3344

Closes #3451

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

Merging #3454 (2d60bf1) into master (a876466) will decrease coverage by 0.26%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3454      +/-   ##
==========================================
- Coverage   37.90%   37.64%   -0.27%     
==========================================
  Files         308      310       +2     
  Lines        8120     8216      +96     
  Branches     1247     1262      +15     
==========================================
+ Hits         3078     3093      +15     
- Misses       4894     4975      +81     
  Partials      148      148              

@codeclimate
Copy link

codeclimate bot commented Nov 23, 2021

Code Climate has analyzed commit 2d60bf1 and detected 2 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 2

View more on Code Climate.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 23, 2021

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 98b058a Previous: 5751ee5 Ratio
BeaconState.hashTreeRoot - No change 527.00 ns/op 533.00 ns/op 0.99
BeaconState.hashTreeRoot - 1 full validator 122.70 us/op 68.475 us/op 1.79
BeaconState.hashTreeRoot - 32 full validator 1.9502 ms/op 1.0671 ms/op 1.83
BeaconState.hashTreeRoot - 512 full validator 26.208 ms/op 15.136 ms/op 1.73
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 124.55 us/op 73.355 us/op 1.70
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.1459 ms/op 1.2613 ms/op 1.70
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 28.667 ms/op 16.313 ms/op 1.76
BeaconState.hashTreeRoot - 1 balances 90.935 us/op 51.493 us/op 1.77
BeaconState.hashTreeRoot - 32 balances 727.84 us/op 460.12 us/op 1.58
BeaconState.hashTreeRoot - 512 balances 7.4545 ms/op 4.4917 ms/op 1.66
BeaconState.hashTreeRoot - 250000 balances 136.27 ms/op 76.708 ms/op 1.78
processSlot - 1 slots 61.266 us/op 33.056 us/op 1.85
processSlot - 32 slots 2.8587 ms/op 1.8087 ms/op 1.58
getCommitteeAssignments - req 1 vs - 250000 vc 5.2299 ms/op 4.6761 ms/op 1.12
getCommitteeAssignments - req 100 vs - 250000 vc 7.3065 ms/op 6.5136 ms/op 1.12
getCommitteeAssignments - req 1000 vs - 250000 vc 7.8222 ms/op 6.9462 ms/op 1.13
computeProposers - vc 250000 20.380 ms/op 18.905 ms/op 1.08
computeEpochShuffling - vc 250000 184.69 ms/op 171.43 ms/op 1.08
getNextSyncCommittee - vc 250000 330.11 ms/op 309.48 ms/op 1.07
altair processAttestation - 250000 vs - 7PWei normalcase 45.678 ms/op 34.854 ms/op 1.31
altair processAttestation - 250000 vs - 7PWei worstcase 48.783 ms/op 42.984 ms/op 1.13
altair processAttestation - setStatus - 1/6 committees join 11.704 ms/op 9.9626 ms/op 1.17
altair processAttestation - setStatus - 1/3 committees join 26.997 ms/op 19.913 ms/op 1.36
altair processAttestation - setStatus - 1/2 committees join 38.462 ms/op 30.944 ms/op 1.24
altair processAttestation - setStatus - 2/3 committees join 55.313 ms/op 41.950 ms/op 1.32
altair processAttestation - setStatus - 4/5 committees join 57.893 ms/op 49.279 ms/op 1.17
altair processAttestation - setStatus - 100% committees join 74.125 ms/op 62.616 ms/op 1.18
altair processAttestation - updateEpochParticipants - 1/6 committees join 14.524 ms/op 10.545 ms/op 1.38
altair processAttestation - updateEpochParticipants - 1/3 committees join 29.701 ms/op 22.852 ms/op 1.30
altair processAttestation - updateEpochParticipants - 1/2 committees join 32.064 ms/op 24.645 ms/op 1.30
altair processAttestation - updateEpochParticipants - 2/3 committees join 29.537 ms/op 26.546 ms/op 1.11
altair processAttestation - updateEpochParticipants - 4/5 committees join 30.843 ms/op 31.159 ms/op 0.99
altair processAttestation - updateEpochParticipants - 100% committees join 32.661 ms/op 28.763 ms/op 1.14
altair processAttestation - updateAllStatus 23.072 ms/op 20.419 ms/op 1.13
altair processBlock - 250000 vs - 7PWei normalcase 53.589 ms/op 42.359 ms/op 1.27
altair processBlock - 250000 vs - 7PWei worstcase 118.16 ms/op 107.29 ms/op 1.10
altair processEpoch - mainnet_e81889 1.0689 s/op 879.44 ms/op 1.22
mainnet_e81889 - altair beforeProcessEpoch 362.83 ms/op 318.68 ms/op 1.14
mainnet_e81889 - altair processJustificationAndFinalization 91.003 us/op 56.487 us/op 1.61
mainnet_e81889 - altair processInactivityUpdates 18.390 ms/op 18.033 ms/op 1.02
mainnet_e81889 - altair processRewardsAndPenalties 147.26 ms/op 113.67 ms/op 1.30
mainnet_e81889 - altair processRegistryUpdates 6.9780 us/op 5.6680 us/op 1.23
mainnet_e81889 - altair processSlashings 1.8230 us/op 1.1690 us/op 1.56
mainnet_e81889 - altair processEth1DataReset 1.2680 us/op 1.0480 us/op 1.21
mainnet_e81889 - altair processEffectiveBalanceUpdates 10.985 ms/op 12.548 ms/op 0.88
mainnet_e81889 - altair processSlashingsReset 11.139 us/op 8.4260 us/op 1.32
mainnet_e81889 - altair processRandaoMixesReset 13.577 us/op 11.416 us/op 1.19
mainnet_e81889 - altair processHistoricalRootsUpdate 2.5690 us/op 1.3800 us/op 1.86
mainnet_e81889 - altair processParticipationFlagUpdates 110.19 ms/op 96.289 ms/op 1.14
mainnet_e81889 - altair processSyncCommitteeUpdates 1.3630 us/op 1.0180 us/op 1.34
mainnet_e81889 - altair afterProcessEpoch 226.00 ms/op 220.20 ms/op 1.03
altair processInactivityUpdates - 250000 normalcase 74.613 ms/op 70.361 ms/op 1.06
altair processInactivityUpdates - 250000 worstcase 97.301 ms/op 70.144 ms/op 1.39
altair processParticipationFlagUpdates - 250000 anycase 106.98 ms/op 107.27 ms/op 1.00
altair processRewardsAndPenalties - 250000 normalcase 150.77 ms/op 110.10 ms/op 1.37
altair processRewardsAndPenalties - 250000 worstcase 129.02 ms/op 110.27 ms/op 1.17
altair processSyncCommitteeUpdates - 250000 351.88 ms/op 361.24 ms/op 0.97
Tree 40 250000 create 718.37 ms/op 599.14 ms/op 1.20
Tree 40 250000 get(125000) 329.48 ns/op 332.72 ns/op 0.99
Tree 40 250000 set(125000) 2.2661 us/op 1.7462 us/op 1.30
Tree 40 250000 toArray() 39.379 ms/op 41.217 ms/op 0.96
Tree 40 250000 iterate all - toArray() + loop 38.906 ms/op 43.488 ms/op 0.89
Tree 40 250000 iterate all - get(i) 122.18 ms/op 118.81 ms/op 1.03
MutableVector 250000 create 25.003 ms/op 21.628 ms/op 1.16
MutableVector 250000 get(125000) 14.298 ns/op 14.214 ns/op 1.01
MutableVector 250000 set(125000) 517.79 ns/op 576.38 ns/op 0.90
MutableVector 250000 toArray() 8.4401 ms/op 8.8084 ms/op 0.96
MutableVector 250000 iterate all - toArray() + loop 8.6605 ms/op 8.9245 ms/op 0.97
MutableVector 250000 iterate all - get(i) 3.1867 ms/op 3.4432 ms/op 0.93
Array 250000 create 5.1292 ms/op 5.8212 ms/op 0.88
Array 250000 clone - spread 2.2294 ms/op 2.3570 ms/op 0.95
Array 250000 get(125000) 1.0360 ns/op 1.1210 ns/op 0.92
Array 250000 set(125000) 1.0340 ns/op 1.1010 ns/op 0.94
Array 250000 iterate all - loop 169.00 us/op 167.98 us/op 1.01
aggregationBits - 2048 els - readonlyValues 239.11 us/op 253.48 us/op 0.94
aggregationBits - 2048 els - zipIndexesInBitList 47.301 us/op 49.357 us/op 0.96
regular array get 100000 times 67.396 us/op 67.465 us/op 1.00
wrappedArray get 100000 times 67.882 us/op 67.458 us/op 1.01
arrayWithProxy get 100000 times 29.050 ms/op 30.602 ms/op 0.95
ssz.Root.equals 1.0820 us/op 1.1590 us/op 0.93
ssz.Root.equals with valueOf() 1.5030 us/op 1.4600 us/op 1.03
byteArrayEquals with valueOf() 1.4850 us/op 1.4230 us/op 1.04
phase0 processBlock - 250000 vs - 7PWei normalcase 10.628 ms/op 10.629 ms/op 1.00
phase0 processBlock - 250000 vs - 7PWei worstcase 77.285 ms/op 72.729 ms/op 1.06
phase0 afterProcessEpoch - 250000 vs - 7PWei 208.59 ms/op 202.83 ms/op 1.03
phase0 beforeProcessEpoch - 250000 vs - 7PWei 593.56 ms/op 544.33 ms/op 1.09
phase0 processEpoch - mainnet_e58758 790.23 ms/op 759.27 ms/op 1.04
mainnet_e58758 - phase0 beforeProcessEpoch 502.44 ms/op 465.92 ms/op 1.08
mainnet_e58758 - phase0 processJustificationAndFinalization 50.294 us/op 45.997 us/op 1.09
mainnet_e58758 - phase0 processRewardsAndPenalties 102.00 ms/op 78.581 ms/op 1.30
mainnet_e58758 - phase0 processRegistryUpdates 37.467 us/op 32.901 us/op 1.14
mainnet_e58758 - phase0 processSlashings 1.6420 us/op 1.2140 us/op 1.35
mainnet_e58758 - phase0 processEth1DataReset 1.4550 us/op 1.0370 us/op 1.40
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 9.2395 ms/op 10.601 ms/op 0.87
mainnet_e58758 - phase0 processSlashingsReset 8.7310 us/op 6.8910 us/op 1.27
mainnet_e58758 - phase0 processRandaoMixesReset 12.714 us/op 10.989 us/op 1.16
mainnet_e58758 - phase0 processHistoricalRootsUpdate 1.7360 us/op 1.2870 us/op 1.35
mainnet_e58758 - phase0 processParticipationRecordUpdates 9.1720 us/op 7.8920 us/op 1.16
mainnet_e58758 - phase0 afterProcessEpoch 205.56 ms/op 205.12 ms/op 1.00
phase0 processEffectiveBalanceUpdates - 250000 normalcase 11.145 ms/op 11.684 ms/op 0.95
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.1543 s/op 1.1731 s/op 0.98
phase0 processRegistryUpdates - 250000 normalcase 35.659 us/op 32.504 us/op 1.10
phase0 processRegistryUpdates - 250000 badcase_full_deposits 2.9408 ms/op 2.5899 ms/op 1.14
phase0 processRegistryUpdates - 250000 worstcase 0.5 1.8121 s/op 1.5204 s/op 1.19
phase0 getAttestationDeltas - 250000 normalcase 33.914 ms/op 36.379 ms/op 0.93
phase0 getAttestationDeltas - 250000 worstcase 33.912 ms/op 36.495 ms/op 0.93
phase0 processSlashings - 250000 worstcase 40.843 ms/op 32.214 ms/op 1.27
shuffle list - 16384 els 12.569 ms/op 12.498 ms/op 1.01
shuffle list - 250000 els 181.32 ms/op 180.68 ms/op 1.00
getEffectiveBalances - 250000 vs - 7PWei 9.5555 ms/op 13.643 ms/op 0.70
computeDeltas 3.8524 ms/op 3.5244 ms/op 1.09
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.0443 ms/op 2.4625 ms/op 0.83
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 661.41 us/op 683.06 us/op 0.97
BLS verify - blst-native 1.8564 ms/op 1.8599 ms/op 1.00
BLS verifyMultipleSignatures 3 - blst-native 3.8167 ms/op 3.8176 ms/op 1.00
BLS verifyMultipleSignatures 8 - blst-native 8.2268 ms/op 8.2366 ms/op 1.00
BLS verifyMultipleSignatures 32 - blst-native 29.854 ms/op 29.892 ms/op 1.00
BLS aggregatePubkeys 32 - blst-native 39.548 us/op 39.558 us/op 1.00
BLS aggregatePubkeys 128 - blst-native 153.79 us/op 153.89 us/op 1.00
getAttestationsForBlock 80.273 ms/op 84.263 ms/op 0.95
CheckpointStateCache - add get delete 14.919 us/op 16.369 us/op 0.91
validate gossip signedAggregateAndProof - struct 4.4443 ms/op 4.4655 ms/op 1.00
validate gossip signedAggregateAndProof - treeBacked 4.4172 ms/op 4.4470 ms/op 0.99
validate gossip attestation - struct 2.0952 ms/op 2.1063 ms/op 0.99
validate gossip attestation - treeBacked 2.1170 ms/op 2.1297 ms/op 0.99
Object access 1 prop 0.31800 ns/op 0.36800 ns/op 0.86
Map access 1 prop 0.28800 ns/op 0.28600 ns/op 1.01
Object get x1000 17.839 ns/op 17.982 ns/op 0.99
Map get x1000 0.97600 ns/op 1.0300 ns/op 0.95
Object set x1000 105.58 ns/op 120.81 ns/op 0.87
Map set x1000 65.632 ns/op 73.473 ns/op 0.89
Return object 10000 times 0.37250 ns/op 0.37300 ns/op 1.00
Throw Error 10000 times 5.9995 us/op 5.8286 us/op 1.03
RateTracker 1000000 limit, 1 obj count per request 175.99 ns/op
RateTracker 1000000 limit, 2 obj count per request 132.09 ns/op
RateTracker 1000000 limit, 4 obj count per request 118.77 ns/op
RateTracker 1000000 limit, 8 obj count per request 110.15 ns/op
RateTracker with prune 5.2810 us/op

by benchmarkbot/action

@twoeths twoeths marked this pull request as ready for review November 23, 2021 07:59
@ChainSafe ChainSafe deleted a comment from shanghan345 Nov 23, 2021
@twoeths twoeths changed the title Implement rate limit when handling block requests Implement rate limit Nov 24, 2021
@twoeths twoeths requested a review from dapplion December 6, 2021 11:44
@dapplion
Copy link
Contributor

dapplion commented Dec 6, 2021

I'm concerned about the memory and performance cost of this implementation.

  • Could you do a realistic analysis of memory + performance cost? Ideally both theoretical with ~25 peers or measuring in benchmarks
  • What also the expected memory cost of all data structures as a function of peers?
  • Do all data structures tied to peers get clean-up properly?

@twoeths
Copy link
Contributor Author

twoeths commented Dec 7, 2021

Update with some analysis and benchmark tests:

Memory

  • Rate limiter can keep up to 15MB (this includes 25 + 25 rate trackers for peer and 2 total rate tracker, each is full of requests (60)) as shown in the test
  • We prune rate limiter when a peer is disconnected so memory will not be higher than that
  • I leave the machine for 3h, take a snapshot and it takes only 125kb

Screen Shot 2021-12-07 at 14 18 47

  • We don't allocate resource per request so no need to cleanup

Performance

  • A rateTracker.requestObjects() call takes from 80ns/op to 160ns/op if it does not have to prune. If it has to prune all old requests, it takes around 3.7 us/op but this rarely happen since the next time it does not have to prune that much
  • A rateLimiter.allowRequest() call include 4x or 2x rateTracker.requestObjects() call

@dapplion please let me know if you have any concern, or any other tests that I can add

@twoeths
Copy link
Contributor Author

twoeths commented Dec 8, 2021

@dapplion it turns out that some requests come to the rate limiter after libp2p disconnect event so I have to delay (5s) the prune by using a pruneQueue in order to cleanly remove unused rate tracker, see 5962dc6

@dapplion
Copy link
Contributor

dapplion commented Dec 8, 2021

@dapplion it turns out that some requests come to the rate limiter after libp2p disconnect event so I have to delay (5s) the prune by using a pruneQueue in order to cleanly remove unused rate tracker, see 5962dc6

This solution looks dangerous assuming 5 seconds isn't safe. What if the node is very overloaded and pormises get stuck for a long time in the event loop?

What do you think about implementing the solution I proposed offline:

  • Track latest seen message timestamp per tracker
  • Once every some time, i.e. 1 min, 5 min loop over all trackers and clear trackers where now - lastRcvMsg > timeWindow
    ?

@twoeths
Copy link
Contributor Author

twoeths commented Dec 9, 2021

@dapplion so every 10 minutes, I remove tracker if there is no request for a peer in the last 5 minutes, does the number make sense to you 2d60bf1 ? That's for minority of peers, most of the trackers of peers will be removed by libp2p disconnect event

Copy link
Member

@wemeetagain wemeetagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM but I've got a question before merging.

If we only have an inbound rate limiter, don't we run the risk of disconnecting from ourselves on the network?
Should we also have the outbound rate limiter implemented before merging this?

@twoeths
Copy link
Contributor Author

twoeths commented Dec 14, 2021

If we only have an inbound rate limiter, don't we run the risk of disconnecting from ourselves on the network?
Should we also have the outbound rate limiter implemented before merging this?

@wemeetagain given the threshold of 500 blocks per minute, I think the issue may only happen on a very small devnet, it's safe for prater/pyrmont/mainnet, even if we have the issue we can always tweak through cli params, also we have #3344. What's your thought @dapplion ?

@dapplion
Copy link
Contributor

@wemeetagain given the threshold of 500 blocks per minute, I think the issue may only happen on a very small devnet, it's safe for prater/pyrmont/mainnet, even if we have the issue we can always tweak through cli params, also we have #3344. What's your thought @dapplion ?

I think it's good, we can fix latter if we see issues

@dapplion dapplion merged commit 20409fe into master Dec 15, 2021
@dapplion dapplion deleted the tuyen/rate-limit branch December 15, 2021 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement rate limit
3 participants