Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes bug with excessive rate limiting #5853

Merged
merged 1 commit into from May 14, 2020

Conversation

farazdagi
Copy link
Contributor

@farazdagi farazdagi commented May 14, 2020

What type of PR is this?

Bug fix

What does this PR do? Why is it needed?

  • Fixes issue when rate limiting was too aggressive, depleting capacity excessively
  • Improves speed of test (15s -> 5s), but setting up database for blocks used in tests only (not all blocks in range)

Which issues(s) does this PR fix?

Issue located when working on init-sync optimizations, and fix was found immediately - so no corresponding issue opened.

Other notes for review
We had several issues:

  • Rate limiter's capacity was decreased even before any block streaming occurred - which is unfair, as before blocks are streamed, we can't count them. It is correct to check if limits are exceeded before sending anything, but updating capacity should occur after the streaming.
  • More importantly, we've always decreased capacity by allowedBlocksPerSecond:
    r.blocksRateLimiter.Add(stream.Conn().RemotePeer().String(), int64(allowedBlocksPerSecond))

    This means, that with our current init-sync routine, where requests are interleaved, and it is common to request less than a dozen of blocks for each peer, every request (even for, say, 4 blocks) - will be considered a served request for 64 blocks. So, effectively, sometimes we mark as bad peers, peers which are not bad at all. That's our node sends 9 requests for 4 blocks (and it keeps track to make sure it doesn't get over the limit) in 1 sec - it gets banned (we do not allow x10 increase - x9 is enough to get banned).

This issue shouldn't manifest itself with update (#5798) where I indeed force single peer to fetch close to allowed max blocks in each request. But with current routine - we are excessively banning peers.

@codecov
Copy link

codecov bot commented May 14, 2020

Codecov Report

Merging #5853 into master will decrease coverage by 0.34%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5853      +/-   ##
==========================================
- Coverage   60.31%   59.96%   -0.35%     
==========================================
  Files         312      312              
  Lines       27158    26190     -968     
==========================================
- Hits        16379    15705     -674     
+ Misses       8605     8349     -256     
+ Partials     2174     2136      -38     

@farazdagi farazdagi added OK to merge Ready For Review A pull request ready for code review labels May 14, 2020
@farazdagi farazdagi marked this pull request as ready for review May 14, 2020 13:37
@farazdagi farazdagi requested a review from a team as a code owner May 14, 2020 13:37
@prylabs-bulldozer prylabs-bulldozer bot merged commit 471779e into master May 14, 2020
@delete-merged-branch delete-merged-branch bot deleted the sync-rate-limiting-fix branch May 14, 2020 13:43
michaelhly pushed a commit to michaelhly/prysm that referenced this pull request May 14, 2020
* fixes bug with excessive rate limiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants