Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix aggregated attestation pool grows large in size #4932

Merged
merged 19 commits into from Feb 24, 2020
Merged

Conversation

terencechain
Copy link
Member

@terencechain terencechain commented Feb 23, 2020

This fixes #4929

Please review and merge #4930 first

Attestation pool were growing large in size because we reset an item's expiration time upon new update. Say a new attestation with dataA and bits 0x01 comes in, the expires time of the attestation is set to time.Now + N.

One slot later another attestation with dataA but different bits came in. Here we should not touch the expire time, but instead the node reset the expire time to time.Now + N again. That's why some attestations would never expire because we keep on re aggregating itself here:
https://github.com/prysmaticlabs/prysm/compare/fix-exp-time?expand=1#diff-45a8f7c8f90ae232acabd3f59fdea76cL27

Change list:

  • Update attestation with correct expire time
  • Delete attestation with correct expire time
  • Don't aggregate itself since aggregated attestation is already in an aggregated form
  • Update aggregation from 3 times per slot to 2 times per slot. 3 was too much with a lot of wasted computations
  • Tests

Live testing update:
[2020-02-23 14:05:36 PST]: Ran it for 10+ epochs, the attestations pool count is steadily below 500 versus 2000.. before

@terencechain terencechain self-assigned this Feb 23, 2020
@terencechain terencechain added Bug Something isn't working Priority: High High priority item labels Feb 23, 2020
@codecov
Copy link

codecov bot commented Feb 23, 2020

Codecov Report

Merging #4932 into master will increase coverage by 0.02%.
The diff coverage is 75%.

@@            Coverage Diff             @@
##           master    #4932      +/-   ##
==========================================
+ Coverage   45.45%   45.48%   +0.02%     
==========================================
  Files         209      209              
  Lines       15960    15963       +3     
==========================================
+ Hits         7254     7260       +6     
+ Misses       7469     7466       -3     
  Partials     1237     1237

@terencechain terencechain added the Ready For Review A pull request ready for code review label Feb 23, 2020
@terencechain
Copy link
Member Author

Don't merge yet, I'm testing proposer is passing

@terencechain terencechain added In Progress and removed Ready For Review A pull request ready for code review labels Feb 23, 2020
@terencechain terencechain added Ready For Review A pull request ready for code review and removed In Progress labels Feb 23, 2020
@@ -24,8 +24,7 @@ func (s *Service) aggregateRoutine() {
case <-s.ctx.Done():
return
case <-ticker.C:
attsToBeAggregated := append(s.pool.UnaggregatedAttestations(), s.pool.AggregatedAttestations()...)
if err := s.aggregateAttestations(ctx, attsToBeAggregated); err != nil {
if err := s.aggregateAttestations(ctx, s.pool.UnaggregatedAttestations()); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we use the aggregated ones too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed this a problem. Reverted

@rauljordan rauljordan merged commit 15b649d into master Feb 24, 2020
@delete-merged-branch delete-merged-branch bot deleted the fix-exp-time branch February 24, 2020 16:18
cryptomental pushed a commit to cryptomental/prysm that referenced this pull request Feb 24, 2020
* Add metrics

* Use it

* Use it

* Fixed exp time and tests

* Update on save too

* Expose getters

* One epoch purge time

* Fixed a timing issue

* Clean up

* Gazelle

* Interface

* Prune every epoch

* Aggregate twice per slot

* Revert attsToBeAggregated

* Delete expired atts

Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
cryptomental pushed a commit to cryptomental/prysm that referenced this pull request Feb 28, 2020
* Add metrics

* Use it

* Use it

* Fixed exp time and tests

* Update on save too

* Expose getters

* One epoch purge time

* Fixed a timing issue

* Clean up

* Gazelle

* Interface

* Prune every epoch

* Aggregate twice per slot

* Revert attsToBeAggregated

* Delete expired atts

Co-authored-by: prylabs-bulldozer[bot] <58059840+prylabs-bulldozer[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Priority: High High priority item Ready For Review A pull request ready for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attestation pool grows to large size
3 participants