Skip to content

Conversation

@michaelsproul
Copy link
Member

Issue Addressed

Closes #1866

Proposed Changes

  • Compact the database on finalization. This removes the deleted states from disk completely. Because it happens in the background migrator, it doesn't block other database operations while it runs. On my Medalla node it took about 1 minute and shrank the database from 90GB to 9GB.
  • Fix an inefficiency in the pruning algorithm where it would always use the genesis checkpoint as the old_finalized_checkpoint when running for the first time after start-up. This would result in loading lots of states one-at-a-time back to genesis, and storing a lot of block roots in memory. The new code stores the old finalized checkpoint on disk and only uses genesis if no checkpoint is already stored. This makes it both backwards compatible and forwards compatible -- no schema change required!
  • Introduce two new INFO logs to indicate when pruning has started and completed. Users seem to want to know this information without enabling debug logs!

And fix an issue with the first prune after
start-up being really slow.
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is running well on our Medalla nodes, too :)

@paulhauner paulhauner added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Nov 9, 2020
@michaelsproul
Copy link
Member Author

Thanks!

bors r+

bors bot pushed a commit that referenced this pull request Nov 9, 2020
## Issue Addressed

Closes #1866

## Proposed Changes

* Compact the database on finalization. This removes the deleted states from disk completely. Because it happens in the background migrator, it doesn't block other database operations while it runs. On my Medalla node it took about 1 minute and shrank the database from 90GB to 9GB.
* Fix an inefficiency in the pruning algorithm where it would always use the genesis checkpoint as the `old_finalized_checkpoint` when running for the first time after start-up. This would result in loading lots of states one-at-a-time back to genesis, and storing a lot of block roots in memory. The new code stores the old finalized checkpoint on disk and only uses genesis if no checkpoint is already stored. This makes it both backwards compatible _and_ forwards compatible -- no schema change required!
* Introduce two new `INFO` logs to indicate when pruning has started and completed. Users seem to want to know this information without enabling debug logs!
@bors
Copy link

bors bot commented Nov 9, 2020

@bors bors bot changed the title Compact database on finalization [Merged by Bors] - Compact database on finalization Nov 9, 2020
@bors bors bot closed this Nov 9, 2020
@michaelsproul michaelsproul deleted the compact-db branch November 9, 2020 08:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A0 ready-for-merge This PR is ready to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.3.3 beacon_node db uses large amount of disk space from fresh sync even after Medalla finalized

3 participants