Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add garbage collection to Forest #2292

Closed
9 tasks
LesnyRumcajs opened this issue Nov 29, 2022 · 3 comments · Fixed by #2638
Closed
9 tasks

Add garbage collection to Forest #2292

LesnyRumcajs opened this issue Nov 29, 2022 · 3 comments · Fixed by #2638
Assignees
Labels
Node Priority: 2 - High Very important and should be addressed ASAP Ready Issue is ready for work and anyone can freely assign it to themselves

Comments

@LesnyRumcajs
Copy link
Member

Issue summary

Forest DB, on a longer-running Forest instance, gets large at around 30G per day. This forces every node operator to implement its shrinking mechanism, with the simplest being:

  • export snapshot and turn off the node (or turn off the node and download it from a trusted source, it may be faster),
  • import the new snapshot
  • repeat when available disk space gets low.

We can do something better on our own (though following roughly the same logic). The rough idea is to mark entries as exportable and then delete them from the database. This should theoretically put us back to the just-after-import db size.

Task summary

  • Implement garbage collection for 1 DB backend of choice
  • Test exhaustively
  • Check for corner cases (e.g. SIGKILL during GC pause because why not) - if overly complicated, we might create a separate issue.
  • Implement for other DB backends

Acceptance Criteria

  • GC pause is minimal,
  • disk space overhead is minimal,
  • works on calibnet,
  • works on mainnet on a reasonable machine (e.g., the default Digital Ocean VPS we are using with 320GB SSD disk),
  • works on all supported DB backends.

Other information and links

Not exactly the way we decided to move forward at the moment, but worth mentioning: #1708

@LesnyRumcajs LesnyRumcajs added the Status: Needs Triage Issue has unresolved discussions and/or needs to be assigned a priority and assignee label Nov 29, 2022
@LesnyRumcajs LesnyRumcajs self-assigned this Nov 29, 2022
@LesnyRumcajs LesnyRumcajs added Node and removed Status: Needs Triage Issue has unresolved discussions and/or needs to be assigned a priority and assignee labels Nov 29, 2022
@lemmih lemmih added Priority: 2 - High Very important and should be addressed ASAP Ready Issue is ready for work and anyone can freely assign it to themselves labels Feb 28, 2023
@hanabi1224
Copy link
Contributor

After thinking abt this, here're my phase 1 test plans. @lemmih let me know your feedback and suggestions

  • Running forest till it catches up to the network head
  • Kill forest and backup DB to a DO volume as a snapshot
  • Copy the DB snapshot, test with a standalone executable and collect metrics
    a. export snapshot and re-import snapshot
    b. mark-sweep GC, delete unreachable entries in place)
    c. semi-space GC, copy reachable entries to a new DB and delete the old one

@lemmih
Copy link
Contributor

lemmih commented Feb 28, 2023

Sounds okay. Are you unable to connect to the mainnnet from your local machine?

@hanabi1224
Copy link
Contributor

Are you unable to connect to the mainnnet from your local machine?

I'm able to. But snapshot downloading is much slower. I will try locally first, just in case there's no sufficient disk space I might need DO droplet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Node Priority: 2 - High Very important and should be addressed ASAP Ready Issue is ready for work and anyone can freely assign it to themselves
Projects
No open projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants