Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce Zebra disk usage for mining pools #5718

Open
Tracked by #7366
teor2345 opened this issue Nov 25, 2022 · 0 comments
Open
Tracked by #7366

Reduce Zebra disk usage for mining pools #5718

teor2345 opened this issue Nov 25, 2022 · 0 comments
Labels
A-rpc Area: Remote Procedure Call interfaces A-state Area: State / database changes C-bug Category: This is a bug I-heavy Problems with excessive memory, disk, or CPU usage S-needs-triage Status: A bug report needs triage

Comments

@teor2345
Copy link
Contributor

Motivation

Some mining pools have asked us to reduce Zebra's disk usage.

Alternative Designs

Here are some different things we could try, in rough order of effort/disruptiveness:

  1. Stop storing duplicate state data

  2. Improve database compression using:

    • a different level 0 compression algorithm, like zstd
    • the maximum compression rate
    • this probably doesn't need a state version change, but:
      • old states will have less compression, and
      • old versions of Zebra might not be able to open new states, if they don't have all the algorithms we're using

We might want to delay this work until after the audit, because it could change a lot of code:

  1. Add a config to Zebra that doesn't create unused indexes:

    • delete balance_by_transparent_addr
    • delete tx_loc_by_transparent_addr_loc
    • delete utxo_loc_by_transparent_addr_loc
    • delete sprout_note_commitment_tree lower than the finalized tip
    • delete sapling_note_commitment_tree lower than the finalized tip
    • delete orchard_note_commitment_tree lower than the finalized tip
    • delete history_tree lower than the finalized tip
    • This will cause errors in RPCs that use these indexes, but that's ok if they aren't called
  2. Add a config to Zebra that deletes blocks below finalized tip - how far we look back to check for legacy chains:

    • block_header_by_height
    • tx_by_loc
    • maybe hash_by_height
    • maybe height_by_hash
    • maybe hash_by_tx_loc
    • maybe tx_loc_by_hash
    • This could cause a lot of errors, we should try a quick and dirty implementation first
@teor2345 teor2345 added C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage I-heavy Problems with excessive memory, disk, or CPU usage P-Optional ✨ A-rpc Area: Remote Procedure Call interfaces A-state Area: State / database changes labels Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rpc Area: Remote Procedure Call interfaces A-state Area: State / database changes C-bug Category: This is a bug I-heavy Problems with excessive memory, disk, or CPU usage S-needs-triage Status: A bug report needs triage
Projects
Status: New
Development

No branches or pull requests

2 participants