Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage optimization #44

Open
12 of 21 tasks
lasarojc opened this issue Dec 23, 2022 · 0 comments
Open
12 of 21 tasks

Storage optimization #44

lasarojc opened this issue Dec 23, 2022 · 0 comments
Assignees
Labels
major-priority A major, long-running priority for the team storage tracking A complex issue broken down into sub-problems

Comments

@lasarojc
Copy link
Contributor

lasarojc commented Dec 23, 2022

High-level tracking issue for general storage optimization efforts. This issue can be expanded over time.

At present (mid 2023), depending on their configuration, Tendermint-based nodes use large quantities of storage space. This has significant cost implications for operators. We aim to implement strategies to reduce and/or offload certain data stored in order to reduce operators' costs.

The two main problems that are present in the CometBFT storage layer:

  1. We have a very big storage footprint
  2. Querying stored data (whether supporting RPC queries or Comet retrieving consensus data structures) is not optimized and in some cases proven to be very efficient

To address these problems, we first need to build understanding of:

  • Workloads : What we store, how frequently we access it, what are the characteristics of the stored data (and this list will be expanded).
  • The database backend: database features, design goals and optimization possibilities.

The work to be done can be broken down in the following main subsections:


To reach this goal we envision the following steps :


Tune CometBFT to address storage related bottlenecks

Part of this section covers addressing issues found during the benchmarking and investigation process outlined above. Another part addresses concrete issues reported by users. While part of this issues cannot be fully addressed before the analysis above, some optimizations can be performed on CometBFT as it is today - marked with * .


CometBFT stores and allows querying of data not essential for consensus
We need to Identify the functionalities we want to support within Tendermint and offload non-critical data and functionality.

  • Implement ADR-101 PoC targeting main #816
    This implementation provides users with an API to implement their own event indexing and prune the full nodes who store events at the moment.
  • Write a data companion based on ADR 101


CometBFT currently maintains its own [WAL](https://github.com/cometbft/cometbft/blob/101bf50e715d6a10c8135392166c35bdae94972e/consensus/wal.go) - is this even necessary, given that the underlying database should actually be taking care of this? It is another source of complexity and potential point of failure in the system that the team has to maintain.


Original issue: tendermint/tendermint#9881

@lasarojc lasarojc added storage major-priority A major, long-running priority for the team tracking A complex issue broken down into sub-problems labels Dec 23, 2022
@thanethomson thanethomson changed the title Storage Optimization Storage optimization Mar 28, 2023
@adizere adizere added this to the 2024-Q1 milestone Jan 31, 2024
@adizere adizere removed this from the 2024-Q1 milestone Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-priority A major, long-running priority for the team storage tracking A complex issue broken down into sub-problems
Projects
Status: In Progress
Status: No status
Development

No branches or pull requests

3 participants