Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap #259

Open
32 of 92 tasks
sentientwaffle opened this issue Nov 18, 2022 · 0 comments
Open
32 of 92 tasks

Roadmap #259

sentientwaffle opened this issue Nov 18, 2022 · 0 comments

Comments

@sentientwaffle
Copy link
Member

sentientwaffle commented Nov 18, 2022

The tasks listed are not guaranteed, and they are not ordered.
The intent is to give a sense of the project's direction.

  • 🔴: Must be done for the production release.
  • 🟡: Nice-to-have for the production release.

Stability: Storage

  • VSR: Manifest free-set. 🔴
  • VSR: 256-byte headers. 🔴
    • VSR: State machine version in headers. 🔴
  • VSR: Async checkpoints 🔴
  • LSM: Remove filter blocks. 🔴
  • LSM: Re-implement secondary index tombstone optimization. 🟡 #1352
  • LSM: Compaction optimizations
    • Coalesce small adjacent tables (context: LSM: implement move table optimization for compaction #463) 🔴
    • "Move-data-block" (more granular than "move table")
    • Start the next round of compaction reads before starting the merge (cpu) work.
    • Last-level of each tree should have double-size tables, but half as many. 🔴
  • VSR: Size manifest trailer, and pace manifest compaction, to guarantee capacity. 🔴
  • VSR: Encode configuration data into superblock. 🔴
  • VSR: Reserve more space in SuperBlock.VSRState for future use. 🔴
  • Guard against running a binary against a data file that was created with a different configuration. 🟡
  • LSM: Add value count to TableInfo. (And possibly value-block count, since compression will decouple the ratio between the two.) 🔴
  • Redo snapshots. 🔴
    • Snapshots should be relative to the op that "creates" them, not the op that compacts them.
    • Maybe use timestamps instead of ops as snapshot ids.
    • Store snapshot in manifest block header (like we do for all other blocks).
  • Reserve a some extra space in the superblock for future use, just in case? (Since "growing" the superblock is not possible once a replica is formatted.)
  • VSR: Align grid zone start to grid block size. 🔴
  • VSR: Remove superblock trailers. 🔴
    • Encode the client sessions trailer into the grid.
    • Encode the manifest trailer into grid blocks. (As an on-disk doubly-linked list.)
    • Encode the manifest-freeset into one grid block.
    • Increase the number of superblock copies, since they will be so much smaller.
  • VSR: Panic on nondeterminism, don't try to state sync recover. 🔴

Stability: API

Safety

  • VSR: State sync (to catch up >1WAL). 🔴
    • VSR: Include checkpoint identifier in prepare messages instead of prepare_ok messages. (Requires 256-byte headers.) 🔴
    • VSR: Remove state sync kludge. (Requires async checkpoints.) 🔴
  • VSR: Grid scrubber, to guard against double-faults. 🟡 (This is mostly done.)
  • VSR: repair_pipeline_read_callback recurses when messages are cached in the pipeline. Restructure to avoid stack overflow risk. 🟡
  • Storage: Audit TODOs in linux.zig and src/storage.zig. 🔴
  • VSR: Write + erase a random amount of sectors during replica formatting, to ensure that if all replicas are each deployed to the same model of SSD, that they are not overexposed to faults that impact the same physical block address on each SSD.
    • Note that this does not need to impact the storage format at all.

Performance

  • StateMachine: Optimistic state machine execution.
  • LSM: Compaction Beat pacing 🟡
    • Spread work more evenly between beats (to avoid latency spikes at the end of a half-bar).
    • LSM storage at the end of each beat will be deterministic (instead of at the end of each half-bar).
  • LSM: Compaction optimizations
    • LSM: Fix sequential grid-read bottleneck.
  • LSM: Manifest log open prefetch.
  • LSM: Add "sequential" bit for constant-time lookup in consecutive-key value block.
  • LSM: Compress value blocks.
  • VSR: Fix checkpoint latency spike:
  • VSR: Grid block reference-counting or cache/stash, to avoid internal block copying during compaction.
  • VSR: Adaptive message timeouts.
  • VSR: To speed up grid block sync, allow a replica to intelligently send blocks before they are asked for. (This is important for e.g. manifest repair, which is otherwise sequential.) The receiving replica should stash these in its grid block pool so that it can (hopefully) avoid a round trip to repair them.

Experience: Operations

  • LSM: Runtime-configurable NodePool size. 🔴 (Allow the node pool size to be set on the command line #1447)
  • LSM: Default NodePool size (lsm_forest_node_count). (Currently it is constant and too small.) 🔴
  • LSM: Replica must panic "nicely" (i.e. with a log message) if NodePools.acquire() has no nodes available. 🔴
  • LSM: Replica must panic nicely if Grid has insufficient free blocks. 🔴
  • LSM: Replica must panic nicely if forest has insufficient tables. (Don't exceed table_count_max.) 🔴
  • VSR: Reconfiguration protocol
    • Add/remove replicas from the cluster.
    • Coordinate rolling replica version upgrades.
  • VSR: Improve asymmetric partition tolerance.
    • VSR: Table sync congestion control.
  • DNS addressing/lookups (Support DNS based addresses for nodes #74)
  • Metrics (e.g. Prometheus)
  • Structured logging, to make parsing/indexing/searching easier
  • Support for TLS between clients and replicas
  • Disaster recovery tool/mechanism to repair storage determinism problems. (TBD)
  • Document all CLI arguments.

Experience: Client

Testing

  • VOPR(hub) running without errors. 🔴
  • VOPR: Test different configurations.
  • VOPR: Test additional storage faults.
  • VOPR: sometimes run with an unrestricted amount of faults
  • Create StateMachine-level fuzzer. (Probably using Workload).
  • Explore more workloads for the forest fuzzer.
  • Antithesis. 🟡
  • Test a "full" LSM, to make sure it properly rejects requests. 🟡
  • Fuzz different compile-time and run-time configurations.
  • Fuzz all components: lsm unit/fuzz testing #189. (Maybe this is unnecessary? Forest fuzzer is a higher priority.)
  • Explicit code coverage marks. (Maybe reuse structured logging or metrics?)

Documentation

  • Document security model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant