Description
Bit rot, silent data corruption, and media errors can corrupt SSTable files. ApexStore needs a background scrubber that periodically reads and verifies all SSTable data against stored checksums.
Implementation
- Background thread with configurable schedule (default: daily)
- For each SSTable:
- Read all blocks
- Verify CRC32 checksums
- Verify bloom filter integrity
- Verify key ordering
- On corruption detected:
- Log the affected keys/ranges
- Attempt repair from WAL archive or replica
- If repair impossible, isolate the corrupted file
- Scrub progress and findings exposed in
/metrics
Configuration
resilience:
scrubber:
interval_secs: 86400 # daily
repair_from_wal: true
report_only: false # if true, don't auto-repair
Labels
Description
Bit rot, silent data corruption, and media errors can corrupt SSTable files. ApexStore needs a background scrubber that periodically reads and verifies all SSTable data against stored checksums.
Implementation
/metricsConfiguration
Labels