Skip to content

[FEATURE] Key prefix compression — block-level prefix encoding to reduce SSTable size #194

@ElioNeto

Description

@ElioNeto

Description

ApexStore stores full keys in every SSTable block. Competitive engines (RocksDB with prefix_extractor, LevelDB) use prefix compression within data blocks: consecutive keys sharing a common prefix store only the suffix, reducing I/O and storage.

Proposed Implementation

  1. During SSTable block building, detect shared prefixes between consecutive keys
  2. Store keys as (shared_prefix_len, suffix) instead of full key bytes
  3. Add prefix_extractor configuration option
  4. When enabled, bloom filters and index blocks can also use prefix-optimized lookups

Impact

  • 30-50% SSTable size reduction for keys with common prefixes (e.g., time-series, user IDs)
  • Faster scans (fewer bytes to read from disk)
  • Better block cache utilization

Labels

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions