Skip to content

[BUG] Compaction detects tombstones by empty value instead of is_deleted flag — data loss risk #188

@ElioNeto

Description

@ElioNeto

Description

During compaction (execute_compaction() in compaction.rs), tombstones are detected by checking if the value is empty (value.is_empty()), rather than checking the LogRecord::is_deleted flag. This is fragile and can cause data loss.

Location

src/core/engine/compaction.rs line 114–115:

// Skip tombstones (empty values) during compaction
if !value.is_empty() {
    // keep this record
}

The problem

  1. The LogRecord metadata (including is_deleted) is lost during flushflush_memtable_impl() converts LogRecord to (Vec<u8>, Vec<u8>), discarding the flag
  2. Compaction then has no way to distinguish a tombstone from a legitimate empty value
  3. A key with an intentionally empty value ("") would be permanently deleted during compaction

Root cause

Table::build() stores data as BTreeMap<Vec<u8>, Vec<u8>> (key → value bytes). The is_deleted, timestamp, and column_family fields are lost. This was a tradeoff when the duplicate MemTable types were unified (#140).

Proposed fix

  1. Change Table::build() to store BTreeMap<Vec<u8>, LogRecord> instead of BTreeMap<Vec<u8>, Vec<u8>>
  2. Update compaction to check LogRecord::is_deleted instead of value.is_empty()
  3. Update flush_memtable_impl() to preserve LogRecord metadata when building tables
  4. Update VersionSet::get() and scan() to extract value from LogRecord

Impact

  • Low probability but high severity: valid data with empty strings would be silently deleted
  • Fix requires propagating LogRecord through the entire Table/VersionSet layer

Severity

High — potential silent data loss for empty-string values.

Labels

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions