Skip to content

fix(mem_wal): advance visibility cursor on WAL durability, not indexing#6754

Merged
jackye1995 merged 1 commit into
lance-format:mainfrom
hamersaw:bug/wal-visiblity-cursor
May 13, 2026
Merged

fix(mem_wal): advance visibility cursor on WAL durability, not indexing#6754
jackye1995 merged 1 commit into
lance-format:mainfrom
hamersaw:bug/wal-visiblity-cursor

Conversation

@hamersaw
Copy link
Copy Markdown
Contributor

Summary

  • §1 — Visibility cursor: For memtables with empty index_configs, the WAL flush handler only advanced the visibility cursor through IndexStore::insert_batches_parallel, which the None branch of flush_from_batch_store skipped. The cursor stayed at 0 for the memtable's lifetime, so scanners returned only batch 0 even though the BatchStore held more. Compounding this, MemTable.indexes initialized to None meant the scanner and flush handler each minted throwaway IndexStore instances via unwrap_or_else, so an unconditional advance would still not share state. Fix: initialize MemTable.indexes to Some(Arc::new(IndexStore::new())) so both ends share a stable Arc, and the existing in-function advance fires correctly for empty registries.
  • Rename: max_indexed_batch_positionmax_visible_batch_position. The cursor now means "durable in the WAL, safe to read," not "indexed up to here." The advance method is now pub for callers that drive the cursor from outside the index path.
  • §3 — pub mod util: changed mod utilpub mod util so external crates can import shard_base_path, shard_wal_path, parse_bit_reversed_filename, bit_reverse_u64 instead of duplicating them. Helpers were already pub at the function level.

Test plan

  • New regression test_wal_flush_advances_visibility_with_empty_indexes — opens a flusher, appends 3 batches, flushes with an empty IndexStore, asserts max_visible_batch_position == 2.
  • New regression-guard test_wal_flush_advances_visibility_with_btree_index — same shape with a BTree index registered.
  • cargo test -p lance --lib dataset::mem_wal — 234 / 234 pass.
  • cargo check -p lance --tests --benches clean.
  • cargo clippy -p lance --tests --benches -- -D warnings clean.
  • cargo fmt --check clean.

🤖 Generated with Claude Code

Symptom: with empty `index_configs`, every batch after the first was
silently invisible to scans even though `BatchStore` held the data.

Root cause: `WalFlushHandler::flush_from_batch_store` only advanced the
visibility cursor through `IndexStore::insert_batches_parallel`, which
was skipped when `indexes: Option<Arc<IndexStore>>` was `None`.
Additionally, `MemTable.indexes` initialized to `None` meant the scanner
and flush handler each materialized throwaway `IndexStore` instances via
`unwrap_or_else`, so even after wiring up an unconditional advance the
two ends would not share a cursor.

Fix: initialize `MemTable.indexes` to `Some(Arc::new(IndexStore::new()))`
so scanner and flush handler share a stable `Arc`. With that shared Arc,
the existing in-function advance at `insert_batches_parallel` correctly
fires for an empty registry. Rename `max_indexed_batch_position` →
`max_visible_batch_position` to reflect the new semantic ("durable, safe
to read", not "indexed up to here"), and expose the advance as `pub` so
it is callable from the WAL flush path.

Also expose `mem_wal::util` as `pub mod util` so external consumers can
import the four shard-path/bit-reversal helpers instead of duplicating
them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the bug Something isn't working label May 13, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 13, 2026

Codecov Report

❌ Patch coverage is 95.52239% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/mem_wal/index.rs 81.25% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix!

@jackye1995 jackye1995 merged commit 1edecb8 into lance-format:main May 13, 2026
28 checks passed
@hamersaw hamersaw deleted the bug/wal-visiblity-cursor branch May 13, 2026 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants