Skip to content

refactor(mem_wal): redesign FTS mem index for single-writer multi-reader#6726

Merged
jackye1995 merged 37 commits into
lance-format:mainfrom
touch-of-grey:MemTableFTSBetter
May 19, 2026
Merged

refactor(mem_wal): redesign FTS mem index for single-writer multi-reader#6726
jackye1995 merged 37 commits into
lance-format:mainfrom
touch-of-grey:MemTableFTSBetter

Conversation

@touch-of-grey
Copy link
Copy Markdown
Contributor

@touch-of-grey touch-of-grey commented May 10, 2026

Summary

Replaces the SkipMap<(token, row_position)> postings layout in
rust/lance/src/dataset/mem_wal/index/fts.rs with per-term
ArcSwap<TermSlice> slices and an ArcSwap<Snapshot>-published
per-batch visibility watermark. Readers grab the snapshot, walk per-term
chunks filtered by chunk.batch_position < snapshot.visible_count, and
score with snapshot-coupled BM25 stats.

The writer publishes the snapshot only after every term chunk for a
batch is linked, so readers never observe a partial document or BM25
stats out of sync with the postings — per-batch monotonic visibility.

Also in this PR:

  • Tokenize-time tokenizer ownership moves out of a single shared Mutex
    into a small reader pool plus a writer-dedicated slot, so search
    calls no longer serialize against the writer.
  • Utf8View is added to the supported text types alongside Utf8 /
    LargeUtf8; non-string columns now produce a clear error instead of
    being silently skipped.
  • FtsMemIndex::memory_usage() for size-based flush triggers.
  • Compound queries (Boolean, Boost) snapshot once at search_query
    and thread the same Arc<Snapshot> through every leaf, so a
    compound query never mixes BM25 stats from different snapshots.
  • The flush path keeps using lance_index::scalar::inverted types
    (TokenSet / DocSet / PostingListBuilder); the on-disk format is
    unchanged. The stale "flush is not yet implemented" header doc is
    corrected.

Public API (FtsMemIndex constructors, FtsQueryExpr, SearchOptions,
BooleanQueryBuilder, FtsIndexConfig, to_index_builder_reversed)
is preserved. No callers of the removed internal FtsKey /
PostingValue types existed outside the file.

cc @jackye1995 for review.

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@github-actions github-actions Bot added the A-python Python bindings label May 10, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@touch-of-grey
Copy link
Copy Markdown
Contributor Author

FTS mem index — performance & memory update

Beyond the original single-writer/multi-reader redesign, this branch now also
makes the in-memory FTS index compact and fast. Summary of what landed and how
it benchmarks.

What changed

  • Partition-structured index — immutable Partitions plus a bounded
    mutable tail, published together behind one ArcSwap; the tail freezes into
    a partition past a threshold. Replaces the per-batch-chunk layout that made
    queries O(corpus).
  • Compressed posting lists — 128-doc blocks; doc ids and frequencies
    bit-packed (SIMD BitPacker4x for full blocks); positions stored as a
    per-block bit-packed delta stream with random per-document access (a
    document's position count equals its term frequency, so it is located from
    the frequency prefix sum — no count is stored).
  • WAND top-k with a shared, cross-partition pruning threshold; shared
    term-dictionary interner.
  • Flush still converts to the on-disk Lance FTS format unchanged.

Benchmark (real FineWeb corpus, 150 queries, k=10)

Against the pre-redesign mem index, at 1M documents:

metric before after improvement
query p50 7 303 µs 1 121 µs 6.5x
query p95 1 175 654 µs 7 849 µs 150x
memory 10.21 GB 1.37 GB 7.5x
term recall@10 1.000 1.000 exact

Against Apache Lucene (same corpus and queries) as a reference point:

docs mem index p50 / p95 Lucene p50 / p95 mem index / Lucene memory
100k 110 / 1 465 µs 206 / 560 µs 0.14 / 0.09 GB
500k 560 / 5 595 µs 208 / 3 589 µs 0.68 / 0.42 GB
1M 1 121 / 7 849 µs 261 / 4 397 µs 1.37 / 0.84 GB

Split by query type at 1M docs:

term (100 q) phrase (50 q)
p50 / p95 1 039 / 1 186 µs 1 359 / 14 245 µs

Term queries — the dominant FTS workload — are at or ahead of Lucene on
latency (term p95 ~1.2 ms vs Lucene's blended p95 ~4.4 ms), ranking is exact
(recall@10 = 1.0), and memory is ~1.6x Lucene.

Known follow-up

The blended p95 is dominated by phrase queries (~14 ms at 1M, ~3x Lucene).
search_phrase scans the rare driving term's full posting list with no top-k
pruning. Closing that gap needs phrase-specific top-k pruning and is best
done as a separate change; term latency, recall, and memory are competitive.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not create a new bench folder, these should be mved under rust/lance/benches. We should probably create a mem_wal folder with specific subfolders to organize different benchmarks at this point

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Moved all the FTS bench files out of the top-level bench/ folder into rust/lance/benches/mem_wal/, with a fts/ subfolder for FTS-specific work:

rust/lance/benches/mem_wal/
  run_shard_writer_backpressure.sh
  fts/
    mem_wal_fts_bench.rs
    mem_wal_fineweb_fts.rs
    LuceneFtsBench.java
    run_fts_compare.sh
    run_fineweb_fts.sh

Cargo.toml's [[bench]] entries now point at the new paths via path = .... I kept this scoped to the bench files this PR introduces — the pre-existing flat mem_wal_*.rs benches are left in place to avoid an unrelated churn diff; happy to do a follow-up that consolidates those into mem_wal/ too.

Also scrubbed the driver scripts and bench doc comments of infra-specific references (S3 bucket names, instance paths, AWS region exports) — they now default to tmpdir-based paths and resolve the repo root via git rev-parse.

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good to me, thanks for iterating this with me, just a final comment regarding the file layout

Replace the SkipMap<(token, row_position)> postings layout with per-term
ArcSwap<TermSlice> slices and an ArcSwap<Snapshot>-published per-batch
visibility watermark. Readers grab a snapshot, walk per-term chunks
filtered by chunk.batch_position < snapshot.visible_count, and score
with snapshot-coupled BM25 stats. Writer publishes the snapshot only
after every term chunk for a batch is linked, so readers never observe
a partial document or BM25 stats out of sync with the postings.

Also: tokenize-time tokenizer ownership moves out of a shared Mutex
into a tokenizer pool plus a writer-dedicated slot, so search calls do
not serialize against the writer; add Utf8View to the supported text
types; add memory_usage() for size-based flush triggers; reuse
lance-index InvertedIndex's TokenSet/DocSet/PostingListBuilder for the
flush path.
Boolean and Boost queries called search_query recursively for each
sub-clause, and every leaf (search / search_phrase / search_fuzzy)
took its own self.snapshot.load_full(). A writer publishing between
sub-queries left the compound result mixing BM25 stats from different
snapshots — n, avgdl, and df disagreed across leaves of the same
query, so the summed score wasn't valid for any point in time.

Snapshot once at search_query / search_with_options and pass the
Arc<Snapshot> through every leaf and every recursive call. Public
search* entry points keep their signatures and snapshot internally.

Also: filter entry_count() by visibility; correct stale doc-comments
on Snapshot::batches, TermChunk::positions, and Snapshot::batch_for;
hoist test-body 'use std::sync::Arc;' to the test module's top.
New rust/lance/benches/mem_wal_fineweb_fts.rs covers three metrics
across the 12 configs in the design doc:

- write throughput at memtable sizes 100k / 500k / 1M
- MemTable FTS query latency (avg/p50/p95) over 100 high-frequency
  tokens + 50 sampled phrases
- consistency: |memtable_top10 ∩ post_flush_disk_top10| / |union| as a
  user-approved replacement for recall@k

The bench downloads HuggingFaceFW/fineweb sample/10BT shards, caches
them, and is fully env-driven so a single binary handles every config.
Driver script bench/run_fineweb_fts.sh loops the 12 configs, uploads
each result.json to S3, and prints a summary.

Also: make `dataset::mem_wal::index` public so the bench can call
`FtsMemIndex::search_with_options` directly to time the MemTable read
path.
The previous spin on max_indexed_batch_position never terminated when
max_memtable_rows triggers auto-flushes during ingest: the counter is
reset on each new active memtable, so target_batch_pos = total_batches
- 1 is unreachable from the active generation.

Close the writer instead — it drains the final WAL flush and any
outstanding memtable flush; the inline sync_indexed_write covers the
per-put index updates. close() time is included in the measured
elapsed so configs with different flush cadences are compared
apples-to-apples.
…e bench

Rewrites mem_wal_fineweb_fts as a CLI-arg bench modeled on the upstream
mem_wal_shard_writer_backpressure bench: same Mode matrix (async/sync ×
index/no-index), same ShardWriter wiring, JSON output. Payload is real
HuggingFace FineWeb text and the maintained index is the in-memory FTS
index.

Splits the run into independent --phase write|read invocations so a
process never holds two ShardWriter lifecycles in sequence — that was
the deadlock in the first iteration. The driver runs every config under
a timeout watchdog so a hang costs one window, not days.
In async-index modes the FTS index is updated in the background
WAL-flush handler, so querying the MemTable right after the put loop
hit an empty index — consistency scored 0 because MemTable top-10 was
always empty. The read phase now waits for the IndexStore visibility
watermark to reach the last buffered batch before querying. Auto-flush
is disabled in the read phase so the watermark is a stable, monotonic
target. Reports the catchup time as index_catchup_seconds.
The watermark-based catchup wait never completed for async-index mode
with auto-flush disabled. The read panel measures FTS read latency and
MemTable/on-disk consistency, both of which are properties of the
populated index and independent of whether ingestion indexed sync or
async. So the read phase now forces sync_indexed_write: the FTS index
is fully populated when the put loop returns and the query phase can
start immediately, no watermark polling.
The read-phase consistency scored 0 because the 1M-row seed polluted
the on-disk comparison: the MemTable FTS index covers only the ingested
rows, but `full_text_search` ranked over all 1.1M rows and its top-10
was dominated by seed rows the MemTable never held — disjoint by
construction.

The read phase now seeds with only READ_SEED_ROWS (1000) rows and
prefilters the on-disk query to `id >= READ_SEED_ROWS`, so both the
MemTable and on-disk top-10 cover exactly the ingested population and
BM25 stats are effectively identical.
…nce dataset

The read-phase consistency was unmeasurable: (1) the post-MemWAL-flush
on-disk query saw nothing because flushed data lives in the MemWAL LSM
structure that a plain Dataset::scan does not traverse, and (2) the
manual FtsMemIndex row_position -> id mapping was mis-aligned.

The read phase now queries the MemTable through the production
MemTableScanner (returns the id column directly, no manual mapping) and
compares against a freshly built reference Lance dataset holding the
identical ingested rows + a normal FTS index. Both sides cover exactly
the same population with identical BM25 stats. Token queries are drawn
from mid-frequency terms so the top-10 is well-determined rather than a
high-frequency near-tie.
The MemTableScanner snapshots the visibility watermark at plan time,
and that watermark only advances as the WAL becomes durable. Without
durable_write the read phase queried a partially-visible MemTable: the
async-mode read configs scored consistency ~0.03-0.23 while the
durable sync-mode ones scored ~0.72-0.75 over identical data. The read
phase now forces durable_write on so the scanner always sees the full
ingested MemTable.
…e phase

Adds paced ingest to the write phase for a backpressure sweep, mirroring
the mem_wal_shard_writer_backpressure bench used for the HNSW vector
panel. The write phase now reports slow_puts_ge_1s/10s, the memtable +
pending-WAL backlog left when the put loop ends, and honors a paced
target so a sweep can find the max sustainable async FTS throughput
(the rate at which the flush/index pipeline keeps up without
accumulating backpressure).
…cation

With no --max-memtable-rows the write-phase config defaulted to
usize::MAX/2, so max_memtable_batches = max_rows/batch_rows overflowed
into a ~9e15 Vec capacity and the writer aborted on a 664 PB
allocation. The default is now derived from the byte budget and
clamped to 16M rows.
Centralizes the vector and FTS write-backpressure tests on one bench and
one driver: the paced-ingest / WAL-queue-sampling / skip-close
methodology is identical, and only --index-type selects which column is
indexed (vec via IVF/PQ, or text via the inverted index). This makes the
FTS backpressure numbers directly comparable to the HNSW vector sweep.

The fineweb-shaped `text` column is now generated as varied vocabulary
words instead of a single repeated character so the FTS index does
representative work; the byte size is unchanged so vector runs (where
text is inert payload) are unaffected.

run_shard_writer_backpressure.sh drives both index types via INDEX_TYPE.
Mirrors the HNSW-vs-hnswlib comparison for full-text search:

- mem_wal_fts_bench.rs — Lance side. `gen` slices a FineWeb corpus and
  writes the shared inputs (corpus.txt, corpus_tok.txt, queries.txt,
  exact-BM25 truth.txt); `bench` builds a raw FtsMemIndex and measures
  build throughput, term/phrase query latency + QPS, and recall@k.
- lucene_fts_bench/LuceneFtsBench.java — Lucene reference, in-memory
  ByteBuffersDirectory + BM25Similarity(1.2,0.75), same inputs, same
  metrics, same JSON output shape.
- run_fts_compare.sh — builds Lucene from the local checkout (JDK 25),
  builds both benches, runs Run A (pre-tokenized, isolates index +
  scorer) and Run B (native analyzers) across a 100k/500k/1M sweep,
  reports both impls side by side plus their mutual top-k overlap.
One Future per query made the multi-thread QPS dominated by thread-pool
submit overhead (smoke run: Lucene qps_nt 10k vs Lance 105k for cheap
queries). Submit one task per thread, each striding a slice of the
queries x reps work — the same shape as the Lance side's rayon loop.
Adds a pub `wand_search` to the inverted module: it builds PostingIterators
from caller-supplied PostingLists and runs the block-max WAND algorithm,
with a no-op row mask. The WAND machinery was pub(crate); this exposes it
for the in-memory FTS MemTable index, whose frozen segments will hold
postings as PostingList::Plain and need the same query primitive as the
on-disk path. Stage 1 of the segment-structured FtsMemIndex redesign.
Adds `Partition` + `PartitionBuilder` to the FTS mem index: an immutable
frozen slice of inserts holding per-term posting lists in the on-disk
`PostingList::Plain` form, queried with the shared block-max WAND
(`wand_search`). WAND reports partition-local doc ids; `Partition::search`
remaps them to MemTable row positions via the partition's `DocSet`.

Standalone and dead-code-allowed — wired into `FtsMemIndex` in the next
stage. Part of the partition-structured redesign that replaces the
O(corpus) per-batch-chunk query layout.
Replaces the per-batch-chunk `FtsMemIndex` layout — where every query walked
every chunk of every query term, making latency O(corpus) (p95 ~1.2s at 1M
docs) — with Lucene's segment model adapted to a Lance FTS partition.

The index is now a set of immutable `Partition`s plus a bounded mutable tail,
published together behind one `ArcSwap` (`IndexState`) so a freeze is observed
atomically. Each partition holds frozen, on-disk-shaped posting lists queried
with block-max WAND in ~O(matches); the writer freezes the tail into a new
partition past `freeze_threshold_rows` (50k) and merges partitions past a hard
cap. All scoring routes through one corpus-wide `MemBM25Scorer` so partition
and tail scores are comparable; the scorer and the scanned tail share a single
snapshot to keep BM25 stats consistent. Flush merges every partition and the
tail into one `InnerBuilder`. The public query API is unchanged.

Exposes `Scorer` from `lance-index`'s inverted module.
Block-max WAND top-k pruning needs per-term `max_score` upper bounds, which
in-memory partition posting lists do not carry (`max_score` is `None`). A
WAND given a `limit` therefore pruned by a bound it could not compute and
silently dropped valid top-k docs — term_recall@10 collapsed from 1.0 to
~0.1 once the corpus exceeded the freeze threshold and partitions appeared.

Per-partition WAND now always runs unbounded; the scan is still O(matches)
(the partition layout, not WAND pruning, is what makes latency flat), and
`search_with_options` truncates to the real limit after merging. Adds a
regression test asserting a limited search over many partitions returns the
exact global top-k.
Block-max WAND needs per-term `max_score` upper bounds to prune safely.
Frozen in-memory partition posting lists carry none (`max_score` is `None`),
so even an unbounded WAND mis-pruned and term_recall@10 collapsed from 1.0 to
~0.12 once the corpus crossed the freeze threshold.

`Partition::search_match` / `search_phrase` now scan the per-term merged
posting list directly: OR accumulates each token's BM25 contribution per doc,
phrase intersects from the rarest token and verifies positions. A partition
holds one merged list per term, so this is O(matches) — the per-batch-chunk
O(corpus) layout was the original problem, not the absence of WAND pruning.
All scoring still routes through the shared `MemBM25Scorer`, so partition and
tail scores stay identical. Block-max WAND can be reintroduced later once
partitions carry precomputed per-term score bounds.
…r slack

Partition posting lists were `PlainPostingList`s whose positions were each
built through a fresh Arrow `ListBuilder`/`Int32Builder`. Those builders
default-allocate 1024-element buffers, so the ~1M+ tiny per-term posting
lists across a large index carried ~10+ GB of pure capacity slack —
in-memory FTS footprint regressed ~2.8x (29 GB vs 10 GB at 1M docs).

Replaces the partition posting representation with `PartitionPosting`:
exact-sized plain `Vec`s, partition-local `u32` doc ids (not `u64`), `u32`
frequencies (not `f32`), and a CSR position layout. No Arrow builders, no
slack, smaller per-element. The OR/AND scan now accumulates into flat arrays
indexed by the dense local doc id instead of a per-posting `HashMap`.
A limited query previously scored every matching doc in every partition —
O(matches), so p50 grew with the corpus and trailed Lucene by ~24x. Each
`PartitionPosting` now carries `(max_freq, min_dl)`, which yields a sound
per-term BM25 score upper bound (`doc_weight` is monotone in both). With that
bound, `Partition::wand_match` runs block-max WAND: it maintains a top-k heap
and skips any doc whose cumulative term upper bounds cannot reach the k-th
score. This is exact — the earlier WAND attempt mis-pruned only because the
posting lists carried no bound at all.

WAND runs for OR queries that carry a result limit (`search_with_options`);
unbounded `search` and AND still take the exact direct scan. Adds a test
asserting WAND top-k equals the exact full-scan top-k across partitions.
…onary

Frozen partitions held uncompressed `Vec<u32>` posting lists plus a per-
partition `HashMap<Arc<str>,u32>` term dictionary duplicated across every
partition — ~4 GB at 1M docs (~5x Lucene).

Postings are now stored in three shared per-partition buffers as 128-doc
blocks: VByte + delta doc-id gaps, VByte frequencies, and VByte delta-encoded
positions. A `PostingCursor` decodes a block at a time and `skip_to` jumps
whole blocks via `BlockMeta` without decoding — so WAND still block-skips and
the exact O(matches) scan path is unchanged. Per-term overhead drops to one
`PostingRef`. The term dictionary becomes a sorted `Box<[Arc<str>]>`
(binary-searched, no hashmap node overhead) whose strings are shared across
partitions via an index-wide `TermInterner`.

Flush is unaffected — `to_index_builder_reversed` decodes postings through
the same cursor, reverses, and rebuilds the Lance FTS on-disk format as
before. Adds VByte and multi-block cursor round-trip tests.
VByte's byte-at-a-time decode loop made compressed-partition queries far
slower than the uncompressed layout — p95 at 1M docs rose 4.7 ms to 24.7 ms.

Doc ids and frequencies in each 128-doc block are now fixed-width bit-packed
(doc ids as `doc - first_doc`, both at the block's minimal bit width, recorded
in `BlockMeta`). Decoding a block is now a tight shift/mask loop instead of a
branchy VByte scan, restoring query latency while keeping the footprint
compressed. Positions stay VByte (decoded lazily, off the hot path).

Adds a bit-pack round-trip test covering zero/sub-byte/byte-crossing/full
widths.
Each partition's WAND previously ran with its own top-k heap, cold-starting
the pruning threshold at -inf. At 1M docs (~20 partitions) that re-paid the
"fill the heap to raise the threshold" warm-up ~20x, leaving FTS query p50
~2.5x Lucene even on the uncompressed layout.

Limited (top-k) OR searches now feed a single shared `TopK` across the whole
index: the tail is scanned first to warm the threshold, then each partition's
`wand_into` prunes against the shared, monotonically rising threshold — so a
partition processed late skips far more. Still exact: the per-term
`doc_weight(max_freq, min_dl)` upper bound is sound, and partitions hold
disjoint docs so the shared heap collects the true global top-k. Unbounded
`search` and AND keep the exact O(matches) scan.
The per-list WAND bound (`doc_weight(max_freq over the whole posting list,
...)`) is so loose for common terms that the pivot never drops below the
threshold — WAND degenerated into a full scan, so neither the shared
threshold nor a faster codec could help.

Each `BlockMeta` now carries the block's `max_freq`, and `wand_into` is a
block-max WAND: the per-list bound still selects the pivot doc, then a
per-block bound — summed over all lanes for the blocks covering the pivot —
decides whether the whole `[pivot_doc, min block end]` region can be skipped
without decoding or scoring it. The bound is sound (block max_freq and term
min_dl over-estimate every doc in the block), so the top-k stays exact.
`PostingCursor` gains shallow `block_max_freq_for` / `block_end_for` lookups
(binary search over `BlockMeta`, no payload decode).

Adds a 600-doc/2-partition test asserting the WAND top-k score multiset
equals the exact full-scan top-k across several k.
Compression cut the in-memory FTS footprint 2.7x but VByte/scalar block
decoding regressed query p95 at 1M docs from ~4.7 ms to ~25 ms — and the
FineWeb top-k queries turned out to be scan-bound, not prune-bound, so WAND
pruning could not recover it. Decode speed is the lever.

Full 128-doc posting blocks (doc ids and frequencies) are now packed and
unpacked with `bitpacking`'s SIMD `BitPacker4x`; the partial final block of
each term keeps the scalar codec. The byte layout and `BlockMeta` widths are
unchanged, so the two codecs interoperate per block. Positions stay VByte
(decoded lazily, off the hot path).

Adds the `bitpacking` crate dependency.
Reports term vs phrase p50/p95 separately so a latency regression can be
attributed to one query class instead of the blended percentile.
Diagnosis (term/phrase latency split): term queries were already fast
(p95 ~0.6 ms, ahead of Lucene) — the blended p95 of ~25 ms was entirely
phrase queries. `search_phrase` calls `positions()` for the candidate docs
of the rare driving term, but each call VByte-decoded the *whole* 128-doc
block's positions to use one — ~128x wasted work, scattered across blocks.

A document's position count equals its term frequency, so no per-doc count
needs storing: each block's positions are now one bit-packed delta stream,
and doc `i`'s slice is located from the freq prefix sum. `bitpack_read_at`
reads that slice directly — phrase decodes one document's positions, never
the block. This also drops the per-doc count bytes, so the footprint is
slightly smaller. Replaces the VByte position codec entirely.
run_fts_compare.sh selected the bench binary with `sort | tail -1` — a
lexical sort by Cargo's build hash. When a dependency change rotates that
hash, an older binary can sort last, so the harness silently ran stale code.
Now it clears old binaries before building and selects by modification time.
…er API

`initialize_mem_wal` is now a builder (`.maintained_indexes(...).execute()`)
rather than taking a `MemWalConfig`; update the benchmark accordingly after
rebasing onto main.
The FTS MemTable index runs its own block-max WAND over its compressed
in-memory posting layout, so the `wand_search` / `WandTerm` entry point
added to `lance-index` early in this work is now dead. Revert it and the
`wand` re-export — `lance-index` is left with a single one-identifier
change: `Scorer` joins `MemBM25Scorer` in the `inverted` re-export so the
in-memory index can call the scorer's trait methods.
The `lance` crate gained a `bitpacking` dependency; the separate pylance
lockfile must list it so CI's `--locked` lint passes.
…fra references

Relocate the FTS bench files out of the top-level bench/ folder into
rust/lance/benches/mem_wal/, with an fts/ subfolder for FTS-specific
benchmarks and drivers. Replace hard-coded S3 buckets, instance paths,
and AWS region exports in the driver scripts and bench doc comments
with generic placeholders and tmpdir-based defaults.
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me!

@jackye1995 jackye1995 merged commit f869fd1 into lance-format:main May 19, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-python Python bindings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants