perf: fix O(N·K) slow row-id lookup on stable-row-id datasets#6716
Conversation
…letions `Dataset._take_rows` could effectively hang on datasets with stable row IDs enabled and any fragment containing deletions. `RowIdIndex::get` ends in `U64Segment::get(pos)`, and for `RangeWithHoles` / `RangeWithBitmap` this was implemented as `self.iter().nth(i)` — an O(i) walk through the filtered iterator. Called once per input id in `TakeBuilder::get_row_addrs`, the total cost was O(N · K). At 1000 ids against a 4M-row dataset with 8 interior deletions, the take ran in 67s; the same workload now completes in ~7ms. `RangeWithHoles::get` now binary-searches the hole-count function (which is monotone in `holes[k] - k`) for O(log H) lookups. `RangeWithBitmap::get` walks bytes with `count_ones` until it finds the byte holding the i-th set bit, then `trailing_zeros` within that byte — O(K/64). `RangeWithHoles::position` swaps `take_while().count()` for `binary_search`, also O(log H). Adds `RowIdIndex::get_many` that sorts a working copy of the input and sweeps the chunk iterator once, amortizing the per-id `RangeInclusiveMap` walk from O(N · log F) to O(F + N). `TakeBuilder::get_row_addrs` now routes through it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@claude review once |
There was a problem hiding this comment.
No bugs found — algorithm is correct and the description's math holds up (the binary search uses the monotone holes[k] - k, and the bitmap select1 relies on new_full/From<&[bool]> both zero-padding bits past len, which I verified).
Extended reasoning...
Overview
Three files modified, ~140 net lines:
rust/lance-table/src/rowids/segment.rs: rewritesRangeWithHoles::get(O(N)→O(log H) via binary search on the monotone functionholes[k]-k),RangeWithBitmap::get(O(N)→O(K/64) via byte popcount +trailing_zeros), andRangeWithHoles::position(O(N)→O(log H)).rust/lance-table/src/rowids/index.rs: addsRowIdIndex::get_many— sorts a working copy of input ids and sweeps theRangeInclusiveMapchunk iterator once.rust/lance/src/dataset/take.rs: routesTakeBuilder::get_row_addrsthroughget_many.
The motivating bug: Dataset._take_rows on stable-row-id datasets with any deletions degraded from ms to ~67s for 1000 ids on 4M rows.
Security risks
None. This is purely an algorithmic performance fix in internal row-id resolution. No new I/O, no new external input parsing, no auth/crypto/permissions. The new code does not introduce panics — the holes.get(mid).unwrap() is bounded by hi <= holes.len() and mid < hi.
Level of scrutiny
Higher-than-minimum. This is not a mechanical change — it is a non-trivial algorithmic rewrite of code on the hot path for take_rows, with subtle correctness invariants:
RangeWithHoles::getcorrectness depends onholes[k] - kbeing monotone non-decreasing (true because holes are sorted and unique).RangeWithBitmap::getcorrectness depends on bytes pastbitmap.len()being zero-padded — verified true fornew_full(explicitly zeros tail bits) andFrom<&[bool]>(starts fromnew_emptywhich is all-zero).get_manycorrectness depends on the input being sorted before the sweep, with theelse { break }path safe becauseoutis pre-initialized toNone.
I worked through these by hand and they hold, but a human maintainer familiar with the rowid layer should still confirm — especially because this code path silently affects all stable-row-id reads.
Other factors
Test coverage is strong: existing proptest in rowids/index.rs, comprehensive test_segments covering all variants of get/position/iteration round-trip, and a new perf-regression test (test_large_range_segments_no_deletions) that would fail if the optimization is undone. Codecov flags 6 uncovered lines (mostly in error/edge paths). The reviewer (westonpace) explicitly asked for a review pass, which also suggests this should not skip a human.
westonpace
left a comment
There was a problem hiding this comment.
Awesome, thanks for the fix. I think my only question is whether we should also close #6486 now?
Dataset._take_rowscould be very slow on datasets with stable row IDs enabled and any fragment containing deletions.RowIdIndex::getends inU64Segment::get(pos), and forRangeWithHoles/RangeWithBitmapthis was implemented asself.iter().nth(i)— an O(i) walk through the filtered iterator. Called once per input id inTakeBuilder::get_row_addrs, the total cost was O(N · K). At 1000 ids against a 4M-row dataset with 8 interior deletions, the take ran in 67s; the same workload now completes in ~7ms.RangeWithHoles::getnow binary-searches the hole-count function (which is monotone inholes[k] - k) for O(log H) lookups.RangeWithBitmap::getwalks bytes withcount_onesuntil it finds the byte holding the i-th set bit, thentrailing_zeroswithin that byte — O(K/64).RangeWithHoles::positionswapstake_while().count()forbinary_search, also O(log H).Adds
RowIdIndex::get_manythat sorts a working copy of the input and sweeps the chunk iterator once, amortizing the per-idRangeInclusiveMapwalk from O(N · log F) to O(F + N).TakeBuilder::get_row_addrsnow routes through it.