compute: make correction buffer reads slice-proportional#36898
Draft
antiguru wants to merge 1 commit into
Draft
Conversation
56b7242 to
ac04452
Compare
CorrectionV2's updates_before restructured the entire buffer on every call: it converted all chains to Rc-wrapped cursors, merged up to the upper, rebuilt the remainders, and re-merged chains to restore the chain invariant. Each read cost O(total buffered chunks) rather than O(drained slice), so an MV sink catching up through T distinct timestamps paid O(T^2 / chunk_capacity). At 65536 timestamps (a day of 1s ticks) a stepwise drain took 13.5s against CorrectionV1's 0.47s, matching observed hydration stalls. Restructure the buffer so reads only touch the drained slice: * A BucketChain partitions times at or beyond the largest read upper into buckets of exponentially growing time ranges, each holding chains maintained with the chain invariant. Reads peel only the buckets below their upper; far-future updates are left alone. * Chains split at the upper at chunk granularity (Chain::split_at_time) reusing whole chunks; at most one straddling chunk per chain is copied. The Rc round trip through cursors is gone. * Updates emitted by a read stay in a designated emitted chain, outside any invariant, until persist feedback cancels them. The previous read's emitted chain is merged with the newly drained updates, so future chains are never re-merged by reads. * A since jump across many distinct buffered timestamps is collapsed with one sort-and-consolidate pass over the affected updates instead of merging one cursor run per distinct stale time. This removes Cursor's limit/overwrite_ts machinery (advance_by, skip_time, set_limit, split_at_time). The public interface, metrics, introspection logging, and dyncfgs are unchanged; CorrectionV1 remains the fallback via enable_correction_v2. BucketChain::restore gains a fast path that skips rebuilding the bucket map when the chain is already well-formed, and BucketChain exposes bucket iterators for metrics reporting. A new micro-benchmark (cargo bench -p mz-compute --features bench --bench correction) drives both implementations through insert, stepwise-drain-with-feedback, and since-jump scenarios. The bench feature only gates pub visibility of the sink correction modules. Stepwise drain at 16384 timestamps drops from 858ms to 240ms and now scales linearly in the timestamp count (previously ~T^1.8); the since jump drops from 250ms to 87ms, below CorrectionV1's 104ms. New tests assert emission equivalence with CorrectionV1 under a stepwise-drain-with-feedback workload, since-jump collapse, and the upper-not-beyond-since edge case. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ac04452 to
feb4b09
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
MVs using the v2 correction buffer can stall during hydration when the input has to catch up through many distinct timestamps.
CorrectionV2::updates_beforerestructures the entire buffer on every call — Rc-wrapping every chunk, merging to the upper, rebuilding remainders, and re-merging chains for the chain invariant — so each read costs O(total buffered chunks) instead of O(drained slice), and a catch-up through T timestamps pays O(T²/chunk_capacity).A micro-benchmark reproduces this: at 65536 timestamps (one day of 1s ticks at 16 updates/tick) a stepwise drain takes 13.5s per worker against CorrectionV1's 0.47s, with the ratio growing in T (~T^1.8).
Description
Restructures
CorrectionV2so reads only do work proportional to the drained slice, keeping the public interface, metrics, introspection logging, and dyncfgs unchanged.CorrectionV1remains the fallback viaenable_correction_v2. Rebased on top of #36577, so chunks use the columnar chunk region storage.BucketChainof exponentially growing time-range buckets, each holding (time, data)-sorted chains under the chain invariant. Reads peel only the buckets below their upper; far-future updates (e.g. temporal-filter retractions) are never touched.Chain::split_at_time), reusing whole chunks and copying at most one straddling chunk per chain.emittedchain outside any invariant, so reads never re-merge future chains. (We also evaluated a destructivetake_before/insert_batchinterface; it measured CPU-neutral because feedback cancellation forces the same merge either way, so the interface stays as-is.)Cursor::advance_byrun-splitting; many stale times (a since jump, e.g. a sink restarting with an old as-of) are collapsed in one sort-and-consolidate pass. The always-bulk variant regressed the ReplicaExpiration feature benchmark by 31%; with the hybrid it is 3.3% faster than the merge base.BucketChainexposes bucket iterators for metrics reporting, andrestoreis invoked with bounded fuel per buffer operation — incomplete restoration is picked up by the next operation, so reads never stall the operator. Builds on therestorefast path from timely-util: avoid rebuilding well-formed bucket chains in restore #36897.Benchmark results (stepwise drain with persist feedback, 16 updates per timestamp, medians, on top of #36577's columnar chunks):
Scaling is now linear in T (was ~T^1.8); the since-jump scenario improves ~3× and inserts are unchanged. ("v2 before" measured pre-#36577; the columnar chunk storage adds a constant on this microbenchmark but does not change the asymptotics.)
Verification
New micro-benchmark
cargo bench -p mz-compute --features bench --bench correctioncovering insert, stepwise drain with feedback, since jumps, and a temporal-filter pattern; thebenchfeature only gatespubvisibility.New tests assert emission equivalence with
CorrectionV1under a stepwise-drain-with-feedback workload, since-jump collapse onto the since, and that reads never observe times at or beyond their upper.The ReplicaExpiration feature benchmark scenario was verified locally against the merge base (3.3% faster, less memory).
🤖 Generated with Claude Code