feat: add CommitStrategy enum with Pessimistic and Optimistic modes#6836
feat: add CommitStrategy enum with Pessimistic and Optimistic modes#6836Jay-ju wants to merge 11 commits into
Conversation
Add configurable commit strategy for transaction conflict resolution, allowing users to choose the best approach for their workload: - Pessimistic (default): Always rebase before commit. Safest for high-contention workloads with Delete/Update/Rewrite operations. Maintains backward compatibility with existing behavior. - Optimistic: Attempt to commit first without rebase, only rebase on conflict (CommitConflict). Significantly faster for add-only workloads (Append, CreateIndex) by skipping load_and_sort_new_transactions and TransactionRebase::try_new IO on the fast path. Benchmark results (30 appenders + 15 deleters + 15 updaters, release): - Total throughput: +52% vs Pessimistic (3.8 vs 2.5 ops/s) - Total latency: -34.5% (117s vs 179s) - Append avg: -32.3%, Delete avg: -38.6%, Update avg: -48.4% - Overall p99: -59.9% - Hybrid: Optimistic for add-only operations (Append, Overwrite, CreateIndex, etc.), pessimistic for fragment-modifying operations (Delete, Update, Rewrite). Best of both worlds for mixed workloads. Changes: - Add CommitStrategy enum in lance-table/src/io/commit.rs - Add commit_strategy field to CommitConfig (default: Pessimistic) - Implement strategy dispatch in lance/src/io/commit.rs commit_transaction - Add CommitBuilder::with_commit_strategy() in dataset/write/commit.rs - Add concurrent_bench.py for mixed-operation benchmarking
Move concurrent_bench.py to benchmarks/concurrent/benchmark.py following the project's benchmark directory convention. Also add Apache License header and read credentials from environment variables instead of hardcoding.
…timistic Benchmark results show Optimistic outperforms both Pessimistic and Hybrid across all operation types (Append, Delete, Update) and all latency percentiles. Hybrid was strictly worse than Optimistic because: - For add-only ops, Hybrid behaves the same as Optimistic - For modifying ops, Hybrid uses pessimistic rebase which adds IO overhead that exceeds the cost of a failed commit attempt under Optimistic Removing Hybrid simplifies the API and avoids offering a strategy that is never the best choice.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Move the standalone benchmark script to python/ci_benchmarks/benchmarks/ using pytest-benchmark, consistent with the project's existing CI benchmark infrastructure (test_scan.py, test_search.py, etc.). Benefits: - Standard latency statistics via pytest-benchmark (min/max/avg/median/stddev) - JSON output support via --benchmark-json for CI reporting - Grouped benchmark results via @pytest.mark.benchmark(group=...) - Three test scenarios: mixed, high-concurrency, append-only
- Fix ruff format: line length, f-string formatting in benchmark test - Fix cargo fmt: import line wrapping in commit.rs
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
This is a helpful finding. I originally wrote the initial behavior here: #3117 Back then, I would have thought the pessimistic strategy would be better for high contention, at the cost of being slower when there is low contention. But looks like empirically I was wrong. Thanks for running this benchmark. TBH, I'm not sure we need to add a setting for this. I think it would be totally acceptable just change the default to optimistic, especially given "Optimistic strategy consistently outperforms Pessimistic across all operation types and concurrency levels." If the results were mixed, I could see the case for having a setting. But if it's better in all cases, I think it's fine to change the behavior. Plus this isn't a behavior that users rely upon, it's most of an optimization. So it's not a breaking change in that sense. What do you think of just making Optimistic the default? |
@wjones127 Thanks for the context! I agree — since Optimistic outperforms across all scenarios, making it the default is the right call. I have updated the PR already. For context, the motivation came from LLM agent workflows where LanceDB serves as the memory module: each conversation turn triggers an individual write, creating high write concurrency that made the rebase overhead very visible. One question while on this topic — under extreme write concurrency (e.g. many agents writing simultaneously), even the optimistic strategy still faces repeated conflict retries. I've been thinking about whether a WAL/buffer layer on top of Lance could help absorb writes and batch-flush them, similar to LSM-tree approaches. Do you have any thoughts on whether this is a direction worth exploring, or if there are other approaches you'd recommend for high-concurrency scenarios? |
Update tests and benchmarks to reflect the new default: - test_commit_iops: read_iops 0 (no rebase), num_stages 2 - test_commit_conflict_iops: account for optimistic retry pattern - test_reuse_session: relax I/O assertions (conflict-dependent) - benchmark docstring: update usage examples
…O assertions for Optimistic strategy 1. migrate_indices: When handle_rewrite_indices changes an index uuid during compact_files, migrate_indices would try to open the index with the new uuid which doesn't exist on disk yet. Now we check if the index uuid exists in dataset.load_indices() before attempting to open it, and skip recalculation with a debug log if it doesn't exist. 2. test_ddb_open_iops: With Optimistic as the default commit strategy, the first attempt skips rebase, eliminating the read IOPS for listing transactions. Updated assertions from read_iops=1 to read_iops=0 for both initial commit and append operations. 3. WriteParams: Added commit_strategy field so callers can explicitly control the commit strategy used by InsertBuilder.
In Optimistic commit mode, build_manifest may inherit a stale config from dataset.manifest if the manifest is not the latest version. This causes auto_cleanup to continue triggering after disable_auto_cleanup() because the new manifest inherits the old auto_cleanup config. The fix: when auto_cleanup_hook detects auto_cleanup config in the committed manifest, it now re-validates against the latest manifest on disk. If the latest manifest does not have auto_cleanup config, the hook skips cleanup. This prevents stale config from causing unintended cleanup after auto_cleanup has been disabled. This only adds an extra I/O when the committed manifest has auto_cleanup config, so the common case (no auto_cleanup) is not affected.
Under Optimistic commit strategy, UpdateConfig operations (like disable_auto_cleanup) could commit based on a stale dataset snapshot, creating out-of-order versions. For example, disable_auto_cleanup on a stale dataset (version 2) would create version 3, while the latest on disk is version 6. Subsequent appends based on version 6 would still inherit the old auto_cleanup config, defeating the disable. The fix: UpdateConfig and Overwrite (with config_upsert_values) operations now always rebase under Optimistic strategy, ensuring config changes are applied to the latest version. This guarantees that subsequent operations inherit the correct config. Also simplified auto_cleanup_hook: when the committed manifest has auto_cleanup config, it validates against the latest manifest on disk. If the latest manifest does not have auto_cleanup config, cleanup is skipped. This serves as a safety net for edge cases.
Yeah, that's essentially the work that's going on in #3985 |
wjones127
left a comment
There was a problem hiding this comment.
since Optimistic outperforms across all scenarios, making it the default is the right call. I have updated the PR already.
Actually, what do you think about going a step further, and just eliminating the option? Just change the behavior directly to use optimistic always.
| log::debug!( | ||
| "Skipping fragment_bitmap recalculation for index {} (uuid: {}) because it does not exist on disk. \ | ||
| This likely means the index was remapped during this commit and the uuid was changed.", | ||
| index.name, | ||
| index.uuid | ||
| ); |
There was a problem hiding this comment.
issue(blocking): this seems like a drive-by bug fix. Could we save this for a different PR and make sure we have a issue describing it?
| Ok((latest, _)) => { | ||
| if latest.config.contains_key("lance.auto_cleanup.interval") { | ||
| latest.config.clone() | ||
| } else { | ||
| log::info!( | ||
| "auto_cleanup skipped: committed manifest (v{}) has auto_cleanup config but latest manifest (v{}) does not. \ | ||
| This likely means auto_cleanup was disabled by a concurrent commit or the committed manifest inherited \ | ||
| stale config from an outdated dataset snapshot under Optimistic commit strategy.", | ||
| manifest.version, | ||
| latest.version | ||
| ); | ||
| return Ok(None); | ||
| } |
There was a problem hiding this comment.
issue(blocking): this also seems like a drive-by change that could use it's own issue.
Add CommitStrategy: Optimistic vs Pessimistic Conflict Resolution
Summary
This PR introduces a
CommitStrategyenum that allows users to choose between Pessimistic (default, current behavior) and Optimistic commit strategies for concurrent write conflict resolution.Background
Currently, Lance always performs a "rebase" before every commit attempt — loading new transactions since the last read and running
TransactionRebase::try_new()to reconcile conflicts. This is safe but adds significant I/O overhead on every attempt, even when there is no conflict. Under high concurrency, this rebase cost becomes a bottleneck.Changes
Core Logic (
rust/lance/src/io/commit.rs)The commit loop now conditionally skips rebase based on the configured strategy:
backoff.attempt() > 0). This skips the rebase I/O on the fast path (no conflict), and even when a conflict occurs, the cost of a failed commit attempt is typically lower than the rebase I/O overhead.New Types (
rust/lance-table/src/io/commit.rs)CommitStrategyenum withPessimistic(default) andOptimisticvariantsCommitConfig.commit_strategyfieldPublic API (
rust/lance/src/dataset/write/commit.rs)CommitBuilder::with_commit_strategy(strategy)— allows users to configure the commit strategyBenchmark (
python/python/ci_benchmarks/benchmarks/test_concurrent_write.py)Added a
pytest-benchmarkbased concurrent write benchmark that measures throughput and latency under different concurrency levels and operation mixes (append, delete, update).Benchmark Test Report
How to Reproduce
Test Environment
maturin develop --release(Release mode)Results — Mixed Workload (20 appenders, 10 deleters, 10 updaters)
Results — High Concurrency (30 appenders, 15 deleters, 15 updaters)
Key Findings
TransactionRebase::try_new()which loadsinitial_fragments— an expensive I/O operation that Optimistic skips on the fast path.load_and_sort_new_transactionscall.Files Changed
rust/lance-table/src/io/commit.rsCommitStrategyenum andcommit_strategyfield inCommitConfigrust/lance/src/io/commit.rsCommitStrategyrust/lance/src/dataset/write/commit.rsCommitBuilder::with_commit_strategy()APIpython/python/ci_benchmarks/benchmarks/test_concurrent_write.pyBackward Compatibility
Pessimistic, which preserves the existing behavior.with_commit_strategy().