Optimize rolling checksum bulk rolls by oferchen · Pull Request #1933 · oferchen/rsync

oferchen · 2025-11-04T07:57:24Z

Summary

tighten roll_many's bulk math by caching the iteration weight and reusing a scalar fallback helper
guard the wide-roll fast path against indices that exceed u32 so extremely large windows fall back to the scalar loop

Testing

cargo test -p rsync-checksums

https://chatgpt.com/codex/tasks/task_e_6909b05a91288323aec39c2107cb63bc

Defines the empirical benchmark harness, workloads (100 / 1k / 10k concurrent clients), metrics, soft-limit triggers, comparison oracle against upstream rsync 3.4.1, and the decision criteria that gate the async listener migration tracked under #1935. Frames the active-counter fix from the parent audit (#1673, PR #3705) as a strict precondition. Tracking: oc-rsync task #1933.

) Defines the empirical benchmark harness, workloads (100 / 1k / 10k concurrent clients), metrics, soft-limit triggers, comparison oracle against upstream rsync 3.4.1, and the decision criteria that gate the async listener migration tracked under #1935. Frames the active-counter fix from the parent audit (#1673, PR #3705) as a strict precondition. Tracking: oc-rsync task #1933.

…#3891) Slim runnable plan complementing the broader benchmark plan. Specifies the minimum harness needed to land first measured numbers on Linux loopback at 100/1K/10K concurrent rsync:// pulls so #1934 RFC and #1935 async-listener work can compare sync vs async paths against quantified sync-baseline ttfb, completion p99, peak RSS, and thread count.

#1933) (#4182) * chore(bench): stress harness for thread-per-connection scaling Adds an integration-test-shaped stress benchmark that drives 100, 1000, and 10000 concurrent TCP clients against the daemon listener, capturing wall time, ECONNREFUSED / EMFILE counts, and peak RSS via getrusage. All three scenarios are marked #[ignore]; the 10k case is unix-only and self-skips when RLIMIT_NOFILE cannot accommodate the request. The harness exists to provide evidence for whether an async listener (tracked separately) would be a meaningful change to the current std::thread::spawn-per-connection model. * style(daemon): apply rustfmt to connection_scaling_stress * fix(daemon): convert ru_maxrss via i64::from for cross-arch portability * fix(daemon): use ru_maxrss directly without conversion The conversion was redundant on all targets we build on: c_long is i64 on 64-bit Linux/macOS (no conversion needed) and i32 on 32-bit Linux (the subsequent 'as u64' sign-extends to i64 first, then saturating_mul operates on u64). Removing the conversion silences both clippy::unnecessary_cast and clippy::useless_conversion without losing overflow safety.

Add a focused evaluation of async runtime options for the daemon accept loop. Compares tokio against async-std and the existing thread-per-connection model, and records the decision to adopt tokio with the rt-multi-thread flavour under the existing async feature gate. The doc complements the implementation plan in #1935 and the benchmark plan in #1933 rather than restating either. It covers maintenance posture, feature parity, ecosystem alignment, the case for staying threaded, the case for tokio, migration cost, trigger conditions, and a five-step adoption sequence.

) Defines the empirical benchmark harness, workloads (100 / 1k / 10k concurrent clients), metrics, soft-limit triggers, comparison oracle against upstream rsync 3.4.1, and the decision criteria that gate the async listener migration tracked under #1935. Frames the active-counter fix from the parent audit (#1673, PR #3705) as a strict precondition. Tracking: oc-rsync task #1933.

…#3891) Slim runnable plan complementing the broader benchmark plan. Specifies the minimum harness needed to land first measured numbers on Linux loopback at 100/1K/10K concurrent rsync:// pulls so #1934 RFC and #1935 async-listener work can compare sync vs async paths against quantified sync-baseline ttfb, completion p99, peak RSS, and thread count.

#1933) (#4182) * chore(bench): stress harness for thread-per-connection scaling Adds an integration-test-shaped stress benchmark that drives 100, 1000, and 10000 concurrent TCP clients against the daemon listener, capturing wall time, ECONNREFUSED / EMFILE counts, and peak RSS via getrusage. All three scenarios are marked #[ignore]; the 10k case is unix-only and self-skips when RLIMIT_NOFILE cannot accommodate the request. The harness exists to provide evidence for whether an async listener (tracked separately) would be a meaningful change to the current std::thread::spawn-per-connection model. * style(daemon): apply rustfmt to connection_scaling_stress * fix(daemon): convert ru_maxrss via i64::from for cross-arch portability * fix(daemon): use ru_maxrss directly without conversion The conversion was redundant on all targets we build on: c_long is i64 on 64-bit Linux/macOS (no conversion needed) and i32 on 32-bit Linux (the subsequent 'as u64' sign-extends to i64 first, then saturating_mul operates on u64). Removing the conversion silences both clippy::unnecessary_cast and clippy::useless_conversion without losing overflow safety.

Add a focused evaluation of async runtime options for the daemon accept loop. Compares tokio against async-std and the existing thread-per-connection model, and records the decision to adopt tokio with the rt-multi-thread flavour under the existing async feature gate. The doc complements the implementation plan in #1935 and the benchmark plan in #1933 rather than restating either. It covers maintenance posture, feature parity, ecosystem alignment, the case for staying threaded, the case for tokio, migration cost, trigger conditions, and a five-step adoption sequence.

#1933) (#4182) * chore(bench): stress harness for thread-per-connection scaling Adds an integration-test-shaped stress benchmark that drives 100, 1000, and 10000 concurrent TCP clients against the daemon listener, capturing wall time, ECONNREFUSED / EMFILE counts, and peak RSS via getrusage. All three scenarios are marked #[ignore]; the 10k case is unix-only and self-skips when RLIMIT_NOFILE cannot accommodate the request. The harness exists to provide evidence for whether an async listener (tracked separately) would be a meaningful change to the current std::thread::spawn-per-connection model. * style(daemon): apply rustfmt to connection_scaling_stress * fix(daemon): convert ru_maxrss via i64::from for cross-arch portability * fix(daemon): use ru_maxrss directly without conversion The conversion was redundant on all targets we build on: c_long is i64 on 64-bit Linux/macOS (no conversion needed) and i32 on 32-bit Linux (the subsequent 'as u64' sign-extends to i64 first, then saturating_mul operates on u64). Removing the conversion silences both clippy::unnecessary_cast and clippy::useless_conversion without losing overflow safety.

Add a focused evaluation of async runtime options for the daemon accept loop. Compares tokio against async-std and the existing thread-per-connection model, and records the decision to adopt tokio with the rt-multi-thread flavour under the existing async feature gate. The doc complements the implementation plan in #1935 and the benchmark plan in #1933 rather than restating either. It covers maintenance posture, feature parity, ecosystem alignment, the case for staying threaded, the case for tokio, migration cost, trigger conditions, and a five-step adoption sequence.

Optimize rolling checksum bulk rolls

ef24520

oferchen added the codex label Nov 4, 2025 — with ChatGPT Codex Connector

oferchen merged commit ca61922 into master Nov 4, 2025

oferchen deleted the implement-missing-features-for-oc-rsync branch November 4, 2025 07:57

oferchen mentioned this pull request May 7, 2026

docs(audits): daemon thread-per-connection benchmark plan #3815

Merged

3 tasks

This was referenced May 14, 2026

docs(audits): scope daemon thread-per-connection ceiling and admission gap (#1673) #4023

Merged

bench(daemon): thread-per-connection scaling at 100/1k/10k connections (#1933) #4182

Merged

oferchen mentioned this pull request May 17, 2026

docs(design): daemon async runtime choice - tokio vs async-std vs threaded (#1367, #1590) #4247

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize rolling checksum bulk rolls#1933

Optimize rolling checksum bulk rolls#1933
oferchen merged 1 commit into
masterfrom
implement-missing-features-for-oc-rsync

oferchen commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oferchen commented Nov 4, 2025

Summary

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant