Accelerate rolling checksum with SIMD fast paths#1916
Merged
Conversation
3 tasks
oferchen
added a commit
that referenced
this pull request
May 1, 2026
…1916) (#3498) Adds a bidirectional daemon-mode interop scenario that exercises --iconv=UTF-8,ISO-8859-1 between oc-rsync and upstream rsync 3.4.1. The test creates a deterministic source tree with UTF-8 filenames whose code points all fit in Latin-1 (café.txt, über.txt, ångström.txt) plus an ASCII baseline, then drives two transfers: 1. upstream client -> oc-rsync daemon (charset = ISO-8859-1) 2. oc-rsync client -> upstream daemon (charset = ISO-8859-1) Each transfer is verified by re-reading the destination filenames and comparing content byte-for-byte against the source. Daemon-mode iconv negotiation in oc-rsync is still incomplete: the `charset =` directive is parsed (crates/daemon/.../module_directives.rs) but never threaded into the iconv runtime setup. Findings 1-3 of the audit (symlink target transcoding, --files-from forwarding, --secluded-args/--protect-args transcoding) also remain open. The scenario is therefore added to KNOWN_FAILURES and DASHBOARD_ENTRIES so CI tracks the gap without blocking, and check_known_failures.sh gets a matching reproducer so the dashboard can rerun it. References: upstream: options.c:recv_iconv_settings, flist.c:1579-1603, flist.c:738-754 oc-rsync: docs/audits/iconv-pipeline.md Touched files: - tools/ci/run_interop.sh: new test_iconv_upstream_interop function, added "iconv-upstream" to standalone test_names/test_funcs arrays, port-injection case branch. - tools/ci/known_failures.conf: KNOWN_FAILURES + DASHBOARD_ENTRIES entry. - tools/ci/check_known_failures.sh: dashboard reproducer dispatch case.
This was referenced May 1, 2026
oferchen
added a commit
that referenced
this pull request
May 1, 2026
…1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)
oferchen
added a commit
that referenced
this pull request
May 1, 2026
…3535) Maps each entry in the unconditional KNOWN_FAILURES array and each conditional rule in is_known_failure_from_conf() to a concrete eliminate path: fix in oc-rsync, permanent upstream bug, or permanent protocol-version-locked. Cites upstream rsync 3.4.1 source (compat.c, exclude.c, token.c) and existing tracking issues (#1916, #1685, companion iconv/zstd/protocol audits) so beta-readiness criterion #3 has an explicit work-plan. Two entries are fixable (standalone:iconv-upstream daemon plumbing and standalone:delta-stats daemon-mode delta engine); six are permanent (one upstream bug, five protocol-version-locked at proto < 29 or 30).
oferchen
added a commit
that referenced
this pull request
May 1, 2026
…1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)
oferchen
added a commit
that referenced
this pull request
May 2, 2026
…1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)
oferchen
added a commit
that referenced
this pull request
May 2, 2026
…1916) (#3534) * test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7) * test(interop): mark iconv-local-ssh known failure pending #1911-#1913 Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting -> protocol::FilenameConverter bridge does not exist in production code, so --iconv is a no-op end to end in SSH/local mode and the test hangs on direction (a) (oc-rsync sender -> upstream receiver). The wiring lands in #1911 (config build), #1912 (sender flist emit), and #1913 (receiver flist ingest). Once those merge, remove this entry so the test starts gating regressions.
oferchen
added a commit
that referenced
this pull request
May 5, 2026
…1916) (#3498) Adds a bidirectional daemon-mode interop scenario that exercises --iconv=UTF-8,ISO-8859-1 between oc-rsync and upstream rsync 3.4.1. The test creates a deterministic source tree with UTF-8 filenames whose code points all fit in Latin-1 (café.txt, über.txt, ångström.txt) plus an ASCII baseline, then drives two transfers: 1. upstream client -> oc-rsync daemon (charset = ISO-8859-1) 2. oc-rsync client -> upstream daemon (charset = ISO-8859-1) Each transfer is verified by re-reading the destination filenames and comparing content byte-for-byte against the source. Daemon-mode iconv negotiation in oc-rsync is still incomplete: the `charset =` directive is parsed (crates/daemon/.../module_directives.rs) but never threaded into the iconv runtime setup. Findings 1-3 of the audit (symlink target transcoding, --files-from forwarding, --secluded-args/--protect-args transcoding) also remain open. The scenario is therefore added to KNOWN_FAILURES and DASHBOARD_ENTRIES so CI tracks the gap without blocking, and check_known_failures.sh gets a matching reproducer so the dashboard can rerun it. References: upstream: options.c:recv_iconv_settings, flist.c:1579-1603, flist.c:738-754 oc-rsync: docs/audits/iconv-pipeline.md Touched files: - tools/ci/run_interop.sh: new test_iconv_upstream_interop function, added "iconv-upstream" to standalone test_names/test_funcs arrays, port-injection case branch. - tools/ci/known_failures.conf: KNOWN_FAILURES + DASHBOARD_ENTRIES entry. - tools/ci/check_known_failures.sh: dashboard reproducer dispatch case.
oferchen
added a commit
that referenced
this pull request
May 5, 2026
…3535) Maps each entry in the unconditional KNOWN_FAILURES array and each conditional rule in is_known_failure_from_conf() to a concrete eliminate path: fix in oc-rsync, permanent upstream bug, or permanent protocol-version-locked. Cites upstream rsync 3.4.1 source (compat.c, exclude.c, token.c) and existing tracking issues (#1916, #1685, companion iconv/zstd/protocol audits) so beta-readiness criterion #3 has an explicit work-plan. Two entries are fixable (standalone:iconv-upstream daemon plumbing and standalone:delta-stats daemon-mode delta engine); six are permanent (one upstream bug, five protocol-version-locked at proto < 29 or 30).
oferchen
added a commit
that referenced
this pull request
May 5, 2026
…1916) (#3534) * test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7) * test(interop): mark iconv-local-ssh known failure pending #1911-#1913 Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting -> protocol::FilenameConverter bridge does not exist in production code, so --iconv is a no-op end to end in SSH/local mode and the test hangs on direction (a) (oc-rsync sender -> upstream receiver). The wiring lands in #1911 (config build), #1912 (sender flist emit), and #1913 (receiver flist ingest). Once those merge, remove this entry so the test starts gating regressions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
internal docsRollingChecksum::accumulate_chunkwith scalar fallbackTesting
https://chatgpt.com/codex/tasks/task_e_690928f98a548323b97dfc6d03f6ee57