test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)#3534
Merged
Conversation
6ecc4e7 to
fa6856f
Compare
…1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)
fa6856f to
7fbc0a6
Compare
Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting -> protocol::FilenameConverter bridge does not exist in production code, so --iconv is a no-op end to end in SSH/local mode and the test hangs on direction (a) (oc-rsync sender -> upstream receiver). The wiring lands in #1911 (config build), #1912 (sender flist emit), and #1913 (receiver flist ingest). Once those merge, remove this entry so the test starts gating regressions.
5 tasks
oferchen
added a commit
that referenced
this pull request
May 3, 2026
…arser (#3566) * test(interop): align iconv-local-ssh with upstream Latin-1 dest semantics PR #3554 redesigned the daemon iconv test (test_iconv_upstream_interop) to use a single-file Value Object that asserts the destination disk holds the ISO-8859-1 byte name (caf\xe9.txt), not the UTF-8 byte name (caf\xc3\xa9.txt). That redesign was missed for the SSH/local-mode test (test_iconv_local_ssh_interop, added separately in PR #3534), which still asserted UTF-8 dest names and failed in CI on a clean Linux host where ext4 happily stores the Latin-1 single-byte names upstream's ic_recv writes. Per upstream rsync.c:85-147 setup_iconv() and options.c:2716-2723 server_options(): with --iconv=UTF-8,ISO-8859-1, the client (charset UTF-8) keeps the LOCAL half and forwards the post-comma REMOTE half (ISO-8859-1) to the spawned peer. The spawned receiver opens ic_recv = iconv_open(charset, UTF8_CHARSET), so it transcodes UTF-8 wire bytes into single-byte Latin-1 on the local disk. The dest filename is therefore caf\xe9.txt, not caf\xc3\xa9.txt. Refactor test_iconv_local_ssh_interop to reuse the single-file Value Object helpers from the daemon test (_ic_init_fixtures / _ic_write_fixtures / _ic_verify_dest at run_interop.sh:3142-3175), expand the docblock to explain the am_server LOCAL/REMOTE split with explicit upstream references, and drop the multi-file UTF-8-expecting fixture that masked the real wire semantics. No production code changes; this fixes the test expectation only. * fix(cli): wire --iconv into server-mode flag parser Server mode previously dropped --iconv=CHARSET passed by the client, falling back to no transcoding. This broke --iconv=LOCAL,REMOTE transfers over SSH and local-shell where the on-disk filename encoding differs between sides: receiver/generator skipped the iconv hook and wrote/read wire UTF-8 bytes verbatim instead of transcoding to the server's local charset. Parse --iconv=VALUE in ServerLongFlags, resolve via IconvSetting, and assign to config.connection.iconv so receiver and generator pick up the converter. Mirrors the daemon path (daemon/sections/module_access/client_args.rs) and the SSH-client path (client/remote/flags.rs). upstream: rsync.c:85-147 setup_iconv() upstream: options.c:2716-2723 server-side iconv forwarding * fix(cli): recognise --timeout= in server-mode flag parser The server-mode argument parser did not recognise `--timeout=N`, so the flag was treated as a positional argument by `parse_server_flag_string_and_args` and consumed in place of the destination path. With the iconv-local-ssh scenario, strace showed the receiver creating `--timeout=10/` instead of `dest/` because upstream rsync emits `--timeout=10` after `--iconv=...` in the server arg list. Add `--timeout=` to `ServerLongFlags`, `parse_value_bearing_flag`, and `is_known_server_long_flag` so the flag is captured (its value remains unwired for now, matching the prior silent-drop behaviour) and excluded from positional extraction. upstream: options.c - server_options() emits `--timeout=%d` from io_timeout
oferchen
added a commit
that referenced
this pull request
May 5, 2026
…1916) (#3534) * test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916) Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1, the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two directions are exercised through a fake remote-shell wrapper that discards the host argument and exec's the rest locally: a) oc-rsync sender -> upstream receiver (--rsh=fake --rsync-path=upstream) b) upstream sender -> oc-rsync receiver (--rsh=fake --rsync-path=oc-rsync) The fixture is UTF-8 source filenames whose code points all fit in ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1 forces Latin-1 wire encoding while the local charset stays UTF-8. The companion daemon-mode scenario, "standalone:iconv-upstream", stays in known_failures.conf until daemon-side `charset =` plumbing lands (#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to make the SSH/local vs daemon split explicit. Pre-checks: - upstream binary version availability (graceful skip). - upstream iconv compile-time support (graceful skip on --disable-iconv). - host filesystem accepts UTF-8 names (graceful skip). References: upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603 oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7) * test(interop): mark iconv-local-ssh known failure pending #1911-#1913 Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting -> protocol::FilenameConverter bridge does not exist in production code, so --iconv is a no-op end to end in SSH/local mode and the test hangs on direction (a) (oc-rsync sender -> upstream receiver). The wiring lands in #1911 (config build), #1912 (sender flist emit), and #1913 (receiver flist ingest). Once those merge, remove this entry so the test starts gating regressions.
oferchen
added a commit
that referenced
this pull request
May 5, 2026
…arser (#3566) * test(interop): align iconv-local-ssh with upstream Latin-1 dest semantics PR #3554 redesigned the daemon iconv test (test_iconv_upstream_interop) to use a single-file Value Object that asserts the destination disk holds the ISO-8859-1 byte name (caf\xe9.txt), not the UTF-8 byte name (caf\xc3\xa9.txt). That redesign was missed for the SSH/local-mode test (test_iconv_local_ssh_interop, added separately in PR #3534), which still asserted UTF-8 dest names and failed in CI on a clean Linux host where ext4 happily stores the Latin-1 single-byte names upstream's ic_recv writes. Per upstream rsync.c:85-147 setup_iconv() and options.c:2716-2723 server_options(): with --iconv=UTF-8,ISO-8859-1, the client (charset UTF-8) keeps the LOCAL half and forwards the post-comma REMOTE half (ISO-8859-1) to the spawned peer. The spawned receiver opens ic_recv = iconv_open(charset, UTF8_CHARSET), so it transcodes UTF-8 wire bytes into single-byte Latin-1 on the local disk. The dest filename is therefore caf\xe9.txt, not caf\xc3\xa9.txt. Refactor test_iconv_local_ssh_interop to reuse the single-file Value Object helpers from the daemon test (_ic_init_fixtures / _ic_write_fixtures / _ic_verify_dest at run_interop.sh:3142-3175), expand the docblock to explain the am_server LOCAL/REMOTE split with explicit upstream references, and drop the multi-file UTF-8-expecting fixture that masked the real wire semantics. No production code changes; this fixes the test expectation only. * fix(cli): wire --iconv into server-mode flag parser Server mode previously dropped --iconv=CHARSET passed by the client, falling back to no transcoding. This broke --iconv=LOCAL,REMOTE transfers over SSH and local-shell where the on-disk filename encoding differs between sides: receiver/generator skipped the iconv hook and wrote/read wire UTF-8 bytes verbatim instead of transcoding to the server's local charset. Parse --iconv=VALUE in ServerLongFlags, resolve via IconvSetting, and assign to config.connection.iconv so receiver and generator pick up the converter. Mirrors the daemon path (daemon/sections/module_access/client_args.rs) and the SSH-client path (client/remote/flags.rs). upstream: rsync.c:85-147 setup_iconv() upstream: options.c:2716-2723 server-side iconv forwarding * fix(cli): recognise --timeout= in server-mode flag parser The server-mode argument parser did not recognise `--timeout=N`, so the flag was treated as a positional argument by `parse_server_flag_string_and_args` and consumed in place of the destination path. With the iconv-local-ssh scenario, strace showed the receiver creating `--timeout=10/` instead of `dest/` because upstream rsync emits `--timeout=10` after `--iconv=...` in the server arg list. Add `--timeout=` to `ServerLongFlags`, `parse_value_bearing_flag`, and `is_known_server_long_flag` so the flag is captured (its value remains unwired for now, matching the prior silent-drop behaviour) and excluded from positional extraction. upstream: options.c - server_options() emits `--timeout=%d` from io_timeout
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an interop test for
--iconvagainst upstream rsync 3.4.1 (issue #1916). The test pins down the SSH/local-mode behaviour bridged by PR #3458, while the daemon-mode gap stays tracked as a known failure.Scenarios covered
New
test_iconv_local_ssh_interopintools/ci/run_interop.sh(registered asstandalone:iconv-local-ssh):--rsh=<fake-rsh> --rsync-path=<upstream>with--iconv=UTF-8,ISO-8859-1. Expected: PASS (PR feat: bridge IconvSetting to FilenameConverter #3458 wiredIconvSetting -> FilenameConverterfor this path; failure here is a regression).--rsh=<fake-rsh> --rsync-path=<oc-rsync>with--iconv=UTF-8,ISO-8859-1. Expected: PASS (same bridge, receiver side).charset = ISO-8859-1-- existingtest_iconv_upstream_interop, kept inKNOWN_FAILURESasstandalone:iconv-upstream. Expected: FAIL until Propagate fallback signal exit codes #1911-Handle files-from relative entries with explicit source directory #1917 land.The fake remote-shell is a
/bin/shscript that doesshift; exec "$@". It discards the host argument that rsync passes as$1and exec's the rest of the command line locally, mirroring how upstream's own testsuite drives "remote" mode without a real sshd.Fixture
UTF-8 source filenames whose code points all fit in ISO-8859-1 (Latin-1):
café.txtüber.txtnaïve.txtZürich.txtPlus
plain.txtas an ASCII baseline. With--iconv=UTF-8,ISO-8859-1, the wire encoding is Latin-1 and both peers must transcode back to UTF-8 on disk.Pre-checks (graceful skip, never a fake failure)
find_upstream_binarypath).--disable-iconv-> probe via no-op--iconv=UTF-8,UTF-8and skip oniconv|not.*compiled|--disable-iconvin stderr.How the gap closes when #1911-#1917 land
charset =)IconvSetting -> FilenameConverterbridgeWhen #1917 wires
charset =into the daemon's iconv runtime, the existingtest_iconv_upstream_interopwill pass andstandalone:iconv-upstreamshould be removed fromKNOWN_FAILURESandDASHBOARD_ENTRIESintools/ci/known_failures.conf.Files modified
tools/ci/run_interop.sh-- newtest_iconv_local_ssh_interopfunction (181 lines), registered intest_namesandtest_funcsarrays.tools/ci/known_failures.conf-- comment refresh onstandalone:iconv-upstreamto reference the new SSH/local test and the Propagate fallback signal exit codes #1911-Handle files-from relative entries with explicit source directory #1917 follow-up chain.Test plan
iconv-local-sshscenario and reports PASS in both directions against upstream 3.4.1.iconv-upstreamdaemon scenario still reports as a known failure (no behaviour change).bash -n tools/ci/run_interop.shclean (verified locally; see commit).References: upstream
options.c:recv_iconv_settings,flist.c:738-754,flist.c:1579-1603;docs/audits/iconv-pipeline.md.