Skip to content

test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)#3534

Merged
oferchen merged 2 commits into
masterfrom
test/iconv-interop-upstream-1916
May 2, 2026
Merged

test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)#3534
oferchen merged 2 commits into
masterfrom
test/iconv-interop-upstream-1916

Conversation

@oferchen
Copy link
Copy Markdown
Owner

@oferchen oferchen commented May 1, 2026

Summary

Adds an interop test for --iconv against upstream rsync 3.4.1 (issue #1916). The test pins down the SSH/local-mode behaviour bridged by PR #3458, while the daemon-mode gap stays tracked as a known failure.

Scenarios covered

New test_iconv_local_ssh_interop in tools/ci/run_interop.sh (registered as standalone:iconv-local-ssh):

The fake remote-shell is a /bin/sh script that does shift; exec "$@". It discards the host argument that rsync passes as $1 and exec's the rest of the command line locally, mirroring how upstream's own testsuite drives "remote" mode without a real sshd.

Fixture

UTF-8 source filenames whose code points all fit in ISO-8859-1 (Latin-1):

Filename Code points
café.txt U+00E9
über.txt U+00FC
naïve.txt U+00EF
Zürich.txt U+00DC

Plus plain.txt as an ASCII baseline. With --iconv=UTF-8,ISO-8859-1, the wire encoding is Latin-1 and both peers must transcode back to UTF-8 on disk.

Pre-checks (graceful skip, never a fake failure)

  • Upstream binary missing -> standalone harness already skips (existing find_upstream_binary path).
  • Upstream rsync built with --disable-iconv -> probe via no-op --iconv=UTF-8,UTF-8 and skip on iconv|not.*compiled|--disable-iconv in stderr.
  • Host filesystem cannot store UTF-8 names (e.g. non-UTF-8 locale, NFC-normalising fs) -> skip with diagnostic.

How the gap closes when #1911-#1917 land

Issue Pipeline part Status today
#1912 Sender file-list emit pending
#1913 Receiver file-list ingest pending
#1914 Filter-rule path matching pending
#1917 Daemon module config (charset =) parsed but inert
#3458 SSH/local IconvSetting -> FilenameConverter bridge landed (this PR pins it)

When #1917 wires charset = into the daemon's iconv runtime, the existing test_iconv_upstream_interop will pass and standalone:iconv-upstream should be removed from KNOWN_FAILURES and DASHBOARD_ENTRIES in tools/ci/known_failures.conf.

Files modified

Test plan

  • CI green on Linux musl, macOS, Windows (no Windows code touched; bash-only change).
  • Interop Validation workflow runs the new iconv-local-ssh scenario and reports PASS in both directions against upstream 3.4.1.
  • Existing iconv-upstream daemon scenario still reports as a known failure (no behaviour change).
  • bash -n tools/ci/run_interop.sh clean (verified locally; see commit).

References: upstream options.c:recv_iconv_settings, flist.c:738-754, flist.c:1579-1603; docs/audits/iconv-pipeline.md.

@github-actions github-actions Bot added the test label May 1, 2026
@oferchen oferchen force-pushed the test/iconv-interop-upstream-1916 branch 2 times, most recently from 6ecc4e7 to fa6856f Compare May 1, 2026 22:36
…1916)

Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering
the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1,
the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two
directions are exercised through a fake remote-shell wrapper that
discards the host argument and exec's the rest locally:

  a) oc-rsync sender -> upstream receiver  (--rsh=fake --rsync-path=upstream)
  b) upstream sender -> oc-rsync receiver  (--rsh=fake --rsync-path=oc-rsync)

The fixture is UTF-8 source filenames whose code points all fit in
ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1
forces Latin-1 wire encoding while the local charset stays UTF-8.

The companion daemon-mode scenario, "standalone:iconv-upstream", stays
in known_failures.conf until daemon-side `charset =` plumbing lands
(#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to
make the SSH/local vs daemon split explicit.

Pre-checks:
  - upstream binary version availability (graceful skip).
  - upstream iconv compile-time support (graceful skip on --disable-iconv).
  - host filesystem accepts UTF-8 names (graceful skip).

References:
  upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603
  oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)
@oferchen oferchen force-pushed the test/iconv-interop-upstream-1916 branch from fa6856f to 7fbc0a6 Compare May 2, 2026 01:27
Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting ->
protocol::FilenameConverter bridge does not exist in production code,
so --iconv is a no-op end to end in SSH/local mode and the test hangs
on direction (a) (oc-rsync sender -> upstream receiver).

The wiring lands in #1911 (config build), #1912 (sender flist emit),
and #1913 (receiver flist ingest). Once those merge, remove this
entry so the test starts gating regressions.
@oferchen oferchen merged commit 1a1740f into master May 2, 2026
41 checks passed
@oferchen oferchen deleted the test/iconv-interop-upstream-1916 branch May 2, 2026 03:50
oferchen added a commit that referenced this pull request May 3, 2026
…arser (#3566)

* test(interop): align iconv-local-ssh with upstream Latin-1 dest semantics

PR #3554 redesigned the daemon iconv test (test_iconv_upstream_interop)
to use a single-file Value Object that asserts the destination disk
holds the ISO-8859-1 byte name (caf\xe9.txt), not the UTF-8 byte name
(caf\xc3\xa9.txt). That redesign was missed for the SSH/local-mode
test (test_iconv_local_ssh_interop, added separately in PR #3534),
which still asserted UTF-8 dest names and failed in CI on a clean
Linux host where ext4 happily stores the Latin-1 single-byte names
upstream's ic_recv writes.

Per upstream rsync.c:85-147 setup_iconv() and options.c:2716-2723
server_options(): with --iconv=UTF-8,ISO-8859-1, the client (charset
UTF-8) keeps the LOCAL half and forwards the post-comma REMOTE half
(ISO-8859-1) to the spawned peer. The spawned receiver opens
ic_recv = iconv_open(charset, UTF8_CHARSET), so it transcodes UTF-8
wire bytes into single-byte Latin-1 on the local disk. The dest
filename is therefore caf\xe9.txt, not caf\xc3\xa9.txt.

Refactor test_iconv_local_ssh_interop to reuse the single-file Value
Object helpers from the daemon test (_ic_init_fixtures /
_ic_write_fixtures / _ic_verify_dest at run_interop.sh:3142-3175),
expand the docblock to explain the am_server LOCAL/REMOTE split with
explicit upstream references, and drop the multi-file UTF-8-expecting
fixture that masked the real wire semantics.

No production code changes; this fixes the test expectation only.

* fix(cli): wire --iconv into server-mode flag parser

Server mode previously dropped --iconv=CHARSET passed by the client,
falling back to no transcoding. This broke --iconv=LOCAL,REMOTE
transfers over SSH and local-shell where the on-disk filename encoding
differs between sides: receiver/generator skipped the iconv hook and
wrote/read wire UTF-8 bytes verbatim instead of transcoding to the
server's local charset.

Parse --iconv=VALUE in ServerLongFlags, resolve via IconvSetting,
and assign to config.connection.iconv so receiver and generator pick
up the converter. Mirrors the daemon path
(daemon/sections/module_access/client_args.rs) and the SSH-client path
(client/remote/flags.rs).

upstream: rsync.c:85-147 setup_iconv()
upstream: options.c:2716-2723 server-side iconv forwarding

* fix(cli): recognise --timeout= in server-mode flag parser

The server-mode argument parser did not recognise `--timeout=N`, so the
flag was treated as a positional argument by `parse_server_flag_string_and_args`
and consumed in place of the destination path. With the iconv-local-ssh
scenario, strace showed the receiver creating `--timeout=10/` instead of
`dest/` because upstream rsync emits `--timeout=10` after `--iconv=...`
in the server arg list.

Add `--timeout=` to `ServerLongFlags`, `parse_value_bearing_flag`, and
`is_known_server_long_flag` so the flag is captured (its value remains
unwired for now, matching the prior silent-drop behaviour) and excluded
from positional extraction.

upstream: options.c - server_options() emits `--timeout=%d` from io_timeout
oferchen added a commit that referenced this pull request May 5, 2026
…1916) (#3534)

* test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)

Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering
the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1,
the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two
directions are exercised through a fake remote-shell wrapper that
discards the host argument and exec's the rest locally:

  a) oc-rsync sender -> upstream receiver  (--rsh=fake --rsync-path=upstream)
  b) upstream sender -> oc-rsync receiver  (--rsh=fake --rsync-path=oc-rsync)

The fixture is UTF-8 source filenames whose code points all fit in
ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1
forces Latin-1 wire encoding while the local charset stays UTF-8.

The companion daemon-mode scenario, "standalone:iconv-upstream", stays
in known_failures.conf until daemon-side `charset =` plumbing lands
(#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to
make the SSH/local vs daemon split explicit.

Pre-checks:
  - upstream binary version availability (graceful skip).
  - upstream iconv compile-time support (graceful skip on --disable-iconv).
  - host filesystem accepts UTF-8 names (graceful skip).

References:
  upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603
  oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)

* test(interop): mark iconv-local-ssh known failure pending #1911-#1913

Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting ->
protocol::FilenameConverter bridge does not exist in production code,
so --iconv is a no-op end to end in SSH/local mode and the test hangs
on direction (a) (oc-rsync sender -> upstream receiver).

The wiring lands in #1911 (config build), #1912 (sender flist emit),
and #1913 (receiver flist ingest). Once those merge, remove this
entry so the test starts gating regressions.
oferchen added a commit that referenced this pull request May 5, 2026
…arser (#3566)

* test(interop): align iconv-local-ssh with upstream Latin-1 dest semantics

PR #3554 redesigned the daemon iconv test (test_iconv_upstream_interop)
to use a single-file Value Object that asserts the destination disk
holds the ISO-8859-1 byte name (caf\xe9.txt), not the UTF-8 byte name
(caf\xc3\xa9.txt). That redesign was missed for the SSH/local-mode
test (test_iconv_local_ssh_interop, added separately in PR #3534),
which still asserted UTF-8 dest names and failed in CI on a clean
Linux host where ext4 happily stores the Latin-1 single-byte names
upstream's ic_recv writes.

Per upstream rsync.c:85-147 setup_iconv() and options.c:2716-2723
server_options(): with --iconv=UTF-8,ISO-8859-1, the client (charset
UTF-8) keeps the LOCAL half and forwards the post-comma REMOTE half
(ISO-8859-1) to the spawned peer. The spawned receiver opens
ic_recv = iconv_open(charset, UTF8_CHARSET), so it transcodes UTF-8
wire bytes into single-byte Latin-1 on the local disk. The dest
filename is therefore caf\xe9.txt, not caf\xc3\xa9.txt.

Refactor test_iconv_local_ssh_interop to reuse the single-file Value
Object helpers from the daemon test (_ic_init_fixtures /
_ic_write_fixtures / _ic_verify_dest at run_interop.sh:3142-3175),
expand the docblock to explain the am_server LOCAL/REMOTE split with
explicit upstream references, and drop the multi-file UTF-8-expecting
fixture that masked the real wire semantics.

No production code changes; this fixes the test expectation only.

* fix(cli): wire --iconv into server-mode flag parser

Server mode previously dropped --iconv=CHARSET passed by the client,
falling back to no transcoding. This broke --iconv=LOCAL,REMOTE
transfers over SSH and local-shell where the on-disk filename encoding
differs between sides: receiver/generator skipped the iconv hook and
wrote/read wire UTF-8 bytes verbatim instead of transcoding to the
server's local charset.

Parse --iconv=VALUE in ServerLongFlags, resolve via IconvSetting,
and assign to config.connection.iconv so receiver and generator pick
up the converter. Mirrors the daemon path
(daemon/sections/module_access/client_args.rs) and the SSH-client path
(client/remote/flags.rs).

upstream: rsync.c:85-147 setup_iconv()
upstream: options.c:2716-2723 server-side iconv forwarding

* fix(cli): recognise --timeout= in server-mode flag parser

The server-mode argument parser did not recognise `--timeout=N`, so the
flag was treated as a positional argument by `parse_server_flag_string_and_args`
and consumed in place of the destination path. With the iconv-local-ssh
scenario, strace showed the receiver creating `--timeout=10/` instead of
`dest/` because upstream rsync emits `--timeout=10` after `--iconv=...`
in the server arg list.

Add `--timeout=` to `ServerLongFlags`, `parse_value_bearing_flag`, and
`is_known_server_long_flag` so the flag is captured (its value remains
unwired for now, matching the prior silent-drop behaviour) and excluded
from positional extraction.

upstream: options.c - server_options() emits `--timeout=%d` from io_timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant