Skip to content

Cache proxy authorization headers#1912

Merged
oferchen merged 1 commit into
masterfrom
implement-missing-components-for-oc-rsync-wdabsb
Nov 3, 2025
Merged

Cache proxy authorization headers#1912
oferchen merged 1 commit into
masterfrom
implement-missing-components-for-oc-rsync-wdabsb

Conversation

@oferchen
Copy link
Copy Markdown
Owner

@oferchen oferchen commented Nov 3, 2025

Summary

  • cache the HTTP proxy authorization header so CONNECT requests reuse the encoded value
  • update proxy parsing tests to expect borrowed header values and verify the cached storage is reused

Testing

  • cargo test -p rsync-core parse_proxy_spec_caches_authorization_header

https://chatgpt.com/codex/tasks/task_e_69091b85ab588323943e360e09eb5771

@oferchen oferchen merged commit 23355aa into master Nov 3, 2025
@oferchen oferchen deleted the implement-missing-components-for-oc-rsync-wdabsb branch November 3, 2025 21:30
oferchen added a commit that referenced this pull request May 1, 2026
…1911)

Closes #1911 by completing the local-copy bridge between the parsed
`IconvSetting` on `ClientConfig` and the engine's `LocalCopyOptions`,
mirroring the SSH/daemon bridge that PR #3458 wired through
`apply_common_server_flags`.

The audit at `docs/audits/iconv-inert.md` (PR #3526) labels #1911 as
"Partial" because the local-copy path bypasses
`apply_common_server_flags` and therefore had no route by which a
resolved `FilenameConverter` could reach the engine.

Changes:

- `engine::local_copy::LocalCopyOptions` and
  `LocalCopyOptionsBuilder` gain an `iconv: Option<FilenameConverter>`
  field with a `with_iconv` setter and `iconv()` accessor on the
  options struct, plus an `iconv()` setter on the builder.
- `core::client::run::LocalCopyOptionsBuilder::build` (the internal
  wrapper around the engine builder) calls
  `config.iconv().resolve_converter()` and propagates the result via
  the new setter.
- Unit tests in `crates/engine/src/local_copy/options/filters.rs`
  cover default-none, attach, and clear.
- Unit tests in `crates/core/src/client/run/mod.rs` cover the
  `IconvSetting -> LocalCopyOptions.iconv` mapping for unspecified,
  disabled, locale-default, explicit, and unsupported-charset variants.

This PR plumbs the converter only; it does not yet apply it on
sender file-list emit (#1912), receiver file-list ingest (#1913),
or filter-rule path matching (#1914). Those are tracked separately.
oferchen added a commit that referenced this pull request May 2, 2026
Wires the existing `apply_encoding_conversion` hook on the sender's
file-list emit path so filenames are transcoded from local to remote
charset before any prefix-compression bookkeeping or wire emission
when --iconv=remote,local is in effect. The producer wiring tracker
in `crates/engine/src/local_copy/options/filters.rs` (#1911) named
this gap explicitly, and the SSH/daemon producer in
`transfer::generator::build_flist_writer` already calls
`writer.with_iconv(...)` from `connection.iconv`, so the hook now
takes effect end-to-end whenever the runtime config carries a
resolved converter.

Changes:

- Annotate the iconv conversion call site in
  `FileListWriter::write_entry` with the upstream reference
  `flist.c send_file_entry() iconv_buf(ic_send, ...)` so the
  protocol-fidelity contract is visible at the wire boundary.
- Mirror `FileListWriter::with_iconv` on `BatchedFileListWriter`
  so the batched sender path can attach a converter through the
  same builder surface.
- Add unit tests in `crates/protocol/src/flist/write/tests.rs`
  covering the with-converter round-trip (UTF-8 source name -> wire
  bytes are ISO-8859-1, no UTF-8 multi-byte sequence remains) and
  the without-converter path (wire bytes match the original
  filename, preserving current behaviour).

upstream: flist.c send_file_entry() iconv_buf(ic_send, ...)
audit: docs/audits/iconv-pipeline.md, Finding 4
oferchen added a commit that referenced this pull request May 2, 2026
Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting ->
protocol::FilenameConverter bridge does not exist in production code,
so --iconv is a no-op end to end in SSH/local mode and the test hangs
on direction (a) (oc-rsync sender -> upstream receiver).

The wiring lands in #1911 (config build), #1912 (sender flist emit),
and #1913 (receiver flist ingest). Once those merge, remove this
entry so the test starts gating regressions.
oferchen added a commit that referenced this pull request May 2, 2026
…3548)

Wires the existing `apply_encoding_conversion` hook on the sender's
file-list emit path so filenames are transcoded from local to remote
charset before any prefix-compression bookkeeping or wire emission
when --iconv=remote,local is in effect. The producer wiring tracker
in `crates/engine/src/local_copy/options/filters.rs` (#1911) named
this gap explicitly, and the SSH/daemon producer in
`transfer::generator::build_flist_writer` already calls
`writer.with_iconv(...)` from `connection.iconv`, so the hook now
takes effect end-to-end whenever the runtime config carries a
resolved converter.

Changes:

- Annotate the iconv conversion call site in
  `FileListWriter::write_entry` with the upstream reference
  `flist.c send_file_entry() iconv_buf(ic_send, ...)` so the
  protocol-fidelity contract is visible at the wire boundary.
- Mirror `FileListWriter::with_iconv` on `BatchedFileListWriter`
  so the batched sender path can attach a converter through the
  same builder surface.
- Add unit tests in `crates/protocol/src/flist/write/tests.rs`
  covering the with-converter round-trip (UTF-8 source name -> wire
  bytes are ISO-8859-1, no UTF-8 multi-byte sequence remains) and
  the without-converter path (wire bytes match the original
  filename, preserving current behaviour).

upstream: flist.c send_file_entry() iconv_buf(ic_send, ...)
audit: docs/audits/iconv-pipeline.md, Finding 4
oferchen added a commit that referenced this pull request May 2, 2026
…1916) (#3534)

* test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)

Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering
the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1,
the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two
directions are exercised through a fake remote-shell wrapper that
discards the host argument and exec's the rest locally:

  a) oc-rsync sender -> upstream receiver  (--rsh=fake --rsync-path=upstream)
  b) upstream sender -> oc-rsync receiver  (--rsh=fake --rsync-path=oc-rsync)

The fixture is UTF-8 source filenames whose code points all fit in
ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1
forces Latin-1 wire encoding while the local charset stays UTF-8.

The companion daemon-mode scenario, "standalone:iconv-upstream", stays
in known_failures.conf until daemon-side `charset =` plumbing lands
(#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to
make the SSH/local vs daemon split explicit.

Pre-checks:
  - upstream binary version availability (graceful skip).
  - upstream iconv compile-time support (graceful skip on --disable-iconv).
  - host filesystem accepts UTF-8 names (graceful skip).

References:
  upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603
  oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)

* test(interop): mark iconv-local-ssh known failure pending #1911-#1913

Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting ->
protocol::FilenameConverter bridge does not exist in production code,
so --iconv is a no-op end to end in SSH/local mode and the test hangs
on direction (a) (oc-rsync sender -> upstream receiver).

The wiring lands in #1911 (config build), #1912 (sender flist emit),
and #1913 (receiver flist ingest). Once those merge, remove this
entry so the test starts gating regressions.
oferchen added a commit that referenced this pull request May 2, 2026
… (#3552)

Locks the byte-level wire encoding of filenames that pass through the
--iconv pipeline (#1911 config, #1912 sender, #1913 receiver), so future
regressions in either the sender or the receiver are caught at the byte
level rather than at the round-trip level.

Coverage:
  - sender (UTF-8 -> ISO-8859-1, KOI8-R, windows-1252) emits exact
    remote-charset bytes; suffix_len header reflects the post-iconv
    length, not the UTF-8 source length
  - receiver decodes ISO-8859-1, KOI8-R, and windows-1252 wire bytes
    into UTF-8 disk names byte-for-byte
  - identity converter (UTF-8 <-> UTF-8) preserves raw UTF-8 bytes
  - ASCII-only names pass through both directions unchanged
  - sender + receiver wire bytes match exactly (iconv writer output
    equals raw remote-charset entry output; iconv reader output equals
    UTF-8-native plain reader output)
  - round-trip preserves "café", "файл", "Ångström.txt" through their
    respective remote charsets
  - directory-prefixed compression operates on post-iconv bytes so
    sender and receiver agree on the iconv-then-compress order
  - unmappable source characters (Greek alpha into ISO-8859-1) and
    invalid wire bytes (malformed UTF-8) surface InvalidData errors,
    mirroring upstream's IOERR_GENERAL path in flist.c

Tests gated on `feature = "iconv"` and Unix where non-UTF-8 path bytes
are preserved by OsStr. Wire-format bytes verified against upstream
rsync 3.4.1's flist.c send_file_entry/recv_file_entry encoding.
oferchen added a commit that referenced this pull request May 2, 2026
#1911) (#3555)

* feat(transfer): wire IconvSetting -> FilenameConverter at config build (#1911)

Per docs/audits/iconv-pipeline.md Finding 4, --iconv was previously a
no-op end to end: the CLI parsed --iconv into IconvSetting and forwarded
the value to the remote peer, but the local process never converted that
setting into a protocol::FilenameConverter, so the file-list reader and
writer ran with iconv: None and copied raw bytes through transfer.

The bridge is now in place across all three transfer modes:

- SSH push/pull (and embedded SSH): apply_common_server_flags writes
  `server_config.connection.iconv = config.iconv().resolve_converter()`
  in crates/core/src/client/remote/flags.rs:228, invoked from
  build_server_config_for_receiver and build_server_config_for_generator
  (ssh_transfer.rs and embedded_ssh_transfer.rs).
- Daemon push/pull: apply_common_server_flags is also called from
  daemon_transfer/orchestration/server_config.rs (both paths). On the
  daemon side, module_access/client_args.rs:283 resolves the module's
  `charset =` directive into the same FilenameConverter slot.
- Local copy: LocalCopyOptionsBuilder::apply_iconv in
  crates/core/src/client/run/mod.rs:399 writes the converter onto
  LocalCopyOptions::iconv, since the local-copy executor in the engine
  crate bypasses the SSH/daemon ServerConfig builder.

The transfer crate consumes the converter at:

- crates/transfer/src/receiver/mod.rs:369 - reader.with_iconv(...)
- crates/transfer/src/generator/mod.rs:570 - writer.with_iconv(...)

These hand the converter to the protocol crate's FileListReader /
FileListWriter, where #1912 and #1913 already apply it via
apply_encoding_conversion on every emitted and ingested filename. With
the producer-side wiring landed, the test_iconv_local_ssh interop
scenario starts gating regressions, so the standalone:iconv-local-ssh
entry is removed from KNOWN_FAILURES and DASHBOARD_ENTRIES.

Upstream references:

- flist.c:1579-1603 send_file_entry filename iconvbufs(ic_send, ...)
- flist.c:738-754 recv_file_entry filename iconvbufs(ic_recv, ...)
- options.c::recv_iconv_settings parses --iconv=LOCAL,REMOTE

* revert(ci): restore standalone:iconv-local-ssh known failure

PR #3458 (cd76ec2) wired IconvSetting -> FilenameConverter for the
local generator/receiver, but two architectural gaps prevent the
SSH iconv scenario from passing end-to-end:

- --iconv is not forwarded to the spawned remote rsync. Upstream
  options.c:2715-2723 forwards the post-comma half of iconv_opt so
  the remote setup_iconv() agrees on charsets.
- FilenameConverter (crates/protocol/src/iconv/converter.rs) treats
  remote_encoding as the literal wire charset. Upstream rsync.c:87-147
  always uses UTF-8 on the wire and treats `charset` as each peer's
  local charset.

Audit findings 1-3 and 5 (docs/audits/iconv-pipeline.md) also remain
open: symlink target transcoding, --files-from line forwarding,
secluded-args argv re-encoding, CF_SYMLINK_ICONV gating.

Restoring the KF entry until the iconv pipeline rework lands.
oferchen added a commit that referenced this pull request May 5, 2026
…1911)

Closes #1911 by completing the local-copy bridge between the parsed
`IconvSetting` on `ClientConfig` and the engine's `LocalCopyOptions`,
mirroring the SSH/daemon bridge that PR #3458 wired through
`apply_common_server_flags`.

The audit at `docs/audits/iconv-inert.md` (PR #3526) labels #1911 as
"Partial" because the local-copy path bypasses
`apply_common_server_flags` and therefore had no route by which a
resolved `FilenameConverter` could reach the engine.

Changes:

- `engine::local_copy::LocalCopyOptions` and
  `LocalCopyOptionsBuilder` gain an `iconv: Option<FilenameConverter>`
  field with a `with_iconv` setter and `iconv()` accessor on the
  options struct, plus an `iconv()` setter on the builder.
- `core::client::run::LocalCopyOptionsBuilder::build` (the internal
  wrapper around the engine builder) calls
  `config.iconv().resolve_converter()` and propagates the result via
  the new setter.
- Unit tests in `crates/engine/src/local_copy/options/filters.rs`
  cover default-none, attach, and clear.
- Unit tests in `crates/core/src/client/run/mod.rs` cover the
  `IconvSetting -> LocalCopyOptions.iconv` mapping for unspecified,
  disabled, locale-default, explicit, and unsupported-charset variants.

This PR plumbs the converter only; it does not yet apply it on
sender file-list emit (#1912), receiver file-list ingest (#1913),
or filter-rule path matching (#1914). Those are tracked separately.
oferchen added a commit that referenced this pull request May 5, 2026
…3548)

Wires the existing `apply_encoding_conversion` hook on the sender's
file-list emit path so filenames are transcoded from local to remote
charset before any prefix-compression bookkeeping or wire emission
when --iconv=remote,local is in effect. The producer wiring tracker
in `crates/engine/src/local_copy/options/filters.rs` (#1911) named
this gap explicitly, and the SSH/daemon producer in
`transfer::generator::build_flist_writer` already calls
`writer.with_iconv(...)` from `connection.iconv`, so the hook now
takes effect end-to-end whenever the runtime config carries a
resolved converter.

Changes:

- Annotate the iconv conversion call site in
  `FileListWriter::write_entry` with the upstream reference
  `flist.c send_file_entry() iconv_buf(ic_send, ...)` so the
  protocol-fidelity contract is visible at the wire boundary.
- Mirror `FileListWriter::with_iconv` on `BatchedFileListWriter`
  so the batched sender path can attach a converter through the
  same builder surface.
- Add unit tests in `crates/protocol/src/flist/write/tests.rs`
  covering the with-converter round-trip (UTF-8 source name -> wire
  bytes are ISO-8859-1, no UTF-8 multi-byte sequence remains) and
  the without-converter path (wire bytes match the original
  filename, preserving current behaviour).

upstream: flist.c send_file_entry() iconv_buf(ic_send, ...)
audit: docs/audits/iconv-pipeline.md, Finding 4
oferchen added a commit that referenced this pull request May 5, 2026
…1916) (#3534)

* test(interop): add --iconv interop test against upstream rsync 3.4.1 (#1916)

Add test_iconv_local_ssh_interop to tools/ci/run_interop.sh covering
the SSH/local-mode side of --iconv interop with upstream rsync 3.4.1,
the path PR #3458 wired up via IconvSetting -> FilenameConverter. Two
directions are exercised through a fake remote-shell wrapper that
discards the host argument and exec's the rest locally:

  a) oc-rsync sender -> upstream receiver  (--rsh=fake --rsync-path=upstream)
  b) upstream sender -> oc-rsync receiver  (--rsh=fake --rsync-path=oc-rsync)

The fixture is UTF-8 source filenames whose code points all fit in
ISO-8859-1 (cafe, uber, naive, Zurich); --iconv=UTF-8,ISO-8859-1
forces Latin-1 wire encoding while the local charset stays UTF-8.

The companion daemon-mode scenario, "standalone:iconv-upstream", stays
in known_failures.conf until daemon-side `charset =` plumbing lands
(#1911-#1917 per docs/audits/iconv-pipeline.md). Comments updated to
make the SSH/local vs daemon split explicit.

Pre-checks:
  - upstream binary version availability (graceful skip).
  - upstream iconv compile-time support (graceful skip on --disable-iconv).
  - host filesystem accepts UTF-8 names (graceful skip).

References:
  upstream: options.c:recv_iconv_settings, flist.c:738-754, 1579-1603
  oc-rsync: docs/audits/iconv-pipeline.md (Findings 1-7)

* test(interop): mark iconv-local-ssh known failure pending #1911-#1913

Per docs/audits/iconv-pipeline.md Finding 4, the IconvSetting ->
protocol::FilenameConverter bridge does not exist in production code,
so --iconv is a no-op end to end in SSH/local mode and the test hangs
on direction (a) (oc-rsync sender -> upstream receiver).

The wiring lands in #1911 (config build), #1912 (sender flist emit),
and #1913 (receiver flist ingest). Once those merge, remove this
entry so the test starts gating regressions.
oferchen added a commit that referenced this pull request May 5, 2026
… (#3552)

Locks the byte-level wire encoding of filenames that pass through the
--iconv pipeline (#1911 config, #1912 sender, #1913 receiver), so future
regressions in either the sender or the receiver are caught at the byte
level rather than at the round-trip level.

Coverage:
  - sender (UTF-8 -> ISO-8859-1, KOI8-R, windows-1252) emits exact
    remote-charset bytes; suffix_len header reflects the post-iconv
    length, not the UTF-8 source length
  - receiver decodes ISO-8859-1, KOI8-R, and windows-1252 wire bytes
    into UTF-8 disk names byte-for-byte
  - identity converter (UTF-8 <-> UTF-8) preserves raw UTF-8 bytes
  - ASCII-only names pass through both directions unchanged
  - sender + receiver wire bytes match exactly (iconv writer output
    equals raw remote-charset entry output; iconv reader output equals
    UTF-8-native plain reader output)
  - round-trip preserves "café", "файл", "Ångström.txt" through their
    respective remote charsets
  - directory-prefixed compression operates on post-iconv bytes so
    sender and receiver agree on the iconv-then-compress order
  - unmappable source characters (Greek alpha into ISO-8859-1) and
    invalid wire bytes (malformed UTF-8) surface InvalidData errors,
    mirroring upstream's IOERR_GENERAL path in flist.c

Tests gated on `feature = "iconv"` and Unix where non-UTF-8 path bytes
are preserved by OsStr. Wire-format bytes verified against upstream
rsync 3.4.1's flist.c send_file_entry/recv_file_entry encoding.
oferchen added a commit that referenced this pull request May 5, 2026
#1911) (#3555)

* feat(transfer): wire IconvSetting -> FilenameConverter at config build (#1911)

Per docs/audits/iconv-pipeline.md Finding 4, --iconv was previously a
no-op end to end: the CLI parsed --iconv into IconvSetting and forwarded
the value to the remote peer, but the local process never converted that
setting into a protocol::FilenameConverter, so the file-list reader and
writer ran with iconv: None and copied raw bytes through transfer.

The bridge is now in place across all three transfer modes:

- SSH push/pull (and embedded SSH): apply_common_server_flags writes
  `server_config.connection.iconv = config.iconv().resolve_converter()`
  in crates/core/src/client/remote/flags.rs:228, invoked from
  build_server_config_for_receiver and build_server_config_for_generator
  (ssh_transfer.rs and embedded_ssh_transfer.rs).
- Daemon push/pull: apply_common_server_flags is also called from
  daemon_transfer/orchestration/server_config.rs (both paths). On the
  daemon side, module_access/client_args.rs:283 resolves the module's
  `charset =` directive into the same FilenameConverter slot.
- Local copy: LocalCopyOptionsBuilder::apply_iconv in
  crates/core/src/client/run/mod.rs:399 writes the converter onto
  LocalCopyOptions::iconv, since the local-copy executor in the engine
  crate bypasses the SSH/daemon ServerConfig builder.

The transfer crate consumes the converter at:

- crates/transfer/src/receiver/mod.rs:369 - reader.with_iconv(...)
- crates/transfer/src/generator/mod.rs:570 - writer.with_iconv(...)

These hand the converter to the protocol crate's FileListReader /
FileListWriter, where #1912 and #1913 already apply it via
apply_encoding_conversion on every emitted and ingested filename. With
the producer-side wiring landed, the test_iconv_local_ssh interop
scenario starts gating regressions, so the standalone:iconv-local-ssh
entry is removed from KNOWN_FAILURES and DASHBOARD_ENTRIES.

Upstream references:

- flist.c:1579-1603 send_file_entry filename iconvbufs(ic_send, ...)
- flist.c:738-754 recv_file_entry filename iconvbufs(ic_recv, ...)
- options.c::recv_iconv_settings parses --iconv=LOCAL,REMOTE

* revert(ci): restore standalone:iconv-local-ssh known failure

PR #3458 (cd76ec2) wired IconvSetting -> FilenameConverter for the
local generator/receiver, but two architectural gaps prevent the
SSH iconv scenario from passing end-to-end:

- --iconv is not forwarded to the spawned remote rsync. Upstream
  options.c:2715-2723 forwards the post-comma half of iconv_opt so
  the remote setup_iconv() agrees on charsets.
- FilenameConverter (crates/protocol/src/iconv/converter.rs) treats
  remote_encoding as the literal wire charset. Upstream rsync.c:87-147
  always uses UTF-8 on the wire and treats `charset` as each peer's
  local charset.

Audit findings 1-3 and 5 (docs/audits/iconv-pipeline.md) also remain
open: symlink target transcoding, --files-from line forwarding,
secluded-args argv re-encoding, CF_SYMLINK_ICONV gating.

Restoring the KF entry until the iconv pipeline rework lands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant