test(protocol): golden byte tests for iconv-converted filenames (#1919)#3552
Merged
Conversation
Locks the byte-level wire encoding of filenames that pass through the --iconv pipeline (#1911 config, #1912 sender, #1913 receiver), so future regressions in either the sender or the receiver are caught at the byte level rather than at the round-trip level. Coverage: - sender (UTF-8 -> ISO-8859-1, KOI8-R, windows-1252) emits exact remote-charset bytes; suffix_len header reflects the post-iconv length, not the UTF-8 source length - receiver decodes ISO-8859-1, KOI8-R, and windows-1252 wire bytes into UTF-8 disk names byte-for-byte - identity converter (UTF-8 <-> UTF-8) preserves raw UTF-8 bytes - ASCII-only names pass through both directions unchanged - sender + receiver wire bytes match exactly (iconv writer output equals raw remote-charset entry output; iconv reader output equals UTF-8-native plain reader output) - round-trip preserves "café", "файл", "Ångström.txt" through their respective remote charsets - directory-prefixed compression operates on post-iconv bytes so sender and receiver agree on the iconv-then-compress order - unmappable source characters (Greek alpha into ISO-8859-1) and invalid wire bytes (malformed UTF-8) surface InvalidData errors, mirroring upstream's IOERR_GENERAL path in flist.c Tests gated on `feature = "iconv"` and Unix where non-UTF-8 path bytes are preserved by OsStr. Wire-format bytes verified against upstream rsync 3.4.1's flist.c send_file_entry/recv_file_entry encoding.
oferchen
added a commit
that referenced
this pull request
May 5, 2026
… (#3552) Locks the byte-level wire encoding of filenames that pass through the --iconv pipeline (#1911 config, #1912 sender, #1913 receiver), so future regressions in either the sender or the receiver are caught at the byte level rather than at the round-trip level. Coverage: - sender (UTF-8 -> ISO-8859-1, KOI8-R, windows-1252) emits exact remote-charset bytes; suffix_len header reflects the post-iconv length, not the UTF-8 source length - receiver decodes ISO-8859-1, KOI8-R, and windows-1252 wire bytes into UTF-8 disk names byte-for-byte - identity converter (UTF-8 <-> UTF-8) preserves raw UTF-8 bytes - ASCII-only names pass through both directions unchanged - sender + receiver wire bytes match exactly (iconv writer output equals raw remote-charset entry output; iconv reader output equals UTF-8-native plain reader output) - round-trip preserves "café", "файл", "Ångström.txt" through their respective remote charsets - directory-prefixed compression operates on post-iconv bytes so sender and receiver agree on the iconv-then-compress order - unmappable source characters (Greek alpha into ISO-8859-1) and invalid wire bytes (malformed UTF-8) surface InvalidData errors, mirroring upstream's IOERR_GENERAL path in flist.c Tests gated on `feature = "iconv"` and Unix where non-UTF-8 path bytes are preserved by OsStr. Wire-format bytes verified against upstream rsync 3.4.1's flist.c send_file_entry/recv_file_entry encoding.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
--iconvpipeline (config build Propagate fallback signal exit codes #1911, sender flist emit Cache proxy authorization headers #1912, receiver flist ingest Ensure fallback binary lookup mirrors Windows PATHEXT #1913) so future regressions in either direction are caught at the byte level.IOERR_GENERALpath inflist.c: a Greek alpha source name targeting ISO-8859-1 (sender) and malformed UTF-8 wire bytes (receiver) both surfaceio::ErrorKind::InvalidData.Coverage
golden_sender_utf8_to_latin1_cafecafé->63 61 66 e9golden_sender_utf8_to_koi8r_cyrillicфайл->c6 c1 ca ccgolden_sender_utf8_to_windows1252_angstromÅngström.txt-> 12-byte CP1252 form, suffix_len=12golden_sender_ascii_passthrough_under_iconvgolden_sender_identity_converter_preserves_utf8golden_sender_unmappable_char_fails_conversionInvalidDatagolden_sender_suffix_len_is_post_iconv_for_shrinking_conversionsuffix_len = 4forcaféover ISO-8859-1golden_sender_suffix_len_is_post_iconv_for_cyrillicsuffix_len = 4forфайлover KOI8-Rgolden_receiver_latin1_to_utf8_cafe63 61 66 e9-> UTF-8cafégolden_receiver_koi8r_to_utf8_cyrillicc6 c1 ca cc-> UTF-8файлgolden_receiver_windows1252_to_utf8_angstromgolden_receiver_ascii_passthrough_under_iconvgolden_receiver_invalid_remote_bytes_fail[0xc3, 0x28](bad UTF-8) ->InvalidDatagolden_round_trip_cafe_via_latin1cafésurvives ISO-8859-1 hopgolden_round_trip_cyrillic_via_koi8rфайлsurvives KOI8-R hopgolden_round_trip_angstrom_via_windows1252Ångström.txtsurvives windows-1252 hopgolden_round_trip_dir_then_file_compressed_after_iconvgolden_sender_wire_equals_raw_remote_bytes_for_cafegolden_receiver_post_iconv_equals_utf8_native_decodeTests gated on
feature = "iconv". Receiver-side and round-trip tests additionally gated on Unix, whereOsStrpreserves arbitrary byte sequences (required to thread non-UTF-8 wire bytes through the writer for the receiver tests).Wire-format expectations cross-referenced against upstream rsync 3.4.1
flist.c:send_file_entry()lines 1580-1602 -iconvbufs(ic_send, ...)onfile->dirnameandfile->basenamebefore they reach the wire.recv_file_entry()lines 738-753 -iconvbufs(ic_recv, ...)afterread_sbuf()intothisname, beforeclean_fname().Test plan
cargo nextest run -p protocol --all-features -E 'test(iconv) or test(golden)'-> 443 tests run, 443 passedcargo fmt --all -- --checkcleancargo clippy -p protocol --all-features --tests --no-deps -- -D warningsclean