Various patches motivated by https://github.com/bootc-dev/bootc/pull/2073#19
Merged
cgwalters merged 3 commits intocomposefs:mainfrom Mar 17, 2026
Merged
Conversation
Motivated by bootc-dev/bootc#2073, where Go's archive/tar (used by Docker/BuildKit) emits PAX path headers for non-ASCII filenames like Főtanúsítvány.pem (valid UTF-8, but non-ASCII). PAX headers take precedence over basic tar headers per POSIX, so code that remaps paths by rewriting the basic header must also update or strip PAX path/linkpath records. tar-core already handles non-UTF-8 PAX path values correctly (raw `&[u8]` throughout, matching Go archive/tar and Rust tar crate), but this was untested. Add tests covering: parser acceptance of non-UTF-8 PAX path bytes, lossy conversion, builder->parser roundtrip with a >100 byte path (to actually trigger PAX emission), linkpath preservation, and PaxExtension value_bytes() vs value() behavior. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
Test the PAX 'x' -> GNU 'L' -> real entry ordering, which is what tar-rs's builder produces when you call append_pax_extensions() followed by append_data() with a long path. This matters for ecosystem compatibility -- bootc's copy_entry (bootc-dev/bootc#2073) generates exactly this layout when filtering PAX extensions during path remapping. The parser already handles this correctly via PendingMetadata accumulation across recursive parse_header calls, but the reversed ordering was untested. Also test that PAX path still wins over GNU long name regardless of which comes first in the byte stream. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
Previously, add_pax() accepted calls regardless of ExtensionMode but finish() only emitted PAX records in Pax mode -- silently discarding any PAX data added in Gnu mode. The doc comment incorrectly claimed PAX extensions would be emitted regardless of mode. Return HeaderError::IncompatibleMode instead, so callers get a clear error rather than quietly losing xattrs or other PAX metadata. This is a breaking API change: add_pax() now returns Result<&mut Self> instead of &mut Self. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See bootc-dev/bootc#2073