Skip to content

Fix tar upload content-length mismatch (#1086)#1089

Closed
noamzbr wants to merge 2 commits intoe2b-dev:mainfrom
noamzbr:fix-tar-upload-content-length-mismatch
Closed

Fix tar upload content-length mismatch (#1086)#1089
noamzbr wants to merge 2 commits intoe2b-dev:mainfrom
noamzbr:fix-tar-upload-content-length-mismatch

Conversation

@noamzbr
Copy link
Contributor

@noamzbr noamzbr commented Jan 21, 2026

Fixes #1086

Use portable mode for deterministic gzip output

Use portable mode for deterministic gzip output
@changeset-bot
Copy link

changeset-bot bot commented Jan 21, 2026

⚠️ No Changeset found

Latest commit: e8db0f6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2be4177c0c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Top-level portable strips executable bits; gzip option only affects header
@mishushakov
Copy link
Member

mishushakov commented Jan 22, 2026

I don't think we can do it this due to how hashes are calculated. Is there another way to ensure these are same?
Maybe pipe the stream to two readers at the same time?

@noamzbr
Copy link
Contributor Author

noamzbr commented Jan 23, 2026

Note I've fixed it from tar portable: true to gzip: { portable: true }. With gzip: { portable: true } - we expect only gzip wrapper headers to be tweaked, so the fileHash should be identical.

But sure, if you prefer avoiding changing the archive mechanism entirely - can you tell me more about what you had in mind? If the presigned PUT requires Content-Length (SDK currently assumes it does), we can’t really stream once to upload + counter because headers must be sent before the body. Did you mean switching the upload to chunked (no Content-Length), making the two-pass tar generation stable (e.g. normalize/strip volatile pax atime/ctime via onWriteEntry) so both passes produce identical bytes. or something else?

@mishushakov
Copy link
Member

Okay, I see.
What we require is that the files inside the tar keep same permissions as the source.

As far as I understand with gzip: { portable: true } the contents of the tar should be unchanged?

For the upload of the files we are using pre-signed URLs that require Content-Length to be specified before the upload. My initial idea was to have 2 streams - one for counting the archive length and another for actual upload.

@noamzbr
Copy link
Contributor Author

noamzbr commented Jan 26, 2026

Okay, I see. What we require is that the files inside the tar keep same permissions as the source.

As far as I understand with gzip: { portable: true } the contents of the tar should be unchanged?

For the upload of the files we are using pre-signed URLs that require Content-Length to be specified before the upload. My initial idea was to have 2 streams - one for counting the archive length and another for actual upload.

yes, gzip: { portable: true } will not change file modes (permissions)

@mishushakov
Copy link
Member

sounds good, did you verify this fixes #1086 for you?

@noamzbr
Copy link
Contributor Author

noamzbr commented Jan 26, 2026

sounds good, did you verify this fixes #1086 for you?

yes, being using it for around a couple of days now

mishushakov added a commit that referenced this pull request Jan 26, 2026
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> Ensures deterministic tar.gz archives during upload to prevent
content-length mismatches.
> 
> - In `tarFileStream`, switch gzip option from `true` to `gzip: {
portable: true }` to produce stable gzip headers without altering file
modes
> - Adds a changeset noting a patch release for this behavior change
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
ec90756. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: noamzbr <noamzbr@users.noreply.github.com>
@mishushakov
Copy link
Member

Thanks for your contribution, we merged it here #1095

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: tarFileStreamUpload() creates gzipped tar stream twice causing UND_ERR_REQ_CONTENT_LENGTH_MISMATCH

2 participants