Conversation
…sses
The artifact-bundle ZIP was DEFLATE-encoded at level 6, then each wire
chunk was re-compressed by the upload codec (zstd or gzip). Compressing
already-compressed bytes wastes CPU and barely helps wire size — DEFLATE
already extracts ~93% of the redundancy, so the per-chunk codec has
nothing left to do.
When the server advertises a wire codec, switch the ZIP to STORED so
the wire codec does the actual compression. Two effects on the CLI's
own bin.js + bin.js.map (~14 MiB raw):
- CPU: 352 ms → 134 ms (62% saved)
- Wire: 3,818,608 B → 3,592,161 B (5.9% saved, 226 KB)
When the server's compression list is empty (the
`chunk-upload.no-compression` kill-switch or unsupported codecs only),
keep DEFLATE so the wire payload doesn't balloon to 14 MiB raw.
Implementation:
- Add `ZipCompression` type + `compression` option to
`ZipWriter.create()` (default `"deflate"` keeps existing direct
callers' behavior unchanged).
- Lift `pickUploadEncoding` from inside `uploadArtifactBundle` to
`uploadSourcemaps` so the ZIP construction can see the codec
choice. Pass it through to `buildArtifactBundle`.
- Server reads the archive via `zipfile.ZipFile`, which transparently
handles STORED + DEFLATE entries — no protocol change.
- Symbolicator's `SourceBundle::test()` cares about the SYSB magic
header, not entry compression — no change there either.
Tests:
- Property tests (round-trip via `unzip -p`) parameterized over both
compression modes.
- Unit tests for the local-file-header method byte (0=STORED,
8=DEFLATE) on both `ZipWriter` and `buildArtifactBundle`.
- Sanity test that STORED is larger than DEFLATE for redundant input
(would catch ZIP-layout regressions).
Refs #847
Contributor
|
Contributor
Codecov Results 📊✅ 6007 passed | Total: 6007 | Pass Rate: 100% | Execution Time: 0ms 📊 Comparison with Base Branch
All tests are passing successfully. ❌ Patch coverage is 32.56%. Project has 12899 uncovered lines. Files with missing lines (2)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 74.84% 75.57% +0.73%
==========================================
Files 285 286 +1
Lines 53047 52804 -243
Branches 0 0 —
==========================================
+ Hits 39699 39905 +206
- Misses 13348 12899 -449
- Partials 0 0 —Generated by Codecov Action |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #847.
Summary
The artifact-bundle ZIP was DEFLATE-encoded inside, then each wire chunk was re-compressed by the upload codec. Compressing already-compressed bytes is wasted work — DEFLATE has already extracted the redundancy that zstd/gzip would have extracted on the wire.
When the server advertises a wire codec, switch the ZIP to STORED so the wire codec does the work. Keep DEFLATE when no wire codec is available (the
chunk-upload.no-compressionkill-switch) so the upload doesn't balloon to raw bytes.Measurements (CLI binary upload, 14.3 MiB raw)
Re-measured locally on master at
3f326bed. Same shape as the numbers in #847; the magnitude is slightly lower than the issue's 75% (because Bun's deflate is faster than the Pythonzipfilebaseline I used in the original investigation).Compatibility
zipfile.ZipFileinsrc/sentry/models/artifactbundle.py, which transparently handles STORED + DEFLATE entries. No protocol change.SourceBundle::test()checks the SYSB magic header, not entry compression. No change.pickUploadEncodingreturnsgziporundefined. STORED is only used when a wire codec is available, so plain uploads keep DEFLATE and stay small.ZipWriter.create(): default iscompression: "deflate"— no behavior change.Tests
describe.each.ZipWriterandbuildArtifactBundle.Implementation notes
ZipCompressiontype exported fromsrc/lib/sourcemap/zip.ts.pickUploadEncodingis lifted from insideuploadArtifactBundleto the top ofuploadSourcemapsso the ZIP construction can see the codec choice. The encoding is then passed through to bothbuildArtifactBundle(for STORED-vs-DEFLATE) anduploadArtifactBundle(for wire compression).pickUploadEncoding(serverOptions.compression)is itself the safety net for any deployment that opts out of wire compression.