Skip to content

perf(sourcemap-upload): switch artifact-bundle ZIP to STORED to let per-chunk zstd actually compress (75% CPU saved, 5% wire saved) #847

@BYK

Description

@BYK

Summary

The chunk-upload code path double-compresses: (1) DEFLATE inside the artifact-bundle ZIP, (2) zstd/gzip on each wire chunk. With zstd now in the protocol (#823), the ZIP-level DEFLATE adds significant CPU cost while contributing essentially nothing to wire size — zstd extracts the same redundancy in a fraction of the time.

Switching the ZIP to STORED (compression=zipfile.ZIP_STORED) and letting per-chunk zstd L3 do all the compression is strictly better on every axis I measured: less CPU, less wire, simpler code, no protocol change.

Measurements

Three real-world payload shapes, comparing the current architecture (A) against STORED + zstd (B):

Payload Current (DEFLATE + zstd) Proposed (STORED + zstd) CPU saved Wire saved
CLI binary (3.2 MB JS + 11.2 MB map) 717 ms / 3,797,107 B 177 ms / 3,591,491 B −540 ms (75%) −205 KB (5.4%)
Docs site (5 pairs, 2.77 MiB) 173 ms / 789,228 B 56 ms / 784,954 B −117 ms (67%) −4 KB (0.5%)
Synthetic JS+map (10 MiB) 486 ms / 7,686,289 B 134 ms / 7,626,670 B −352 ms (72%) −60 KB (0.8%)

Methodology: rebuilt the actual artifact-bundle ZIPs from real source files, then chunked at the server-advertised 8 MiB and compressed each chunk per the wire codec. Times are encode-only (decompression measured separately in AGENTS.md lore: zstd L3 ~13 ms vs gzip L6 ~22 ms on equivalent server-side workload).

Why the wire size barely changes

The current architecture's ZIP-level DEFLATE L6 already extracts ~93% of the redundancy from bin.js + bin.js.map:

RAW total                14,028,274 bytes  100.0%
DEFLATE L6 inside ZIP     3,797,010 bytes   27.1%   ← what the ZIP becomes
gzip L6 on top            3,797,267 bytes   27.1%   ← essentially unchanged
zstd L3 on top            3,797,107 bytes   27.1%   ← essentially unchanged

The per-chunk wire codec has near-zero work to do because DEFLATE has already done it. Switch to STORED and the wire codec does the actual work:

RAW total                14,028,274 bytes  100.0%
zstd L3 on raw chunks     3,591,491 bytes   25.6%   ← strictly smaller

The 5.4% savings on the CLI payload is consistent with the prior AGENTS.md benchmark ("zstd L3 vs gzip L6 on real 8 MiB sourcemap chunks: ~5% smaller"). That benchmark was on uncompressed input — it described the codec's behavior in isolation, not the architecture's behavior end-to-end.

Why CPU drops 67–75%

DEFLATE L6 is doing real work compressing the ZIP entries on the encode side; that work is wasted because zstd then operates on already-compressed bytes (which also takes time, even though the output is essentially the same). Skipping the DEFLATE pass eliminates the wasted work. Total wall-clock CPU for the CLI binary case: 717 ms → 177 ms.

Server side

Sentry's ArtifactBundle reader uses zipfile.ZipFile(fileobj) (in src/sentry/models/artifactbundle.py), which transparently handles both DEFLATE and STORED entries. No protocol change. STORED entries are also marginally cheaper to read on the server (skip the per-entry DEFLATE decompress on lookup).

Rollout considerations

  • Pre-zstd self-hosted servers (gzip-only): STORED + gzip L6 gives identical wire size to the current architecture, just shifts the CPU work from the encode side to the wire side. Net: break-even on those servers; no regression.
  • Concurrency: the larger STORED bundle (14 MiB vs 3.6 MiB) splits into more 8 MiB chunks (2 vs 1 for the CLI binary). Servers advertise concurrency ≥ 8, so parallel uploads compensate. No latency penalty expected.
  • Chunk dedup: STORED chunks contain raw bytes; DEFLATE chunks contain compressed bytes. Both reflow on tiny edits to the source, so dedup behavior is comparable.
  • Memory: ZipWriter currently holds one entry's compressed output in memory at a time. STORED removes the DEFLATE buffer, reducing peak memory. Strictly better.

Proposed implementation

Add compress: boolean (default false post-zstd, gated on a flag while gathering data) to ZipWriter.addEntry(), or simpler — add a static factory ZipWriter.createStored() that hardcodes STORED. Update buildArtifactBundle() in src/lib/api/sourcemaps.ts to use the STORED variant.

I'm happy to follow up with a PR. Filing as an issue first because:

  1. The numbers come from local benchmarks; want a sanity check on whether anyone sees a payload shape where DEFLATE inside the ZIP is doing real work that zstd doesn't replicate.
  2. May want a small protocol kill-switch (SENTRY_CHUNK_UPLOAD_FORCE_DEFLATE_ZIP=1) for early debugging if a server somewhere chokes on STORED entries.

Related

  • feat(sourcemap-upload): Add zstd compression support #823 — zstd compression for chunk uploads (this issue is a follow-up: the codec lands the capability; this issue makes it actually pay off)
  • AGENTS.md lore entry "Chunk-upload compression level choices (CLI side)" — the prior benchmark, on isolated uncompressed input, missed that the architecture pre-compresses

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions