Optional `--compress`: ship native addons ~70% smaller at the same runtime speed #3350

jdalton · 2026-06-25T21:10:26Z

jdalton
Jun 25, 2026

napi build --compress ships any native addon compressed and self-extracting, so the install is smaller at the same runtime speed. As an example: a vite 8.1.0 install pulls in two big napi natives — lightningcss and the rolldown binding — and --compress takes their footprint from ~26 MB to ~8 MB on disk (darwin-arm64).

Measured against vite 8.1.0's resolved natives, default codec (zstd-16):

lightningcss 1.32.0: 8.52 MB → 2.54 MB (3.4×)
rolldown 1.1.3 binding: 17.22 MB → 5.61 MB (3.1×)
vite 8.1.0, both natives: 25.74 MB → 8.15 MB — ~17.6 MB off the install

Output is byte-identical and runtime speed is unchanged — it's the same binary once loaded. PoC + tests on the feat/compress-native-addons branch; numbers reproduce on Node 26.3.1 with benchmark.mts.

`napi warm` — make the first require a cache hit

--compress adds one cost: the first time your app loads an addon it unpacks it (~12 ms for lightningcss, ~21 ms for the rolldown binding — a one-time stall on the first run). napi warm moves that off the hot path — run it in postinstall or CI and it unpacks every installed addon up front, so your app's first require is already a cache hit. It unpacks them all at once, so the warm step takes about as long as the slowest single addon (~31 ms here), not the sum. It uses the same NAPI_RS_NATIVE_CACHE as the loader, so =workspace warms the shared monorepo root once.

Each warmed binary then gets a best-effort, transparent filesystem-compression pass (--fs-compress, on by default): APFS via afsctool, btrfs via filesystem defragment -czstd, NTFS via compact /c /exe:LZX. "Use if it works" — skipped silently where unsupported, and it never alters file contents (the OS serves the bytes decompressed on read, so dlopen is unaffected), so a caching machine keeps the on-disk win instead of holding a full expanded copy. --no-fs-compress opts out.

--in-place goes a step further: instead of a separate cache, it expands each addon over its own .node in node_modules and filesystem-compresses it — so there's a single copy (no compressed-plus-expanded duplication) and the loader then sniffs it as a native binary and loads it directly, with no decompress ever. It mutates node_modules (pnpm imports build-script packages with clone-or-copy and caches the result via its side-effects cache; a reinstall restores the compressed file, so run it from postinstall), and on a filesystem without transparent compression the single copy is full-size — the trade for no duplication, so it's opt-in.

FAQ

What it does, and the one-time cost

napi build --compress stores each .node as a self-describing NAPC container — the addon keeps its name (no .zst/.br/.json sidecars, so a package's file list and optionalDependencies don't change). It starts with the ASCII magic NAPC and a short human-readable note (so head foo.node explains it), then a small header (codec, original size, payload sha256) and the compressed payload; the build also drops an llms.txt alongside. The generated binding sniffs the first bytes: a real binary (ELF/Mach-O/PE) loads directly; a NAPC addon has its payload sha256 verified (the small shipped bytes, not the larger output) and is decompressed to the cache — zstd's frame checksum validates the decode for free. A mismatch throws loud, so a corrupt or tampered addon never gets dlopen'd. (The NAPC magic plus a codec byte is also how brotli is detected — brotli has no magic of its own.) After that it's a cache hit.

first-load cost: ~7.6 ms decompress + ~5.1 ms write + ~0.9 ms verify + ~1 ms dlopen ≈ 14 ms

Once decompressed it's byte-for-byte a raw .node, so resident memory and runtime speed don't change. The only cost is the first-load expansion above — and napi warm moves even that to install time.

Codec: zstd vs brotli (--compress-codec)

codec comparison: zstd decodes ~2.5x faster than brotli at near-identical size

zstd is the default: it decodes ~2.5× faster at near-identical size and is in every maintained Node (22.15+). --compress-codec brotli forces brotli for the smallest blob (lightningcss 2.15 MB at q11 vs 2.31 MB at zstd max) or for consumers on Node < 22.15 (no zstd in node:zlib); if the build runtime lacks zstd it auto-falls-back to brotli. Decode is level-independent, so --compress-level only trades build time for a smaller blob — never load time.

Why not just shrink the Rust binary?

binary size vs minify throughput: opt-level=z is about 3x slower; --compress keeps full speed

These binaries are already stripped + fat LTO, so there's no free size left at the Rust level. opt-level=z gets lightningcss to 3.4 MB but runs ~3× slower (transform() minifying a 1.16 MB stylesheet: ~60 → ~19 ops/sec, best-of-3). The bytes are just code at that point. Storing them compressed and unpacking once is the only thing that shrinks the published size without making it slower.

Cache location & env vars

The size win is in the published package, the npm download, and node_modules — they hold the small blob. The decompressed copy is a separate, optional layer, and by default it's ephemeral: the loader expands into the same OS temp dir Node's own V8 compile cache uses — the one tools like vite already turn on via module.enableCompileCache() (that defaults to <os.tmpdir()>/node-compile-cache; ours sits beside it as <os.tmpdir()>/napi-rs-native). It's content-addressed, so it's a hit until temp is cleared (e.g. a reboot), then re-decompressed once. That keeps disk honest — a persistent home-dir cache would leave a machine holding the 2.5 MB blob plus the 8.5 MB expanded copy, i.e. more than shipping the raw .node.

NAPI_RS_NATIVE_CACHE picks the trade:

unset (default): ephemeral OS-temp cache — fast repeat loads within a session, disk self-bounds.
=<path>: persist there — a CI cache volume, a tmpfs, a shared mount.
=node_modules: the nearest node_modules/.cache (per-package).
=workspace: the workspace root's node_modules/.cache, shared by every package — root found by its manifest (pnpm-workspace.yaml, package.json "workspaces" for npm/yarn/bun, vlt-workspaces.json, aube-workspace.yaml, lerna.json, rush.json), falling back to the topmost node_modules.
=0: no cache — decompress to a per-process temp, unlinked right after load, so disk never holds more than the blob.

CI / depot.dev / Docker

On CI the compressed form is a win before you tune anything: a vite 8.1.0 install pulls ~17.6 MB less native down the wire, so fresh installs are smaller and the package store holds less. The only added cost is the one-time decompress.

To skip even that, the cache is just a dir, so it's cacheable and bakeable:

actions/cache: point NAPI_RS_NATIVE_CACHE at a persisted path keyed on the lockfile. First job decompresses, the rest restore warm.
depot.dev: mount that path on a cache volume, or bake it with depot bake + remote cache, so it's decompressed once across every build.
Docker / base images: a small napi warm build step pre-decompresses into the cache as a layer; containers start warm.

Bonus: the JS side adds up too

Separate from the native, a lot of toolchains ship their JS dist unminified for readable stack traces. That's reclaimable, and it isn't all-or-nothing — vite 8.1.0 has 1.88 MB of bundled JS:

level	dist JS	identifiers
whitespace + comments only	1.88 MB → 1.47 MB (−21%)	kept, stack traces stay readable
full minify	1.88 MB → 0.97 MB (−48%)	mangled

Strip whitespace + comments globally (names intact), or fully minify only the bundled-dependency chunks (nobody steps into chokidar/postcss internals) while leaving your own entry points readable. Combined with --compress, a full vite install goes 29.2 MB → ~10.7 MB (−63%).

Two open calls: for the multi-platform CI flow this belongs in artifacts (the PoC wires it into build for the single-platform case), and codec/level could move into the napi config so projects commit the choice instead of passing flags. Worth pursuing? Happy to PR it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node-API (N-API) for Rust

Optional `--compress`: ship native addons ~70% smaller at the same runtime speed #3350

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Node-API (N-API) for Rust

Optional --compress: ship native addons ~70% smaller at the same runtime speed #3350

Uh oh!

Uh oh!

jdalton Jun 25, 2026

napi warm — make the first require a cache hit

FAQ

Replies: 0 comments

Optional `--compress`: ship native addons ~70% smaller at the same runtime speed #3350

jdalton
Jun 25, 2026

`napi warm` — make the first require a cache hit