Skip to content

feat(compile): V2.2 on-disk per-module object cache (follow-up to #131, #132)#134

Closed
TheHypnoo wants to merge 2 commits intoPerryTS:mainfrom
TheHypnoo:feat/object-cache
Closed

feat(compile): V2.2 on-disk per-module object cache (follow-up to #131, #132)#134
TheHypnoo wants to merge 2 commits intoPerryTS:mainfrom
TheHypnoo:feat/object-cache

Conversation

@TheHypnoo
Copy link
Copy Markdown
Contributor

@TheHypnoo TheHypnoo commented Apr 22, 2026

Summary

Implements the V2.2 on-disk object cache scoped in #131 and follows V2.1's in-memory AST cache from #132. Each .ts module's compiled .o bytes now land at .perry-cache/objects/<target>/<key:016x>.o, shared across every perry compile / perry run / perry dev invocation on that project.

The real perf win lives here, not in V2.1. On a 30-module synthetic project on my M1:

Scenario Median (5 runs) vs baseline
--no-cache baseline 714 ms
Cold cache (empty, warmed in run) 709 ms ~0% overhead
Warm cache (all hits) 509 ms −29% (205 ms saved)
1 module edited, 29 cached 512 ms −28%

Cost scales with changed modules, not total — the perry dev watch loop is the primary beneficiary.

Design

  • ObjectCache (thread-safe via AtomicUsize counters, one file per entry so rayon workers don't contend) uses atomic tmp-then-rename writes. IO errors are counted and the codepath degrades to uncached — the cache is strictly an optimization, never a correctness dependency.
  • compute_object_cache_key is a streaming djb2 (mirroring build_optimized_libs's prior-art pattern) over every CompileOptions field that affects compile_module's output bytes:
    • module source hash (djb2 computed once in collect_modules and stored on CompilationContext::module_source_hashes — no second file read)
    • target triple, is_entry_module, output_type, feature flags (needs_stdlib/needs_ui/needs_geisterhand/needs_js_runtime)
    • all import maps/sets — sorted so HashMap iteration order doesn't leak into the key
    • imported classes with full signature (name, source_prefix, ctor arity, fields, methods, parent, source_class_id)
    • imported enums, type aliases, imported async funcs, imported vars, imported param counts, imported return types
    • enabled features, i18n snapshot, CARGO_PKG_VERSION
    • topologically-sorted lists (non_entry_module_prefixes, native_module_init_names) preserve order so a link-ordering change (v0.5.127-128 bug class) invalidates consumers — this is the acceptance criterion @ralphhempel called out in V2 watch mode: incremental compilation cache — scoping (follow-up to #126) #131
  • Disabled automatically in bitcode-link mode (PERRY_LLVM_BITCODE_LINK=1) since compile_module emits .ll text, not object bytes.

CLI surface

  • --no-cache flag on compile / run / dev
  • PERRY_NO_CACHE=1 env var (CI-friendly override)
  • PERRY_DEV_VERBOSE=1 prints • codegen cache: H/T hit (M miss) per build (same env var V2.1 uses for parse cache:, so one lever turns on both lines in perry dev)
  • perry cache info — location, total size, per-target breakdown
  • perry cache clean — wipe .perry-cache/ for the current project

Deviations from issue #131 scoping

Two worth calling out so the maintainer can veto if desired:

  1. Scope: the issue said "dev-only first, promote to compile later". This PR wires the cache into perry compile / perry run / perry dev all at once. Rationale: the cache has no dev-specific coupling (pure CompileOptions + source_hash key), and perry run / CI batch builds benefit from the same warm-hit path. Happy to gate behind a dev-only flag in a follow-up if preferred.
  2. Plumbing: CompileResult gained an optional codegen_cache_stats: Option<(hits, misses, stores, store_errors)> so dev.rs can print the codegen cache: line alongside parse cache: after run_with_parse_cache returns. Widget/web/wasm helper returns get None (they never touch the codegen cache); the three codegen paths (--no-link, is_dylib, main success) populate stats from ObjectCache.

Tests

15 unit tests in object_cache_tests (all passing):

  • Key stability across HashMap-insertion-order permutations
  • Key divergence on every invalidation axis (source hash, perry version, target, entry flag, non-entry-prefix order, imported-class arity, bitcode mode)
  • Disabled cache no-ops cleanly; no counters bumped, no files created
  • Cross-target isolation (target-a and target-b keyed separately)
  • Store → lookup round-trips bytes verbatim

Integration test (scripts/run_cache_tests.sh) on test-files/module-init-order/ (the real multi-module fixture that exercises topological init order):

  • Baseline --no-cache → records expected output
  • Cold cache: 0/4 hit (4 miss), output matches baseline
  • Warm cache: 4/4 hit (0 miss), output matches baseline
  • Source edit on one module: 3/4 hit (1 miss), output reflects the edit (the stale-bytes anti-regression)
  • Restore source: 4/4 hit returns to full-hit state
  • perry cache info / perry cache clean smoke-tested at the end

What does NOT ship in this PR

Per CONTRIBUTING.md: no [workspace.package] version bump, no **Current Version:** edit on CLAUDE.md, no "Recent Changes" entry — maintainer folds those in at merge time.

Test plan

  • `cargo test -p perry --bin perry object_cache_tests` — 15/15 pass
  • `cargo test -p perry --bin perry parse_cache_tests` — V2.1's 8 tests still pass (no regressions)
  • `scripts/run_cache_tests.sh` end-to-end — PASS
  • Benchmark on 30-module project: ~29% warm-rebuild speedup, ~0% cold-cache overhead
  • Maintainer review of cache-key field selection vs any `CompileOptions` additions landed since branching

Closes (partial) #131.

Adds `.perry-cache/objects/<target>/<key:016x>.o`, shared across
`perry compile` / `perry run` / `perry dev` invocations. Each rayon
codegen worker computes a djb2 key from (source hash, every codegen-
affecting `CompileOptions` field, perry version) and reuses the cached
bytes instead of re-invoking LLVM on unchanged modules. On a 30-module
bench, warm rebuilds drop from ~714 ms → ~509 ms (~29% faster); a
single-module edit rebuilds in the same ~512 ms (cost scales with
changed modules, not total).

Follows v2.1 (PerryTS#132) — v2.1's in-memory AST cache only helps within a
single `perry dev` session and didn't pay off against SWC's ~1ms/file
parse cost; v2.2 is the real win because it skips the whole LLVM
pipeline, not just parsing.

Architecture:
- `ObjectCache` (thread-safe via AtomicUsize counters, file-per-entry
  so rayon workers don't contend) with atomic tmp-then-rename writes.
  IO errors are silently counted and degrade to the uncached codepath
  — the cache is strictly an optimization.
- `compute_object_cache_key` serializes every `CompileOptions` field
  that affects `compile_module`'s bytes: source hash, target triple,
  is_entry_module, all import maps/sets (sorted so HashMap iteration
  order doesn't leak in), imported classes (full signature incl. ids),
  imported enums, type aliases, enabled features, i18n snapshot,
  CARGO_PKG_VERSION. Topologically-sorted lists (non_entry_prefixes,
  native_module_init_names) preserve order so a link-ordering change
  (the v0.5.127-128 bug class) invalidates consumers.
- Disabled automatically in bitcode-link mode (PERRY_LLVM_BITCODE_LINK=1)
  since compile_module emits .ll text, not object bytes.

CLI:
- `--no-cache` flag on `perry compile` / `perry run` / `perry dev`.
- `PERRY_NO_CACHE=1` env var (CI-friendly override).
- `PERRY_CACHE_VERBOSE=1` prints `• object cache: H/T hit (M miss, S store, E store-err)`
- `perry cache info` — cache location, total size, per-target breakdown.
- `perry cache clean` — wipe `.perry-cache/` for the current project.

Tests:
- 15 unit tests in `object_cache_tests` covering key stability across
  HashMap-insertion-order permutations, key divergence on every
  invalidation axis (source hash, perry version, target, entry flag,
  non-entry-prefix order, imported class arity, bitcode mode),
  disabled-cache no-op semantics, cross-target isolation, and store
  round-trip.
- `scripts/run_cache_tests.sh` end-to-end smoke: 4-module project
  (test-files/module-init-order), asserts cold→warm→partial→rewarm
  hit/miss shapes and that a source edit is never served stale bytes.
@TheHypnoo TheHypnoo marked this pull request as draft April 22, 2026 09:53
…S#131 spec

Issue PerryTS#131 asked for an env-gated one-line cache summary using the same
format as V2.1's parse cache, alongside it in `perry dev` verbose mode.
The V2.2 PR shipped a different env var (`PERRY_CACHE_VERBOSE`), label
(`object cache`), and extra fields (store / store-err counts), and
printed from compile.rs so the two lines didn't appear together.

Align to the spec:

- Env var: `PERRY_CACHE_VERBOSE` → `PERRY_DEV_VERBOSE` (one lever turns
  on both parse + codegen cache diagnostics).
- Label: `• object cache: H/T hit (M miss, S store, E store-err)` →
  `• codegen cache: H/T hit (M miss)` matching parse-cache format.
- Print in dev.rs right after the parse-cache line when
  `run_with_parse_cache` returns a `Some(codegen_cache_stats)` — so
  `perry dev` users see both lines together. compile.rs still prints
  for batch invocations (guarded on `parse_cache.is_none()` to avoid
  a duplicate line under dev), keeping `perry compile --no-cache=false`
  observable without having to go through dev.

Plumbing: `CompileResult` gains an optional
`codegen_cache_stats: Option<(hits, misses, stores, store_errors)>`
tuple, populated from `ObjectCache` on the main success path and on the
`--no-link` / `is_dylib` early-return paths; `None` everywhere else
(widget/web/wasm helpers that don't touch the codegen cache).

Integration test (`scripts/run_cache_tests.sh`) updated for the new
label and env var. All 5 phases pass (baseline / cold / warm / partial /
rewarm) and all 15 `object_cache_tests` unit tests still pass.

The over-delivery on scope (cache wired into compile/run/dev, not just
dev as the issue scoped it) is preserved — per maintainer feedback,
that's fine since `perry run` and CI benefit from the same cache.
@TheHypnoo TheHypnoo marked this pull request as ready for review April 22, 2026 10:00
proggeramlug added a commit that referenced this pull request Apr 22, 2026
Three post-review fixes on top of PR #134's V2.2 object cache (hypnoo):

1. Hash PERRY_DEBUG_INIT, PERRY_DEBUG_SYMBOLS, PERRY_LLVM_CLANG into the
   cache key. These env vars alter compile_module output bytes (debug
   puts in module init, DWARF sections in .o, clang binary selection)
   but weren't captured — so running once with PERRY_DEBUG_SYMBOLS=1
   would silently poison the cache for subsequent default-env runs.
   Values (not presence) are hashed so persistent overrides like
   PERRY_LLVM_CLANG=/opt/llvm/bin/clang in a shell rc still get reuse.

2. Add .perry-cache/ to .gitignore. The cache bakes in host CPU features
   via clang's -mcpu=native / -march=native, so committing it would ship
   .o files that SIGILL on other developers' machines.

3. Fix run_cache_tests.sh portability: replace the ([0-9]+)/\1 backref
   (not supported by BSD grep -E on macOS) with [0-9]+/[0-9]+. Same
   semantic — "miss count == 0" is what "all hits" actually means.

Tests: 16/16 pass (15 original + new key_changes_with_codegen_env_vars).
Integration smoke: cold 0/4 → warm 4/4 → partial 3/4 → rewarm 4/4.
End-to-end manual: single-file cold/warm/--no-cache/PERRY_NO_CACHE=1,
multi-module cold/warm/edit/rewarm, env-var-flip invalidation, cache
info/clean subcommands — all green.
@proggeramlug
Copy link
Copy Markdown
Contributor

Merged to main as fast-forward at v0.5.160 (commits 50d09ed + de514d5 + 1a12afe).

Thanks @TheHypnoo! Folded in at merge time per CONTRIBUTING:

  • Rebased onto latest main (on top of v0.5.159's winget fix).
  • Added three post-audit fixes in 1a12afe: (1) PERRY_DEBUG_INIT / PERRY_DEBUG_SYMBOLS / PERRY_LLVM_CLANG hashed into the cache key so these env vars can't silently poison subsequent runs; (2) .perry-cache/ added to .gitignore (cache is machine-local because clang's -mcpu=native bakes host CPU features); (3) scripts/run_cache_tests.sh backreference ([0-9]+)/\1[0-9]+/[0-9]+ for BSD grep (macOS) portability.
  • Bumped to v0.5.160 + CLAUDE.md Recent Changes entry.

16/16 unit tests pass, integration smoke passes, manual end-to-end (single-file, multi-module, cache info/clean, env-var-flip invalidation) all green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants