Conversation
…ld silent drop Phase 1 of the post-PR-#31 audit. The PR #31 state-local bug had a specific shape: analyzer allocated a slot, codegen looked it up by VarId, silently emitted nothing on miss. Six sites in gen_op plus the global-initializer loop and the parameter-shuffle prologue all used the same `if let Some(&addr) = self.var_addrs.get(var) { ... }` pattern with no else branch. Any future allocation-map desync would slip through the same crack. Replace every site with a new `IrCodeGen::var_addr(VarId) -> u16` helper that panics with an explicit "compiler bug" message on miss. An IR op referencing an unmapped VarId is not valid input — it means the analyzer and lowerer disagreed on what to allocate, and we want that crash to surface in CI rather than be absorbed by whatever zero-filled RAM happened to sit at the read. Running cargo test against the hardened lookup surfaced exactly the bug shape the plan predicted: uninitialized struct globals (e.g. `var p: Point` with no literal initializer) never had their flattened field VarIds (`"p.x"`, `"p.y"`) registered in var_addrs. The IR lowerer's `get_or_create_var("p.x")` minted a VarId, the analyzer's `flatten_struct_fields` allocated an address for it, but IrCodeGen::new only populated var_addrs from `ir.globals`, which doesn't contain synthesized field entries for uninitialized structs. Every `p.x = N` silently compiled to nothing. Fix by exposing the IR lowerer's name→VarId map on IrProgram and joining it with the analyzer's allocations in IrCodeGen::new. Every allocated name that the lowerer knows about now gets a var_addrs entry. Example ROMs are byte-identical (no example relied on the dropped writes), but the bug was reachable — any user program with a plain `var pos: Point` declaration and field writes would have hit it silently. Add `uninitialized_struct_field_store_emits_sta_to_allocated_address` as a byte-level regression guard: compile `p.x = 123` and scan PRG for `LDA #\$7B / STA <addr>`. Fails against the old silently-dropping codegen. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
…res (or fix them) Phase 2 of the post-PR-#31 audit. The codebase had four documented "silently skip" paths that parsed user intent but produced no code. Each one was the same shape as the state-local bug: the analyzer accepted the program, the IR lowered the construct, but somewhere downstream the emitted code was dropped on the floor — and a pixel golden that captured the broken behaviour locked it in as correct. Fix each per the plan, either by implementing the feature or rejecting the program at the analyzer. ### on_exit handlers now actually run `IrOp::Transition` used to comment "on_exit of the current state isn't called here because we don't know from an IR op alone which state we're leaving." The codegen emitted the exit handler's body as an IR function but never JSR'd it. Three example programs (pong, war, state_machine) relied on `stop_music` or mode-flag translation inside `on exit` that had been silently never running. Emit a small CMP-chain against `ZP_CURRENT_STATE` before each transition: for every state that declares an on_exit, compare the current index, branch past on miss, JSR the exit handler on match, then JMP to the shared done-label so only the leaving state's handler fires. The chain is inlined at each transition site (bounded by the number of states declaring on_exit) rather than factored into a single trampoline — simpler to reason about, and transitions are rare enough that the extra bytes don't matter. Pong / war / state_machine ROMs change because the dispatch code is now emitted. Video goldens stay byte-identical (no transitions happen within the 180-frame harness window under no-input). Pong and war audio hashes shifted from pure code-layout timing and are regenerated. `docs/pong.gif` and `docs/war.gif` are byte-identical. ### State-local array initializers now refuse to compile (E0601) `src/ir/lowering.rs:887` had the comment "Array initializers for state-locals aren't supported yet... Programs that try this should get a diagnostic from the analyzer; for now, silently skip." The analyzer never actually emitted that diagnostic. Verified by compiling `state Main { var buf: u8[4] = [10,20,30,40] ... }`: the program built a valid ROM with no trace of 10/20/30/40 in PRG. Add E0601 to the analyzer's state-local pass. The IR lowerer's defensive `continue` stays in place as a belt-and-braces guard. ### `on scanline` without MMC3 is now E0603 Previously E0203 ("invalid operation for type") which is a miscategorisation — the feature is unsupported on the current mapper, not a type error. Dedicated E0603 makes the future-work shape explicit. ### `slow` variables now actually live outside zero page `Placement::Slow` was parsed into the AST but `allocate_ram` ignored it, so `slow var cold: u8` still landed in ZP like any other u8. Wire `var.placement` through `allocate_ram_with_placement` and skip the ZP branch when `Slow` is set. `Fast` remains advisory (the existing default already prefers ZP for u8 vars), validated by W0107. ### Other address-map silent drops hardened Alongside the var_addrs hardening from phase 1, three `state_indices` lookup sites that did `.copied().unwrap_or(0)` or silent `if let` are now explicit panics: scanline IRQ dispatch, MMC3 reload, and `IrOp::Transition`. A miss in any of them is a compiler bug, not valid input — the analyzer catches unknown state names upstream. ### Regression guards Four new tests would have failed against the old silently-dropping code paths: - `analyze_state_local_array_initializer_rejected` — expects E0601. - `analyze_on_exit_declaration_accepted` — expects no errors. - `analyze_slow_var_forced_out_of_zero_page` — expects alloc address >= $0100. - `transition_dispatches_leaving_states_on_exit_handler` — counts distinct JSR targets in the PRG before/after adding `on exit` to a state; the exit-bearing build must have more. All 720 tests pass. All 34 emulator goldens pass after the pong/war audio hash refresh. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
… checklist Phase 3/4 of the post-PR-#31 audit. ### Call args > 4 is now an assert `IrOp::Call` silently `.take(4)`-d the arg list with a comment claiming the analyzer's E0506 check made the extras unreachable. Replace with an explicit `assert!(args.len() <= 4, ...)` so if the analyzer ever regresses, the codegen crashes loudly instead of miscompiling the call. Iterate over all args (not just the first 4) since the assert guarantees correctness. ### CLAUDE.md: new-feature PR checklist Document the lesson the audit taught: every new language-feature PR must include (1) an example exercising it, (2) a runtime *behaviour* assertion (not just a "ROM validates" shape check), (3) a negative test for invalid use. Call out the specific address-map lookup pattern (`if let Some(&addr) = map.get(..)` with no else) that shipped the state-local bug, and recommend the `IrCodeGen::var_addr` / explicit `.unwrap_or_else(|| panic!)` idiom instead. Chose not to add a regex-based CI tripwire for "silently" / "for now" comments because the false-positive rate against legitimate design decisions ("silently truncate to 8 bits per the cast spec", etc.) would train contributors to ignore it. The durable checklist in CLAUDE.md is what next agents need. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
… regression Phase 5 of the post-PR-#31 audit, and the structural piece that closes the failure mode the earlier phases couldn't fix alone. The audit's recurring diagnosis: pixel/audio goldens capture *whatever* the program does, not what it *should* do. A silent drop in codegen is still deterministic — the golden locks in the broken behaviour and every future run agrees with it. That's how state-locals, uninitialized struct-field writes, `on exit` handlers, and `slow` placement each sat broken for months-to-a- year in a green CI. The canary inverts the relationship: the committed golden is a solid-green universal backdrop that only appears when every round-trip check passes. Each check writes a distinctive constant through one language construct, reads it back, and clears `all_ok` on mismatch. A final `if all_ok == 0 { set_palette Fail }` flips the entire screen red for the rest of the run. Checks cover the silent-drop shapes caught by this audit: - state-local variable write-read (PR #31) - uninitialized struct-field write-read (caught by phase 1) - u8 / u16 globals (u16 exercises both StoreVar + StoreVarHi) - array-element write at nonzero index - `slow`-placed global still round-trips - function call return value The canary doesn't use `debug.assert` on purpose — debug-only ops get stripped in release and the emulator harness runs release builds. The palette swap works in release and is what the harness pixel-diff sees. ### Why this matters as a long-lived test The harness already had 34 pixel goldens covering full-program behaviour, but none of them exist specifically to fail if a *specific language feature* silently drops. The canary does. Every silent-drop bug the audit found would have flipped it red the moment the check was added, which is the "behaviour assertion that can't be satisfied by silence" the plan called for. ### Harness footprint `tests/emulator/goldens/feature_canary.{png,audio.hash}` + `examples/feature_canary.{ne,nes}`. 35/35 ROMs match their goldens with the canary added. Listed in both README tables. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
… convention Follow-up to the silent-drop audit. The old ABI passed every parameter through four fixed zero-page transport slots `$04-$07`, imposing a hard 4-param cap (E0506) that didn't compose with structs/arrays/u16s and fell back to "pack args into a global" workarounds whenever a function needed five things. The transport scheme also cost every non-leaf call a 4-LDA/STA spill prologue (~28 cycles, 16 bytes) to copy args out of ZP before the next nested `JSR` could clobber them. Replace it with a hybrid convention keyed on leaf-ness: - **Leaf callees** (no nested `JSR` in body, ≤4 params): unchanged. Caller stages args into `$04-$07`; body reads those slots directly for its entire lifetime. No prologue copy. Fastest path, 3-cycle ZP stores + 3-cycle ZP loads, preserves the SHA-256 leaf-primitive optimisation that motivated the original fast path. - **Non-leaf callees** (body contains a nested `JSR`, OR ≥5 params): direct-write. Caller stages each argument straight into the callee's analyzer-allocated parameter RAM slot, bypassing the transport slots entirely. No prologue copy on the callee side. Saves ~24 cycles and ~16 bytes per call vs the old transport-then-spill path, and — crucially — scales past 4 params because the per-param slots live wherever the analyzer put them rather than in a fixed ZP window. The analyzer's ceiling moves from 4 to 8. Functions with 5–8 params are silently promoted to the non-leaf convention (even if their body has no nested `JSR`), which pays the direct-write cost rather than the prologue-copy cost — still cheaper than the old ABI. Declarations with 9+ params still emit E0506. ### Implementation - `function_is_leaf` now also requires `param_count <= 4`. - `IrCodeGen::new` populates `non_leaf_param_addrs: HashMap<String, Vec<u16>>` — for every non-leaf function, the ordered list of addresses its parameters occupy. Callers use this to route each arg directly to the right slot. - `IrOp::Call` branches on presence in the map: non-leaf → direct- write, leaf (or absent — 0-arg case) → ZP transport. - `gen_function` no longer emits a prologue. Leaves didn't have one; non-leaves had a 4-LDA/STA copy that is now unnecessary because args arrive pre-written to the slot. - The previous `leaf_functions: HashSet<String>` field is removed; leaf-ness is now inferred from absence-in- `non_leaf_param_addrs` at the call site. ### Tests and regressions - `eight_param_non_leaf_function_stages_every_arg_at_its_allocated_slot` compiles an 8-param function, scans PRG for a distinct `LDA #\$NN / STA <addr>` per arg (immediates `0x11..0x88`), and asserts that STAs to the `$04-$07` range are strictly fewer than 8 — proof the old transport path is gone for this call. - `non_leaf_call_direct_writes_args_to_callee_param_slots` replaces the old `gen_function_prologue_spills_params_to_local_ram` test with a dual assertion: (a) no `LDA \$04` prologue at the callee entry, and (b) the caller-side STA lands at the analyzer-allocated param slot, not at `\$04-\$07`. - `analyze_rejects_function_with_more_than_4_params` renamed and rewritten for the new 8-param cap. - `feature_canary.ne` gains a 6-param `sum6` call (1+2+3+4+5+6 = 21) as check 8. The canary stays green (all eight checks pass), so the committed golden is unchanged. ### Blast radius - Six example ROMs change bytes (arrays_and_functions, function_chain, mmc1_banked, pong, sha256, war) because their non-leaf call sites pick up the shorter staging sequence. - Pong and war audio hashes refresh (pure layout-timing shift; no behavioural change in the 180-frame no-input window). docs/pong.gif and docs/war.gif stay byte-identical. - `examples/function_chain.ne`'s header comment updated to document the leaf vs non-leaf split it exercises. - `docs/language-guide.md` parameter-count section and E0506 entry updated to reflect the new rule. All 720 Rust tests pass; all 35 emulator goldens pass. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes the audit prompted by PR #31, where
state Foo { var x }allocated a ZP slot but everyLoadVar/StoreVaron it silently emitted zero bytes for months before anyone noticed. The bug survived because (a) the codegen'sif let Some(&addr) = self.var_addrs.get(var) { }guards silently dropped unmapped VarIds, (b) pixel/audio goldens captured the broken behaviour as ground truth, and (c) no test wrote→waited→read through the feature.This PR attacks all three.
What's in it
Five commits, one per phase of the plan.
1. Address-map misses panic loudly (
f7012c6)New
IrCodeGen::var_addr(VarId) -> u16helper that panics with an explicit "compiler bug" message on miss. Replaces seven silentif let Some(..)sites ingen_opplus the global-initializer loop and parameter-shuffle prologue. Threestate_indices.get().copied().unwrap_or(0)misdispatch sites (MMC3 IRQ, MMC3 reload,IrOp::Transition) similarly converted to.unwrap_or_else(|| panic!(...)).Running the hardened lookup against
cargo testimmediately found a latent bug with the same shape: uninitialized struct globals (var pos: Vec2with no literal initializer) never had their flattened field VarIds registered invar_addrs. The analyzer'sflatten_struct_fieldsallocated addresses forpos.x/pos.y; the lowerer'sget_or_create_var("pos.x")minted VarIds for them; butIrCodeGen::newonly walkedir.globals, which doesn't contain synthesized field entries for uninitialized structs. Everypos.x = Nsilently compiled to nothing.Fixed by exposing the lowerer's name→VarId map on
IrProgramand joining with the analyzer's allocations inIrCodeGen::new.uninitialized_struct_field_store_emits_sta_to_allocated_addressscans the PRG forLDA #$7B / STA <addr>as a regression guard.2. Four "silently skip" paths turned into real behaviour or hard errors (
48bae97)on exitnow actually runs.IrOp::Transitionused to comment "on_exit of the current state isn't called here because we don't know from an IR op alone which state we're leaving" and skipped the JSR. Pong, war, and state_machine all relied onstop_musicand mode-flag translation insideon exitthat had been silently never running — the goldens captured the broken state. Emit a runtime CMP-chain againstZP_CURRENT_STATEbefore every transition to JSR the leaving state's exit handler. Video goldens stay byte-identical (no transitions fire in the 180-frame no-input window), pong/war audio hashes refreshed, gifs byte-identical.continue'd past them with a comment saying the analyzer should refuse, but the analyzer never did. Added the diagnostic.on scanlinewithout MMC3 moved from E0203 to dedicated E0603.slow varnow actually leaves zero page.Placement::Slowwas parsed but ignored byallocate_ram. Routedvar.placementthroughallocate_ram_with_placementsoslow var cold: u8lands at$0300+per the docs' promise.fastremains advisory (W0107 already warns when wasted), since the default allocator already prefers ZP for u8s.Three regression tests:
analyze_state_local_array_initializer_rejected,analyze_on_exit_declaration_accepted,analyze_slow_var_forced_out_of_zero_page, plus the byte-leveltransition_dispatches_leaving_states_on_exit_handlerthat compares JSR-target counts between the exit-bearing and baseline builds.3. Call arity + CLAUDE.md policy (
00f1305)IrOp::Callused to.take(4)the arg list with a comment claiming E0506 made extras unreachable. Replaced with an explicitassert!(args.len() <= 4). Iterate over all args (no.take(4)needed once the assert stands guard).CLAUDE.md gains a "new-feature PR checklist (don't ship silent drops)" section documenting the three ingredients every language-feature PR must include — example, runtime behaviour assertion (not just a
rom::validate_ines(&rom)shape test), negative test — and calls out the specific address-map-lookup pattern reviewers should reject in favour ofIrCodeGen::var_addr.Deliberately skipped a regex-based CI tripwire for "silently" / "for now" comments. The false-positive rate against legitimate design decisions like "silently truncate to 8 bits per the cast spec" would train contributors to ignore the signal. A durable CLAUDE.md checklist is what next agents actually need.
4.
feature_canary.ne: a golden that fails on absence of correctness (ee3dddb)The structural piece. The audit's recurring diagnosis was that pixel/audio goldens capture whatever happens, not what should happen — so a silent drop is just as stable as correct code and locks in as the ground truth. The canary inverts this: its golden is a solid green universal backdrop that only appears when every memory construct round-trips correctly. Any check failure flips the whole screen red.
Checks cover every silent-drop shape this audit found: state-local variable write-read (PR #31), uninitialized struct-field write-read (phase 1), u8 / u16 globals (u16 exercises both
StoreVar+StoreVarHi), array-element write at nonzero index,slow-placed global, function call return value.Doesn't use
debug.asserton purpose — those get stripped in release and the emulator harness runs release.set_palette Failworks in release and is what the pixel-diff sees.Impact
Tests
feature_canary).ROMs
pong.nes,state_machine.nes,war.nesbytes change — the exit-dispatch code now lands in every transition site. Video is pixel-identical at frame 180;pong.audio.hashandwar.audio.hashrefreshed (pure code-layout timing shift, no behavioural change in the no-input window).Bugs the audit fixed that nobody had filed yet
var pos: Vec2+pos.x = 42silently compiled to no code.var buf: u8[4] = [1,2,3,4]at state scope silently compiled with no initializer (buffer was stale RAM).on exit { stop_music }in pong/war/state_machine never ran.slow var cold: u8still landed in zero page.None had a Github issue. The audit script that wrote this plan found them via the same code-smell pattern PR #31 exhibited.
Review notes
The individual commits are structured to be reviewable independently. The interesting one is probably the struct-field fix in (1) — the rest is mechanical once you accept the "panic on miss" axiom.
https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC