Conversation
Before this change, state-local variables (`state Foo { var x: u8 = 0 }`)
were silently no-ops: the analyzer allocated a ZP slot for them, but
the codegen's `var_addrs` map only covered IR globals and scope-qualified
function locals — so every `LoadVar` / `StoreVar` whose `VarId` pointed
at a state-local resolved to no address and emitted nothing. Existing
examples compiled and matched their goldens because none of them observed
the dropped writes within the 180-frame harness window.
The overlay changes the analyzer's state-local pass to snapshot both the
ZP and RAM cursors after the globals have been laid out, then rewind to
that snapshot before each state's locals and track the running max.
`ZP_CURRENT_STATE` keeps exactly one state active at runtime, so every
state's locals are mutually exclusive with every other state's and can
share the same bytes. The IR lowerer now pushes each state's locals into
the IR globals table (with `init_value=None`) so the codegen resolves
their addresses the same way it does program globals, and prepends the
declared initializers to each state's `on_enter` handler (synthesizing
an empty one where needed) so a freshly-entered state re-establishes its
bytes before user code runs.
`--memory-map` now tags each allocation with its owning state
(`[@title]`, `[@playing]`, ...) and counts distinct bytes rather than
summed allocation sizes so overlaid slots don't double-count. The
`AnalysisResult.state_local_owners` map exposes the ownership to any
tool that wants to group allocations the same way.
Only `state_machine.ne` and `platformer.ne` declare state-level vars,
so they're the only example ROMs whose bytes change. `platformer.ne`'s
audio golden shifts slightly (the now-working `blink` counter in Title
adds a few cycles per frame before the auto-transition to Playing, which
offsets APU register writes within each frame); its video golden and
every other example ROM stay byte-for-byte identical.
Fixes #22.
https://claude.ai/code/session_015kvJu3iEFLSRJoShPBfm3X
…latformer Both examples declared their gameplay variables at the top level even though every read and write happened inside one specific state. That pattern hid the overlay feature from new users and kept the state-local code path from being exercised outside the dedicated `state_machine.ne` demo (which is how the "state-locals silently drop their writes" bug survived so long). `coin_cavern.ne`: the five Playing-only physics/position/inventory vars (`player_x`, `player_y`, `player_vy`, `on_ground`, `coins_left`) move onto Playing's state block. `score` stays global because GameOver-era code could reasonably grow to read it. The `on_enter` body loses its redundant resets — the declared initializers on the state-locals re-run on every entry, so retrying after `transition Title` comes back to a fresh state. `platformer.ne`: player physics, camera, liveness, animation phase, and the autopilot budget (`player_y`, `on_ground`, `rise_count`, `fall_vy`, `camera_x`, `anim_tick`, `alive`, `auto_jumps`) all move onto Playing. `frame_tick` and `stomp_count` stay global — Title reads the former to auto-advance, GameOver reads the latter to tally coins on the death screen. The analyzer now overlays Title's `blink`, Playing's eight physics vars, and GameOver's `linger` starting at the same ZP byte (`$1A`), so the three scenes share a 9-byte window instead of each claiming their own slots. Byte-level ROM bytes for both examples shift because variable addresses moved. Video goldens stay pixel-identical (the harness doesn't see Playing in coin_cavern, and the pre-transition Title→Playing timing in platformer is preserved); the platformer audio hash needed one more refresh because the now-slightly-shorter reset prologue shifts APU writes within each frame. https://claude.ai/code/session_015kvJu3iEFLSRJoShPBfm3X
imjasonh
pushed a commit
that referenced
this pull request
Apr 18, 2026
…ld silent drop Phase 1 of the post-PR-#31 audit. The PR #31 state-local bug had a specific shape: analyzer allocated a slot, codegen looked it up by VarId, silently emitted nothing on miss. Six sites in gen_op plus the global-initializer loop and the parameter-shuffle prologue all used the same `if let Some(&addr) = self.var_addrs.get(var) { ... }` pattern with no else branch. Any future allocation-map desync would slip through the same crack. Replace every site with a new `IrCodeGen::var_addr(VarId) -> u16` helper that panics with an explicit "compiler bug" message on miss. An IR op referencing an unmapped VarId is not valid input — it means the analyzer and lowerer disagreed on what to allocate, and we want that crash to surface in CI rather than be absorbed by whatever zero-filled RAM happened to sit at the read. Running cargo test against the hardened lookup surfaced exactly the bug shape the plan predicted: uninitialized struct globals (e.g. `var p: Point` with no literal initializer) never had their flattened field VarIds (`"p.x"`, `"p.y"`) registered in var_addrs. The IR lowerer's `get_or_create_var("p.x")` minted a VarId, the analyzer's `flatten_struct_fields` allocated an address for it, but IrCodeGen::new only populated var_addrs from `ir.globals`, which doesn't contain synthesized field entries for uninitialized structs. Every `p.x = N` silently compiled to nothing. Fix by exposing the IR lowerer's name→VarId map on IrProgram and joining it with the analyzer's allocations in IrCodeGen::new. Every allocated name that the lowerer knows about now gets a var_addrs entry. Example ROMs are byte-identical (no example relied on the dropped writes), but the bug was reachable — any user program with a plain `var pos: Point` declaration and field writes would have hit it silently. Add `uninitialized_struct_field_store_emits_sta_to_allocated_address` as a byte-level regression guard: compile `p.x = 123` and scan PRG for `LDA #\$7B / STA <addr>`. Fails against the old silently-dropping codegen. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
imjasonh
pushed a commit
that referenced
this pull request
Apr 18, 2026
…res (or fix them) Phase 2 of the post-PR-#31 audit. The codebase had four documented "silently skip" paths that parsed user intent but produced no code. Each one was the same shape as the state-local bug: the analyzer accepted the program, the IR lowered the construct, but somewhere downstream the emitted code was dropped on the floor — and a pixel golden that captured the broken behaviour locked it in as correct. Fix each per the plan, either by implementing the feature or rejecting the program at the analyzer. ### on_exit handlers now actually run `IrOp::Transition` used to comment "on_exit of the current state isn't called here because we don't know from an IR op alone which state we're leaving." The codegen emitted the exit handler's body as an IR function but never JSR'd it. Three example programs (pong, war, state_machine) relied on `stop_music` or mode-flag translation inside `on exit` that had been silently never running. Emit a small CMP-chain against `ZP_CURRENT_STATE` before each transition: for every state that declares an on_exit, compare the current index, branch past on miss, JSR the exit handler on match, then JMP to the shared done-label so only the leaving state's handler fires. The chain is inlined at each transition site (bounded by the number of states declaring on_exit) rather than factored into a single trampoline — simpler to reason about, and transitions are rare enough that the extra bytes don't matter. Pong / war / state_machine ROMs change because the dispatch code is now emitted. Video goldens stay byte-identical (no transitions happen within the 180-frame harness window under no-input). Pong and war audio hashes shifted from pure code-layout timing and are regenerated. `docs/pong.gif` and `docs/war.gif` are byte-identical. ### State-local array initializers now refuse to compile (E0601) `src/ir/lowering.rs:887` had the comment "Array initializers for state-locals aren't supported yet... Programs that try this should get a diagnostic from the analyzer; for now, silently skip." The analyzer never actually emitted that diagnostic. Verified by compiling `state Main { var buf: u8[4] = [10,20,30,40] ... }`: the program built a valid ROM with no trace of 10/20/30/40 in PRG. Add E0601 to the analyzer's state-local pass. The IR lowerer's defensive `continue` stays in place as a belt-and-braces guard. ### `on scanline` without MMC3 is now E0603 Previously E0203 ("invalid operation for type") which is a miscategorisation — the feature is unsupported on the current mapper, not a type error. Dedicated E0603 makes the future-work shape explicit. ### `slow` variables now actually live outside zero page `Placement::Slow` was parsed into the AST but `allocate_ram` ignored it, so `slow var cold: u8` still landed in ZP like any other u8. Wire `var.placement` through `allocate_ram_with_placement` and skip the ZP branch when `Slow` is set. `Fast` remains advisory (the existing default already prefers ZP for u8 vars), validated by W0107. ### Other address-map silent drops hardened Alongside the var_addrs hardening from phase 1, three `state_indices` lookup sites that did `.copied().unwrap_or(0)` or silent `if let` are now explicit panics: scanline IRQ dispatch, MMC3 reload, and `IrOp::Transition`. A miss in any of them is a compiler bug, not valid input — the analyzer catches unknown state names upstream. ### Regression guards Four new tests would have failed against the old silently-dropping code paths: - `analyze_state_local_array_initializer_rejected` — expects E0601. - `analyze_on_exit_declaration_accepted` — expects no errors. - `analyze_slow_var_forced_out_of_zero_page` — expects alloc address >= $0100. - `transition_dispatches_leaving_states_on_exit_handler` — counts distinct JSR targets in the PRG before/after adding `on exit` to a state; the exit-bearing build must have more. All 720 tests pass. All 34 emulator goldens pass after the pong/war audio hash refresh. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
imjasonh
pushed a commit
that referenced
this pull request
Apr 18, 2026
… checklist Phase 3/4 of the post-PR-#31 audit. ### Call args > 4 is now an assert `IrOp::Call` silently `.take(4)`-d the arg list with a comment claiming the analyzer's E0506 check made the extras unreachable. Replace with an explicit `assert!(args.len() <= 4, ...)` so if the analyzer ever regresses, the codegen crashes loudly instead of miscompiling the call. Iterate over all args (not just the first 4) since the assert guarantees correctness. ### CLAUDE.md: new-feature PR checklist Document the lesson the audit taught: every new language-feature PR must include (1) an example exercising it, (2) a runtime *behaviour* assertion (not just a "ROM validates" shape check), (3) a negative test for invalid use. Call out the specific address-map lookup pattern (`if let Some(&addr) = map.get(..)` with no else) that shipped the state-local bug, and recommend the `IrCodeGen::var_addr` / explicit `.unwrap_or_else(|| panic!)` idiom instead. Chose not to add a regex-based CI tripwire for "silently" / "for now" comments because the false-positive rate against legitimate design decisions ("silently truncate to 8 bits per the cast spec", etc.) would train contributors to ignore it. The durable checklist in CLAUDE.md is what next agents need. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
imjasonh
pushed a commit
that referenced
this pull request
Apr 18, 2026
… regression Phase 5 of the post-PR-#31 audit, and the structural piece that closes the failure mode the earlier phases couldn't fix alone. The audit's recurring diagnosis: pixel/audio goldens capture *whatever* the program does, not what it *should* do. A silent drop in codegen is still deterministic — the golden locks in the broken behaviour and every future run agrees with it. That's how state-locals, uninitialized struct-field writes, `on exit` handlers, and `slow` placement each sat broken for months-to-a- year in a green CI. The canary inverts the relationship: the committed golden is a solid-green universal backdrop that only appears when every round-trip check passes. Each check writes a distinctive constant through one language construct, reads it back, and clears `all_ok` on mismatch. A final `if all_ok == 0 { set_palette Fail }` flips the entire screen red for the rest of the run. Checks cover the silent-drop shapes caught by this audit: - state-local variable write-read (PR #31) - uninitialized struct-field write-read (caught by phase 1) - u8 / u16 globals (u16 exercises both StoreVar + StoreVarHi) - array-element write at nonzero index - `slow`-placed global still round-trips - function call return value The canary doesn't use `debug.assert` on purpose — debug-only ops get stripped in release and the emulator harness runs release builds. The palette swap works in release and is what the harness pixel-diff sees. ### Why this matters as a long-lived test The harness already had 34 pixel goldens covering full-program behaviour, but none of them exist specifically to fail if a *specific language feature* silently drops. The canary does. Every silent-drop bug the audit found would have flipped it red the moment the check was added, which is the "behaviour assertion that can't be satisfied by silence" the plan called for. ### Harness footprint `tests/emulator/goldens/feature_canary.{png,audio.hash}` + `examples/feature_canary.{ne,nes}`. 35/35 ROMs match their goldens with the canary added. Listed in both README tables. https://claude.ai/code/session_01AoQ678uVeqpyayvWHpfDhC
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #22.
What changed
Two things, tightly coupled: a long-latent bug fix and the overlay feature it unlocked.
1. State-local variables actually work now
Before this change, state-level
vardeclarations (state Foo { var x: u8 = 0 }) were silently no-ops at codegen time. The analyzer allocated a ZP slot, butIrCodeGen::newonly populated itsvar_addrsmap from IR globals and scope-qualified function locals — state-locals lived in neither list, so everyLoadVar/StoreVarwhoseVarIdpointed at one resolved to no address and emitted zero bytes. Initializers at reset were similarly dropped.The lowerer now pushes each state's locals into the IR globals table (with
init_value = None) so the codegen resolves their addresses the same way it does program globals.2. Overlay allocation for mutually-exclusive states
ZP_CURRENT_STATEkeeps exactly one state active at runtime, so every state's locals are mutually exclusive with every other state's and can safely share the same bytes. The analyzer now snapshots both cursors after globals are laid out, rewinds to the snapshot before each state's locals, and advances to the running max at the end. The IR lowerer prepends each state-local's declared initializer to the owningon_enterhandler (synthesizing an empty one where needed) so a freshly-entered state re-establishes its bytes before user code runs — which is what makes the overlay safe when another state has written the same ZP byte in the interim.--memory-mapnow tags each allocation with its owning state ([@Title],[@Playing], ...) and counts distinct bytes rather than summed sizes so overlaid slots don't double-count.AnalysisResult.state_local_ownersexposes the ownership to any tool that wants to group allocations.3. Converted existing examples to exercise the feature
Only
state_machine.neused state-locals before this PR — which is exactly how the silent-drop bug survived so long: no example with state-locals observed the dropped writes inside the 180-frame harness window. As a follow-up commit,coin_cavern.neandplatformer.nemove their state-scoped globals onto the owning state blocks so the overlay code path sees real use.Why the bug wasn't caught earlier
state_machine.nestays in Title forever (no auto-advance, no buttons in the harness);platformer.neauto-advances on a globalframe_tick, not on Title's state-localblink. The wrong state-local values never reached a test assertion.program_with_state_machinecovered allocation and scoping but nothing wrote a value to a state-local, did await_frame, and then read it back to assert persistence. The newstate_local_store_round_trips_through_zero_pagetest asserts the specificLDA #N / STA $10byte sequences land in the ROM, which would have failed against the old silently-dropping codegen.Measured impact
Overlay savings across the three examples that declare state-locals today:
4 bytes of ZP reclaimed across the suite — meaningful on a 2 KB system but not dramatic, because the savings scale with
total_state_local_bytes − max_state_local_bytesand these examples have one dominant state (Playing) with lightweight siblings. The shape where the overlay really pays is "many fat states" (Level1/Level2/Level3 each with ~10 bytes, or symmetric menu/play/score turn state) — we don't have an example like that yet, but the plumbing is in place.Blast radius
coin_cavern.nes,platformer.nes, andstate_machine.nesshift because their variable addresses moved.platformer.audio.hashneeded refreshing — the now-workingblinkcounter in Title adds cycles per frame before auto-transitioning to Playing, which offsets APU register writes within each frame.docs/platformer.gifwas also regenerated per the CLAUDE.md rule.Tests
tests/integration_test.rscovering: overlay at the same base address, globals don't alias state-locals, state-local writes round-trip through ZP, initializers fire on entry (not reset), synthesizedon_enterfor states that only have initializers.Documentation
docs/language-guide.md— new "State-Local Variables and Memory Overlays" section with example.CLAUDE.md— zero-page convention updated with the overlay rule and[@State]tagging.docs/future-work.md— added a "State-local memory overlay follow-ups" section covering same-named locals across states, array/struct initializers on state-locals, and handler-local overlay.https://claude.ai/code/session_015kvJu3iEFLSRJoShPBfm3X