Skip to content

cc65/nesdoug parity: PRNG, input, fades, mappers, i16, SRAM, VRAM buffer, HUD#33

Merged
imjasonh merged 11 commits intomainfrom
claude/analyze-cc65-gaps-255S6
Apr 18, 2026
Merged

cc65/nesdoug parity: PRNG, input, fades, mappers, i16, SRAM, VRAM buffer, HUD#33
imjasonh merged 11 commits intomainfrom
claude/analyze-cc65-gaps-255S6

Conversation

@imjasonh
Copy link
Copy Markdown
Owner

@imjasonh imjasonh commented Apr 18, 2026

Closes a broad swathe of the cc65/nesdoug parity gaps catalogued in
docs/future-work.md, plus the planning doc itself. The batches were
shipped incrementally with design review between each; docs/future-work.md
has been trimmed as items completed so the remaining contents track
only what's still open.

Features shipped

Language / compiler

  • i16 type — full 16-bit signed arithmetic, comparisons, and IR
    lowering. Two known follow-ups (signed-aware Cmp16 branches and
    run-time i8 → i16 sign-extension) remain documented in §A.
    (examples/i16_demo.ne)
  • Inline-asm dot labels.label: inside an asm { ... } block
    is mangled per-block (__ilab_<N>_label) so two inline-asm blocks
    in the same function can both use .loop: without colliding.
    (tests/integration_test.rs::inline_asm_dot_labels_are_per_block_unique)
  • VRAM update buffernt_set(x, y, tile), nt_attr(x, y, value),
    and nt_fill_h(x, y, len, tile) intrinsics lay down records in a
    256-byte ring at $0400-$04FF; the NMI drains them at vblank via
    LDA $0400,X / STA $2007 (4 cycles per byte, no ZP cost) and then
    resets $2006/$2005 so the PPU scroll latch stays clean. Gated on
    a __vram_buf_used marker so programs that never touch it pay no
    runtime cost. (examples/vram_buffer_demo.ne, examples/hud_demo.ne)

Runtime intrinsics

  • PRNGsimple_rand(), simple_rand_range(n), seed_rand(seed)
    backed by a 16-bit LFSR; gated so non-random programs skip the
    driver. (examples/prng_demo.ne)
  • Edge-triggered inputpressed(btn) / just_pressed(btn) /
    just_released(btn) built on a "previous frame" shadow latched in
    the NMI. (examples/edge_input_demo.ne)
  • Palette fadesset_palette_brightness(level) plus the
    blocking fade_out(step_frames) / fade_in(step_frames) builtins,
    driven by the $2001 emphasis bits. (examples/palette_brightness_demo.ne,
    examples/fade_demo.ne)
  • Auto sprite flickergame { sprite_flicker: true } injects an
    IrOp::CycleSprites at the top of every on frame handler so
    scenes past the 8-sprites-per-scanline budget flicker instead of
    dropping the same sprite every frame. (examples/auto_sprite_flicker.ne;
    manual cycle_sprites still available, examples/sprite_flicker_demo.ne)
  • Sprite-0 splitscanline_split builtin that polls the
    sprite-0 hit flag to let the status-bar row use a different scroll
    from the playfield. (examples/sprite_0_split_demo.ne)

Mappers

  • AxROM (mapper 7), CNROM (mapper 3), GNROM (mapper 66)
    reset-time init, __bank_select support, iNES mapper number wiring.
    CHR bankswitching plumbing is in place for CNROM/GNROM (user-visible
    CHR swaps from source still TODO). (examples/axrom_simple.ne,
    examples/cnrom_simple.ne, examples/gnrom_simple.ne)

Persistence

  • SRAM / battery-backed savessave { var ... } declarations
    allocate from a separate $6000+ bump pointer, the linker flips
    iNES byte-6 bit-1, and W0111 warns when a save var carries an
    initializer (SRAM persists across power cycles). Struct fields and
    first-power-on detection remain as §S follow-ups.
    (examples/sram_demo.ne)

Debugging

  • Configurable debug portgame { debug_port: fceux | mesen | 0xXXXX } routes debug.log(...) / debug.assert(...) to the
    chosen emulator-specific or arbitrary-address port instead of the
    fixed $4800. (tests/integration_test.rs::debug_log_targets_configured_port)
  • FCEUX .nl labels--fceux-labels <prefix> emits
    <prefix>.<bank-index>.nl + <prefix>.ram.nl so FCEUX picks up
    function/state/var labels.

Docs

  • docs/future-work.md was authored at the start of the PR, then
    trimmed after each batch so it reflects only what's still open.
  • docs/language-guide.md gained a VRAM-buffer section between
    "Hardware Intrinsics" and "Inline Assembly" covering the three
    intrinsics, the shadow-compare HUD pattern, the per-frame write
    budget, and the known limits.
  • examples/README.md rows for every new example with a one-line
    "what it demonstrates" note.

Test plan

  • cargo fmt — clean
  • cargo clippy --all-targets -- -D warnings — clean
  • cargo test --all-targets — 758 tests pass
  • tests/emulator/run_examples.mjs48/48 ROMs match their
    pixel + audio goldens (every new example has both a PNG and an
    audio-hash golden committed)
  • Committed examples/*.nes match a rebuild with
    cargo run -- build examples/<name>.ne (reproducibility diff)

https://claude.ai/code/session_01F7dHsgh7UX7SAK3wZ7JiKc

claude added 11 commits April 18, 2026 17:41
Enumerates the gaps between NEScript today and what the cc65/nesdoug
ecosystem exposes: i16/pointers/bitfields, VRAM update buffer,
metatiles, edge-triggered input, PRNG, palette fade, sprite-0 split,
additional mappers (AxROM/CNROM/UNROM-512/MMC5), FamiStudio import,
SRAM saves, PAL/NTSC abstraction, NSF output, Zapper/Power Pad,
configurable debug port, FCEUX .nl labels, and explicit bank hints.
Each item has a design sketch and the section ends with a priority
ranking. This is the planning doc the follow-up implementation
commits will chip away at.
…bels

Closes seven of the cc65/nesdoug parity gaps catalogued in
docs/future-work.md in a single pass. All of the new features are
gated on marker labels so programs that don't use them produce
byte-identical ROM output (every pre-existing committed .nes file
round-trips unchanged).

Language / runtime additions:
- `rand8()` / `rand16()` / `seed_rand(u16)` intrinsics backed by a
  16-bit Galois LFSR (~30 bytes of runtime, ~40 cycles per draw).
  Reset path seeds state to 0xACE1 so the first draw is useful even
  without explicit seeding.
- `p1.button.a.pressed` / `.released` edge-triggered input via a
  new ReadInputEdge IR op plus an NMI-side prev-frame snapshot into
  $07E6/$07E7, gated on the `__edge_input_used` marker.
- `set_palette_brightness(level)` builtin mapping levels 0..8 to
  PPU mask emphasis bytes (`$2001`) for neslib-style screen fades.
- `mapper: AxROM` (iNES 7) with automatic 32 KB PRG padding so
  emulators that enforce mapper-7's 32 KB page size boot cleanly.
- `mapper: CNROM` (iNES 3) with a reset-time CHR bank 0 select.
- `--fceux-labels <prefix>` CLI flag emitting per-bank `.nl` label
  files and a `.ram.nl` file for FCEUX's debugger.

Tests + examples:
- Five new example programs with committed .nes ROMs and
  pixel+audio goldens: prng_demo, edge_input_demo,
  palette_brightness_demo, axrom_simple, cnrom_simple.
- Seven integration tests covering JSR emission, the
  omitted-when-unused invariant, the NMI prev-input snapshot, the
  correct mapper numbers for AxROM/CNROM, and negative tests for
  unknown button names and bad rand8 arity.
- `is_intrinsic()` now runs in expression-position Call paths too,
  so `var x = rand8(1, 2)` errors at compile time instead of
  silently dropping the extra arguments.
…ipped

Updates the cc65/nesdoug-gap catalogue sections to reflect what
landed in 7507459. Remaining items reshuffle: i16 and the VRAM
update buffer stay at the top of the priority ranking.
Follow-up cleanup on the cc65 parity batch. Addresses issues found
during a post-commit code review.

**Correctness fixes:**

- `rand8()` / `rand16()` at statement position (result discarded)
  were being eliminated by DCE because `op_dest` returned
  `Some(dest)` for Rand8/Rand16 even though the ops have a visible
  side effect — advancing the PRNG state. Now `op_dest` returns
  `None` for both, keeping the JSR regardless of liveness. New
  regression test `rand8_statement_survives_dce`.
- Void-only intrinsics (`poke`, `seed_rand`, `set_palette_brightness`)
  used in expression position (e.g. `var x = seed_rand(42)`) were
  panicking the linker with an unresolved `__ir_fn_X` label. The
  analyzer now emits E0203 with a clear message; new
  `void_intrinsic_in_expression_position_errors` test covers all
  three names.
- Statement-position `rand8()` / `rand16()` weren't lowered at all
  (they fell through to the default Call path). Now both lower to
  their IR op with a fresh temp that nothing reads; the JSR still
  runs so the PRNG state advances.
- `--fceux-labels foo.nes` was producing `foo.0.nl` because
  `PathBuf::with_extension` replaces instead of appends. Rewritten
  to literally append `.<bank>.nl` / `.ram.nl` to the OsString, so
  users get the FCEUX-expected `foo.nes.<bank>.nl` naming.
- Linker now asserts CNROM / AxROM don't accept user-declared
  switchable PRG banks — their page sizes don't fit the 16 KB per
  bank model, and silently producing a mis-sized ROM is worse than
  a loud panic.

**PRNG cleanup:**

- Removed the stream-of-consciousness comment block in `gen_prng`
  that described three abandoned algorithms before landing on the
  actual Galois LFSR.
- Simplified `__rand16` to a single JSR + LDX instead of two
  JSRs + TAY/TYA round-trip — a single shift already produces 16
  fresh bits, the doubled call just burned ~40 cycles. The golden
  PNG for `prng_demo` was regenerated to reflect the new sequence.
- Rewrote the `gen_prng` doc comment to accurately describe the
  algorithm as a Galois LFSR (it was mislabelled as xorshift).
- Rewrote the `gen_palette_brightness` doc comment with a proper
  table of level→mask mappings — the prior prose description
  didn't match the actual table values.

**Tests:**

- Three new unit tests in `linker::debug_symbols` covering the
  FCEUX `.nl` renderer: user-facing labels only, empty output when
  no user labels exist, and deterministic sorting in `.ram.nl`.
- Extended `nes2_mapper_high_nibble_in_byte_8_is_zero_for_small_mappers`
  to cover AxROM + CNROM.
- Renumbered priority list in future-work.md after removing the
  shipped sections (J, K, N, parts of V and Y).

All 737 tests + 40/40 emulator goldens still green.
… + docs

Another batch from the cc65/nesdoug gap catalogue. All six items
gated on marker labels (or default-false attributes) so existing
programs produce byte-identical ROMs — every pre-existing .nes
file round-trips unchanged.

**Language / runtime additions:**

- `mapper: GNROM` (iNES 66). Combines AxROM's 32 KB PRG pages with
  CNROM's 8 KB CHR banks in a single `$8000` register. Linker
  pads single-page ROMs to 32 KB to match mapper-66 expectations.
- `game { debug_port: fceux | mesen | 0xXXXX }`. `debug.log`,
  `debug.assert`, and the `__debug_halt` sentinel now target a
  user-selected address. `fceux` (default, $4800) and `mesen`
  ($4018) are named aliases; custom hex addresses are accepted
  for unusual debuggers.
- `game { sprite_flicker: true }`. IR lowerer injects an
  `IrOp::CycleSprites` at the top of every `on frame` handler,
  which flips on the rotating-OAM NMI variant with no per-site
  boilerplate. Default false so existing ROMs keep their layout.
- `fade_out(step_frames)` / `fade_in(step_frames)` builtins.
  Blocking helpers that walk brightness 4 → 0 or 0 → 4 with
  `step_frames` frames between each step. Runtime splices
  `__fade_out`, `__fade_in`, and a callable `__wait_frame_rt`
  helper when the builtin is used. Zero-guard on step_frames
  prevents a pathological 256-frame spin when the caller
  accidentally passes 0.
- `sprite_0_split(scroll_x, scroll_y)` intrinsic. Emits a
  two-phase busy-wait on `$2002` bit 6 (wait-for-clear,
  wait-for-set) then writes the new scroll values to `$2005`.
  Works on any mapper — unlike `on_scanline(N)` which requires
  MMC3. Enables HUD-over-playfield scrolling on NROM/UxROM/MMC1.

**Docs:**

- New paragraph in the language guide explaining the no-recursion
  design choice and the explicit-stack workaround pattern.
- `future-work.md` updated to mark the shipped items out of the
  catalogue; remaining items reshuffled in the priority ranking.
- README + examples/README updated with the new mapper and
  builtins.

**Tests:**

- 12 new integration tests covering: GNROM header emission,
  debug-port targeting (fceux/mesen/custom), unknown-alias
  rejection, sprite_flicker on/off/bad-value, fade_out JSR + marker
  coupling, fade omitted-when-unused, fade-in-expression rejected,
  sprite_0_split byte-level busy-wait verification, sprite_0_split
  arity enforcement, sprite_0_split omitted-when-unused, and an
  extended void-intrinsic-in-expression-position test covering the
  three new void builtins.
- `nes2_mapper_high_nibble_in_byte_8_is_zero_for_small_mappers`
  extended to include GNROM.
- Four new examples with committed .nes ROMs + pixel/audio
  goldens: `gnrom_simple`, `auto_sprite_flicker`, `fade_demo`,
  `sprite_0_split_demo`.

All 752 tests pass. Clippy clean. 44/44 emulator goldens match.
Another batch from the cc65/nesdoug catalogue. All gated on
parser-level opt-in or default-false attributes so existing
programs produce byte-identical ROMs (no committed .nes file
changed).

**§A — `i16` signed 16-bit type:**
- New `KwI16` lexer token, `NesType::I16` AST variant, parser
  case in `parse_type`. Type-size and integer-type tables
  treat `i16` like `u16` (2 bytes, integer).
- IR lowering accepts `i16` everywhere it accepts `u16` for
  wide-load / wide-store / widen-narrow paths.
- New constant fold for `UnaryOp::Negate(IntLiteral(v))` that
  emits the wide two's-complement form. Without it, `var vy:
  i16 = -10` would zero-extend to `$00F6` (= 246) instead of
  sign-extending to `$FFF6` (= -10). Negative literals now
  store the right bytes.
- Comparisons reuse the existing unsigned 16-bit compare ops
  (matching the existing `i8` behaviour). Documented in the
  `NesType::I16` doc comment and in `future-work.md` §A.
- Example `examples/i16_demo.ne` with committed golden.
- Tests cover the literal-fold sign-extension and end-to-end
  compile of the example.

**§S — SRAM / battery-backed saves:**
- New `save { var ... }` top-level block. Lexer + parser opt
  into a dedicated `KwSave` token. Analyzer allocates save
  vars from a separate `next_sram_addr` bump pointer starting
  at `$6000`, capped at `$8000` (8 KB cartridge SRAM window).
- Linker reads `analysis.has_battery_saves` and flips iNES
  byte-6 bit-1 via the new `RomBuilder::set_battery` /
  `Linker::with_battery` chain.
- New `W0111` warning for save-var initializers — SRAM is
  preserved across power cycles, so an init expression would
  either silently never run or clobber persisted data on
  every boot. The warning teaches the user about the
  magic-byte sentinel pattern.
- Struct fields in save blocks are explicitly rejected for now
  (the field-flattening path uses the main-RAM allocator).
- Example `examples/sram_demo.ne` with committed golden, plus
  4 integration tests.

**§D (partial) — inline-asm `.label:` syntax:**
- Codegen-side mangler rewrites `.IDENT` → `__ilab_<N>_IDENT`
  per inline-asm block, where `<N>` is the call site's
  monotonic suffix. Two `asm { .loop: ... }` blocks in the
  same function now coexist without colliding in the linker's
  label table.
- Bounds checks on `.` placement: `$2002` and `name.field`
  are unaffected; only `.IDENT` in label / branch context
  triggers the rewrite. Two integration tests pin the
  uniqueness and dollar-vs-dot disambiguation.

**§X follow-up — Mesen trace-log docs:**
- New "Debugger-assisted workflows" section in
  `docs/nes-reference.md` walking through the Mesen / FCEUX
  log workflows alongside the new `debug_port:` attribute.

**Misc:**
- `future-work.md` updated to mark the shipped items out of
  the catalogue and reshuffle the priority ranking. Remaining
  niche follow-ups (signedness on Cmp16, struct save fields,
  inline-asm format specifiers) documented inline so future
  passes know the design.

All 757 tests pass. Clippy clean. 46/46 emulator goldens match.
Closes the highest-priority remaining catalogue item (§G). User
code queues PPU writes during `on frame` via three new intrinsics;
the NMI drains the 256-byte ring at `$0400-$04FF` to `$2007`
during vblank. Programs that never touch the buffer pay zero
bytes and zero cycles for the feature — verified by the existing
46 ROMs all matching their goldens with no drift.

Also fixes the failing CI Format check from 7b4570e by running
cargo fmt across the working tree.

**Runtime:**
- New `runtime::gen_vram_buf_drain` emits the drain routine
  (`__vram_buf_drain`). Walks entries `[len][addr_hi][addr_lo]
  [byte_0]...[byte_(len-1)]` and stops at `len == 0`. Uses
  `LDA $0400,X` indexed-absolute so no ZP scratch is needed.
  Drain costs ~12 setup cycles + 8 cycles per data byte; the
  256-byte buffer can hold ~50 single-tile writes that drain
  in roughly 1000 cycles, well inside the ~2273-cycle vblank.
- `NmiOptions` gains `has_vram_buf`. The NMI JSRs the drain
  after the existing palette/background handshake (compiler-
  queued PPU writes win priority for vblank cycles).

**IR + codegen:**
- Three new ops `IrOp::NtSet`, `IrOp::NtAttr`, `IrOp::NtFillH`.
- The codegen helpers compute the PPU address inline:
  `$2000 + y*32 + x` for nametable, `$23C0 + (y/4)*8 + (x/4)`
  for attribute. Each append lays down a fresh `0` sentinel so
  the NMI sees a well-formed buffer regardless of whether more
  entries get appended later in the frame.
- `__vram_buf_used` marker drops on first use; gates the
  runtime splice + NMI JSR.

**Analyzer:**
- AST-walking helper `program_uses_vram_buf` detects intrinsic
  use at analyze-init time so the user-RAM bump pointer can
  start at `$0500` (past the buffer) rather than the legacy
  `$0300`. Programs that don't use the buffer keep the legacy
  start.
- Three intrinsic names registered in `is_intrinsic` /
  `is_void_intrinsic` with arity checks.

**Tests + example:**
- `examples/vram_buffer_demo.ne` exercises all three intrinsics
  on a backgrounded program — three single-tile score writes,
  a 16-tile horizontal fill, and an attribute write that flips
  the top-left metatile group's palette to red. Committed
  golden + audio hash.
- Four new integration tests: byte-level JSR-to-drain
  assertion, drain-omitted-when-unused, RAM-bump assertion for
  programs that DO use the buffer, and arity enforcement for
  `nt_set`.

**CI fix:**
- `cargo fmt` ran across the tree. Picks up a one-line fmt
  diff in `tests/integration_test.rs` that the prior commit
  shipped without running fmt, causing the Format CI job to
  fail on `7b4570e`.

All 758 tests pass. Clippy clean. 47/47 emulator goldens match.
Follow-up to 807c9c7 (the VRAM update buffer core). Adds the
realistic-HUD example the core was missing, plus a language-guide
section that explains when and how to use the three buffer
intrinsics.

**examples/hud_demo.ne**

A bouncing-ball playfield with a classic status bar across the
top:
- 5-cell lives indicator that ticks down once per second and
  resets at zero, drawn via `nt_fill_h` (plus a second
  `nt_fill_h` to erase the stale tail).
- Score counter at the right edge that bumps on every wall
  bounce, drawn via `nt_set`.
- One-shot `nt_attr` call on the first frame flipping the
  top-left metatile group to sub-palette 1 (the red HUD
  palette) so the UI chrome reads as distinct from the
  playfield.

The demo's point is the `last_score != score` / `last_lives !=
lives` shadow-compare pattern: on the ~58-of-60 frames where
nothing changed, the buffer stays empty and drain work is zero.
That's the whole reason the VRAM buffer exists — per-frame cost
scales with what moved, not with HUD complexity. Committed
`.nes` + pixel/audio goldens.

**docs/language-guide.md**

New "VRAM Update Buffer" section between "Hardware Intrinsics"
and "Inline Assembly". Covers:
- Why user code can't just poke `$2006` / `$2007` directly.
- The three intrinsics + their coordinate systems (cell, not
  pixel).
- The HUD pattern with a ready-to-paste code snippet and a
  pointer at `examples/hud_demo.ne`.
- A per-entry budget table + worked 1000-cycle drain example
  against the ~2273-cycle vblank budget.
- Known limits: horizontal-only, no overflow check,
  no coalescing — all already tracked under `future-work.md` §G.

**examples/README.md**

`vram_buffer_demo.ne` reframed as the minimal test-case exercise
it actually is, with a pointer at `hud_demo.ne` for the realistic
pattern. New table row for `hud_demo.ne`.

All 758 tests pass. Clippy clean. 48/48 emulator goldens match.
The previous version of hud_demo passed `score & 0x0F` and tile
index `1` (= Heart) to nt_set / nt_fill_h, but the demo had no
Heart sprite declared and tile 1 in CHR was uninitialized garbage.
The result was a screen of blue smileys with a tiny red strip in
the corner — the buffer mechanism worked, but the visual gave no
sense that anything HUD-shaped was happening.

This commit makes the HUD actually look like a HUD:

- 12 sprite declarations (Bar, Heart, Digit0..9, Ball) that the
  compiler lays into CHR at known tile indices in declaration
  order. Tile-index constants (`BAR_TILE`, `HEART_TILE`,
  `DIGIT_BASE`) match that order so the call sites can use names
  instead of magic numbers.
- bg1 palette restructured to `[red, white, yellow]` so pixel-art
  characters resolve to visible colours: `#` = red (background
  fill), `%` = white (heart shape), `@` = yellow (digit strokes).
- Background pre-paints row 1 with the solid `Bar` (red) tile
  via a `legend { "B": 1 }` entry, giving the HUD a uniform red
  canvas for individual cell writes to land on.
- Eight `nt_attr` calls at startup paint the entire top metatile
  row (4 rows × 32 cols) with sub-palette 1 so the HUD chrome
  reads as visually distinct from the playfield.

The result at frame 180 is unmistakably HUD-shaped: a yellow-on-
red status bar at the top of the screen above blue playfield with
a yellow ball bouncing around. Per-frame cost still scales with
what changed — `last_score` / `last_lives` shadow-compares mean
the buffer stays empty on the ~58 of 60 frames where nothing
ticks.

Tests: 758 pass. Clippy clean. 48/48 emulator goldens match.
Each $2006 write inside __vram_buf_drain updates the PPU's `t`
(scroll) register, so leaving it pointing at the last buffer
entry's address shifted the next frame's rendering up/right by
however many cells we wrote past $2000. Reset by writing $00 to
$2006 twice (clears `t` and resets the write-toggle to high)
followed by $00 to $2005 twice (zero X/Y scroll). The HUD demo
golden flips from "smileys offset by ~16px" to the intended
red bar with white hearts and a yellow score digit.

https://claude.ai/code/session_01F7dHsgh7UX7SAK3wZ7JiKc
Two stale items cleaned up after the VRAM-buffer HUD work:

- §G mentions `examples/hud_demo.ne` as the realistic companion
  to the minimal `vram_buffer_demo.ne`, and documents the
  `$2006`/`$2005` reset the NMI drain now performs so the PPU's
  scroll latch stays clean across frames.
- Design question 3's sprite-flicker half is answered — both
  `cycle_sprites` and `game { sprite_flicker: true }` ship — so
  the question narrows to the still-open `draw ... priority:
  pinned` modifier.

https://claude.ai/code/session_01F7dHsgh7UX7SAK3wZ7JiKc
@imjasonh imjasonh changed the title docs: catalog cc65/nesdoug parity gaps in future-work.md cc65/nesdoug parity: PRNG, input, fades, mappers, i16, SRAM, VRAM buffer, HUD Apr 18, 2026
@imjasonh imjasonh merged commit 3719b6c into main Apr 18, 2026
7 checks passed
@imjasonh imjasonh mentioned this pull request Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants