fix(selector): spill i64 register pairs + call results under pressure (#171)#198
Merged
Conversation
…#171) i64-heavy functions that kept more i64 values live than the 5 consecutive register pairs hold hit "no consecutive pair" exhaustion and were dropped by the #168 skip-and-continue net. #188's caller-saved preservation added pressure that pushed z_impl_k_sem_give over the edge (it compiled on v0.11.6, was skipped on v0.11.8). This implements the real spill. - Operand stack `Vec<Reg>` → `Vec<StackVal>` (`Reg | Spilled { lo_slot }`). `stack_live_regs` counts only register-resident entries. - `alloc_consecutive_pair` spills the deepest register-resident entry to a frame slot (STR lo/hi) when no pair is free, freeing it, and retries. `pop_operand` / `peek_operand` reload a spilled value into a fresh pair (LDR lo/hi) on consume. - `LocalLayout` reserves an 8-slot i64 spill area (`i64_spill_base`), only when the function has i64 ops (keeps i32-only frames unchanged); slots are reused. - Call path under pressure: `restore_caller_saved` spills the call result to the frame (returns `StackVal::Spilled`) when no callee-saved reg is free to park it; `emit_arg_moves` acquires its cycle-break scratch lazily (only for a genuine cycle), so an acyclic arg move needs no free reg. Acceptance: the 6-live-i64 case now compiles via spill (was Err); a z_impl-shaped i64+calls function compiles end-to-end (was skipped). 1292 workspace tests pass, fmt + clippy clean; #188/#195/#170 non-regression confirmed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #171 — the real i64 register-pair spill. i64-heavy functions exceeding the 5 consecutive register pairs hit "no consecutive pair" exhaustion and were dropped by the #168 skip net. #188's caller-saved preservation added pressure that regressed
z_impl_k_sem_givefrom compiling (v0.11.6) to skipped (v0.11.8). This implements the spill so they compile.Design
Vec<Reg>→Vec<StackVal>(Reg | Spilled { lo_slot });stack_live_regscounts only register-resident entries (~200 sites converted, every handler's logic preserved).alloc_consecutive_pairspills the deepest register-resident entry to a frame slot (STR lo/hi) when no pair is free, then retries;pop_operand/peek_operandreload a spilled value into a fresh pair (LDR lo/hi) on consume.LocalLayoutreserves an 8-slot i64 spill area (i64_spill_base) only when the function has i64 ops (i32-only frames unchanged); slots reused.restore_caller_savedspills the call result to the frame (returnsSpilled) when no callee-saved reg is free;emit_arg_movesacquires its cycle-break scratch lazily (only a genuine cycle needs it).Verification
Err), asserts STR/LDR frame traffic.Note
Three background agents failed on this all-or-nothing refactor (each crashed mid-pass leaving the i64-critical selector non-compiling); it was completed directly. The related #197 (the optimized path doesn't preserve caller-saved across import calls — gale's
z_impluses that path) is a distinct fix tracked separately.Closes #171.
🤖 Generated with Claude Code