|
movem.l %d0-%d7/%a0-%a7, -(%sp) |
The default VBlank and Timer IRQ handlers in runtime/ngdevkit-crt0.S save
and restore all 16 m68k registers around the user C callback and SYSTEM_IO:
;; rom_handler_VBlank_default (crt0.S:138-146)
movem.l %d0-%d7/%a0-%a7, -(%sp)
move.l #rom_callback_VBlank, %d0
tst.l %d0
beq .Ldone_Vblank
move.l %d0, %a0
jsr (%a0)
.Ldone_Vblank:
jsr SYSTEM_IO.l
movem.l (%sp)+, %d0-%d7/%a0-%a7
rte
The m68k SysV ABI declares d0-d1/a0-a1 as caller-saved (scratch) and
d2-d7/a2-a6 as callee-saved. C-compiled callbacks already preserve the
callee-saved set themselves, so the IRQ entry only needs to save scratch.
SYSTEM_IO in this BIOS dispatches to C functions
(controllers_update_status, credits_update_status,
credits_check_game_start at nullbios/biosstart.S:242), so it is also
ABI-conformant.
These symbols are pure IRQ entry points - referenced only by the linker-
script aliases at runtime/ngdevkit.ld:66-67 that route the vector table
to them. No code path calls them via jsr, so there is no caller relying
on the wider save.
Cost on 68000 (movem.l predecrement = 8 + 8n, postincrement = 12 + 8n):
current 16 regs: 136 + 140 = 276 cyc / IRQ
minimal 4 regs: 40 + 44 = 84 cyc / IRQ
savings: ~192 cyc / IRQ
At 60 Hz vblank plus any timer use, this is roughly 0.1-0.3% of the
200k-cyc/frame budget on a 12 MHz 68000 - small in absolute terms but free
since it is pure ABI bookkeeping. The same fix applies symmetrically to the
Timer handler at line 159.
Side note: including %a7 (SP) in the movem regmask writes the
post-decrement value of SP onto the stack - functionally meaningless once
popped. Dropping the mask down to d0-d1/a0-a1 removes that quirk too.
Suggested fix - same change in both handlers:
movem.l %d0-%d1/%a0-%a1, -(%sp)
... (callback + SYSTEM_IO unchanged) ...
movem.l (%sp)+, %d0-%d1/%a0-%a1
Verification:
m68k-neogeo-elf-objdump -d ngdevkit-crt0.o | grep -A1 'rom_handler'
shows the 16-register movem in both handlers.
ngdevkit/runtime/ngdevkit-crt0.S
Line 138 in d65da8b
The default VBlank and Timer IRQ handlers in
runtime/ngdevkit-crt0.Ssaveand restore all 16 m68k registers around the user C callback and SYSTEM_IO:
The m68k SysV ABI declares
d0-d1/a0-a1as caller-saved (scratch) andd2-d7/a2-a6as callee-saved. C-compiled callbacks already preserve thecallee-saved set themselves, so the IRQ entry only needs to save scratch.
SYSTEM_IOin this BIOS dispatches to C functions(
controllers_update_status,credits_update_status,credits_check_game_startatnullbios/biosstart.S:242), so it is alsoABI-conformant.
These symbols are pure IRQ entry points - referenced only by the linker-
script aliases at
runtime/ngdevkit.ld:66-67that route the vector tableto them. No code path calls them via
jsr, so there is no caller relyingon the wider save.
Cost on 68000 (movem.l predecrement = 8 + 8n, postincrement = 12 + 8n):
current 16 regs: 136 + 140 = 276 cyc / IRQ
minimal 4 regs: 40 + 44 = 84 cyc / IRQ
savings: ~192 cyc / IRQ
At 60 Hz vblank plus any timer use, this is roughly 0.1-0.3% of the
200k-cyc/frame budget on a 12 MHz 68000 - small in absolute terms but free
since it is pure ABI bookkeeping. The same fix applies symmetrically to the
Timer handler at line 159.
Side note: including
%a7(SP) in the movem regmask writes thepost-decrement value of SP onto the stack - functionally meaningless once
popped. Dropping the mask down to
d0-d1/a0-a1removes that quirk too.Suggested fix - same change in both handlers:
Verification:
m68k-neogeo-elf-objdump -d ngdevkit-crt0.o | grep -A1 'rom_handler'
shows the 16-register movem in both handlers.