Skip to content

PPU timing fixes

Pre-release
Pre-release
Compare
Choose a tag to compare
@qguv qguv released this 04 Jul 15:31
· 55 commits to master since this release

Play this version directly in your browser

The previous version was broken in some emulators because the game isn't sensitive to the timing of the Pixel Processing Unit (PPU), which is responsible for sending the pixels to the screen. When the LCD is on, the Gameboy PPU is eternally cycling between four different modes:

  1. hblank, a short period between drawing a horizontal line of pixels
  2. vblank, a very long period after the last horizontal line of pixels is drawn
  3. searching objects, a very short period before drawing a line where the PPU is reading sprite RAM (OAM RAM). The CPU can't access sprite RAM when the PPU is in this mode.
  4. drawing, a short period where pixels are being drawn to the screen. The CPU can't access sprite RAM or tile RAM when the PPU is in this mode. The Gameboy Color (CGB) can't access palette data here either.

PPU modes diagram

All the functions of the game were running from the VBLANK interrupt alone; the main loop was simply spinning. This caused several problems:

  1. Operations necessary to write sprites to the Gameboy sprite memory (OAM) trigger the Gameboy Sprite RAM bug, which is a bug in the physical hardware of the Gameboy.
  2. Unfortuitous timing (PPU in mode 2) causes sprite writes to fail. On hardware and bgb this only happens occasionally, but these sort of direct OAM writes aren't supported at all by wasmBoy, the emulator that's used for the undercooked in-browser demo links.
  3. There's inconsistent behavior when the duration of undercooked game and engine logic exceeds the vblank period.

To fix this, several pieces need to be moved around:

  1. All writes to the Gameboy sprite RAM (OAM) are instead directed at a buffer table at $c000, the bottom of WRAM bank 0. Because this region is (almost) always accessible, writes here will (almost) always succeed.
  2. All game and engine logic is moved from the vblank interrupt handler to the main loop.
  3. A 5-byte snippet of code is written to $ffa0, the bottom of HRAM. When this region is called, an accelerated direct memory access (DMA) routine in hardware is activated to copy the OAM buffer table at $c000 into the real OAM RAM. The HRAM subroutine then waits the appropriate amount of time to allow the hardware routine to complete, then returns to the caller. This subroutine must live in HRAM because the CPU can only access HRAM while the hardware DMA routine is active. My sincere thanks to the maintainers of the constantly updated gbdev pandocs for their detailed description of OAM DMA.
  4. The vblank interrupt handler is replaced with register save/restore and a call to the aforementioned OAM DMA subroutine in HRAM. This completes the fix: if we can treat the OAM buffer table at $c000 as the "real" OAM RAM (since we can trust that changes to this region will propagate at the next vblank), then we will never hit the sprite RAM bug, since $c000 is accessible regardless of mode, and we'll never have to edit OAM RAM directly.

You may be thinking:

Whew! That's a lot of core structural changes! So does the game look or behave any differently?

Absolutely not. Aside from slightly better compatibility with naïve emulators and a somewhat lower chance of coincidental sprite errors, it looks pixel-for-pixel identical to the previous version. This is the joy of programming directly in assembly~ "yay"