Skip to content

tbsp/varaboy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Varaboy

Varaboy is a Game Boy emulator written in uxntal for the Varvara system.

Special Thanks

  • Devine Lu Linvega (and collaborators) for uxn and Varvara, the reason this exists at all
  • Andrew Richards for uxn32, without which debugging this would have been a nightmare
  • Calindro for Emulicious and all his guidance on this silly project
  • binji for his writeup on POKEGB, and the emulator code itself
  • Everyone that's contributed to the Pandocs
  • Nintendo for making the Game Boy, which has consumed far too much of my life

Features

  • ROM/MBC1/MBC3/MBC5 games (only basic ROM/RAM banking support, extended bank bits not handled, no RTC for MBC3)
  • .theme file support
  • No sound support

Controls

Key Function
Home Start
Shift Select
Ctrl A
Alt B
Arrows Dpad
0-9 Set frame skip (default: 0)
Escape Write SRAM to disk and quit

Stability

Note that due to the incomplete MBC implementations and SRAM not being repacked unless you quit with the escape key I strongly discourage playing any game seriously with this emulator at this time. Games will almost certainly break, and save files will almost certainly be lost/corrupted.

Usage

This emulator requires a UXN emulator to run. The name of the Game Boy ROM to run is provided on the command line, for example:

uxnemu varaboy.rom tetris.gb

Note: Since uxngb (my UXN VM written for Game Boy) only supports up to 8KiB UXN ROMs, it's not possible to run this Game Boy emulator inside uxngb. It is entirely possible to run UXN ROMs 8KiB or smaller inside uxngb inside varaboy though!

Screenshots

cpu_instrs dmg_acid2 sml tetris megaman sml2 megaman5 ffl3 shocklobster fruitpursuit deathplanet uxn_screen

Compatibility

Note that unless otherwise noted, only a few minutes of testing per game was performed.

Game Notes
Tetris Playable
Super Mario Land Playable
The Legend of Zelda: Link's Awakening Playable
Mega Man: Dr. Wily's Revenge Payable
Mega Man V Playable
Mario's Picross Playable
Final Fantasy Legend III Playable
Wario Land II Playable
Super Mario Land 2: Six Golden Coins Map corrupt, levels playable
Donkey Kong Freezes after first level, visual glitches in cutscenes
Dr. Mario Freezes when you try to start the game
Shock Lobster (Homebrew) Playable
Fruit Pursuit Beta (Homebrew) Playable
Adjustris (Homebrew) Playable
uxngb (Homebrew) Playable
Death Planet (Homebrew) Playable
Libbet (Homebrew) Playable
Geometrix (Homebrew) Playable
Sam Mallard (Homebrew) Hangs on startup
Quartet (Homebrew) Works, but RNG doesn't function due to the unimplemented joypad interrupt

How it works

Both the Game Boy and UXN use a 16bit address space ($0000-$ffff). The Game Boy has a large region of "echo RAM" from $e000 to $fdff, which mirrors the contents of WRAM (c000~ddff) and was considered off-limits for Game Boy software by Nintendo.

Varaboy starts at the UXN entry point ($0100, the same as the Game Boy entry point), sets up some basic stuff, reads the GB ROM header to load the appropriate MBC handler code, and then jumps to the main runtime code inside echo RAM. As long as we can fit all UXN runtime code inside echo RAM the Game Boy code is able to access the rest of memory using native addresses, which I find super fun!

Note that this means we don't have access to the UXN zero page, which contains the Game Boy RST and interrupt vectors.

Performance

This emulator performs fairly well on a recent/fast CPU. On a Ryzen 5600X running in uxnemu most games are playable with no frameskip. Emulation speed in uxn32 is slightly slower, though it's not clear why at this time. A i5-540M with a frameskip of 3 could be considered playable for some games, but action games are pushing it. Performance on the Nintendo DS UXN VM is even slower, which isn't surprising.

I've sped up instruction dispatch by using jump tables, which in certain cases "wastes" as much as 126 bytes for the ~64 "ld r8,r8" instructions, but overall I believe the performance gain is worth it. I've also tried to pre-calculate as much as possible in the PPU scanline renderer to reduce redundant calculations as I'm not considering mid-scanline register writes. Background/window tiles are cached for reuse for up to 8 pixels, which provides a slight performance gain, though the presence of that code also slows things down a bit, so the net gain isn't huge. In addition, several common operations (ticks, reads, etc) have been converted to macros for speed over size, though the gains are minor. The initial release had a very inefficient OAM scan approach, which has since been resolved. I've also reworked the PPU mode advancement to only check a single PPU transition dot per mode, and to only check once per instruction, which speeds things up quite a bit and doesn't affect accuracy with a simple scanline renderer.

Save files are unpacked into a file per bank on startup for faster access during SRAM banking. The individual bank files are repacked on shutdown if you quit by pressing the Escape key. Quitting by closing the VM any other way will not properly write SRAM contents back to the SAV file. Without file seeking, games which use lots of ROM banks (and bank often) could also suffer a notable performance hit which could be reduced by unpacking ROM banks in a similar manner.

The only "big" idea I have to speed things up right now is to write 8 full rows of pixels to a buffer of 20 2bpp UXN tiles and draw them to screen with two .Screen/sprite (auto) writes instead of 1280 .Screen/pixel writes. It's unclear if the extra VM instructions to juggle the buffering would be worth the reduction in .Screen calls though, and the benefits may vary by VM implementation. I've tried this in the sprite-ppu branch and my current attempt runs slower than the .Screen/pixel approach.

In addition, I'm still very new at writing uxntal, so there are likely a whole bunch of smart optimizations which could be done to speed things up. Anything that could speed up the main CPU and PPU loops would likely yield huge speed benefits.

Accuracy

  • Passes blargg's cpu_instrs tests
  • Fails blargg's instr_timing test, possibly due to a flawed timer implementation which mis-measures instruction timing
  • Passes most of Matt Currie's dmg-acid2, except for sprite x priority (which is not implemented)
  • Fails most other test ROMs
  • Still manages to run a surprising number of commercial/homebrew games, despite the above!