## Emulation

Takanen Edoardo March 31, 2025

## Abstract

#### MOVE TO CARTRIDGE SECTION

Another interesting fact I found out when making this emulator is what is inside a cartridge. It is fascinating to know that some cartridges would not only include the ROM banks with the game code, but they could also supply their own additional SRAM (Static Random-Access Memory), as well as a battery to preserve the game saves. Due to limiting memory sizes, games could also have a Memory Bank Controller (MBC) to change what ROM should be pointed for memory region 4000-7FFF.

Notice that all these additional components are **not** supported on my emulator, since the original Tetris DMG cartridge just had a 32 Kb ROM. (still to be placed) images from:

1. https://www.pastraiser.com/cpu/gameboy/gameboyopcodes.html

# Contents

| 1 | Introduction |                                 |    |  |  |  |  |  |
|---|--------------|---------------------------------|----|--|--|--|--|--|
| 2 | Pre          | mises                           | 4  |  |  |  |  |  |
| 3 | Ger          | neral Structure                 | 5  |  |  |  |  |  |
| 4 | Me           | mory                            | 6  |  |  |  |  |  |
|   | 4.1          | Memory mapping                  | 6  |  |  |  |  |  |
|   | 4.2          | Choices for this project        | 6  |  |  |  |  |  |
|   | 4.3          | Implementation                  | 7  |  |  |  |  |  |
| 5 | CP           | U                               | 8  |  |  |  |  |  |
|   | 5.1          | Architecture and considerations | 8  |  |  |  |  |  |
|   | 5.2          | The op-tables                   | 9  |  |  |  |  |  |
|   | 5.3          | Implementation                  | 12 |  |  |  |  |  |
| 6 | Del          | ougging the CPU                 | 14 |  |  |  |  |  |
|   | 6.1          | The boot ROM                    | 14 |  |  |  |  |  |
|   | 6.2          | Boot code analysis              | 14 |  |  |  |  |  |
|   | 6.3          | The execution so far            | 15 |  |  |  |  |  |
| 7 | Tin          | iers                            | 16 |  |  |  |  |  |
|   | 7.1          | Structure                       | 16 |  |  |  |  |  |
|   |              | 7.1.1 DIV                       | 16 |  |  |  |  |  |
|   |              | 7.1.2 TIMA                      | 16 |  |  |  |  |  |
|   |              | 7.1.3 TAC                       | 16 |  |  |  |  |  |
|   |              | 7.1.4 TMA                       | 17 |  |  |  |  |  |
|   |              | 7.1.5 Timing behaviors          | 17 |  |  |  |  |  |
|   | 7.2          | Implementation                  | 17 |  |  |  |  |  |
| 8 | Inte         | errunts                         | 19 |  |  |  |  |  |

## 1 Introduction

As a kid, I used to play with some Nintendo (copyright symbol) consoles like the Wii or the DS and I've always been keen about the games they make. This passion for videogames grew on me so that I got interested in the making process of them, leading to game development. I never asked myself one question, though, until this year, which is how are these games able to run onto this consoles? and how can people make emulators so that I could play on my personal computer? Thus, I decided to embrace the unknown world of emulation, because I have always been fascinated by it but always took it for granted.

Emulation is not well explained on the Internet. Mainly, the results you will find if you search for it are "the program pretends to be the console" or "you will be able to play old titles". Unfortunately I was not satisfied with these responses and I wanted to know more. Thus, my emulation journey started with looking for a full definition of this process.

I will try to give my own definition of emulation, so that I can lay a starting point to a general knowledge that will be then deepened during the paper. With this being said, to "make an emulator" means to develop the software that will do exactly what the hardware of the console does, so that when plugging the game data, the program will know how to read and handle it.

Emulation can only happen when the machine in which we run the software is more powerful than the hardware we want to emulate. For example, if our console has 2Kb of memory, for sure we are not able to emulate it on a computer that has 2Kb or less, since we also have to consider that the host computer will have an operating system running (which uses some of the RAM). I chose to make a Game Boy emulator, because while looking for the retro consoles, it seemed the least difficult when talking about the hardware structure complexity, meaning a good way to start tackling this topic.

## 2 Premises

This emulator project is purely for understanding the concepts and the theory about how a machine like a console (or similarly a computer) is made (and also for fun). There are many better-developed Game Boy emulators online, and making one that could compete with the other popular ones is nowhere near my goals. In addition, I could not achieve the level of knowledge I want to reach if I just looked at other people's codes, I wanted to fully understand the subject. Obviously though I have to start somewhere, I do not have the skills to reverse-engineer a real Game Boy (although it would be an extremely interesting challenge), for this reason I will only consult theory guides made by many passionate developers and hackers that already did the work of studying the Game Boy from scratch for us. For a better understanding, I used two sources for this project, in order to have a dual perspective on the study. For anyone who would like to dig into this challenge too, the guides are GBDev and GBEDG.

#### What this paper is not?

This paper is not and was not intended as a guide, I previously attached some real references. This document is a report of my journey throughtout the development of the emulator, made to understand the fundamentals of what is around us, from personal computers to smartphones. It could also be a way for readers to get passionate about this topic and an inspiration for them to make their own emulators (or even better, their own consoles!).

## 3 General Structure

The first thing I want to cover is in what way we want to structure our emulator.

```
int main() {
       // Hardware components definition
      Memory mem;
3
      CPU cpu;
6
      // Components initialization
      mem.init();
9
      cpu.init();
      cpu.load_boot();
10
      // ...
12
      while (true) {
13
           cpu.execute(mem); // Executing an operation
14
           // emulate all the other components
15
16
      return 0;
18
19
```

Actually, when we look at the circuit inside the Game Boy, all the components are, on one side, all on their own, they all execute at the same time. The CPU could be executing a simple addition, while the PPU could be rendering graphics onto the screen, all of these things happen simultaneously. This **can** be done with software, but would mean more complexity. Hence we will pick a less complicated path, and decide to execute the components one at a time.

```
while (true) {
    // Returns how many clock cycles the instruction took
    int cycles = cpu.execute();

// Updating all the other components
    timers.update(cycles);
    ppu.update(cycles);
    // ...
}
```

The real hardware is driven by the clock, while my implementation will be driven by how many clock cycles an instruction took. This may cause some bugs and imprecisions in the emulator (and that was my main concern), but in the end it worked just fine.

## 4 Memory

Before looking at the main components that shape the Game Boy hardware, I would like to focus on how memory is subdivided inside the console.

## 4.1 Memory mapping

The address bus had 16 bits, meaning there could be 65'536 unique addresses (64Kb).

Since the Game Boy did not have a flash memory, those 64Kb were all the console could access (this includes all the different RAMs, the cartridge data, and the registers made to control various components). Internally inside the Game Boy, there is some logic that specifies what component will be activated based on the requesting address, but we do not have to worry about it since we are not dealing with actual hardware this time.

These are the regions into which the memory is split, along with a brief description of their use. Notice that some areas are marked as "Prohibited", though Nintendo has not provided an explanation for this.

| Start | End  | Description         |
|-------|------|---------------------|
| 0000  | 3FFF | 16Kb cartridge ROM  |
| 4000  | 7FFF | 16Kb cartridge ROM* |
| 8000  | 9FFF | 8Kb VRAM            |
| A000  | BFFF | 8Kb External RAM    |
| C000  | CFFF | 4Kb Work RAM        |
| D000  | DFFF | 4Kb Work RAM        |
| E000  | FDFF | Prohibited area     |
| FE00  | FE9F | OAM                 |
| FEA0  | FEFF | Prohibited area     |
| FF00  | FF7F | I/O Registers       |
| FF80  | FFFE | High RAM            |
| FFFF  | FFFF | IE register         |

<sup>\*</sup>switchable

Table 1: Game Boy's memory mapping

Most of these regions will be discussed later, based on the components that use them.

## 4.2 Choices for this project

For simplicity, I decided not to break down all these areas into different regions of memory in the emulator, but I opted for an easier solution, which is to create an array of 65'536 bytes, since the data bus was 8 bits-long, so every address

will have exacly one byte of data.

There is a trade-off in choosing this approach tough. On one hand, it makes things simpler to manage, we can have all the memory in one place and it really helps when debugging, but on the other hand, it is not entirely correct. Each region of memory has different restrictions, cartridge memory should be read-only, some areas might not be fully readable and writable sometimes (we will see an example when implementing the PPU).

By choosing this option, we are making all the memory readable and writable for everyone, so tecnically, the game could edit its own code (and this actually happened when I was emulating Tetris!).

### 4.3 Implementation

```
struct Memory {
2 private:
      static constexpr u32 MAX_MEM = 64 * 1024;
      // Array of bytes to emulate all the Gameboy's addresses
      Byte Data[MAX_MEM] = {};
5
  public:
6
      void init();
9
       * Functions used for setting and accessing memory as
10
       * mem[addr] = value to set
11
       * Byte value = mem[addr]
                                   to access
13
       * */
      Byte operator[](u32 addr) const;
14
      Byte& operator[](u32 addr);
15
16
17
       * Dumps all the memory in a file,
18
19
       * used for debugging purposes
20
      void dump(const char* filename);
21
22 };
```

This is the entire memory structure. Every component we create will have access to this structure in order to read from and write to memory using the two operator[] functions. I have also added a **dump** function that writes all the bytes to a binary file at the moment the function is called, for easier debugging. The **init** function just initializes the array by setting all values to 0. This is **not** actually done in a real Game Boy, as the memory tipically contains random values when powered on. However, I decided to initialize it with all zeros to make debugging easier by allowing me to see if any memory has changed.

## 5 CPU

The first component we likely want to implement is the Central Processing Unit (CPU). This component is the most important in our circuit, and is the one that coordinates the other components, executes the program we give to it etc. Thus, the first thing I had to implement were all the instructions that the processor could run.

#### 5.1 Architecture and considerations

The Game Boy's CPU is a custom-made by Sharp Corporation (which had a close relationship with Nintendo at that time), it is often referred to as **DMG-CPU** or **Sharp SM83** and runs at around 4.19 MHz. When making the processor a lot of inspiration was taken from the Zilog 80 and the Intel 8080. Personally, I recently had the possibility to work with a real Z80, and over the past few months, I have gained hands-on experience with its architecture. Specifically, when studying the Zilog, I noticed some differences and similarities with the Game Boy's processor.

For example, the Nintendo processor lacks the IX and IY registers, which in the Zilog were used to set a base address that could be offset with the LD(IX+d), r and LD(IY+d), r to save instruction bytes. Instead, the DMG-CPU introduced a brand new load instruction, LDH (load from high memory), which always offsets from address **FF00**, pointing to **High-RAM** and the **I/O registers**. I think custom-making their own CPU was the perfect choice for Nintendo, as it allowed them to implement changes like these to better suit their needs, save instruction bytes, and increase performace.

CPUs are the main core of every computer and what they do most of the time is execute instructions defined in some memory. In the next section, we will give a brief summary of the different types of instructions. But what is most important for now is to understand that the majority of these operate on internal registers and external memory.

| 8-bit re | egisters | 16-bit pairs | 16-bit register | Description     |
|----------|----------|--------------|-----------------|-----------------|
| A*       | F**      | AF DC        | SP              | Stack Pointer   |
| D        | E E      | BC<br>DE     | PC              | Program Counter |
| Н        | L        | HL           |                 |                 |

<sup>\*</sup>Accumulator

Table 2: Game Boy's registers

Registers are the fastest memory to access because it is already inside the processor. However their size is very limited, so we cannot have everything in

<sup>\*\*</sup>Flags

them. That is why the CPU has a set of instructions for loading data to and from larger memory.

## 5.2 The op-tables

We can arrange all the instructions in tables, called opcode-tables, based on their identifier byte.

The Game Boy has 2 op-tables which are shown in Figure 1. Instructions with similar behaviors have been marked with the same color. Since a single byte (one op-table) was insufficient to cover all instructions, an additional table was made, which however uses 2-byte instructions, with the first one always being  $\theta xCB$  in hexadecimal (acting as a prefix), covering all bit operations.

By briefly examining the different instructions, you can see that most of them perform the following operations:

- 1. Loading values into registers
- 2. Adding and subtracting values between registers
- 3. Reading from and writing to memory
- 4. Comparing values and manipulating individual bits in registers

Obviously other instructions also do other kinds of operations but, as I have said above, most of them operate on the CPU's registers.

## 8-bit opcodes

|    | x0         | X1        | X2         | х3        | x4           | x5        | x6         | X7        | x8          | x9        | XΑ         | хB        | xC         | XD       | XE                     | XF      |
|----|------------|-----------|------------|-----------|--------------|-----------|------------|-----------|-------------|-----------|------------|-----------|------------|----------|------------------------|---------|
|    | NOP        | LD BC,d16 | LD (BC),A  | INC BC    | INC B        | DEC B     | LD B,d8    | RLCA      | LD (a16),SP | ADD HL,BC | LD A, (BC) | DEC BC    | INC C      | DEC C    | LD C,d8                | RRCA    |
| Θx | 1 4        | 3 12      | 1 8        | 1 8       | 1 4          | 1 4       | 2 8        | 1 4       | 3 20        | 1 8       | 1 8        | 1 8       | 1 4        | 1 4      | 2 8                    | 1 4     |
|    |            |           |            |           | Z 0 H -      | Z 1 H -   |            | 000C      |             | - 0 H C   |            |           | Z 0 H -    | Z 1 H -  |                        | 000C    |
|    | STOP 0     | LD DE,d16 | LD (DE),A  | INC DE    | INC D        | DEC D     | LD D,d8    | RLA       | JR r8       | ADD HL,DE | LD A, (DE) | DEC DE    | INC E      | DEC E    | LD E,d8                | RRA     |
| 1x | 2 4        | 3 12      | 1 8        | 1 8       | 1 4          | 1 4       | 2 8        | 1 4       | 2 12        | 1 8       | 1 8        | 1 8       | 1 4        | 1 4      | 2 8                    | 1 4     |
|    |            |           |            |           | Z 0 H -      | Z 1 H -   |            | 000C      |             | - 0 H C   |            |           | Z 0 H -    | Z 1 H -  |                        | 000C    |
|    | JR NZ,r8   | LD HL,d16 | LD (HL+),A | INC HL    | INC H        | DEC H     | LD H,d8    | DAA       | JR Z,r8     | ADD HL,HL | LD A,(HL+) | DEC HL    | INC L      | DEC L    | LD L,d8                | CPL     |
| 2x | 2 12/8     | 3 12      | 1 8        | 1 8       | 1 4          | 1 4       | 2 8        | 1 4       | 2 12/8      | 1 8       | 1 8        | 1 8       | 1 4        | 1 4      | 2 8                    | 1 4     |
|    |            |           |            |           | Z 0 H -      | Z 1 H -   |            | Z - 0 C   |             | - 0 H C   |            |           | Z 0 H -    | Z 1 H -  |                        | - 11-   |
|    | JR NC,r8   | LD SP,d16 | LD (HL-),A | INC SP    | INC (HL)     | DEC (HL)  | LD (HL),d8 | SCF       | JR C,r8     | ADD HL,SP | LD A,(HL-) | DEC SP    | INC A      | DEC A    | LD A,d8                | CCF     |
| 3x | 2 12/8     | 3 12      | 1 8        | 1 8       | 1 12         | 1 12      | 2 12       | 1 4       | 2 12/8      | 1 8       | 1 8        | 1 8       | 1 4        | 1 4      | 2 8                    | 1 4     |
|    |            |           |            |           | Z 0 H -      | Z 1 H -   |            | -001      |             | - 0 H C   |            |           | Z 0 H -    | Z 1 H -  |                        | - 0 0 C |
|    | LD B,B     | LD B,C    | LD B,D     | LD B,E    | LD B,H       | LD B,L    | LD B,(HL)  | LD B,A    | LD C,B      | LD C,C    | LD C,D     | LD C,E    | LD C,H     | LD C,L   | LD C,(HL)              | LD C,A  |
| 4x | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    |            |           |            |           |              |           |            |           |             |           |            |           |            |          |                        |         |
|    | LD D.B     | LD D.C    | LD D.D     | LD D.E    | LD D.H       | LD D.L    | LD D,(HL)  | LD D.A    | LD E.B      | LD E.C    | LD E.D     | LD E.E    | LD E.H     | LD E.L   | LD E,(HL)              | LD E,A  |
| 5x | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    |            |           |            |           |              |           |            |           |             |           |            |           |            |          |                        |         |
|    | LD H,B     | LD H,C    | LD H,D     | LD H,E    | LD H,H       | LD H,L    | LD H,(HL)  | LD H,A    | LD L,B      | LD L,C    | LD L,D     | LD L,E    | LD L,H     | LD L,L   | LD L,(HL)              | LD L,A  |
| 6x | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    |            |           |            |           |              |           |            |           | -111        |           |            | 1.1.1.    |            | -111     |                        |         |
|    | LD (HL),B  | LD (HL),C | LD (HL),D  | LD (HL),E | LD (HL),H    | LD (HL),L | HALT       | LD (HL),A | LD A.B      | LD A.C    | LD A.D     | LD A.E    | LD A.H     | LD A.L   | LD A <sub>v</sub> (HL) | LD A,A  |
| 7x | 1 8        | 1 8       | 1 8        | 1 8       | 1 8          | 1 8       | 1 4        | 1 8       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    |            |           |            |           |              |           |            |           |             |           |            |           |            |          |                        |         |
|    | ADD A.B    | ADD A.C   | ADD A.D    | ADD A.E   | ADD A.H      | ADD A.L   | ADD A.(HL) | ADD A.A   | ADC A.B     | ADC A.C   | ADC A.D    | ADC A.E   | ADC A.H    | ADC A.L  | ADC A.(HL)             | ADC A.A |
| 8x | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    | ZOHC       | ZOHC      | ZOHC       | ZOHC      | ZOHC         | ZOHC      | ZOHC       | ZOHC      | ZOHC        | ZOHC      | ZOHC       | ZOHC      | ZOHC       | ZOHC     | ZOHC                   | ZOHC    |
|    | SUB B      | SUB C     | SUB D      | SUB E     | SUB H        | SUB L     | SUB (HL)   | SUB A     | SBC A,B     | SBC A,C   | SBC A,D    | SBC A,E   | SBC A,H    | SBC A,L  | SBC A, (HL)            | SBC A,A |
| 9x | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    | Z 1 H C    | Z 1 H C   | ZIHC       | Z 1 H C   | Z 1 H C      | Z 1 H C   | ZIHC       | Z 1 H C   | Z 1 H C     | Z 1 H C   | Z 1 H C    | Z 1 H C   | Z 1 H C    | Z 1 H C  | Z 1 H C                | Z 1 H C |
|    | AND B      | AND C     | AND D      | AND E     | AND H        | AND L     | AND (HL)   | AND A     | XOR B       | XOR C     | XOR D      | XOR E     | XOR H      | XOR L    | XOR (HL)               | XOR A   |
| Ax | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
|    | Z 0 1 0    | Z 0 1 0   | Z 0 1 0    | Z 0 1 0   | Z 0 1 0      | Z 0 1 0   | Z 0 1 0    | Z 0 1 0   | Z 0 0 0     | Z 0 0 0   | Z 0 0 0    | Z 0 0 0   | Z 0 0 0    | Z 0 0 0  | Z 0 0 0                | Z 0 0 0 |
|    | OR B       | OR C      | OR D       | OR E      | OR H         | OR L      | OR (HL)    | OR A      | CP B        | CP C      | CP D       | CP E      | CP H       | CP L     | CP (HL)                | CP A    |
| Bx | 1 4        | 1 4       | 1 4        | 1 4       | 1 4          | 1 4       | 1 8        | 1 4       | 1 4         | 1 4       | 1 4        | 1 4       | 1 4        | 1 4      | 1 8                    | 1 4     |
| -  | Z 0 0 0    | Z 0 0 0   | Z 0 0 0    | Z 0 0 0   | Z 0 0 0      | Z 0 0 0   | Z 0 0 0    | Z 0 0 0   | Z 1 H C     | Z 1 H C   | Z 1 H C    | ZIHC      | Z 1 H C    | Z 1 H C  | Z 1 H C                | ZIHC    |
|    | RET NZ     | POP BC    | JP NZ,a16  | 3P a16    | CALL NZ,a16  | PUSH BC   | ADD A,d8   | RST 00H   | RET Z       | RET       | JP Z,a16   | PREFIX CB | CALL Z,a16 | CALL a16 | ADC A,d8               | RST 08H |
| CX | 1 20/8     | 1 12      | 3 16/12    | 3 16      | 3 24/12      | 1 16      | 2 8        | 1 16      | 1 20/8      | 1 16      | 3 16/12    | 1 4       | 3 24/12    | 3 24     | 2 8                    | 1 16    |
|    |            |           |            |           |              |           | Z Ø Н С    |           |             |           |            | 1111      |            |          | z ө н с                |         |
|    | RET NC     | POP DE    | JP NC,a16  |           | CALL NC, a16 | PUSH DE   | SUB d8     | RST 10H   | RET C       | RETI      | JP C,a16   |           | CALL C,a16 |          | SBC A,d8               | RST 18H |
| Dx | 1 20/8     | 1 12      | 3 16/12    |           | 3 24/12      | 1 16      | 2 8        | 1 16      | 1 20/8      | 1 16      | 3 16/12    |           | 3 24/12    |          | 2 8                    | 1 16    |
| -  | 1 20/6     | 1 12      | 3 10/12    |           | 5 24/12      | 1 10      | ZIHC       | 1 10      | 1 20/0      | 1 10      | 3 10/12    |           | 5 24/12    |          | Z 1 H C                | 1 10    |
|    | LDH (a8),A | POP HL    | LD (C),A   |           |              | PUSH HL   | AND d8     | RST 20H   | ADD SP.r8   | JP (HL)   | LD (a16),A |           |            |          | XOR d8                 | RST 28H |
| Ex | 2 12       | 1 12      | 2 8        |           |              | 1 16      | 2 8        | 1 16      | 2 16        | 1 4       | 3 16       |           |            |          | 2 8                    | 1 16    |
|    |            |           |            |           |              |           | Z 0 1 0    |           | о о н с     | .1.7.     |            |           |            |          | Z 0 0 0                |         |
|    | LDH A,(a8) | POP AF    | LD A,(C)   | DI        |              | PUSH AF   | OR d8      | RST 30H   | LD HL,SP+r8 | LD SP.HL  | LD A,(a16) | EI        |            |          | CP d8                  | RST 38H |
| FX | 2 12       | 1 12      | 2 8        | 1 4       |              | 1 16      | 2 8        | 1 16      | 2 12        | 1 8       | 3 16       | 1 4       |            |          | 2 8                    | 1 16    |
| FX | 2 12       | ZNHC      | 2 8        | 1 4       |              | 1 16      | Z 0 0 0    | 1 16      | 0 0 H C     | 1 8       | 3 16       | 1 4       |            |          | Z 1 H C                | 1 16    |
|    |            | 2486      |            |           |              |           | 2000       |           | OUNC        |           |            |           |            |          | 2111                   |         |

## 16-bit opcodes, where the first 8 bits are 0xCB

|     | x0             | X1             | X2             | x3             | x4             | X5             | x6                 | X7             | x8             | х9             | ХΔ             | XВ             | xC             | XD             | XE                 | XF             |
|-----|----------------|----------------|----------------|----------------|----------------|----------------|--------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|--------------------|----------------|
|     | RLC B          | RLC C          | RLC D          | RLC E          | RLC H          | RLC L          | RLC (HL)           | RLC A          | RRC B          | RRC C          | RRC D          | RRC E          | RRC H          | RRC L          | RRC (HL)           | RRC A          |
| Øх  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        |
|     | RL B           | RL C           | RL D           | RL E           | RL H           | RL L           | RL (HL)            | RL A           | RR B           | RR C           | RR D           | RR E           | RR H           | RR L           | RR (HL)            | RR A           |
| 1x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        |
|     | SLA B          | SLA C          | SLA D          | SLA E          | SLA H          | SLA L          | SLA (HL)           | SLA A          | SRA B          | SRA C          | SRA D          | SRA E          | SRA H          | SRA L          | SRA (HL)           | SRA A          |
| 2x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0            | Z 0 0 0        |
|     | SWAP B         | SWAP C         | SWAP D         | SWAP E         | SWAP H         | SWAP L         | SWAP (HL)          | SWAP A         | SRL B          | SRL C          | SRL D          | SRL E          | SRL H          | SRL L          | SRL (HL)           | SRL A          |
| 3x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0        | Z 0 0 0            | Z 0 0 0        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C        | Z 0 0 C            | Z 0 0 C        |
|     | BIT 0,B        | BIT 0,C        | BIT 0,D        | BIT 0,E        | BIT 0,H        | BIT 0,L        | BIT 0,(HL)         | BIT 0,A        | BIT 1,B        | BIT 1,C        | BIT 1,D        | BIT 1,E        | BIT 1,H        | BIT 1,L        | BIT 1,(HL)         | BIT 1,A        |
| 4x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        |
|     | BIT 2,B        | BIT 2,C        | BIT 2,D        | BIT 2,E        | BIT 2,H        | BIT 2,L        | BIT 2,(HL)         | BIT 2,A        | BIT 3,B        | BIT 3,C        | BIT 3,D        | BIT 3,E        | BIT 3,H        | BIT 3,L        | BIT 3,(HL)         | BIT 3,A        |
| 5x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        |
|     | BIT 4,B        | BIT 4,C        | BIT 4,D        | BIT 4,E        | BIT 4,H        | BIT 4,L        | BIT 4,(HL)         | BIT 4,A        | BIT 5,B        | BIT 5,C        | BIT 5,D        | BIT 5,E        | BIT 5,H        | BIT 5,L        | BIT 5,(HL)         | BIT 5,A        |
| 6x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        |
|     | BIT 6,B        | BIT 6,C        | BIT 6,D        | BIT 6,E        | BIT 6,H        | BIT 6,L        | BIT 6,(HL)         | BIT 6,A        | BIT 7,B        | BIT 7,C        | BIT 7,D        | BIT 7,E        | BIT 7,H        | BIT 7,L        | BIT 7,(HL)         | BIT 7,A        |
| 7x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -        | Z 0 1 -            | Z 0 1 -        |
|     | RES 0,B        | RES 0,C        | RES 0,D        | RES 0,E        | RES 0,H        | RES 0,L        | RES 0,(HL)         | RES 0,A        | RES 1,B        | RES 1,C        | RES 1,D        | RES 1,E        | RES 1,H        | RES 1,L        | RES 1,(HL)         | RES 1,A        |
| 8x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     |                |                |                |                |                |                |                    |                |                |                |                |                |                |                |                    |                |
|     | RES 2,B        | RES 2,C        | RES 2,D        | RES 2,E        | RES 2,H        | RES 2,L        | RES 2,(HL)         | RES 2,A        | RES 3,B        | RES 3,C        | RES 3,D        | RES 3,E        | RES 3,H        | RES 3,L        | RES 3,(HL)         | RES 3,A        |
| 9x  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     |                |                |                |                |                |                |                    |                |                |                |                |                |                |                |                    |                |
|     | RES 4,B        | RES 4,C        | RES 4,D        | RES 4,E        | RES 4,H        | RES 4,L        | RES 4,(HL)         | RES 4,A        | RES 5,B        | RES 5,C        | RES 5,D        | RES 5,E        | RES 5,H        | RES 5,L        | RES 5,(HL)         | RES 5,A        |
| Ax  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     |                |                |                |                |                |                |                    |                |                |                |                |                |                |                |                    |                |
| Bx  | RES 6,B        | RES 6,C        | RES 6,D        | RES 6,E        | RES 6,H        | RES 6,L        | RES 6,(HL)         | RES 6,A        | RES 7,B        | RES 7,C        | RES 7,D        | RES 7,E        | RES 7,H        | RES 7,L        | RES 7,(HL)         | RES 7,A        |
| BX  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     |                |                |                |                |                |                |                    |                |                | SET 1,C        |                |                |                |                |                    |                |
| cx  | SET 0,B<br>2 8 | SET 0,C<br>2 8 | SET 0,D<br>2 8 | SET 0,E<br>2 8 | SET 0,H<br>2 8 | SET 0,L<br>2 8 | SET 0,(HL)<br>2 16 | SET 0,A<br>2 8 | SET 1,B<br>2 8 | 2 8            | SET 1,D<br>2 8 | SET 1,E<br>2 8 | SET 1,H<br>2 8 | SET 1,L<br>2 8 | SET 1,(HL)<br>2 16 | SET 1,A<br>2 8 |
| CX  | 2 8            |                |                |                |                |                |                    |                |                |                |                |                |                |                | 2 16               |                |
|     |                |                | SET 2.D        |                |                | CET D.         |                    |                | SET 3.B        |                | SET 3.D        |                |                | SET 3.L        |                    | SET 3,A        |
| Dx  | SET 2,B<br>2 8 | SET 2,C<br>2 8 | 2 8            | SET 2,E<br>2 8 | SET 2,H<br>2 8 | SET 2,L<br>2 8 | SET 2,(HL)<br>2 16 | SET 2,A<br>2 8 | 2 8            | SET 3,C<br>2 8 | 2 8            | SET 3,E<br>2 8 | SET 3,H<br>2 8 | 2 8            | SET 3,(HL)<br>2 16 | 2 8            |
| DX. | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | SET 4,B        | SET 4.C        | SET 4.D        | SET 4.E        | SET 4.H        | SET 4,L        | SET 4,(HL)         | SET 4.A        | SET 5.B        | SET 5,C        | SET 5.D        | SET 5.E        | SET 5.H        | SET 5.L        | SET 5,(HL)         | SET 5.A        |
| Ex  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
| EX  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     | SET 6.B        | SET 6.C        | SET 6,D        | SET 6.E        | SET 6.H        | SET 6.L        | SET 6,(HL)         | SET 6.A        | SET 7.B        | SET 7.C        | SET 7.D        | SET 7.E        | SET 7.H        | SET 7.L        | SET 7,(HL)         | SET 7.A        |
| FX  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
| FX  | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 8            | 2 16               | 2 8            |
|     |                |                |                |                |                |                |                    |                |                |                |                |                |                |                |                    |                |

Figure 1: Game Boy's opcode-tables

It is important to say that some operations depend of results coming from previous instructions. These results are saved in the so called flags. Each flag would be represented by a single bit, which is set to 1 when active, and all the flags are stored together inside the F register. Later, we will see that for simplicity I chose to use a separate variable for each flag, instead of using a single F variable.

These flags are:

| Bit* | Name | Description     |
|------|------|-----------------|
| 7    | zf   | Zero flag       |
| 6    | n    | Add/sub flag    |
| 5    | h    | Half carry flag |
| 4    | cy   | Carry flag      |
| 3-0  | -    | Not used        |

<sup>\*</sup>bit position inside the F register

Table 3: DMG-CPU's flags

### 1. Zero flag

Set if the result of an operation is 0 Used for conditional jumps

## 2. Add/sub flag

1 if the previous operation was an addition, 0 if it was a subtraction Used for DAA instructions only

#### 3. Half carry flag

Set when there is a carry between the lower 4 bits of the operands during an arithmetic operation. It indicates that the lower nibble (4 bits) has overflowed.

#### 4. Carry flag

Set when an arithmetic operation causes a carry beyond the most significant bit of a byte (either the first or the second one in 16-bit operations) in addition, or a borrow when subtracting. Also set when a rotate/shift operation has shifted out a 1.

Instructions also can take different amount of clock cycles to execute. (TODO)

## 5.3 Implementation

Thus, the first task I had to do was to implement every single instruction shown above, so that my virtual CPU would imitate the original Game Boy's processor behavior.

```
struct CPU {
      Byte A, B, C, D, E, H, L;
      Word SP;
      Word PC;
5
6
      // flags
      Byte z, n, h, c, IME;
      Byte fetch_byte(u32& cycles, Memory& mem);
10
      Word fetch_word(u32& cycles, Memory& mem);
11
      Byte read_byte(Word addr, u32& cycles, Memory& mem);
      void write_byte(Word addr, Byte data, u32& cycles, Memory& mem)
13
      void write_word(Word addr, Word data, u32& cycles, Memory& mem)
14
      // ... Functions to execute different bit manipulations
16
17
      // ... List of all instructions written as
18
      // static constexpr Byte INS_[INSTRUCTION] = [OP-CODE];
19
20
      // ... Functions to handle interrups (we will discuss them
21
      later)
23
      // Executes an instruction
      void exec_op(u32&, Memory&);
24
25
      // Gets called by the main loop
26
      u32 execute (Memory& mem);
27
28 }
```

This is the CPU structure, as you can see I defined all the registers and flags and I also implemented some utility functions.

The two functions we need to focus on now are the **execute** and the **exec\_op** function.

The **execute** function is called by the main loop and, besides executing an instruction, it also handles interrupts.

```
u32 CPU::execute(Memory& mem) {
      u32 \text{ cycles} = 0;
2
3
      handle_interrupts(mem);
4
       exec_op(cycles, mem);
6
7
       // Handling the cartdrige after the boot program is done (we
      will see it later)
       if (PC == 0x100 && is_boot) {
           for (int i = 0; i < 0x100; ++i) {</pre>
               mem[i] = rom_first[i];
10
11
           is_boot = false;
12
13
14
       return cycles;
15
```

While the **exec\_op** is responsible for handling the operations.

```
void CPU::exec_op(u32 &cycles, Memory& mem) {
      switch (Byte ins = fetch_byte(cycles, mem)) {
2
           case INS_LD_BL: {
3
               B = L;
4
               break;
          }
6
           case INS_LD_BHL: {
               Word addr = L | (H << 8);
               B = read_byte(addr, cycles, mem);
9
10
               break;
          }
11
12
           case INS_LD_BA: {
               B = A;
13
               break;
14
15
           case INS_LD_BN: {
16
               B = fetch_byte(cycles, mem);
17
               break;
18
19
           case INS_ADD_AB: {
20
               n = 0;
21
22
               h = ((A \& OxF) + (B \& OxF)) > OxF;
               c = (u32)((A & OxFF) + (B & OxFF)) > OxFF;
23
               A += B;
24
               z = A == 0;
25
               break;
26
          }
27
           // Just some examples of instructions, you can see the
28
      whole implementation in cpu/cpu_ops.cpp
29
30 }
```

## 6 Debugging the CPU

It was now time to test if my CPU worked, I decided to do so by giving the Game Boy boot program to my emulator and see how it would behave.

#### 6.1 The boot ROM

The Game Boy has a little program burned inside the CPU that gets executed when the console is powered on and, among other things, shows the Nintendo® logo. This code is exactly **256 bytes** and is stored in the first 256 addresses (from 0000-00FF in hexadecimal).

I decided to download the binary file and start disassemblying by myself and studying from scratch.

(DISASSEMBLY MAYBE)

## 6.2 Boot code analysis

For anyone interested, I will attach my disassembly along with some comments and thoughts I jotted down while studying it.

Anyways, here is what the code does:

- 1. Resets VRAM
- 2. Sets the audio to play the famous "ba-ding!" sound
- 3. Loads the Nintendo logo from the game cartridge into VRAM to display it on screen
- 4. Scrolls the logo
- 5. Checks if the Nintendo logo is correct by comparing it with its own version; if not, the Game Boys stops executing.

Some peculiar things are happening in this code that I have not been able to explain. The Game Boys contains the entire Nintendo logo (including the registered trademark), but it only displays the R symbol on screen, while the "Nintendo" text is loaded from the game cartridge. Additionally, the logo is displayed on screen **before** it is checked for correctness.

#### 6.3 The execution so far

With this being said, I finally loaded the boot ROM into memory and started executing.

```
void CPU::load_bootup(const char *filename, Memory &mem) {
      std::ifstream file;
      file.open(filename, std::ios::in | std::ios::binary);
      if (file.is_open()) {
          for (size_t i = 0; i < 0x100; ++i) {</pre>
               Byte value = 0;
               file.read((char*)&value, sizeof(char));
               mem[i] = value;
9
          }
10
      } else {
          std::cerr << "Failed to open file " << filename << std::
12
      endl:
      }
13
14
      file.close();
      is_boot = true;
15
```

I checked whether the registers that were supposed to be modified had the correct values to verify if my CPU implementation was accurate—and it was!

The only issue now is that execution stops between addresses 0x64 and 0x68. Looking at my disassembly, I noticed that the code was looping until register FF44 reached 0x90. However, after examining the rest of the code, I saw that this register was never modified, meaning it must be a read-only register managed by another component. This component is the PPU (Pixel Processing Unit) which handles rendering on the display. Since the PPU is rather complex and long to implement, I decided to break it down into sections and follow the order in which I implemented it.

Before working on the PPU, though, I first implemented two simpler components.

## 7 Timers

As the name suggests, timers are in charge of measuring time and execute some code every certain time. One classic application that uses timers is a game where (pseudo) randomness is involved. We can get a random value every time we try to read the DIV register (the core counter) for example, because games execution follows an unpredictable order and because instructions take different amount of clock cycles to complete, the value in the DIV register will likely be at a different value each time.

#### 7.1 Structure

The timer has **four** mapped registers, two of them are for counting, while the other two are for configuring them.

#### 7.1.1 DIV

The DIV register is mapped to address **0xFF04** and is the core of the whole system. Internally, it is a 16-bit counter which is incremented every single clock cycle, although only the upper 8 bits are mapped to memory. The DIV register can be read from at anytime, writing to it will reset the whole 16-bit register to 0.

#### 7.1.2 TIMA

TIMA is a little more complex and gives us the possibility to count at different rates. It is mapped to address **0xFF05** and can be configured using the two registers TMA and TAC.

#### 7.1.3 TAC

This register controls the behavior of TIMA.

| 7 6 5 4 3 | 2      | 1 0          |
|-----------|--------|--------------|
| TAC       | Enable | Clock select |

Table 4: TAC flags

Bit 2 just enables or disables TIMA's counting, while bits 1 and 0 set TIMA's incrementing frequency. Notice that 1 M-Cycle is equal to 4 clock cycles.

| Clock select | Frequency    |
|--------------|--------------|
| 00           | 256 M-Cycles |
| 01           | 4 M-Cycles   |
| 10           | 16 M-Cycles  |
| 11           | 64 M-Cycles  |

Table 5: TAC flags

#### 7.1.4 TMA

When TIMA overflows, it is reset to the value stored in the TMA register and an interrupt is requested (we will them see later). An example of use can be the following: if TMA is set to 0xFF and the frequency set in TAC is 256 M-Cycles, some piece of code gets executed every 256 M-Cycles.

#### 7.1.5 Timing behaviors

When TIMA overflows, it does not get reset instantly. Instead, it contains a value of zero and waits for a duration of four clock cycles before it is updated. This update can be **aborted** by writing **any** value to TIMA during these four clock cycles. In this case, TIMA keeps the value that was written and an interrupt does **not** get requested. However, if TIMA is written on the **same** clock cycle on which the reload occurs, the write is ignored. While if TMA is written on the same clock cycle on which the reload occurs, TMA is updated **before** its value is loaded into TIMA.

## 7.2 Implementation

I decided not to implement these oddities, although I **did** implement the TIMA overflow abort.

The **update** function structure is the following.

```
void Timers::update(u32 cycles, Memory& mem) {
       // the cycles parameter is the number of M-Cycles
      // Which then gets multiplied by 4 to get the number of clock
3
       for (u32 i = 0; i < cycles * 4; ++i) {</pre>
           // if someone writes into DIV, the register gets reset to 0
5
           if (mem[DIV_REG] != ((DIV >> 8) & OxFF))
6
               DIV = 0;
9
           // Incrementing DIV
10
           mem[DIV_REG] = (DIV >> 8) & OxFF;
11
12
13
           if (!tima_overflow) {
               // Check if TIMA needs to be incremented
14
15
16
               // ... condition logic
17
18
               if (is_increment) {
                   const Byte tima_value = ++mem[TIMA_REG];
19
20
                   if (tima_value == 0) {
                        tima_overflow = true;
21
22
               }
23
           } else {
24
               // Handle TIMA overflow
25
               tima_overflow_cycles++;
26
27
               if (tima_overflow_cycles == 4) {
28
                   mem[TIMA_REG] = mem[TMA_REG];
29
                   mem[IF_REG] |= 1 << 2; // Calling interrupt</pre>
30
                   tima_overflow = false;
31
                   tima_overflow_cycles = 0;
32
               } else {
33
                   if (mem[TIMA_REG] != 0) {
34
35
                        // overflow aborted
                        tima_overflow = false;
36
37
                        tima_overflow_cycles = 0;
                   }
38
               }
39
          }
40
41
      }
42 }
```

# 8 Interrupts