## 1 Design

A MIPS, single-instruction, multi-cycle processor (1) was implemented in C++ via the CLion IDE.



Figure 1: Multi-Cycle, single instruction MIPS processor schematic diagram.

The register topology (figure 2) used in a MIPS processor, was implemented. Only the register numbers are in the simulator, not the register names. \\$fp and \\$gp (\\$28 and \\$30) where not implemented and \\$sp was set to the maximum value in the **PC** (slightly changing its true function). All values in the register file are converted to decimal, for debugging purposes.





Figure 3: A section of the memory file.

Figure 2: Register file structure.

The memory layout (figure 3) used did not mimic that used in a MIPS processor and instead simply started at 0 (with a maximum address value of 1000 although this could be increased by simply increasing the array size of **memoryFiler**). The memory is still byte addressed and word aligned. The limited instruction set implemented include the following instructions:

ullet beq ullet add ullet jr ullet j jal ullet sw ullet lw ullet addi

These instructions are sufficient to implement a fully functioning MIPS processor which can process the showcase program, along with leafed/non-leafed procedures and arrays.

## 2 Functionality

The system has an automated MIPS assembly to machine code translator. This is implemented by a class (**Translator**), which uses multiple methods to differentiate between the **IS** (Instruction Set) and produce the binary string instruction (machine code). The MIPS assembly .txt file is inputted into the **Translator** class.

The menu of the program allows you to pick from 3 different running modes, with each running mode being selected by entering the accompanying number into the terminal and pressing enter.

- 0. Picking option 0 will run through the program cycle by cycle. When an instruction is completed after a certain number of cycles, the instruction will be outputted to the console, letting the user know what instruction has just been performed. The values of all the registers in the register file is outputted after each clock cycle is executed. This could be very useful to the user when debugging their own MIPS assembly code.
- 1. Picking option 1 will simply run through the entire program and output the final values in memory along with the final values in the register file.
- 2. Picking option 2 will take the user to the testing suite (figure 4). This suite does not test their code but instead tests the simulator and shows information about the code that was just executed. This information includes
  - (a) The total number of clock cycles executed by the simulator
  - (b) The number of clock cycle errors
    - i. These would arise if an instruction (for example a j) would take more/less cycles than expected from the instruction
    - ii. During correct operation of the simulator, this value should be 0
  - (c) The total number of instructions executed

```
Testing Mode:
Total Number of Clock Cycles: 2602
Number of Clock Cycle Errors: 0
Total Number of Instructions: 801
```

Figure 4: A snippet of the testing suite.

The output of the simulator consists of the contents inside the memory file, followed by a list containing the contents of the register file. The contents of the register file is what the user will usually use.

The processor consists of modules, and each module (ALU, ALU out, Control, Instruction register, Memory file, Memory data register, PC, Register File, registers A and B) has its own class, whose

job is to filter inputs and carry out any operations the specific module may do. This includes multiplexer operations (implemented by switch, case statements), arithmetic, binary sorting etc. Each of these modules are called by the runInstructions function. This incrementally steps through the instructions (based on the value in the **PC**) and calls other nested functions which carry out a single clock cycle/stage. At the beginning of each nested function, the control signals for that specific clock cycle are setup, ensuring that the correct dataflow occurs through the Datapath. These control signals are shown in appendix A. All instructions do stages 1 (runIF) and 2 (runID), the instruction fetch, and instruction decode stages. runIF ensures that the instruction register stores the instruction at the **PC** address. It then increments the **PC** by 4.

$$PC = PC + 4 \tag{1}$$

runID ensures that the temporary registers, **A** and **B**, store the **rs** and **rt** values for the next clock cycle. This also ensures that the **ALU Out** temporary register, stores the sign-extended, shifted left by 2, **PC** referenced value to also be used in the next clock cycle.

$$A = rs$$
  $B = rt$  (2)

$$ALUOut = PC + (sign-extend(imm) << 2)$$
(3)

From here on, the cycles become instruction dependant. The opcode is used to differentiate between different instructions and from there, the correct nested functions can be called.

#### 2.1 BEQ Command

The beq command (opcode 000100) is implemented using runBeqEXE. This carries out a referenced jump (the reference being the value of the **PC**), depending on whether or not the zero flag has been raised by the **ALU**. The inputs the **ALU** are **A** and **B**, and the **ALU** subtracts them letting us know whether they are equal if the zero flag is raised.

$$if (A == B), PC = ALUOut$$
 (4)

This instruction takes a total of 3 clock cycles.

#### 2.2 ADD Command

The add command (opcode 000000) uses the funct section of its instruction (100000) to setup the correct **ALU** operation. In clock cycle 3, runAddEXE is called such that the values in **A** and **B** can added together to give the **ALU Out** temporary register value.

$$A + B = ALUOut (5)$$

runAddWB is then called to write ALU Out to register rd, in the register file.

$$reg[rd] = ALUOut (6)$$

This instruction takes a total of 4 clock cycles.

#### 2.3 JR Command

The jr command (opcode 000000) also uses the funct section of its instruction (001000) to setup the correct **ALU** operation. In clock cycle 3, runJrEXE is called such that the values in **A** and **B** can added together to give the **ALU Out** temporary register value.

$$A + B = ALUOut (7)$$

runJrWB is then called to write ALU Out to the PC, as a form of direct addressing.

$$PC = ALUOut$$
 (8)

This instruction takes a total of 4 clock cycles.

#### 2.4 J and JAL Commands

The j and jal commands (opcodes 000010 and 000011 respectively) **EXE** stage is implemented using the runJEXE. This carries out a direct jump, based on the values in the 26 bit immediate.

$$PC = PC[31 - 28] \mid (IR[25 - 0] << 2)$$
 (9)

This instruction takes a total of 3 clock cycles.

#### 2.5 Sw, Lw, and Addi Commands

The sw, lw and addi commands (opcodes 101011, 010111 and 001000 respectively) **EXE** stage is implemented using runSwLwAddiEXE. This sign-extends the immediate value, adds the value in the temporary register, **A**, and returns it to **ALU Out**.

$$ALUOut = A + sign-extend(imm)$$
 (10)

Afterwards, the stages become instruction dependant.

#### 2.5.1 Sw Command

sw uses runSwMA to write to memory, the value of register B, at memory position ALU Out.

$$memory[ALUOut] = B$$
 (11)

This instruction takes a total of 4 clock cycles.

#### 2.5.2 Lw command

lw passes the data in memory at the position ALU Out to the MDR using runLwMA.

$$MDR = memory[ALUOut]$$
 (12)

It then uses runLwwb to write the value in the MDR to the register file at position, rt.

$$reg[rt] = MDR \tag{13}$$

This instruction takes a total of 5 clock cycles.

#### 2.5.3 Addi Command

addi writes the value in **ALU Out** to the register file, also at position **rt**.

$$reg[rt] = ALUOut \tag{14}$$

## 3 Basic Testing

At the end of every function, there is a commented-out section which starts with 'proof working'. All of these check that each clock cycle (depending on instruction) is performing the correct operation and producing the correct results. The correct results are all showcased in the above equations and this is what the 'proof working' checks for, that the equations are being fulfilled by the function.

Along with this, the clock cycle method (mode 0), was implemented early on to ensure that each clock cycle had the correct 'proof working' results, that each instruction had the correct number of clock cycles and (by outputting all of the values in the register file) that each register had the correct value associated with it. Using these in conjunction with the IDE debugger allowed me to fully debug any potential issues that could arise in the simulator.

I created 3 test programs to ensure correct operation of the simulator, all of which were checked with the mentioned methods I implemented to ensure correct operation of the simulator. These test programs checked that every instruction in my IS operated as expected. I initially used mode 2 to ensure that the number of clock cycles was correct for each instruction. I then used mode 0 to ensure that the register file values were correctly updating after each cycle. At the end of all of this, mode 1 could be used to check that the final register values were correct and that the result had been correctly stored. Full details of testing these programs are in appendix B.

## 4 Showcase Testing

The showcase program uses the following algorithm.

$$i_n = i_{n-1} + n, \quad wheren = 1, 3, 5, 7, \dots$$
 (15)

This program was initially written in C and then translated into MIPS assembly code, shown in figure 5.

```
addi $t1, $zero, 1
addi $s0, $zero, 99
LoopForA:
sw $t2, 0($t3)
addi $t3, $t3, 4
beq $t0, $s0, ExitA
addi $t2, $t2, $t1
addi $t1, $t1, 2
addi $t0, $t0, 1
j LoopForA
ExitA:
addi $t4, $zero, 100
lw $t5, 0($t4)
```

Figure 5: MIPS assembly showcase program

This program created an array to store the results of the square of all of the values from 0 to 99. This is then stored in memory from the position 0. At the end of the program, is a load word. This was used to test if the correct answer to 25 squared was acquired. This is checked by looking at register \$t5 (\$13), which gives us 625. \$t2 (\$10) gives us the final result, and should, therefore, be 99 squared, which is 9801.

I used mode 2 to see how many total clock cycles were required to implement the program, giving me 2602 clock cycles, as shown in figure 6. Further evidence of results is in appendix C.

```
Testing Mode:

Total Number of Clock Cycles: 2602

Number of Clock Cycle Errors: 0

Total Number of Instructions: 801
```

Figure 6: Testing mode showcase program output

### 5 Conclusion

Overall, I am very content with my final simulator. I believe it gives the user an easy to use interface with a lot of functionality. The different modes of use allow for methods of debugging for both the simulator itself along with the MIPS assembly code that has been inputted to it. All of the functionality I planned to achieve by the end of the coursework (with regards to the simulator) were accomplished and the code is presented in an easy to understand format, with comments and commented out extra testing for the user of the code to easily understand how the simulator works.

The simulator can successfully implement all of the testing programs along with the showcase program, while allowing the user to do the following tasks:

• Run the simulator cycle by cycle

- Run the simulator straight to the end
- Run a testing booth of the simulated MIPS code

If I were to continue making the processor, I would expand the IS to include more instructions such as la, sll etc. and enhance the capability of the translater to a wider IS. The testing booth could also potentially give more data with regards to the simulation, however, none of these improvements are necessities to the simulator and I believe that the finished simulator is more than a sufficient MIPS processor simulator.

# **Appendix**

## **A** Control Signals

The control signals used to control the simulator include:

- PCWriteCond
- PCWrite
- IorD
- MemRead
- MemWrite
- MemtoReg
- IRWrite
- PCSource
- ALUOp
- ALUSrcA
- ALUSrcB
- RegWrite
- RegDst

## **B** Detailed Testing

### **B.1** Test Program 1

The output to the program should be in \$s1 (\$17) and it should be 125. This program produces 2 variables ( $\mathbf{x}$  and  $\mathbf{y}$ ) and then sums them. If the sum == 25 (which it is), the program will then increment the sum 5 times in a for loop and store the result to \$17 (125).



Figure 7: Register file final output for test program 1



Figure 8: Test Program 1 MIPS assembly code

## **B.2** Test Program 2

This sets a variable a (\$a0) to 15, and a variable b to -12. It then runs a leaf procedure which increments the sum of the 2 variables.



Figure 9: Register file final output for test program 2



Figure 10: Test Program 2 MIPS assembly code

The expected result is in \$s0 (\$16) and should be 3.

## **B.3** Test Program 3

This is a very simple program and checks that addi and add can use negative values.



Figure 11: Register file final output for test program 3



Figure 12: Test Program 3 MIPS assembly code

The expected result in \$s0 (\$16) is -27.

## C Showcase Program

This program gets the squares of all integers from 0 to 99. The final square result (99) is stored in \$t2 ( $\$10 \rightarrow 9801$ ) and the loaded square of 25 is stored is \$t5 (\$13) and is 625.

This was used in conjunction with modes 0 and 2 for debugging.

```
Final Register values:

50: 0

52: 0

52: 0

53: 0

55: 0

55: 0

56: 0

57: 0

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 139

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58: 0

58:
```

Figure 13: Register file final output for Showcase Program

```
Total Number of Clock Cycles: 2602
Number of Clock Cycle Errors: 0
Total Number of Instructions: 801
```

Figure 14: Showcase Program testing mode

```
1 addi $t1, $zero, 1
2 addi $s0, $zero, 99
3 LoopForA:
4 sw $t2, 0($t3)
5 addi $t3, $t3, 4
6 beq $t0, $s0, ExitA
7 add $t2, $t2, $t1
8 addi $t1, $t1, 2
9 addi $t0, $t0, 1
10 j LoopForA
11 ExitA:
12 addi $t4, $zero, 100
13 lw $t5, 0($t4)
```

Figure 15: Showcase Program MIPS assembly code

## D Simulator Menu

The user can use this menu to input the mode they wish to use the simulator in.

- $0 \rightarrow \text{cycle-by-cycle}$
- $1 \rightarrow normal run$
- $2 \rightarrow \text{testing booth}$

Figure 16: Simulator main menu