# armish Processor

# **Karlo Godfrey Escalona Gregorio**

## **Contents**

| 1 | Pref | ace         |                                                  |    |  |  | 1  |
|---|------|-------------|--------------------------------------------------|----|--|--|----|
| 2 | ISA  | Design      |                                                  |    |  |  | 1  |
|   | 2.1  | RX-type:    | Fixed Point Operations                           |    |  |  | 2  |
|   |      | 2.1.1 O     | verview                                          |    |  |  | 2  |
|   |      | 2.1.2 ac    | ldx/subx                                         |    |  |  | 4  |
|   |      | 2.1.3 ac    | lcx/sbcx                                         |    |  |  | 7  |
|   |      | 2.1.4 m     | ulx/divx                                         |    |  |  | 8  |
|   |      | 2.1.5 ab    | OSX                                              |    |  |  | 9  |
|   |      | 2.1.6 cm    | npx                                              |    |  |  | 10 |
|   |      | 2.1.7 nc    | otx                                              |    |  |  | 11 |
|   |      | 2.1.8 ar    | ndx/orrx/xorx                                    |    |  |  | 11 |
|   |      | 2.1.9 M     | scellaneous Notes: R-type Fixed-Point Instructio | ns |  |  | 12 |
|   | 2.2  | RF-type:    | Floating Point Operations                        |    |  |  | 13 |
|   |      | 2.2.1 O     | verview                                          |    |  |  | 13 |
|   | 2.3  | D-type: D   | ata Movement Operations                          |    |  |  | 16 |
|   |      | 2.3.1 O     | verview                                          |    |  |  | 16 |
|   |      | 2.3.2 ld    |                                                  |    |  |  | 17 |
|   |      | 2.3.3 st    |                                                  |    |  |  | 18 |
|   |      |             | scellaneous Notes about D-type Instructions .    |    |  |  | 18 |
|   | 2.4  | B-type: B   | ranching Operations                              |    |  |  | 19 |
|   |      | 2.4.1 O     | verview                                          |    |  |  | 19 |
|   |      | 2.4.2 bx    | (                                                |    |  |  | 20 |
|   |      | 2.4.3 b     |                                                  |    |  |  | 20 |
|   |      | 2.4.4 bl    |                                                  |    |  |  | 21 |
|   |      | 2.4.5 Im    | portant Notes for B-type Instructions            |    |  |  | 21 |
|   | 2.5  | Important   | Notes                                            |    |  |  | 21 |
| 3 | Ass  | embler      |                                                  |    |  |  | 22 |
| 4 | Arcl | nitecture I | Design                                           |    |  |  | 23 |
|   | 4.1  |             | Execution Control                                |    |  |  | 24 |
|   | 4.2  | Program     | Counter Adder                                    |    |  |  | 25 |
|   |      | •           | esign                                            |    |  |  | 25 |
|   |      |             | prification                                      |    |  |  | 26 |
|   | 4.3  |             | n Memory                                         |    |  |  | 27 |
|   |      |             | esign                                            |    |  |  | 27 |
|   |      | 4.3.2 Ve    |                                                  |    |  |  | 28 |

|   | 4.4  | Main F  | Register File      |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 30 |
|---|------|---------|--------------------|-----|---|----|-----|---|------|--|--|--|--|--|--|--|--|----|
|   |      | 4.4.1   | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 30 |
|   |      | 4.4.2   | Verification       |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 32 |
|   | 4.5  | ALU +   | <b>ALU</b> Control |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 33 |
|   |      | 4.5.1   | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 33 |
|   |      | 4.5.2   | Verification       |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 33 |
|   | 4.6  | Data M  | lemory             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 33 |
|   |      | 4.6.1   | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 33 |
|   |      | 4.6.2   | Verification       |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |
|   | 4.7  | Pipelin | ing and Haza       | ard | C | on | trc | ı | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.7.1   | Design             |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.7.2   | Verification       |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   | 4.8  | FPU .   |                    |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.8.1   | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.8.2   | Verification       |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |
|   | 4.9  | FPU C   | ontrol             |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.9.1   | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.9.2   | Verification       |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   | 4.10 | Branch  | Prediction         |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.10.1  | Design             |     |   |    |     |   | <br> |  |  |  |  |  |  |  |  | 34 |
|   |      | 4.10.2  | Verification       |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |
| 5 | Perf | ormano  | e                  |     |   |    |     |   |      |  |  |  |  |  |  |  |  | 34 |

## 1 Preface

This project explores a custom implementation of a subset of the ARMv7 instruction set. A custom instruction set inspired by the ARM architecture is designed with a custom assembler. The architecture is implemented in hardware as an RTL model, whose functionality is verified.

The assembler is implemented in Python, and the RTL model is implemented using SystemVerilog, using an Arty-S7 25 as a target hardware to use as an example.

It should be noted that this architecture is an educational project inspired by ARM-style RISC design using the ARM7TDMI-S data sheet as refereence. It is not ARM-compatible and does not use proprietary ARM encoding or IP.

## 2 ISA Design

All instruction words are designed to be 32 bits wide. Each instruction has 4 condition bits that will determine whether or not the instruction executes based on CPSR condition flags (N, Z, C, V). This makes it simpler to write conditional statements for simple instructions. A list of the condition codes is listed below.

| Field List |             |                   |                          |  |  |  |  |  |  |  |  |
|------------|-------------|-------------------|--------------------------|--|--|--|--|--|--|--|--|
| Condition  | Instruction | Flags Set         | Explanation              |  |  |  |  |  |  |  |  |
| Code       | Suffix      | (NZCV)            |                          |  |  |  |  |  |  |  |  |
| 0000       | halt        | N/A               | Terminate program        |  |  |  |  |  |  |  |  |
| 0001       | al          | flags ignored     | Always Executed          |  |  |  |  |  |  |  |  |
| 0010       | le          | Z set OR (N not   | Less Than or Equal       |  |  |  |  |  |  |  |  |
|            |             | equal to V)       |                          |  |  |  |  |  |  |  |  |
| 0011       | gt          | Z clear AND (N    | Greater Than             |  |  |  |  |  |  |  |  |
|            |             | equals V)         |                          |  |  |  |  |  |  |  |  |
| 0100       | It          | N not equal to V  | Less Than                |  |  |  |  |  |  |  |  |
| 0101       | ge          | N equals V        | Greater Or Equal         |  |  |  |  |  |  |  |  |
| 0110       | Is          | C clear or Z set  | Unsigned Lower or Same   |  |  |  |  |  |  |  |  |
| 0111       | hi          | C set and Z clear | Unsigned Higher          |  |  |  |  |  |  |  |  |
| 1000       | VC          | V clear           | No Overflow              |  |  |  |  |  |  |  |  |
| 1001       | vs          | V set             | Overflow                 |  |  |  |  |  |  |  |  |
| 1010       | pl          | N clear           | Positive or Zero         |  |  |  |  |  |  |  |  |
| 1011       | mi          | N set             | Negative                 |  |  |  |  |  |  |  |  |
| 1100       | СС          | C clear           | Unsigned Lower           |  |  |  |  |  |  |  |  |
| 1101       | cs          | C set             | Unsigned Higher or Equal |  |  |  |  |  |  |  |  |
| 1110       | neq         | Z clear           | Not Equal                |  |  |  |  |  |  |  |  |
| 1111       | eq          | Z set             | Equal                    |  |  |  |  |  |  |  |  |

The halt condition code is used to terminate the program. Instructions attached with halt will not execute.

## 2.1 RX-type: Fixed Point Operations

#### 2.1.1 Overview

The R-type instructions are used for fixed-point arithmetic data-processing instructions. Values are done in Q8.8 format. A summary of the format can be seen in Figure 1, and explanations of the fields can be seen under the figure.



Figure 1: R-type format for fixed-point instructions.

|        | Field List             |                                                              |  |  |  |  |  |  |  |  |  |
|--------|------------------------|--------------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| Field  | Field Bits Description |                                                              |  |  |  |  |  |  |  |  |  |
| cond   | [31:28]                | State of CPSR condition codes (based on NZCV flags)          |  |  |  |  |  |  |  |  |  |
| type   | [27:26]                | Encoding specific to instruction type                        |  |  |  |  |  |  |  |  |  |
| opcode | [25:22]                | Determines the operation performed on operands               |  |  |  |  |  |  |  |  |  |
| I      | 21                     | Determines whether or not op2 is an immediate (I = 0 means   |  |  |  |  |  |  |  |  |  |
|        |                        | op2 is not an immediate, but a shift register)               |  |  |  |  |  |  |  |  |  |
| S      | 20                     | Determines whether or not to alter condition codes ( $S = 0$ |  |  |  |  |  |  |  |  |  |
|        |                        | means do not alter)                                          |  |  |  |  |  |  |  |  |  |
| $R_n$  | [19:16]                | First source register                                        |  |  |  |  |  |  |  |  |  |
| $R_d$  | [15:12]                | Destination register                                         |  |  |  |  |  |  |  |  |  |
| op2    | [11:0]                 | Varying field depending on the instruction                   |  |  |  |  |  |  |  |  |  |

R-type instructions have a varying op2 field that can be used depending on whether or not the instruction uses an immediate. For each of the R-type instructions, a closer look will be given in their individual instruction sections.

|             | Field List |                                                                                                                         |  |  |  |  |  |  |  |  |  |
|-------------|------------|-------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| Instruction | Bits       | Description                                                                                                             |  |  |  |  |  |  |  |  |  |
| shift       | [11:4]     | Used for instructions using two source registers. The amount to shift the value in $R_m$                                |  |  |  |  |  |  |  |  |  |
| $R_m$       | [3:0]      | Used for instructions using two source registers. The second source register                                            |  |  |  |  |  |  |  |  |  |
| rotate      | [11:8]     | Used for instructions using one source register and one immediate. Rotates the immediate a specific number of positions |  |  |  |  |  |  |  |  |  |
| imm         | [7:0]      | A constant used with another shift register to produce the result                                                       |  |  |  |  |  |  |  |  |  |

## Instructions take the following form:

```
(mneumonic) - (instruction suffix) (rd), (rn), (rm)
```

## where in each parentheses:

- mneumonic the type of instruction (e.g. add, sub, etc.)
- instruction suffix the instruction suffix that details the condition that the instruction is executed under
- rd destination register

- rn source register 1
- rm source register 2/immediate

A list of suported instructions is listed below. It should be noted that because of some complex instructions, the ALU is pipelined to [insert how many stages here] stages.

|       | Instructions                                                 |                                                               |  |  |  |  |  |  |  |  |
|-------|--------------------------------------------------------------|---------------------------------------------------------------|--|--|--|--|--|--|--|--|
| Field | opcode                                                       | Description                                                   |  |  |  |  |  |  |  |  |
| addx  | 0000                                                         | Adds two fixed-point values                                   |  |  |  |  |  |  |  |  |
| subx  | 0001                                                         | Subtracts two fixed-point values                              |  |  |  |  |  |  |  |  |
| mulx  | 0010                                                         | Multiplies two fixed-point values                             |  |  |  |  |  |  |  |  |
| divx  | 0011                                                         | Divides two fixed-point values                                |  |  |  |  |  |  |  |  |
| absx  | 0100                                                         | Takes the absolute value of an operand                        |  |  |  |  |  |  |  |  |
| adcx  | Adds two fixed-point values and a carry flag from a previous |                                                               |  |  |  |  |  |  |  |  |
|       |                                                              | instruction                                                   |  |  |  |  |  |  |  |  |
| sbcx  | 0110                                                         | Subtracts two fixed-point values and a carry flag from a pre- |  |  |  |  |  |  |  |  |
|       |                                                              | vious instruction                                             |  |  |  |  |  |  |  |  |
| стрх  | 0111                                                         | Compares two operands and outputs the appropriate flag        |  |  |  |  |  |  |  |  |
| notx  | 1000                                                         | Takes the bitwise NOR of two operands)                        |  |  |  |  |  |  |  |  |
| andx  | 1001                                                         | Takes the bitwise AND of two operands)                        |  |  |  |  |  |  |  |  |
| orrx  | 1010                                                         | Takes the bitwise OR of two operands)                         |  |  |  |  |  |  |  |  |
| xorx  | 1011                                                         | Takes the bitwise XOR of two operands)                        |  |  |  |  |  |  |  |  |
|       | <u> </u>                                                     |                                                               |  |  |  |  |  |  |  |  |

## 2.1.2 addx/subx

The addx and subx instructions add or subtract two numbers and store them into a destination register. The following snippet shows the cases for add, but sub follows a similar format.

```
// add the values stored in r1 and r2 and store them
   into r3
addx.s-al r3, r1, r2
// add 8 to the value stored in r1 and store them into r3
addx.s-al r3, r1, #8
// add the values stored in r1 and r2 and store them
   into r3, and use the result to set NCZV flags
addx.s-al r3, r1, r2
// r3 = r1 - r2
subx-al r3, r1, r2
```

The op2 field in the instruction format for add/sub takes on different forms depending on the value of of bit 25 (I). For I=0, the op2 field operates under the assumption that the 3rd operand is stored in a register. For I=1, the op2 field operates under the assumption that the 3rd operand is an immediate value.

## 3rd Operand: Register

When the 3rd operand is a register, the value in the register can be manipulated through shifting before carrying out addition or subtraction.

```
// add the values stored in r1 and r2 (whose value is
    shifted logically to the left by a value specified in
    r4) and store them into r3
addx r3, r1, r2, lsl r4
// add the values stored in r1 and r2 (whose value is
    shifted logically to the left by 8) and store them
    into r3
addx r3, r1, r2, lsl #8
```

## The op2 field specifications are as follows:



Figure 2: op2 field when the 3rd operand is a register. The top field is the format when the 3rd operand is shifted by a constant. The bottom field is the format when the 3rd operand is shifted by an amount specified in a register.

| Field List for $op2$ (3rd operand register, shifted by immediate) |         |                                                               |  |  |  |  |  |  |  |  |
|-------------------------------------------------------------------|---------|---------------------------------------------------------------|--|--|--|--|--|--|--|--|
| Field Bits Description                                            |         |                                                               |  |  |  |  |  |  |  |  |
| shtype                                                            | [11:10] | The shift type performed on the 3rd operand                   |  |  |  |  |  |  |  |  |
| shamt                                                             | [9:5]   | The amount that the 3rd operand is shifted by                 |  |  |  |  |  |  |  |  |
| $r_{shift}$                                                       | 4       | The bit that specifies whether the shifting operand is a reg- |  |  |  |  |  |  |  |  |
|                                                                   |         | ister or an immediate (value after Isl)                       |  |  |  |  |  |  |  |  |
| $R_m$                                                             | [3:0]   | The register holding the second operand                       |  |  |  |  |  |  |  |  |

| Fie                    | Field List for $op2$ (3rd operand register, shifted by register value) |                                                               |  |  |  |  |  |  |  |  |  |
|------------------------|------------------------------------------------------------------------|---------------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| Field Bits Description |                                                                        |                                                               |  |  |  |  |  |  |  |  |  |
| shtype                 | [11:10]                                                                | The shift type performed on the 3rd operand                   |  |  |  |  |  |  |  |  |  |
| $R_s$                  | [9:6]                                                                  | The register that contains the amount that the 3rd operand    |  |  |  |  |  |  |  |  |  |
|                        |                                                                        | is shifted by                                                 |  |  |  |  |  |  |  |  |  |
| unused                 | 5                                                                      | unused                                                        |  |  |  |  |  |  |  |  |  |
| $r_{shift}$            | 4                                                                      | The bit that specifies whether the shifting operand is a reg- |  |  |  |  |  |  |  |  |  |
|                        |                                                                        | ister or an immediate (value after Isl)                       |  |  |  |  |  |  |  |  |  |
| $R_m$                  | [3:0]                                                                  | The register holding the second operand                       |  |  |  |  |  |  |  |  |  |

The shift type (shtype) determines what kind of shift the second operand goes through. The specifications for the shift type are as follows:

|                                 | Description of Shift Types |                        |  |  |  |  |  |  |  |  |
|---------------------------------|----------------------------|------------------------|--|--|--|--|--|--|--|--|
| Shift Type   Encode Description |                            |                        |  |  |  |  |  |  |  |  |
| ror                             | 00                         | Rotate right           |  |  |  |  |  |  |  |  |
| asr                             | 01                         | Arithmetic shift right |  |  |  |  |  |  |  |  |
| Isr                             | 10                         | Logical shift right    |  |  |  |  |  |  |  |  |
| Isl 11 Logical shift left       |                            |                        |  |  |  |  |  |  |  |  |

For carrying out the operation without any shifting, it is sufficient to just not include a mention of the shift. It will assume Isl #0, which will not perform any shift.

**3rd Operand: Immediate** When the 3rd operand is an immediate, the values the immediate can take a variety of values.



Figure 3: op2 format for when the 3rd operand is an immediate.

|                        | Field List for $op2$ (3rd operand immediate) |                                                         |  |  |  |  |  |  |  |  |  |
|------------------------|----------------------------------------------|---------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| Field Bits Description |                                              |                                                         |  |  |  |  |  |  |  |  |  |
| rotate                 | [11:8]                                       | Number defining how many times the immediate is rotated |  |  |  |  |  |  |  |  |  |
|                        |                                              | right in a 16-bit value                                 |  |  |  |  |  |  |  |  |  |
| imm                    | [7:0]                                        | Immediate to be encoded                                 |  |  |  |  |  |  |  |  |  |

The 8-bit immediate field can be used to values from 0 to 255. Since each register is 16-bits, an 8-bit immediate isn't enough to reach the full value range that can be held by the register. With the rotate field, the rest of the bits in each register can be set, and a wider range of immediates can be used.

| rote | te |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
|------|----|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0    |    |   |   |   |   |   |   |   | 7 | 6 | S | 4 | 3 | 7 | ſ | 0 |
|      | 0  |   |   |   |   |   |   |   |   | 7 | 6 | 5 | 4 | 3 | 2 | ( |
| Z    | (  | 0 |   |   |   |   |   |   |   |   | 7 | 6 | 5 | 4 | 3 | 2 |
| 3    | 2  |   | 0 |   |   |   |   |   |   |   |   | 7 | 6 | 5 | 4 | 3 |
| 25   | 3  | 2 | [ | 0 |   |   |   |   |   |   |   |   | 7 | 6 | 5 | 4 |
| 5    | 4  | 3 | 2 | ( | 0 |   |   |   |   |   |   |   |   | 7 | 6 | 5 |
| 6    | 5  | 4 | 3 | 2 | ( | 0 |   |   |   |   |   |   |   |   | 7 | 6 |
| 7    | 6  | 5 | 4 | 3 | 2 | ( | 0 |   |   |   |   |   |   |   |   | 7 |
| 8    | 7  | 6 | S | 4 | 3 | Z | 1 | 0 |   |   |   |   |   |   |   |   |
| 9    |    | 7 | G | S | 4 | 3 | 7 | 1 | 0 |   |   |   |   |   |   |   |
| 10   |    |   | 7 | G | S | 9 | 3 | 7 | 1 | 0 |   |   |   |   |   |   |
| 1(   |    |   |   | 7 | G | S | 4 | 3 | Z | 1 | 0 |   |   |   |   |   |
| 12   |    |   |   |   | 7 | 6 | S | 4 | 3 | 7 | 1 | 0 |   |   |   |   |
| [3   |    |   |   |   |   | 7 | 6 | S | 4 | 3 | 7 | 1 | 0 |   |   |   |
| 19   |    |   |   |   |   |   | 7 | G | S | 4 | 3 | 7 | 1 | 0 |   |   |
| (5   |    |   |   |   |   |   |   | 7 | G | S | 4 | 3 | 7 | 1 | 0 |   |
|      |    |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |

Figure 4: How the value of rotation affects which bits are selected to be affected

One unique difference from ARMv7 is that the rotation encoding from an 8-bit immediate into 16-bits is that this allows for full access to the full range of immediates possible for 16-bit operands, meaning the effective range of encoding immediates directly is from 0 to  $2^{16}-1$ . This makes the encodable range for the immediate much wider at the cost of higher hardware complexity.

#### 2.1.3 adcx/sbcx

The adcx and sbcx instructions add or subtract two numbers with a carry from the previous instruction and store them into a destination register. The formatting used for the instruction is the same as addx/subx, with the only distinction between the instruction being the opcode.

```
// add the values stored in r1 and r2 and store them
    into r3
adcx.s-al r3, r1, r2
// add 8 to the value stored in r1 and store them into r3
adcx.s-al r3, r1, #8
```

```
// add the values stored in r1 and r2 and store them
  into r3, and use the result to set NCZV flags
adcx.s-al r3, r1, r2
// r3 = r1 - r2
sbcx-al r3, r1, r2
```

#### 2.1.4 mulx/divx

The mulx instruction can multiply two numbers and store them into a destination register.

```
// multiply the values stored in r1 and r2 and store the
   product into r3 and r4
mulx-al r3, r4, r1, r2
// divide the values stored in r1 and r2 and store the
   quotient into r3 and the remainder in r4
divx-al r3, r4, r1, r2
// integer division of r1 and r2 and store the quotient
   into r3
divx-al r3, r1, r2
```

The op2 format for mulx is shown below. For full context, part of the rest of the instruction encoding is also shown.



Figure 5: *op*2 encoding for the mul instruction

|          | Field List for op2 (mul) |                                                             |  |  |  |  |  |  |  |  |  |
|----------|--------------------------|-------------------------------------------------------------|--|--|--|--|--|--|--|--|--|
| Field    | Bits                     | Description                                                 |  |  |  |  |  |  |  |  |  |
| $R_{du}$ | [15:12]                  | Register to hold the upper byte of the product (technically |  |  |  |  |  |  |  |  |  |
|          |                          | not part of $op2$                                           |  |  |  |  |  |  |  |  |  |
| $R_{dl}$ | [11:8]                   | Register to hold the lower byte of the product              |  |  |  |  |  |  |  |  |  |
| unused   | [7:4]                    | unused                                                      |  |  |  |  |  |  |  |  |  |
| $R_m$    | [3:0]                    | The register holding the second operand                     |  |  |  |  |  |  |  |  |  |

Because the product of 2 16-bit numbers is 32-bit, two registers are necessary to hold the entire product.

The op2 format for div is shown below. For full context, part of the rest of the instruction encoding is also shown.



Figure 6: *op*2 encoding for the div/divi instructions.

| Field List for op2 (div) |         |                                                                |
|--------------------------|---------|----------------------------------------------------------------|
| Field                    | Bits    | Description                                                    |
| $R_{dq}$                 | [15:12] | Register to hold the quotient (technically not part of $op2$ ) |
| $R_{dr}$                 | [11:8]  | Register to hold the remainder                                 |
| rem                      | 7       | Bit to decide whether or not to keep the remainder (rem = 1    |
|                          |         | means keep the remainder)                                      |
| unused                   | [6:4]   | unused                                                         |
| $R_m$                    | [3:0]   | The register holding the second operand                        |

One register is used to store the quotient, and 1 register is used to store the remainder of the division. For integer division, the rem bit is set to 1, and the bit values of  $R_{dr}$  are all set to 1.

Some things to note about mul/div:

- Immediates cannot be used. The 32-bit instructions doesn't have the capacity to use immediates. This means I is always set to 0
- NCZV flags cannot be set with mul and div. Allowing for this increases the complexity of the hardware by too much. This means S is always set to 0
- Putting the same register as both destination registers can result in unpredictable behavior

#### 2.1.5 absx

The absx instruction takes the absolute value of a register.

```
// take the bitwise not of r1 and store into r2
notx-al r2, r1
```



Figure 7: *op*2 encoding for the not instruction

op2 is set to all 0s, since not is a unary operator. This also means that immediates have no purpose for this instruction, as well as setting NCZV flags (I = 0, S = 0).

## 2.1.6 cmpx

The cmpx instruction compares to registers.

```
// compares r2 and r1 and sets the appropriate flags
cmpx-al r2, r1
// compares r2 and 145 and sets the appropriate flags
cmpx-al r2, #145
```



Figure 8: Encoding for bits [19:0] for cmpx

For op2, cmpx uses the same immediate encoding as add/sub instructions (see 2.1.2), allowing for both register and immediate comparison.  $R_d$  is set to 0.

| Field List for $op2$ of cmpx (3rd operand register I = 0) |         |                                                             |  |
|-----------------------------------------------------------|---------|-------------------------------------------------------------|--|
| Field                                                     | Bits    | Description                                                 |  |
| r0                                                        | [15:12] | The r0 register to send the final result to (to be ignored) |  |
| 0                                                         | [11:4]  | Zero Filling                                                |  |
| imm                                                       | [3:0]   | Immediate to be encoded                                     |  |

| Field List for $op2$ of cmpx (3rd operand immediate I = 1) |         |                                                             |
|------------------------------------------------------------|---------|-------------------------------------------------------------|
| Field                                                      | Bits    | Description                                                 |
| r0                                                         | [15:12] | The r0 register to send the final result to (to be ignored) |
| rotate                                                     | [11:8]  | Number defining how many times the immediate is rotated     |
|                                                            |         | right in a 16-bit value                                     |
| imm                                                        | [7:0]   | Immediate to be encoded                                     |

The destination register is set to the 0 register, which is always hardwired to 0 value.

Some notes about cmpx:

• cmpx will always set flags. Do not add '.s'

#### 2.1.7 notx

The notx instruction can take the bitwise not of what is stored in the source register.

```
// take the bitwise not of r1 and store into r2 notx-al r2, r1
```



Figure 9: *op*2 encoding for the not instruction

op2 is set to all 0s, since not is a unary operator. This also means that immediates have no purpose for this instruction, as well as setting NCZV flags (I = 0, S = 0).

#### 2.1.8 andx/orrx/xorx

The andx instruction can take the bitwise and of what is stored in the source register. The orrx instruction can take the bitwise or of what is stored in the source register.

```
// take the bitwise and of r1 and r2 and store into r3
andx-al r3, r1, r2
// take the bitwise and of r1 and #0x00ff and store into
    r3
andx-al r2, r1, #9
```

```
// take the bitwise or of r1 and r2 and store into r3
orrx-al r3, r1, r2
// take the bitwise or of r1 and #0x00ff and store into
    r3
orrx-al r2, r1, #0x00ff
// take the bitwise xor of r1 and r2 and store into r3
xorx-al r3, r1, r2
// take the bitwise xor of r1 and #0x00ff and store into
    r3
xorx-al r2, r1, #0x00ff
```

The encoding done for op2 is identical to that of addx/subx instructions, so shifting operations can be applied to the 3rd operand (given that it is a register), if desired (see 2.1.2).

## 2.1.9 Miscellaneous Notes: R-type Fixed-Point Instructions

A few things to note about R-type instructions:

 To update NCZV flags after addx, subx, append an 's' after the mneumonic (i.e. adds, subs).

## 2.2 RF-type: Floating Point Operations

Note: This section will be implemented if time allows for it. Implemented as a coprocessor according to ARM7-TDMI-S (Ch 4

#### 2.2.1 Overview

The RF-type instructions are used for floating-point arithmetic data-processing instructions, using the IEEE-754 floating-point standard format. A summary of the format can be seen in Figure 3, and explanations of the fields can be seen under the figure.



Figure 10: RF instruction type format.

|            | Field List |                                                              |  |
|------------|------------|--------------------------------------------------------------|--|
| Field      | Bits       | Description                                                  |  |
| cond       | [31:28]    | State of CPSR condition codes (based on NZCV flags)          |  |
| type       | [27:26]    | Encoding specific to instruction type                        |  |
| opcode     | [25:22]    | Determines the operation performed on operands               |  |
| unused     | 21         | unused                                                       |  |
| S          | 20         | Determines whether or not to alter condition codes ( $S = 0$ |  |
|            |            | means do not alter)                                          |  |
| $R_n$      | [19:16]    | First source register                                        |  |
| $R_d$      | [15:12]    | Destination register                                         |  |
| $r_{mode}$ | [11:9]     | Specifies the rounding mode of the floating point operation. |  |
|            |            | See the underlying table for details.                        |  |
| unused     | [8:4]      | unused (might do flags for invalid operations)               |  |
| $R_m$      | [3:0]      | Varying field depending on the value of opcode               |  |

| $r_{mode}$        |                                                   |  |  |
|-------------------|---------------------------------------------------|--|--|
| r <sub>mode</sub> | Description                                       |  |  |
| 000               | Operation rounds toward 0                         |  |  |
| 001               | Operation rounds toward nearest, ties away from 0 |  |  |
| 010               | Operation rounds toward nearest, ties to even     |  |  |
| 011               | Operation rounds toward +∞                        |  |  |
| 100               | Operation rounds toward $-\infty$                 |  |  |

## Instructions take the following form:

```
(mneumonic).f-(instruction suffix) (rd), (rn), (rm),
# (r_mode)
```

## where in each parentheses:

- mneumonic the type of instruction (e.g. add, sub, etc.)
- instruction suffix the instruction suffix that details the condition that the instruction is executed under
- rd destination register
- rn source register 1
- rm source register 2
- $r_{mode}$  the rounding mode for the floating point operation

## Instructions take the following form:

```
(mneumonic).f-(instruction suffix) (rd), (rn), (rm),
# (r_mode)
```

#### where in each parentheses:

- mneumonic the type of instruction (e.g. add, sub, etc.)
- instruction suffix the instruction suffix that details the condition that the instruction is executed under
- · rd destination register
- rn source register 1
- rm source register 2
- $r_{mode}$  the rounding mode for the floating point operation

A list of suported instructions is listed below.

| Instructions |      |                                                          |
|--------------|------|----------------------------------------------------------|
| Field opcode |      | Description                                              |
| addf         | 1000 | Adds two floating-point values                           |
| subf         | 1001 | Subtracts two floating-point values                      |
| mulf         | 1010 | Multiplies two floating-point values                     |
| divf         | 1011 | Divides two floating-point values                        |
| cmpf         | 1100 | Compares two floating-point values)                      |
| cnvf         | 1101 | Convert value to IEEE-754 floating-point standard format |
| sqrf         | 1110 | Takes square root of a floating-point value              |
| recf         | 1111 | Takes reciprocal of a floating-point value               |

## A few things to note about RF-type instructions:

- The instructions cannot be used to set CPSR condition codes, and are undefined for immediate type instructions.
- To choose the rounding mode for the floating point operations, after the '.f' market, use '#' followed by the value of  $r_{mode}$  to specify the rounding operation (e.g. add.f-al r1, r2, r3, #4 to round toward  $-\infty$ ).
- Rounding mode is underfined for cmp instruction. Just only use the two operands being compared
- Note the lack of immediate operations. To use immediate values, use fixedpoint representation to create the immediate value with addi, and then convx2f.

## 2.3 D-type: Data Movement Operations

#### 2.3.1 Overview

The D-type instructions are used for loading and storing data from and into memory.



Figure 11: D instruction type format.

|        | Field List |                                                                  |  |
|--------|------------|------------------------------------------------------------------|--|
| Field  | Bits       | Description                                                      |  |
| cond   | [31:28]    | State of CPSR condition codes (based on NZCV flags)              |  |
| type   | [27:26]    | Encoding specific to instruction type                            |  |
| opcode | [25:22]    | Encoding specific to instruction                                 |  |
| U      | 21         | Determines whether the offset is added or subtracted (U = 1      |  |
|        |            | means that the offset is added, U = 0 means that the offset      |  |
|        |            | is subtracted)                                                   |  |
| 1      | 20         | Determines whether or not the the offset is an immediate         |  |
|        |            | value or a register (I = 1 means that it is an immediate value,  |  |
|        |            | I = 0 means that the offset is stored in a register)             |  |
| $R_n$  | [19:16]    | Address register used to interact with memory                    |  |
| $R_d$  | [15:12]    | Destination register                                             |  |
| offset | [11:0]     | Offset used to calculate where to load/store data. For a reg-    |  |
|        |            | ister offset, the register would be the least significant 4 bits |  |

Instructions take the following form:

```
(mneumonic) - (instruction suffix) (rd), [(rn), (offset)]
```

## where in each parentheses:

- mneumonic the type of instruction (e.g. add, sub, etc.)
- instruction suffix the instruction suffix that details the condition that the instruction is executed under
- · rd Register in register file to load or store to
- rn Register holding the address to interact with in data memory

· offset - offset used to calculate where to load/store data

A list of suported instructions is listed below.

|             | Instructions |                                                                                      |  |
|-------------|--------------|--------------------------------------------------------------------------------------|--|
| Instruction | opcode       | Description                                                                          |  |
| ldw         | 0000         | Loads a 16-bit word from data memory into a register in the register file            |  |
| ldb2l       | 0001         | Loads a byte from data memory into the lower byte of a register in the register file |  |
| ldb2h       | 0010         | Loads a byte from data memory into the upper byte of a register in the register file |  |
| stw         | 0011         | Stores a 16-bit word from a register in the register file into data memory           |  |
| stb2l       | 0100         | Stores a byte from a register in the register file into data memory                  |  |
| stb2h       | 0101         | Stores a byte from a register in the register file into data memory                  |  |

#### 2.3.2 ld

Load instructions are used to load data from data memory into the register file. Users have the option of loading an entire 16-bit word or just a byte, which can be written into the upper or lower byte of a register. Additionally, it is possible to choose whether or not the offset is defined by an immediate or by a register.

```
// load a 16-bit word from data memory, 2 bytes upstream
ldw r2, [r1, #2]
// Load a 16-bit word from memory, 2 bytes downstream
ldw r2, [r1, #-2]
// Load a 16-bit word from memory, offset according to
    the value stored in r3
ldw r2, [r1, r3]
// Load a 16-bit word from memory, offset according to
    the value stored in a 4 times shifted version r3
ldw r2, [r1, r3, lsl #4]
// Load the byte stored at address r1 into the lower
    byte of the register r2
ldb21 r2, [r1, #0]
// Load the byte stored at address r1 into the upper
    byte of the register r2
```

The value of the offset can be determined by a register or by an immediate value. These follow the same format as the 3rd operand in R-type instructions like add and sub (See Section 2.1.2).

#### 2.3.3 st

Store instructions are used to store data from the register file into data memory. Users have the option of storing an entire 16-bit word or just a byte, which can be read from the upper or lower byte of a register. Additionally, it is possible to choose whether or not the offset is defined by an immediate or by a register.

```
// load a 16-bit word from data memory, 2 bytes upstream
stw r2, [r1, #2]
// Load a 16-bit word from memory, 2 bytes downstream
stw r2, [r1, #-2]
// Load a 16-bit word from memory, offset according to
    the value stored in r3
stw r2, [r1, r3]
// Load the byte stored at address r1 into the lower
    byte of the register r2
stb21 r2, [r1, #0]
// Load the byte stored at address r1 into the upper
    byte of the register r2
```

The value of the offset can be determined by a register or by an immediate value. These follow the same format as the 3rd operand in R-type instructions like add and sub (See Section 2.1.2).

## 2.3.4 Miscellaneous Notes about D-type Instructions

- To specify loading a byte, add a 'b' after the mneumonic (ldrb, strb), otherwise it will default to loading/storing a word.
- To specify whether an offset is added or subtracted, use positive offset values for adding, and negative offset values for subtracting (e.g. ldr r0, [r1, #8] for the address r1 + 8, ldr r0, [r1, #-8] for the address r1 8).
- To specify whether an offset is an immediate value or a register, use '#' to specify the offset, or 'r' to specify a register (e.g. ldr r0, [r1, #8] for an offset or ldr r0, [r1, r2] for a register).
- The hardware uses little-endian formatting.

## 2.4 B-type: Branching Operations

## 2.4.1 Overview

B-type instructions are used for procedure calls. The ISA uses relative branching.



Figure 12: B instruction type format for BX instruction

| Field List (BX) |         |                                                              |  |
|-----------------|---------|--------------------------------------------------------------|--|
| Field           | Bits    | Description                                                  |  |
| cond            | [31:28] | State of CPSR condition codes (based on NZCV flags)          |  |
| type            | [27:26] | Encoding specific to instruction type                        |  |
| R               | 25      | Determines whether the instruction is a BX instruction vs B  |  |
|                 |         | or BL instructions (R = 0 means that it is a BX instruction, |  |
|                 |         | while R = 1 means that it is either a B or a BL instruction) |  |
| $R_b$           | [3:0]   | Address of the register containing the address to branch to  |  |



Figure 13: B instruction type format for B and BL instruction

|        | Field List (B or BL) |                                                                                                                                                                                       |  |
|--------|----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Field  | Bits                 | Description                                                                                                                                                                           |  |
| cond   | [31:28]              | State of CPSR condition codes (based on NZCV flags)                                                                                                                                   |  |
| type   | [27:26]              | Encoding specific to instruction type                                                                                                                                                 |  |
| R      | 25                   | Determines whether the instruction is a BX instruction vs B or BL instructions (R = 0 means that it is a BX instruction, while R = 1 means that it is either a B or a BL instruction) |  |
| L      | 24                   | Determines whether the instruction is a B instruction vs a BL instruction ( $L = 0$ means that it is a B instruction, while $L = 1$ means that it is a BL instruction)                |  |
| offset | [23:0]               | Relative address of the label to branch to                                                                                                                                            |  |

Instructions take the following form:

```
(mneumonic) - (instruction suffix) (label)
```

where in each parentheses:

- mneumonic the type of instruction (e.g. add, sub, etc.)
- instruction suffix the instruction suffix that details the condition that the instruction is executed under
- · label the label or register containing program counter value to branch to

| Instructions |                                                |  |
|--------------|------------------------------------------------|--|
| Field        | Description                                    |  |
| bx           | Branches to an address specified by a register |  |
| b            | Branch to a label                              |  |
| bl           | Branch and link                                |  |

#### 2.4.2 bx

Branch and exchange is a branching instruction that branches to an address stored in a register. It is commonly used to return from a procedure using the link register (r14).

```
// Return from procedure bx lr
```

## Some things to note:

 This is an absolute branching instruction that branches to the address stored in the register. This means that it just loads the address into the program counter directly.

#### 2.4.3 b

The general relative branch instruction branches to an address stored in a label. For conditional branching, NCZV flags must be set by a previous instruction.

```
// go to label
b label
// go to label if registers r1 and r2 are equal
subs r0, r1, r2
beq label
```

#### 2.4.4 bl

The branch and link instruction stores the address of the next instruction before branching to a label.

```
// go to label and save the location of the instruction
   after the label
bl label
```

## 2.4.5 Important Notes for B-type Instructions

• B and BL instructions contain a signed 2's complement 24 bit offset.

## 2.5 Important Notes

• Labels must be alone on its own line. In other words, this is allowed:

```
label:
addx-al r1, r2, r3
```

#### But this is not:

```
label: addx-al r1, r2, r3
```

- Labels don't have a specific syntax defined. As long as the label is before
  a ':', it is a valid label. Using multiple colons for a label will cause some
  undefined behavior.
- Negative shift amounts are undefined for instructions that involve shifting
- All RX-type instructions operate on 16-bit fixed point 2's complement numbers. Likewise, all RF instructions operate on 16-bit floating point numbers. Using an instruction on an incompatible number will yield unexpected results.
- The only registers that should have unisgned integers should be PC and LR.

## 3 Assembler

The assembler is implemented as a two-pass assembler in Python. In the first pass, labels are assigned location counter (LC) values to represent where they will be stored in instruction memory. For an instruction memory of  $2^{16}$  addresses, 16 bits are used to represent the addresses. These values are stored in a symbol table implemented as a hash table. In the second pass, all instructions are put into their machine code counterpart in the following format (similar to .bin files):

```
0x##: ## ## ##
```

The number before the colon is a hexadecimal representation of the LC value, and the numbers after it are the hexadecimal representation of the instruction encoding. A binary version of this is also produced. Consider the following example instruction:

```
addx-al r0, #9
```

A few things to note about the assembler:

- Multiple labels of the same have undefined behavior. Since the symbol table
  was implemented as a Python dictionary, the most recent definition of the
  label will probably be what defines the label.
- There is nothing to check for invalid syntax. The programmer takes responsibility for making sure everything is correct.
- · No bounds checking exists in the assembler.

## 4 Architecture Design



Figure 14: Main datapath for the armish Processor

## 4.1 Program Execution Control

The hardware needs to know when to load the program into instruction memory and when to execute the program. To do this, a simple Moore FSM was used to design the top-level control flow on when to load the instructions and when to execute the program.



## 4.2 Program Counter Adder

The program counter adder calculates the next value of the program counter, a register that keeps track of the address of the next instruction.

## 4.2.1 Design

#### **Design Specification**

- Purpose and Scope
  - The program counter adder is an adder that calculates the next value of the program counter, a register that keeps track of the address of the current instruction.
  - The adder should produce a new address that will be a possible value for the program counter.
- · Functional Requirements
  - Add two 16-bit 2's complement numbers and outputs a positive 2's complement number representing the next value of PC.
  - While the inputs are two 2's complement numbers, the output should always be positive, as addresses can never be negative.
- Interface Specification
  - Inputs
    - \* PC: 16-bit 2's complement number that represents the current address of the program
    - offset: 16-bit 2's complement number that represents how much to add to PC
  - Outputs
    - \* PC\_next: 16-bit 2's complement number that represents the next address of the program

## Implementation

The PC adder is a simpler adder.



## 4.2.2 Verification

## **Test Plan**

| Program Counter Adder Test Plan |               |                                                                                                                                                                                                     |
|---------------------------------|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| #                               | Title         | Description                                                                                                                                                                                         |
| 1                               | Core Features |                                                                                                                                                                                                     |
|                                 |               | Perform addition of two 2's complement numbers for the following combinations: - Positive number (address) with positive number (offset) - Positive number (address) with negative number (negative |
| 1.1                             | Addition      | number should not be larger than the positive number) (offset)                                                                                                                                      |

## **Tests**

The following tests were performed on the adder, and were successfully completed:

- Addition of a positive offset (ranging from 0 to 512) onto a 0 address.
- Addition of a positive offset (ranging from 0 to 1024) onto a positive address (512 in this case)
- Addition of a negative offset (ranging from 0 to -512) on a positive address (512 in this case)



Figure 15: Example waveform output for the pc adder

## 4.3 Instruction Memory

The instruction memory is a memory that holds the program instructions.

## 4.3.1 Design

## **Design Specification**

- Purpose and Scope
  - The instruction memory is a memory that holds the program instructions.
  - A central location to store and output instructions is necessary for an organized execution of instructions.

#### · Functional Requirements

- Instruction memory is composed of 1024 possible memory locations, each location holding 32-bit instructions.
- When a write signal is asserted, the instruction memory should be loading instructions from a testbench.
- When a write signal is deasserted, the instruction memory should be outputting instructions from an address given to it.
- To be synthesizable, the instruction memory must be able to load instructions from an outside source (test bench)

## · Interface Specification

#### Inputs

- \* r\_address: the address of instruction memory to read from; determines what instruction is outputted (read mode)
- w\_instruction: the instruction to be written into instruction memory (write mode)
- \* w\_address: the address to be written into instruction memory (write mode)
- w\_e: signal determining whether the instruction memory is in read mode or write mode

#### - Outputs

\* instruction: the instruction to be outputted from instruction memory

## Implementation

The instruction memory has a memory size of 1024 32-bit words. The instruction memory loads instructions serially. When a w\_e is asserted, it will store an inputted instruction at a given address every clock cycle until all words are loaded (load completion is determined by the testbench). After loading, instructions can be accessed by inputting a read address into the instruction memory. Because the instructions are byte addressable, the instruction memory also takes this into account when outputting the instruction.



Figure 16: RTL model of instruction mem.

#### 4.3.2 Verification

#### **Test Plan**

| Instruction Memory Test Plan |               |                                                                                                                                                                                                                  |  |
|------------------------------|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| #                            | Title         | Description                                                                                                                                                                                                      |  |
| 1                            | Core Features |                                                                                                                                                                                                                  |  |
| 1.1                          |               | When the write signal is asserted, and with an input address and an input instruction, the instruction should be written into the location specified in instruction memory. Output instruction should be all 0s. |  |
| 1.2                          |               | When the write signal is deasserted, and with an input address, an instruction should be fetched from the address in instruction memory and outputted.                                                           |  |

## **Tests**

To test these features, it is necessary to have a way for the testbench to communicate with the instruction memory when it is finished writing to it. To do this, 3 tasks were written. The first task reads memory from the machine code file into a memory located in the testbench. The second task counts the number of lines in the machine code file, so the processor knows when loading is completed. The third task loads an instruction into instruction memory every clock cycle. To test the functionality of the instruction memory, a program of 6 instructions was loaded into the instruction memory, and read in different ways.

All 6 instructions were loaded into the instruction memory, and then were read sequentially, instructions at even addresses, and then instructions ad odd addresses.



Figure 17: Waveform view of the instruction memory being loaded by instructions specified by w\_instruction at address w\_address.



Figure 18: Instructions being read out at addresses specified by r\_address.

## 4.4 Main Register File

A register file is a set of registers that can be quickly accessed to store data.

#### 4.4.1 Design

There are a total of 16 16-bit registers in the main register file, including link register, program counter, and zero/discard register. 16-bit registers were chosen, due to the goal of designing a processor that performs floating point operations, which are too complex to be done in 1 clock cycle for 32-bit operands. 16-bit operands can get very close to IEE-754 compliance. The remaining 12 registers are general-purpose.

| Main Register File |                                |  |  |
|--------------------|--------------------------------|--|--|
| Register           | Purpose                        |  |  |
| r0                 | Zero Register/Discard Register |  |  |
| r1-r13             | General Purpose                |  |  |
| r14                | Link Register                  |  |  |
| r15                | Program Counter (PC)           |  |  |

## **Design Specification**

- · Purpose and Scope
  - A central place to store intermediate values is needed to ensure timely calculations.
  - The main register file mainly stores data from fixed point operations.
     Floating point operations are handled in a separate register file.
  - Same cycle read/write is not covered for now, but will be covered in the pipelining section.
  - It is assumed that an instruction will never use both of its write ports to write to the same register.
- · Functional Requirements
  - The register file outputs values when requested (read mode)
  - The register file stores values when requested (write mode).
    - \* The register file should be able to overwrite registers
  - The register file should be capable of storing 16 16-bit values for all instructions specified by the ISA.

- \* For ALU outputs that require more than 16 bits, the outputs should be stored in multiple registers in a known order.
- \* r0 should always hold value 0, even when written to

## · Interface Specification

#### Inputs

- \* r\_reg1: Register to read specified by the 2nd operand of the instruction  $(R_n)$
- \*  $r_reg2$ : Register to read specified by the 3rd operand of the instruction  $(R_m)$
- w\_data1: Data to write to register specified by 1st operand of the instruction
- \* w\_data2: Data to write to register specified by w\_reg2 (often optional)
- \* w\_reg1: Register to write to specified by 1st operand of the instruction  $(R_{d1})$
- \* w\_reg2: Register to write to specified by 4th operand of the instruction (often optional)  $(R_{d2})$
- reg\_write1: Signal to allow for writing to the register file through the first write port
- \* reg\_write2: Signal to allow for writing to the register file through the second write port
- \* reset: Synchronous reset, sets all registers to 16'b0
- \* clk: clock

#### - Outputs

- \* r\_data1: Data stored in register specified by the 2nd operand of the instruction
- \* r\_data2: Data stored in register specified by the 3rd operand of the instruction

## Implementation

The register file is modeled as a memory with depth 16, each register with width 16. At the positive edge of each clock cycle, it checks for a reset signal. If the reset signal is high, then all the registers in the register file are set to 0. In absence of a reset signal, the register file checks its two reg\_write signals to decide whether it needs to write data to the register file. If either signal is high, data will be written to the register corresponding to the reg\_write signal that is high. If two

writes are done at the same time, and if they are writing to the same register, the data specified by  $w_reg1$  will be prioritized. Data is read by requesting data for the registers specified by  $r_reg1$  and  $r_reg2$ . The read is asynchronous, in that it will update combinationally if a write is performed on a register being read.

#### 4.4.2 Verification

#### **Test Plan**

| Register File Test Plan |                                                 |                                                                                                                                                                                     |  |  |
|-------------------------|-------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| #                       | Title                                           | Description                                                                                                                                                                         |  |  |
| 1                       | Core Features                                   |                                                                                                                                                                                     |  |  |
| 1.1                     | Reset                                           | Reset should set all registers in the register file to 0                                                                                                                            |  |  |
| 1.2                     | Read/Write Functionality                        | When the write signal is asserted, the register file should store values, regardless of the instruction given. When data is requested, it should output the data given the register |  |  |
| 1.2.1                   | Separate Cycle Write/Read (Same Register)       | Be able to write a value to a register in one cycle and then read it in the next                                                                                                    |  |  |
| 1.2.2                   | Separate Cycle Write/Read (Different Registers) | Be able to write a value to a register in one cycle and then read it in the next                                                                                                    |  |  |
| 1.2.3                   | r0 Constant                                     | r0 should remain constantly 0 even when written to                                                                                                                                  |  |  |
| 1.2.4                   | Overwriting                                     | Registers should be able to be overwritten (with the exception of r0)                                                                                                               |  |  |
| 1.2.5                   | Multi-Operand Function                          | Multiple registers should be able to be read and overwritten in the same clock cycle (MUL) or in different clock cycles                                                             |  |  |
| 1.2.6                   | Unary Operations                                | When given a unary operation instruction (absx, notx), the 3rd operand position should output 0                                                                                     |  |  |
|                         |                                                 |                                                                                                                                                                                     |  |  |

#### **Tests**

A scoreboard was created to keep track of the expected values of the register file whenever it was written to or read from. It is also used to directly check with register file to see if what is read from the register file is what is expected. The tests were conducted to satisfy the test plan requirements like so:

- 1.1- The reset signal was asserted and deasserted in 2 separate clock cycles. Iterating through the first 4 registers, it was checked that their values were 0.
- 1.2.1- First, the two reg\_write signals were set to high. Then, data was written to the register file. After one clock cycle, data was read from the same register. This was done for both read and write ports.
- 1.2.2- First, the two reg\_write signals were set to high. Then, data was written to the register file. After one clock cycle, data was read from a different register. This was done for both read and write ports.
- 1.2.3- Both reg\_write signals were set to high. Then, w\_reg1 was set to 0 to write to register r0 through the first write port. Cycling through different write values, register r0 was checked. This was also done through the second write port.

- 1.2.4- Both reg\_write signals were set to high. The register file was preloaded with values. Each register in the register file was then overwritten with different values through the first write port. After that, this was repeated through the second write port.
- 1.2.5- Both reg\_write signals were asserted. Iterating through the first 8 registers, each write port was assigned some register number. When the register ports held the same value, it was tested that the first write port was prioritized to write to the register file. Otherwise, the registers specified by both write ports were written with the data corresponding to each write port.
- 1.2.6- Both reg\_write signals were set to high. The register specified by r\_reg1 iterates over the register file while r\_reg2 is set to 0.

## 4.5 ALU + ALU Control

- 4.5.1 Design
- 4.5.2 Verification
- 4.6 Data Memory

## 4.6.1 Design

Data memory is composed of 256 possible locations, each location holding 8 bits.

- 4.6.2 Verification
- 4.7 FPU + FPU Control
- 4.7.1 Design
- 4.7.2 Verification
- 4.8 Pipelining and Hazard Control
- 4.8.1 Design
- 4.8.2 Verification
- 4.9 Branch Prediction
- 4.9.1 Design
- 4.9.2 Verification

## 5 Performance