# **Computer Architecture 2022 Fall Lab 2**

## 1. Problem Description

In this lab, you are extending your Lab 1 to a pipelined CPU with a 2-bit dynamic global branch predictor. In Lab 1, we let the actual branch decision be determined at ID stage with an "equal" module added to the datapath. However, in this lab, we are replacing that "equal" module with a global branch predictor and utilizing the original resource of ALU (i.e., operation of subtraction) in the EX stage to examine the actual branch decision. We will test the correctness of your implementation by dumping the values specified in the provided reference "testbench.v" after each cycle.

## 1.1.2-bit Dynamic Branch Predictor

The major difference from Lab 1 is that we are replacing the branch decision at ID stage with a global branch predictor and verify the actual branch decision at EX stage using the original resource of ALU (i.e., operation of subtraction). All branches share the same predictor, that is, all branches receive the predictions from the same predictor and update the predictor with their actual branch decisions according to the state transition diagram shown in Figure 1.



Figure 1: state transition diagram used for the 2-bit dynamic branch predictor

## 1.2.Instructions

Instructions needed to be support in this lab are the same as Lab 1, where the machine codes are as follows:

| funct7       | rs2      | rs1 | funct3 | rd          | opcode  | function |
|--------------|----------|-----|--------|-------------|---------|----------|
| 0000000      | rs2      | rs1 | 111    | rd          | 0110011 | and      |
| 0000000      | rs2      | rs1 | 100    | rd          | 0110011 | xor      |
| 0000000      | rs2      | rs1 | 001    | rd          | 0110011 | sll      |
| 0000000      | rs2      | rs1 | 000    | rd          | 0110011 | add      |
| 0100000      | rs2      | rs1 | 000    | rd          | 0110011 | sub      |
| 0000001      | rs2      | rs1 | 000    | rd          | 0110011 | mul      |
| imm[11       | :0]      | rs1 | 000    | rd          | 0010011 | addi     |
| 0100000      | imm[4:0] | rs1 | 101    | rd          | 0010011 | srai     |
| imm[11:0]    |          | rs1 | 010    | rd          | 0000011 | lw       |
| imm[11:5]    | rs2      | rs1 | 010    | imm[4:0]    | 0100011 | SW       |
| imm[12,10:5] | rs2      | rs1 | 000    | imm[4:1,11] | 1100011 | ped      |

## 1.3.Input / Output Format

After you finish your modules and CPU, you should compile all of them including your "testbench.v". A recommended compilation command would be

Then by default, your CPU loads "instruction.txt", which should be placed in the same directory as CPU.out, into the instruction memory. "instruction.txt" is a plain text file that consists of 32 bits (ASCII 0 or 1) per line, representing one instruction per line. For example, the first 3 lines in "instruction.txt" are

```
0000000_00000_00000_000_01000_0110011 //add $t0,$0,$0
000000001010_00000_000_01001_0010011 //addi $t1,$0,10
000000001101 00000 000 01010 0010011 //addi $t2,$0,13
```

Note that underlines and texts after "//" (i.e. comments) are neglected. They are inserted simply for human readability. Therefore, the CPU should take

<sup>&</sup>quot;000000000000000000001000110011" and execute it in the first cycle, then

<sup>&</sup>quot;0000000101000000000010010010011" in the second cycle, and

<sup>&</sup>quot;0000000110100000000010100010011" in the third, and so on.

## 1.4. Modules You Need to Add or Modify

## 1.4.1. branch predictor

The predictor gives the prediction of whether the upcoming branch is a taken or not taken action. Then, the predictor is updated by the actual branch decision examined at the EX stage according to the state transition diagram shown in Figure 1. Initial state of the branch predictor is strongly taken (Top left of Figure 1).

## 1.4.2. ALU Control & ALU

You have to add additional control signals in the ALU\_Control and ALU modules to support the examination of the beq condition.

### 1.4.3. Additional Flushes

Since the the actual branch decision is determined at the EX stage now, you need to add proper flushes to the pipeline registers (e.g., ID/EX) if any misprediction occurs.

#### 1.4.4. testbench

You have to initialize reg in your pipeline registers before any instruction is executed. Please do not change the output format (\$fdisplay part) of the provided reference "testbench.v" file.

### 1.4.5. Others

You can add more modules than listed above if you want. You are also required to change any detail to let your CPU perform correctly.

## 2. Report

### 2.1. Modules Explanation

You should briefly explain how the modules you implement work in the report. You have to explain them in human-readable sentences. Either English or Chinese is welcome, but no Verilog. Explaining Verilog modules in Verilog is nonsense. Simply pasting your codes into the report with no or little explanation will get zero points for the report.

Take "PC.v" as an example, an acceptable report would be:

PC module reads clock signals, reset bit, start bit, and next cycle PC as input, and outputs the PC of the current cycle. This module changes its internal register "pc\_o" at the positive edge of the clock signal. When the reset signal is set, PC is reset to 0. And PC will only be updated by next PC when the start bit is on.

And following report will get zero points.

The inputs of PC are clk i, rst i, start i, pc i, and ouput pc o.

It works as follows:

```
always@(posedge clk_i or negedge rst_i) begin
    if(rst_i) begin
        pc_o <= 32'b0;
    end
    else begin
        if(start_i)
            pc_o <= pc_i;
        else
            pc_o <= pc_o;
    end
end</pre>
```

## 2.2. Difficulties Encountered and Solutions in This Lab

Write down the difficulties if any you encountered in doing this lab, and the final solution to them.

## 2.3. Development Environment

Please specify the OS (e.g. MacOS, Windows, Ubuntu) and compiler (e.g. iverilog) or IDE (e.g. ModelSim) you use in the report, in case that we cannot reproduce the same result as the one in your computer.

### 3. Submission Rules

Put all your Verilog codes into a directory named "codes", then put "codes" and your report (should be named in format "studentID\_lab2\_report.pdf") into a directory named "studentID\_lab2". Note that you have to REMOVE Instruction\_Memory.v, Data\_Memory.v, Registers.v, PC.v, instruction.txt, output.txt, which are provided by TA, in your submission. Please use ASCII-printable characters only. Finally, zip this directory, and upload this zip file onto NTU COOL before 12/27/2022 (Tue.) 23:59.

In short, we should see a single directory like the following structure after we type

```
$ unzip studentID_lab2.zip
```

in Linux terminal:

- studentID lab2/
  - o studentID\_lab2/codes
    - studentID lab2/codes/CPU.v

- studentID\_lab2/codes/ALU.v
- •
- o studentID lab2/studentID lab2 report.pdf

Make sure you remove the following files before submission: Instruction\_Memory.v,

Data\_Memory.v, PC.v, Registers.v, instruction.txt, output.txt. And make sure your testbench.v

reads instructions from "instruction.txt" and output to "output.txt".

## 4. Evaluation Criteria

We will compile your program in following command:

```
studentID lab2/codes/ $ iverilog *.v ../../*.v -o CPU.out
```

, where ../../\*.v includes Instruction\_Memory.v, Data\_Memory.v, PC.v, Registers.v, and the working directory is in your "codes/" directory. That is, some modules are provided outside the directory you submit.

- 4.1. Programming Part (80%)
  - We will have a demonstration session on both labs. You have to come up to briefly explain how your program works to get the credits of the programming part.
- 4.2. Report (20%)
- 4.3. Other
  - Minor mistakes (examples below) causing compilation error: -10 pts
    - wrong usage of "`include"
    - submitting unnecessary files (those provided by TA except CPU.v and testbench.v)
    - other mistakes that can be fixed within 5 lines
  - o Major mistakes causing compilation error: 0 pts on programming part
  - O No show up at demonstration: 0 pts on programming part
  - Wrong directory format: -10 pts
  - Wrong I/O paths: -10 pts
  - Late submission: -10 pts per day
  - o Plagiarism: 0 pts.

email to eclab.ca.ta@gmail.com if you have any questions.