# Lab 3: Single-cycle CPU Lab

EE312 Computer Architecture

Professor: John Kim

TA (Lab 3): Bongjoon Hyun, Taehun Kim, Jiin Kim

EMAIL: kaist.ee312.ta@gmail.com

#### 1. Overview

Lab 3 is intended to give you a hands-on experience in designing the simplest form of a modern microprocessor, whose operations are conducted within a single clock cycle. Based on the concepts and skills you have acquired through Lab 1 and Lab 2, you are now ready to implement a single-cycle CPU. Through this lab assignment, you will have gained an in-depth understanding on the fundamentals of designing a CPU microarchitecture.

## 2. Backgrounds

#### Control Path & Data Path

A CPU is composed of two types of units, which are the control units and data units. A control path consists of multiple control units, which "control" the data path of the processor (e.g., ALUs, memory) and how to respond to the instruction that is fetched to the processor. Likewise, a data path is a collection of functional units such as arithmetic logic units (ALUs) or multipliers, that perform data processing operations, registers, and buses. You can simply think of the data path as its name literally states; the path where the data flow inside of the processor.

## Single-cycle CPU

A single-cycle CPU processes an instruction per every single clock cycle. At every clock cycle, the single-cycle CPU must perform Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory operation (MEM), and Write Back (WB) stages of a single instruction, so the architectural states are updated as the instruction "instructs" the CPU. The IF stage is to fetch the instruction that the Program Counter (PC) points to in the instruction memory. The ID stage is to decode the fetched instruction, figure out which operations are required to carry out the instruction, as defined by the Instruction Set Architecture (ISA). The EX stage is where primitive operations (e.g., add, multiply, bit-wise operations) as well as memory address calculations are performed. The MEM stage is to perform memory operations (e.g., load, store) on each specified address by the instruction. The WB stage is to update the architectural states of the processor; the values of registers or memory are updated as a result of executing the instruction. The exact control and data path are shown in Fig. 1, which is the MIPS version of a single cycle CPU design. Remember that the goal of Lab 3 is to design your own "RISC-V" version of such single cycle CPU microarchitecture.



Figure 1. Control & Data Path of the Single-cycle CPU (MIPS ver)

#### **Instruction Set Architecture (ISA)**

Instruction Set Architecture (ISA) is an abstract model of a computer. An ISA defines everything that a machine language programmer needs to know to program the computer. "x86", "ARM", "MIPS", and "RISC-V" are common ISAs that you have probably heard of many times. As you have already been notified in the lectures, you are going to implement a CPU that follows the RISC-V ISA throughout the lab assignments.

RISC-V is an open ISA based on established reduced instruction set computing principles. RISC-V tutorial is available from Lab1. If you have not done so, please watch the pre-recorded video from the TAs, so we hope you have gained a useful amount of knowledge to complete *Lab 3*. However, if you have anything unclear about the RISC-V ISA, please review the RISC-V tutorial and RISC-V manual. In *Lab 3* and the rest of the lab assignments, we focus on the basic 32-bit integer set, which is RV32I, with the addition of a few custom instructions. A full documentation of RV32I is given to you on the KLMS (riscv-spec-v2.2.pdf), so please refer to the documentation to make sure your implementation of the single-cycle CPU follows our customized RV32I correctly. You can refer to the following manual to get an idea on the semantics of some of the RV32I instructions (remember though that your goal is to implement the RV32I instruction set (+ custom instructions), not the tiny RV version below which only accounts for a subset of RV32I).

https://www.csl.cornell.edu/courses/ece4750/handouts/ece4750-tinyrv-isa.txt

#### 3. Files

In Lab 3, you are given files that are in three folders:

- 1. template folder includes templates of the single-cycle CPU.
- 2. *testbench* folder includes files you can test with.
- 3. *testcase* folder includes instruction streams in either assembly code or binary format that you can test with.

In the *template* folder, you are given four files:

- 1. *Mem Model.v* defines the memory model.
- 2. REG FILE.v defines the register file.
- 3. RISCV CLKRST.v defines the clock signal.
- 4. RISCV TOP.v includes templates you can start with.

You **SHOULD NOT** modify *Mem\_Model.v*, *REG\_FILE.v*, *and RISCV\_CLKRST.v*. You only need to implement modules that consist the data path in *RISCV\_TOP.v*. You may create additional files that include modules that consist the control path in the *templete* folder.

In the *testbench* folder, you are given three *testbench* files:

- 1. TB\_RISCV\_forloop.v
- 2. TB\_RISCV\_inst.v
- 3. TB\_RISCV\_sort.v

In the *testcase* folder, you are given five files:

- 1. asm/forloop.asm
- 2. asm/sort.asm
- 3. hex/forloop.hex
- 4. hex/inst.hex
- 5. hex/sort.hex

You can test your single-cycle CPU with *testbench* files in the *testbench* folder. Before you test with *testbench* files, you need to specify the instruction stream stored in the *testcase* folder. For example, if you want to test with *TB\_RISCV\_forloop.v*, you must change the file path to *testcase/hex/forloop.hex*. You can find a human-readable assembly code in *testcase/asm/forloop.asm*, which can be helpful for debugging.

The TB file reads hex from the hex file and puts instructions in the instruction memory. Then, it executes the instruction from the first instruction in the memory according to the instruction flow. If the number of executed instructions becomes the pre-defined number, your output port is compared to the expected value during the execution of the program.

### 4. Single Cycle CPU Lab

In Lab 3, you are required to implement a single-cycle CPU, which is the simplest form of CPU.

Your implementation **MUST** comply with the following rules:

- 1. The instruction memory and data memory are physically separated as two independent modules (check the testbench files).
- 2. The overall memory size is 4KB for instructions and 16KB for data.
- 3. The instruction memory follows byte addressing, however, data memory follows word addressing (check

the testbench files and the RISCV TOP.v file).

- 4. Little-endian
- 5. The control path and data path should be in separated "module" in the Verilog implementation.
- As we have not covered the concept of "Virtual Memory" in this course just yet, you can assume that the lower N-bits of the instruction and data memory addresses are used as-is to access instruction memory and data memory.
  - a. For accessing instructions, use (Effective\_Address & 0xFFF) as the translation function
  - b. For accessing data, use (Effective\_Address & 0x3FFF) as the translation function
- 7. The initial value of the Program Counter (PC) is 0x000.
- 8. The initial value of the stack pointer is 0xF00 (This is initial value of 'RF[2]' in the 'REG FILE.v').
- 9. You need to implement only below instructions
  - a. LUI, AUIPC
  - b. JAL
  - c. JALR
  - d. BEQ, BNE, BLT, BGE, BLTU, BGEU
  - e. LB, LH, LW, LBU, LHU
  - f. SB, SH, SW
  - g. ADDI, SLTI, SLTIU, XORI, ORI, ANDI, SLLI, SRLI, SRAI
  - h. ADD, SUB, SLL, SLT, SLTU, XOR, SRL, SRA, OR, AND
  - i. MULT, MODULO, IS\_EVEN
    (ISA details for these instructions can be found in RISC V 101.pdf)

The templates of memory and register file are already given to you. You are only required to implement the control and data path of the single-cycle CPU. If you implement correctly, you will see a "Success" message in the console log when you run the testbench file, as shown in Fig. 2. The TAs recommend you to carefully design which modules are needed, how to connect them, and which control signals are necessary to transfer to the data units, before you start to write the code.

```
VSIM 2> run -all
# Finish: 73 cycle
# Success.
# ** Note: $finish : C:/Users/sh520/Desktop/lab/rv32i-verilog/3. single-cycle/TB_RISCV.v(166)
# Time: 845 ns Iteration: 1 Instance: /TB_RISCV
```

Figure 2. Success Message

#### **Terminal condition**

Since we don't implement instructions which are used to transfer control to the operating system, we set a flag instruction to quit the program. If you get the instruction sequence containing "0x00c00093 //(addi x1, x0, 0xc), 0x00008067 //(jalr x0, x1, 0)" in a row, you should halt the program. HALT output wire should be set to 1.

#### **Simulation**

To test your single-cycle CPU, you need to implement two additional outputs, which are *NUM\_INST* and *OUTPUT PORT*.

- 1. NUM INST: the number of executed instructions
- 2. OUTPUT\_PORT: the output result,
  - a. If the instruction has a destination register (*rd*), then the value that is supposed to be written in the destination register should also be written to *OUTPUT PORT*.
  - b. If the instruction is a branch instruction, 1 is written to *OUTPUT\_PORT* if taken; otherwise, 0 is written.
  - c. If the instruction is a store instruction, the target address of the store instruction is written to *OUTPUT PORT*.

#### 5. Grading

The TAs will grade your lab assignments with three *testbench* files in the *testbench* folder that are already given to you and one additional *testbench* file that is hidden to you. Your score is determined by how many tests you pass. However, there are some guidelines that can affect your grades.

- **Do not** use "blocking operator" in the sequential logic part (e.g., always @(posedge clk)). Use non-blocking operator for sequential logic and blocking operator for combinational logic. If you don't abide by this rule, there will be a deduction.
- **Do not** use delay operator (e.g., #50, #100). If you use "delay operator" in your code (except the given testbench code), you will automatically get zero points for that lab assignment.
- **Do not** use "wait()" function for your design. We will not give any points for a design that is using "wait()" function.
- Do not cheat.

# 6. Lab Report Guidance

You are required to submit a lab report for every lab assignment. You can write your report either in Korean or English. We don't want you to waste your time writing a lab report. Please keep the report short. **Three pages** are enough for the report unless you have more to show. You don't need to have too much concern about the report.

Your lab report **MUST** include the following sections:

- 1. Introduction
  - a. *Introduction* includes what you think you are required to accomplish from the lab assignment and a brief description of your design and implementation.
- 2. Design (Try to assign most of your lab report pages explaining your design)
  - a. *Design* includes a high-level description of your design of the Verilog modules (e.g., the relationship between the modules).

- b. Figures are very helpful for the TAs to understand your Verilog code.
- c. The TAs recommend you to include figures because drawing the figures helps you to *design* your modules.

# 3. Implementation

- a. Implementation includes a detailed description of your implementation of what you design.
- b. Just writing the overall structure and meaningful information is enough; you do not need to explain minor issues that you solved in detail.
- c. Do not copy and paste your source code.

#### 4. Evaluation

- a. *Evaluation* includes how you evaluate your design and implementation and the simulation results.
- b. Evaluation must include how many tests you pass in the testbench folder.

## 5. Discussion

- a. *Discussion* includes any problems that you experience when you follow through the lab assignment or any feedbacks for the TAs.
- b. Your feedbacks are very helpful for the TAs to further improve EE312 course!

#### 6. Conclusion

a. *Conclusion* includes any concluding remarks of your work or what you accomplished through the lab assignment.

# 7. Requirements

You MUST comply with the following rules:

- You should implement the lab assignment in Verilog.
- You should only implement the **TODO** parts of the given template.
- You should name your lab report as Lab3\_YourName\_StudentID.pdf.
- You should compress the honor pledge(s), lab report, simulation results, and source code, then name the compressed zip file as Lab3\_YourName\_StudentID.zip, and submit the zip file on KLMS. All names of your submitted files, including your zipped file, should not contain any spaces in them. If you need to put a space in the name, substitute it with an underscore ("\_") instead. You will be penalized if you fail to comply with this rule.
- Make sure that all characters in your submission zip files and their content files' names be in English.

```
      Lab1_GilDongHong_20220000.zip
      (O),

      Lab1_홍길동_20220000.zip
      (X),

      Lab1_GilDongHong_20220000_JohnDoe_20220001.zip
      (O),

      Lab1_홍길동_20220000_존도_20220001.zip
      (X)
```

# Reference

 $1.\ \underline{https://www.csl.cornell.edu/courses/ece4750/handouts/ece4750-tinyrv-isa.txt}$ 

#### **How To Start Guide**

1. Load all files in the template folder, your CPU implementation, and a testbench that you want to test.

The CTRL.v and ALU.v files are one of TA's implementation of single cycle CPU.

CTRL.v is for control signal and RISCV TOP.v & ALU.v are for datapath.



2. You should change file location in testbench files.

```
.NUM_INST (NUM_INST),
.OUTPUT_PORT (OUTPUT_PORT)
);

//I-Memory

SP_SRAM #(
.ROMDATA ("C:\\Users\\jjeong\\Lab3\\release\\testcase\\inst.hex"), //Initialize I-Memory
.AWIDTH (10),
.SIZE (1024)
) i_meml (
```

3. Compile all the files.



4. Choose all files relevant to single cycle CPU when you start simulation.



- 5. Run the simulation.
- 6. You get the simulation result.

```
# Finish: 24 cycle
# Success.
# ** Note: $finish : C:/Users/jjeong/Lab3/release/testbench/TB_RISCV_inst.v(175)
# Time: 355 ns Iteration: 1 Instance: /TB_RISCV
```