# **RISC-V Design**

# A non-pipelined RISC architecture



# **Opcodes**

- Defines the "opcode" for various classes of operations
- ▶ For 32-bit instruction words, the 2 LSBs are always 11

| inst[4:2]<br>inst[6:5] | 000    | 001      | 010      | 011      | 100    | 101      | 110            | 111    |
|------------------------|--------|----------|----------|----------|--------|----------|----------------|--------|
| 00                     | LOAD   | LOAD-FP  | custom-0 | MISC-MEM | OP-IMM | AUIPC    | OP-IMM-32      | 48b    |
| 01                     | STORE  | STORE-FP | custom-1 | AMO      | OP     | LUI      | OP-32          | 64b    |
| 10                     | MADD   | MSUB     | NMSUB    | NMADD    | OP-FP  | reserved | custom-2/rv128 | 48b    |
| 11                     | BRANCH | JALR     | reserved | JAL      | SYSTEM | reserved | custom-3/rv128 | >= 80b |

| LUI   |              | imm[31:12]       | rd    | 0110111 |             |         |
|-------|--------------|------------------|-------|---------|-------------|---------|
| AUIPC |              | imm[31:12]       |       |         | rd          | 0010111 |
| JAL   | i            | mm[20 10:1 11 19 | 9:12] |         | rd          | 1101111 |
| JALR  | imm[11:0     | )]               | rs1   | 000     | rd          | 1100111 |
| BEQ   | imm[12 10:5] | rs2              | rs1   | 000     | imm[4:1 11] | 1100011 |
| BNE   | imm[12 10:5] | rs2              | rs1   | 001     | imm[4:1 11] | 1100011 |
| BLT   | imm[12 10:5] | rs2              | rs1   | 100     | imm[4:1 11] | 1100011 |
| BGE   | imm[12 10:5] | rs2              | rs1   | 101     | imm[4:1 11] | 1100011 |
| BLTU  | imm[12 10:5] | rs2              | rs1   | 110     | imm[4:1 11] | 1100011 |
| BGEU  | imm[12 10:5] | rs2              | rs1   | 111     | imm[4:1 11] | 1100011 |
| LB    | imm[11:0     | rs1              | 000   | rd      | 0000011     |         |
| LH    | imm[11:0     | rs1              | 001   | rd      | 0000011     |         |
| LW    | imm[11:0     | rs1              | 010   | rd      | 0000011     |         |
| LBU   | imm[11:0     | rs1              | 100   | rd      | 0000011     |         |
| LHU   | imm[11:0     | )]               | rs1   | 101     | rd          | 0000011 |
| SB    | imm[11:5]    | rs2              | rs1   | 000     | imm[4:0]    | 0100011 |
| SH    | imm[11:5]    | rs2              | rs1   | 001     | imm[4:0]    | 0100011 |
| SW    | imm[11:5]    | rs2              | rs1   | 010     | imm[4:0]    | 0100011 |
| ADDI  | imm[11:0     | -                | rs1   | 000     | rd          | 0010011 |
| SLTI  | imm[11:0     | )]               | rs1   | 010     | rd          | 0010011 |
| SLTIU | imm[11:0     | )]               | rs1   | 011     | rd          | 0010011 |
| XORI  | imm[11:0     | )]               | rs1   | 100     | rd          | 0010011 |
| ORI   | imm[11:0     | )]               | rs1   | 110     | rd          | 0010011 |
| ANDI  | imm[11:0     | )]               | rs1   | 111     | rd          | 0010011 |
| SLLI  | 0000000      | shamt            | rs1   | 001     | rd          | 0010011 |
| SRLI  | 0000000      | shamt            | rs1   | 101     | rd          | 0010011 |
| SRAI  | 0100000      | shamt            | rs1   | 101     | rd          | 0010011 |
| ADD   | 0000000      | rs2              | rs1   | 000     | rd          | 0110011 |
| SUB   | 0100000      | rs2              | rs1   | 000     | rd          | 0110011 |
| SLL   | 0000000      | rs2              | rs1   | 001     | rd          | 0110011 |
| SLT   | 0000000      | rs2              | rs1   | 010     | rd          | 0110011 |
| SLTU  | 0000000      | rs2              | rs1   | 011     | rd          | 0110011 |
| XOR   | 0000000      | rs2              | rs1   | 100     | rd          | 0110011 |
| SRL   | 0000000      | rs2              | rs1   | 101     | rd          | 0110011 |
| SRA   | 0100000      | rs2              | rs1   | 101     | rd          | 0110011 |
| OR    | 0000000      | rs2              | rs1   | 110     | rd          | 0110011 |
| AND   | 0000000      | rs2              | rs1   | 111     | rd          | 0110011 |

**U-type** U-type J-type I-type **B-type** B-type B-type B-type **B-type** B-type I-type I-type I-type I-type I-type S-type S-type S-type I-type I-type I-type I-type I-type I-type R-type R-type

| funct7       | rs2             | rs1         | funct3 | rd          | opcode |
|--------------|-----------------|-------------|--------|-------------|--------|
| imm[11:      | 0]              | rs1         | funct3 | rd          | opcode |
| imm[11:5]    | rs2             | rs1         | funct3 | imm[4:0]    | opcode |
| imm[12 10:5] | rs2             | rs1         | funct3 | imm[4:1 11] | opcode |
|              | imm[31:12]      |             | rd     | opcode      |        |
| ir           | nm[20 10:1 11 1 |             | rd     | opcode      |        |
|              |                 |             |        |             |        |
|              | İ               | !<br>!<br>! |        |             |        |

R-type I-type S-type B-type U-type J-type

| JAL  | ir           | nm[20 10:1 11 1 | 9:12] |     | rd          | 1101111 |
|------|--------------|-----------------|-------|-----|-------------|---------|
| BEQ  | imm[12 10:5] | rs2             | rs1   | 000 | imm[4:1 11] | 1100011 |
| BNE  | imm[12 10:5] | rs2             | rs1   | 001 | imm[4:1 11] | 1100011 |
| BLT  | imm[12 10:5] | rs2             | rs1   | 100 | imm[4:1 11] | 1100011 |
| BGE  | imm[12 10:5] | rs2             | rs1   | 101 | imm[4:1 11] | 1100011 |
| LW   | imm[11:0     | rs1             | 010   | rd  | 0000011     |         |
| SW   | imm[11:5]    | rs2             | rs1   | 010 | imm[4:0]    | 0100011 |
| ADDI | imm[11:0     | rs1             | 000   | rd  | 0010011     |         |
| XORI | imm[11:0     | rs1             | 100   | rd  | 0010011     |         |
| ORI  | imm[11:0     | rs1             | 110   | rd  | 0010011     |         |
| ANDI | imm[11:0     | rs1             | 111   | rd  | 0010011     |         |
| ADD  | 0000000      | rs2             | rs1   | 000 | rd          | 0110011 |
| SUB  | 0100000      | rs2             | rs1   | 000 | rd          | 0110011 |
| XOR  | 0000000      | rs2             | rs1   | 100 | rd          | 0110011 |
| OR   | 0000000      | rs2             | rs1   | 110 | rd          | 0110011 |
| AND  | 0000000      | rs2             | rs1   | 111 | rd          | 0110011 |

J-type **B-type B-type B-type B-type** I-type S-type I-type I-type I-type I-type R-type R-type R-type R-type R-type



#### **RISC-V** online resources

#### Online RISC-V assembler

- https://riscvasm.lucasteske.dev/#
- Make sure you select only RV32I instructions!

#### Online RISC-V C compiler

- https://godbolt.org/
- Select the C language
- Select a RISC-V compiler (per esempio gcc) for the 32-bit architecture
- Unfortunately we can't constrain the compiler to generate code for the base ISA only (it might include instructions that you have not implemented!)

# **RISC-V** online compiler



**RISC-V** online assembler Compiler Explorer Online RISC-V Assembler → G in riscvasm.lucasteske.dev/# \* 🗆 😘 **RISC-V Online Assembler** Copy code here Type the assembly code below and click Build a5,0 a3,1000 a4,0(a0) a4,a4,a5 a5,a5,1 a0,a0,4 a5,a3,.L2 Hit "BUILD" ret BUILD Get the hex dump to be copied in **Hex Dump** instruction memory initialization fed796e3 Also get the hex to code correspondence

## **Example**

```
void iaxpy( int data[ ] ) {
    for ( int i = 0; i < 1000; i++ ) {
        data[ i ] = data[ i ] + 1;
    }
}</pre>
```

| Program                        |             |                                    |         |                 |               |                 |                                       |
|--------------------------------|-------------|------------------------------------|---------|-----------------|---------------|-----------------|---------------------------------------|
|                                |             | 0                                  |         | 0               | ADDI          | 15              | OP-IMM                                |
| li a5,0                        | 0x00000793  | 0 0 0 0 0 0 0 0                    | 0 0 0 0 | 0 0 0 0         | 0 0 0         | 0 1 1 1 1       | 0 0 1 0 0 1 1                         |
|                                |             | 1000                               |         | 0               | ADDI          | 13              | OP-IMM                                |
| li a3,1000                     | 0x3e800693  | 0 0 1 1 1 1 1 0                    | 1 0 0 0 | 0 0 0 0         | 0 0 0         | 0 1 1 0 1       | 0 0 1 0 0 1 1                         |
| loop:                          |             | 0                                  |         | 10              | LW            | 14              | LOAD                                  |
| lw a4,0(a0)                    | 0x00052703  | 0 0 0 0 0 0 0                      | 0 0 0 0 | 0 1 0 1 0       | 0 1 0         | 0 1 1 1 0       | 0 0 0 0 0 1 1                         |
|                                |             | 0                                  | 15      | 14              | ADD           | 14              | OP                                    |
| add a4,a4,a5                   | 0x00f70733  | 0 0 0 0 0 0 0 0                    | 1 1 1 1 | 0 1 1 1 0       | 0 0 0         | 0 1 1 1 0       | 0 1 1 0 0 1 1                         |
|                                |             | 0                                  | 14      | 10              | SW            | 0               | STORE                                 |
| sw a4,0(a0)                    | 0x00e52023  | 0 0 0 0 0 0 0 0                    | 1 1 1 0 | 0 1 0 1 0       | 0 1 0         | 0 0 0 0         | 0 1 0 0 0 1 1                         |
|                                |             | 1                                  |         | 15              | ADDI          | 15              | OP-IMM                                |
| addi a5,a5,1                   | 0x00178793  |                                    |         |                 |               |                 |                                       |
|                                | 0.0001.0750 | 0 0 0 0 0 0 0                      | 0 0 0 1 | 0 1 1 1 1       | 0 0 0         | 0 1 1 1 1       | 0 0 1 0 0 1 1                         |
|                                | 0.001,0,00  | 4                                  | 0 0 0 1 | 0 1 1 1 1       | 0 0 0<br>ADDI | 0 1 1 1 1       | 0 0 1 0 0 1 1<br>OP-IMM               |
| addi a0,a0,4                   | 0x00450513  | 4                                  | 0 0 0 1 |                 |               |                 | · · · · · · · · · · · · · · · · · · · |
| addi a0,a0,4                   |             | 4                                  |         | 10              | ADDI          | 10              | OP-IMM                                |
| addi a0,a0,4<br>bne a5,a3,loop | 0x00450513  | 4<br>0 0 0 0 0 0 0 0 0<br>negative | 0 1 0 0 | 10<br>0 1 0 1 0 | ADDI 0 0      | 10<br>0 1 0 1 0 | OP-IMM 0 0 1 0 0 1 1                  |

#### Notes:

- Registers a0, a1, a2, etc. correspond to x10, x11, x12, etc. (it is a standard naming convention)
- **a0** is the pointer to the array in memory (initially zero?)
- a5 contains the current loop index (variable i in the C code)
- a3 is the end of loop condition (1000)
- **a4** is a temporary that holds the value loaded from memory (data[i]), then it gets added to **a5** (the index) and is stored back to memory (data[i])
- Branch checks if the index (a5) has not reached the end (a3)

# **Example architecture**

#### **Instruction Fetch**

#### Instruction Memory

- Defined as 1024 words of 32 bits each, total of 4 kB
  - Address is 10 bits
- Using a BRAM, it has an output register
  - Instruction code available only in the clock cycle that follows the new value of the program counter
- Could be potentially be placed outside, to be more general

#### Program Counter (PC)

- Must address 4 kB, hence it is 12 bits if addressing bytes
  - ▶ The two least significant bits will always be 0, removed when connecting to instruction memory
  - PC arithmetic in the ALU done at byte level
- ▶ The stage computes the address of the next instruction, by adding 4 for the PC
- Load enable (pc\_load) activated by control state machine

### **Instruction fetch**



#### **Instruction Decode**

#### Register file

- Defined as 32 words of 32 bits each, two ports, distributed memory
  - First port used to read and write, at different times
  - Address (source or destination) chosen depending on instruction execution phase
  - write\_en generated by control state machine
  - Latched output, since we would need to add it anyway

#### Decoding logic and sign extension

- Checks instruction class and defines operand selection signals
- Defines operation signal for ALU and comparator
- Selects immediate fields and reconstructs the sign extended immediate according to the instruction class
- Outputs are latched

#### Instruction decode



#### **Instruction Execute**

#### ALU

- Implements a subset of arithmetic and logic operations
- Muxes at inputs select the operands
- Opcode generally follows the RISC-V standard convention
- ▶ Both latched and unlatched outputs available

#### Comparator

- Used for branch operations, compares register values according to opcode
- Output determines if branch is to be taken or not

#### Instruction execute



### Memory access and write back

#### Data Memory

- ▶ Implemented as 4096 words of 32 bits
  - Address is 12 bits, take only the required bits from effective address (coming from the ALU)
  - write\_en generated by control state machine
- Memory introduces one clock cycle delay
  - Take the ALU output from previous clock cycle, to get memory output during the memory phase
  - Write value (rs2) already available since end of decode phase

#### Destination register value selection

 Could be the next\_pc (for JAL), the ALU result (for ALU operations), or the memory output (for loads)

#### Value of program counter

- Next instruction at next\_pc (for ALU operations, loads, stores, and not taken branches)
- Next instruction at alu\_result (for Jumps and taken branches)

## Memory and write back

Select next pc according to instruction class and branch condition



#### **Control state machine**

### Schedules certain write operations

- Updating the program counter
- Writing into the register file
- Writing into memory
- Uses op\_class to decide
  - Plus keeps track of the phase of execution

### **Phases**

