

# Computer Organization



Lab10 CPU(2) Data-Memory, IFetch, ALU



# 2 Topics

- > CPU(2) -DataPath (2)
  - Data-Memory
  - > IFetch
  - > ALU



# **Data-Memory**



module **dmemory32**(readData,address, writedata,memWrite,clock);

input **clock**; // Clock signal

/\* used to determine to write the memory unit or not, in the left screenshot its name is 'WE' \*/

input **memWrite**;

// the address of memory unit which is tobe read/writen input[31:0] address;

// data tobe wirten to the memory unit input[31:0] writeData;

/\*data to be read from the memory unit, in the left screenshot its name is 'MemData' \*/

output[31:0] readData;

# **Using IP core Block Memory**

**Using** the **IP core Block Memory** of Xilinx to implement the core of the Data-memory.



**Import** the **IP core** in vivado project

1) in "PROJECT MANAGER" window click "IP Catalog"



- 2) in "IP Catalog" window
  - > Vivado Repository
    - > Memories & Storage Elements
      - > RAMs & ROMs & BRAM
        - > Block Memory Generator

# **Customize Memory IP core**



#### **Customize memory IP core**

#### 1) Component Name: RAM

TIPS: 'RAM' here is just an example for the name, not a requirement.

- 2) Basic settings:
  - ➤ Interface Type: Native
  - ➤ Memory Type: Single-port RAM
  - ➤ ECC options: no ECC check
  - ➤ Algorithm options: Minimum area
  - ➤ Write Enable: **NOT SET**

# Customize Memory IP core continued



- 3) PortA Options settings:
  - Data read and write bit width: 32 bits
    (4Byte)
  - Write/Read Depth: 16384 (64KB)
  - ➤ Operating Mode: Write First
  - > Enable Port Type: Always Enabled
  - ➤ PortA Optional Output Registers: **NOT SET**



Q1: How to determine the Write/Read Width?

Q2: How to determine the Write/Read Depth?

## Customize Memory IP core continued

- 4) Other Options settings:
  - > 1. When specifying the initialization file for customize the RAM on the 1st time, the IP core RAM just customized WITHOUT initial file and corresponding path, so set it to no initial file when creating RAM.
  - > 2. After the RAM IP core created
    - > 2-1. COPY the initialization file dmem32.coe to projectName.srcs/sources\_1/ip/ComponentName. ("projectName.srcs" is under the project folder, "componentName" here is 'RAM')
    - > 2-2. Double-click the newly created RAM IP core, **RESET** it with the **initialization file**, select the dmem32.coe file that has been in the directory of projectName.srcs/sources 1/ip/RAM.





Tips: "dmem32.coe" file could be found in the directory "lab\_tools" of course blackboard site https://bb.sustech.edu.cn/webapps/blackboard/content/listContentEditable.jsp?content\_id=\_281672\_1&course\_id=\_3609\_1

# Design Module With Memory IP Instanced

```
// Part of dmemory32 module
//Create a instance of RAM(IP core), binding the ports
RAM ram (
                                      BRAM PORTA
    .clka(clk),
                                                       // input wire clka
                                    addra[13:0]
    .wea(memWrite),
                                                       // input wire [0 : 0] wea
                                    clka
    .addra(address[15:2]),
                                                       // input wire [13:0] addra
                                    dina[31:0]
    .dina(writeData),
                                                       // input wire [31:0] dina
                                    douta[31:0]
    .douta(readData)
                                                       // output wire [31:0] douta
                                    wea[0:0]
);
/*The clock is from CPU-TOP, suppose its one edge has been used at the upstream module of data memory,
such as IFetch, Why Data-Memroy DO NOT use the same edge as other module? */
assign clk = !clock;
```

**Q**: In the five stages of instruction processing, what operations must be arranged on the edge of the clock? What's your design for a one-cycle CPU?

#### **Function Verification**

```
//The testbench module for dmemory32
module ramTb( );
reg clock = 1'b0;
reg memWrite = 1'b0;
reg [31:0] addr = 32'h0000 0010;
reg [31:0] writeData = 32'ha000 0000;
wire [31:0] readData;
dmemory32 uram
            (clock,memWrite,addr,writeData,readData);
always #50 clock = ~clock;
initial fork
  #120 memWrite = 1'b1;
  #200
     writeData = 32'h0000 00f5;
  #400
    memWrite = \frac{1'b0}{1}
  // ... to be completed
join
endmodule
```

#### TIPS:

#### Using bind port with name is Suggested!!

- 1) Set "memWrite" to 1'b0 means to read the data from the RAM unit identified by "addr".
- 2) Set "memWrite" to 1'b1 and "writeData" to 0x0000\_00f5 which means to write data 0xa000\_00f5 to the RAM unit identified by "addr".
- **Q1.** While instance the module on page 3(module dmemory32(readData,address,writedata,memWrit e,clock)) and using sequential binding as the testbench on the left hand, What will hanppen?
- **Q2.** While the data has been written to the RAM unit, would it be recorded to the initial data file(dmem32.coe)?

#### Function Verification continued



Q1: On which edge of clock does the read and write operations occur? posedge or negedge?

Q2: What's value will be get while read the memory according to the "addr" 0x0000\_0020?

how about "addr" 0x0000 0016?

Tips: "dmem32.coe" file could be found in the directory "lab tools" of course blackboard sites https://bb.sustech.edu.cn/webapps/blackboard/content/listContentEditable.jsp?content\_id= 281672 1&course\_id= 3609 1

#### Practice1

- 1. Build the data memory module.
- 2. Verify its function by simulation(TIPS: The testbench on page 9 is JUST a reference)
  - > Read the word one by one from memory which are specified in the red box of the screenshot on the right hand.
  - ➤ Write a word(value is 0x000f\_fff0) to the memory unit where is specified in the green box of the screeshot on the right hand, then read it out to check if the data has been written or not.
- 3. List all the signals which are needed for data-memory module

| -  |                                              |
|----|----------------------------------------------|
| 1  | <pre>memory_initialization_radix = 16;</pre> |
| 2  | <pre>memory_initialization_vector =</pre>    |
| 3  | 007f2812,                                    |
| 4  | 007f2811,                                    |
| 5  | 007f2810, read these initial value           |
| 6  | 007f2810, read triese illitial value         |
| 7  | 007f280f, /                                  |
| 8  | 00000001,                                    |
| 9  | 00000002,                                    |
| 10 | oooooooo, write the word                     |
| 11 | 00000005.                                    |
| 12 | ooooooo, with value                          |
| 13 | 00000007, <b>0x000f fff0</b>                 |
| 14 | 0000ffff, OXOOOI_IIIO                        |
| 15 | 00000000,                                    |
| 16 | 00000000,                                    |
| 17 | 00000000,                                    |
| 18 | 00000000,                                    |
| 19 | 00000000,                                    |
| 20 | 00000000,                                    |

| name        | from        | to          | bits | function                                                                    |
|-------------|-------------|-------------|------|-----------------------------------------------------------------------------|
| clock       | CPU-TOP     | Data Memory | 1    | data memory write is sensitive with its ? edge                              |
| dmemRdata   | Data Memory | Decoder     | 32   | the word read from the data memory and send to Decoder                      |
| memoryWrite | Controller  | Data Memory | 1    | 1'b1 means to write the memory unit, else means not to write                |
| dmemAddress | ALU         | Data Memory | 32   | the address which is used to identify the memory unit to be read or written |
|             |             |             |      |                                                                             |

Tips: "dmem32.coe" file could be found in the directory "lab\_tools" of course blackboard site https://bb.sustech.edu.cn/webapps/blackboard/content/listContentEditable.jsp?content\_id=\_281672\_1&course\_id=\_3609\_1

## Instruction Fetch



#### Instruction Fetch

- 1. Store the instructions(machine-code)
- 2. **Update** the value of the PC register
  - Reset
  - PC+4
  - Update the value of the PC register according to the jump instructions
    - branch(beq,bne) [I-type]
    - jal, j [J-type]
    - jr [R-type]
- 3. **Fetch** the instructions according to the value of the PC register

# **Using IP core As Instruction Memory**

- Step1: Find the IP core(Block Memory Generator) in IP Catalog
- Step2: Customize the IP core
  - set name(component name), type(ROM)
  - set features of the ROM(width and depth), operation
     mode and register output
- Step3: Generate the IP core, then it will be added to vivado project automatically
- Step4: Re-customize the generated IP core to set it's initial file.

Tips: The setting steps of ROM IP core are same with which of the RAM IP core in Data-memory



#### Customize the IP core



#### TIPS:

- 1. The component name "prgrom" is just an example, not a requirement.
- 2. set the init file of prgrom after this IP core has been generated and added into vivado project. Same steps as the RAM IP core used in data memory.

#### Instance the IP core



```
prgrom instmem(
    .clka(clock),
    .addra(PC[15:2]),
    .douta(Instruction)
):
```

#### TIPS:

"prgrom" is the IP core which is generated in vivado follow the steps on page 13, 14 of this slides.

In One Cycle CPU, the process of **geting instrcution** should **happen** on the **posedge** of the clock. At this moment, IFetch module gets the instruction which is store at "**addra**" from the instruction memory "Instmem"

Q: Why using PC[15:2] instead of PC[13:0] to bind with port "addra"?

TIPS: The same reason as the address bus used in data memory

# The Function Verification of "prgrom"



```
reg clock=1'b0;
wire [31:0] Instruction;
prgrom instmem(.clka(clock),.addra(PC[15:2]),.douta(Instruction));
always #5 clock = ~clock;
initial begin
clock = 1'b0;
#2 PC = 32'h0000_0000;
repeat(5) begin
#10 PC = PC+4;
#10 $finish;
end

PC[15:2]),.douta(Instruction)

Q1 : How many instruction right hand)?

Q2. If the testbench(on the change it to do the function between them.
```

**reg**[31:0] PC;

end

endmodule

340c000c,

- Q1 : How many instructions would be fetched in this testbench(on the right hand)?
- Q2. If the testbench(on the right hand) could not meet your requirements, change it to do the function verification.
- Q3. Compare the behavior of the ROM with the RAM, list at 2 differences between them.

### **IFetch Module**

```
module IFetc32(Instruction, branch_base_addr, link_addr,
clock, reset.
Addr_result, Read_data_1, Branch, nBranch, Jmp, Jal, Jr, Zero);
 output[31:0] Instruction;
                               // the instruction fetched from this module
  output[31:0] branch_base_addr; // (pc+4) to ALU which is used by branch type instruction
  output[31:0] link_addr;
                                     // (pc+4) to Decoder which is used by jal instruction
              clock, reset:
                                          // Clock and reset
 input
// from ALU
 input[31:0] Addr_result;
                                    // the calculated address from ALU
                                    // while Zero is 1, it means the ALUresult is zero
 input
              Zero:
// from Decoder
 input[31:0] Read_data_1;
                                    // the address of instruction used by jr instruction
// from Controller
           Branch;
                                    // while Branch is 1,it means current instruction is beq
 input
           nBranch;
                                    // while nBranch is 1,it means current instruction is bnq
 input
                                    // while Jmp 1, it means current instruction is jump
 input
           Jmp;
                                    // while Jal is 1, it means current instruction is jal
           Jal;
 input
                                    // while Ir is 1, it means current instruction is ir
 input
```

# Update the Value of the PC register

TIPS: The code here is JUST refence, NOT request. reg[31:0] PC, Next\_PC; always @\* begin if(((Branch == 1) && (Zero == 1)) || ((nBranch == 1) && (Zero == 0))) // beq, bne**Next\_PC** = ... // the calculated new value for PC else if(Jr == 1) **Next\_PC** = ... // the value of \$31 register Next PC else **Next\_PC** = ... // PC+4 end Clock . always @(... clock) begin if(reset == 1)5. Pseudodirect addressing **PC** <= 32'h0000 0000; Address Memory else begin Word if((Jmp == 1) || (Jal == 1)) begin**PC** <= ...: Q1: Complete the code to update 'Next PC' end else **PC** <= ...: Q2: Could be 'PC' ready while read the 'prgrom'? Determine when to update the value of the PC register. end Q3: Is this Minisys ISA a Harvard structure or Von Neumann structure

take a look at the initial value of PC

# Outputs of IFetch: Prepare for Decoder and ALU

```
output[31:0] branch_base_addr; // (pc+4) to ALU which is used by 'beq' and 'bne' instructions output[31:0] link_addr; // (pc+4) to Decoder which is used by 'jal' instruction
```

Here for "pc+4", the value of pc is the address of current processing instruction.



Don't forget to instance instruction memory, complete the port binding.

TIPS: The design here is for reference ONLY, NOT request.

#### Practice2

- 1. Make a Minisys source file with j, jal, jr, beq,bne and other NON-jumping instructions included.
- 2. Using Minisys1AssemblerV2.2 to assembler the source file on step 1, get the coe files.
- 3. Using the "prgmip32.coe" generated on step 2 as the initial file for the ROM in IFetch submodule to verify the its funciton:
- 3-1) What's the value of register PC while the reset is valid.
- 3-2) While reset is invalid, on which edge of clock would the value of register PC be updated?
- 3-3) What's the updated value to register PC while the current instruction is j, jal, jr, beq,bne and other NON-jumping instructions.
- 3-4) On which edge of clock would the instruction be fetched out?
- 3-5) Is there any difference between the two output ports ("branch\_base\_addr" and "link\_addr")

Tips:1) There are j, jal, jr, beq,bne and other NON-jumping instructions in cputest.asm(which is in the Minisys1AssemblerV2.2.rar), you can modify it as an alternative to the 1st step.

2) "Minisys1AssemblerV2.2.rar" could be found in the directory "lab\_tools" of course blackboard site

https://bb.sustech.edu.cn/webapps/blackboard/content/listContentEditable.jsp?content\_id=\_281672\_1&course\_id=\_3609\_1

#### **ALU**

- ➤ Determine the function and the inputs and outputs
  - > A MUX for operand selection
  - 'ALU\_control'
  - ➤ Operation
    - Arithmetic and Logic calculation
    - > Shift calculation
    - > Special calculation (slt,lui)
    - > Address calculation

## Q: Is the ALU a commbinatorial logic and sequential logic?



Tips: follow design is a reference ONLY, not required.

# Minisys - A subset of MIPS32

| Туре | Name | funC(ins[5:0])  |
|------|------|-----------------|
| R    | sII  | <b>00_00</b> 00 |
|      | srl  | <b>00_0</b> 010 |
|      | sllv | <b>00_0</b> 100 |
|      | srlv | <b>00_0</b> 110 |
|      | sra  | <b>00_0</b> 011 |
|      | srav | 00_0111         |
|      | jr   | <b>00_1</b> 000 |
|      | add  | <b>10_000</b> 0 |
|      | addu | 10_0001         |
|      | sub  | <b>10_001</b> 0 |
|      | subu | 10_0011         |
|      | and  | <b>10_01</b> 00 |
|      | or   | <b>10_01</b> 01 |
|      | xor  | <b>10_01</b> 10 |
|      | nor  | 10_0111         |
|      | slt  | <b>10_1</b> 010 |
|      | sltu | <b>10_1</b> 011 |

| Type | Name  | opC(Ins[31:26]) |
|------|-------|-----------------|
|      | beq   | <b>00</b> _0100 |
|      | bne   | <b>00</b> _0101 |
|      | lw    | <b>10</b> _0011 |
|      | sw    | <b>10</b> _1011 |
|      |       |                 |
|      |       |                 |
|      |       |                 |
|      | add i | <b>00_1</b> 000 |
|      | addiu | <b>00_1</b> 001 |
|      | slti  | <b>00_1</b> 010 |
|      | sltiu | <b>00_1</b> 011 |
|      | and i | <b>00_1</b> 100 |
|      | or i  | <b>00_1</b> 101 |
|      | xori  | <b>00_1</b> 110 |
|      | lui   | <b>00_1</b> 111 |

| Туре | Name | opC(Ins[31:26]) |
|------|------|-----------------|
|      | jump | <b>00_001</b> 0 |
| J    | jal  | 00_0011         |
|      |      |                 |



NOTE:

Minisys is a subset of MIPS32.

The opC of R-Type instruction is 6'b00\_0000

#### **BASIC INSTRUCTION FORMATS**

| R | opco | ode   | rs   | rt |       | rd     | shar | nt     | funct |
|---|------|-------|------|----|-------|--------|------|--------|-------|
|   | 31   | 26 25 | 21 2 | 0  | 16 15 | 11     | 10   | 6.5    | 0     |
| I | opco | ode   | rs   | rt |       |        | imme | ediate |       |
|   | 31   | 26 25 | 21 2 | 0  | 16 15 |        |      |        | 0     |
| J | opco | ode   |      |    | а     | ddress |      |        |       |
|   | 31   | 26 25 |      |    |       |        |      |        | 0     |

# Inputs Of ALU

```
module Executs32 ();
// from Decoder
  input[31:0] Read data 1;
                                      //the source of Ainput
  input[31:0] Read_data_2;
                                      //one of the sources of Binput
  input[31:0] Sign_extend;
                                      //one of the sources of Binput
// from IFetch
  input[5:0] Opcode;
                                       //instruction[31:26]
  input[5:0] Function_opcode;
                                       //instructions[5:0]
  input[4:0] Shamt;
                                       //instruction[10:6], the amount of shift bits
  input[31:0] PC_plus_4;
                                       //pc+4
// from Controller
  input[1:0] ALUOp;
                            //{ (R format | I format) , (Branch | nBranch) }
            ALUSrc;
                            // 1 means the 2nd operand is an immediate (except beg,bne)
  input
  input
           I format;
                            // 1 means I-Type instruction except beg, bne, LW, SW
             Sftmd:
                             // 1 means this is a shift instruction
  input
```

# Outputs And Variable of ALU continued

Q1: What's the destination of the outputs of ALU?

```
output[31:0] reg ALU_Result; // the ALU calculation result
output Zero; // 1 means the ALU_reslut is zero, 0 otherwise
output[31:0] Addr_Result; // the calculated instruction address
```

Q2: How to determine the data type of following variable?

```
wire[31:0]
             Ainput, Binput;
                                         // two operands for calculation
wire[5:0]
             Exe code; // use to generate ALU_ctrl. (I_format==0) ? Function_opcode : { 3'b000 , Opcode[2:0] };
             ALU ctl; // the control signals which affact operation in ALU directely
wire[2:0]
wire[2:0]
                         // identify the types of shift instruction, equals to Function opcode[2:0]
             Sftm;
                                     // the result of shift operation
reg[31:0]
             Shift_Result;
reg[31:0]
                                      // the result of arithmetic or logic calculation
             ALU output mux;
wire[32:0]
             Branch Addr;
                               // the calculated address of the instruction, Addr Result is Branch Addr[31:0]
```

# The Selection On Operand2

- Two operands: Ainput and Binput.
- Binput is the output of 2-1 MUX:
  - "Sign\_extend" and "Read\_data\_2" are from Decoder.
  - The output of the MUX is determined by "ALUSrc".



## **ALU\_ctrl** generation

- ➤ Design:
  - lots of operations need to be processed in ALU
  - To reduce the burden of the Controller, the Controller and ALU produce control signals which affect the ALU operation together

Function code

- > Implements(1):
  - ALUOp(1st level control signal):

generated by Controller (the basic relationship between instruction and operation)

- > bit1 to identify if the instruction is R\_format/ I\_format, otherwise means neither
- > bit0 to identify if the instruction is beg/bne, otherwise means neither
- ALUOp = { (R\_format || I\_format) , (Branch || nBranch) }

```
// R_format = (Opcode==6'b000000)? 1'b1:1'b0;
```

// "I\_format" is used to identify if the instruction is I\_type(except for beq, bne, lw and sw).

Exe code

## ALU\_ctrl generation continued

- Implements(2) :
  - > Exe\_code(2nd level control signal): according to the instruction type( I-format or not):

function\_opcode:

{ 3'b000, Opcode[2:0] };

#### Tips

- I\_format is 1 means this is the I-type instruction
   except beq,bne,Iw and sw.
- 2) Opcode is instruction[31:26]
- 3) function\_opcode is instruction[5:0]
- **Q.** Could the 'Exe\_code' be generated by Controller or by ALU? What's your choic?

| Type | Name | funC(ins[5:0])  |
|------|------|-----------------|
| R    | sll  | 00_0000         |
|      | srl  | <b>00_0</b> 010 |
|      | sllv | <b>00_0</b> 100 |
|      | srlv | <b>00_0</b> 110 |
|      | sra  | <b>00_0</b> 011 |
|      | srav | 00_0111         |
|      | jr   | 00_1000         |
|      | add  | <b>10_000</b> 0 |
|      | addu | 10_0001         |
|      | sub  | <b>10_001</b> 0 |
|      | subu | 10_0011         |
|      | and  | <b>10_01</b> 00 |
|      | or   | <b>10_01</b> 01 |
|      | xor  | <b>10_01</b> 10 |
|      | nor  | 10_0111         |
|      | slt  | <b>10_1</b> 010 |
|      | sltu | <b>10_1</b> 011 |

| e | Name  | opC(Ins[31:26]) |  |  |  |  |  |
|---|-------|-----------------|--|--|--|--|--|
|   | beq   | <b>00</b> _0100 |  |  |  |  |  |
|   | bne   | <b>00</b> _0101 |  |  |  |  |  |
|   | lw    | <b>10</b> _0011 |  |  |  |  |  |
|   | sw    | <b>10</b> _1011 |  |  |  |  |  |
|   |       | I-Format        |  |  |  |  |  |
|   |       |                 |  |  |  |  |  |
| ſ | addi  | <b>00_1</b> 000 |  |  |  |  |  |
| ı | addiu | <b>00_1</b> 001 |  |  |  |  |  |
| ı | slti  | <b>00_1</b> 010 |  |  |  |  |  |
| ı | sltiu | <b>00_1</b> 011 |  |  |  |  |  |
| ı | andi  | <b>00_1</b> 100 |  |  |  |  |  |
| ١ | ori   | <b>00_1</b> 101 |  |  |  |  |  |
| 1 | xor i | <b>00_1</b> 110 |  |  |  |  |  |
| 1 | lui   | <b>00 1</b> 111 |  |  |  |  |  |

## ALU\_ctrl generation continued

| Exe_code[30] | ALUOp[10] | ALU_ctl[20] | 指令助记符       |
|--------------|-----------|-------------|-------------|
| 0100         | 10        | 000         | and,andi    |
| 0101         | 10        | 001         | or,ori      |
| 0000         | 10        | 010         | add,addi    |
| XXXX         | 00        | 010         | lw, sw      |
| 0001         | 10        | 011         | addu, addiu |
| 0110         | 10        | 100         | xor,xori    |
| 0111         | 10        | 101         | nor,lui     |
| 0010         | 10        | 110         | sub, slti   |
| xxxx         | 01        | 110         | beq, bne    |
| 0011         | 10        | 111         | subu, sltiu |
| 1010         | 10        | 111         | slt         |
| 1011         | 10        | 111         | sltu        |

ALU\_ctrl: based on ALUOp and Exe\_code, specify most of the operation details in ALU

```
Exe_code =
(I_format==0) ?
Function_opcode :
{ 3'b000 , Opcode[2:0] };

ALUOp =
{ (R_format || I_format) , (Branch || nBranch) }
```



```
assign ALU_ctl[0] = (Exe_code[0] | Exe_code[3]) & ALUOp[1];

assign ALU_ctl[1] = ((!Exe_code[2]) | (!ALUOp[1]));

assign ALU_ctl[2] = (Exe_code[1] & ALUOp[1]) | ALUOp[0];
```

# ALU\_ctrl usage

> Type1: The same operation in ALU with different operand source sometimes the instructions share the same calculation operation but with different operand source, such as "and" and "andi", "addu" and "addui".

# The same operation but different operand source: ALU\_ctrl is same

- add vs addi
- addu vs addiu
- and vs andi
- or vs ori
- xor vs xori
- slt vs sltu vs sltiu

| Exe_code[30] | ALUOp[10] | ALU_ctl[20] | 指令助记符       |  |
|--------------|-----------|-------------|-------------|--|
| 0100         | 10        | 000         | and,andi    |  |
| 0101         | 10        | 001         | or,ori      |  |
| 0000         | 10        | 010         | add,addi    |  |
| XXXX         | 00        | 010         | lw, sw      |  |
| 0001         | 10        | 011         | addu, addiu |  |
| 0110         | 10        | 100         | xor,xori    |  |
| 0111         | 10        | 101         | nor,lui     |  |
| 0010         | 10        | 110         | sub, slti   |  |
| XXXX         | 01        | 110         | beq, bne    |  |
| 0011         | 10        | 111         | subu, sltiu |  |
| 1010         | 10        | 111         | slt         |  |
| 1011 10      |           | 111         | sltu        |  |

# **ALU\_ctrl** usage continued

> Type2: The same operation in ALU with different destination

The ALU\_ctrl code is same(3'b010) for both "Iw", "sw", "add" and "andi":

the operation of "Iw" and "sw" in ALU
is calcuation the address based on
the base address and offset which is
same as in "add" operation.

| Exe_code[30] | ALUOp[10] | ALU_ctl[20] | 指令助记符       |  |
|--------------|-----------|-------------|-------------|--|
| 0100         | 10        | 000         | and,andi    |  |
| 0101         | 10        | 001         | or,ori      |  |
| 0000         | 10        | 010         | add,addi    |  |
| XXXX         | 00        | 010         | lw, sw      |  |
| 0001         | 10        | 011         | addu, addiu |  |
| 0110         | 10        | 100         | xor,xori    |  |
| 0111         | 10        | 101         | nor,lui     |  |
| 0010         | 10        | 110         | sub, slti   |  |
| XXXX         | 01        | 110         | beq, bne    |  |
| 0011         | 10        | 111         | subu, sltiu |  |
| 1010         | 10        | 111         | slt         |  |
| 1011         | 10        | 111         | sltu        |  |

# ALU\_ctrl usage continued

- > Type2 continued: The same operation in ALU with different destination
  - "beq", "bne" vs "sub" (destionation ):
    - "beq" and "bne": Addr\_reslut
    - "sub": "ALU\_reslut"
  - "subu" vs "slt", "sltu" (destionation)
    - "slti" and "sltiu": Zero.

I\_format is used here to distinguish these two types

• "sub" vs "slti", "subu" vs "sltiu":

same as upper instructions,
Function\_opcode(3)=1 of slt and sltu
could be used as distinguishment

| Exe_code[30] | ALUOp[10] | ALU_cti[20] 指令助记符 |             |  |
|--------------|-----------|-------------------|-------------|--|
| 0100         | 10        | 000               | and,andi    |  |
| 0101         | 10        | 001               | or,ori      |  |
| 0000         | 10        | 010               | add,addi    |  |
| XXXX         | 00        | 010               | lw, sw      |  |
| 0001         | 10        | 011               | addu, addiu |  |
| 0110         | 10        | 100               | xor,xori    |  |
| 0111         | 10        | 101               | nor,lui     |  |
| 0010         | 10        | 110               | sub, slti   |  |
| XXXX         | 01        | 110               | beq, bne    |  |
| 0011         | 10        | 111               | subu, sltiu |  |
| 1010         | 10        | 111               | slt         |  |
| 1011         | 10        | 111               | sltu        |  |

# ALU\_ctrl usage continued

> Type3 : Some instructions' ALU\_ctrl code is the same as others, but with different operation in ALU.

For these instructions, make sure they can be identified to avoid wrong operations:

- shift instrucitons: could be identified by the input port "sftmd"
- lui: whose ALU\_ctrl code is the same as "nor", but could be identified by "I\_format"
- jr : could be identified by the input port "jr", not excute in ALU
- j : could be identified by the input port "jmp", not excute in ALU
- jal: could be identified by the input port "jal", not excute in ALU

# Practice3-1: Arithmatic and Logic calculation

Complete the following code according to the table on the right hand

```
reg[31:0] ALU output mux;
always @ (ALU ctl or Ainput or Binput)
begin
case (ALU ctl)
    3'b000:ALU output mux =? ? ?
    3'b001:ALU output mux =? ? ?
    3'b010:ALU output mux =? ? ?
    3'b011:ALU output mux =? ? ?
    3'b100:ALU output mux =? ? ?
    3'b101:ALU output mux =? ? ?
    3'b110:ALU output mux =? ? ?
    3'b111:ALU output mux =? ? ?
    default:ALU output mux = 32'h00000000;
endcase
end
```

| Exe_code[30] | ALUOp[10] | ALU_ctl[20] | 指令助记符       |
|--------------|-----------|-------------|-------------|
| 0100         | 10        | 000         | and,andi    |
| 0101         | 10        | 001         | or,ori      |
| 0000         | 10        | 010         | add,addi    |
| xxxx 00      |           | 010         | lw, sw      |
| 0001         | 10        | 011         | addu, addiu |
| 0110         | 10        | 100         | xor,xori    |
| 0111         | 10        | 101         | nor,lui     |
| 0010         | 10        | 110         | sub, slti   |
| xxxx 01      |           | 110         | beq, bne    |
| 0011         | 10        | 111         | subu, sltiu |
| 1010         | 10        | 111         | slt         |
| 1011         | 10        | 111         | sltu        |

**Tips**: While ALU\_ctrl is 3'b101, One of the implements is to execute only 'nor', make other procedure do the 'lui'

# **Shift Operation**

| Type | Name | funC(ins[5:0])  |
|------|------|-----------------|
| R    | sII  | 00_0000         |
|      | srl  | <b>00_0</b> 010 |
|      | sllv | 00_0100         |
|      | srlv | 00_0110         |
|      | sra  | <b>00_0</b> 011 |
|      | srav | 00_0111         |

| sftm[2:0] | process           |
|-----------|-------------------|
| 3'b000    | sll rd, rt, shamt |
| 3'b010    | srl rd, rt, shamt |
| 3'b100    | sllv rd, rt, rs   |
| 3'b110    | srlv rd, rt, rs   |
| 3'b011    |                   |
|           | sra rd, rt, shamt |
| 3'b111    | srav rd, rt, rs   |
| other     | not shift         |

There are 6 shift instructions, listed in the table on the left hand.

Ainput, Binput/shamt are the operand of shift operation



## **Practice3-2: Shift Operation**

Complete the following code, taking the table on the left hand as reference

| sftm[2:0] | process           |
|-----------|-------------------|
| 3'b000    | sll rd, rt, shamt |
| 3'b010    | srl rd, rt, shamt |
| 3'b100    | sllv rd, rt, rs   |
| 3'b110    | srlv rd, rt, rs   |
| 3'b011    | sra rd, rt, shamt |
| 3'b111    | srav rd, rt, rs   |
| other     | not shift         |

```
always @* begin // six types of shift instructions
   if(Sftmd)
         case(Sftm[2:0])
           3'b000:Shift_Result = Binput << Shamt;
                                                            //Sll rd,rt,shamt 00000
           3'b010:Shift_Result = ???;
                                                            //Srl rd,rt,shamt 00010
           3'b100:Shift_Result = Binput << Ainput;
                                                            //Sllv rd,rt,rs 000100
           3'b110:Shift Result = ???;
                                                            //Srlv rd,rt,rs 000110
                                                            //Sra rd,rt,shamt 00011
           3'b011:Shift Result = ???;
           3'b111:Shift_Result = ???;
                                                            //Srav rd,rt,rs 00111
           default:Shift_Result = Binput;
         endcase
   else
         Shift_Result = Binput;
  end
```

# Get the Output of ALU

The operations of ALU include:

- 1) execute the **setting** type instructions ( **slt, sltu, slti** and **sltiu** )
  - > get ALU\_output\_mux, and set the value of the output port "ALU\_result"
- 2) execute the lui operation
  - get result of "lui" execution, and set the value to the output port "ALU\_result"
- 3) execute the **shift** operation
  - > get "Shift\_Result", set its value to the output port "ALU\_result"
- 4) do the basic arithmetic and logic calculation
  - get ALU\_output\_mux, set its value to the output port "ALU\_result"

Tips: Exe\_code[3..0], ALUOp[1..0] and ALU\_ctl[2..0] are used to identify the types of operation

# Practice 3-3: determine the output "ALU\_Result"

Complete the following code according to the code annotation

```
always @* begin
     //set type operation (slt, slti, sltu, sltiu)
      if( ((ALU_ctl==3'b111) && (Exe_code[3]==1)) || /*to be completed*/ )
                  ALU Result = (Ainput-Binput<0)?1:0;
                                                                                                                             指令助记符
                                                                                                              ALU_ctl[2..0]
                                                                                         Exe_code[3..0]
                                                                                                    ALUOp[1..0]
                                                                                            0100
                                                                                                                              and, andi
    //lui operation
                                                                                            0101
                                                                                                       10
                                                                                                                 001
                                                                                                                              or, ori
    else if((ALU_ctl==3'b101) && (I_format==1))
                                                                                            0000
                                                                                                                 010
                                                                                                                              add,addi
                                                                                                       10
                  ALU_Result[31:0] = /*to be completed*/;
                                                                                                                 010
                                                                                            XXXX
                                                                                                                              lw, sw
                                                                                            0001
                                                                                                       10
                                                                                                                 011
                                                                                                                             addu, addiu
                                                                                            0110
                                                                                                       10
                                                                                                                 100
    //shift operation
                                                                                                       10
                                                                                                                 101
                                                                                            0111
                                                                                                                              nor,lui
                                                                                            0010
                                                                                                       10
                                                                                                                 110
    else if(Sftmd==1)
                                                                                                                              sub, slti
                                                                                                                 110
                                                                                            XXXX
                  ALU_Result = Shift_Result;
                                                                                                       10
                                                                                            0011
                                                                                                                 111
                                                                                                                             subu, sltiu
                                                                                            1010
                                                                                                       10
                                                                                                                 111
                                                                                                       10
                                                                                            1011
                                                                                                                 111
    //other types of operation in ALU (arithmatic or logic calculation)
    else
                  ALU_Result = ALU_output_mux[31:0];
  end
```

# Practice 3-4: determine the output "Addr\_result" and "Zero"

The values of "Addr\_result" and "Zero" are still not detemined.

```
output[31:0] reg ALU Result; // the ALU calculation result
output Zero; // 1 means the ALU_reslut is zero, 0 otherwise
output[31:0] Addr_Result; // the calculated instruction address
```

> "Zero" is a signal used by "IFetch" to determine whether to use the value of "Addr\_reslut" to update PC register or not.

TIPS: Minisys only support "beq" and "bne" in the conditional jump instruction.

> "Addr\_result" is calculated by ALU when the instruction is "beq" or "bne".

TIPS: Addr\_reslut should be the sum of pc+4(could be get from PC\_plus\_4) and the immediate in the instruction.

#### Practice 3-5: Function Verification on ALU

Build a testbench to verify the function of ALU.

Take the testcases described in bellow table as reference, More testcases are suggested for function verification.

| Time (ns) | Instruction | A input                                                                                                                                                                                      | B input     | Results(includes 'Zero')                 |
|-----------|-------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|------------------------------------------|
| 0         | add         | 0x5                                                                                                                                                                                          | 0x6         | ALU_Result = 0x0000_000b, Zero=1'b0      |
| 200       | addi        | 0xffff_ff40                                                                                                                                                                                  | 0x3         | ALU_Result = 0xffff_ff43, Zero=1'b0      |
| 400       | and         | 0x0000_00ff                                                                                                                                                                                  | 0x0000_0ff0 | $ALU_Result = 0x0000_00f0, Zero=1'b0$    |
| 600       | sll         | 0x0000_0002                                                                                                                                                                                  | 0x3         | $ALU_Result = 0x0000_0010$ , $Zero=1'b0$ |
| 800       | lui         | 0x0000_0040                                                                                                                                                                                  | 0x10 (16)   | $ALU_Result = 0x0040_0000$ , $Zero=1'b0$ |
| 1000      | beq         | The value of Ainput is same with that of Binput. Zero = 1'b1 Depends on your design Addr_Result: should be the sum of pc+4(could be get from PC_plus_4) and the immediate in the instruction |             |                                          |