

# Digital Systems Design with Verilog

### MIPS Processor Design Assignment 3

## Assignment 3 – Two/Three Parts

#### Assignment 3 Part A

- 1. Modify the MIPS assembly language program so that the program displays the lowest 8 digits of your ID on the DE2 board 7 segment display.
- 2. Show that your program functions correctly by taking a screen shot(s) of the ModelSim simulation or SignalTap Logic analyser.
- In your report you should include your assembly language code and a screen dump of the ModelSim simulation or SignalTap Logic analyser. Also include a photograph of the 7 segment displays showing your ID if you used a DE2 Board.

## Assignment 3 – Part B

- The MIPS design presented in MIPS\_System only implements a limited number of the MIPS instructions. For the R-Type instructions ADD, ADDU, SUB, SUBU, AND, OR and SLT are implemented. Your task is to modify the MIPS design so that it implements the additional instructions shown in Table 1.
- Once you have modified your design you need to write a program to demonstrate that your hardware correctly implements the instructions. Your results should include print outs of the SignalTap logic analyser showing your program operating. Annotate the print out to explain what is happening.
- (Instructions may include nor, xor, andi, xori, lb, lbu, lh) <sup>3</sup>

## Overview of Our Design



# Design single cycle – From previous lectures



## RTL View of MIPS Design



#### Features of the MIPS Design

- Uses a PLL to generate 3 clock signals
- Includes Dual Port memory (for data and for instructions)
- Includes a Timer not essential for you
- Includes GPIO to access the LEDs and Switches on the DE2 Board
- Has an address decoder to select between
  - Memory
  - GPIO
  - Timer

## Why 3 clocks?

It's supposed to be a single cycle design why have we three clocks?

## Why three clocks?

- Remember the order of events after the main clock edge in this single cycle processor.
  - Latch the new PC into the PC latches
  - Read the Instruction from the Instruction memory
  - For a lw or sw instruction there is a data memory read or a data memory write once the address has been calculated
- As the Dual port memory accesses are synchronised with "clock edges" we need three clock "edges" within the one clock period.
  - Latch PC
  - Read Instruction
  - Read/Write data value
- We can use one of the Phase Locked Loops (PLLs) within the FPGA to generate the edges or split the clock into a number of cycles with a state machine.

#### Dual Port RAM module



#### Can we remove the Registers on the inputs?



Not on the cyclone II The option is greyed out.

## The Design Files



- MIPS\_System.v is the top-level file that combines all the other modules.
- The Memory is implemented using an Altera Dual port ram megafunction.
  - The RAM is initialised using the "insts\_data.mif" file. You need to edit this file with the hex code generated by the MIPS assembler.
- The ClockBlk generates the shifted clocks required by the Dual Port Memory.

## The Memory Map

```
output reg CS MEM N,
                        output reg CS TC N,
                        output reg CS UART N,
                        output reg CS GPIO N);
                  Peripheral
                                  Peripheral Name
     // Oxffff Ffff -----
10
12
13
     // OxFFFF 3000 -----
                                  General Purpose IO
     // OxFFFF 2000 -----
                                  Universal
17
                                  Asynchronous
                                                              4KB
18
                                  Receive/ Transmitter
19
                                  Timer Conter
     // OxFFFF 0000 -----
22
23
                    Reserved
25
          mem Instruction & Data Memory
27
      // Ox0000 0000 -----
28
29
       always @(*)
30
31
       begin
         if
                 (Addr[31:13] == 19'h0000) // Instruction & Data Memory
32
33
           begin
34
               CS MEM N <=0;
               CS TC N <=1;
35
36
               CS UART N <=1;
37
               CS GPIO N <=1;
38
39
40
         else if (Addr[31:12] == 20'hFFFF0) // Timer
41
               CS MEM N <=1;
42
43
               CS TC N
                         <=0;
               CS UART N <=1;
44
               CS GPIO N <=1;
45
47
         else if (Addr[31:12] == 20'hFFFF1) // UART
48
```

⊟module Addr Decoder (input [31:0] Addr,

The Address Decoder uses the upper bits of the address bus to decide which device should be selected. Its normal for devices and control signals to be "active-low" i.e. selected when the control signal is low.

### Given Insts\_data.mif



0x3C020000 0x24420055 0x3C03FFFF 0x24632008 0xAC620000 0x08000005

## Manual Decoding - 0x3C020000

3 Instruction formats: all 32 bits wide

| opcode | rs             | rt | rd | sa       | funct | R format |
|--------|----------------|----|----|----------|-------|----------|
| opcode | rs             | rt |    | immediat | :e    | I format |
| opcode | de jump target |    |    |          |       | J format |

What is the opcode (upper 6 bits)?

 $0x3C020000 = 0b0011\_1100\_0000\_0010\_0000\_0000\_0000$ 

Opcode =  $0011_{11}$ 

What format instruction?

## Manual Decoding - 0x3C020000



 $0x3C020000 = 0b0011\_1100\_0000\_0010\_0000\_0000\_0000$ 

Opcode =  $0011_{11}$ 

An I type instruction: lui rt, immediate

Load upper immediate, set the upper 16 bits in register rt to the immediate

value:

What register is rt? Bits  $20->16 = 0\_0010$  rt = register 2 What is the immediate value? 0x0000 (last 16 bits of the instruction)

Instruction is: lui \$2, 0x0000

## Manual Decoding - 0x24420055

| opcode | rs          | rt | rd | sa       | funct    | R format |
|--------|-------------|----|----|----------|----------|----------|
| opcode | rs          | rt |    | immediat | :e       | I format |
| opcode | jump target |    |    |          | J format |          |

0x24420055 = 0b0010 0100 0100 0010 0000 0000 0101 0101

Opcode =  $0010 \ 01$ 

An I type instruction: addiu rt, rs, immediate Add immediate unsigned, add the 16 bit immediate value to rs and place the result in rt value:

What register is rs? Bits  $25 - 21 = 00_010$  rs = register 2 What register is rt? What is the immediate value?

Bits  $20->16 = 0_0010$  rt = register 2

0x0055 Bits 15->0 of the instruction

Instruction is: addiu \$2, \$2, 0x0055

## Manual Decoding - 0x3C03FFFF



 $0x3C03FFFF = 0b0011\_1100\_0000\_0011\_1111\_1111\_1111$ 

Opcode =  $0011_1$  (upper 6 bits)

An I type instruction "lui rt, immediate"

Load upper immediate, set the upper 16 bits in register rt to the immediate value stored in the lower 16 bits:

What register is rt?
What is the immediate value?

Bits 20->16 = 0\_0011 rt = register 3 0xFFFF (low 16 bits of the instruction)

Instruction is: lui \$3, 0xFFFF

## Manual Decoding - 0x24632008



 $0x24632008 = 0b0010_0100_0110_0011_0010_0000_0000_1000$ 

Opcode =  $0010_01$ 

An I type instruction: addiu rt, rs, immediate Add immediate unsigned, add the 16 bit immediate value to rs and place the result in rt value:

What register is rs?

Bits  $25->21 = 00_011 \text{ rs} = \text{register } 3$ 

What register is rt?

Bits  $20->16 = 0_0011$  rt = register 3

What is the immediate value?

0x2008, Low 16 bits of the instruction

Instruction is: addiu \$3, \$3, 0x2008

## Manual Decoding - 0xAC620000



Opcode =  $1010_{11}$ 

An I type instruction: sw rt, immediate(rs)

Store register rt at the location specified by rs offset by the immediate value:

What register is rs? Bits  $25->21 = 00\_011$  rs = register 3 What register is rt? Bits  $20->16 = 0\_0010$  rt = register 2

What is the immediate value? 0.0000 (low 16 bite of the instruction)

What is the immediate value? 0x0000 (Low 16 bits of the instruction)

Instruction is: sw \$2, 0x0000(\$3)

## Manual Decoding – 0x08000005



 $0x08000005 = 0b0000\_1000\_0000\_0000\_0000\_0000\_0101$ 

Opcode =  $0000_{10}$ 

A J type instruction: j coded address of label Jump to address

What address is label?

label = 0x00000005

Instruction is: j 0x00000014 (this is a byte address, 4 \* 5 = 20d = 0x14

#### MARS Assembler

- If the Mars java file does not run when you click on it you can start it with the command line
- Java –jar mars.jar (if you saved it as mars.jar)

#### Assembled with MARS



To get the assembly code starting at location 0x0000000, select "Settings->Memory Configuration->Compact, Text at Address 0"

#### What is stored where?

What does \$2 contain?

\$2 = 0x00000055

What does \$3 contain?

\$3 = 0xFFFF 2008

```
wire sw14 pressed;
 wire sw15 pressed;
 wire sw16 pressed;
 wire sw17 pressed;
                                   In the gpio.v file
   FFFF 202C HEX7 R
   FFFF 2028
               HEX6 R
               HEX5 R
   FFFF 2024
   FFFF 2020 HEX4 R
   FFFF 201C
               HEX3 R
// FFFF 2018 HEX2 R
// FFFF 2014 HEX1 R
   FFFF 2010 HEXO R
   FFFF 200C LEDG R
   FFFF 2008 LEDR R
   FFFF 2004 SW StatusR
   FFFF 2000
               KEY StatusR
   LEDG register (32bit)
   ZZZZ ZZZ|LEDG8| |LEDG7|LEDG6|LEDG5|LEDG4|
           |LEDG3|LEDG2|LEDG1|LEDG0|
   LEDR register (32 bit)
   ZZZZ ZZZZ ZZZZ ZZ|SLEDR17|LEDR16|
           |LEDR15|LEDR14|LEDR13|LEDR12|
           |LEDR11|LEDR10|LEDR9|LEDR8|
           |LEDR7|LEDR6|LEDR5|LEDR4|
           |LEDR3|LEDR2|LEDR1|LEDR0|
   SW Status register (32 bit)
   ZZZZ ZZZZ ZZZZ ZZ|SW17|SW16|
```

## What signals should be shown with ModelSim?



## What signals should be shown with ModelSim?



## Part B – What to change

- Part B requires you to encode some new instructions.
- What instruction type is your "instruction 1"?

#### **Control Unit**

Unit MemtoReg - MemWrite **Branch** Opcode<sub>5:0</sub> Main - ALUSrc **Decoder** - RegDst **Opcode and funct fields come** RegWrite from the fetched instruction ALUOp<sub>1:0</sub> **ALU** Funct<sub>5:0</sub> - ALUControl<sub>2:0</sub> **Decoder** 

Control

#### Control Unit - ALU Control

- Implementation is completely dependent on hardware designers
- But, the designers should make sure the implementation is reasonable enough
  - Memory access instructions (lw, sw) need to use ALU to calculate memory target address (addition)
  - Branch instructions (beq, bne) need to use ALU for the equality check (subtraction)

| ALUOp <sub>1:0</sub> | Meaning       |
|----------------------|---------------|
| 00                   | Add           |
| 01                   | Subtract      |
| 10                   | Look at Funct |
| 11                   | Not Used      |



| ALUOp <sub>1:0</sub> | Funct        | ALUControl <sub>2:0</sub> |
|----------------------|--------------|---------------------------|
| 00                   | X            | 010 (add)                 |
| X1                   | X            | 110 (subtract)            |
| 1X                   | 100000 (add) | 010 (add)                 |
| 1X                   | 100010 (sub) | 110 (subtract)            |
| 1X                   | 100100 (and) | 000 (and)                 |
| 1X                   | 100101 (or)  | 001 (or)                  |
| 1X                   | 101010(slt)  | 111 (slt)                 |

## Verilog Code – ALU



```
module alu(input [31:0] a, b,
           input
                            alucont,
                    [2:0]
           output reg [31:0] result,
           output
                             zero);
 wire [31:0] b2, sum, slt;
  assign b2 = alucont[2] ? ~b:b;
  // addition (sub)
  assign sum = a + b2 + alucont[2];
  assign slt = sum[31];
  always@(*)
 begin
   case(alucont[1:0])
      2'b00: result <= a & b2; // A & B
      2'b01: result <= a | b2; // A | B
      2'b10: result \leq sum; // A + B, A - B
      2'b11: result <= slt;
                             // SLT
    endcase
  end
 // for branch
  assign zero = (result == 32'b0);
endmodule
```

| F <sub>2:0</sub> | Function |
|------------------|----------|
| 000              | A & B    |
| 001              | A B      |
| 010              | A + B    |
| 011              | not used |
| 100              | A & ~B   |
| 101              | A   ~B   |
| 110              | A - B    |
| 111              | SLT      |

## Where are "R-Type" Instructions decoded?

```
assign {signext, shiftl16, regwrite, regdst,
          alusrc, branch, memwrite,
          memtoreg, jump, aluop} = controls;
   always @(*)
    case (op)
      6'b0000000: controls <= 11'b00110000011; // Rtype
      6'b100011: controls <= 11'b1010101001000; // LW
      6'b101011: controls <= 11'b10001010000; // SW
      6'b000100: controls <= 11'b10000100001; // BEO
      6'b001001: controls <= 11'b10101000000; // ADDI, ADDIU: only difference is exception
      6'b001101: controls <= 11'b00101000010; // ORI
      6'b001111: controls <= 11'b01101000000; // LUI
      6'b000010: controls <= 11'b00000000100; // J
      default: controls <= 11'bxxxxxxxxxxxx; // ???
     endcase
 endmodule
-Imodule aludec(input
                        [5:0] funct,
                          [1:0] aluop,
              output reg [2:0] alucontrol);
  always @(*)
  case(aluop)
     2'b00: alucontrol <= 3'b010: // add
      2'b01: alucontrol <= 3'b110: // sub
      2'b10: alucontrol <= 3'b001; // or
      default: case(funct)
          6'b100000.
          6'b100001: alucontrol <= 3'b010; // ADD, ADDU: only difference is exception
          6'b100011: alucontrol <= 3'b110; // SUB, SUBU: only difference is exception
          6'b100100; alucontrol <= 3'b000; // AND
          6'b100101: alucontrol <= 3'b001; // OR
          6'b101010: alucontrol <= 3'b111; // SLT
          default: alucontrol <= 3'bxxx: // ???
         endcase
     endcase
```

### What do lb, lbu, lh, lhu do?

- Ib = load byte
  - Load a single 8 bit byte from any memory location (not just word aligned) and place it in the lower 8 bits of the specified register. Upper 24 bits should be sign extension of bit 7
- Ibu = load byte unsigned
  - Load a single 8 bit byte from any memory location (not just word aligned) and place it in the lower 8 bits of the specified register. Upper 24 bits should be set to zero

### What do lb, lbu, lh, lhu do?

- Ih = load half word
  - Load a 16 bit value from any half word memory location (not just word aligned) and place it in the lower 16 bits of the specified register. Upper 16 bits should be sign extension of bit 15
- Ihu = load half word unsigned
  - Load a 16 bit byte value from any half word memory location (not just word aligned) and place it in the lower 16 bits of the specified register. Upper 16 bits should be set to zero

## What extra hardware is needed for lb etc.?



# Currently which address lines go to the Memory?



## Dual Port RAM Megafunction



A: Instruction

#### Full Data Path



#### **Current Data Path**



- Could be ALU Result
- Could be Read 32 bit word
- For LH/LHU
  - Could be Sign/Zero Low Half Word
  - Could be Sign/Zero High Half Word
- For LB/LBU
  - Could be Sign/Zero Byte[0]
  - Could be Sign/Zero Byte[1]
  - Could be Sign/Zero Byte[2]
  - Could be Sign/Zëro Byte[3]

#### **Current Data Path**



What selects High or Low Half? What selects which Byte?

- Could be ALU Result
- Could be Read 32 bit word
- For LH/LHU
  - Could be Sign/Zero Ext.
     Low Half Word
  - Could be Sign/Zero Ext.
     High Half Word
- For LB/LBU
  - Could be Sign/Zero Ext. Byte[0]
  - Could be Sign/Zero Ext. Byte[1]
  - Could be Sign/Zero Ext. Byte[2]
  - Could be Sign/Zero Ext.
     Byte[3]

## How many multiplexers are needed?



- One to selected between LW and (LH/LB)
  - What is its selection signal?
- 2. One to select the appropriate Half Word or BYTE
  - What is its selection signal(s)?
- Also need a Sign or Zero Extender

- Could be ALU Result
- Could be Read 32 bit word
- For LH/LHU
  - Could be Sign/Zero Ext.
     Low Half Word
  - Could be Sign/Zero Ext.
     High Half Word
- For LB/LBU
  - Could be Sign/Zero Ext. Byte[0]
  - Could be Sign/Zero Ext. Byte[1]
  - Could be Sign/Zero Ext. Byte[2]
  - Could be Sign/Zero Ext.
     Byte[3]

## How many multiplexers are needed?



This is for LH/LHU. LB and LBU would be 4 input 8 bit mux and with two select lines to select the correct byte.

- Could be ALU Result
- Could be Read 32 bit word
- For LH/LHU
  - Could be Sign/Zero Ext.
     Low Half Word
  - Could be Sign/Zero Ext.
     High Half Word
- For LB/LBU
  - Could be Sign/Zero Ext. Byte[0]
  - Could be Sign/Zero Ext. Byte[1]
  - Could be Sign/Zero Ext. Byte[2]
  - Could be Sign/Zero Ext.
     Byte[3]

## Code for testing instructions

- Your code should cover all combinations of inputs
  - For Boolean operators this includes:
    - **0**,0:0,1:1,0:1,1
    - Your operations should be able to be easily spotted in the ModelSim / Signaltap Waveforms
    - The numbers used shouldn't require too much thought to work out whether it produces the correct answer:
      - Which is easier to calculate
        - 0x01234567 & 0x76543210
        - 0xFFFF0000 & 0xFF00FF00

## Testing LB, LBU, LH, LHU

- You need to load a 32 bit value into an appropriate memory location.
  - The memory is mapped between 0x0 -> 0x2000.
  - You should pick a word aligned location higher than where your program is stored
    - Something like 0x200
  - You should use SW to store an appropriate word aligned 32 bit value.
    - The value used should allow the result to differentiate between LB & LBU (or LH & LHU)
  - For LB/LBU you need to load 4 bytes using four LB/LBU instructions
  - For LH/LHU you need to load 2 half-words using two LH/LHU instructions