

GMU CS695 Advanced Computer Architecture
Spring 2023
Lecture 2

# **Processor Instruction Set Architecture & Language**

Reference:

CMPE 140, Hyeran Jeon

CMPE 140, Donald Hung

CMPE 140, Haonan Wang

Computer Architecture: A Quantitative Approach

Licensed for use under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Lishan Yang

#### Instruction Set Architecture & Microarchitecture

#### Instruction Set Architecture

 the programmer's view of the computer, defined by a set of instructions that each specifies an operation and its operand(s)

#### Microarchitecture

 the way that the ISA is implemented in hardware



#### What Does the ISA Deal With Specifically?

**Example:** a C program that reads two integer values from "file.txt" file and prints the sum of them.

```
#include <stdio.h>
#include <string.h>
int numbers[2];
void myfunction(void)
   FILE *fp;
   int size = 2;
   int sum = 0;
   /* Open file for reading */
   fp = fopen("mynumbers.txt", "r");
   /* Read and display data */
   fread(numbers, sizeof(int), size, fp);
   fclose(fp);
   sum = numbers[0] + numbers[1];
   printf("Sum = %d\n'', sum);
int main (void) {
   myfunction();
   return(0);
```

Processor (CPU)

Understands and executes each line of the code.

Uses fast onchip memories Memory (DRAM)

Provides operands to CPU

(\*fp, size, sum, numbers[2])

Storage (HDD)

Provides file inputs and program code

(file.txt)



- Text: program code
- Static data: global variables
  - e.g., static variables in C, constant arrays and strings
- Dynamic data: heap
  - e.g., malloc in C, new in Java
  - Grows from bottom (lower address) to top (higher address)
- Stack: temporal storage for functions
  - e.g., return address of sub-functions, local variables
  - Grows from top to bottom



```
#include <stdio.h>
#include <string.h>
int numbers[2];
void myfunction(void)
   FILE *fp;
   int size = 2;
   int sum = 0;
   /* Open file for reading */
   fp = fopen("mynumbers.txt", "r");
   /* Read and display data */
   fread(numbers, sizeof(int), size, fp);
   fclose(fp);
   sum = numbers[0] + numbers[1];
   printf("Sum = %d\n'', sum);
int main (void) {
  myfunction();
   return(0);
```



```
#include <stdio.h>
                                       Stack region for myfunction
#include <string.h>
int numbers[2];
void myfunction(void)
   FILE *fp;
   int size = 2;
   int sum = 0;
   /* Open file for reading */
   fp = fopen("mynumbers.txt", "r");
   /* Read and display data */
   fread(numbers, sizeof(int), size, fp);
   fclose(fp);
   sum = numbers[0] + numbers[1];
   printf("Sum = %d\n'', sum);
int main (void) {
   myfunction();
   return(0);
```





```
#include <stdio.h>
                                       Stack region for myfunction
#include <string.h>
int numbers[2];
void myfunction(void)
   FILE *fp;
  int size = 2;
   int sum = 0;
   /* Open file for reading */
   fp = fopen("mynumbers.txt", "r");
   /* Read and display data */
   fread(numbers, sizeof(int), size, fp);
   fclose(fp);
   sum = numbers[0] + numbers[1];
   printf("Sum = %d\n'', sum);
int main (void) {
   myfunction();
   return(0);
```





# Memory







#### Memory Access Is Slow

- Variables are all stored in Memory, which is outside of CPU
  - Slow...

- How can we execute the operations faster?
  - Use on-chip memory





#### Register File

#### Register File

On-chip memory that stores "Registers"

#### Registers

- Temporal space to maintain operand values and calculation results before storing back to memory
- Presented with "\$" + "register id" in MIPS processor
  - i.e. \$0 : 0<sup>th</sup> register, \$1 : 1<sup>st</sup> register



# **Typical Execution Steps**

- Load operand values from memory to registers
- 2. Do computation on registers
- 3. Move the results from register to memory



**Return address** 

### **Operations in Hardware**

- Hardware can do one operation at a time
  - Arithmetic operations
    - add, sub, mult, div, ...
  - Data movement
    - move, load data, store data, ...
  - Logical operations
    - shift, and, or, xor, ...
  - Conditional operations
    - jump, branch on condition, ...



**ALU (Arithmetic Logic Unit)** 

- Format of assembly instructions that uses register operands
  - Command Result, Operand 1, Operand 2
  - E.g. C = A + B → Add C, A, B (A, B, C should be replaced by register id)

#### C code



#### **Operations involved:**

- Value of 'a' is loaded from mem to \$3
- Value of 'b' is loaded from mem to \$2
- Add operation done on \$2 and \$3
- Value of 'c' is stored to mem

#### MIPS Assembly code

```
test():
        addiu
                $sp,$sp,-32
                $fp,28($sp)
        SW
                $fp,$sp
        move
                $2,1
                                              # 0x1
        li
                $2,8($fp)
        SW
        li
                $2,2
                                              # 0x2
                $2,12($fp)
        SW
                $3,8($fp)
        lw
                $2,12($fp)
        lw
                $2,$3,$2
        addu
                $2,16($fp)
        SW
        nop
                $sp,$fp
        move
                $fp,28($sp)
        lw
        addiu
                $sp,$sp,32
                $31
        nop
```















# Registers of MIPS CPUs

| Assembler Name | Register Number | Description                                       |
|----------------|-----------------|---------------------------------------------------|
| \$zero         | \$0             | Constant 0 value                                  |
| \$at           | \$1             | Assembler temporary                               |
| \$v0-\$v1      | \$2-\$3         | Function return values                            |
| \$a0-\$a3      | \$4-\$7         | Function Arguments                                |
| \$t0-\$t7      | \$8-\$15        | Temporaries                                       |
| \$s0-\$s7      | \$16-\$23       | Saved Temporaries                                 |
| \$t8-\$t9      | \$24-\$25       | Temporaries                                       |
| \$k0-\$k1      | \$26-\$27       | Reserved for OS kernel                            |
| \$gp           | \$28            | Global Pointer (Global and static variables/data) |
| \$sp           | \$29            | Stack Pointer                                     |
| \$fp           | \$30            | Frame Pointer                                     |
| \$ra           | \$31            | Return Address                                    |

#### **HLL vs. Hardware Operations**

- HLL: High-Level Language
- HLL are designed to be understood and programmed easily by programmers
- A HLL line may be a combination of multiple assembly instructions



#### A Brief Review

One line of HLL can involve:



#### **Operations of MIPS CPUs**

• Note: A, B, C in the table should be replaced by proper register ids

| C operator             | Assembly Operations      | Comments                  |
|------------------------|--------------------------|---------------------------|
| C = A + B              | add C, A, B              | Add two values            |
| C = A - B              | sub C, A, B              | Subtract one from another |
| C = A * B              | mul C, A, B              | Multiply two values       |
| C = A & B              | and C, A, B              | Logical AND operation     |
| C = A   B              | or C, A, B               | Logical OR operation      |
| C = A ^ B              | xor C, A, B              | Logical XOR operation     |
| C = A << shamt         | <b>sll</b> C, A, shamt   | Shift left by shamt       |
| C = A >> shamt         | <b>srl</b> C, A, shamt   | Shift right by shamt      |
| If (A < B) C = 1       | slt C, A, B              | Set if less than          |
| C = Memory             | lw C, Memory address     | Load value from memory    |
| Memory = C             | sw C, Memory address     | Store value to memory     |
| If (A == B) go to Addr | <b>beq</b> A, B, address | Jump if A == B            |
| Many more              |                          |                           |

### **Exercise: Register Reuse**

- Example:
  - Assume g, h, i and j are loaded to registers, \$t0, \$t1, \$t2, \$t3
  - HLL:

$$f = (g + h) - (i + j);$$

- Assembly (with add & sub operations)?

add \$t0, \$t0, \$t1 add \$t2, \$t2, \$t3 sub \$t2, \$t0, \$t2

Register values are maintained unless overwritten

### **About MIPS Assembly**

- CPUs use their own assembly languages
  - Assembly language of ARM, Pentium, Opteron... are all different
- MIPS Assembly Features
  - Very similar to ARM Assembly
  - One of the earliest RISC architectures that used pipelined instruction processing

#### Developed by MIPS Technologies

- Now maintained by Wave Computing
- Various generations
  - MIPS I~V
  - MIPS32 (32-bit processor)
  - MIPS64 (64-bit processor)
  - microMIPS...



# **Two types of Computers**

- Reduced Instruction Set Computer (RISC): MIPS, ARM, ...
  - Operands are in registers

Check RISC-V https://riscv.org/

- Each instruction can do only one operation: Simple but longer codes
- Complex Instruction Set Computer (CISC): Intel, AMD, ...
  - Operands can be in registers, stacks, accumulators, in memories
  - Each instruction can do multiple operations: Complex but shorter codes
  - Preferred when memory was very expensive

#### **MIPS Processor Organization**



#### **About MIPS32 Processor**

- Instructions are 32-bit wide
- Registers and Computing Logic use 32-bit data
- Memory bus is logically 32-bit wide
- 32 general purpose registers (GPRs) for integer and address values
  - A few special ones (i.e. \$zero: constant 0, \$fp: frame pointer, \$sp: stack pointer..)
- 32 floating point registers for floating point operations (not our focus)

#### **MIPS Data Sizes**

- Integer: 3 Sizes Defined
  - Byte (B)
    - 8-bits
  - Halfword (H)
    - 16-bits = 2 bytes
  - Word (W)
    - 32-bits = 4 bytes

- Floating-point: 2 Sizes Defined
  - Single (S)
    - 32-bits = 4 bytes
  - Double (D)
    - 64-bits = 8 bytes
    - For a 32-bit data bus, a double needs 2 memory reads

#### Byte-oriented vs. Word-oriented Memory

- Most processors are byte-oriented
  - Can access a word from any byte address
- MIPS: Word-oriented
  - Words must be aligned to multiples of its size
  - Still byte-addressable!
  - Provides some simplicity in design

 Logical views can be arranged in rows of 4-bytes for word-oriented memories





#### **Endian-ness**

- Endian-ness refers to the two alternate methods of ordering the bytes in a larger unit (word, long, etc.)
  - Big-Endian: IBM, SPARC, Motorola
    - Most Significant byte (MSB) is put at the starting (low) address
  - Little-Endian: Intel, DEC
    - Least Significant byte (LSB) is put at the starting (low) address
  - Supporting both
    - MIPS, PowerPC, ARM



#### **Memory Characteristics & Assumptions**



 Half-word and Word data are addressed with lowest byte address among the bytes in the data

- Addresses from left to right follows the same order as addresses from top to bottom
- We will use Little-Endian for MIPS in this course

## **Memory Organization Example 1**

**Example:** If the memory layout is given like below

#### The byte value in address

```
0x0000 \text{ is } (0x3B)
```

0x0001 is (0xF3)

0x0002 is (0x12)

0x0003 is (0x7C)

0x0006 is (0x47)



## **Memory Organization Example 2**

Example: If the memory layout in MIPS is given like below

The half-word value in address

```
0x0000 is ( 0xF33B )
0x0001 is (Incorrect)addressing
0x0002 is ( 0x7C12 )
```

The word value in address

```
0x0002 is (Incorrect addressing 0x0004 is (0x8E471300)
```



## Tips to Remember Endianness



### **Basic Instruction Formats**

Example: Format with 3 registers



### **Basic Instruction Formats**

• Example: Format with 3 registers

If \$2 and \$3 have integer values 3 and 2 each, \$4 will have ( 5 ) after executing this instruction

### \$0 \$1 \$2 (3) \$3 (2) \$4 (5) ...

\$31

### **R-Type Instructions**

- We call the instructions that use three registers as R-type Instructions
  - 2 registers as source, 1 register for result (destination)

#### • Examples:

many more

```
Rd, Rs, Rt
                            \# Rd = Rs - Rt
• sub
            Rd, Rs, Rt
                            # Rd = Rs * Rt
mul
            Rd, Rs, Rt
and
                            # Rd = Rs & Rt
            Rd, Rs, Rt
                            # Rd = Rs | Rt
or
            Rd, Rs, Rt
                            # Rd = Rs ^ Rt
xor
            Rd, Rs, Rt
                            # if (Rs < Rt) Rd = 1; else Rd = 0;
• slt
```

Replace Rd, Rs, Rt with actual registers

### **R-Type Instructions**

#### • Exercise:

- Assume that the initial state of register file is like below
- What is \$2 value after executing all three instructions?

# 2 = 10 - 8  
# 10 = 8 + 2  
# (10 < 2)? 
$$\rightarrow$$
 No  
 $\rightarrow$  \$2 = 0

#### Register File

| \$0      |
|----------|
| \$1      |
| \$2 (0)  |
| \$3 (10) |
| \$4 (2)  |
|          |
| \$31     |

### Instructions Using Immediate Value

- What if you want to use an immediate value?
  - E.g. i = i + 1; // add immediate value 1 to i



### Instructions Using Immediate Value

Example: Instruction format with immediate value

If \$3 has integer value 2, \$4 will have ( 3 ) after executing this instruction

#### \$0 \$1 \$2 (3) \$3 (2) \$4 (3) ... \$31

## **I-Type Instructions**

- We call the instructions that use two registers and one immediate value as I-type Instructions
  - 1 register and 1 immediate value as source, 1 register for result

#### • Examples:

```
Rt, Rs, imm

    subi

                                     # Rt = Rs - imm

    andi

            Rt, Rs, imm
                                     # Rt = Rs & imm
            Rt, Rs, imm
ori
                                     # Rt = Rs | imm
            Rt, Rs, imm
                                     # Rt = Rs ^ imm

    xori

beq
            Rt, Rs, imm
                                     # if (Rt == Rs) goto imm; else continue
            Rt, imm(Rs)
                                     # load a word from (imm + Rs) address to Rt
• lw
                                     # if (Rs < imm) Rt = 1; else Rt = 0;
            Rt, Rs, imm

    slti

 many more
```

## I-Type: Branch Instructions

#### Conditional Branches

- Branches only if a particular condition is true
  - E.g., Compares Rs, Rt. If EQ/NE, branch to label, else continue

#### Example:



Check condition of if statement Upon the condition result, you execute either a++ or a--

### I-Type: Branch Instructions

#### Conditional Branches

- Branches only if a particular condition is true
  - E.g., Compares Rs, Rt. If EQ/NE, branch to label, else continue

#### Example:

```
sub $4, $2, $3
add $3, $3, $4
slt $2, $3, $4
```

Without branch, instructions are executed sequentially; line by line



Branch instruction checks the condition and choose either executing next line or jumping to the specified line

### I-Type: Branch Instructions

#### Conditional Branches

Branch if condition satisfies

```
    beq Rs, Rt, imm # if (Rs == Rt) goto imm; else continue
    bne Rs, Rt, imm # if (Rs != Rt) goto imm; else continue
```

#### Unconditional Branches

Always branch to label

```
    b imm # goto imm
    beq $0, $0, imm # goto imm
```

Comparing the same register values
→ always equal

#### **Pseudo-instruction**

Instruction "b imm" is not actually supported by the hardware.

Assembler replaces "b imm" to "beq \$0, \$0, imm"

- Memory operations use special format to present memory address
- Example:



Memory Instruction Example:

lw 
$$$4, 4($3)$$
 #  $$4 = a \text{ word data in } ($3 + 4)$ 

If the initial state of register file and memory is like illustrated, a word in (0x7004) will be loaded to \$4 so \$4 will have (10) after executing this instruction



Load: move data from memory to register file

```
Ib Rt, imm(Rs) # Rt = 1-byte data in (Rs + imm)
Ih Rt, imm(Rs) # Rt = 2-byte (half-word) data in (Rs + imm)
Iw Rt, imm(Rs) # Rt = 4-byte (word) data in (Rs + imm)
```

Store: move data from register file to memory

```
    sb Rt, imm(Rs) # (Rs + imm) = 1-byte data in Rt
    sh Rt, imm(Rs) # (Rs + imm) = 2-byte (half-word) data in Rt
    sw Rt, imm(Rs) # (Rs + imm) = 4-byte (word) data in Rt
```

#### • Example:

- The initial state of register file and memory are like below
- What is \$2 value after executing each of the following instructions?

```
lw $2, 0x40($3) # $2 = word in 0x7040
lb $2, -1($4) # $2 = byte in 0x704B
lh $2, -6($4) # $2 = 2 bytes in 0x7046
```

Note that when executing lb and lh which loads a signed data which is shorter than 4-byte word the data should be sign-extended to fill 32-bit space of a register.



#### • Example:

- The initial state of register file and memory are like below
- How the memory is updated?

```
sw $2, 0x40(\$3) # 0x7040 = word in \$2
sb $2, -5(\$4) # 0x7047 = byte in \$2
sh $2, -2(\$4) # 0x704A = 2 bytes in \$2
```

When you store 1- to 2- bytes to memory, other values in the same word should remain unchanged



### Signed vs. Unsigned Instructions

- We use sign extension for lb and lh
  - This is because lb and lh are signed instructions

#### Unsigned instructions

- No need to do sign extension because the values are regarded as positive values always; just fill zeros to the remaining bytes
- Ibu \$2, -1(\$4) # load byte unsigned
   # If Ib in the earlier example is Ibu,
   # \$2 will be updated with 0x000000F8
- Ihu \$2, -6(\$4) # load half-word unsigned
  - # If Ih in the earlier example is Ihu,# \$2 will be updated with 0x00001349

## Signed vs. Unsigned Instructions

- Other instructions also have unsigned versions
  - addu, addiu, subu, divu, multu, ...

#### Example:

Assume that \$2 = 0xFFFFFFFF, \$3 = 0x00000001, what is \$1 in each instruction?

• slt \$1, \$2, \$3 # signed set less than # -1 < +1 → \$1 = 1

• sltu \$1, \$2, \$3 # unsigned set less than # +4,294,967,295 > +1 → \$1 = 0

### **Exercise**

- Translate the given high-level language code to MIPS assembly.
  - Assume that the address of integer variables x, y, and z are in 0x400, 0x404, and 0x408 in the memory, respectively
  - \$1's initial value is 0x400

#### High-level language

$$x = y + z$$
;

#### Tasks to do:

- 1) Load y value to a register
- 2) Load z value to another register
- 3) Run add operation
- 4) Store the result to memory

#### MIPS

- All instructions in MIPS have 32-bit width
  - Within 32-bit data, command and three register ids should be presented

### add Rd, Rs, Rt





#### Opcode

- Indicate Command/Operation of an instruction with combination of Function field
- Example:

| • | add |
|---|-----|
|   |     |

and

· or

| 0 | Rs | Rt | Rd | Shamt | 0x20 |
|---|----|----|----|-------|------|
| 0 | Rs | Rt | Rd | Shamt | 0x24 |
| 0 | Rs | Rt | Rd | Shamt | 0x25 |

- many more
- R-type instructions <u>always</u> have all zeros in Opcode field and use Function field to distinguish the instructions



#### Rs, Rt, Rd

- Indicate source and destination register ids used by an instruction
- Example:
  - add \$5, \$4, \$3
  - and \$13, \$8, \$2
  - or \$t8, \$s0, \$s1

| 000000 | \$4       | <b>\$</b> 3 | <b>\$5</b>  | Shamt | 100000 |
|--------|-----------|-------------|-------------|-------|--------|
| 000000 | \$8       | <b>\$2</b>  | <b>\$13</b> | Shamt | 100100 |
| 000000 | \$s0 (16) | \$s1 (17)   | \$t8 (24)   | Shamt | 100101 |

• Each field has 5 bits because we have 32 (=25) registers in MIPS



#### Shamt

- Indicate the shift amount that is used for shift instructions only; the other R-type instructions have all zeros
  in this field
- Example:
  - add \$5, \$4, \$3
  - and \$13, \$8, \$2
  - or \$t8, \$s0, \$s1

| 000000 | 00100 | 00011 | 00101 | 00000 | 100000 |
|--------|-------|-------|-------|-------|--------|
| 000000 | 01000 | 00010 | 01101 | 00000 | 100100 |
| 000000 | 10000 | 10001 | 11000 | 00000 | 100101 |

### **Shift Instructions**

- Special format R-type Instruction
  - The second operand is an immediate value (shamt) that indicates the amount of shift

#### Logical Shifting

- Shift towards left/right with filling in 0s
- Example:

```
    sll Rd, Rt, shamt # Rd = Rt << shamt (Rt: unsigned data)</li>
    srl Rd, Rt, shamt # Rd = Rt >> shamt (Rt: unsigned data)
```

#### Arithmetic Shifting

- Shift towards left/right with maintaining the value's sign bit
- Example:

```
sra Rd, Rt, shamt # Rd = Rt >> shamt (Rt: signed data)
```

# **Example: Logical Shift**

- Shift towards left/right with filling in 0s
- Example:
  - Assume that the original value of \$4 is 0x000000C
  - What is the value of \$5 after executing the following shift instructions?
  - sll \$5, \$4, 3

# shift left logical the value of \$4 by 3 bits

• srl \$5, \$4, 2

# shift right logical the value of \$4 by 2 bits

"Shift Right N bits" is used for division by 2N

\$4  $0 \times 00000000$  = +12

**srl** \$5, \$4, 2

**sll** \$5, \$4, 3

"Shift Left N bits" is used for multiply by 2N Careful: Could overflow!

Shift operation should be done in bit level →

translate given value

to binary first

Logical Right Shift by 2 bits:

Logical Left Shift by 3 bits:

\$5 0 0 ... 0 0 1 1 = +3
$$0 \times 00000003$$

# **Example: Arithmetic Shift**

- Shift right with replicating MSB
- Shift left with filling in 0s
- Example:
  - Assume that the original value of \$4 is 0xFFFFFFC (= -4)
  - What is the value of \$5 after executing the following shift instruction?
  - sra \$5, \$4, 2 # shift right arithmetic the value of \$4 by 2 bits



Arithmetic Right Shift by 2 bits:

Arithmetic Left Shift by 3 bits:

 MSB replicated and shifted in...

 \$5
 1
 1
 1
 1
 1
 1
 1
 1
 0 or 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0
 0

# **Example: Arithmetic Shift**

- Shift right with replicating MSB
- Shift left with filling in 0s
- Example:
  - value of \$4 is 0xFFFFFFC (= -4)
  - er executing the following shift instruction Os (like a logical right
  - shift) our result would be a positive number and the division wouldn't work

# shift right arithmetic the value of \$4 0×FFFFFFFC \$4 = -4

Notice there is no difference between an arithmetic and logical left shift. We always shift in 0s.

MIPS uses sll for both logical/arithmetic left shift (no separate sla instruction

sra \$5, \$4, 2

\$5, \$4, 3 sla?

*Arithmetic Left* Shift by 3 bits: *Arithmetic Right* Shift by 2 bits:

MSB replicated and shifted in...

\$5 1.. 1 1 1 0 0 0 0 0 = -32 
$$0 \times FFFFFFE0$$



#### Shamt

- Indicate the shift amount that is used for shift instructions only; the other R-type instructions have all zeros
  in this field
- Example:
  - sll \$5,\$4,3
  - srl \$5, \$4, 2
  - sra \$5, \$4, 2

| 000000 | 00000 | 00100 | 00101 | 3 | 000000 |
|--------|-------|-------|-------|---|--------|
| 000000 | 00000 | 00100 | 00101 | 2 | 000010 |
| 000000 | 00000 | 00100 | 00101 | 2 | 000011 |

- All instructions in MIPS are 32-bit wide
  - Within 32-bit data, command, two register ids, and an immediate value should be presented

### addi Rt, Rs, imm





#### Opcode

- Indicate Command/Operation of an instruction
- Example:
  - addi
  - andi
  - ori
  - many more;

| 001000 | Rs | Rt | Immediate |
|--------|----|----|-----------|
| 001100 | Rs | Rt | Immediate |
| 001101 | Rs | Rt | Immediate |



#### Rs, Rt

- Indicate source and destination register ids used by an instruction
- Example:
  - addi \$5, \$4, 1
  - andi \$13, \$12, 0xA
  - ori \$t3, \$t2, 12

| 001000 | \$4       | <b>\$</b> 5 | Immediate |
|--------|-----------|-------------|-----------|
| 001100 | \$12      | <b>\$13</b> | Immediate |
| 001101 | \$t2 (10) | \$t3 (11)   | Immediate |



#### Immediate

- Indicate the immediate value
- Example:
  - addi \$5, \$4, 1
  - andi \$13, \$12, 0xA
  - ori \$t3, \$t2, 12

| 001000 | 00100 | 00101 | 1   |
|--------|-------|-------|-----|
| 001100 | 01100 | 01101 | 0xA |
| 001101 | 01010 | 01011 | 12  |

- Special I-Type instructions: Load/Store
  - · Within 32-bit data, command, two register ids, and an offset value should be presented





#### Load/Store in I-Type machine code format

- Example:
  - lb \$5, 4(\$4)
  - lh \$13, 0x10(\$12)
  - lw \$t3, -16(\$t2)
  - sb \$3,0xfff0(\$4)
  - sh \$2,8(\$8)
  - sw \$s1, 10(\$s0)

| 0x20 | 00100 | 00101 | 4      |
|------|-------|-------|--------|
| 0x21 | 01100 | 01101 | 0x10   |
| 0x23 | 01010 | 01011 | -16    |
| 0x28 | 00100 | 00011 | 0xfff0 |
| 0x29 | 01000 | 00010 | 8      |
| 0x2B | 10000 | 10001 | 10     |

- Special I-Type instructions: Branch
  - uses the same format but needs a special treatment for immediate field value



# **Branch Target Addressing**

#### PC-relative addressing

- MIPS calculates the branch target address based on the branch instruction's next instruction's address
- "Branch to N lines above or below"; not branch to absolute address
- necessary because in many cases, we do not know (at compile time) where our code will be loaded in the memory space

Assume that the code is loaded to **0x00080000** in memory Loop: sll \$t1, \$s3, 2 80000 80004 add \$t1, \$t1, \$s6 80008 lw \$t0, 0(\$t1) bne \$t0, \$s5, Exit 8000C 80010 addi \$s3, \$s3, 1 80014 Loop 80018 Exit: ...

Each instruction is 4-byte long

→ Assembly code line address increases by 4 bytes

# **Branch Target Addressing**

#### PC-relative addressing

 MIPS calculates the branch target address based on the branch instruction's next instruction's address

 "Branch to N lines above or below"; not branch to absolute address

 necessary because in many cases, we do not know (at compile time) where our code will be loaded in the memory space



# **Branch Target Addressing**

- Why branch's next instruction should be considered?
  - Because MIPS increments program counter (PC) by 4 before executing an instruction
  - This already incremented PC value is used for branch target address calculation
- Branch Target Address Calculation
  - Target address =
     branch's next instruction address + offset x 4
  - bne \$t0, \$s5, Exit
    - Exit (0x00080018) = addi address (0x00080010) + distance (2) x 4 byte/inst
  - b Loop
    - Loop (0x00080000) = Exit address (0x00080018) + distance (-6) x 4 byte/inst

| 80000 | Loop: sll \$t1, \$s3, 2 |
|-------|-------------------------|
| 80004 | add \$t1, \$t1, \$s6    |
| 80008 | lw \$t0, 0(\$t1)        |
| 8000C | bne \$t0, \$s5, Exit    |
| 80010 | addi \$s3, \$s3, 1      |
| 80014 | b Loop                  |
| 80018 | Exit:                   |

## **Machine Code of I-Type Instructions**



| 80000s | Loop: sll \$t1, \$s3, 2 |
|--------|-------------------------|
| 80004  | add \$t1, \$t1, \$s6    |
| 80008  | lw \$t0, 0(\$t1)        |
| 8000C  | bne \$t0, \$s5, Exit    |
| 80010  | addi \$s3, \$s3, 1      |
| 80014  | b Loop                  |
| 80018  | Exit:                   |

### Branch in I-Type machine code format

- Example:
  - bne \$t0, \$s5, Exit
  - beq \$0, \$0, Loop

| 0x5 | 01000 | 10101 | 2  |
|-----|-------|-------|----|
| 0x4 | 00000 | 00000 | -6 |

# Loading an Immediate

- What if you want to load an immediate value to a register?
- If immediate (constant) is 16 bits or less
  - Use ori or addi instruction with \$0 register
  - Examples: You want to load value 1 to \$2

```
addi $2, $0, 1  # R[2] = 0 + 1 = 1
ori $2, $0, 0x1  # R[2] = 0 | 1 = 1
```

LUI: the immediate value is loaded to the MSB 16 bits of the target register

- If immediate is more than 16 bits
  - Immediates limited to 16 bits so we must load constant with a 2-instruction sequence using the special LUI (Load Upper Immediate) instruction
  - To load \$2 with 0x12345678

lui \$2, 0x1234ori \$2, \$2, 0x5678

R[2] 12340000 LUI
OR 00005678
R[2] 12345678 ORI

### **Mult & Div Instructions**



### **Mult & Div Instructions**

Mult & Div use special registers, lo and hi

### Multiplication

```
• mult Rs, Rt # lo = lower 32-bit of Rs * Rt # hi = higher 32-bit of Rs * Rt
```

### Division

```
• div Rs, Rt # lo = quotient of Rs/Rt # hi = remainder of Rs/Rt
```

Moves the contents of lo/hi registers to GPR (General Purpose Register)

mflo \$2 # \$2 = lo
 mfhi \$3 # \$3 = hi

### Machine Code of Mult/Div/Mflo/Mfhi



### Mult/Div/Mflo/Mfhi

- Example:
  - mult \$5,\$4
  - div \$5, \$4
  - mflo \$5
  - mfhi \$4

| 000000 | 00101 | 00100 | 00000 | 00000 | 0x18 |
|--------|-------|-------|-------|-------|------|
| 000000 | 00101 | 00100 | 00000 | 00000 | 0x1a |
| 000000 | 00000 | 00000 | 00101 | 00000 | 0x12 |
| 000000 | 00000 | 00000 | 00100 | 00000 | 0x10 |

# J-Type Instruction

- There is another type, J, in MIPS for Jump instruction
  - Similar to unconditional branch

### j target # Jump to target



# **Jump Target Addressing**

- Jump instruction provides larger scale jump than branch
- Target address = First 4 bits of jump's next instruction address
   : Last 28 bits of (target x 4)
- Example: j Loop (0x00080000)
  - The first 4 bits of Exit = 0x0
  - Last 28 bits of target x 4 = 0080000
  - target field = 0x0020000



32 bits



## **Machine Code of J-Type Instructions**



- Jump in J-Type machine code format
  - Example:
    - j Loop 0x2 0x0020000

# Loops in MIPS: While Loops

- Loop code consists of
  - Condition check code
    - To decide to continue the loop iteration or exit
  - Jump to the loop entry to run the next iteration

### Example: Find x where $2^x = 128$

High-level language

```
int pow = 1;
int x = 0;

while (pow != 128)
{
    pow = pow * 2;
    x = x + 1;
}
```

```
#$s0 = pow, $s1 = x

addi $s0, $0, 1
add $s1, $0, $0
addi $t0, $0, 128
beq $s0, $t0, done
sll $s0, $s0, 1
addi $s1, $s1, 1
j while

done:
```

## Loops in MIPS: For Loop

- Loop code consists of
  - Condition check code
    - To decide to continue the loop iteration or exit
  - Jump to the loop entry to run the next iteration

### **Example: Add numbers from 0 to 9**

High-level language

```
int sum = 0;
int i;

for (i = 0; i != 10; i++)
{
    sum = sum + i;
}
```

```
# $s0 = i, $s1 = sum

add $s0, $0, $0
add $s1, $0, $0
addi $t0, $0, 10
beq $s0, $t0, done
add $s1, $s1, $s0
addi $s0, $s0, 1
j for

done:
```

# **Less Than Comparisons**

The previous for loop can be rewritten by using "less than" operation like below

**Example: Add numbers from 0 to 9** 

High-level language

```
int sum = 0;
int i;

for (i = 0; i < 10; i++)
{
    sum = sum + i;
}</pre>
```

```
#$s0 = i, $s1 = sum
                  $s0, $0, $0
         add
                  $s1, $0, $0
         add
                  $t0, $0, 10
         addi
         beq
                  $s0, $t0, done
for:
         add
                  $s1, $s1, $s0
                  $s0, $s0, 1
         addi
                  for
done:
```

# **Less Than Comparisons**

To reduce the number of instructions, you can also use slti or sltui instead of slt

**Example: Add numbers from 0 to 9** 

### High-level language

```
int sum = 0;
int i;

for (i = 0; i < 10; i++)
{
    sum = sum + i;
}</pre>
```

```
#$s0 = i, $s1 = sum
                  $s0, $0, $0
         add
                  $s1, $0, $0
         add
                  $t1, $s0, 10
         slti
for:
                  $t1, $0, done
         beq
                  $s1, $s1, $s0
         add
                  $s0, $s0, 1
         addi
                  for
done:
```

# Calling a Subroutine

How are subroutines executed?

How can we jump back to the Caller function, and resume the execution from the immediate following line of the subroutine calling line?

→ We should record the return address before jumping to the subroutine

# **Special Registers**

| Assembler Name     | Register Number | Description                                       |
|--------------------|-----------------|---------------------------------------------------|
| \$zero             | \$0             | Constant 0 value                                  |
| \$at               | \$1             | Assembler temporary                               |
| \$v0-\$v1          | \$2-\$3         | Function return values                            |
| \$a0 <b>-</b> \$a3 | \$4-\$7         | Function Arguments                                |
| \$t0-\$t7          | \$8-\$15        | Temporaries                                       |
| \$s0 <b>-</b> \$s7 | \$16-\$23       | Saved Temporaries                                 |
| \$t8-\$t9          | \$24-\$25       | Temporaries                                       |
| \$k0-\$k1          | \$26-\$27       | Reserved for OS kernel                            |
| \$gp               | \$28            | Global Pointer (Global and static variables/data) |
| \$sp               | \$29            | Stack Pointer                                     |
| \$fp               | \$30            | Frame Pointer                                     |
| \$ra               | \$31            | Return Address                                    |

**\$ra** register holds the return address

# **Jump and Link Instruction**

- Most of assembly languages provide a special jump (or branch) instruction that Jump
  - + Update Return Address Register



Target code line to jump

- 1. Jump to Label and
- 2. Update \$ra with return address (JAL's next instruction address)

# Calling a Subroutine

How can we jump back to address in \$ra?

```
int sub_func(void)
{
    int main ()
{
        a = 1;
        b = sub_func();
        a += b;
}

pro ← address of first line of sub_func
        $ra ← address of a += b;
```

# Jump with Register

We can return to the Caller by running Jump with Register instruction with \$ra as an operand



# Jump to address in \$ra
# (\$pc = \$ra)

# Jump with Register

Assume that the addresses in the previous example are like below



# Jump with Register

Assume that the addresses in the previous example are like below



- Translate the given high-level language code to MIPS assembly.
  - Assume that the address of integer variable x and y are in \$4 and \$1 respectively.

### High-level language

```
if (x > y)

x = x + y;

else

x = 1;
```

#### Tasks to do:

- 1) Load x value to a register
- 2) Load y value to another register
- 3) Check if y is less than x
- 4) If true, run add operation
- 5) Else, store 1 to one register
- 6) Store the result value to x in memory

- Translate the given high-level language code to MIPS assembly.
  - Assume that the address of integer variable x and y are in \$4 and \$1 respectively.

### High-level language

if 
$$(x > y)$$
  
 $x = x + y$ ;  
else  
 $x = 1$ ;

#### Tasks to do:

- 1) Load x value to a register
- 2) Load y value to another register
- 3) Check if y is less than x
- 4) If true, run add operation
- 5) Else, store 1 to one register
- 6) Store the result value to x in memory



- Translate the given high-level language code to MIPS assembly.
  - Assume that the address of integer variable x and y are in \$4 and \$1 respectively.

### High-level language



- 2) Loa without executing the else code
  3) Chec. → Add unconditional branch
  4) If true, real and skip addi
- 5) Else, store
- 6) Store the result value to a manemory



## **Parameter Passing**

How can we pass the parameters to/from a subroutine?

```
int sub_func(int a)
{
    int main ()
{
        a = 1;
        b = sub_func(c);
        a += b;
}
Output "b" should be returned

Input parameter "c" should be passed
```

## Registers For Parameter Passing

### Input Parameters

- Up to 4 parameters in \$a0 ~ \$a3
- If more than 4 parameters, use stack from 5<sup>th</sup> parameter

### Return Value

- For 32-bit return value, \$v0 is used
- \$v1 also is used when the return value is 64 bits long
  - i.e. \$v0 holds the bottom 32 bits and \$v1 holds the top 32 bits

# Example

### High-level language

```
int t;
int main(void) {
      int y;
      y = diffofsums(2, 3, 4, 5);
int diffofsums (int f, int g, int h, int i)
      int result;
      result = (f + g) - (h + i);
      return result;
```

### MIPS Assembly

```
main:
                            \# arg 0 = 2
                            \# arg 1 = 3
                            \# arg 2 = 4
                            \# arg 3 = 5
                            # call subroutine
                            # y = returned value
diffofsums:
                                   # $t0 = f + g
                                   # $t1 = h + I
                                   \# result = (f + g) - (h + i)
             $s0, $t0, $t1
       sub
                                   # put return value in $v0
                                   # return to caller
```

# **Example (Addressing)**

Question: What values the following registers will have in each condition?

- After jal
  - \$pc
  - \$ra
- After jr
  - \$pc

```
Address
                               MIPS Assembly
              main:
                                        \# \text{ arg } 0 = 2
                           $a0, $0, 2
                     addi
                     addi
                           $a1, $0, 3
                                       \# arg 1 = 3
                           $a2, $0, 4
                                       # arg 2 = 4
                     addi
                           $a3, $0, 5
                                       \# arg 3 = 5
                     addi
                           diffofsums # call subroutine
0x0000015C
                    ial
                           $s0, $v0, $0 # y = returned value
0x00000160
                     add
              diffofsums:
0x00000168
                     add
                           $t0, $a0, $a1
                                               # $t0 = f + g
                           $t1, $a2, $a3
                                              # $t1 = h + I
                     add
                    sub
                           $s0, $t0, $t1
                                               \# result = (f + g) - (h + i)
                           $v0, $s0, $0
                                               # put return value in $v0
                     add
0x0000016E
                                               # return to caller
                           $ra
```

Register file is a shared resource

### • Example:

- In the previous code, \$t0, \$t1, and \$s0 are updated by Callee
- Assume that main function wanted to compare the diffofsums return value with value 10 and \$t0 has value 10 before calling subroutine
- After calling the subroutine, \$t0 is updated with intermediate compute result → incorrect comparison

```
main:
             $t0, $0, 10 # main wanted to use $t0
             $a0, $0, 2
                         \# arg 0 = 2
      addi
             $a1, $0, 3
      addi
                         \# \text{ arg } 1 = 3
             a2, 0, 4 \# arg 2 = 4
      addi
      addi
             $a3, $0, 5
                        \# arg 3 = 5
                          # call subroutine
             diffofsums
      jal
             $s0, $v0, $0 # y = returned value
             $t1, $s0, $t0 # to compare with comp result
diffofsums:
             $t0, $a0, $a1
                                 # $t0 = f + q
      add
                                 # $t1 = h + I
      add
             $t1, $a2, $a3
                                 \# result = (f + g) - (h + i)
             $s0, $t0, $t1
      sub
                                 # put return value in $v0
      add
             $v0, $s0, $0
                                 # return to caller
             $ra
```

After returning from diffofsums, main function will see the values of \$t0, \$t1, and \$s0 are updated unexpectedly

 Use Stack memory to protect register values

> Callee backs up the values of registers to stack memory that will be updated in its function body

 Callee restores the values from stack to original registers before returning to Caller



After computations in the function, registers are updated

| diffofsums: |                          |                               |
|-------------|--------------------------|-------------------------------|
| addi        | \$sp, \$sp, -12          | # make space on stack         |
|             |                          | # to store 3 registers        |
| SW          | \$s0, 8(\$sp)            | # save \$s0 on stack          |
| SW          | \$t0, 4(\$sp)            | # save \$t0 on stack          |
| SW          | \$t1, 0(\$sp)            | # save \$t1 on stack          |
|             | <b>*</b>                 | W <b>4</b> · 0 · 0            |
| add         | <b>\$t0</b> , \$a0, \$a1 | # \$t0 = f + g                |
| add         | <b>\$t1</b> , \$a2, \$a3 | # \$t1 = h + l                |
| sub         | <b>\$s0</b> , \$t0, \$t1 | #  result = (f + g) - (h + i) |
| add         | \$v0, \$s0, \$0          | # put return value in \$v0    |
| lw          | \$t1, 0(\$sp)            | # restore \$t1 from stack     |
| lw          | \$t0, 4(\$sp)            | # restore \$t0 from stack     |
| lw          | \$s0, 8(\$sp)            | # restore \$s0 from stack     |
|             |                          | ·                             |
| addi        | \$sp, \$sp, 12           | # deallocate stack space      |
| jr          | \$ra                     | # return to caller            |
|             |                          |                               |



Before returning to Caller, Register values are revoked

| diffofsums: |                          |                                                 |
|-------------|--------------------------|-------------------------------------------------|
| addi        | \$sp, \$sp, -12          | # make space on stack<br># to store 3 registers |
| SW          | \$s0, 8(\$sp)            | # save \$s0 on stack                            |
| SW          | \$t0, 4(\$sp)            | # save \$t0 on stack                            |
| sw          | \$t1, 0(\$sp)            | # save \$t1 on stack                            |
| add         | <b>\$t0</b> , \$a0, \$a1 | # \$t0 = f + g                                  |
| add         | \$t1, \$a2, \$a3         | # \$t1 = h + l                                  |
| sub         | \$s0, \$t0, \$t1         | # result = (f + g) – (h + i)                    |
| add         | \$v0, \$s0, \$0          | # put return value in \$v0                      |
| add         | φνο, φοο, φο             | " patrotain value in \$vo                       |
| lw          | \$t1, 0(\$sp)            | # restore \$t1 from stack                       |
| lw          | \$t0, 4(\$sp)            | # restore \$t0 from stack                       |
| lw          | \$s0, 8(\$sp)            | # restore \$s0 from stack                       |
| addi        | \$sp, \$sp, 12           | # deallocate stack space                        |
| jr          | \$ra                     | # return to caller                              |
|             |                          |                                                 |



# **Example: Nested Function Call**

| func1: |            |                                |                                                       |
|--------|------------|--------------------------------|-------------------------------------------------------|
|        | addi       | \$sp, \$sp, -4                 | # make space on stack<br># to store \$ra register     |
|        | sw         | \$ra, 0(\$sp)                  | # save \$ra on stack                                  |
|        | jal        | func2                          | # jump to func2                                       |
|        | lw<br>addi | \$ra, 0(\$sp)<br>\$sp, \$sp, 4 | # restore \$ra from stack<br># deallocate stack space |
|        | jr         | \$ra                           | # return to caller                                    |
| func2: |            |                                |                                                       |
|        | addi       | \$sp, \$sp, -8                 | # make space on stack<br># to store 2 registers       |
|        | addi       | \$sp, \$sp, 8                  | # deallocate stack space                              |
|        | jr         | \$ra                           | # return to func1 (caller)                            |



Why does func1 store \$ra in stack before calling func2?

# **Example: Recursion**

 Recursive functions should keep its input parameters and return address to the stack because all the recursions will try to use the same registers for these.

### High-level language

```
Int factorial (int n)
{
    if (n <= 1)
        return 1;
    else
        return (n * factorial (n-1));
}</pre>
```

### MIPS Assembly

```
factorial:
                   $sp, $sp, -8 # make room
            addi
                   $a0, 4($sp) # store input (n)
            SW
                   $ra, 0($sp) # store return address
            SW
                  $t0, $0, 2 # $t0 = 2
            addi
                   $t0, $a0, $t0 # n <= 1?
            slt
                   $t0, $0, else # no: go to else (recursion)
            beg
                   $v0, $0, 1 # yes: return 1
            addi
                   $sp, $sp, 8 # restore $sp
            addi
                               # return
                   $ra
            addi a0, a0, -1 # n = n -1
else:
                   factorial # recursive call
            ial
                   $ra, 0($sp) # restore return address
                   $a0, 4($sp) # restore input
                  $sp, $sp, 8 # restore $sp
            addi
                   $v0, $a0, $v0
                                     # n * factorial(n-1)
            mul
                               # return
            jr
                   $ra
```

# **Example: Recursion for 3!**



```
factorial:
                   $sp, $sp, -8 # make room
             addi
                   $a0, 4($sp) # store input (n)
            SW
                   $ra, 0($sp) # store return address
             SW
            addi
                   $t0, $0, 2 # $t0 = 2
                   $t0, $a0, $t0 # n <= 1?
             slt
                   $t0, $0, else # no: go to else (recursion)
                   $v0, $0, 1 # yes: return 1
             addi
                   $sp, $sp, 8 # restore $sp
            addi
                                # return
                   $ra
                   a0, a0, -1 # n = n -1
else:
             addi
                   factorial
                                # recursive call
            ial
                   $ra, 0($sp) # restore return address
                   $a0, 4($sp) # restore input
                   $sp, $sp, 8 # restore $sp
            addi
                   $v0, $a0, $v0
                                      # n * factorial(n-1)
            mul
                                # return
            ir
                   $ra
```

- Translate the given high-level language code to MIPS assembly
  - Assume that the base address of an integer array a is in register \$s0
    - base address of an array: the address of the first element of the array

High-level language a[1]++;



#### **MIPS**

lw \$t3, 4(\$s0) addi \$t3, \$t3, 1 sw \$t3, 4(\$s0)

- Translate the given high-level language code to MIPS assembly
  - Assume that the base address of a char array a is in register \$s0
    - base address of an array: the address of the first element of the array
  - Assume that the value of i is in \$t0.



- Translate the given high-level language code to MIPS assembly
  - Assume that the base address of a char array a is in register \$s0
    - base address of an array: the address of the first element of the array
  - Assume that the value of i is in \$t0.

High-level language

a[i]--;

Now we know that the address of a[i] is \$s0 + 1byte \* i = \$s0 + \$t0

Is this a valid operation to get one byte from \$s0 + \$t0?

lb \$t1, **\$t0(\$s0)** 

**MIPS** 

add \$t2, \$s0, \$t0

\$t3, 0(\$t2)

subi \$t3, \$t3, 1

\$t3, 0(\$t2)

No, because offset should be an immediate value, not a register id

→ We should change the base address value, not offset value i.e. new base: **\$t2 = \$s0 + \$t0** and then load one byte from 0(\$t2)