

# Overview of Computer Architecture

Processors and Execution

Carl Henrik Ek - carlhenrik.ek@bristol.ac.uk November 8, 2019

http://carlhenrik.com

# **Computer Science**

# **Definition (Computer Science)**

The systematic study (science) of algorithms

# Definition (Algorithms)

an ordered set of unambigous, executable steps that defines a terminating process

# Computing



# **Computer Architecture**



# Start



# **Computer Architecture**



# Memory

# RAM as an array of bytes

| Content: | FF          | 00          | 57          | 92          | ВЗ          | 8A          | <br>10          | 46          | DC          |
|----------|-------------|-------------|-------------|-------------|-------------|-------------|-----------------|-------------|-------------|
| Address: | 000 000 000 | 000 000 001 | 000 000 002 | 000 000 003 | 000 000 004 | 000 000 005 | <br>134 217 725 | 134 217 726 | 134 217 727 |



# Register



# **Program Counter**

## **Program Counter**

- Address to memory to fetch instruction to execute
- Start position

## Instruction Register

• Stores instruction







a. At the beginning of the fetch step the instruction starting at address A0 is retrieved from memory and placed in the instruction register.



b. Then the program counter is incremented so that it points to the next instruction.



# **Processors**

# **Computer Architecture**



# State Machine



## Status Register



Copies of the ALU status flags (latched if the instruction has the "S" bit set).

- Condition Code Flags
  - N = **N**egative result from ALU flag. Z = **Z**ero result from ALU flag.
  - C = ALU operation Carried out
  - V = ALU operation o**V**erflowed
- Mode Bits
   M[4:0] define the processor mode.

- Interrupt Disable bits.
  - I = 1, disables the IRQ.
  - **F** = 1, disables the FIQ.
- \* T Bit (Architecture v4T only)
  - T = 0, Processor in ARM state T = 1, Processor in Thumb state
  - i = 1, Processor in Thumb state

# Flags

|                     | Logical Instruction                              | Arithmetic Instruction                                                                                     |  |  |  |
|---------------------|--------------------------------------------------|------------------------------------------------------------------------------------------------------------|--|--|--|
| Flag                |                                                  |                                                                                                            |  |  |  |
| Negative<br>(N='1') | No meaning                                       | Bit 31 of the result has been set<br>Indicates a negative number in<br>signed operations                   |  |  |  |
| Zero<br>(Z='1')     | Result is all zeroes                             | Result of operation was zero                                                                               |  |  |  |
| Carry<br>(C='1')    | After Shift operation '1' was left in carry flag | Result was greater than 32 bits                                                                            |  |  |  |
| oVerflow<br>(V='1') | No meaning                                       | Result was greater than 31 bits<br>Indicates a possible corruption of<br>the sign bit in signed<br>numbers |  |  |  |

### **Conditions**



### Instructions

- OP-Code
  - Decides operation
- Operand
  - OP-Code dependent
  - "Parameters"
- Each instruction on ARM 32 bits

### Instruction ARM



## Reference Manual



### Instructions

- Data Transfer
  - LDR, STR
- Flow Control
  - B, CMP
- Arithmetic
  - ADD, MUL

## LDR/STR

### 4.9.8 Assembler syntax

<LDR STR>{cond}{B}{T} Rd,<Address>

#### where:

LDR load from memory into a register

STR store from a register into memory

{cond}

two-character condition mnemonic. See Table 4-2: Condition code summary on page 4-5.

{B} if B is present then byte transfer, otherwise word transfer {T}

if T is present the W bit will be set in a post-indexed instruction, forcing non-privileged mode for the transfer cycle. T is not allowed when a preindexed addressing mode is specified or implied.

Rd is an expression evaluating to a valid register number.

Rn and Rm are expressions evaluating to a register number. If Rn is R15 then the assembler will subtract 8 from the offset value to allow for pipelining. In this case base write-back should not be specified.

### <Address> can be:

An expression which generates an address:

#### <expression>

The assembler will attempt to generate an instruction using the PC as a base and a corrected immediate offset to address the location given by evaluating the expression. This will be a PC relative, pre-indexed address. If the address is out of range, an error will be generated.

#### 2 A pre-indexed addressing specification:

offset of zero [Rn1

[Rn,<#expression>1{]} offset of <expression> bytes

[Rn, {+/-}Rm{, <shift>}]{!} offset of +/- contents of index register, shifted by <shift>

#### 3 A post-indexed addressing specification:

[Rn],<#expression> offset of <expression> bytes

[Rn], {+/-}Rm{, <shift>} offset of +/- contents of index register, shifted as by <shift>.

cshifts general shift operation (see data processing instructions) but you cannot specify the shift amount by a register.

# Addressing Modes

- ldr r0, [r1,#4] Load word addressed by R1+4.
- str r0,[r1],#4 Store R0 to word addressed by R1. Increment R1 by 4.
- ldr r0,[r1,#4]! Load word addressed by R1+4. Increment R1 by 4.
- ldr r0,=label Load address of label label into R0

# **ADD**

```
Code
_start:
    add    r0, r1
    add    r0, #4
    add    r0, [r1, #3]
```

### Multiply and Multiply-Accumulate (MUL, MLA)

The instruction is only executed if the condition is true. The various conditions are defined in Table 4-2: Condition code summary on page 4-5. The instruction encoding is shown in Figure 4-12: Multiply instructions.

The multiply and multiply-accumulate instructions use an 8 bit Booth's algorithm to perform integer multiplication.



Figure 4-12: Multiply instructions

The multiply form of the instruction gives Rd:=Rm\*Rs. Rn is ignored, and should be set to zero for compatibility with possible future upgrades to the instruction set.

The multiply-accumulate form gives Rd:=Rm\*Rs+Rn, which can save an explicit ADD instruction in some circumstances.

Both forms of the instruction work on operands which may be considered as signed (2's complement) or unsigned integers. The results of a signed multiply and of an unsigned multiply of 32 bit operands differ

only in the upper 32 bits - the low 32 bits of the signed and unsigned results are identical. As these instructions only produce the low 32 bits of a multiply, they can be used for both signed and unsigned multiplies.

For example consider the multiplication of the operands: Operand AOperand BResult

0xFFFFFF6 0x00000010xFFFFFF38

If the operands are interpreted as signed

Operand A has the value -10, operand B has the value 20, and the result is -200 which is correctly represented as 0xFFFFFF38

If the operands are interpreted as unsigned

Operand A has the value 4294967286, operand B has the value 20 and the result is 85899345720, which is represented as 0x13FFFFFF38, so the least significant 32 bits are 0xFFFFFF38.

#### 4.7.1 Operand restrictions

The destination register Rd must not be the same as the operand register Rm. R15 must not be used as an operand or as the destination register.

### 4.4 Branch and Branch with Link (B, BL)

The instruction is only executed if the condition is true. The various conditions are defined *Table 4-2: Condition code summary* on page 4-5. The instruction encoding is shown in *Figure 4-3: Branch instructions*, below.



Figure 4-3: Branch instructions

Branch instructions contain a signed 2's complement 24 bit offset. This is shifted left two bits, sign evented to 32 bits, and added to the PC. The instruction can therefore specify a branch of +-6 32Mbytes. The branch offset must take account of the prefetch operation, which causes the PC to be 2 words (8 bytes) shead of the current instruction. Branches beyond 4-5 32Mbytes must use an offset or absolute destination which has been previously loaded into a register. In this case the PC should be manually saved in R14 if a Branch with Link type operation is required.

### 4.4.1 The link bit

Branch with Link (BL) writes the old PC into the link register (R14) of the current bank. The PC value written into R14 is adjusted to allow for the prefetch, and contains the address of the instruction following the branch and link instruction. Note that the CPSR is not saved with the PC and R141:0] are always cleared.

To return from a routine called by Branch with Link use MOV PC,R14 if the link register is still valid or LDM Rn!,{..,PC} if the link register has been saved onto a stack pointed to by Rn

### 4.4.2 Instruction cycle times

Branch and Branch with Link instructions take 2S + 1N incremental cycles, where S and N are as defined in 6.2 Cycle Types on page 6-3.

## Code

```
Code
       .section
                       .text
       .align
                             _start
       .global
_start:
                  r0, =matrix
       ldr
                                  0 a
       ldr
                r1, [r0]
                r2, [r0, #12]
       ldr
                                  @ a*d
       mul
                r7, r2,r1
       ldr
                 r1, [r0, #4]
                                     @ b
                r2, [r0, #8]
                                     @ c
       ldr
                                   @ b*c
       mul
                r6, r2, r1
```

# **Summary**

# **Summary**

- This was just a very quick intro to CPUs
- You learn this by doing
- Lab on Monday we will play with these things