#### **ARCOS Group**

# uc3m | Universidad Carlos III de Madrid

## Lesson 4 (I) The processor

Computer Structure Bachelor in Computer Science and Engineering



#### Contents

- Computer elements
- 2. Processor organization
- 3. Control unit
- 4. Execution of instructions
- 5. Control unit design
- 6. Execution modes
- 7. Interrupts
- 8. Computer startup
- 9. Performance and parallelism

### Contents

- Motivation and goals
- 2) Basic functionality of the control unit
- B) Control signals and elemental operations
- 4) Introduction of the elemental processor
- 2. Processor organization

Computer eleme

- 3. Control unit
- 4. Execution of instructions
- 5. Control unit design
- Execution modes
- 7. Interrupts
- Computer startup
- 9. Performance and parallelism

## Computer components

#### review



## Processor components

#### review



#### Main motivation



- In lesson 3, we studied what processor execute: assembly programming.
- In lesson 4 we are going to study **how** the instructions are executed in the computer.

## How C.U. works:

#### Execute machine instructions



- Each element of the computer has inputs, outputs and control signals.
- At each clock cycle, the Control Unit (C.U.) sends the control signals via the control bus wires.
- Control signals indicate what value to output:
  - Move from an input to an output: S=Ex
  - Transform an input: S=f(E)

### Contents

- Motivation and goals
- 2) Basic functionality of the control unit
- 3) Control signals and elemental operations
- 4) Introduction of the elemental processor
- 2. Processor organization

Computer eleme

- 3. Control unit
- 4. Execution of instructions
- 5. Control unit design
- 6. Execution modes
- 7. Interrupts
- Computer startup
- Performance and parallelism





- Read from main memory the instruction pointed to by PC
- Increment PC
- Decode instruction
- Execute
- ► The PC register (program counter) holds the address of the next instruction to be executed.
- The RI register (instruction register) holds the instruction is currently executed.





- Read from main memory the instruction pointed to by PC
- Increment PC
- Decode instruction
- Execute
- ► The PC register (program counter) holds the address of the next instruction to be executed.
- The RI register (instruction register) holds the instruction is currently executed.





- Read from main memory the instruction pointed to by PC
- Increment PC
- Decode instruction
- Execute
- The PC register (program counter) holds the address of the next instruction to be executed.
- The RI register (instruction register) holds the instruction is currently executed.





- Read from main memory the instruction pointed to by PC
- Increment PC
- Decode instruction
- Execute

## Other functions of the C.U.



- Resolving anomalous situations
  - Illegal instructions
  - Illegal memory accesses
  - •
- Attend to interruptions
- Control the communication with the peripherals.

#### Contents

- Motivation and goals
- 2) Basic functionality of the control unit
- 3) Control signals and elemental operations
- 4) Introduction of the elemental processor
- 2. Processor organization

Computer eleme

- 3. Control unit
- 4. Execution of instructions
- 5. Control unit design
- 6. Execution modes
- 7. Interrupts
- Computer startup
- Performance and parallelism

## Register and bus



#### Register

Let us store a list of bits

#### Bus

list of bit between two elements connected though the bus

## Signals: output tristate





#### ▶ Tri-state

- In the middle of the elements and the bus.
- Allows to send data to the bus.

#### IMPORTANT

Two or more tri-states cannot be activated on the same bus at the same time.

## Signals: load in register





#### Load in register

- Let store the input value at the clock falling edge
  - During the clock level the register keeps the inner (old) value.
  - At the end of the clock cycle (falling edge) is when the inner value is updated

#### IMPORTANT

Therefore, in the following cycle, the new value will be seen at the output

# Sequence of signals



# Sequence of signals



## Example of *transfer* elemental operation



# Elementary transfer operation:

- Source storage element
- Target storage element
- A path is established





#### **▶ IMPORTANT**

- Establish the path between origin and destination in the same cycle
- In the same cycle NOT:
  - > Traverse a register
  - > carry two values to a bus at the same time.

# Sequence of signals





## Example of *process* elemental operation





# Elementary processing operation:

- Source element(s)
- Target element
- Transformation operation on the path

yy: 
$$A \leftarrow B+D$$
 [SC1=b,SC2=d, Ca]

#### **▶ IMPORTANT**

- Establish the path between origin and destination in the same cycle
- In the same cycle NOT:
  - > Traverse a register
  - > carry two values to a bus at the same time.

## RT Language and Elementary Operations

#### ▶ RT Language:

- Register transfer level language.
- It specifies what happens in the computer by elementary operations.

### Elementary operations:

- Transfer operations
  - MAR ← PC



- Processing operations
  - $\rightarrow$  RI  $\leftarrow$  R2 + RT2



## Review all components...

- Binary system based on 0 y I
- Building blocks:



# Review all components... Registers

- ▶ Element storing n bits at a time
  - Output: I
    - During the level, the output is the value stored in the register.
  - Input: I
    - Possible new value to be stored
  - Control: I or 2
    - Load: in the falling edge the possible new value is stored
    - Reset: there may be a signal to set the register to zero



# Control Unit (UC)



#### Contents

- Motivation and goals
- 2) Basic functionality of the control unit
- 3) Control signals and elemental operations
- I. Computer eleme 4) Introduction of the elemental processor
- 2. Processor organization
- 3. Control unit
- 4. Execution of instructions
- 5. Control unit design
- Execution modes
- 7. Interrupts
- Computer startup
- 9. Performance and parallelism

# Structure of an elementary computer and WepSIM Simulator

▶ Elemental Processor (E.P.):



- WepSIM simulates the E.P.:
  - https://wepsim.github.io/wepsim/

## Main features of the elemental processor

- ▶ 32 bits computer
- Main memory:
  - Addressed by bytes
  - A clock cycle for reading and writing operations
- Different types of registers available:
  - ▶ Register file of 32 <u>registers visible to programmers</u> (R0...R31)
    - ▶ Similar to MIPS: R0 = 0 and SP = R29
  - Registers not visible to programmers (RT1, RT2 and RT3)
    - Possible use for intermediate calculations within an instruction
  - Control registers (PC, IR, MAR, MBR) and state register (SR)
    - MAR, MBR, PC, SR, IR

# Structure of an elementary computer



## Control signals



#### General nomenclature: Control signals Mx: Selection in <u>multiplexor</u> Tx: Tri-state activation signal I/O Cx: Register load signal **Memory** IOSR • Ry: Register file selection on point y ADDR BE MRdy R W EAT IORdy IOR IOW .... 1√7-V0 Address Bus Data Bus Control Bus A31-A0 BE3-BE0 D31-D0 0 MRdy NOI IOR IR C3 SE BW-Bytes Selector + w A1A0 Control Unit M2 0 MUX Ta М1 SE SELEC 5 Size 5 Offset MUX PC C0 MAR MBR Control \_\_\_\_ CLK Signals T1-Internal Bus -T5 -T8 人—T10 -C5 RT1 SR C7 00...0100 1 C6 RT3 00...0001 - M7 0 1 MUX 00 01 10 <sup>2</sup>/ MB RB Register RC SELEC File SeIP\_2/ 11 10 01 ALU С CVNZ

Processor

## Elemental Processor: control signals



## Elemental Processor: registers



# Registers

- Registers visible to programmers
  - Register file's registers (~MIPS: \$t0, \$t1, etc.)
- Control and status registers:
  - PC: program counter
  - ▶ IR: instruction register
  - SP: stack pointer (in the register file)
  - MAR: memory address register
  - ▶ MBR: memory data register
  - > SR: status register
- Registers not visible to the user:
  - RT1, RT2 and RT3 (internal temporal reg.)







# Structure of an elementary computer **Registers**



#### Nomenclature:

- Ry: Register file selection
- Tx: <u>Tri-state</u> activation signal
- Cx: Register load signal

#### Register file

- ▶ RA register output by A
- ▶ RB register output by B
- ▶ RC input C to the RC register
- ▶ LC activates writing for RC
- ▶ T9 copy A to the internal bus
- ▶ TIO copy B to the internal bus

#### RTI and RT2

- C4 from the internal bus to RTI
- ▶ T4 RTI output to internal bus
- C5 from the internal bus to RT2
- T5 RT2 output to internal bus

# Example elemental operations in registers



#### **SWAPRIR2**

# Example elemental operations in registers



#### **SWAPRIR2**

| Elemental Op. | Signals |
|---------------|---------|
|               |         |
|               |         |
|               |         |

## elemental operations in registers



#### **SWAP RI R2**

| Elemental Op. | Signals          |
|---------------|------------------|
| RT1← R1       | RA=00001, T9, C4 |
|               |                  |
|               |                  |



The data is loaded on RT1 on the falling edge. It will be available on RT1 during the **next** cycle.

# Example elemental operations in registers



#### **SWAP RI R2**

| Elemental Op. | Signals                       |
|---------------|-------------------------------|
| RT1← R1       | RA=00001, T9, C4              |
| R1 ← R2       | RA=2 (00010), T9,<br>RC=1, LC |
|               |                               |

## elemental operations in registers



#### **SWAP RI R2**

| Elemental Op. | Signals                       |
|---------------|-------------------------------|
| RT1← R1       | RA=00001, T9, C4              |
| R1 ← R2       | RA=2 (00010), T9,<br>RC=1, LC |
| R2 ← RT1      | T4, RC=2 (00010), LC          |



The data is loaded on RT1 on the falling edge. It will be available on RT1 during the **next** cycle.

# ALU: Arithmetic Logic Unit





#### **ALU**

- MA selection of operand A
- MB selection of operand B
- Cop operation code

| Cop<br>(Cop <sub>3</sub> -Cop <sub>0</sub> ) | Operation                                             |
|----------------------------------------------|-------------------------------------------------------|
| 0000                                         | NOP                                                   |
| 0001                                         | A and B                                               |
| 0010                                         | A or B                                                |
| 0011                                         | not (A)                                               |
| 0100                                         | A xor B                                               |
| 0101                                         | Shift Right Logical (A) B= number of bits to shift    |
| 0110                                         | Shift Right Arithmetic( A) B= number of bits to shift |
| 0111                                         | Shift left (A) B= number of bits to shift             |
| 1000                                         | Rotate Right (A) B= number of bits to rotate          |
| 1001                                         | Rotate Left (A) B= number of bits to rotate           |
| 1010                                         | A + B                                                 |
| 1011                                         | A - B                                                 |
| 1100                                         | A * B (with overflow)                                 |
| 1101                                         | A / B (integer division)                              |
| 1110                                         | A % B (integer division)                              |
| 1111                                         | LUI (A)                                               |





| Cop<br>(Cop <sub>3</sub> -Cop <sub>0</sub> ) | Operation                                             |
|----------------------------------------------|-------------------------------------------------------|
| 0000                                         | NOP                                                   |
| 0001                                         | A and B                                               |
| 0010                                         | A or B                                                |
| 0011                                         | not (A)                                               |
| 0100                                         | A xor B                                               |
| 0101                                         | Shift Right Logical (A) B= number of bits to shift    |
| 0110                                         | Shift Right Arithmetic( A) B= number of bits to shift |
| 0111                                         | Shift left (A) B= number of bits to shift             |
| 1000                                         | Rotate Right (A) B= number of bits to rotate          |
| 1001                                         | Rotate Left (A) B= number of bits to rotate           |
| 1010                                         | A + B                                                 |
| 1011                                         | A - B                                                 |
| 1100                                         | A * B (with overflow)                                 |
| 1101                                         | A / B (integer division)                              |
| 1110                                         | A % B (integer division)                              |
| 1111                                         | LUI (A)                                               |



| Result                              | С | ٧ | N | Z |
|-------------------------------------|---|---|---|---|
| Positive result (0 is considered +) | 0 | 0 | 0 | 0 |
| Result == 0                         | 0 | 0 | 0 | 1 |
| Negative result                     | 0 | 0 | 1 | 0 |
| Overflow                            | 0 | 1 | 0 | 0 |
| Division by zero                    | 0 | 1 | 0 | 1 |
| Carrying at bit 32                  | 1 | 0 | 0 | 0 |

| Cop<br>(Cop <sub>3</sub> -Cop <sub>0</sub> ) | Operation                                             |
|----------------------------------------------|-------------------------------------------------------|
| 0000                                         | NOP                                                   |
| 0001                                         | A and B                                               |
| 0010                                         | A or B                                                |
| 0011                                         | not (A)                                               |
| 0100                                         | A xor B                                               |
| 0101                                         | Shift Right Logical (A) B= number of bits to shift    |
| 0110                                         | Shift Right Arithmetic( A) B= number of bits to shift |
| 0111                                         | Shift left (A) B= number of bits to shift             |
| 1000                                         | Rotate Right (A) B= number of bits to rotate          |
| 1001                                         | Rotate Left (A) B= number of bits to rotate           |
| 1010                                         | A + B                                                 |
| 1011                                         | A - B                                                 |
| 1100                                         | A * B (with overflow)                                 |
| 1101                                         | A / B (integer division)                              |
| 1110                                         | A % B (integer division)                              |
| 1111                                         | LUI (A)                                               |

# Example elemental operations in ALU



#### ADD R3 RI R2

| Elem. Op. | Signals |
|-----------|---------|
|           |         |

## elemental operations in ALU



#### ADD R3 RI R2

| Elem. Op. | Signals |
|-----------|---------|
|           |         |

# Example elemental operations in ALU



#### ADD R3 RI R2

| Elem. Op.   | Signals                                    |
|-------------|--------------------------------------------|
| R3← R1 + R2 | RA=R1, RB=R2,<br>Cop=+, T6,<br>RC=R3, LC=1 |

## elemental operations in ALU



#### ADD R3 RI R2

| Elem. Op.   | Signals                                    |
|-------------|--------------------------------------------|
| R3← R1 + R2 | RA=R1, RB=R2,<br>Cop=+, T6,<br>RC=R3, LC=1 |



The load is performed on R3 on the falling edge.

The data is available in register R3 for the next cycle.

# Example elemental operations in ALU



#### **SWAP RI R2**

# Elem. Op. Signals RT1←R1 RA=1, T9, C4 R1 ← R2 RA=2, T9, RC=1, LC R2 ← RT1 T4, RC=2, LC

## elemental operations in ALU



#### **SWAP RI R2**

| Elem. Op. | Signals            |
|-----------|--------------------|
| RT1← R1   | RA=1, T9, C4       |
| R1 ← R2   | RA=2, T9, RC=1, LC |
| R2 ← RT1  | T4, RC=2, LC       |

| Elem. Op.  |                     |
|------------|---------------------|
| R1←R1 ^ R2 | R1 ← (R1 ^ R2)      |
| R2←R1 ^ R2 | R2 ← (R1 ^ R2) ^ R2 |
| R1←R1 ^ R2 | R1 ← (R1 ^ R2) ^ R1 |

# Example elemental operations in ALU



#### **SWAP RI R2**

| Elem. Op. | Signals            |
|-----------|--------------------|
| RT1← R1   | RA=1, T9, C4       |
| R1 ← R2   | RA=2, T9, RC=1, LC |
| R2 ← RT1  | T4, RC=2, LC       |

| Elem. Op.  | Signals                         |
|------------|---------------------------------|
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |
| R2←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=2, LC |
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |

## elemental operations in ALU



#### **SWAP RI R2**

| Elem. Op. | Signals            |
|-----------|--------------------|
| RT1← R1   | RA=1, T9, C4       |
| R1 ← R2   | RA=2, T9, RC=1, LC |
| R2 ← RT1  | T4, RC=2, LC       |

| Elem. Op.  | Signals                         |
|------------|---------------------------------|
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |
| R2←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=2, LC |
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |

## elemental operations in ALU



#### **SWAP RI R2**

| Elem. Op. | Signals            |
|-----------|--------------------|
| RT1← R1   | RA=1, T9, C4       |
| R1 ← R2   | RA=2, T9, RC=1, LC |
| R2 ← RT1  | T4, RC=2, LC       |

| Elem. Op.  | Signals                         |
|------------|---------------------------------|
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |
| R2←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=2, LC |
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |

## elemental operations in ALU



#### **SWAP RI R2**

| Elem. Op. | Signals            |
|-----------|--------------------|
| RT1← R1   | RA=1, T9, C4       |
| R1 ← R2   | RA=2, T9, RC=1, LC |
| R2 ← RT1  | T4, RC=2, LC       |

| Elem. Op.  | Signals                         |
|------------|---------------------------------|
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |
| R2←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=2, LC |
| R1←R1 ^ R2 | RA=1, RB=2, Cop=^, T6, RC=1, LC |

# Elemental Processor: memory access





#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- ▶ W Write



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- ▶ W –Write
- ▶ DATA data from/to memory
- ▶ ADDR address



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- W −Write
- ▶ DATA data from/to memory
- ADDR address
- ▶ BE3-BE0 = AIA0 + BW
  - Access size (byte, word, half word)

#### BW: byte selector

It selects which bytes are, stored in MBR while reading and copy to the bus on writes.

- **BW=0**: access to byte
- **BW=01**: access to half word
- **BW=II:** word access

#### > SE: sign extension

- 0: does not extend the sign in smaller accesses of a word
- in smaller word accesses



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- W −Write
- ▶ DATA data from/to memory
- ADDR address
- $\rightarrow$  BE3-BE0 = AIA0 + BW
  - Access size (byte, word, half word)
- MRdy operation ended [only in asynchronous]

#### Synchronous:

Memory requires a certain number of cycles for all operations.

#### Asynchronous:

- Non fixed number of clock cycles for memory operations.
- The memory indicates when the operation ends



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- W −Write
- ▶ DATA data from/to memory
- ADDR address
- ▶ BE3-BE0 = AIA0 + BW
  - Access size (byte, word, half word)
- MRdy operation ended [only in asynchronous]

#### MAR & MBR

- Ta output of MAR to the address bus
- ► Td MBR output to data bus
- ► TI MBR output to internal bus



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- W −Write
- ▶ DATA data from/to memory
- ADDR address
- $\rightarrow$  BE3-BE0 = AIA0 + BW
  - Access size (byte, word, half word)
- MRdy operation ended [only in asynchronous]

#### MAR & MBR

- Ta output of MAR to the address bus
- ► Td MBR output to data bus
- ► TI MBR output to internal bus
- MI selection for MBR: memory or internal bus
- ▶ CI from data bus to MBR
- ▶ C0 from internal bus to MAR

#### summary



#### **Nomenclature:**

- MAR -> Address register
- MBR -> Data register

#### Main Memory

- ▶ R Read
- ▶ W Write
- ▶ DATA data from/to memory
- ADDR address
- ▶ BE3-BE0 = AIA0 + BW
  - Access size (byte, word, half word)
- MRdy operation ended [only in asynchronous]

#### MAR & MBR

- Ta output of MAR to the address bus
- ► Td MBR output to data bus
- ► TI MBR output to internal bus
- MI selection for MBR: memory or internal bus
- ▶ CI from data bus to MBR
- ▶ C0 from internal bus to MAR

#### elemental operations in main memory

#### Reading a word



#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals |
|---------------------------|---------|
| MAR ← <address></address> | , C0    |
|                           |         |

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

#### Writing a word

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

| Elem. Op.                 | Signals |
|---------------------------|---------|
| MAR ← <address></address> | , C0    |
|                           |         |
|                           |         |

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

| Elem. Op.                 | Signals |
|---------------------------|---------|
| MAR ← <address></address> | , C0    |
| MBR ← <data></data>       | , C1    |
|                           |         |

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

| Elem. Op.                 | Signals             |
|---------------------------|---------------------|
| MAR ← <address></address> | , CO                |
| MBR ← <data></data>       | , C1                |
| Writing cycle             | Ta, Td, W,<br>BW=11 |

#### access to 1 cycle synchronous main memory



#### Read

| Elem. Op.                 | Signals                 |
|---------------------------|-------------------------|
| MAR ← <address></address> | , C0                    |
| MBR ← MP[MAR]             | Ta, R, M1,<br>C1, BW=11 |

| Elem. Op.                 | Signals             |
|---------------------------|---------------------|
| MAR ← <address></address> | , CO                |
| MBR ← <data></data>       | , C1                |
| Writing cycle             | Ta, Td, W,<br>BW=11 |

#### access to 2 cycle synchronous main memory



#### Reading a word

| Elem. Op.                       | Signals                 |
|---------------------------------|-------------------------|
| MAR ← <address></address>       | , C0                    |
| Reading cycle                   | Ta, R,                  |
| Reading cycle,<br>MBR ← MP[MAR] | Ta, R, M1,<br>C1, BW=11 |

## BE (Byte-Enable) signals for reading

| В       | Bytes in memory |        |        | Byte-Enable |     |     | Output to bus |         |         |        |        |
|---------|-----------------|--------|--------|-------------|-----|-----|---------------|---------|---------|--------|--------|
| D31-D24 | D23-D16         | D15-D8 | D7-D0  | BE3         | BE2 | BEI | BE0           | D31-D24 | D23-D16 | D15-D8 | D7-D0  |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | 0   | 0   | 0             |         |         |        | Byte 0 |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | 0   | 0   | -1            |         |         | Byte I |        |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | 0   | 1   | 0             |         | Byte 2  |        |        |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | 0   | 1   | -1            | Byte 3  |         |        |        |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | - 1 | 0   | X             |         |         | Byte I | Byte 0 |
| Byte 3  | Byte 2          | Byte I | Byte 0 | 0           | I   |     | X             | Byte 3  | Byte 2  |        |        |
| Byte 3  | Byte 2          | Byte I | Byte 0 | ı           | I   | X   | X             | Byte 3  | Byte 2  | Byte I | Byte 0 |

### BE (Byte-Enable) signals for writing

|         | Data in bus |        |        | Byte-Enable |     |     | Bytes written in memory |         |         |        |        |
|---------|-------------|--------|--------|-------------|-----|-----|-------------------------|---------|---------|--------|--------|
| D31-D24 | D23-D16     | D15-D8 | D7-D0  | BE3         | BE2 | BEI | BE0                     | D31-D24 | D23-D16 | D15-D8 | D7-D0  |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | 0   | 0   | 0                       |         |         |        | Byte 0 |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | 0   | 0   | -1                      |         |         | Byte I |        |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | 0   | - 1 | 0                       |         | Byte 2  |        |        |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | 0   | - 1 | -1                      | Byte 3  |         |        |        |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | - 1 | 0   | X                       |         |         | Byte I | Byte 0 |
| Byte 3  | Byte 2      | Byte I | Byte 0 | 0           | Ī   | I   | X                       | Byte 3  | Byte 2  |        |        |
| Byte 3  | Byte 2      | Byte I | Byte 0 | ı           | Ī   | X   | X                       | Byte 3  | Byte 2  | Byte I | Byte 0 |

# Memory Access size



#### **Nomenclature:**

- MAR -> Addresss register
- MBR -> Data register

- Byte Selector: selects which bytes are stored in MBR while reading and copy to the bus on writes.
  - **BW=0**: access to byte
  - **BW=01**: access to half word
  - **BW=II**: word access
- ▶ SE: sign extension
  - 0: does not extend the sign in smaller accesses of a word
  - I: extends the sign in smaller word accesses

#### Elemental Processor: PC and IR



### PC: Program Counter



#### **PC**

- ▶ C2 − load value into PC
- ▶ T2 − from PC to internal bus

### PC Mux: IB/PC+4



#### **PC**

- ▶ C2 − load value into PC
- ▶ T2 from PC to internal bus

- ▶ C2, M2
  - PC ← PC + 4
- C2, M2=0
  - ▶ PC ← <internal bus>

### IR: Instruction register



#### IR:

- ▶ C3 from internal bus to IR
- ▶ SELEC: IR content to the bus
  - □ Offset: displacement
    - □ Start bit (less significant)
  - □ Size: Size
    - □ Number of bits
  - □ SE: sign extension

#### SELEC: selector circuit



| op.    | address |  |
|--------|---------|--|
| 6 bits | 26 bits |  |



TIP

General phases:

- A. Fetch + Decode
- B. Fetch operands
- C. Execution
- D. Store results

op. address

6 bits 26 bits



·C3

Not possible in the same clock cycle:

- 1. To passthrough a register (C4, MA=1)
- 2. To send several values to a bus (T4,T5)
- 3. To set a datapath if the circuitry does not enable it (IDB -> RT3)

Elem. Op.

op. address
6 bits 26 bits



**Cycle** 

| op.    | address |  |
|--------|---------|--|
| 6 bits | 26 bits |  |

o j addr



| op.    | address |
|--------|---------|
| 6 bits | 26 bits |

o i addr

**—**сз



















|   | op.    | <br>offset |
|---|--------|------------|
| _ | 6 bits | I6 bits    |



#### Elemental Processor: SR



### SR: Status Register



- Stores information (status bits) about the status of the program being executed on the processor.
- Typical status bits:
  - C,V, N, Z:Result from last operation in ALU
  - U:CPU running in kernel or user mode
  - l:Interruptions are enabled or not

### SR: Status Register



- ▶ T8 − from SR to internal data bus
- ▶ C7 from internal data bus to SR
- ▶ M7 flags from IDB or SELEC

### SR: Status Register



- ▶ T8 − from SR to internal data bus
- C7 from internal data bus to SR
- ▶ M7 flags from IDB or SELEC
- SeIP update flags:ALU / I / U

```
if (SelP == 11)
   Output = C'V'N'Z'IU
if (SelP == 10)
   Output = C V N Z I'U
if (SelP == 01)
   Output = C V N Z I U'
```

### SR: Save/Restore + Update from ALU



- ▶ T8 − from SR to internal data bus
- C7 from internal data bus to SR
- ▶ M7 flags from IDB or SELEC
- ▶ SeIP update flags:ALU / I / U
- ▶ T8, C4
  - $\rightarrow$  RTI  $\leftarrow$  SR
- T4, M7=0, C7
  - SR ← RTI
- SelP=11, M7, C7
  - SR ← <ALU flags>

### SR: check some flag...



- ▶ C − condition to select
- ▶ B − condition or not condition
- A0 1: fetch/co2maddr0: from maddr/maddr+1
- ► C=[4...9], B=0,A0=0
  - If C maddr ← MADDR else maddr ← maddr+1
- ► C=0, B=1, A0=0
  - ▶ maddr ← MADDR

| Cycle | Elem. Op.                                    |
|-------|----------------------------------------------|
| CI    | MAR ← PC                                     |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ |
| C3    | IR←MBR                                       |
| C4    | Decode                                       |
|       |                                              |
|       |                                              |
|       |                                              |
|       |                                              |
|       |                                              |
|       |                                              |
|       |                                              |
| CI2   | Jump to fetch                                |

Si 
$$r1 == r2$$
  
PC  $\leftarrow$  PC + offset

| Cycle | Elem. Op.                                    |
|-------|----------------------------------------------|
| CI    | MAR ← PC                                     |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ |
| C3    | IR←MBR                                       |
| C4    | Decode                                       |
| C5    | MBR ← SR                                     |
| C6    | \$rI - \$r2                                  |
| C7    | Si SR.Z != 0 jump to C11                     |
|       |                                              |
|       |                                              |
|       |                                              |
| CII   | SR ← MBR                                     |
| CI2   | Jump to fetch                                |

Si 
$$r1 == r2$$
  
PC  $\leftarrow$  PC + offset

| Cycle | Elem. Op.                                    |
|-------|----------------------------------------------|
| CI    | MAR ← PC                                     |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ |
| C3    | IR←MBR                                       |
| C4    | Decode                                       |
| C5    | MBR ← SR                                     |
| C6    | \$rI - \$r2                                  |
| C7    | Si SR.Z != 0 jump to C11                     |
| C8    | RTI ← PC                                     |
| C9    | RT2 ← IR(offset)                             |
| CI0   | PC ← RTI + RT2                               |
| CII   | SR ← MBR                                     |
| CI2   | Jump to fetch                                |

Si 
$$r1 == r2$$
  
PC  $\leftarrow$  PC + offset

| Cycle | Elem. Op.                                    | Control Signals                                               |
|-------|----------------------------------------------|---------------------------------------------------------------|
| CI    | MAR ← PC                                     | T2, C0                                                        |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M1,<br>Ta, R, C1, M1, BW=11                               |
| C3    | IR←MBR                                       | T1, C3                                                        |
| C4    | Decode                                       | A0, B=0, C=0                                                  |
| C5    | MBR ← SR                                     | T8, C1                                                        |
| C6    | \$rI - \$r2                                  | SELA=10101, SELB=10000, MC=1,<br>SELCOP=1011, SELP=11, M7, C7 |
| C7    | Si SR.Z != 0 jump to C11                     | A0=0, B=1, C=110, MADDR=beq11                                 |
| C8    | RTI ← PC                                     | T2, C4                                                        |
| C9    | RT2 ← IR(offset)                             | Size=10000, Offset=0, SE=1, T3, C5                            |
| CI0   | PC ← RTI + RT2                               | MA=1, MB=1, MC=1, SELCOP=+, T6, C2                            |
| CII   | SR ← MBR                                     | T1, M7=0, C7                                                  |
| CI2   | Jump to fetch                                | A0, B=1, C=0                                                  |

# Structure of an elementary computer



# Control unit Phases of execution of an instruction



#### Instruction Reading or fetch

- Read the instruction stored in the memory address indicated by PC and take it to IR.
- PC is updated to point to the next instruction

#### Decoding

- Analysis of the instruction in RI to determine:
  - The operation to be performed.
  - Control signals to be activated

#### Execution

Generation of the control signals in each clock cycle.

#### Clock

- element
- A computer is a synchronous element
- Controls the operation
- ▶ The clock times the operations:
  - In a clock cycle one or more elementary operations are executed as long as there is no conflict
  - The necessary control signals are kept active during the cycle
- In the same cycle you can perform
  - MAR ← PC y RT3 ← RT2 + RT1
- In the same cycle it is not possible to perform
  - ▶ MAR  $\leftarrow$  PC y RI  $\leftarrow$  RT3 why?

# Description of the Control Unit activity



## Description of the Control Unit activity



### Description of the Control Unit activity



### Fetch (Elemental Operations)

| Cycle | Elem. Op.   |
|-------|-------------|
| CI    | MAR ← PC    |
| C2    | PC ← PC + 4 |
| C3    | MBR ← MP    |
| C4    | IR← MBR     |

| Cycle | Elem. Op.                                    |
|-------|----------------------------------------------|
| CI    | MAR ← PC                                     |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ |
| C3    | IR← MBR                                      |

Possibility of simultaneous operations

### Fetch (Control Signals)

- Specification of the active control signals in each clock cycle
  - Can be generated from the RT level.

| Cycle | Elem. Op.                                    | Control Signals                |
|-------|----------------------------------------------|--------------------------------|
| CI    | MAR ← PC                                     | T2, C0                         |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M2<br>Ta, R, C1, M1, BW=11 |
| C3    | IR← MBR                                      | TI, C3                         |

### Example

▶ lw \$reg, dir





| op.    | rs     | rt     | dir     |
|--------|--------|--------|---------|
| 6 bits | 5 bits | 5 bits | 16 bits |

| Cycle | Elem. Op.                                    | Control Signals                |
|-------|----------------------------------------------|--------------------------------|
| CI    | MAR ← PC                                     | T2, C0                         |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M2<br>Ta, R, C1, M1, BW=11 |
| C3    | IR← MBR                                      | TI, C3                         |
| C4    | Decoding                                     | A0, B=0, C=0                   |
| C5    |                                              |                                |
| C6    |                                              |                                |
| C7    |                                              |                                |





| Cycle | Elem. Op.                                    | Control Signals                       |
|-------|----------------------------------------------|---------------------------------------|
| CI    | MAR ← PC                                     | T2, C0                                |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M2<br>Ta, R, C1, M1, BW=11        |
| C3    | IR← MBR                                      | T1, C3                                |
| C4    | Decoding                                     | A0, B=0, C=0                          |
| C5    | MAR ← RI(dir)                                | C0,T3, Size = 10000<br>Offset = 00000 |
| C6    |                                              |                                       |
| C7    |                                              |                                       |

| op.    | rs     | rt     | dir     |
|--------|--------|--------|---------|
| 6 bits | 5 bits | 5 bits | 16 bits |

| Cycle | Elem. Op.                                    | Control Signals                       |
|-------|----------------------------------------------|---------------------------------------|
| CI    | MAR ← PC                                     | T2, C0                                |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M2<br>Ta, R, C1, M1, BW=11        |
| C3    | IR← MBR                                      | TI, C3                                |
| C4    | Decoding                                     | A0, B=0, C=0                          |
| C5    | MAR ← RI(dir)                                | C0,T3, Size = 10000<br>Offset = 00000 |
| C6    | MBR ← MP                                     | Ta, R, CI, MI, BW=II                  |
| C7    |                                              |                                       |



| op.    | rs     | rt     | dir     |
|--------|--------|--------|---------|
| 6 bits | 5 bits | 5 bits | 16 bits |

| Cycle     | Elem. Op.                                    | Control Signals                       |
|-----------|----------------------------------------------|---------------------------------------|
| CI        | MAR ← PC                                     | T2, C0                                |
| C2        | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ | C2, M2<br>Ta, R, C1, M1, BW=11        |
| C3        | IR← MBR                                      | TI, C3                                |
| C4        | Decoding                                     | A0, B=0, C=0                          |
| C5        | MAR ← RI(dir)                                | C0,T3, Size = 10000<br>Offset = 00000 |
| C6        | MBR ← MP                                     | Ta, R, CI, MI, BW=II                  |
| <b>C7</b> | \$reg ←MBR                                   | TI, RC=id \$reg, LC                   |



### Instructions that take up several words

**Example:** addm R1, addr R1  $\leftarrow$  R1 + MP[addr]

Format: addm R1 addr (address)

1<sup>a</sup> word 2<sup>a</sup> word

| Cycle | Elem. Op.                                    |
|-------|----------------------------------------------|
| CI    | MAR ← PC                                     |
| C2    | $PC \leftarrow PC + 4$ , $MBR \leftarrow MP$ |
| C3    | IR← MBR                                      |
| C4    | Decoding                                     |
| C5    | MAR← PC                                      |

| Cycle | Elem. Op.               |
|-------|-------------------------|
| C6    | MBR← MP,<br>PC ← PC + 4 |
| C7    | MAR ← MBR               |
| C8    | MBR ← MP                |
| C9    | RTI ← MBR               |
| CI0   | RI ← RI + RTI           |

# Simple tips (1/2): general phases in an instruction...

A. Fetch + Decod.

B. Fetch operands.

C. Execution

D. Store results

# Example

ADD  $(R_2)$   $R_3$   $(R_4)$ 

A. Fetch + Decod.

I.- MAR ← PC

2.- RI  $\leftarrow$  Memory(MAR)

3.- PC ← PC + "4"

4.- Decoding

B. Fetch operands.

5.- MAR  $\leftarrow$  R<sub>4</sub>

6.- MBR  $\leftarrow$  Memory(MAR)

7.- RTI  $\leftarrow$  MBR

c. Execution

8.- MBR  $\leftarrow$  R<sub>3</sub> + RTI

D. Store results

9.- MAR  $\leftarrow$  R<sub>2</sub>

10.- Memory(MAR)  $\leftarrow$  MBR

Simple tips (2/2): remember don'ts, everything else is yes...

- It is not possible to go through a register in the clock cycle
- It is not possible to take two or more values to a bus at the same time
- 3. It is not possible to set a datapath if the circuitry does not enable it.

Félix García-Carballeira, Alejandro Calderón Mateos

#### **ARCOS Group**

# uc3m | Universidad Carlos III de Madrid

### Lesson 4 (I) The processor

Computer Structure Bachelor in Computer Science and Engineering

