### CS 61C:

# Great Ideas in Computer Architecture Single-Cycle CPU Datapath & Control

#### Instructors:

John Wawrzynek & Vladimir Stojanovic http://inst.eecs.Berkeley.edu/~cs61c/fa15

## Step 3b: Add & Subtract

- R[rd] = R[rs] op R[rt] (addu rd,rs,rt)
  - Ra, Rb, and Rw come from instruction's Rs, Rt, and Rd fields

|    | 31 2   | 26 | 21 1   |        | 11     | 6      | 0      |  |
|----|--------|----|--------|--------|--------|--------|--------|--|
| ор |        |    | rs     | rt     | rd     | shamt  | funct  |  |
|    | 6 bits | S  | 5 bits | 5 bits | 5 bits | 5 bits | 6 bits |  |

ALUctr and RegWr: control logic after decoding the instruction



... Already defined the register file & ALU

# Register-Register Timing: One Complete Cycle (Add/Sub)



# Register-Register Timing: One Complete Cycle



## 3c: Logical Op (or) with Immediate

R[rt] = R[rs] op ZeroExt[imm16]



## 3d: Load Operations

R[rt] = Mem[R[rs] + SignExt[imm16]]

Example: lw rt, rs, imm16



## 3e: Store Operations

Mem[ R[rs] + SignExt[imm16] ] = R[rt]



## **3e: Store Operations**

Mem[ R[rs] + SignExt[imm16] ] = R[rt]



## 3f: The Branch Instruction



beq rs, rt, imm16

- mem[PC] Fetch the instruction from memory
- Equal = (R[rs] == R[rt]) Calculate branch condition
- if (Equal) Calculate the next instruction's address
  - PC = PC + 4 + (SignExt(imm16) x 4)

else

 $\bullet$  PC = PC + 4

# Datapath for Branch Operations

beq rs, rt, imm16



## Datapath generates condition (Equal)

#### **Inst Address**

imm16



Already have mux, adder, need special sign extender for PC, need equal compare (sub?)

## Instruction Fetch Unit including Branch



• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]\*4; else



How to encode nPC\_sel?

Instruction<31:0>

- Direct MUX select?
- Branch inst. / not branch inst.
- Let's pick 2nd option

| nPC_sel | zero? | MUX |  |  |  |
|---------|-------|-----|--|--|--|
| 0       | Х     | 0   |  |  |  |
| 1       | 0     | 0   |  |  |  |
| 1       | 1     | 1   |  |  |  |

Q: What logic

gate?



## Putting it All Together: A Single Cycle Datapath



## Clickers/Peer Instruction

What new instruction would need no new datapath hardware?

- A: branch if reg==immediate
- B: add two registers and branch if result zero
- C: store with auto-increment of base address:
  - sw rt, rs, offset // rs incremented by offset after store
- D: shift left logical by two bits



## Administrivia

- Midterm
  - A: Super Hard

• • •

- E: Too Easy
- Project 2-2 due 10/11
  - A: Haven't started
  - B: Done with Step 1
  - C: Done with Step 2
  - D: Done with Step 3
  - E: Finished

## Processor Design: 5 steps

- Step 1: Analyze instruction set to determine datapath requirements
- Meaning of each instruction is given by register transfers
- Datapath must include storage element for ISA registers
- Datapath must support each register transfer
- Step 2: Select set of datapath components & establish clock methodology
- Step 3: Assemble datapath components that meet the requirements
- Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer
- Step 5: Assemble the control logic

## **Datapath Control Signals**

ExtOp: "zero", "sign"

• ALUsrc:  $0 \Rightarrow \text{regB}$ ;

 $1 \Rightarrow immed$ 

ALUctr: "ADD", "SUB", "OR"

• MemWr:  $1 \Rightarrow$  write memory

• MemtoReg:  $0 \Rightarrow ALU$ ;  $1 \Rightarrow Mem$ 

• RegDst:  $0 \Rightarrow$  "rt";  $1 \Rightarrow$  "rd"

• RegWr:  $1 \Rightarrow$  write register



# Given Datapath: RTL -> Control



## RTL: The Add Instruction



#### add rd, rs, rt

- MEM[PC] Fetch the instruction from memory
- -R[rd] = R[rs] + R[rt] The actual operation
- PC = PC + 4 Calculate the next instruction's address

## Instruction Fetch Unit at the Beginning of Add

Fetch the instruction from Instruction

memory: Instruction = MEM[PC]



# Single Cycle Datapath during Add





## Instruction Fetch Unit at End of Add

 $\bullet$  PC = PC + 4

Same for all Inst instructions except: Memory Branch and Jump nPC sel=+4 **Inst Address** Adder clk Ext imm<sub>16</sub>

## Single Cycle Datapath during Jump



New PC = { PC[31...28], target address, 00 }



# Single Cycle Datapath during Jump

target address jump J-type op

New PC = { PC[31..28], target address, 00 }



## Instruction Fetch Unit at the End of Jump



New PC = { PC[31..28], target address, 00 }



## Instruction Fetch Unit at the End of Jump



New PC = { PC[31..28], target address, 00 }



## Clicker Question

## Which of the following is TRUE?

- A. The clock can have a shorter period for instructions that don't use memory
- B. The ALU is used to set PC to PC+4 when necessary
- C. Worst-delay path in Instruction Fetch unit is Add+mux delay
- D. The CPU's control needs only *opcode* to determine the next PC value to select
- E. npc\_sel affects the next PC address on a *jump*

# In The News: Tile-Mx100 100 64-bit ARM cores on one chip



EZChip (bought Tilera)

100 64-bit ARM Cortex A53

Dual-issue, in-order

## Summary of the Control Signals (1/2)

```
Register Transfer
inst
add
        R[rd] \leftarrow R[rs] + R[rt]; PC \leftarrow PC + 4
        ALUSTC=ReqB, ALUCTT="ADD", ReqDst=rd, ReqWr, nPC sel="+4"
sub
        R[rd] \leftarrow R[rs] - R[rt]; PC \leftarrow PC + 4
        ALUsrc=RegB, ALUctr="SUB", RegDst=rd, RegWr, nPC_sel="+4"
        R[rt] \leftarrow R[rs] + zero ext(Imm16); PC \leftarrow PC + 4
ori
        ALUSTC=Im, Extop="Z", ALUCTT="OR", RegDst=rt, RegWr, nPC sel="+4"
        R[rt] \leftarrow MEM[R[rs] + sign ext(Imm16)]; PC \leftarrow PC + 4
lw
        ALUSTC=Im, Extop="sn", ALUCTT="ADD", MemtoReg, RegDst=rt, RegWr,
        nPC sel = "+4"
        MEM[R[rs] + sign_ext(Imm16)] \leftarrow R[rs]; PC \leftarrow PC + 4
SW
        ALUSTC=Im, Extop="sn", ALUCTT = "ADD", MemWr, nPC sel = "+4"
         if (R[rs] == R[rt]) then PC \leftarrow PC + sign ext(Imm16)] \mid 00
beg
        else PC \leftarrow PC + 4
        nPC sel = "br", ALUctr = "SUB"
```

# Summary of the Control Signals (2/2)

|           |             |                |          |                   |           |         |          |         | •           |
|-----------|-------------|----------------|----------|-------------------|-----------|---------|----------|---------|-------------|
| See       | func        | 10 0000        | 10 0010  | We Don't Care :-) |           |         |          |         |             |
| Appendix  | (A          | 00 0000        | 00 0000  | 00 1101           | 10 0011   | 10 1011 | 00 0100  | 00 0010 |             |
|           |             | add            | sub      | ori               | lw        | sw      | beq      | jump    |             |
|           | RegDst      | 1              | 1        | 0                 | 0         | Х       | Х        | Х       |             |
|           | ALUSrc      | 0              | 0        | 1                 | 1         | 1       | 0        | Х       |             |
|           | MemtoReg    | 0              | 0        | 0                 | 1         | Х       | Х        | Х       |             |
|           | RegWrite    | 1              | 1        | 1                 | 1         | 0       | 0        | 0       |             |
|           | MemWrite    | 0              | 0        | 0                 | 0         | 1       | 0        | 0       |             |
|           | nPCsel      | 0              | 0        | 0                 | 0         | 0       | 1        | ?       |             |
|           | Jump        |                | 0        | 0                 | 0         | 0       | 0        | 1       |             |
|           | ExtOp       | Х              | Х        | 0                 | 1         | 1       | Х        | Х       |             |
|           | ALUctr<2:0> | Add            | Subtract | Or                | Add       | Add     | Subtract | х       |             |
|           | 31 26       | 21             |          | 16 11             |           | 6       |          | 0       |             |
| R-typ     | е ор        | rs             | rt       |                   | rd        | shamt   | func     | ct add  | d, sub      |
| I-typ     | е ор        | rs             | rt       |                   | immediate |         |          | ori,    | lw, sw, beq |
| J-type op |             | target address |          |                   |           |         |          | jun     | ηp          |

# **Boolean Expressions for Controller**

```
ReqDst = add + sub
ALUSrc = ori + lw + sw
MemtoReq = lw
RegWrite
               = add + sub + ori + lw
MemWrite = sw
nPCsel = beq
Jump = jump
ExtOp = lw + sw
ALUctr[0] = sub + beg (assume ALUctr is 00 ADD, 01 SUB, 10 OR)
ALUctr[1] = or
Where:
rtype = \sim op_5 \bullet \sim op_4 \bullet \sim op_3 \bullet \sim op_2 \bullet \sim op_1 \bullet \sim op_0,
                                                                                 How do we
ori
         = \sim op_5 \bullet \sim op_4 \bullet op_3 \bullet op_2 \bullet \sim op_1 \bullet op_0
lw
         = op_5 \bullet \sim op_4 \bullet \sim op_3 \bullet \sim op_2 \bullet op_1 \bullet op_0
                                                                            implement this in
         = op_5 \bullet \sim op_4 \bullet op_3 \bullet \sim op_2 \bullet op_1 \bullet op_0
SW
                                                                                     gates?
         = \sim op_5 \bullet \sim op_4 \bullet \sim op_3 \bullet op_2 \bullet \sim op_1 \bullet \sim op_0
bea
         = \sim op_5 \bullet \sim op_4 \bullet \sim op_3 \bullet \sim op_2 \bullet op_1 \bullet \sim op_0
jump
add = rtype • func<sub>5</sub> • ~func<sub>4</sub> • ~func<sub>5</sub> • ~func<sub>7</sub> • ~func<sub>1</sub> • ~func<sub>7</sub>
sub = rtype • func<sub>5</sub> • ~func<sub>4</sub> • ~func<sub>3</sub> • ~func<sub>2</sub> • func<sub>1</sub> • ~func<sub>0</sub>
```

# Controller Implementation



# P&H Figure 4.17



# Summary: Single-cycle Processor

- Five steps to design a processor:
  - 1. Analyze instruction set  $\rightarrow$ datapath requirements
  - 2. Select set of datapath components & establish clock methodology
  - 3. Assemble datapath meeting the requirements
  - 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.
  - 5. Assemble the control logic
    - Formulate Logic Equations
    - **Design Circuits**



# Levels of Representation/Interpretation





## No More Magic!

