# Computer Organization and Architecture Laboratory, Autumn 2021

Assignment 1

Problem 1

1a: Half Adder



**Circuit Diagram of Half Adder** 

**Truth Table for Half Adder** 

| а | b | sum = a ^ b | carry = a & b |
|---|---|-------------|---------------|
| 0 | 0 | 0           | 0             |
| 0 | 1 | 1           | 0             |
| 1 | 0 | 1           | 0             |
| 1 | 1 | 0           | 1             |

1b: Full Adder



Circuit Diagram for Full Adder implemented using Half Adders

**Truth Table for Full Adder** 

| а | b | carry_in | sum = a ^ b ^ carry_in | carry_out = (a & b)   (c & (a ^ b)) |
|---|---|----------|------------------------|-------------------------------------|
| 0 | 0 | 0        | 0                      | 0                                   |
| 0 | 0 | 1        | 1                      | 0                                   |
| 0 | 1 | 0        | 1                      | 0                                   |
| 0 | 1 | 1        | 0                      | 1                                   |
| 1 | 0 | 0        | 1                      | 0                                   |
| 1 | 0 | 1        | 0                      | 1                                   |
| 1 | 1 | 0        | 0                      | 1                                   |
| 1 | 1 | 1        | 1                      | 1                                   |

1c: n-bit Ripple Carry Adder Speeds



Circuit Diagram for 8-bit Ripple Carry Adder implemented using full adders



Circuit Diagram for 16-bit Ripple Carry Adder implemented using 8-bit RCAs



Circuit Diagram for 32-bit Ripple Carry Adder implemented using 16-bit RCAs



Circuit Diagram for 64-bit Ripple Carry Adder implemented using 32-bit RCAs

Longest Delays Observed during Xilinx simulation

- 1. 8-bit Ripple Carry Adder = 9.949ns (Levels of Logic = 36)
- 2. 16-bit Ripple Carry Adder = 18.717ns (Levels of Logic = 70)
- 3. 32-bit Ripple Carry Adder = 36.253ns (Levels of Logic = 138)
- 4. 64-bit Ripple Carry Adder = 71.325ns (Levels of Logic = 274)

#### 1d: Subtraction using RCA

In order to perform a - b, we can write it as a + (2s complement of b)

= a + (1s complement of b) + 1

where the 1s complement of b is obtained by inverting every bit of b.

Thus, we pass every bit of b through a NOT gate to get the inverse of b. In order to do the addition of 1, we can set the **initial carry to 1** in our RCA. This way, we get a + (inverse of b) + 1 using an RCA.

In order to perform a subtraction of two n bit numbers, we can ripple in a series of n full adders to make an n bit RCA, and use it to perform the addition of a + (1s complement of b) + 1 by setting the input carry to 1 instead of 0 and completenting bits of b using NOT gates. Thus, subtraction can be performed using addition using the 2s complement method. We do not have to perform two additions to add the extra 1 as it can be passed as the carry.

## Problem 2

# 2a: 4-bit Carry Lookahead Adder



Design of the 4-bit CLA

# Inputs:

A[3..0], B[3..0], C0 (input carry)

## **Outputs:**

Sum[3..0] : Sum of A[3..0] and B[3..0] C4 (Output carry)

P[0..3] = Propagate signals G[0..3] = Generate signals

```
carry[1..4] = Generated carry bits
```

**Equations** for the generate and propagate signals along with sum:

```
P[i] = A[i] ^ B[i], for 0<=i<=3
G[i] = A[i] & B[i], for 0<=i<=3
sum[i] = P[i] ^ carry[i], for 0<=i<=3
```

Boolean equations of the Look-ahead carry generation for the 4 carry bits, C1, C2, C3, and C4:

```
carry[0] = input carry
carry[i+1] = G[i] | (P[i] & carry[i]), for 0<=i<=3
```

More specifically, on expanding:

```
 \begin{array}{l} \text{carry[1] = G[0] \mid (P[0] \& carry[0])} \\ \text{carry[2] = G[1] \mid (P[1] \& G[0]) \mid (P[1] \& P[0] \& carry[0])} \\ \text{carry[3] = G[2] \mid (P[2] \& G[1]) \mid (P[2] \& P[1] \& G[0]) \mid (P[2] \& P[1] \& P[0] \& carry[0])} \\ \text{carry[4] = G[3] \mid (P[3] \& G[2]) \mid (P[3] \& P[2] \& G[1]) \mid (P[3] \& P[2] \& P[1] \& G[0]) \mid (P[3] \& P[2] \& P[1] \& P[0]} \\ \text{\& carry[0])} \\ \end{array}
```

2b: Comparison of 4-bit CLA with 4-bit RCA

```
Longest delay in 4-bit RCA = 5.565ns (Levels of Logic = 20)
Longest delay in 4-bit CLA = 2.123ns (Levels of Logic = 4)
```

**Hence, the 4-bit CLA is much faster than the 4-bit RCA.** The reason for this is that we have eliminated the rippling of carry from one bit to another by introducing generate and propagate signals (and computing them in an optimised way using multiple levels of **lookahead circuit**) which speed up the process of carry generation and propagation through the circuit.

# 2c (i): Augmented 4-bit CLA



Design of the augmented 4-bit CLA

## Inputs:

A[3..0], B[3..0], C0 (input carry)

#### **Outputs:**

Sum[3..0] : Sum of A[3..0] and B[3..0]

P\_Block : Block Propagate G\_Block : Block Generate

P[0..3] = Propagate signals G[0..3] = Generate signals carry[1..4] = Generated carry bits **Equations** for the generate and propagate signals:

Boolean equations of the Look-ahead carry generation for the 4 carry bits, C1, C2, C3, and C4:

```
carry[0] = input carry
carry[i+1] = G[i] | (P[i] & carry[i]), for 0<=i<=3
More specifically, on expanding:
carry[1] = G[0] | (P[0] & carry[0])
carry[2] = G[1] | (P[1] & G[0]) | (P[1] & P[0] & carry[0])
carry[3] = G[2] | (P[2] & G[1]) | (P[2] & P[1] & G[0]) | (P[2] & P[0] & carry[0])
```

No need to calculate c[4] as the LCU does that using P\_Block and G\_Block.

 $sum[i] = P[i] ^ carry[i], for 0 <= i <= 3$ 

Calculation of Block Propagate and Generate for the entire 4-bit block (independent of carry[0]):

P\_block = P[3] & P[2] & P[1] & P[0] G\_block = G[3] | (P[3] & G[2]) | (P[3] & P[2] & G[1]) | (P[3] & P[2] & P[1] & G[0])

2c (ii): 16-bit CLA with Lookahead Carry Unit



Design 1: Design of 16-bit CLA using block propagates and block generates passing through a second layer of lookahead



Design 2: Design of 16-bit CLA by rippling carries from the 4-bit CLA blocks

Longest delay in 16-bit CLA without the Lookahead Carry Unit = **6.167ns** (Levels of Logic =14) Longest delay in 16-bit CLA with the Lookahead Carry Unit = **5.243ns** (Levels of Logic =11)

Hence, the inclusion of the Lookahead Carry Unit improves the speed of the 16-bit CLA in comparison to cascaded 4-bit CLA structure where the carry bits are rippled in. The reason is that the former has an extra level of lookahead circuit which optimises the generation/propagation of carry bits even more.

2c (iv): Comparison between the 16-bit CLA and the 16-bit RCA

|                             | 16-bit Ripple Carry<br>Adder    | 16-bit Carry Lookahead<br>Adder |
|-----------------------------|---------------------------------|---------------------------------|
| Longest Delay               | 18.717ns (Levels of Logic = 70) | 5.243ns (Levels of Logic = 11)  |
| Number of Slice LUTs        | 48 / 63400                      | 29 / 63400                      |
| Number of Bonded IO Buffers | 50 / 210                        | 52 / 210                        |
| Number of LUT-FF Pairs used | 48                              | 29                              |
| Number of Occupied Slices   | 48 / 15850                      | 14 / 15850                      |

Hence, we see that:

- 1. The **lookup table (LUT) cost** of the 16-bit CLA is much **lesser** in comparison to the 16-bit RCA.
- 2. The **speed** of the 16-bit CLA is much **greater** (as delay is lesser) in comparison to the 16-bit RCA.

The reason for this is that we have eliminated the rippling of carry from one bit to another by introducing generate and propagate signals (and computing them in an optimised way using multiple levels of **lookahead circuit**) which speed up the process of carry generation and propagation through the circuit.