# A Novel Implementation of 4-bit Carry Look-ahead Adder

Jia Miao and Shuguo Li

Institute of Microelectronics, Tsinghua University, Beijing, China Email: miaoj15@mails.tsinghua.edu.cn, lisg@tsinghua.edu.cn

#### Abstract

Carry look-ahead adder (CLA) is a fast adder. This paper is delicated to accelerate the 4-bit CLA circuit. In proposed circuit structure, XOR gate is replaced by NOR gate. Moreover, logic gates have less fan-in and fan-out and signal throughs one less MOS transistor in critical path. All designs are built in dynamic CMOS logic gates. Simulation results show that the proposed architecture has advantages over conventional circuits in speed, area and power consumption.

Key words: CLA, CMOS, HSpice and logic gate.

### I. Introduction

Addition is the basic mathematical operation and an essential operating in Algorithm Logic Units (ALU) [1]. Currently, addition have many logic designs, such as ripple adder, carry select adder and so on [2]. In these circuits, the carry-sum of higher bits depend on the carry-out of lower significant bits and this causes long propogration delay. Carry look-ahead adder (CLA) solves this problem. This method makes the carry-sum of each bit to be calculated by inputs directly. Some articals implement the logic gate in different technologies, such as BiCMOS [3], in order to speed up the CLA operation. And some fabricate the logic gate only in NMOS transistors [4]. Circuits in [5]-[7] are accomplished in CMOS logic gate. They are constructed by XOR AND and OR gates. All of them have the same formulas. However, the design [8] uses NAND gate to replace the AND and NOT gates. This can efficiently decrease the cost and increase the speed. Therefore [8] is the fastest CLA design at present, and it has lower cost in area and power.

Our work is to accelerate the CLA circuit established by dynamic CMOS logic gates. This simplification scheme realizes this acceleration and reduces power consumption.

In this paper, section I introduces the backgroud about CLA. In section II, the principle of 4-bit CLA circuit is described in part A. Part B gives the new derived functions and analysis. Comparsions of the simulation results are presented in section III. At last, a conclusion is given.

# II. Implementation of Carry look-ahead Adder

### A. The principle of 4-bit CLA

Function of the CLA is to calculate the carry-in of each bit from inputs directly. For addition with two inputs, the external carry-in is denoted as  $C_0$ . Both inputs have 4 bits and they are represented as  $A_3A_2A_1A_0$  and  $B_3B_2B_1B_0$  respectively. Then sum of this addition  $S_3S_2S_1S_0$  can be calculated by Full adders. Thus  $S_3S_2S_1S_0 = A_3A_2A_1A_0 \oplus B_3B_2B_1B_0 \oplus C_3C_2C_1C_0$ , where  $C_3$   $C_2$  and  $C_1$  are the carry-outs of previous bits.

The logic relationships between the carry-outs and inputs are expressed by equations (1)-(3). Among them, signals

Carry-Propagation (abbreviated  $P_i$ ) and Carry-Generate (abbreviated  $G_i$ ) are obtained  $P_i = A_i \oplus B_i$  and  $G_i = A_i \& B_i = A_i B_i$ , in which i is the value 0 1 2 or 3. The symbol '&' representes logic AND and it can be omitted in function sometimes. Besides, the symbol '+' denotes the logic OR.

$$C_1 = G_0 + P_0 C_0 (1)$$

$$C_2 = G_1 + P_1G_0 + P_1P_0C_0 \tag{2}$$

$$C_3 = G_2 + P_2G_1 + P_2P_1G_0 + P_2P_1P_0C_0$$
 (3)

$$G = G_3 + P_3G_2 + P_3P_2G_1 + P_3P_2P_1G_0$$
 (4)

$$P = P_3 P_2 P_1 P_0 (5)$$

In equations (4) and (5), G represents the carry-propagation signal of the 4-bit CLA adder and P is denoted as the carry-generate signal. Based on them, carry-out of this addition  $C_4$  is given by formula  $C_4 = G + PC_0$ .

## B. Proposed CLA Design

The basic circuit of 4-bit CLA can be accopmlished according to the above equations (1)-(5). Our proposed structure is implemented as the following new derived functions.

On the one hand, substuiting  $G_0$  and  $P_0$  into equation (1) we get  $C_1 = A_0 B_0 + (A_0 \oplus B_0)C_0$ . According to Table I,  $C_1$  is simplified to formula  $C_1 = A_0 B_0 + (A_0 + B_0)C_0$ . From Table I, we know that the values of  $A_0 + B_0$  and  $A_0 \oplus B_0$  are different only when  $A_0B_0=11$ . In this situation, the final value of  $C_1$  equals to '1' because  $A_0 \& B_0 = 1$ .

On the other hand, the complementary logic gate is inverting for CMOS technology. Based on the DeMorgan's Theorems,  $C_1$  is implemented as  $C_1 = \overline{A_0B_0}$   $\overline{\overline{A_0} + \overline{B_0}} + \overline{\overline{C_0}}$ . Replacing  $\overline{A_0B_0}$  and  $\overline{A_0 + B_0}$  by  $G'_0$  and  $\overline{P'_0}$ . In other words,  $G'_0 = \overline{A_0B_0} = \overline{A_0B_0}$  and  $P'_0 = \overline{A_0 + B_0}$ . Thus the simplified formula of  $C_1$  is finally written as equation (6).

For the remaining carry-outs  $C_3$  and  $C_2$ , using the same way, we obtain the expressions (7)-(8). In particular, G and P are replaced by  $G'_{out}$  and  $P'_{out}$  which are shown in equations (9)-(10). In them,  $G'_i = \overline{A_i} \& \overline{B_i} = \overline{A_i} \overline{B_i}$  and  $P'_i = \overline{A_i} + \overline{B_i}$ .  $C_4$  can be built as function  $C_4 = \overline{G_{out}(P'_{out} + \overline{C_0})}$ . Eventually, Fig.1 demonstrates the detailed circuit structure of the proposed 4-bit CLA.

$$C_1 = \overline{G'_0 \left( P'_0 + \overline{C_0} \right)} \tag{6}$$

$$C_2 = \overline{G'_1(P'_1 + G'_0)} + (\overline{P'_1 + P'_0} C_0)$$
 (7)

$$C_3 = \overline{G'_2(P'_2 + G'_1)} + (\overline{P'_2 + P'_1} \overline{G'_0(P'_0 + \overline{C_0})})$$
 (8)

$$G'_{\text{out}} = \overline{G'_{3} (P'_{3} + G'_{2}) + (P'_{3} + P'_{2} \overline{G'_{1} (P'_{1} + G'_{0})})}$$
 (9)

$$P'_{out} = \overline{P'_{3} + P'_{2} P'_{1} + P'_{0}}$$
 (10)

Path from  $B_0$  through  $C_3$  to  $S_3$  is the longest. To function  $\overline{C_3}$  of our design and  $C_3$  of [8], Fig.2 illustrates a dynamic CMOS structure for both of them. By constract, logic gates in Fig.2.(a) has less fan-in and signal throughs one less transistor than logic gates in Fig.2.(b). Moreover, the proposed logic gates  $G_i$  and  $P_i$  have less fan-out and logic gates  $C_i$  have less fan-in compared with corresponding gates in [8].

TABLE I LOGIC RELATIONSHIP TRUTH TABLE OF  $C_1$ 

| $C_0$     | $A_0B_0$    | $A_0\&B_0$ | $A_0 \oplus B_0$ | $A_0 + B_0$ | $C_1$ |
|-----------|-------------|------------|------------------|-------------|-------|
|           | $A_0B_0=00$ | 0          | 0                | 0           | 0     |
| $C_0 = 0$ | $A_0B_0=01$ | 0          | 1                | 1           | 0     |
| or 1      | $A_0B_0=10$ | 0          | 1                | 1           | 0     |
|           | $A_0B_0=11$ | 1          | 0                | 1           | 1     |

### III. Results and Comparsions

Based on the  $0.25\mu m$  technology, circuits are implemented in dynamic logic gates which are fabricated in minimal size using typical model SPICE transistors of [9]. The inverter has the  $0.5\mu m$  PMOS width and the  $0.25\mu m$  NMOS width. The voltage is 2.5v and all outputs have no extra loads. Fig.3 shows the waves of the longest paths in HSpice.

Simulation results of circuits are listed in Table II. Seen from the data, delay of the 4-bit CLA is reduced from 1.008ns to 0.732ns. Besides, our design uses less transistors, so the area of transistor is less. Thus, power consumption during the latency of the proposed design is lower than [8].

### IV. Conclusion

In this paper, we implement a novel 4-bit CLA circuit. Logic gates in the proposed designhave less fan-in and fan-out. And XOR gate is replaced by NOR gate. Meanwhile, signal throughs one less MOS transistor in the critical paths. Power consumption and area have also decreased. Comparesions show that our design has better circuit performance than reported designs. This scheme is also applied to the other architecture.

TABLE II RESULTS OF 4-BIT CLA DESIGNS

| Design | Delay   | Area             | Power consumption         |
|--------|---------|------------------|---------------------------|
| [8]    | 1.008ns | $37.937 \mu m^2$ | $6.988 \times 10^{-13} J$ |
| Our    | 0.732ns | $30.125 \mu m^2$ | $5.075 \times 10^{-13} J$ |

### Acknowledgment

This work is supported by the National Natural Science Foundation of China under Grant No.61674086.

#### References

[1] Morrison M, Lewandowski M, Meana R, et al. *Design of a novel reversible ALU using an enhanced carry look-ahead adder*[C]//Nanotechnology (IEEE-NANO), 2011 11th IEEE Conference on. IEEE, 2011: 1436-1440.



Fig. 1. Proposed circuit structure of 4-bit CLA.



Fig. 2. (a) A dynamic CMOS logic structure of  $\overline{C_3}$  in our design.

(b) A dynamic CMOS logic structure of  $C_3$  in [8].



Fig. 3. The simulation waves of  $S_3$  in HSpice.

- [2] Knowles S. *A family of adders*[C]//Computer Arithmetic, 2001. Proceedings, 15th IEEE Symposium on. IEEE, 2001: 277-281
- [3] Kuo J B, Liao H J, Chen H P. A BiCMOS dynamic carry lookahead adder circuit for VLSI implementation of high-speed arithmetic unit[J]. IEEE journal of solid-state circuits, 1993, 28(3): 375-378.
- [4] Hwang B K. Carry lookahead adder: U.S. Patent 5,951,631[P]. 1999-9-14.
- [5] Mano M M, Kime C R, Martin T. Logic and computer design fundamentals[M]. Prentice Hall, 2008.
- [6] Preethi K, Balasubramanian P. FPGA implementation of synchronous section-carry based carry look-ahead adders[C]//Devices, Circuits and Systems (ICDCS), 2014 2nd International Conference on. IEEE, 2014: 1-4.
- [7] Singh R P P, Kumar P, Singh B. Performance analysis of 32-bit array multiplier with a carry save adder and with a carry-look-ahead adder[J]. International Journal of Recent Trends in Engineering, 2009, 2(6): 83-86.
- [8] Pai Y T, Chen Y K. *The Fastest Carry Lookahead Adder*[C]// IEEE International Workshop on Electronic Design, Test and Applications. IEEE Computer Society, 2004:434.
- [9] Jan M. Rabaey, Anantha Chandrakasan, Borivoje Nikolic, "Digital Integrated Circuits: A Design Perspective", Second Edition, Prentice Hall, 2003.