Vol.No:03 doi:

Received: 15/12/2018 Revised: 15/01/2019 Accepted: 15/02/2019

# Performance Analysis of Ultra Low Power and PDP Efficient 1-Bit Full Adder Circuit for Arithmetic Blocks

S.Vijayakumar<sup>1</sup>, Senior Member IEEE, V.Jayaprakasan<sup>2</sup>, Senior Member IEEE, and Reeba Korah<sup>3</sup>

<sup>1</sup>Department of ECE, Sreenivasa Institute of Technology and Management Studies, Autonomous, Chittoor, AP, India-517127

<sup>2</sup>Department of ECE, Sreenidhi Institute of Science & Technology, Autonomous, Hyderabad, Telangana, India-501301.

<sup>3</sup>Department of ECE, Alliance College of Engineering & Design, Alliance University, Anekal, Bangalore, Karnataka, India-562106.

dr.vijay-vlsi@ieee.org, jprakasan@gmail.com, reeba.korah@alliance.edu.in

Abstract - Technology scaling which is now approaching 10nm causes to account the power consumption by interconnects also. For devices with feature size less than 100nm, it is a tedious job to estimate their characteristics. In this work, the TGA, TFA, SERF, ULPFA and BBL-PT types of 1-bit full adder cells are taken along—with the proposed 8-Transistors based Ultra Low Power (8T-ULP) full adder and analyzed to compare their performances. 32nm BPTM Model file is used to implement the designs. RCA structure is used as a top module to analyze the behavior of adders. The comparison states that the 8T-ULP adder consumes 2.7 times low power against the SERF as a minimum difference and a maximum of 10 times lowest power than the ULPFA method. But the TGA and TFA are switching 4 times faster than the proposed one while the ULPFA switches 7 times slower. However the PDP of the proposed 8T-ULP is 35% more efficient than the TGA and TFA. Since the 8T-ULP is simple, it occupies 1600λ2 as a smallest area among the leaf cells.

Keywords Ultra Low Power, 8T-ULP, 1-bit Full Adder, RCA, Multiplier, VLSI, Circuit Level Approach.

Apart from trade off among power, area and delay, the choice of low power is a leading constraint in many digital miniatures. This causes the designers to shrink the power backup source like battery for which prolonged standby is possible with low power designs. The popular techniques for miniaturization and low power are

- Technology scaling
- Circuit level Optimization
- Logic Optimization
- Technology Innovation for low power
- Parallelism and pipelining concepts and so on....

There are many advantages of scaling for low-power

# 1. Introduction

operation. The improved device characteristic for low-voltage operation is the first and foremost benefit. This is due to the improvement of the current driving capabilities. Scaling is done by a dimensionless factor S. The scalable parameters are the dimension in all directions (x, y and z), the device voltage and doping concentration densities for better performance. Reduced capacitances through small geometries and junction capacitances are the effects of scaling the devices. Lateral scaling is another approach in which the lengths of the transistors are scaled down, popularly termed as scaling the feature size. Industry prefers the scaling of process generations generally with  $S = \sqrt{2}$ . This doubles the number

of transistors per unit area between two generations.

But scaling down the CMOS technology results in high performance more difficult to achieve which is due to the effect of velocity saturation at earlier stage than the higher feature sizes [1] – [2]. At short channel lengths L, the drain current per channel width (W) is no longer proportional to (Vgs – Vt)2/L, where Vgs is the gate-to-source voltage and Vt is the threshold voltage. The drain current is proportional to Vsat(Vgs – Vt), where Vsat is the saturation velocity [3] – [4]. In addition, at sub-micron level the difference between the operating voltage and threshold voltage is very small and the static leakage is dominant to increase the static power consumption. This situation increases the design complexity and care must be taken to design the system for a better performance.

As per the International Technology Roadmap for Semiconductors (ITRS) predictions 2012, the devices are of 32nm technology which are the current trend with nearly 3100 million transistors /die against 1550 million CMOS transistors / die with 45nm technology which were in use during 2010 [5]. The supply voltage is also scaled down to less than 1 volt which causes to reduce the dynamic power consumption drastically which is proportional to square of the supply voltage.

Along with technology scaling, the circuit level power optimization method gives number of choices to limit the operating power such as multi-Vt, multi-Vdd, clock gating, power gating and so on. Availability of multiple and variable threshold devices lead to MTCMOS technology. These techniques are used to balance the critical path by speeding up the switching activity by means of low Vt cells to logic path and high Vt cells as power switches. This results in good management of active and standby power trade-off and higher density of integration. The sleep transistor and power gating are the familiar methods in connection with scaling the devices. Power gating is the method saves energy during the circuit's ideal state.

The paper is organized as follows. The literature on various low power logic styles of 1-bit full adder design are

given in the next section. The proposed 8T-ULP full adder is given in section III. The simulation and performance comparison of the proposed work with other low approaches are given in section IV. The concluding remarks are given in section V.

# 2. Earlier Approaches to Low Power Full Adder

In a CMOS circuit, the designers aim at achieving smaller power dissipation along with high throughput. Technology, circuit design style, architectural abstraction and algorithmic abstraction are the factors which influence the power dissipation in CMOS circuit. Static CMOS circuit is robust as it is proven earlier by its pull-up and pull-down networks which vital role to ensure the full swing and hence immune to noise factors. The full adder employed with PMOS transistors permits the output to be at logic 1 equals to supply voltage in magnitude and NMOS transistors to discharge the output node to ground, is of CMOS style.

The PMOS pull-up networks ensure the output to a full logic high level and the NMOS pull-down networks; on the other hand produce the output with a full logic low level equal to ground. Combined together, the CMOS circuit operates to produce a full swing output voltage. Many CMOS circuit styles exist which are static and dynamic kind. The following parts of this section express the logic styles available other than regular CMOS implementation (mostly) for a 1-bit adder cell which offer low power consumption.

# 1. Ultra Low Power Full Adder

This is an Ultra-Low Power Full Adder (ULPFA) which has XOR gates, consume less energy. The PMOS and NMOS transistors are connected like a complementary pass transistor to get sum output. Two PMOS transistors in series and two NMOS transistors arranged in a manner along with an inverter forms a two input XOR gate. This arrangement is replicated and connected with it to form the circuit of a three input XOR gate as shown in fig. 1(a). The carry is of complementary static CMOS design method as in fig. 1(b). This method operates at a supply voltage less than 1volt and hence the

design is named as Ultra-Low Power Full Adder. It consistently performs well even with a magnitude less than 0.6V as mentioned in [6]. The Leakage power consumption by ULPFA is negligibly smaller.



Fig. 1. ULPFA (a) Sum circuit. (b) Carry circuit 2. BBL-PT Adder

It is the combination of Branch Based Logic and Pass Transistor circuits to form a hybrid circuit abbreviated as BBL-PT full adder. This method consumes less operating power which is the primary advantage. Few transistors are utilized to conduct which results in least nodes thereby resulting in a reduced switching activity and hence low power. The branch-based implementation which meets the requirements of an adder ensures robustness with respect to voltage and device scaling.

At the physical design level, the circuit diminishes the diffusion capacitance since it eases diffusion-sharing which leads to regular and compact layout. Series connection of transistors between the output node and the supply rail forms the branch like circuit and hence the name BBL. Using Karnaugh's maps, the circuit is optimized with PMOS and NMOS networks to produce the sum of products of the Boolean expression.



Fig. 2. Sum circuit of BBL – PT adder

The carry block is constructed with similar kind of logic as constructed for a carry block of ULPFA represented in fig. 1(b). But the sum block doesn't follow this method of implementation, it uses Pass Transistor logic (PT) to achieve the functionality with less device count and in total, 24 transistors are used to construct the complete adder [6]. The sum block with the PT structure is hence computing the arithmetic function AB + BCin + ACin which is shown in fig. 2. The one bit adder of this type is therefore termed as BBL-PT, because it uses the BBL¬ method to compute carry and PT method to calculate the sum output.

## 3. Static Energy Recovery Full Adder

The Static Energy Recovery Full adder (SERF) is also another method which operates with lower energy requirements. In the regular CMOS, the load capacitance charged to Vdd is discharging to ground when the output is switching from logic '1' to logic '0'.



Fig. 3. Static Energy Recovery Full Adder

This kind of non-energy resulting in a higher short circuit power Psc. Whereas in an energy recovery circuit, the elimination of a path to the ground reduces the power consumption, removing the term Psc from the total power equation. The energy is discharged from the load capacitance to the control gates. The combination of not having a direct path to ground and the re-application of the load charge to the control gate makes the energy recovery full adder as a power

efficient design as represented in fig. 3. This idea leads to a full adder with minimal transistors and a drastic reduction in total power utilization.

Just 10 transistors are enough to construct the complete SERF type of 1-bit adder which is the lowest among all the other styles described so far which leads to the requirement of lowest die area [7]. The transient function of this kind of adder is faster and competent with the other logic methods taken for comparison. But it has the problem of threshold loss, i.e., the circuit is producing the degraded output. The use of this type of adder hence becomes critical when it is used to build larger blocks [8], [9].

#### 4. Transmission Gate Adder

Another type of unconventional static method is the Transmission Gate Adder (TGA). Only transmission gates and inverters are used to implement XOR and multiplexing functions which require 20 MOS transistors for implementation. The design avoids threshold voltage drop problem: a solution arrived over TFA. NMOS transistor passes a strong 0 but a weak 1 and the PMOS passes a strong 1 and a weak 0.

Fig. 4. Transmission Gate Full Adder





Fig. 5. Functions of Transmission gate (a) Charging nodeB, (b) Discharging node B

CCMOS uses a PMOS pull up and an NMOS pull down whereas the transmission gate combines the advantages of both the devices [10] - [13] by placing an NMOS device in parallel with a PMOS as in fig. 4. The transmission gate acts as a bi-directional switch controlled by gate signal C as shown in Fig .5(a). When C = 1, both MOSFETs are on, allowing the signal to pass through the gate. It is expressed as,

$$A = B$$
; if  $C = 1$ 
(1)

Otherwise, for C=0 it will results both transistors in cutoff, creating an open circuit between nodes A and B.

To understand the function of a transmission gate, consider the charging node B to Vdd as in Fig. 5(b). Node A

is set to Vdd and the transmission gate is enabled (C = 1 & C' = 0). Since both the NMOS & PMOS transistors are on, the node B is fully charged to Vdd. On the other hand, discharging node B to 0, B is initially at Vdd when the node A is driven low. The PMOS can pull-down node B to VTP at which moment it turns off. The parallel NMOS device stays turned on and at VGS = Vdd the node B is pulled down to GND. Though the transmission gate needs two transistors to achieve this condition, it ensures a full swing.

#### 5. Transmission Function Adder

Closely similar kind of structure to TGA is termed as the Transmission Function Adder (TFA) as given in fig. 6. It is built with 16 MOS transistors. The basic XOR circuit used in the design requires one of its two inputs in complementary forms. An additional inverter is thus needed, which leads to a 6-transistor XOR design. Additionally four CMOS transmission gates are employed to achieve full swing output.



Fig. 6. Transmission Function Adder

The method is a good choice to build XOR or XNOR gates as well as it ensures low energy requirements [14] – [17]. The poor driving capability of this design in an hierarchial architecture is its drawback. Hence there is a significant performance degradation due to the threshold voltage loss while integrating it to many stages as required.

# III. The Proposed 8T – ULP Adder

This is a novel approach which is designed with only 8 MOS transistors to achieve the function of a 1-bit full adder.

In regular CMOS implementations, the inputs are applied to the gate terminals of transistors with source of PMOS and NMOS are connected to Vdd and GND respectively. In passtransistor circuits, the inputs are applied to source and or drain terminals also. Pass transistor is a ratioless logic class. The dc characteristics are not affected by the size of transistor. Increasing the size reduces the resistance but this can be offset by the increase in diffusion capacitance. Pass transistors in a cascaded chain can be arranged from largest to smallest to reduce delay. The pass transistors have a series resistance associated with them. This kind of logic style is built with NMOS alone or parallel combination of PMOS and NMOS as transmission gates as available in a TGA logic style. The switching activities of the transistors control the parts of entire circuit. Implementation of complex gates can be possible with minimum number of transistors which results in a simpler circuit and the node capacitances are also reduced [18], [19].



Fig. 7. The proposed 8T – ULP Adder

The implementation of the proposed 8T-ULP method is simpler which has two PMOS transistors at the first stage and a static inverter at the intermediate stage with finally four transistors at the output stage as in fig. 7. As discussed a short while, in conventional PTL design the NMOS transistors alone are constructed to get a specific function of the transmission gates which use parallel combination of PMOS and NMOS.

But in the proposed method, the drain of the two PMOS transistors at the first stage are coupled together and the inputs A and B are applied to the sources. These are acting as

switches to give the sum output S. If Cin=0, S is driven by node q and for Cin=1 it is by the node r which is the inverted signal of q. At the final stage of sum output, the signals from the nodes q and r are fed into sources of PMOS and NMOS transistors whose gates are coupled and the input Cin is applied.

When Cin =0 the PMOS is on to produce the sum output S. On the other hand, for Cin =1 the NMOS transistor is on to give the output S. Hence the arrangement of transistors clearly differs from the traditional xor gate. At node q, the PMOS at the top gives A'B while the bottom one produces AB'. The drains of the two PMOS transistors are coupled and produce a wired or logic. This is expressed with respect to node q as

$$q = A'B + AB'$$
(2)

The static CMOS inverter at intermediate stage offers an inverted signal with respect to node q at its output node r, given by

$$r = (A'B + AB')'$$
(3)

The sum output S is driven by PMOS for Cin at logic 0 and NMOS is on to give S for Cin equal to logic 1. Hence it is a wired ¬or logic expressed as

$$S = Cin'q + Cinr$$
(4)

Recalling expressions (2) and (3),

$$S = Cin' (A'B + AB') + Cin(A'B + AB')'$$

(5)

It is also expressed as

$$S = A \oplus B \oplus Cin$$

(6)

The carry output Cout is achieved with the help of the available intermediate signal being computed for the sum output (node r). Hence there is the gain of reduced device utilization which makes the circuit simpler than the earlier methods. The signal from the node r is connected with the gate terminals of PMOS and NMOS transistors of carry out path which are coupled together (bottom hand). The sources of PMOS and NMOS are driven by inputs C and B respectively.

The drains of the two devices are shorted to give a wired or function for Cout which can be derived as

$$Cout = r'Cin + rB$$

(7)

Since r = q', mutually it is true that q = r'. Hence (7) can be written as

$$Cout = qCin + rB$$

(8)

$$= (AB' + A'B)Cin + (A'B' + AB)$$

$$= AB'Cin + A'BCin + A'B'B + AB$$

$$= AB'Cin + A'BCin + 0 + AB$$

$$= A(B + B'Cin) + A'BCin$$

$$= A (B + Cin) + A'BCin$$

$$Cout = AB + BCin + ACin$$
(9)

From (6) and (9) it is proven that the adder's outputs are logically true.

## IV. Performance Analysis and Comparison

The full adders taken for comparison with the proposed 8T – ULP are used as leaf cells individually in an n X n (here, 4x4 with n as 4) Ripple Carry Adder (RCA) as in fig. 8 to see their performance. The simulation is carried out with the use of 32nm Berkeley Predictive Technology Model (BPTM) for analyzing the discussed 1-bit adders and listed in table 1.



Fig. 8. The Ripple Carry Adder Structure to simulate the adders



Fig. 9. Average power Vs supply voltage

The below procedures are carried out to simulate the RCA:

☐ The length and width of PMOS and NMOS of the adders are kept uniformly like unit sized inverter. Here the width of the NMOS transistors are twice the channel length and the PMOS transistors have twice that of the NMOS width [1]. This ensures the 50% duty cycle as well as the operation of the circuits are un-skewed with equal rise resistance.

 $\Box$  Though the adders are compared with the values measured at Vdd = 0.8V, the supply voltage is varied from 0.8V to 1.6V to check the performances of the discussed logic

styles with the increased device threshold levels.

☐ All the adders are simulated and the performance measures are tabulated for the same test conditions: most importantly all the test cases are covered.

Table 1. Average Power, Delay and PDP of the adders used in RCA

| $V_{dd}$           | 0.8V  | 1.0V   | 1.2V    | 1.4V   | 1.6V    |  |  |  |
|--------------------|-------|--------|---------|--------|---------|--|--|--|
| Average Power (nW) |       |        |         |        |         |  |  |  |
| 8T - ULP           | 8.95  | 27.32  | 63.19   | 126.71 | 226.3   |  |  |  |
| TGA                | 58.79 | 185.94 | 435.42  | 877.52 | 1533.3  |  |  |  |
| TFA                | 64.18 | 202.98 | 487.152 | 949.94 | 1572.41 |  |  |  |
| BBL-PT             | 24.84 | 104.96 | 342.71  | 815.23 | 1567.7  |  |  |  |
| SERF               | 54.17 | 127.99 | 246.76  | 413.93 | 643.26  |  |  |  |
| ULPFA              | 90.6  | 311.42 | 730.62  | 1514.3 | 2811.3  |  |  |  |
| Delay (nS)         |       |        |         |        |         |  |  |  |
| 8T - ULP           | 23.75 | 22.66  | 25.24   | 25.22  | 25.22   |  |  |  |
| TGA                | 5.41  | 4.91   | 4.93    | 4.92   | 4.93    |  |  |  |
| TFA                | 5.32  | 4.87   | 4.86    | 4.84   | 4.83    |  |  |  |
| BBL-PT             | 25.54 | 25.32  | 25.23   | 25.2   | 25.17   |  |  |  |
| SERF               | 15.01 | 25.35  | 25.26   | 25.12  | 35      |  |  |  |
| ULPFA              | 35.14 | 35.09  | 35.08   | 35.07  | 35.07   |  |  |  |
| PDP (pWS)          |       |        |         |        |         |  |  |  |
| 8T - ULP           | 0.21  | 0.62   | 1.59    | 3.19   | 5.71    |  |  |  |
| TGA                | 0.32  | 0.91   | 2.15    | 4.32   | 7.56    |  |  |  |
| TFA                | 0.34  | 0.99   | 2.37    | 4.6    | 7.6     |  |  |  |
| BBL-PT             | 0.63  | 2.66   | 8.65    | 20.54  | 39.46   |  |  |  |
| SERF               | 0.81  | 3.24   | 6.23    | 10.4   | 22.51   |  |  |  |
| ULPFA              | 3.18  | 10.93  | 25.63   | 53.11  | 98.58   |  |  |  |



Fig. 10. Delay Vs supply voltage

The 8T-ULP adder consumes nearly 3 times less power than the BBL-PT and a maximum of 10 times lowest power against ULPFA at Vdd=0.8V as per the measured values noted in table 1. Though the performance comparisons at Vdd=0.8V are sufficient enough as per the technology scaling as well as the devices with feature size less than 100nm operate at Vdd under 1volt, the circuits are tested by varying Vdd from 0.8V to 1.6V in steps of 0.2V for two reasons: One is to find the effect of Vdd Vs threshold voltage shift for various logic methods taken and the other reason is to check

the performance differences between the adders with the impact of Vdd¬¬ on the devices [20].



Fig. 11. PDP Vs supply voltage

The average power consumption with these considerations are given in fig. 9. With  $\neg Vin < Vgs-Vt$ , the leakage power is dominant whereas at  $\neg Vin > Vgs-Vt$ , the dynamic power is increasing. These two depend on the flow of drain current which follows the below expression.

$$Id = \mu eff CoxW\{(Vgs-Vt)2\}/L$$
(10)

Where,  $\mu$ eff is the effective mobility of electrons, Cox is the oxide capacitance, W is the width of the NMOS transistor and L is the channel length. However Vgs and Vt are the only variables changing with respect to time and hence (10) can also be expressed as

$$Id \propto (Vgs-Vt)2\neg\neg$$
(11)

With higher Vdd, the value of (Vgs-Vt) is larger and the drain current follows the square law [21]-[24]. Hence all the logic methods follow (11). At Vdd=1Volt and above in steps of 200mv, the dynamic power doubles to every design with respect to previous value of supply voltage. But this difference is not happened between 0.8V to 1V with 200mV Vdd increase. Because the difference between Vgs and Vt is less

than 1V and hence the dynamic power is lower. At this point the leakage current is dominant. At all the cases our proposed design consumes drastically lowest power. Though the 8T-ULP has the similar kind of circuit integrity with the compared designs, the reason for consuming lowest power is that the node transition activity factor  $\alpha$  is smaller than the rest of the logic styles.  $\alpha$  is the average number of times the node makes a power consuming transition with respect to frequency [25]. The dynamic power dissipation related to  $\alpha$  is expressed as

$$P = \alpha CLV dd2f$$
(12)

The device counts, number of nodes, frequency and  $\alpha$  values for various adders implemented in the top module of fig. 8 are listed in table 2.

Table 2. Frequency and node transition activity factor  $\alpha$  of various Adders

| Type of Adder | Device | Number of | Area          | Frequency | A    |
|---------------|--------|-----------|---------------|-----------|------|
| $(V_{dd=}1V)$ | Count  | Nodes     | $(\lambda^2)$ | (MHz)     | 71   |
| 8T-ULPFA      | 171    | 794       | 1600          | 42.11     | 0.42 |
| SERF          | 188    | 850       | 1920          | 66.64     | 1.49 |
| TFA           | 236    | 1050      | 2880          | 188       | 0.51 |
| TGA           | 268    | 1194      | 2880          | 185       | 0.42 |
| BBL-PT        | 300    | 1370      | 4480          | 39.15     | 0.72 |
| ULPFA         | 348    | 1594      | 5440          | 28.46     | 3.12 |

Type of Adder (Vdd=1V) Device Count Number of Nodes Area ( $\lambda$ 2) Frequency (MHz) A

8T-ULPFA171 794 160042.11 0.42

SERF 188 850 192066.64 1.49

TFA 236 10502880188 0.51

TGA 268 11942880185 0.42

BBL-PT 300 1370448039.15 0.72

ULPFA 348 1594544028.46 3.12

Interpreting the  $\alpha$ , frequency values from table 2 with the average power consumption given in table 1, one can easily identify that the operating frequency of a circuit increases if it has more number of nodes. It also results in increased switching activity of a particular circuit to consume more power than the circuit with least number of nodes. It means that the operating frequency of a design increases yield an

increased power consumption of it [26]-[29]. Hence, with least number of nodes (nearly half less than ULPFA) the 8T-ULP consumes 10 times lowest average power than ULPFA at Vdd=0.8V.

Despite of the contribution of parasitic by wiring elements, the device resistance R and the capacitance between its various terminals like Cgs, Csd, Cgb and Csb are the reasons for propagating the signal to its output. The delay values for various methods with Vdd ranging from 0.8V to 1.6V are plotted in y-axis while the supply voltage is in x-axis of fig. 10. Since there is a similarity among the TGA and TFA, the plotted curves for these two types are nearly coinciding with each other. Due to the availability of Vdd at intermediate nodes, the TGA and TFA are switching 3 times faster than SERF and 7 times speeder against ULPFA. It happens based on the nature of the circuits and depends on the applied electric field, velocity of electron as well as the mobility of charge carriers [27]. Here, the average velocity of electron is equal to the product of mobility of charge carriers and electric field which is the reason for switching activity of a circuit expressed as,

 $\upsilon = \mu E$ 

(13)

Here, E is the electric field defined as

E = Vds/L

(14)

Where, Vds is the drain-source voltage and L is the channel length of MOS transistors. The drain current is expressed as,

 $Ids = Qchannel \upsilon/L \neg \neg$ 

(15)

Here, Qchannel is the charge across the channel. Hence the velocity of electrons causes a circuit to switch accordingly [27],[28]. The propagation of the signal also depends on the delay caused by channel resistance Rch [30] expressed as

Rch=L/[µeff CoxW(Vgs-Vt)]

(16)

In relationship with the points discussed for average power, the designs operate with low power are slower than the circuits which are faster by consuming higher power. The proposed design switches with moderate speed due to its simple form and least nodes which enable it to manage the electric field and hence a better velocity. The power delay product (PDP) curves plotted in fig. 11 give another dimension to choose the adder according to the requirement. It is needed because of the often pronounced trade of between the average power and delay in a vice-versa manner. If the low power is a major design objective, it is achieved at the expense of slower switching activity of the circuit due to the lower availability of electric field throughout the region of devices and the resulting inefficient mobility of electrons. On the other hand, the need for speed is possible at the cost of higher dynamic power consumption as a result of frequent switching [31] -[34]. Hence PDP is calculated to get the additional views. Though the 8T-ULP is moderately switching, the lowest power consumed by it ensures the circuit as 15 times most efficient design than the ULPFA. The TGA and TFA lag the design by consuming 35% more PDP while the BBL-PT is 3 times expensive and the SERF uses 4 times more PDP as in table 1.

Lambda ( $\lambda$ ) based design is used to estimate the area of the leaf cells. For area calculations, only adders are considered. P and N type diffusion layers (for PMOS and NMOS transistors respectively) are placed horizontally. Two more horizontal layers of Metal 1 for Vdd and gnd. One layer is needed for connection between nodes. Hence a total of 5 layers are needed horizontally. A wiring track requires 4 λ width and 4  $\lambda$  spacing from its neighbor with a total of 8  $\lambda$  pitch. Hence 5 horizontal layers require 40λ. The inputs and terminals (Gate, Source, Drain) are accounted vertically [35] - [37]. For a 3 input, 1-bit full adder design (A, B, Cin) it requires 3x8 λ in addition to another 8 \( \lambda \) for supply connections (Vdd and gnd) with a total of 32 λ. For 8T-ULP, there are five horizontal layers and hence a 40  $\lambda$  is needed. Vertically it has 3 inputs and an inverter along with supply terminals which needs 5x8  $\lambda$  to a total of 40  $\lambda$ . Overall area of the design is 40  $\lambda$  by 40  $\lambda$ which is noted as 1600  $\lambda$ 2. Similarly all the other adders are calculated and given in table 2. The 8T-ULP is hence termed has Ultra Low Power design which is an optimized method with least layout area.

#### V. Conclusion

The PMOS and NMOS transistors which are functioning as simple switches are integrated as a novel 1-bit ultra low power full adder circuit with only 8 transistors to achieve low power termed as 8T-ULP. It is compared with similar kind of 1-bit adder structures as Leaf Cells. To test the performance of the adders RCA structure is used in which all the designs are utilized as leaf cells. The proposed method consumes 2.7 times lower power than SERF and it is 10 times lowest against ULPFA. Other logic styles are also consuming higher power than 8T-ULP at all the values of Vdd. Though the proposed circuit switches typically than the rest of the adders in which case the TGA and TFA are faster, the 8T-ULP operates with 15 times efficient PDP is an additional benefit. The area of 8T-ULP circuit is also atleast 17% economic than the SERF and more than 3 times the design space is saved against the ULPFA.

The dominance of the leakage components of power with respect to technology scaling is to be concentrated to control the energy consumption particularly with 32nm as a future scope.

# 3. References

- [1] J.M.Rabaey, A.Chandrakasan and B.Nikolic, Digital
   Integrated Circuits A Design Perspective, Pearson
   Educational publishers 2nd edition 2008.
- [2] N.Weste and K.Eshraghian, Principles of CMOS VLSI design, a System perspective, 2nd edition, Addison-Wiley, 1993.
- [3] Y.Taur and T.H.Ning, Fundamentals of Modern VLSI Devices, Cambridge University Press, Reprint-2004.
- [4] Behzad Razavi, Design of Analog CMOS Integrated Circuits, Tata Mc Graw Hill, 2002.
- [5] International Technology Road map for Semiconductors predictions (ITRS), 2012.
- [6] Ilham Hassoune, Denis Flandre, Ian O'Connor and Jean-Didier Legat, "ULPFA: A New Efficient Design of a

- Power-Aware Full Adder," IEEE Trans. Circuits Syst. I, Vol. 57, No. 8, August 2010, pp. 2066-2074.
- [7] R.Shalem, E.John and L.K.John, "A Novel Low Power Energy Recovery Full Adder cell," Proceedings of IEEE 9th Great Lakes symposium on VLSI, 1999, pp. 380 383.
- [8] Zine Abid, Wei Wang, "New Designs of Redundant binary Full Adders and It's Applications," IEEE International Symposium on Circuits and Systems, pp.3366-3369, 2008.
- [9] Konduru Lakshmi Bhanu Prakash Reddy, S.Vijayakumar, "High Speed Low Area and Energy Efficient 32 bit Carry Skip Adder using Verilog HDL", International Journal of Engineering Research in Electronics and Communication Engineering, ISSN: 2394-6849, DOI: 01.1617/vol4iss3pid041, vol. 4, no. 3, pp. 205-209, March 2017.
- [10] Chien-Hung Lin and Shu-Chung Yi, Jin-Jia Chen, "Low Power Adders Design for Portable Video Terminals," IEEE Int. Conf. on Intelligent Information Hiding and Signal Processing, pp.651-654, 2008.
- [11] A.Shams and M.Bayoumi, "Performance Evaluation of 1-bit CMOS adder cells,"Proc. Of IEEE Int. Symposium on Circuit and Systems, pp.27-30, 1999.
- [12] Kuo-Hsing Cheng, Chih-Sheng hang, "The Novel Efficient Design of XOR/XNOR Function for Adder Applications," 6th IEEE Conf. Proceedings of ICECS '99, pp. 29-32, 1999.
- [13] J.Archana, S.Vijayakumar, P.Saravanan, A.Manimegalai, "Modeling a Mobile Robot to Reduce Energy Consumption", International Journal of Science and Innovative Engineering & Technology (IJSIET), ISBN 978-81-904760-7-2, vol. 2, May 2015.
- [14] T.Vigneswaran, B. Mukundhan, and P. Subbarami Reddy, "A Novel Low Power, High Speed 14 Transistor CMOS Full Adder Cell with 50% Improvement in Threshold Loss Problem," 13th conf. World Academy of Science, Engineering and Technology, pp.81-85, 2006.
- [15] Nan Zhuang and Haomin Wu, "A New Design of the CMOS Full Adder," IEEE Journal of Solid State Circuits,

- Vol.27, No.5, pp.840-844, 1992.
- [16] S.Vijayakumar, V.Jayaprakasan, Reeba Korah, "Design and Analysis of Low Power, Area Efficient Skip Logic for CSKA Circuit in Arithmetic Unit", Proceedings of IEEE Conference on Emerging Devices and Smart Systems, ICEDSS 2018, 2nd and 3rd March 2018.
- [17] A.M.Shams, Tarek K.Darwish, Magdy A,Bayoumi, "Performance Analysis of low power 1-bit CMOS full adder cells," IEEE Tran. On VLSI Systems, Vol.10 No.1, pp.20-29, Feb'2002.
- [18] A.Chandrakasan, R.W.Brodersen, Low power Digital CMOS Design, Kluwer Academic Publishers, 2002.
- [19] Vijayakumar, S & Reeba Korah 2014, 'Circuit level, 32 nm, 1-bit MOSSI-ULP adder: power, PDP and area efficient base cell for unsigned multiplier', IEICE Electronics Express, vol. 11, no. 6, pp. 1-7, DOI: 10.1587/elex.11.20140109.
- [20] J.M.Rabaey and M.Pedram, Low Power Design Methodologies, Kluwer Academic Publishers, 2002.
- [21] G.K. Yeap and F.N. Najm, Low Power VLSI Design and Technology, World Scientific Publishers, 1996.
- [22] D.Habeeba Sulthana, S.Vijayakumar, "An Efficient Implementation of High Speed, Low Power Vedic Multipliers using Reversible Gates", Journal of Emerging Technologies and Innovative Research (JETIR), ISSN: 2349-5162, vol. 5, no. 9, pp. 69-73, September 2018.
- [23] M.Sravani, S.Vijayakumar, "Design and Implementation of Hybrid LUT/Multiplexer FPGA Logic Architectures using Verilog HDL", International Journal of Research, DOI: 16.10089.IJR.2018.V7I9.285311.22836, vol. 7, no. 9, pp. 820-828, September 2018.
- [24] K.Roy and S.C.Prasad, Low Power CMOS VLSI Circuit Design, Wiley Interscience Publishers, 2000.
- [25] A.Bellaouar and M.I.Elmasry, Low Power Digital VLSI Design: Circuits and Systems, Kluwer Academic Publishers, 5th printing 2000.
- [26] G.Hang and X.Wu, "Improved Structure for Adiabatic CMOS Circuit Design," Microelectronics Journal, Vol. 33, pp.403-407, 2002.

- [27] S.Vijayakumar and Reeba Korah, "Area and Power Efficient Hybrid PTCSL MUX Design," EJSR, Vol. 83 No. 1, pp.39-52, 2012.
- [28] Venkatesan, RG, Vijayakumar, S & Yashodha, P 2013, 'Power harvesting and area efficient clock gating method for a de-composed MUX controller', International Journal of Advanced Research in Computer Science, vol. 4, no. 4, pp. 45-51.
- [29] S.Vishnupriya, A.Manimegalai, P.Saravanan, S.Vijayakumar, "Design of an Intelligent Human Interacting Mobile Robot with Sensing System", International Journal of Science and Innovative Engineering & Technology (IJSIET), ISBN 978-81-904760-7-2, vol. 2, May 2015.
- [30] M.S.Lundstrom and J.Guo, Nanoscale Transistors: Device Physics, Modeling and Simulation, Springer, 2008.
- [31] H.Kaul et al., "A 300mV 494 GOPS/W Reconfigurable Dual-Supply 4-Way SIMD Vector Processing Accelerator in 45nm CMOS," IEEE Journal of Solid State Circuits, Vol. 45, No. 1, pp. 95-102, 2010.
- [32] Nivitha, D, Vijayakumar, S & Reeba Korah 2014 'Design and implementation of microstrip antenna for wireless application in UWB region', International Journal of Research in Engineering Technology, vol. 3, Special Issue 7, pp. 260-264.
- [33] K.S.Chong, B.H.Gwee and J.S.Chang, "Low Energy 16-bit Booth Leaf-frog Array Multiplier using Dynamic Adders," IET Circuits Devices Syst., Vol. 1, No.2, pp. 170-174, 2007.
- [34] S.Rohith Kumar, S.Vijayakumar, "An Optimized Power Efficient Programmable PRPG with Test Compression Capabilities", International Research Journal in Advanced Engineering and Technology, E-ISSN: 2454-4752, vol. 2, no. 3, pp. 1061-1066, 2016.
- [35] R.Jacob Baker, CMOS Circuit Design Layout and Simulation, John Wiley and Sons, 2010.
- [36] A.Manikanta, D.Y.Pushpamitra, S.Vijayakumar, "Design of CMOS PLC Receiver Using Dual Power Lines for Design For Testability", Journal of Emerging Technologies and Innovative Research (JETIR), ISSN: 2349-5162, vol. 5,

no. 9, pp. 74-82, September 2018.

[37] J.P.Uyemura, Chip Design for Submicron VLSI:

/