# Design of Efficient Multiplier with Low Power and High-Speed Using PTL (Pass Transistor Logic)

Dr.D. Satyanarayana1, Dr.M. Chennakesavulu2, Ms.N. Fouzia Sulthana3, K. Upendra4, D. Sashidhar5, K.Ramachandra Reddy6, N.Naga Sai Vikranth7, V.Devendra8.

1 Professor, 2 Associate Professor, 3 Assistant Professor Dept. of ECE, Rajeev Gandhi Memorial College of Engineering and Technology, Andhra Pradesh, India;

4, 5, 6, 7, 8, students, Dept. of ECE, Rajeev Gandhi Memorial College of Engineering and Technology, Andhra Pradesh, India;

Email: upendrakurni@gmail.com

Abstract—The demand for low-power, high-speed, and area efficient digital circuits has driven the exploration of alternative logic families such as Pass Transistor Logic (PTL). The design of a multiplier circuit that leverages the inherent advantages of PTL to achieve significant improvements in power consumption, operational speed and silicon area usage. The proposed multiplier using PTL-based logic gates to generate partial products, followed by a reduction tree and a final addition stage, all optimized for performance and efficiency. Key design challenges, such as voltage degradation and level restoration inherent in PTL circuits, is addressed through carefully designed voltage restoration techniques and custom PTL cells. The architecture is compared against conventional CMOS-based multipliers to demonstrate its superiority in terms of power efficiency and speed. All the circuits are simulated using ECAD tools to analyze the power, delay, area and Power-Delay-Product (PDP) of the multiplier to highlight a substantial reduction in power consumption and a faster operation, making the PTL-based multiplier an ideal circuit for high-performance and low-power applications in modern digital systems. The proposed work contributes to the field of low-power digital design by showcasing the potential of PTL in creating multipliers which are not only efficient but also scalable for future technology nodes.

Key Words: Low power - Full adder - Multiplier - Delay - Pass Transistor.

#### I. INTRODUCTION

In this new world of digital technology, electronic systems have taken up an integral position in several activities that govern life today. Core parts include microprocessors, digital communication modules, and signal processing units that provide the back bone of such systems. Scaling has brought several issues; chief among these is the consumption of power and constraints imposed by the area on the silicon that further deter its viability and extensive application. With the rapid advancement of emerging technologies such as the Internet of Things (IoT), smart homes, and intelligent urban infrastructure, the demand for energy-efficient and compact circuit designs has become more pressing than ever. It is crucial to achieve a balance between reduced power consumption, minimized chip area, and high computational efficiency in the design of modern digital circuits. Arithmetic circuits, which include adders, multipliers, and dividers, are the basic building blocks of many digital applications. Of these, multipliers are the most important since their efficiency determines the overall computational speed and power consumption of a system. Thus, optimizing multiplier topologies and developing creative computation methodologies are critical for obtaining extremely fast, low-energy multiplication, which improves the functionality of electronic devices in real-world applications. In this research, we offer a power-aware a maximum of eight-bit multiplier that uses Pass Transistor Logic (PTL) to operate at fast rates while taking up little space. It uses the Modified Booth Encoding (MBE) technique to improve speed and minimize both the power dissipated and the circuit complexity. With these optimizations, the proposed multiplier has achieved substantial improvements in speed and energy efficiency and so is highly suited for modern digital applications.

#### II. BACKGROUND

Multipliers play a crucial role in various computational applications, making them a significant area of research [2] [3]. Over the years, researchers have proposed various approaches to improve multiplier efficiency using different algorithms, including array multipliers, carry-save architectures, Wallace tree structures, and optimized Booth encoding techniques [4] [5]. These design strategies aim at an optimal balance be-tween power consumption, processing speed, and silicon area. Modern multiplier architectures have become increasingly sophisticated, but further performance enhancements require the development of novel standard cells and circuit optimization techniques. Traditionally, standard cell designs are based on Complementary Metal-Oxide-Semiconductor. However, in order to improve circuit functionality even more, experts have focused on Pass Transistor Logic (PTL) [6]. For example, in [6], all electronics circuits computing-inmemory macro was reported, whereby both CMOS-based and PTL-based complete adders were integrated in the adder tree, increasing computational efficiency by 30% In [7], a PTLbased full adder was also integrated into an RCA circuit to reduce the transistor count. Furthermore, In [8], a PTL-based Booth selection circuit incorporated a 54×54 multiplier lowered transistor usage by 45%, demonstrating the potential of PTL in arithmetic circuits. Despite these advantages, PTLbased designs present certain challenges. While they require fewer transistors, this does not always equate to a smaller layout area due to their intricate connectivity, which can lower layout utilization [9]. Further- more, cascading PTL circuits cause large delays, as observed in [10]. In order to minimize this problem, researchers in [11] placed inverters at the output of PTL circuits, which doubled the power and area. The hybrid approach of [9] alternated PTL and CMOS full adders, effectively reducing cascading delay while remaining efficient. Moreover, PTL circuits often de- grade with threshold voltage and have poor driving capability. Therefore, some design modifications are needed. In [9], for example, inverters were placed after every two PTL units to improve logic restoration and reliability. However, with these trade-offs, the successful implementation of PTL in arithmetic circuits requires careful design considerations to maximize performance while minimizing associated drawbacks.

# III. PROPOSED DESIGN

- In this study, we present a maximum of eightbit multiplier that uses Pass Transistor Logic (PTL) circuits to optimize the 3 main portions of the multiplier. The circuits has been carried out in a 45nm technology node with enhanced performance on the grounds of power, area, and speed.

TABLE I A COMPARATIVE EVALUATION OF CMOS AND PTL DESIGNS IN 45NM TECHNOLOGY WITH 0.9V VOLTAGE SUPPLY AT 500 MHZ.

| Design   | Transistor Count | Power(µW) | Delay(ps) | Area(µm2) |
|----------|------------------|-----------|-----------|-----------|
| CMOS-XOR | 12               | 17.16     | 65.0      | 0.78      |
| PTL-XOR  | 6                | 0.389     | 34.20     | 0.39      |
| CMOS-HA  | 20               | 6.929     | 87.72     | 1.47      |
| PTL-HA   | 12               | 1.964     | 87.71     | 0.88      |
| CMOS-FA  | 28               | 2.505     | 147.0     | 1.67      |
| PTL-FA   | 16               | 11.06     | 43.39     | 1.47      |



Figure.1(a) 6/8-T PTL-XOR

# A. Design and Evaluation of PTL-Based Circuits for 45nm Technology

Figure 1(a), Figure 1(b), Figure 1(c) illustrate the PTL architectures for fundamental logic gates, including the XOR gate, half adder (HA), and full adder (FA), [11], [12]. To critically assess the advantages of the use of PTL instead of traditional CMOS-based implementations, we designed the standard cells for these units based on the 45 nm process design rules with a 7-track architecture. Simulations were performed at an operating voltage of 0.9V and a frequency of 500MHz, and the performance was compared against equivalent CMOS-based standard cells. The comparative analysis is summarized in Table I. The results indicate that while the PTL-based full adder (FA) achieves reductions in area and power consumption, it exhibits a slightly higher critical path delay than its CMOS counterpart. This implies that although PTL FA contributes to energy and area



Figure.1(b) PTL-HA



Figure.1(c) PTL-FA



Figure.2(a) CMOS-XOR

efficiency, its impact on the overall circuit speed needs to be carefully considered in high-performance applications. On the other hand, PTL-based XOR and HA structures show remarkable improvements in power efficiency, delay reduction, and area minimization compared to CMOS equivalents. An important aspect of practical implementations is the driving capability of PTL circuits. Because of their inherent design, PTL-based structures often have limited load-driving capacity.



Figure.2(b) CMOS-HA



Figure.2(c) CMOS-FA



Figure.3(a) PTL-NAND

Figure.3(b) PTL-OR



Figure.4(a) CMOS-NAND



Figure.4(b) CMOS-OR



Figure.5(a) PTL-MBE



Figure.5(b) CMOS-MBE



Figure.6 Synthesized Multiplier using CMOS logic

To increase signal integrity and circuit stability, design adjustments such as integrating inverters at the output can be employed, as seen in the 8T XOR gate in Figure 1(a).

# B. Using PTL-Based Cells

The Modified Booth Encoder(MBE), in Figure. 5(a), mainly depends on the XOR gate. Integration of the PTL-based XOR gates into the MBE requires incorporating buffers for the  $X_{2i+1}$ and X<sub>2i-1</sub> to make sure that B and Bbar are generated correctly. Such encoding for Booth multiplication relies on such signals, buffered globally to retain signal integrity with the associated minimal overhead in the total transistor count. In the vast majority of situations, a performance level as obtained from a 6T XOR gate will suffice, while for a need for additional drive strength a 6T XOR gate as described below in Figure. 1(a)- is advantageous. This arrangement includes a full-swing PTL XNOR gate constructed Using an inverter for increased driving strength. The usage of PTL XOR gates in the Booth encoder significantly decreases the multiplier's circuit surface area and power consumption. As shown in Figure 6, the multiplier is synthesized using half adders (HA) and full adders (FA). The use of PTL-based HA significantly minimizes the energy consumption and latency in the multiplier design. Though the integration of PTL FA further lowers circuit power, this could increase the critical path delay a little bit. Although the maximum delay of PTL



Figure. 7 PTL FAinv

FA is longer than CMOS, the PTL\_FA involves lower carry propagation delay. This is the characteristic that makes PTL\_FA a suitable candidate for improving the overall efficiency of the multiplier circuit. To make it more efficient in terms of its circuitry, we present PTL-based FA with Inversion (FAinv) in Fig. 7. The major difference between the PTL\_FA and FAinv is the type of carry input (C<sub>in</sub>) and carry output (Co) structure used. For example, compared to the PTL\_FA circuit that uses simple carry signals, FAinv features an inverted carry input and output to achieve a better propagation of the signals. The corresponding Boolean equations of their functions are: For the PTL\_FA:

$$S = A \oplus B \oplus C_{in} \tag{i}$$

$$Co = (A \odot B) \cdot B + (A \oplus B) \cdot C_{in}$$
 (ii)

For the proposed FAinv:

$$S = A \oplus B \oplus C_{in} bar$$
 (iii)

$$Co = (A \odot B) \cdot Bbar + (A \oplus B) \cdot C_{in}bar$$
 (iv)

These basic arithmetic blocks implementation via PTL-based logic significantly improve power efficiency and reduce the required area, which definitely motivates the idea of using this for the design of low-power and high-speed-multipliers.

# C. Proposed PTL-Based Multiplier



Figure.8 Proposed Multiplier

Above, the proposed multiplier utilizes Pass Transistor Logic (PTL) Full Adders (FA) and Half Adders (HA) at appropriate positions in the Synthesized Multiplier, with the aim of achieving a low-power and area-efficient design.



Figure.9 Comparison of Total Transistor Count

Although including extra PTL FA can significantly reduce energy consumption, it may also result in an increase in crucial path time. Therefore, achieving an optimal balance between power and speed is essential. In an attempt to bridge this tradeoff, we design an optimized multiplier structure as shown in Figure. 8. This is done by introducing four PTL-based Half Adders (HAs) in the adder tree. This adds power efficiency with minimal performance loss. Moreover, careful placement of PTL-based arithmetic components helps reduce transistor count without compromising accuracy in computation. As illustrated in Figure.9, the proposed a maximum of eight bit signed multiplier consists of 1236 transistors, achieving a 26.16% reduction in transistor count compared to a conventionally synthesized multiplier. These improvements highlight the effectiveness of PTL logic in designing compact, low-power, and high-performance multipliers.

TABLE II
PERFORMANCE ANALYSIS OF DIFFERENT MULTIPLIER ARCHITECTURES

| Multiplier  | Supply<br>Voltage(V) | Power(µW) | Delay(ps) | Area(μm2) | PDP(fJ) |
|-------------|----------------------|-----------|-----------|-----------|---------|
| Synthesized | 0.9                  | 144       | 240.8     | 144.94    | 34.67   |
| Proposed    | 0.9                  | 90.14     | 160.4     | 125.44    | 14.45   |
| [1]         | 0.9                  | 124       | 760       | 749.12    | 94.24   |
| [9]         | 0.9                  | 128.89    | 950.8     | 1036      | 121.17  |

#### IV. EXPERIMENTAL RESULTS

To compare our design with existing implementations, we design a maximum of eight bit multiplier using Cadence Virtuoso in 45nm, 0.9V standard cell library. We execute the simulations at 27°C with a supply voltage of 0.9V. The



Figure. 10(a) Comparison of Power Consumption



Figure. 10(b) Comparison of Delay

power consumption is estimated using 1000 random input patterns at a frequency of 500MHz, and the power-delay product (PDP) is evaluated by multiplying the worst-case delay with the average power consumption of the main circuit. The performance of the multipliers is analyzed over a range of supply voltages (0.5V to 0.9V) at 500MHz. We compare



Figure. 10(c) Comparison of PDP

the performance of the proposed multiplier with previously published works [1], [9], summarized in Table II. The stated metrics include delay, area, power usage, and Power Delay Product. The designs in [9] and [1] were initially built on the 65nm technology, and to be fair, they were reduced to 32nm following the technique from [14]. Subsequently, the powerdelay data was updated accordingly. As can be seen from Table III, the designed multiplier results in significant improvements compared to the best counterpart, such as 9.72% delay reduction, 13.45% area reduction, 15.19% power reduction, and 23.44% PDP improvement. As shown in Figure. 10(a)–(c), the proposed multiplier performs better than the synthesized one for all supply voltages (0.5V-0.9V) in terms of speed, energy, and PDP. Simulation results in Figure. 10(a) verify that the proposed multiplier works down to 0.5V, and Figure. 10(c) demonstrates that the optimal PDP is obtained at 0.8V.

### V. CONCLUSION

In this short, we demonstrate a maximum of eightbit multiplier utilizing the PTL method. Our experimental results for simulation show the multiplier achieves the PDP values down to an amazing 23.44% as compared with the best among all counterparts. Additionally, across supply voltages, from 0.5V to 0.9 V, proposed multipliers prove faster and of greater energy efficiency while outscoring other designs of multipliers when it comes to power/delay trade-off.

#### REFERENCES

- [1] G. G. Kumar and S. K. Sahoo, "Implementation of a high speed multiplier for high-performance and low power applications," in Proc.19th Int. Symp. VLSI Design Test, 2015, pp. 1–4.
- [2] C. Senthilpari, A. K. Singh, and K. Diwakar, "Low power and high speed 8x8 bit multiplier using non-clocked pass transistor logic," in Proc. Int.Conf. Intell. Adv. Syst., 2007, pp. 1374–1378.
- [3] J. Pihl and E. J. Aas, "A multiplier and squarer generator for high performance DSP applications," in Proc. 39th Midwest Symp. Circuits Syst., vol. 1, 1996, pp. 109–112.
- [4] Z. Huang and M. D. Ercegovac, "High-performance low-power left-toright array multiplier design," IEEE Trans.Comput., vol. 54,no. 3, pp. 272–283, Mar. 2005.
- [5] S.-R. Kuang, J.-P. Wang, and C.-Y. Guo, "Modified booth multipliers with a regular partial product array," IEEE Trans. Circuits Syst. II, Exp.Briefs, vol. 56, no. 5, pp. 404–408, May 2009.

- [6] Y.-D. Chih et al., "16.4 an 89TOPS/W and 16.3TOPS/mm 2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications," in Proc. IEEE Int.Solid-StateCircuits Conf. (ISSCC), 2021, pp. 252–254.
- [7] D. Wang, C.-T. Lin, G. K. Chen, P. Knag, R. K. Krishnamurthy, and M. Seok, "DIMC: 2219TOPS/W 2569F2/b digital in-memory computing macro in 28nm based on approximate arithmetic hardware," in Proc.IEEE Int. Solid- State Circuits Conf. (ISSCC), 2022, pp. 266–268.
- [8] G. Goto et al., "A 4.1-ns compact 54/spl times/54-b multiplier utilizing sign-select booth encoders," IEEE J. Solid-State Circuits, vol. 32, no. 11, pp. 1676–1682, Nov. 1997.
- [9] A. C. Ranasinghe and S. H. Gerez, "Glitch-Optimized circuit blocks for low-power high-performance booth multipliers," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 28, no. 9, pp. 2028–2041, Sep. 2020.
- [10] H. Naseri and S. Timarchi, "Low-power and fast full adder by exploring new XOR and XNOR gates," IEEE Trans. Very Large Scale Integr.(VLSI) Syst., vol. 26, no. 8, pp. 1481–1493, Aug. 2018.
- [11] J.-M. Wang, S.-C. Fang, and W.-S. Feng, "New efficient designs for XOR and XNOR functions on the transistor level," IEEE J. Solid-State Circuits, vol. 29, no. 7, pp. 780–786, Jul. 1994.
- [12] A. M. Shams and M. A. Bayoumi, "A novel high-performance CMOS 1-bit full-adder cell," IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process., vol. 47, no. 5, pp. 478–481, May 2000.
- [13] N. V. V. K. Boppana, J. Kommareddy, and S. Ren, "Low-cost and high performance 8 × 8 booth multiplier," Circuits, Syst., Signal Process.,vol. 38, pp. 4357–4368,Jan. 2019.
- [14] A. Stillmaker and B. Baas, "Scaling equations for the accurate prediction of CMOS device performance from 180nm to 7nm," Integration, vol. 58, pp. 74–81, Jun. 2017.