# EECS 151/251A Discussion 10/worksheet

Alisha Menon

#### Administrativia

- Midterm 2 on Thursday 11/4
- Midterm review session Tuesday 11/2 (problems 4-7)
- How is the project?

### Agenda

Midterm worksheet/problem examples

#### Midterm 2 Topics!

#### Including:

- CMOS
- Delay optimization and logical effort
- Power and energy
- Wires and RC models
- Arithmetic

#### Problem 1 – CMOS Gate (Fa20 m2)

In this problem, use the process that has  $W_n = W_p = 1$  in a reference inverter, and  $\gamma = 1$ .

(a) Design a complex CMOS gate:  $Y = \overline{(A+B) \cdot C + D}$ . Size the transistors so that the gate has the same pull-up and pull-down strength as a reference inverter.

**251A only:** Also make sure the parasitic delay (P) is minimized in your design.



Sizing from worst-case path will guarantee the minimum overall area. In PMOS, A-B-D is the worst case and should be sized to 3. In NMOS, A/B-C is the worst case and should be sized to 2. The overall area (width) for PMOS is: 3 + 3 + 3 + 1.5 = 10.5.

#### Problem 1 – CMOS Gate (Fa20 m2)

(b) Calculate the logic effort for each input  $(LE_A, LE_B, LE_C, LE_D)$ .

$$LE_A = LE_B = \frac{3+2}{2} = 2.5$$
  
 $LE_C = \frac{1.5+2}{2} = 1.75$   
 $LE_D = \frac{3+1}{2} = 2$   
 $P = \frac{3+2+1}{2} = 3$ 

#### Problem 1 – CMOS Gate (Fa20 m2)

(c) 251A only: Calculate the optimized parasitic delay P.

To minimize parasitic delay, we need to reduce the total parasitic capacitance at the output node. Moving smaller sized transistors in serial to the output can help. For example, in the schematic above, if NMOS A//B switch with NMOS C, the parasitic delay will be P = (3 + 2 + 2 + 1)/2 = 4 instead.

We didn't require the minimum area in the description so other solutions will also be taken. For example, in PMOS network, if you size C and D to 2, then A in serial with B should have the same resistance with C, so A = B = 4 (1/4 + 1/4 = 1/2). The overall area for PMOS is: 4 + 4 + 2 + 2 = 12 > 10.5.

As long as the logic effort calculation matches your schematic, you'll also get full credit in the second (and third) part.

Reference inverter:  $W_p = 2W_n = W$ ,  $t_{p,inv} = 32ps$ ,  $\gamma = 2$ 

Determine path effort and optimal stage effort



b) Find optimal a and b for minimum delay

$$b = \frac{\sqrt{3}}{3} \frac{38C_{in}}{6.3} = 8C_{in}$$

$$a = \frac{5}{3} \cdot 3 \cdot 8C_{in} = 6.3C_{in}$$

$$= \frac{4}{3} * 38C_{in}}{6.3} = 8C_{in}$$

$$= \frac{5}{3} * 3 * 8C_{in}}{6.3} = 6.3C_{in}$$

Is this the optimal number of stages for delay? If not, what is?

Nope = 
$$\frac{\log(253.3)}{\log(4)}$$
  $\approx$   $\sqrt{\frac{253.3}{\log(4)}} = 4$   
So we should add an extra stage to minimize the delay.

Reference inverter:  $W_p = 2W_n = W$ ,  $t_{p,inv} = 32ps$ ,  $\gamma = 2$ 

d) If we could add a single inverter anywhere along the critical path, where would be the best for the minimum total area?



We would want to add the extra inverter at the end of the chain to minimize the area. The intuition is that we are going to have four stages with equal effective fanouts (EF=LE.f). If LE is smaller (inverter has the smallest), fanout would be larger. So putting the gate with larger fanout towards the end will reduce the input cap of the stage more, which results in smaller area.

Reference inverter:  $W_p = 2W_n = W$ ,  $t_{p,inv} = 32ps$ ,  $\gamma = 2$ 

e) "Better" NAND gate logical effort?

$$LE_{A} = \frac{R_{N}(4C_{5})}{R_{N}(3C_{5})} = \frac{4}{3}$$

$$LE_{B_{\beta}\rightarrow L} = \frac{R_{N}(3C_{5})}{R_{N}(3C_{5})} = 1$$

$$LE_{8} = \frac{3}{2}$$

$$LE_{A} = \frac{R_{N} (4C_{3})}{R_{N} (3C_{5})} = \frac{4}{3}$$

$$LE_{A} = \frac{R_{n} (2C_{g} + 2C_{g})}{R_{n} (2C_{g} + C_{g})} = \frac{4}{3}$$

$$LE_{B L \rightarrow H} = \frac{2R_{N} (3C_{5})}{R_{N} (3C_{5})} = 2$$

$$LE_{B,LH} = \frac{2R_{n} (C_{g} + 2C_{g})}{R_{n} (2C_{g} + C_{g})} = 2$$

$$LE_{B,HL} = \frac{R_{n} (C_{g} + 2C_{g})}{R_{n} (2C_{g} + C_{g})} = 1$$

$$LE_{B,HL} = \frac{R_{n} (C_{g} + 2C_{g})}{R_{n} (2C_{g} + C_{g})} = 1$$

$$LE_{B,HL} = \frac{LE_{B,LH} + LE_{B,HL}}{2} = \frac{2+1}{2} = \frac{3}{2}$$

$$A \longrightarrow 2$$

$$B \longrightarrow 2$$

$$B \longrightarrow 2$$

Reference inverter:  $W_p = 2W_n = W$ ,  $t_{p,inv} = 32ps$ ,  $\gamma = 2$ 

f) "Better" NAND gate intrinsic delay parameter (p)?

$$t_{p,inv} = \ln(2) R_{eq,inv} C_{in,inv}$$

$$P_{A} = \frac{R_{N} 5C_{D}}{R_{N} 3C_{D}} Y = \frac{LO}{3}$$

$$\frac{10}{8}$$
hes to include  $Y$ 

$$P_{BL=H} = \frac{2R_{N}(SC_{0})}{R_{N}(3C_{0})}Y = \frac{20}{3}$$

$$P_{BH\rightarrow L} = \frac{R_N(SC_0)}{3} = P_A = \frac{10}{3}$$
  $P_B =$ 

g) How does this affect the delay of our original chain?



$$P_{B} = 5$$

$$p_{A} = \frac{R_{n}(5C_{d})}{R_{n}(3C_{d})}\gamma = \frac{5}{3} * 2 = \frac{10}{3}$$

$$p_{B,LH} = \frac{2R_{n}(5C_{d})}{R_{n}(3C_{d})}\gamma = \frac{10}{3} * 2 = \frac{20}{3}$$
ginal chain?
$$p_{B,HL} = \frac{R_{n}(5C_{d})}{R_{n}(3C_{d})}\gamma = \frac{5}{3} * 2 = \frac{10}{3}$$

$$p_{B} = \frac{p_{B,LH} + p_{B,HL}}{2} = 5$$

p<sub>A</sub> has been reduced from 4 to 3.3 which means faster critical path. However, the other path through B input is now slower. Also logical effort of B is increased which means more drive strength is needed for it.

#### 4) Delays and Adders (18 points, 22 minutes)

You are designing a datapath in a brand-new CMOS FinFET technology, where NMOS and PMOS devices have equal strength. In particular, it has only four CMOS gates: 2- input NAND, 2-input NOR, 2-input XOR, and an OAI21 gate (Y =  $(\overline{A+B})\overline{C}$ ). Gate capacitance equals drain capacitance per unit area ( $\gamma$  = 1), and the four gates come in with only one size each, i.e., you do not need to size gates in this problem.

a) If both the 2-input NAND gate and the 2-input NOR gate have a delay dependence on a fanout, f, given as  $t_{NAND2} = t_{NOR2} = 2ns(4 + 3f)$ , what is the delay dependence on the fanout of the input C of the OAI21 gate (C is in the critical path)?



$$P = \frac{3+2+2}{2} = \frac{7}{2}$$

$$Q = \frac{242}{2} = 2$$

b) Draw a fast full-adder using the four types of gates and calculate the ci->sum and ci->cout delay. Note that using an OAI21 gate could potentially simplify the logic for C out. Assume the capacitance driven by the sum and the carry bits is equal to the capacitance of inputs  $a_i$ ,  $b_i$  and  $t_{XOR2} = 2ns$  (8 + 8f).



c) Complete the drawing and label all inputs and outputs for an 4-bit ripple-carry adder in figure below by using full-adder (FA) cells from part b). Inputs are a[3:0] and b[3:0] and there is no carry-in to the least-significant bit.



d) Find the critical path delay for the circuit in part c), if the capacitance driven by the sum and the carry bits is equal to the input a<sub>i</sub>, b<sub>i</sub> capacitance. If you are not confident in your answer in part b), you can express the delay in terms of t<sub>FA,Ci→Co</sub>, and t<sub>FA,Ci→S</sub>

Critical path = \_\_\_\_\_.

#### Problem 4 Logical effort (Sp13)

## Logical effort and intrinsic delay?





#### Problem 4 Logical effort (Sp13)

 Made a mistake in layout! What is the effect on L-H transition? H-L transition?



#### Problem 4 Logical effort (Sp13)

Delay optimization with fixed gate

Using A input with Cin = 24 fF, CL = 75Cin



$$PE = FB6 = 1000$$
 $EF = 5 \sqrt{1000} \approx 4$ 
 $L4 = 1$ 
 $L3 = 1$ 

Ly = 
$$1 \cdot 1 \cdot 75 = 18.75$$
, C  
 $14$ 

Ly =  $1 \cdot 1 \cdot 18.75 \approx 9.69$ 
 $13 = 1 \cdot \frac{5}{3} \cdot 4.69 \approx 1.95$ 
 $12 = 4 \cdot 1 \cdot 1.95 = 1.95$ 

on 10

EECS 151/251A Discussion 10

Min. inverter: drive resistance R<sub>d</sub>, input cap C<sub>in</sub>, intrinsic cap C<sub>int</sub>

Wire: r<sub>w</sub>, c<sub>w</sub> (per unit length)

a + b) Propagation delay from in to out (using distributed model for wires)



Min. inverter: drive resistance R<sub>d</sub>, input cap C<sub>in</sub>, intrinsic cap C<sub>int</sub>

Wire: r<sub>w</sub>, c<sub>w</sub> (per unit length)

c) Derive reduction in delay by increasing the first inverter as much as possible:

$$R_{\delta,1} \downarrow , C_{int}, \uparrow \uparrow$$

$$R_{\delta}C_{int}$$

Min. inverter: drive resistance R<sub>d</sub>, input cap C<sub>in</sub>, intrinsic cap C<sub>int</sub>

Wire: r<sub>w</sub>, c<sub>w</sub> (per unit length)



Min. inverter: drive resistance R<sub>d</sub>, input cap C<sub>in</sub>, intrinsic cap C<sub>int</sub>

Wire: r<sub>w</sub>, c<sub>w</sub> (per unit length)



#### Problem 6 - Wires and Energy (Sp16)

A NOR gate (G1) with input capacitance  $C_1$  is driving a wire with total resistance  $R_W$  and total capacitance  $C_W$ , and an inverter with input capacitance  $C_2$ . Inverter G2 is driving an external load of  $C_L$ . The on-resistance of G1 and G2 are  $R_1$  and  $R_2$ , respectively. Assume  $\gamma = 1$  and  $2R_N = R_P$ . The power supply has a voltage of  $V_{DD}$ .



a) Determine the delay between input A to Out

$$C_{int} = GC_p = GC_y$$

$$= \frac{C}{5}C_{in}$$

#### Problem 6 - Wires and Energy (Sp16)

A NOR gate (G1) with input capacitance  $C_1$  is driving a wire with total resistance  $R_W$  and total capacitance  $C_W$ , and an inverter with input capacitance  $C_2$ . Inverter G2 is driving an external load of  $C_L$ . The on-resistance of G1 and G2 are  $R_1$  and  $R_2$ , respectively. Assume  $\gamma = 1$  and  $2R_N = R_P$ . The power supply has a voltage of  $V_{DD}$ 



b) Assume input A is driven by a square-wave signal of frequency  $f_{clk}$  and input B is driven by a square wave signal of frequency  $\frac{1}{3}f_{clk}$ . Derive an expression for total dynamic power consumption of the circuit.

Point = 
$$f_{elk} C_1 V_{DD}^2 + \frac{f_{elk}}{3} C_1 V_{DD}^2$$
  
Point =  $\frac{2}{3} f_{elk} (\frac{6}{5} C_1 + C_4 + C_2) V_{DD}^2$   
Point 2 =  $\frac{2}{3} f_{elk} (C_2 + C_4) V_{DD}^2$ 

#### Problem 7 - adder (Sp19 final)

- 22. Adder Design [5pts]: The carry-select technique presented in lecture can be applied hierarchically. Imagine applying the technique in a binary way—a larger adder is divided in half using the carry-select and then the same technique is applied to the sub-adders, etc. The process would stop at 4-bit ripple adders.
  - (a) Assuming a FA and a MUX both have unit delay (delay = 1), what would be the delay (from data in to carry out) of a 32-bit adder?

(b) How would the delay scale with the size of the adder?

#### Problem 7 - adders (Sp19 final)

- Adder Design [6pts]: A carry-lookahead adder used to add A to B is organized into groups of 4-bits. A group has carry into the group, c<sub>i-1</sub> and carry out of the group, c<sub>i+3</sub>.
  - (a) In terms of the propagate and generate signals, p<sub>i</sub> and g<sub>i</sub>, of the individual bit stages, write an expression for the group propagate signal, P and group generate signal, G.

(b) Write an expression for c<sub>i+3</sub> based on the group P and G.

(c) Write an expression for the bit stage sum output, s<sub>i</sub>.