## EE 241B HW1 Writeup

## Vighnesh Iyer

#### Contents

| 1 | Mo                  | dels - MOSFET Characterization                 | 1 |
|---|---------------------|------------------------------------------------|---|
|   | 1.1                 | Threshold Voltages                             | 1 |
|   | 1.2                 | Fitting Velocity Saturation Model              | 3 |
|   | 1.3                 | Fitting Alpha-Power Law                        | 4 |
|   | 1.4                 | Fitting Alpha-Power Law $(\alpha = 1)$         | 4 |
| 2 | 2 Transistor Sizing |                                                | 4 |
| 3 | 5-32                | $2  { m Decoder  Design} + { m VLSI  Flow}$    | 4 |
|   | 3.1                 | GCD: VLSI's Hello World                        | 4 |
|   | 0.1                 | 002. (2010 110110 (10110 1 1 1 1 1 1 1 1 1 1 1 | - |
|   |                     | Decoder Design                                 |   |
| A | 3.2                 |                                                |   |

## 1 Models - MOSFET Characterization

We are using a 32nm LP CMOS process for this class. The devices being characterized are n105 and p105 (TT corner) with a nominal supply voltage of 1.05V.

#### 1.1 Threshold Voltages

We want to determine the threshold voltage  $V_{th}$  for the NMOS and PMOS devices (for  $V_{BS} = 0$ , L = 32nm, and  $W = 1\mu$ m), by extrapolating from the  $I_DS$  vs.  $V_{GS}$  curve at low  $V_{DS}$ . We compare the threshold voltage derived from DC sweeps to the values reported in the model file and the DC operating point analysis.

To perform this characterization, we first collect a full range of DC operating points for both transistors to make analysis easier for this entire section. The transistors' drains are connected to a variable DC supply and the transistors' gates are connected to another independent variable DC supply. The source for both transistors is held at ground or VDD (0V/1.5V). We perform a

nested DC analysis by sweeping  $V_{DS}$  from  $0 \to 1.05 \text{V}$  in (10mV) increments, and sweep  $V_{GS}$  from  $0 \to 1.05 \text{V}$  in (10mV) increments.

The gathered I-V curves are shown below.



From the DC OP analysis,  $V_{th}$  of the NMOS is reported to be 324.4 mV, and the  $V_{th}$  of the PMOS is reported to be -208.1 mV. From the model files the NMOS  $V_{th0}$  is 370 mV, and the PMOS  $V_{th0}$  is -213 mV. However  $V_{th0}$  is specified for long-channel devices, which don't accurately model the devices in this process.

To extract the threshold voltage from the I-V curves, we extrapolate the  $V_{GS}$  vs  $I_{DS}$  curves for a low value of  $V_{DS}$  to keep the transistor in the linear region of operation. Then we fit a line to the linear part of the curves. We treat the x-intercept of those lines as the  $V_{th}$  of the transistor for the selected value of  $V_{DS}$ . This method is shown for NMOS and PMOS transistors.



The images on the left side show the linear fit to each curve, while the images on the right side show the extrapolated  $V_{th}$  for each  $V_{DS}$  curve. As expected, with increased  $V_{DS}$  the threshold voltage improves for both devices.

The extrapolated results for the NMOS match the model files and the DC operating point measurement well. However, the PMOS threshold voltage is off by around 100 mV.

## 1.2 Fitting Velocity Saturation Model

In class, we use this model for  $I_{DSat}$ :

$$I_{DSat} = \frac{W}{L} \frac{\mu_{eff} C_{ox} E_C L}{2} \frac{(V_{GS} - V_{th})^2}{(V_{GS} - V_{th}) + E_C L}$$

We want to find the values of  $E_CL$  that best fit the NMOS and PMOS I-V curves. We will use the  $V_{th}$  value from the previous section. We take the case of the  $V_{DS}$  curve where  $V_{DS} = V_{DD}$  to keep the transistors fully saturated and we sweep  $V_{GS}$ . We then fit out data where  $V_{GS} > V_{th}$  to this model with  $E_CL$  and  $k = \mu_{eff}C_{ox}$  as free variables.

To get starting values for fit iteration, I used model parameters to estimate k.  $\frac{W}{L} = 31.25$ . I found that for the NMOS  $\mu_0$  (u0\_n105) is 30.4,  $t_{ox}$  (toxe\_n105) is 2.39e-9, and  $\epsilon_{ox}$  (epsrox) is 3.9. For

the PMOS  $\mu_0$  (u0\_p105) is 60.95,  $t_{ox}$  (toxe\_p105) is 2.62e-9, and  $\epsilon_{ox}$  (epsrox) is 3.9. We also know that  $C_{ox} = \frac{\epsilon_{ox}}{t_{ox}}$ .

Through experimentation, I found that  $V_{th}$  from the previous section doesn't match this curve that well since the higher  $V_{DS}$  results in a smaller effective threshold voltage. I then made  $V_{th}$  another free variable to see if I could get a better fit.

After several iterations, it became clear that small changes in the initial guesses for the free variables resulted in very different curve fits. I settled on values that would give me a realistic  $E_CL$  and k, but gave me a lower  $V_{th}$  than expected.

- 1.3 Fitting Alpha-Power Law
- 1.4 Fitting Alpha-Power Law ( $\alpha = 1$ )
- 2 Transistor Sizing
- 3 5-32 Decoder Design + VLSI Flow
- 3.1 GCD: VLSI's Hello World

Here are the first 45 lines of the post-PAR Verilog netlist:

```
module gcdGCDUnitCtrl (clk , reset_BAR , operands_val , result_rdy ,
   B_zero , A_lt_B , result_val , operands_rdy , A_mux_sel , B_mux_sel ,
   A_en , B_en , INO );
   input clk;
4
   input
          reset_BAR ;
          operands_val;
          result_rdy;
   input
   input B_zero;
   input A_lt_B;
   output result_val ;
10
   output operands_rdy ;
11
   output [1:0] A_mux_sel ;
12
   output B_mux_sel ;
   output A_en ;
14
   output B_en ;
15
   input INO;
16
17
18
   wire [1:0] state ;
19
20
   AND2X1_RVT icc_clock3 (.Y ( n16 ) , .A1 ( A_lt_B ) , .A2 ( n5 ) );
21
   NBUFFX8_RVT CTSNBUFFX32_RVT_G1B3I1 (.A ( clk_G1B5I1 ) , .Y ( clk_G1B6I1 ) ) ;
```

```
INVX2_RVT CTSINVX32_RVT_G1B13I1 (.A ( clk ) , .Y ( clk_G1B1I1 ) ) ;
23
   NBUFFX8_RVT CTSNBUFFX2_RVT_G1B1I1 (.A ( clk_G1B6I1 ) , .Y ( clk_G1B7I1 ) );
24
   NBUFFX8_RVT CTSNBUFFX2_RVT_G1B5I1 (.A ( clk_G1B2I1 ) , .Y ( clk_G1B5I1 ) ) ;
25
   INVX16_RVT CTSINVX32_RVT_G1B11I1 (.A ( clk_G1B1I1 ) , .Y ( clk_G1B2I1 ) );
26
   INVXO_RVT icc_place4 (.A ( n7 ) , .Y ( n10 ) );
   DELLN1X2_RVT icc_place3 (.A ( operands_val ) , .Y ( n7 ) );
28
   DFFX1_RVT state_reg_1_ (.QN ( n17 ) , .Q ( state[1] ) , .D ( n11 )
29
   , .CLK ( clk_G1B7I1 ) );
30
   NAND2XO_RVT U24 (.A2 ( n14 ) , .A1 ( n15 ) , .Y ( A_en ) ) ;
31
   INVXO_RVT U23 (.A ( B_en ) , .Y ( n15 ) );
   NAND4XO_RVT U22 (.A2 ( INO ) , .A4 ( n8 ) , .A3 ( n9 ) , .Y ( n11 ) , .A1 ( n13 ) ) ;
33
   INVXO_RVT U21 (.A ( result_val ) , .Y ( n8 ) );
34
   NAND2XO_RVT U20 (.A2 ( state[1] ) , .A1 ( n10 ) , .Y ( n9 ) ) ;
35
   NAND3XO_RVT U17 (.Y ( n13 ) , .A1 ( n6 ) , .A2 ( n5 ) , .A3 ( B_zero ) ) ;
36
   AOI21X1_RVT U16 (.Y ( n12 ) , .A2 ( n3 ) , .A3 ( reset_BAR ) , .A1 ( n4 ) ) ;
37
   NAND2XO_RVT U15 (.A2 ( state[0] ) , .A1 ( n2 ) , .Y ( n3 ) ) ;
38
   NAND2XO_RVT U14 (.A2 ( state[1] ) , .A1 ( result_rdy ) , .Y ( n2 ) ) ;
39
   NAND3XO_RVT U13 (.Y ( n4 ) , .A1 ( n6 ) , .A2 ( B_zero ) , .A3 ( n17 ) );
40
   INVXO_RVT U12 (.A ( A_lt_B ) , .Y ( n6 ) );
   A021X1_RVT U11 (.A1 (operands_rdy), .Y (B_en), .A2 (n7), .A3 (n16));
42
   NOR2XO_RVT U10 (.Y ( A_mux_sel[1] ) , .A2 ( A_lt_B ) , .A1 ( n14 ) ) ;
43
   OR2X1_RVT U9 (.Y ( n14 ) , .A2 ( B_zero ) , .A1 ( n1 ) );
44
   INVXO_RVT U8 (.A ( n5 ) , .Y ( n1 ) );
45
   AND2X2_RVT U7 (.Y ( operands_rdy ) , .A1 ( n18 ) , .A2 ( state[1] ) ) ;
```

- 1. Why are you generally not worried when you have hold time violations after synthesis? Hold time violations can be easily resolved later in the flow by inserting dummy logic/delay or by allowing a longer wire (with greater, non-zero delay) to connect the path that has a hold time violation.
- 2. Why does the tutorial example (page 9) fail timing even though the delay is less than the clock period (0.5ns)? Even though the data arrival time is below 0.5ns, the data required time is also below 0.5ns, and to a greater extend than the data arrival time. This is due to factoring in the clock uncertainty (jitter/skew introduced later in the flow), and the setup time of the flip-flop.
- 3. Why might you want to set a conservative clock\_uncertainty? A conservative clock\_uncertainty will account for any clock skew and jitter added during PAR. If the clock\_uncertainty is too optimistic, you may find that timing will pass when performing post-synthesis simulation, but will fail when trying post-PAR simulation.

## 3.2 Decoder Design

Here is my decoder implementation in decoder.v:

```
module decoder(
  input [4:0] A,
```

```
output reg [31:0] Z,
input clk
);

// NOTE: Testbench expects that the output Z is registered!
genvar i;
generate
  for (i = 0; i <= 5'd31; i = i + 1) begin:decoder_loop
    always @ (posedge clk) begin
    Z[i] <= A == i;
    end
    end
    end
endgenerate
endmodule</pre>
```

I had to modify the 'timescale in the decoder\_tb to be 1ns/10ps to match the vcd invocation and to ensure that a small clock period under 100ps would simulate without issues. The behavioral simulation succeeded.

After running dc-syn, I found that there were timing violations, and I fixed those by increasing the clock period to 0.2ns. After this, I re-ran synthesis, verified that there were no timing violations, and ran the post-synthesis VCS sim to make sure the functionality of the decoder was still in order.

I then ran icc-par and found several timing violations after PAR. They weren't internal to the decoder block, but rather involved the decoder's Z register driving an output load capacitance of 30fF. This load capacitance is much higher than the internal capacitances in the circuit, so it takes a longer time to drive. The register needs to drive the output cap to its proper value before the next rising edge of the clock. I found that this will take at least 0.38ns, so I modified my clock period to 0.5ns to give the tools some slack.

After running post-PAR simulation, I found I needed to further increase the clock period to 1.0ns just to fix a testbench issue where the inputs to the decoder were supplied on the falling edge of the clock, rather than just right after the rising edge Clk-to-q to more accurately simulate the actual design.

Here are the final results:

- 1. A screenshot of the layout
- 2. Critical path in ns
- 3. Post-PAR DVE sim showing correct functionality and critical path

# A PMOS/NMOS DC Characterization SPICE Sim

```
Sweep of V_GS with constant V_DS for N-MOSFET
.lib '/home/ff/ee241/synopsys-32nm/hspice/saed32nm.lib' TT
vds vds gnd 1.05
vgs vgs gnd 1.05
x1 vds vgs gnd gnd n105 (w=1u l=32n)
.op
.dc vgs 0 1.05 10m vds 0 1.05 10m
.option post=2 nomod
.end
```

 $\mathbf{B}$