## Introduction to the Custom Design Flow: Building a standard cell

EE241 Tutorial 3 Written by Brian Zimmer (2013) Updated by Bonjern Yang (2018)

## Overview

In Tutorial 1 (GCD: VLSI's Hello World), you used the digital design flow to place-and-route a pre-existing library of standard cells based on an RTL description. In Tutorial 2 (Using VLSI Flow Outputs), you place-and-routed a 4-to-16 decoder, imported the design into Cadence Virtuoso, and investigated the difference in timing and power measurements from the digital design tools, non-parasitic transistor-level simulation, and parasitic transistor-level simulation. In this tutorial, you will use Cadence Virtuoso to create your own standard cell that can be used in the digital design flow, and learn how to run Monte Carlo simulations in order to investigate the effects of variability.

The custom design flow is shown in Figure 1. You start with a schematic representation of the circuit, and run simulations to verify functionality and performance. Then, you layout your design, and run Design Rule Checks (DRC) to verify that the layout is manufacturable, and Layout-Versus-Schematic (LVS) to verify that your layout matches your schematic. Last, you run parasitic extraction on your design to account for the contribution of wires and vias, and run simulation again on the extracted design.

To analyze the effect that variability has on your design, you can run a Monte Carlo analysis, which simulates your circuit hundreds of times while varying process parameters according to expected process variability.

In order to use your standard cell in a digital flow, you will need to describe the cell's pre-computed timing characteristics (so it can be synthesized) and physical attributes (so it can be place-and-routed). We will use Synopsys Library NCX to automatically simulate the cell at different temperatures, voltages, and process corners for different input slopes and output loads. NCX will create a lib file that can be compiled into a binary .db file for use in the flow. We will use Cadence Virtuoso to generate an abstract view that only contains pins and metal layers, export this to a textual .lef file, then use Synopsys Library Compiler to generate a binary Milkyway file for use in the flow.

# Getting Started

Load the EE241 environment, then download a set of scripts that will allow you to characterize your own standard cell. On BWRC machines, replace  $\sim$ ee241 with /tools/designs/ee241 throughout this tutorial.

```
% source ~ee241/tutorials/ee241.bashrc
% cd /scratch/userA
% git clone ~ee241/tutorials/gen_stdcells
```



Figure 1: Custom design "flow."

Also, if you have not completed tutorial 1 and tutorial 2, you need to download a directory to run Virtuoso inside, and the decoder place-and-route directory.

```
% git clone ~ee241/tutorials/ee241_virtuoso
% git clone ~ee241/tutorials/decoder_analysis
```

# Schematic Entry

Start Cadence Virtuoso.

```
% cd ee241_virtuoso
% virtuoso &
```

In Virtuoso, make a new library by going to File — New — Library. Call it (customcells), then select "Attach to an existing technology library" and choose SAED\_PDK\_32\_28.

Then go to File — New — Cell, and name the cell NAND2X1B\_RVT, and set the view to "Schematic."

First, add the pins to the file. Go to Create — Pins (or press p). Enter "A1 A2" under "Pin Names", then click on the schematic to place them. Next, change the "Direction" to "Output" and place "Y". Last, change "Direction" to "inputOutput" and place "VDD" and "VSS."

Now you need to enter the devices in the NAND2. Go to Create — Instance (or press i), set the Library to "SAED\_PDK\_32\_28", the cell to "nmos4t", and the view to "symbol."

Change the width of each NMOS to 630n M (note: "n M", not "nM"). Click in another box, and notice how the source/drain area and periphery change automatically. Virtuoso will try to guess parasitics from the source and drain before layout based on the length and width and design rules. Post-layout extraction will replace these values in the netlist (AS, AD, PS, PD) with the actual values. For example, in a NAND gate the diffusion for the NMOS can be shared, and no contact is needed, decreasing the parasitics from what the schematic would estimate.

Place two PMOS with a width of 760n. Now, go to Create — Wire, then click on the starting and ending point of a wire. If a diamond appears over the desired terminal, press "s" to snap the wire to that location. Finish wiring up the design until it looks like Figure 2. Note that both NMOS bulk terminals must be connected to VSS. Go to File — Check and Save to finish.



Figure 2: Schematic of the 2-input NAND gate.

Now make a symbol view so your cell can be instantiated in other designs and netlisted. Go to Create — Cellview — From Cellview, and click ok in the dialog box. You can leave the symbol as is, or use the drawing tools to draw a NAND symbol. When you are done, save and close the window.

## **Schematic Simulation**

The best way to test design is to make a "testbench" schematic, then instantiate your design within it. This allows different tests for the same design, and won't cause LVS conflicts. Go to File — New — Cell, and name the cell nand2\_tb, and set the view to "Schematic."

Press i, and place your cell (NAND2X1b\_RVT). Using the "vdc" and "vpwl" components in analogLib and the "vdd", "gnd", "noConn" components in basic, hook up VDD, VSS, and the inputs as shown in Figure 3. Add a 1fF capacitor as a load. Set the vpwl source on your input as follows

• A: 3 pairs of points, T1: 0, V1: 0, T2: 100p, V2: 0, T3: 110p, V3: VDD\_val



Figure 3: Testbench to measure the delay of an inverter.

Note that "VDD\_val" is not a typo—we are defining it as a parameter so it can be changed from the testbench. Go to File — Check and Save and fix any warnings or errors.

Now to simulate, go to Launch — ADE L.

Click on Variables — Copy from Cellview, then in the right pane, enter 1.05 as the value for VDD\_val. Go to Analysis — Choose, and set the stop time to 4n, and click ok. Then go to Outputs — To be Plotted — Select on Schematic, and click on both the A and Z bus (and select all of their signals), then press "Esc" when finished. Last, go to Simulation — Netlist and Run, and make sure the design simulated as expected.

The default process corner for simulation is TT. You can change the corner by going to Setup — Model Libraries, and changing the section to "SF", "FS", "SS", or "FF". You can change the temperature by going to Setup — Temperature.

# Monte Carlo Analysis

While a real design kit will use measured silicon data to calibrate variability for many device parameters, our educational design kit will only vary the threshold voltage of a device. Threshold voltage variation has been found to be Gaussian, so during Monte Carlo simulation, your circuit will be simulated hundreds of times, but each time the threshold of each device will be shifted by an amount determined from sampling a Gaussian distribution for each device. The standard deviation of each device is inversely proportional to its size. In our process, Avt is assumed to be 2 mV·um.

$$\sigma_{Vth} = \frac{2}{\sqrt{W * L * 1e6 * 1e6}} mV \tag{1}$$

Carlo in ADE XL, but only with the Spectre simulator, not HSPICE. Therefore to run Monte Carlo, there are two options: 1) Export the netlist and run HSPICE manually from the command line or 2) use HSPICE built in utility in ADE-L to run Monte Carlo. The downside of the 2nd approach is that it is no longer possible to run sweeps of a variable (such as the supply voltage), so you will need to run on the command line to both sweep a variable and run Monte Carlo analysis.

To run Monte Carlo, in your open ADE L window, go to Tools — HSPICE Statistical. Under "Number of runs" enter 300, and check the box next to "Enabled." Under the outputs tab in the Measurements section, click on "Open." Enter the settings shown in Figure 4, and click "Add", then "Close." Now inside the HSPICE Monte Carlo Analysis window, under the "Simulation" section click "Netlist-Run." You should see different delays for different iterations as shown in Figure 5.



Figure 4: Settings to measure the delay through an inverter.

For any measurement, statistics are automatically calculated in the Measurement Variation Statistics section as shown in Figure 6. Here you can get a quick mean and sigma measurement. To view the distribution, click "Get Variation Statistics", then click "Plotting as." To export this data for more detailed processing, change the mode from "Hist" to "Scatter" and set Y as your variable and X as index as shown in Figure Then in the plotting window, right click on the signal name — Send To — Export as shown in Figure 7. You can also find the .mt file in /simulation/inverter\_tb/HSPICE/schematic/psf/input.mt0.

In the HSPICE window, go to File — Save Setup so it can be easily loaded later. If you do this from the ADE L window, you will lose HSPICE-specific information.



Figure 5: Different inverter delays due to local variation.



Figure 6: Generate a scatterplot of a measured value.

# Layout Entry

In this tutorial, we will design a new NAND2 gate. To save time, we will use an existing cell as a template. In the CIW, go to Tools — Library Manager, choose the saed32nm\_rvt library , and choose the cell as NAND2X1\_RVT, then right click on "Layout" and choose "Copy". Change the "To" library to "customcells" and cell to NAND2X1B\_RVT and click ok. Note that you are only copying the layout view—do not override the schematic you just created!

Now open the layout view NAND2X1B\_RVT in the customcells library, and you should see the layout in Figure 8. Based on the naming of the cell, the "X1" means that the drive strength should



Figure 7: Export variable to a CSV file.

be larger than NAND2X0\_RVT. However, the designers accomplished this by adding 2 extra inverter stages as a buffer. The fanout between the 1st/2nd stage and the 2nd/3rd stage is less than 4, and therefore is not optimal.

We can improve this gate both in terms of performance and area by reverting it to a single stage while retaining drive strength by increasing the NMOS and PMOS size. To accomplish this, each PMOS will need to be 760nm wide and the NMOS will need to be about 630nm wide (1.5 increase in size because of the stack). However, these devices will not fit on the standard cell pitch, so we need to multi-finger both the NMOS and PMOS as shown in Figure 9. Following the guidelines below, modify your layout until it looks like the figure.

You only need a few commands to change the layout. By pressing  $\sim$ , 1, 2, 3 etc you can show only certain layers. Press Shift-1 to add M1 to whatever is visible. To move something, hover over the object, then press "m" on the keyboard. To make something bigger or smaller, hover over the edge of the object, then press "s" on the keyboard, then click again at the final location. Press "Esc" if you selected something incorrectly. To add new wire, press Ctrl-Shift-W (then F3 will change the options such as the width). To just add a rectangle, select the desired layer in the layer pallete, and press "r". Then click on the two corners to create.



Figure 8: Educational standard cell library NAND2X1\_RVT

Make sure you resize all of the wells as well as the yellow boundary to the new cell dimensions (try to emulate the left end of the cell.) Make all of the layers visible to ensure that you didn't miss anything. All of the pin labels should already exist, but if deleted one, go to Create — Label, enter the terminal name, then select the layer to be M1PIN.

#### LVS

When you believe the layout matches the schematic, go to Hercules — Run LVS. Wait a few seconds for the "Block" entry to be filled in automatically as shown in Figure 10, then press "Execute." In the options tab, choose both "General" and "LVS" from the menu, and make sure that the Block and Library entries are correct. If you are having problems running, close all of your designs in the CIW by going to File — Close Data — All — Ok, and make sure there are no abstract views in your design.

If you are having problems passing LVS, you will see an error like Figure 11. Click on the "LVS Errors" tab, then the design under "First Priority." The best approach for small circuits is to look directly at the netlists as shown in Figure 12. First check to make sure that all of the ports are defined and match, then look at the rest of the netlist and see if either the number of devices, the properties, or the connections are wrong.



Figure 9: An improved 2-input NAND layout

### DRC

When you believe the layout matches the schematic, go to Hercules — Run DRC. Wait a few seconds for the "Block" entry to be filled in automatically then press "Execute."

Double click on the DRC error to see the error on the layout. Each error entry will provide additional details about what failed, as shown in Figure 13.

#### Extraction

To run parasitic extraction in Virtuoso, open the layout view and go to StarRC — Parasitic Generation Cockpit. Most of the settings should be automatically loaded for you as shown in Figure 14, but some options are useful to understand. Make sure you manually fill out Output Parasitics — Ports Annotation like in Lab 2. Without this step, extracted simulations will not be properly netlisted.

• Extract Parasitics Tab — Extraction. RC will extract both resistance and capacitance.



Figure 10: Hercules LVS dialog settings.



Figure 11: Typical error dialog for LVS.

- Extract Parasitics Tab Couple to Ground = NO will include coupling between nets in your extraction
- Extract Parasitics Tab Additional Options, allows you to set a threshold to try to reduce device count



Figure 12: Debugging LVS using the netlist



Figure 13: Typical DRC errors.

Now click "Apply" and wait a minute while extraction runs. When it finishes, go to your new library and open the "starrc" view of the NAND2 cell. If you zoom in, you can see the annotated parasitics overlaid on your layout. By going to StarRC — Parasitic Prober you can explore the parasitics in your design.



Figure 14: StarRC settings.

### **Extracted Simulation**

Next, we need to make a "config" view that will let us tell the simulator whether to simulate with the schematic view or the starrc view. Go to File — New — Cellview, and enter

- Cell: nand2\_tb (the testbench you just created)
- View: config

Then click enter the settings shown in Figure 15. When the next window opens, go to File — Save. Now in the schematic window, go to Launch — ADE L. Go to Setup — Simulator, and make sure HSPICE is chosen. Click on Setup — Design, and choose View: config, and press ok. Enter the same settings as your schematic level, and run the simulation.



Figure 15: Settings for the new config view.

Next, you need to tell the netlister to choose the extracted view instead of the schematic view. Open the config view of nand2\_tb, (click "yes" for Configuration config), then in the "View to Use" column for the "NAND2X1B\_RVT" instance, enter starrc as shown in Figure 16. Then go to File — Save.

## Cell Characterization

#### .lef/Milkyway generation

Open the layout view of your new cell, and go to Tools — Abstract Generation. Click on your cell, then "Move Cell" to the "Core" bin. Click on the Options tab, then click "..." next to Import Options, navigate to gen\_stdcells/abstract.options, then click "Import". Now click "Start." In the CIW, go to File — Export — LEF, and create the file gen\_stdcells/lefs/nand2x1b\_rvt.lef. Set the library name to "customcells", output cell to "NAND2X1B\_RVT", and view to "abstract." Then check the box next to "no technology", and click "ok."

| Library        | Cell        | View Found | View To Use | Inherited View List |
|----------------|-------------|------------|-------------|---------------------|
| SAED_PDK_32_28 | pmos4t      | hspice     |             | hspiceD hspice      |
| analogLib      | vdc         | hspiceD    |             | hspiceD hspice      |
| analogLib      | vpulse      | hspiceD    |             | hspiceD hspice      |
| saed32nm_rvt   | AND2X1_RVT  | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | AND4X1_RVT  | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | INVX0_RVT   | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | INVX1_RVT   | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | INVX2_RVT   | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | NBUFFX2_RVT | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | NOR2X0_RVT  | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | NOR2X2_RVT  | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | SHFILL1_RVT | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | SHFILL2_RVT | schematic  |             | hspiceD hspice      |
| saed32nm_rvt   | SHFILL3_RVT | schematic  | I           | hspiceD hspice      |
| test           | decoder     | starrc     | starrc      | hspiceD hspice      |
| test           | decoder_tb  | schematic  |             | hspiceD hspice      |

Figure 16: Change the config view to use the extracted netlist.

Now you need to convert the LEF file to a MW file. There is a Makefile that performs this process. First look at the generated LEF file to make sure all of the pins exist and have the correct direction.

```
% cd /scratch/userA/gen_stdcell/lefs/
% vim nand2x1b_rvt.lef
% make
```

### .lib/.db generation

You will need a netlist of devices to characterize. Open the starrc view of your design, go to Launch — ADE L, then Setup — Environment, and check the box next to "setTopLevelAsSubckt" and "HSPICE Case Sensitivity". Now, go to Simulation — Netlist — Create, then File — Save As, and place it in the gen\_stdcells/netlists directory as NAND2X1B\_RVT.sp. Then, open this file in an editor, and delete everything before the .subckt line (but maintain the first line as a comment!).

Library NCX automatically recognizes the functions of different cells, and based on the existing technology settings, creates a set of HSPICE simulations that will generate a .lib file for you. After that, the Makefile calls IC Compiler to convert the .lib file to a .db file. Generate timing information for your cell, then open the .lib file to see what happened.

```
% cd /scratch/userA/gen_stdcell/
% make
% vim customcells.lib
...
pin(Y) {
  related_power_pin : VDD ;
  related_ground_pin : VSS ;
  direction : "output" ;
  output_voltage : "DC_0" ;
  power_down_function : "!VDD + VSS" ;
```

```
function : "(A2*A1)'";
 max_capacitance : 8 ;
 max_transition : 0.4527 ;
 timing() {
     related_pin : "A1" ;
      timing_sense : "negative_unate" ;
      cell_rise(del_1_7_7) {
        index_1("0.016, 0.032, 0.064, 0.128, 0.256, 0.512, 1.024");
        index_2("0.1, 0.25, 0.5, 1, 2, 4, 8");
        values("0.01203, 0.01257, 0.01344, 0.01501, 0.0181, 0.02438, 0.03723",\
               "0.01584, 0.0166, 0.01781, 0.02005, 0.02371, 0.03007, 0.0424", \setminus
               "0.02184, 0.02271, 0.02417, 0.02706, 0.03172, 0.04, 0.05359", \setminus
               "0.02858, 0.02928, 0.03026, 0.03353, 0.04053, 0.05237, 0.07068",
               "0.04518, 0.04657, 0.04883, 0.05299, 0.06135, 0.0764, 0.0999", \
               "0.07155, 0.07241, 0.07387, 0.07782, 0.0875, 0.1041, 0.1339", \setminus
               "0.128, 0.13, 0.1312, 0.1337, 0.1395, 0.1558, 0.187");
. . .
```

The lib file holds a 2d table indexed by index\_1 (rise/fall time on the input) and index\_2 (load capacitance). It is important to have a feeling for how this was generated. Open the spice file that was used to generate this data.

```
% cd /scratch/userA/gen_stdcells/customcells/runtime/spice/NAND2X1B_RVT
% tar -xf delay__A1__hl__Y__lh__ACQ_1.tar.gz
% cd delay__A1__hl__Y__lh__ACQ_1.sif_0
% vim deck.cir
%
.data arc_data
+ slew_a1 load_y temperature_tag __param_vdd __param_vss
+ 1.6e-11 1e-16 25 1.05 0
+ 1.6e-11 2.5e-16 25 1.05 0
+ 1.6e-11 5e-16 25 1.05 0
+ 1.6e-11 1e-15 25 1.05 0
+ 1.6e-11 2e-15 25 1.05 0
+ 1.6e-11 4e-15 25 1.05 0
+ 1.6e-11 8.00000000000001e-15 25 1.05 0
+ 3.2e-11 1e-16 25 1.05 0
+ 3.2e-11 2.5e-16 25 1.05 0
+ 3.2e-11 5e-16 25 1.05 0
+ 3.2e-11 1e-15 25 1.05 0
+ 3.2e-11 2e-15 25 1.05 0
+ 3.2e-11 4e-15 25 1.05 0
+ 3.2e-11 8.00000000000001e-15 25 1.05 0
+ 6.4e-11 1e-16 25 1.05 0
+ 6.4e-11 2.5e-16 25 1.05 0
```

```
+ 6.4e-11 5e-16 25 1.05 0

+ 6.4e-11 1e-15 25 1.05 0

+ 6.4e-11 2e-15 25 1.05 0

+ 6.4e-11 4e-15 25 1.05 0

+ 6.4e-11 8.00000000000001e-15 25 1.05 0
```

This structure will run a simulation for each line in the .data section and effectively sweeps both dimensions of the table. Notice the second value in each line (1e-16, 2.5e-16, 5e-16, etc.) for the parameter load\_y. This matches the values in index\_2.

Now use a waveform viewer to see what happened when the simulation ran.

```
% cd /scratch/userA/gen_stdcells/
% cd customcells/runtime/spice/NAND2X1B_RVT/delay__A1__hl__Y__lh__ACQ_1.sif_0/
% wv deck.mt0 &
```

Then plot delay\_a1\_hl\_y\_lh.trig and delay\_a1\_hl\_y\_lh.targ, overlay them in the same plot, and you should see something similar to Figure 17.



Figure 17: NCX characterization of a NAND2 gate

## Place-and-route

First, optimize the timing for the existing 4-to-16 decoder and note the critical path.

```
% cd /scratch/userA/decoder_analysis/build-rvt/
% vim Makefrag
% ....
clock_period = 0.08
...
# add
```

Then run place-and-route and record the critical path length and the gates that it goes through.

Now you need to let Design Compiler and IC Compiler know about your new library. Add the following lines to both dc-syn/Makefile and icc-par/Makefile (replace it with the correct path)

```
% cd /scratch/userA/decoder_analysis/build-rvt/dc-syn
% vim Makefile
% ....

toplevelinst = ...

# Hack for custom cells
mw_sram_libs = /scratch/userA/gen_stdcells/lefs/customcells
db_sram_libs = /scratch/userA/gen_stdcells/customcells.db

% cd /scratch/userA/decoder_analysis/build-rvt/icc-par
% vim Makefile
% ....
toplevelinst = ...

# Hack for custom cells
mw_sram_libs = /scratch/userA/gen_stdcells/lefs/customcells
db_sram_libs = /scratch/userA/gen_stdcells/lefs/customcells.db
```

Now rerun both synthesis and place-and-route, and check the Verilog output to make sure your gate was instantiated. If it was not, check the log file for errors (search for your cell name: NAND2X1B\_RVT).