







# ICS904/CD2IC : Cell Design For Digital Integrated Circuits

L3: Structural design of digital circuits(2/2)

Yves MATHIEU
yves.mathieu@telecom-paris.fr

### **Outline**

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work



Yves MATHIEU

### **Outline**

#### Dynamic logic



Yves MATHIEU

pseudo-NMOS logic



- Replacement of the PMOS network by a passive load.
- Smaller: The "1" values of the truth table are implicit values.
- CONFLICT: When NMOS network is ON: steady-state current.



pseudo-NMOS logic



- Replacement of the PMOS network by a passive load.
- Smaller: The "1" values of the truth table are implicit values.
- CONFLICT: When NMOS network is ON: steady-state current.



pseudo-NMOS logic



- Replacement of the PMOS network by a passive load.
- Smaller: The "1" values of the truth table are implicit values.
- CONFLICT: When NMOS network is ON: steady-state current.



pseudo NMOS-logic



### PMOS/NMOS width balancing:

- Low output level should be less than threshold voltage of NMOS transistor.
- Propagation time for output rising edge should be kept small.

How to avoid a choice between speed and low-power/robustness.

Yves MATHIEU



pseudo NMOS-logic



### PMOS/NMOS width balancing:

- Low output level should be less than threshold voltage of NMOS transistor.
- Propagation time for output rising edge should be kept small.

How to avoid a choice between speed and low-power/robustness.



pseudo NMOS-logic



### PMOS/NMOS width balancing:

- Low output level should be less than threshold voltage of NMOS transistor.
- Propagation time for output rising edge should be kept small.

How to avoid a choice between speed and low-power/robustness.



pseudo NMOS-logic



### PMOS/NMOS width balancing:

- Low output level should be less than threshold voltage of NMOS transistor.
- Propagation time for output rising edge should be kept small.

How to avoid a choice between speed and low-power/robustness.





- Only one NMOS network.
- One clock (synchronous context)
- lacktriangledown  $\Phi=0$ : Output is precharged to 1 (Precharge phase)
- $\Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears . . .).





- Only one NMOS network.
- One clock (synchronous context)
- lacktriangledown  $\Phi=0$ : Output is precharged to 1 (Precharge phase)
- $\Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears . . .).



#### **Precharge logic**



6/40

- Only one NMOS network.
- One clock (synchronous context)
- lacktriangledown  $\Phi=0$ : Output is precharged to 1 (Precharge phase)
- $\Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears . . .).





- Only one NMOS network.
- One clock (synchronous context)
- $lack \Phi = 0$ : Output is precharged to 1 (Precharge phase)
- $lack \Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears . . .).





- Only one NMOS network.
- One clock (synchronous context)
- $lack \Phi = 0$ : Output is precharged to 1 (Precharge phase)
- $\Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears . . .).





- Only one NMOS network.
- One clock (synchronous context)
- $\blacksquare$   $\Phi = 0$ : Output is precharged to 1 (Precharge phase)
- $\blacksquare$   $\Phi = 1$ : Conditional computation of the output. (Evaluation phase)
- State "1" is a high impedance state.
- Leakage current of transistors limits the minimum clock frequency (state "1" disappears ...).



#### **Constraints**

- If the final state is "1", the output should stay to "1" during evaluation phase.
- So inputs should be stable during Evaluation phase.
- Then output of such cell can not be used as inputs of a another cell . . .
- Even if we meet constraints, final output voltage may be less than  $V_{dd}$  (charge sharing inside the NMOS network)



#### **Constraints**

- If the final state is "1", the output should stay to "1" during evaluation phase.
- So inputs should be stable during Evaluation phase.
- Then output of such cell can not be used as inputs of a another cell . . .
- Even if we meet constraints, final output voltage may be less than  $V_{dd}$  (charge sharing inside the NMOS network)



#### **Constraints**

- If the final state is "1", the output should stay to "1" during evaluation phase.
- So inputs should be stable during Evaluation phase.
- Then output of such cell can not be used as inputs of a another cell . . .
- Even if we meet constraints, final output voltage may be less than  $V_{dd}$  (charge sharing inside the NMOS network)



#### Constraints

- If the final state is "1", the output should stay to "1" during evaluation phase.
- So inputs should be stable during Evaluation phase.
- Then output of such cell can not be used as inputs of a another cell . . .
- **Even** if we meet constraints, final output voltage may be less than  $V_{cd}$  (charge sharing inside the NMOS network)



7/40











- During precharge phase: All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase: some inputs switch to
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning: only non-inverting gate can be implemented



#### **Domino logic**



- During precharge phase: All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase: some inputs switch to
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning: only non-inverting gate can be implemented



9/40

**Domino logic** 



- During precharge phase: All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase: some inputs switch to
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning: only non-inverting gate can be implemented



9/40



- During precharge phase: All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase: some inputs switch to
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning: only non-inverting gate can be implemented





- During precharge phase: All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase : some inputs switch to
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning: only non-inverting gate can be implemented





- During precharge phase : All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase : some inputs switch to "1"
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" . . .
- $lacktriangleq T_{cycle} > \sum T_{propagation}$
- Warning : only non-inverting gate can be implemented



# Dynamic logic Domino logic



- During precharge phase : All gates inputs are "0" (all NMOS networks are OFF)
- During evaluation phase : some inputs switch to "1"
- Then some NMOS networks switch to ON state.
- Then some gate outputs switch to "0"
- Then some gate inputs switch to "1" ...
- $\blacksquare$   $T_{cycle} > \sum T_{propagation}$
- Warning : only non-inverting gate can be implemented



# Dynamic logic Domino N-P logic



- Simplified : no more invertor
- Warning : only inverting gates can be implemented
- Warning: NMOS gates should alternate with PMOS gates...



10/40

Yves MATHIEU

#### Summary

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example: High speed carry chain for arithmetic computation in microprocessors
- But: Clock is loaded by all cells... (power consumption of the clock network)

Yves MATHIEU

- But: Precharge phase is wasted time.
- But: Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



#### **Summary**

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example : High speed carry chain for arithmetic computation in microprocessors
- But : Clock is loaded by all cells... (power consumption of the clock network)
- But : Precharge phase is wasted time.
- But : Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



#### **Summary**

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example : High speed carry chain for arithmetic computation in microprocessors
- But : Clock is loaded by all cells... (power consumption of the clock network)
- But : Precharge phase is wasted time.
- But : Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



#### **Summary**

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example : High speed carry chain for arithmetic computation in microprocessors
- But : Clock is loaded by all cells... (power consumption of the clock network)
- But : Precharge phase is wasted time.
- But : Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



# **Dynamic logic**

### **Summary**

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example : High speed carry chain for arithmetic computation in microprocessors
- But : Clock is loaded by all cells... (power consumption of the clock network)
- But : Precharge phase is wasted time.
- But : Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



# **Dynamic logic**

### **Summary**

- Very high speed logic (very small capacitive loads)
- Very often used during 80-90 years
- Example : High speed carry chain for arithmetic computation in microprocessors
- But : Clock is loaded by all cells... (power consumption of the clock network)
- But : Precharge phase is wasted time.
- But : Complex design automation (no direct RTL synthesis tool)
- Only used for optimized "full custom" designs.



### **Outline**

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.
- Less number of gates, but more wires . . .

#### Introduction

- $\blacksquare$  All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.

Yves MATHIEU

Less number of gates, but more wires . . .



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.
- Less number of gates, but more wires . . .



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed.
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.

Yves MATHIEU

Less number of gates, but more wires . . .



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.

Yves MATHIEU

Less number of gates, but more wires ...



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.

Yves MATHIEU

Less number of gates, but more wires ...



#### Introduction

- All signals are duplicated :  $A \Rightarrow (A_T, A_F)$
- Parallel computation of F and  $\overline{F}$
- No invertor needed
- Reduces complexity of arithmetic computation.
- 2 NMOS transistors networks of equal size.
- No limitation to inverting or non inverting gates.

Yves MATHIEU

Less number of gates, but more wires . . .



**Cascode Voltage Switch Logic** 





**Cascode Voltage Switch Logic** 





**Cascode Voltage Switch Logic** 





**DCVSL: Dynamic Cascode Voltage Switch Logic** 



- No conflict during output transitions.
- Q1 : Design a CVSL 2 inputs XOR gate. Try to minimize the number of transistors.



**DCVSL: Dynamic Cascode Voltage Switch Logic** 



- No conflict during output transitions.
- Q1 : Design a CVSL 2 inputs XOR gate. Try to minimize the number of transistors.



**DCVSL: Dynamic Cascode Voltage Switch Logic** 



**CPL**: Complementary Pass Logic



- Pass transistor logic using only NMOS : slow degraded logic one but . . .
- "Weak" PMOS pullups restore full scale swing and outputs are buffered by CMOS invertors
- Using  $lowV_t$  transistor for the network, and  $highV_t$  transistors for the invertors helps speed optimization
- Said to be one of the fastest logic style ...



**CPL**: Complementary Pass Logic



- Pass transistor logic using only NMOS : slow degraded logic one but . . .
- "Weak" PMOS pullups restore full scale swing and outputs are buffered by CMOS invertors
- Using  $lowV_t$  transistor for the network, and  $highV_t$  transistors for the invertors helps speed optimization
- Said to be one of the fastest logic style ...



**CPL: Complementary Pass Logic** 



- Pass transistor logic using only NMOS : slow degraded logic one but . . .
- "Weak" PMOS pullups restore full scale swing and outputs are buffered by CMOS invertors
- Using  $lowV_t$  transistor for the network, and  $highV_t$  transistors for the invertors helps speed optimization
- Said to be one of the fastest logic style ...



**CPL**: Complementary Pass Logic



- Pass transistor logic using only NMOS: slow degraded logic one but . . .
- "Weak" PMOS pullups restore full scale swing and outputs are buffered by **CMOS** invertors
- Using  $lowV_t$  transistor for the network, and  $highV_t$  transistors for the invertors helps speed optimization

Yves MATHIEU

Said to be one of the fastest logic style ...



### **Outline**

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work



**Application specific logic styles** 

- Current Mode Logics
- Adiabatic Logic
- SubTreshold Logic
- . . . .



**Application specific logic styles** 

- Current Mode Logics
- Adiabatic Logic
- SubTreshold Logic
- . . . .



Yves MATHIEU

**Application specific logic styles** 

- Current Mode Logics
- Adiabatic Logic
- SubTreshold Logic
- . . . .



**Application specific logic styles** 

- Current Mode Logics
- Adiabatic Logic
- SubTreshold Logic
- . . .



Yves MATHIEU



Yves MATHIEU

# Outline

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work





- Muxes with loops define 2 storage elements (Master and Slave)
- Master(resp. Slave) hold is value while Slave (resp. Master) is transparent.
- Skew between the two clocks should be avoided (race condition).
- D signal should respect timing conditions.
  - Setup time / Hold time





- Muxes with loops define 2 storage elements (Master and Slave)
- Master(resp. Slave) hold is value while Slave (resp. Master) is transparent.
- Skew between the two clocks should be avoided (race condition).
- D signal should respect timing conditions.
  - Setup time / Hold time





- Muxes with loops define 2 storage elements (Master and Slave)
- Master(resp. Slave) hold is value while Slave (resp. Master) is transparent.
- Skew between the two clocks should be avoided (race condition).
- D signal should respect timing conditions.
  - Setup time / Hold time





- Muxes with loops define 2 storage elements (Master and Slave)
- Master(resp. Slave) hold is value while Slave (resp. Master) is transparent.
- Skew between the two clocks should be avoided (race condition).
- D signal should respect timing conditions.
  - Setup time / Hold time





- Clocks are internally generated in order to avoid unwanted skew.
- Warning D input is on the Drain of a transistor
- One can mix CMOS logic and pass transistor logic (using tristate invertors).
- Q1 : Design a D flip/flop using tristate invertors





- Clocks are internally generated in order to avoid unwanted skew.
- Warning D input is on the Drain of a transistor
- One can mix CMOS logic and pass transistor logic (using tristate invertors).
- Q1 : Design a D flip/flop using tristate invertors





- Clocks are internally generated in order to avoid unwanted skew.
- Warning D input is on the Drain of a transistor
- One can mix CMOS logic and pass transistor logic (using tristate invertors).
- Q1 : Design a D flip/flop using tristate invertors





- Clocks are internally generated in order to avoid unwanted skew.
- Warning D input is on the Drain of a transistor
- One can mix CMOS logic and pass transistor logic (using tristate invertors).
- Q1 : Design a D flip/flop using tristate invertors





### **Outline**

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimizatior

Practical work



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block.
- Wide logic styles choice.
- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or analog blocs.

- Layout design limited to generic cells.
- Electrical simulation for cell properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.
- When time-to-market is the main criterion.



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block.
- Wide logic styles choice.
- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or analog blocs.

- Layout design limited to generic cells.
- Electrical simulation for cell properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.
- When time-to-market is the main criterion.



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block.
- Wide logic styles choice.
- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or analog blocs.

- Layout design limited to generic cells.
- Electrical simulation for cell properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.
- When time-to-market is the main criterion.



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block.
- Wide logic styles choice.
- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or analog blocs.

- Layout design limited to generic cells.
- Electrical simulation for cell properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.
- When time-to-market is the main criterion.



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block
- Wide logic styles choice.

25/40

- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or

- Layout design limited to generic cells
- Electrical simulation for cell. properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.



#### **Full Custom**

- Manual layout of all needed cells.
- Long elec. simulations for verif. of a whole block
- Wide logic styles choice.
- Scripting may help layout phases.
- Ultimate optimisation for speed power or area.
- Only for high value added digital or analog blocs.

- Layout design limited to generic cells
- Electrical simulation for cell. properties extraction.
- Small logic styles choice.
- Automation of synthesis, place and route phases.
- Suboptimal for speed power and area.
- When time-to-market is the main. criterion.





- All cells have same height
- Power supply and Ground connected by abutment.
- Cell design should be free of DRC error.
- Any abutment of any couple of cells should be free of DRC error.
- Wiring inside cell limited to Metal1 level.





- All cells have same height
- Power supply and Ground connected by abutment.
- Cell design should be free of DRC error.
- Any abutment of any couple of cells should be free of DRC error.
- Wiring inside cell limited to Metal1 level.





- All cells have same height
- Power supply and Ground connected by abutment.
- Cell design should be free of DRC error.
- Any abutment of any couple of cells should be free of DRC error.
- Wiring inside cell limited to Metal1 level.



26/40



ICS904-CD2IC-L3

- All cells have same height
- Power supply and Ground connected by abutment.
- Cell design should be free of DRC error.
- Any abutment of any couple of cells should be free of DRC error.
- Wiring inside cell limited to Metal1 level.





- All cells have same height
- Power supply and Ground connected by abutment.
- Cell design should be free of DRC error.
- Any abutment of any couple of cells should be free of DRC error.
- Wiring inside cell limited to Metal1 level.



# Reference technology profile









- NMOS areas and PMOS areas already filled.
- Body-ties areas for NMOS and PMOS already filled
- Body-ties already connected to  $V_{dd}$  (for PMOS) or  $V_{ss}$  (for NMOS)
- Simple abutment of cells fill an raw of cells with NMOS and PMOS areas.







- NMOS areas and PMOS areas already filled.
- Body-ties areas for NMOS and PMOS already filled
- Body-ties already connected to V<sub>dd</sub> (for PMOS) or V<sub>ss</sub> (for NMOS)
- Simple abutment of cells fill an raw of cells with NMOS and PMOS areas.







- NMOS areas and PMOS areas already filled.
- Body-ties areas for NMOS and PMOS already filled
- Body-ties already connected to V<sub>dd</sub> (for PMOS) or V<sub>ss</sub> (for NMOS)
- Simple abutment of cells fill an raw of cells with NMOS and PMOS areas.







- NMOS areas and PMOS areas already filled.
- Body-ties areas for NMOS and PMOS already filled
- Body-ties already connected to V<sub>dd</sub> (for PMOS) or V<sub>ss</sub> (for NMOS)
- Simple abutment of cells fill an raw of cells with NMOS and PMOS areas.



# apdk045 adder practical design



29/40



- All transistors have horizontal orientation.
- Maximal width defined by NMOS and PMOS areas height.
- Use parallel transistors for larger widths.
- Drain/Source implants may be used for local short wires (beware the resistivity).
- Global optimisation of Eulerian Paths (N(P)MOS subcircuits are graphs which visits every edge exactly once)





# gpdk045 adder







- All transistors have horizontal orientation.
- Maximal width defined by NMOS and PMOS areas height.
- Use parallel transistors for larger widths.
- Drain/Source implants may be used for local short wires (beware the resistivity).
- Global optimisation of Eulerian Paths (N(P)MOS subcircuits are graphs which visits every edge exactly once)



# gpdk045 adder practical design





- All transistors have horizontal orientation.
- Maximal width defined by NMOS and PMOS areas height.
- Use parallel transistors for larger widths.
- Drain/Source implants may be used for local short wires (beware the resistivity).
- Global optimisation of Eulerian Paths (N(P)MOS subcircuits are graphs which visits every edge exactly once)



# apdk045 adder practical design





- All transistors have horizontal orientation.
- Maximal width defined by NMOS and PMOS areas height.
- Use parallel transistors for larger widths.
- Drain/Source implants may be used for local short wires (beware the resistivity).
- Global optimisation of Eulerian Paths (N(P)MOS subcircuits are graphs which visits every edge exactly once)



# apdk045 adder

#### practical design



29/40



- All transistors have horizontal orientation.
- Maximal width defined by NMOS and PMOS areas height.
- Use parallel transistors for larger widths.
- Drain/Source implants may be used for local short wires (beware the resistivity).
- Global optimisation of Eulerian Paths (N(P)MOS subcircuits are graphs which visits every edge exactly once)



#### practical design





- Only needed informations for Place and Route.
- Wires connected to Input/Output pins of the cell
- Wires that are obstacles for wiring
- The router may use Metal1 has a wiring layer if enough room inside the cell





#### practical design





- Only needed informations for Place and Route.
- Wires connected to Input/Output pins of the cell
- Wires that are obstacles for wiring
- The router may use Metal1 has a wiring layer if enough room inside the cell









- Only needed informations for Place and Route.
- Wires connected to Input/Output pins of the cell
- Wires that are obstacles for wiring
- The router may use Metal1 has a wiring layer if enough room inside the cell









- Only needed informations for Place and Route.
- Wires connected to Input/Output pins of the cell
- Wires that are obstacles for wiring
- The router may use Metal1 has a wiring layer if enough room inside the cell





### **Outline**

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP: "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP: "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP: "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



#### Introduction

- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP : "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP: "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



#### Introduction

- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP: "Energy Delay Product" of the logic gates.

- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP : "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



- Area of the logic gates.
- Speed of the logic gates.
- Power consumption of the logic gates
- Noise margin of the logic gates
- EDP : "Energy Delay Product" of the logic gates.
- Near threshold or Sub-threshold behavior.
- Robustness of the design.
- . . . .



# **Buffer optimization example**

#### simple invertor

- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A \cdot C_{load}$
- $\blacksquare$  The inputs of the gates connected to A are modelized by a load capacitor  $C_{LdA}$
- A parametrized invertor can be used :
  - $T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$
  - $C_{InIV}(\alpha) = C_{0InIV}.\alpha$
  - with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates lpha
- lacktriangle Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known : $T_{pA} = T_{p0A} + R_A \cdot C_{load}$
- $\blacksquare$  The inputs of the gates connected to A are modelized by a load capacitor  $C_{LdA}$
- A parametrized invertor can be used :
  - $T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$
  - $C_{InIV}(\alpha) = C_{0InIV}.c$
  - with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates lpha
- lacksquare Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A$ .  $C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :
  - $T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$
  - $C_{InIV}(\alpha) = C_{0InIV}.\alpha$
  - with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates  $\alpha$
- lacksquare Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A$ .  $C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :

• 
$$T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$$

- $C_{InIV}(\alpha) = C_{0InIv}.\alpha$
- with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- $\blacksquare$  Compute the propagation time through the gates  $\alpha$
- lacksquare Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A$ .  $C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :

• 
$$T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$$

- $C_{InIV}(\alpha) = C_{0InIv}.\alpha$
- with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- Compute the propagation time through the gates of
- lacksquare Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A$ .  $C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :
  - $T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$
  - $C_{InIV}(\alpha) = C_{0InIv}.\alpha$
  - with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates lpha
- lacksquare Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known :  $T_{pA} = T_{p0A} + R_A$ .  $C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :

• 
$$T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$$

- $C_{InIV}(\alpha) = C_{0InIv}.\alpha$
- with  $\alpha >= 1.0$
- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates lpha
- lacktriangle Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.



- Problem definition: What is the fastest way to transmit a data from the input of gate A, to the inputs of gates connected to A?
- The timing model of gate A is known : $T_{pA} = T_{p0A} + R_A \cdot C_{load}$
- The inputs of the gates connected to A are modelized by a load capacitor C<sub>LdA</sub>
- A parametrized invertor can be used :

• 
$$T_{pIV}(\alpha) = T_{p0IV} + (R_{0IV}/\alpha).C_{load}$$

• 
$$C_{InIV}(\alpha) = C_{0InIv}.\alpha$$

• with 
$$\alpha >= 1.0$$

- The inverter is inserted between gate A and the other gates.
- lacktriangle Compute the propagation time through the gates lpha
- Compute the value of  $\alpha$  giving the minimum propagation time.
- Compute the value of the minimum propagation time.





- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time
- The two invertors are replaced by *N* successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- The two invertors are replaced by N successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of N that minimize the propagation time



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by N successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time
- Compute the value of N that minimize the propagation time
- Compute the value of the minimum propagation time



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by *N* successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by *N* successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time.



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by N successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time.



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by *N* successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time.



- The invertor is replaced by two successive invertors with parameters  $\alpha_0$  and  $\alpha_1$ .
- What are the optimal sizes of  $\alpha_0$  and  $\alpha_1$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- The two invertors are replaced by *N* successive invertors with parameters  $\alpha_0 \dots \alpha_{N-1}$ .
- What are the optimal sizes of the N parameters  $\alpha_i$  for a minimum propagation time?
- Compute the value of the minimum propagation time.
- Compute the value of *N* that minimize the propagation time
- Compute the value of the minimum propagation time.



several invertors



Yves MATHIEU

How to transmit a data on a long wire

- The long wire as a distributed RC model.
- Distributed invertors along the line may help minimizing overall propagation time.
- Same kind of optimization but with a non linear model of the propagation time through the line



How to transmit a data on a long wire

- The long wire as a distributed RC model.
- Distributed invertors along the line may help minimizing overall propagation time.
- Same kind of optimization but with a non linear model of the propagation time through the line



How to transmit a data on a long wire

- The long wire as a distributed RC model.
- Distributed invertors along the line may help minimizing overall propagation time.
- Same kind of optimization but with a non linear model of the propagation time through the line





#### Outline

Dynamic logic

Differential logic

Application specific logic

Sequential cells

Standard cell design

Performance optimization

Practical work



**TSPC: True Single Phase Clock logic** 



- Very compact structure.
- Supposed to be faster than standard CMOS cell.



**TSPC: True Single Phase Clock logic** 



- Very compact structure.
- Supposed to be faster than standard CMOS cell.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.



- Question : How does this cell works?
- Setup a table D, Q, Ck and all internal nodes of the cell.
- Add also the states of the MOS transistors (ON or OFF).
- Each internal node may have the states 0, 1, U(Unknown), Z0 (high-impedance 0), Z1 (high-impedance 1), or ZU (high-impedance Unknown).
- Imagine the following input sequence.



- Fill the table with the successive values of the signals
- Check if the cell is a D flip-flop.

