Installation of RISC-V Toolchain
https://github.com/kunalg123/riscv_workshop_collaterals/blob/master/run.sh
- Execute the commands in run.sh
- See that the gcc version on your system is of version 12
If not an error of this kind be found:

sudo apt upgrade
sudo apt install build-essential
sudo apt -y install gcc-12 g++-12
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 12
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 12
sudo update-alternatives --config gcc
sudo update-alternatives --config g++
gcc --version; g++ --version
The above commands will update your gcc to version 12. To check for successful installation run the below command and the output will be shown as depicted below
riscv64-unknown-elf-gcc --version
Instruction Set Architecture
Instruction set architecture or computer architecture is an abstract model of the computer that defines how the CPU is controlled by the software. It acts as an interface between languages like C, C++, Java, and the hardware. The type of instructions depends on the type of hardware.From Apps to Hardware
Application software ---> System software ---> Hardware- System Software converts application software into binary language
- It has three major parts:
- Operating system
- Compiler
- Assembler
- The operating system acts on small functions present in C, C++, Java, or any other language codes and gives it to the Compiler which in turn generates the .exe file which has all the Instructions. The .exe file is fed into the assembler, which generates the Machine Language code through which hardware can be implemented
- Pseudo Instructions
- Base Integer Instructions(RV64I)
- Multiply Extension(RV64M)
- Single and Double precision floating point Extension(RV64F and RV64D)
These are the keywords through which programmers can access the registers of RISC-V. They are basically the System functions associated with the RISC-V registers
Labwork for RISC-V Toolchain
Write a program to calculate the sum of numbers from 1 to n
#include <stdio.h>
int main(){
int i,sum=0,n=10;
for(i=1;i<=n;i++){
sum=sum+i;
}
printf("Sum of numbers from 1 to %d is %d",n,sum);
}
- To execute the above type in the following commands
gcc sum.c
./a.out
- To display the code present in the .c file
cat sum.c
- To compile the C language code using RISC-V Compiler
- O1 optimization
riscv64-unknown-elf-gcc -O1 -mabi=lb64 -march=rv64i -o sum.o sum.c
- Ofast optimization
riscv64-unknown-elf-gcc -Ofast -mabi=lb64 -march=rv64i -o sum.o sum.c
If there is an error found as above, use the following commands and then re-run the compilation command
vim ~/.bashrc
export PATH=~/riscv_toolchain/riscv64-unknown-elf-gcc-8.3.0-2019.08.0-x86_64-linux-ubuntu14/bin:$PATH
export PATH=~/riscv_toolchain/riscv64-unknown-elf-gcc-8.3.0-2019.08.0-x86_64-linux-ubuntu14/riscv64-unknown-elf/bin:$PATH
- To view the assembly-level code for the C program, which is compiled using RISC-V
riscv64-unknown-elf-objdump -d sum.o
riscv64-unknown-elf-objdump -d sum.o | less
spike pk sum.o
spike -d pk sum.o
The above command is used for debugging

- Click on ENTER to show the first line and ENTER to show successive lines
- Click on q to quit the debug process
Integer Number Representation
- Unsigned numbers: are just like integers but they don't have a + or - sign associated with them.Range: [0, (2^n)-1 ]
- Signed numbers: These are a set of both positive and negative numbers Range : [0, 2^(n-1)-1] to [-1 to 2^(n-1)]
To represent negative numbers in binary 2's complement methodology is used.
Lab
- Write a C program that shows the maximum and minimum values of "n" bit unsigned numbers Considering n=64 here
#include <stdio.h>
#include <math.h>
int main(){
int n=64;
unsigned long long int max = (unsigned long long int) (pow(2,n) -1);
unsigned long long int min = (unsigned long long int) (pow(2,n) *(-1));
printf("Minimum value is %llu\n",min);
printf("Maximum value is %llu\n",max);
return 0;
}
- Write a C program that shows the maximum and minimum values of "n" bit signed numbers
#include <stdio.h>
#include <math.h>
int main(){
int n=64;
long long int max = (long long int) (pow(2,n-1) -1);
long long int min = (long long int) (pow(2,n-1) *(-1));
printf("Minimum value is %lld\n",min);
printf("Max value is %lld\n",max);
return 0;
}
Application Binary Interface
- An Application Binary Interface is the interface between two binary program module programs allowing them to work together. It defines the interface between two software components or systems that are written in different programming languages, compiled by different compilers, or running on different hardware architectures.- ABI defines how your code is stored inside the library file so that any program using your library can locate the desired function and execute it.
Double Words- Memory allocation
Architecture can also be divided into two types based on the process of loading memory. Memory can be loaded in two ways1. Little Endian: Here, the least significant byte is at the lowest memory address, and the most significant byte is at the highest memory address.
2. Big Endian: Here, the most significant byte is at the lowest memory address, and the least significant byte is at the highest memory address.
Load, Add, and Store Instructions
-
Load Instruction Considering the instruction
ld x8,16(x23)
- Here ld represents the loading of double-word
- x8 is the destination register
- x23 is the source register which has the base address
- 16 is the offset which is added to the base address
- The base address and the offset are added to generate the Physical Address
- The content of the physical address is accessed and now loaded to the destination register i.e. x8 in here
-
Add Instruction
Instruction: add x8,x24,x8- Here add represents a normal adding arithmetic operation
- x8 is the destination register
- x24 is the source register 1
- x8 is the source register 2
32 Registers and their general ABI Names
Through the ABI names, we reserve some of these registers for certain purposes

Labwork
Using ABI Function calls (re-writing C program using ASM language) C program- .c file
#include <stdio.h>
extern int load(int x, int y);
int main()
{
int result = 0;
int count = 9;
result = load(0x0, count+1);
printf("Sum of numbers from 1 to 9 is %d\n", result);
}
Assembly file - .s file
.section .text
.global load
.type load, @function
load:
add a4, a0, zero
add a2, a0, a1
add a3, a0, zero
loop:
add a4, a3, a4
addi a3, a3, 1
blt a3, a2, loop
add a0, a4, zero
ret
Compile the above using
riscv64-unknown-elf-gcc -Ofat -mabi=lp64 -march=rv64i -o custom1to9.o custom1to9.c load.S
To get the assembly-level code
riscv64-unknown-elf-objdump -d custom1to9.o |less
RISC-V CPU (PICORV-32)
PicoRV-32 is a size-optimized RISC-V CPU Core that implements the RISC-V RV32IMC Instruction Set.

Introduction
- RTL Design is checked for adherence to the spec by simulating the design - Simulator (Iverlog in here) is a tool used for checking the design ( set of Verilog codes in here) - Working of Simulator: The Simulator looks for changes in the input signal and evaluates the output. If the input values are changed, only then they are reflected in the changes in output valuesTestbench
Iverilog Based Simulation flow
- vcd file: A Value Change Dump file stores all the information about value changes in the simulator
- GTKwave: It is a software, used as a simulation tool to verify the Verilog design code through a testbench.
sudo apt install gtkwave
Labs using Iverilog and GTKwave
mkdir VLSI
cd VLSI
git clone https://github.com/kunalg123/sky130RTLDesignAndSynthesisWorkshop.git
- The library files are stored in my_lib
- All the Verilog models of the standard cells are present in verilog_model

- verilog_files has all the source files and testbench files of the required standard cells ( has the design files)
- for every file for example good_mux.v file there is a **tb_**good_mux.v file. We can see a one-to-one mapping between the Verilog Design file and it's testbench file
- Load both the design source file and testbench file into the verilog simulator (iverilog in here)
iverilog good_mux.v tb_good_mux.v. - An
a.outfile is created.
- On executing this file
./a.outan VCD file is dumped out of the simulator - Loading the file into GTKwave using the command
gtkwave tb_good_mux.vcd

- For looking into the file structure `gvim tb_good_mux.v -o good_mux.v`
Logic Synthesis
- RTL Design: Behavioural representation of the required specification
- .lib: Collection of logical modules
- Synthesizer: Tool used for converting RTL to netlist
- Netlist: Representation of design in the form of standard cells present in .lib

**Need of different flavors of gates** - Combinational logic (Propagation Delay) determines the maximum speed of operation of the digital logic circuit - T_clock > T_pd + T_cq + T_setup - To achieve maximum clock frequency, for better performance the delays should be as minimum as possible. This would mean that only faster cells are sufficient - But to ensure that there are no hold delay issues, gates are required to work slowly, creating a contractionary requirement - Therefore, for better performance fast cells are used while to avoid hold-time delays slow cells are used.
**Fast Cells v/s Slow Cells** - Fast Cells - Fast cells use wider transistors to enable higher current carrying capacity. This allows for quicker charging and discharging of capacitive loads, resulting in faster signal transitions. - Wider transistors generally consume more power compared to narrower ones due to the increased current flow and larger gate capacitance. - While faster cells offer improved performance, they might have larger silicon area requirements due to the increased number of transistors. Additionally, they might be more susceptible to issues like noise and power consumption.
-
Slow Cells
- Slow cells use narrower transistors to reduce power consumption and minimize power dissipation.
- Narrower transistors consume less power due to their lower current carrying capacity and reduced gate capacitance.
- While slower cells consume less power, they might operate at lower clock frequencies and have longer signal propagation delays. This can impact their ability to process data quickly.
-
The choice between faster and slower cells depends on the specific requirements of the digital logic circuit's application. Designers often need to strike a balance between performance, power consumption, and area constraints.
Yosys
Installation of Yosys
git clone https://github.com/YosysHQ/yosys.git
cd yosys
sudo apt install make
sudo apt-get update
sudo apt-get install build-essential clang bison flex libreadline-dev gawk tcl-dev libffi-dev git graphviz xdot pkg-config python3 libboost-system-dev libboost-python-dev libboost-filesystem-dev zlib1g-dev
make config-gcc
make
sudo make install
- To invoke Yosys
cd VLSI/sky130RTLDesignAndSynthesisWorkshop/verilog_files
yosys
-
To read the library
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
-
For realizing the logic in the verilog file
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
- Number of input signals, output signals and internal signals can be known through above
-
To get the graphical version of the realized logic
show-The mux is completely realised in the form of sky130 library cells.
-
To write netlist
write_verilog good_mux_netlist.v
!gvim good_mux_netlist.v
- To get a simplified version
write_verilog -noattr good_mux_netlist.v
!gvim good_mux_netlist.v
Timing Dot libs
- .lib files
- To view the contents of .lib file
gvim ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
- .lib files are used in digital circuit design to provide detailed information about the timing, power, and other characteristics of standard cells.
- In the first line i.e. library("sky130_fd_sc_hd__tt_025C_1v80")
- Libraries can be slow, fast, or typical. Here
ttstands for typical. The term typical (abbreviated as "tt") refers to the standard or average performance characteristics of a component or circuit under normal operating conditions. 025Crefers to the temperature at which the library's characteristics are specified.1v80is a representation of the supply voltage in volts. This voltage level serves as a reference point for understanding the circuit's behavior and performance under that specific operating voltage.screpresents standard cells signifies that the library contains standard cell information and characteristics for use in circuit design.
- Libraries can be slow, fast, or typical. Here
- To view the contents of .lib file
Hierarchy v/s Flat Synthesis
For synthesizing the module we used the command, synth -top good_mux. Now to know what type of synthesis is taking place mutiple_modules.v module is used.

- There are two sub-modules
- AND Gate
- OR Gate
- The module
multiple_modules, is instantiated sub-module 1 and 2 - As per the module, the gate-level logic would be as below

- But after synthesis
yosys
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.v
read_verilog mutiple_modules.v
synth -top multiple_modules
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show multiple_modules
- This above synthesized is of hierarchical form
- To get the netlist
write_verilog -noattr multiple_modules_hier.v
!gvim multiple_modules_hier.v
- A NAND Implementation is seen, here.
Stacked PMOS Circuits
- Stacked PMOS NOR requires multiple transistors to be stacked vertically, which leads to a more complex manufacturing process. This complexity can result in lower yields and higher manufacturing costs.
- The stacked PMOS architecture tends to occupy more space compared to other memory cell configurations. This larger cell size translates to a lower storage density
- Due to its larger cell size, stacked PMOS NOR flash has a lower bit density, meaning you can store fewer bits in the same area compared to other architectures like NAND
- PMOS transistors are constructed using a p-type semiconductor for the channel region, and their carrier mobility tends to be lower than that of NMOS transistors, which use an n-type semiconductor for the channel. Due to the lower carrier mobility of PMOS transistors compared to NMOS transistors, stacked PMOS NOR flash memory cells might experience slower switching speeds, contributing to slower overall memory performance and longer access times.
write_verilog -noattr multiple_modules_flat.v
!gvim multiple_modules_flat.v
- Directly the AND and OR Gate are instantiated.
In hierarchical synthesis, the design is organized into a hierarchy of modules, with each module representing a functional block or sub-component. Each module is synthesized independently, and then these synthesized modules are connected together to form the complete design.
- Advantages
- Encourages modular design, making it easier to manage and maintain complex designs.
- Supports the reuse of modules, as synthesized blocks can be used in multiple designs.
- Enables concurrent development and optimization of different modules.
- Can help manage complexity and reduce the size of intermediate files.
- Encourages modular design, making it easier to manage and maintain complex designs.
- Disadvantages
- Introduces the challenge of correctly integrating modules and ensuring proper connectivity.
- Some high-level optimizations might be more challenging due to module-level synthesis.
- Introduces the challenge of correctly integrating modules and ensuring proper connectivity.
In flat synthesis, the entire design is treated as a single, monolithic unit. This means that the entire design hierarchy, including all sub-modules, is flattened into a single-level representation. All optimizations, logic synthesis, and technology mapping are performed on this single-level design.
- Advantages
- Simplifies the synthesis process, as the entire design is treated as a single unit.
- Can lead to high-level optimizations across the entire design.
- Simplifies the synthesis process, as the entire design is treated as a single unit.
- Disadvantages
- Can result in large intermediate files and complex optimization problems.
- Limited ability to reuse common logic structures across different parts of the design.
- Can lead to inefficient use of resources if the design is very large and complex.
- Can result in large intermediate files and complex optimization problems.
In practice, a combination of both flat and hierarchical synthesis is often used. Hierarchical synthesis is employed for managing the complexity of large designs, and then certain modules might be synthesized flat to achieve specific optimizations.
Flop Coding Styles
- A flip-flop is a bistable multivibrator circuit element that can store one bit of data. It has two stable states and can be used to represent binary information.
Glitches are unwanted and unpredictable transitions in digital circuits that can occur due to variations in signal propagation delays.
- Reasons for Glitches
- Different gates have different propagation delays, and these delays can lead to temporary imbalances in signal timing. If inputs to different gates change at slightly different times, it can result in momentary glitches in the output.
- Signals may take different path lengths to reach different gates. Longer paths can introduce larger propagation delays, potentially causing timing mismatches and glitches.
- Race conditions occur when two or more signals arrive at a gate at nearly the same time, and the output of the gate depends on which signal arrives first. This can lead to unpredictable temporary output values before the circuit settles into a stable state.
- Flip-flops are used in sequential circuits to store data and create a controlled timing mechanism. They can help eliminate glitches that may occur in combinational circuits
- The asynchronous reset feature allows the user to reset the flip-flop's state to a specific value, irrespective of the clock signal
- When the reset input is not active i.e. 0, the flip-flop operates as a standard D flip-flop, capturing the value at the D input on the rising edge of the clock.
- When the reset input is active i.e. 1, the flip-flop's output is forced to 0 regardless of the clock or D input.
Simulation
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
iverilog dff_asyncres.v tb_dff_asyncres.v
./a.out
gtkwave tb_dff_asyncres.v
Synthesis
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
yosys
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_asyncres.v
synth -top dff_asyncres
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show dff_asyncres
- When the set is high, the output of the flip-flop is forced to 1, irrespective of the clock signal.
- When the set is low, the flip-flop operates as a standard D flip-flop, capturing the value at the D input on the rising edge of the clock
!gvim dff_async_set.v
Simulation
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
iverilog dff_asyncres_set.v tb_dff_asyncres_set.v
./a.out
gtkwave tb_dff_asyncres_set.v
Synthesis
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
yosys
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_asyncres_set.v
synth -top dff_asyncres_set
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show dff_asyncres_set
- A synchronous reset D flip-flop is a type of flip-flop that includes a reset input that is synchronized with the clock signal. This means that the reset input will only take effect on a specific clock edge, typically the rising or falling edge of the clock.
- During normal operation, when the reset input is not asserted, the flip-flop operates like a standard D flip-flop
- When the reset input is asserted (active), the flip-flop's output is forced to 0
!gvim dff_syncres.v
Simulation
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
iverilog dff_asyncres_set.v tb_dff_syncres.v
./a.out
gtkwave tb_dff_syncres.v
Synthesis
cd vsd/sky130RTLDesignAndSynthesisWorkshop/verilog_files
yosys
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_syncres.v
synth -top dff_syncres
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show dff_syncres
Optimizations
1.read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog mult_2.v
synth -top mult2
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
write_verilog -noattr mul2_netlist.v
!gvim mul2_netlist.v
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog mult_2.v
synth -top mult8
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
Combinational Optimization
- Combinational optimization deals with finding the best solution from a finite set of possible solutions.
- It focuses on finding the best possible solution from a finite set of options for problems that involve discrete variables and have no inherent notion of time.
- Two methods of computational optimization are
- Constant Propagation is a method of optimization that involves identifying and replacing variables with their constant values if they can be determined at compile-time. This optimization helps reduce the execution time of programs by avoiding redundant computations and simplifying expressions.
- Boolean logic optimization is a process of simplifying and improving logical expressions in Boolean algebra. It aims to simplify Boolean expressions or logic circuits by reducing the number of terms, literals, and gates required to implement a given logical function.
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog opt_check.v
synth -top opt_check
opt_clean -purge
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog opt_check2.v
synth -top opt_check2
opt_clean -purge
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog opt_check3.v
synth -top opt_check3
opt_clean -purge
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog opt_check4.v
synth -top opt_check4
opt_clean -purge
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog multiple_module_opt.v
synth -top multiple_module_opt
flatten
opt_clean -purge
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
Sequential Logic Optimization
- Sequential logic optimization is the process of enhancing digital circuits that incorporate memory elements and time-dependent behavior, with the aim of improving performance, efficiency, and other key characteristics
- Sequential logic optimization directly impacts the performance and reliability of digital circuits and systems.
- Methods of computational optimization are
- Sequential constant propagation is a process used in computer programming and software optimization to identify and replace variables with their constant values in a sequential or step-by-step manner. This technique aims to replace variable values with their known constant values at various stages of the logic circuit, optimizing the design for better performance and resource utilization.
- State optimization is an optimization technique used in digital design to reduce the number of states in finite state machines (FSMs) while preserving the original functionality.
- Sequential Logic Cloning replicates portions of sequential logic to alleviate bottlenecks and improve circuit throughput.
- Retiming, Adjusts the placement of flip-flops within a circuit to optimize timing, balance critical paths, and enhance overall performance
- Simulation
iverilog dff_const1.v tb_dff_const1.v
./a.out
gtkwave tb_dff_const1.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_const1.v
synth -top dff_const1
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Simulation
iverilog dff_const2.v tb_dff_const2.v
./a.out
gtkwave tb_dff_const2.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_const2.v
synth -top dff_const2
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Simulation
iverilog dff_const3.v tb_dff_const3.v
./a.out
gtkwave tb_dff_const3.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_const3.v
synth -top dff_const3
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Simulation
iverilog dff_const4.v tb_dff_const4.v
./a.out
gtkwave tb_dff_const4.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_const4.v
synth -top dff_const4
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Simulation
iverilog dff_const4.v tb_dff_const4.v
./a.out
gtkwave tb_dff_const4.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog dff_const5.v
synth -top dff_const5
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
Sequential optimisations for unused outputs
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog counter_opt.v
synth -top counter_opt
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog counter_opt2.v
synth -top counter_opt2
dfflibmap -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
show
GLS Concepts and Flow using Iverilog
- Gate-level simulation is a method used in electronics design to test and verify digital circuits at the level of individual logic gates and flip-flops.
- It involves simulating the circuit using the actual logic gates and flip-flops that make up the design, as opposed to higher-level abstractions like RTL (Register Transfer Level) descriptions.
- Gate-level simulation is commonly used in designs where precise timing and functionality are critical.
- It operates at a lower abstraction level than higher-level simulations and is essential for debugging and ensuring circuit correctness.
- Write RTL code.
- Synthesize to generate gate-level netlist.
- Create a testbench in Verilog.
- Compile both netlist and testbench.
- Run the simulation with compiled files. Debug and iterate as needed.
- Perform timing analysis if necessary.
- Generate test vectors for manufacturing tests.
Synthesis-Simulation Mismatch
- Synthesis-simulation mismatch is when there are differences between how a digital circuit behaves in simulation at the RTL level and how it behaves after gate-level synthesis.
- This discrepancy can occur due to various reasons, such as timing issues, optimization conflicts, and differences in modeling between the simulation and synthesis tools.
- To address it, ensure consistent tool versions, check synthesis settings, debug with simulation tools, and follow best practices in RTL coding and design.
- Resolving these mismatches is crucial for reliable hardware implementation.
Blocking and Non-blocking statements
- Blocking Statements
- Blocking statements are executed sequentially in the order they appear in the code and have an immediate effect on signal assignments.
- They are called "blocking" because they block the execution of subsequent statements until they are completed. Blocking statements are typically used within procedural blocks, such as always or initial blocks, to describe sequential behavior.
- Blocking assignments are typically used to describe combinational logic, where the order of execution doesn't matter, and each assignment depends on the previous one.
always @(posedge clk) begin
// Blocking assignments
a = b;
c = a + 1;
end
- Non-blocking statements
- Non-blocking statements allow concurrent execution within a procedural block or always block, making them suitable for describing synchronous digital circuits.
- Non-blocking assignments are typically used to model sequential logic, like flip-flops and registers, where parallel execution is required.
reg [2:0] state, next_state;
always @(posedge clk or posedge reset) begin
if (reset) begin
state <= 3'b000;
end else begin
// Non-blocking assignment to update the next state
next_state <= state + 1;
end
end
- Blocking statements in Verilog are essential for modeling sequential logic and specifying the order of operations within procedural blocks.
- Race Conditions: When multiple blocking assignments are used within the same always block or initial block, there is a potential for race conditions. Race conditions occur when the final value of a signal depends on the execution order of assignments. To avoid race conditions, use non-blocking assignments or ensure that assignments do not depend on the order of execution.
- Combinational Loops: Using blocking assignments to create combinational feedback loops (combinational loops with no flip-flops) can lead to undefined behavior and simulation issues.
- Debugging Challenges: Debugging code with many blocking assignments can be challenging, especially when trying to track down timing-related issues.
- Limited for Testbenches: In testbench code, excessive use of blocking statements can lead to simulation race conditions that don't reflect real-world hardware behavior.
- Order Dependency: The order of blocking statements can affect simulation results, leading to race conditions or unintended behavior.
- Lack of Parallelism: Blocking statements do not accurately represent the parallel nature of hardware. In hardware, multiple signals can update concurrently, but blocking statements model sequential behavior. As a result, using blocking statements for modeling complex concurrent logic can lead to incorrect simulations.
Labwork
- Simulation
iverilog ternary_operator_mux.v tb_ternary_operator_mux.v
./a.out
gtkwave tb_ternary_operator_mux.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog ternary_operator_mux.v
synth -top ternary_operator_mux
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
write_verilog -noattr ternary_operator_mux_netlist.v
show
- GLS to Gate level simulation
iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v ternary_operator_mux_netlist.v tb_ternary_operator_mux.v
./a.out
gtkwave tb_ternary_operator_mux.vcd
- Simulation
iverilog bad_mux.v tb_bad_mux.v
./a.out
gtkwave tb_bad_mux.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog bad_mux.v
synth -top bad_mux
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
write_verilog -noattr bad_mux_netlist.v
show
- GLS to Gate level simulation
iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v bad_mux_netlist.v tb_bad_mux.v
./a.out
gtkwave tb_bad_mux.vcd
- Simulation
iverilog blocking_caveat.v tb_blocking_caveat.v
./a.out
gtkwave tb_blocking_caveat.vcd
- Synthesis
read_liberty -lib ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog blocking_caveat.v
synth -top blocking_caveat
abc -liberty ../lib/sky130_fd_sc_hd__tt_025C_1v80.lib
write_verilog -noattr blocking_caveat_netlist.v
show
- GLS to Gate level simulation
iverilog ../my_lib/verilog_model/primitives.v ../my_lib/verilog_model/sky130_fd_sc_hd.v blocking_caveat_netlist.v tb_blocking_caveat.v
./a.out
gtkwave tb_blocking_caveat.vcd































































































