Skip to content

In this repository we begin our journey with simulating basic C code for a RISC-V processor.

Notifications You must be signed in to change notification settings

JoyenBenitto/pes_asic_class

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VSD_ASIC_DESIGN

This Repository Guides you the process of complete ASIC flow, we begin our journey with simulating basic C code and go all the way to putting together an SoC ready to be fabbed out.
(FACULTY : Mahesh Awati, GUIDE : Kunal Ghosh)

The course files are present under those respective Lab directories, make sure to download all the dependencies before proceeding with the lab

Introduction

Flow :

  +------+     +------+     +---------+     +----+     +------+
  |  HLL | --> |  ALP | --> |  Binary | --> | HDL | --> | GDS  |
  +------+     +------+     +---------+     +----+     +------+

1. HLL -> High level language (c , c++)

  • A high-level programming language is a type of programming language that is designed to be more human-readable and user-friendly compared to low-level languages like assembly or machine code.

2. ALP -> Assembly level program

  • Assembly language is a low-level programming language that is closely related to the architecture of a specific computer's central processing unit (CPU). Assembly language programs are written using mnemonic codes that represent specific machine instructions which the machine can understand.

3. HDL -> Hardware Description Language

  • It is a specialized programming language used for designing and describing digital hardware circuits. HDLs allow engineers to model and simulate complex digital systems before physical implementation, aiding in the design and verification of integrated circuits and FPGA configurations.
  • Verilog, System Verilog, VHDL

4. GDS -> Graphic Data System (layout)

  • Format used in the semiconductor industry to represent the layout and design of integrated circuits at a highly detailed level. GDSII files contain information about the geometric shapes, layers, masks, and other essential details that make up the physical layout of a chip.
  • Tools : Klayout, Magic
The Hardware needs to understand what the Application software is saying it to do.This relation can be eshtablished by System Software

System Software

  • OS : Operating System : Handles IO, memory allocation, Low level system function
  • Compiler : Convert the input to hardware dependent instruction
  • Assembler : Convert the instructions provided by compiler to Binary format
  • HDL : A program that understands the Binary pattern and map it to a netlist
  • GDS : Layout

COURSE

LAB 1

LAB 1: Introduction to RISCV ISA and GNU Compiler Toolchain

Dependencies

C Code

  • Open your terminal and run the following command to create a .c file.
vim lab1/sum1ton.c 

we are writing a C program that finds the sum of number from 1 to n . We are going to use SPIKE simulator to simulate the code.

#include<stdio.h>

int main(){
	int sum =0, n=5;

	for(int i= 0; i <= n; ++i){
		sum += i;
	}

	printf("Sum of the series is %d", sum);
	return 0;
}
  • To run the above code use :
gcc lab1/sum1ton.c

However this will return an executable for the system it is run on. I am running it on an intel-i5 so the assembly is not RISC-V Assembly

cat lab1/sum1ton.c

C to Disassembly

  • To generate a RISC-V object file we need to use the riscv64-unknown-elf-gcc
cd lab1
riscv64-unknown-elf-gcc -o1 -mabi=lp64 -march=rv64i -o sum1ton.o sum1ton.c
ls -ltr sum1ton.o
riscv64-unknown-elf-gcc -S sum1ton.c
vim sum1ton.c

Screenshot from 2023-08-21 22-12-02

Few flags we used o1 - Level 1 optimization lp64 - l (long integer) p(pointer)

  • To view the assembly code:
riscv64-unknown-elf-objdump -d  sum1ton.o 
  • Repeat the above but with ofast :
riscv64-unknown-elf-gcc -ofast -mabi=lp64 -march=rv64i -o sum1ton.o sum1ton.c

image

Integer number representation

  • The below is the C code for finding max unsigned number
#include <stdio.h>
#include <math.h>

int main(){
	unsigned long long int max = (unsigned long long int)(pow(2,10)*-1);
	printf("highest number represented by unsigned long long int is %llu \n",max);
	return 0;
}
  • The below is the C code for finding range of signed numbers
#include <stdio.h>
#include <math.h>

int main(){
	long long int max = (long long int)(pow(2,63)-1);
	long long int min = (long long int)(pow(2,63)*(-1));
	printf("highest number represented by long long int is %lld \n",max);
	printf("lowest number represented by long long int is %lld \n",min);
	return 0;
}
One could get the outputs for the above program as discussed above.

Debug

spike -d $(which pk) sum1ton.c

image

  • press ENTER : shows the first line and successive ENTER shows successive lines
  • reg 0 a2 : checks content of register a2 0th core
  • q : quit the debug process

Difference between the ALP commands when used different optimizers

  • use the command riscv64-unknown-elf-objdump -d 1_to_N.o | less
  • use /instance to search for an instance
  • press ENTER
  • press n to search next occurance
  • press N to search for previous occurance.
  • use esc :q to quit

o1:

Screenshot from 2023-08-22 08-57-52

The Ofast flag can be used the the above command can be run to vew the logs in a similar way !

Integer number Representation (n-bit)

  • Range of Unsigned numbers : [0, (2^n)-1 ]
  • Range of signed numbes : Positive : [0 , 2^(n-1)-1] Negative : [-1 to 2^(n-1)]

Lab 2

LAB 2: Introduction to ABI and Basic Verification Flow

RISC-V Instructions:

RISC-V instructions can be categorized into different categories based on their functionality. Here are some common types of RISC-V instructions:

  • R-Type Instructions: These are used for arithmetic and logic operations between two source registers, storing the result in a destination register. Common R-type instructions include:

    • add: Addition
    • sub: Subtraction
    • and: Bitwise AND
    • or: Bitwise OR
    • xor: Bitwise XOR
    • slt: Set Less Than (signed)
    • sltu: Set Less Than (unsigned)
  • I-Type Instructions: These are used for arithmetic and logic operations that involve an immediate value (constant) and a source register. Common I-type instructions include:

    • addi: Add Immediate
    • andi: Bitwise AND with Immediate
    • ori: Bitwise OR with Immediate
    • xori: Bitwise XOR with Immediate
    • lw: Load Word (load data from memory)
    • sw: Store Word (store data to memory)
    • beq: Branch if Equal
    • bne: Branch if Not Equal
  • S-Type Instructions: These are similar to I-type instructions but are used for storing values from a source register into memory. Common S-type instructions include:

    • sb: Store Byte
    • sh: Store Halfword
    • sw: Store Word
  • U-Type Instructions: These are used for setting large immediate values into registers or for jumping to an absolute address. Common U-type instructions include:

    • lui: Load Upper Immediate (set the upper bits of a register)
    • auipc: Add Upper Immediate to PC (used for address calculations)
  • J-Type Instructions: These are used for jumping or branching to different program locations. Common J-type instructions include:

    • jal: Jump and Link (used for function calls)
    • jalr: Jump and Link Register (used for function calls)
  • F-Type and D-Type Instructions (Floating-Point): These instructions are used for floating-point arithmetic and include various operations like addition, multiplication, and comparison. These instructions typically have an 'F' or 'D' prefix to indicate single-precision or double-precision floating-point operations.

    • fadd.s: Single-precision floating-point addition
    • fmul.d: Double-precision floating-point multiplication
    • fcmp.s: Single-precision floating-point comparison
  • RV64 and RV32 Variants: RISC-V can be configured in different bit-widths, with RV64 being a 64-bit architecture and RV32 being a 32-bit architecture. Instructions for these variants have slightly different encoding to accommodate the different data widths.

instr

RISC-V instructions have a common structure with several fields that serve different purposes. The specific fields may vary depending on the type of instruction (R-type, I-type, S-type, U-type, etc.), but here are the typical fields you'll find in a RISC-V instruction:

  • Opcode (OP): The opcode field specifies the operation to be performed by the instruction. It indicates what type of instruction it is (e.g., arithmetic, load, store, branch) and what specific operation within that type is being carried out.

  • Destination Register (rd): This field specifies the destination register where the result of the instruction should be stored. In R-type instructions, this is often the first source register. In I-type and S-type instructions, this is sometimes used to specify the target register.

  • Source Register(s) (rs1, rs2): These fields specify the source registers for the operation. For R-type instructions, rs1 and rs2 are typically the two source registers. In I-type and S-type instructions, rs1 is the source register, and immediate values may be used in place of rs2.

  • Immediate (imm): The immediate field contains a constant value that is used as an operand in I-type and S-type instructions. This value can be an immediate constant, an offset, or an immediate value for arithmetic or logical operations.

  • Function Code (funct3, funct7): In R-type and some I-type instructions, these fields further specify the operation. funct3 typically selects a particular function within an opcode category, and funct7 may provide additional information or further differentiate the instruction.

  • Extension-specific Fields: Depending on the RISC-V extension being used (e.g., F for floating-point, M for integer multiplication/division), there may be additional fields to accommodate the specific requirements of that extension. These fields are not part of the base RISC-V instruction format.

C Program - Sum of numbers from 1 to 9

#include <stdio.h>

extern int load(int x, int y);

int main()
{
  int result = 0;
  int count = 9;
  result = load(0x0, count+1);
  printf("Sum of numbers from 1 to 9 is %d\n", result);
}
  • Using gcc Screenshot from 2023-08-21 23-59-18
  • Using spike: Screenshot from 2023-08-22 00-07-36

Assembly

.section .text
.global load
.type load, @function

load:

add a4, a0, zero
add a2, a0, a1
add a3, a0, zero

loop:

add a4, a3, a4
addi a3, a3, 1
blt a3, a2, loop
add a0, a4, zero
ret

Screenshot from 2023-08-22 10-09-48

CPU processes the contents of the memory and provides with output using iverilog

Risc-V CPU : Picorv32.v Testbench for verification : testbench.v Tool : iverilog script : rv32im.sh : has the commands to get the c-program, ALP, converts into hex format, loads into memory of riscv cpu, passes it iverilog and provides the output

Screenshot from 2023-08-22 09-57-17

Register Naming in RISC-V according to ABI

abi-reg

RTL Design using Verilog with SKY130 Technology

Day 1

Introduction to verilog RTL Design and Synthesis

Tools Required:

  • gtkwave
  • yosys
  • iverilog

In fedora all of the above can be installed using sudo dnf

Terminologies

  • Simulator : The RTL should be check if it matches with the specifications provided. This work is done by Simulator and is used to simulate the design for its functionality Example : iverilog Simulator looks for a change in input, based on which the output is evaluated ==> if there is no change in input, Then output is not evaluated.

  • Design : The set of verilog code(s) that represents the provided functionality/Specification in the form of a netlist.

  • Testbench : Setup to apply stimulus(test vectors)to the design inorder to check the functionality using the response obtained. The response is obtained using iverilog in the form of VCD file that is visulaised using gtkwave

  • Synthesizer : Tool required t convert RTL to netlist Example : yosys

  • Netlist : In Synthesis, RTL Design is converted to gate level netlist ie.,design is converted into gates and connections are made between the gates. This is givenout as a file called netlist.

  • liberty(.lib) : It contains all cells required to represent any logic and the cells are of different flavours(different power, delay, operating conditions etc)

Simulation and Results:

  • wirting a simple mux
  • wriite a testbench
  • and check functionality
     

cd /path/to/file/location
iverilog mux.v mux_tb.v -o mux.out
./mux.out
gtkwave mux_tb.vcd

mux img

yosys synth

rtl_diag

/* Generated by Yosys 0.32+1 (git sha1 389b8d0f94a, gcc 13.2.1 -O2 -fexceptions -fstack-protector-strong -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -fPIC -Os) */

module mux(i0, i1, s, y);
  input i0;
  wire i0;
  input i1;
  wire i1;
  input s;
  wire s;
  output y;
  wire y;
  assign y = s ? i1 : i0;
endmodule

The above is the generated mapped file.

Flops

Due to propogation delays of gates, The combinational block may output some glitches which might be negligible but when "n" number of combinational blocks are connected, theglich becomes large and no more remains a glitch but as a false state. So to avoid the addition effect of glitches we will have flops at the end of each combinational blocks as the flop stores the final value and the glitch is eliminated before passing it to next block.

Steps to synthesize flops

read_liberty -lib <PATH_TO_.lib_FILE>/sky130_fd_sc_hd__tt_025C_1v80.lib
read_verilog flop.v
synth -top flop
dfflibmap -liberty <PATH_TO_.lib_FILE>/sky130_fd_sc_hd__tt_025C_1v80.lib
abc -liberty <PATH_TO_.lib_FILE>/sky130_fd_sc_hd__tt_025C_1v80.lib
write_verilog -noattr flop_mapped.v
show 

To view the waveform

iverilog flop.v flop_tb.v -o flop.out
./flop.out
gtkwave flop_tb.vcd

D-flip-flop with an asynchronous reset asyncres.v asyncres_tb.v

Alt text Alt text

Since we see a D Flip FLop getting inferred, We use the above mentioned dfflibmap command to map the flops accurately View the output waveforms

Alt text To Check the functionality, We refer to this waveform

D-flip-flop with an asynchronous set asyncset.v asyncset_tb.v

Alt text

Alt text

D-flip-flop with both synchronous and asynchronous reset sync_async_res.v sync_async_res_tb.v

Alt text Alt text


DAY 3: Combinational and sequential optimizations

Inorder to optimise a verilog files that has submodules, We have to first flatten it, then optimize opt_clean -purge and complete the synthesis process

Here we can observe that instead of using and gate and or gates, its using AOI The synthesis sctipts can be found in the verilog folder, the same steps as above can be followed to visualize the sythesised verilog using yosys.

Find all the verilog used in the verilog directory

opt_check1.v

Alt text

opt_check2.v

Alt text

opt_check3.v

Alt text

opt_check4.v

Alt text

multipe_modules_opt.v

Alt text

Inorder to optimise a verilog files that has submodules, We have to first flatten it, then optimize opt_clean -purge and complete the synthesis process

Here we can observe that instead of using 'and' gate and 'or' gates, its using 'AOI'

The waveforms can be vizualized using gtk wave using vcd

dff_const1 dff_const1_tb.v

Alt text Alt text

dff_const2 dff_const2_tb.v

Alt text Alt text

dff_cosnt3 dff_const3_tb.v

Alt text Alt text

dff_const4 dff_const4_tb.v

Alt text Alt text

dff_const5 dff_const5_tb.v

Alt text Alt text

counter_opt1.v

Alt text

counter_opt2.v

Alt text

Usually a 3 bit counter requires 3 flops but since the output here is dependent on only the LSB and other 2 bits are unused. Therefore only one flop is being down and as we know that LSB toggles every clock cylce,its just using an inverter to invert the output at every clock cycle. - counter1.v dot diag shows us this

in the second one counter2 the logic is changed such a way that the output is dependent on all 3bits, it has inferred 3 ffs.


DAY 4:GLS, SYnthesis solution mismatch

Gate Level Simulation(GLS)

  • used for post-synthesis verification to ensure functionality and timing requirements
  • input : testbench ,synthesized netlist of a deisgn, gate level verilog models (since design now is synthesised one , it has library gate definitions in it.so we have to pass those verilog models too)
  • sometimes there is a mismatch in simulation results for post-synthesis netlist that's called synthesis simulation mismatch

Synthesis Simulation Mismatch

Reasons :

  • Missing Sensitivity list
  • Blocking(sequential execution) vs Non Blocking assignments(parallel Execution)
  • Non standard verilog codeing

GLS Lab

synthesize conditional_mux and write its netlist
iverilog <Path_to_primitives.v>/primitives.v <path_to_sky130_fd_sc_hd.v>/sky130_fd_sc_hd.v <path_to_synthesized_netlist>/conditional_mux_mapped.v <path_to_original_file>/conditional_mux_tb.v -o conditional_mux_gls.out
./conditional_mux_gls.out
gtkwave conditional_mux_tb.vcd

On running the above command we can get the waveforms using gtk wave.

presynthesis(above) and post-synthesis simulation(below) conditional_mux.v conditional_mux_tb.v

Alt text Alt text

The above shows presynthesis and post-synthesis waveforms, as we can see both are so same hece we can confirms that the synthesized netlist is functionally correct

presynthesis(above) and post-synthesis(below) bad_mux.v bad_mux_tb.v

Checking functionality with waveforms Alt text Alt text Alt text

presynthesis(above) and post-synthesis(below) Blocking_error.v Blocking_error_tb.v

Alt text

find all the TB, Waveforms(vwf files) and verilog files in the verilog directory.

About

In this repository we begin our journey with simulating basic C code for a RISC-V processor.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages