Skip to content

alex96295/pulp-verilog-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PULP System Verilog evaluation benchmark and library for large language models

This repository creates a benchmark and library dataset from some of the open-source PULP platform's designs in System Verilog, with a dual goal:

  1. Leveraging the generated benchmark to assess LLM-based RTL design and verification capabilities
  2. Leveraging the generated library as a retrieval database for LLM-based hardware design agents, enabling them to consult and reuse modules in a way that mirrors how human designers reference existing third party IPs.

The reference RTL designs and testbenches are single-source, self-contained files from the original PULP codebase, generated using bender and morty open-source tool. Specification prompts are generated by an LLM.

The idea of this repository is inspired by verilog-eval from NVIDIA.

Dependencies

  • Bender >= 0.27.1
  • Morty >= 0.9.0
  • Python >= 3.11

Additionally, you should set "OPENAI_API_KEY", "ANTHROPIC_API_KEY" or other keys in your env variables to use a cloud LLM provider's APIs, or create key.cfg file. The file should be in format of:

OPENAI_API_KEY= 'xxxxxxx'
ANTHROPIC_API_KEY= 'xxxxxxx'
VERTEX_SERVICE_ACCOUNT_PATH= 'xxxxxxx'
VERTEX_REGION= 'xxxxxxx'

Getting started

This project uses Git submodules that have to be initialized. Either clone the repository recursively using:

git clone --recursive <url>

or fetch the submodules afterwards in the repository:

git submodule update --init --recursive

Setup

Install the dependencies:

# get latest stable rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# bender
cargo install bender

# morty
cargo install --git https://github.com/pulp-platform/morty.git

# python environment
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Usage

The list of (RTL DUT, TB) pairs and their original asset have to be listed in a JSON file. We provide a sample file in $ROOT/assets.json file. Then call:

./scripts/bench-lib-gen.sh \
  --json assets.json \
  --out out \
  --provider openai \
  --model gpt-4o-2024-08-06 \
  --key-cfg ./key.cfg \
  --max-token 8192 \
  --tokens 60000 \
  --temperature 0.6 \
  --top-p 0.95

Output

Name Description
$ROOT/out/bench/ProbXXX_<dut_name>_ref.sv Reference DUT RTL design for automatic spec generation.
$ROOT/out/bench/ProbXXX_<dut_name>_test.sv Reference testbench for the DUT. Used by the assessed LLM to verify its generated RTL DUT in-the-loop. TBs are self-checking.
$ROOT/out/bench/ProbXXX_<dut_name>_test_golden.sv Reference testbench instantiating the reference DUT. Serves as a golden reference for comparison with the LLM-generated RTL DUT.
$ROOT/out/bench/ProbXXX_<dut_name>_prompt.txt LLM-generated natural language input spec based on the reference design and, if any specified, testbench. Can also be written from scratch if a reference design is missing.
Name Description
$ROOT/out/lib/<dut_name>.json LLM-generated structured input spec (json) based on the reference design.

An example output with the designs fifo_v3 and credit_counter from PULP platform's common_cells is provided in out/{bench,lib}.

About

PULP System Verilog evaluation benchmark for large language models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published