# **VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification**

This is the repository for the VeriThoughts Dataset, the first large scale formally verified Verilog reasoning dataset. This repository contains all of the code necessary to generate VeriThoughts as well as our model training and evaluation code.

Our datasets can be found on HuggingFace: [Link](https://huggingface.co/collections/wilyub/verithoughts-datasets-6826de76e798014f05de6c0f)

Our fine-tuned Verilog models can be found on HuggingFace: [Link](https://huggingface.co/collections/nyu-dice-lab/verithoughts-models-681eead7cd13abeb5957baf3)

In [1]:
!git clone https://github.com/wilyub/VeriThoughts.git

Cloning into 'VeriThoughts'...
remote: Enumerating objects: 121, done.[K
remote: Counting objects: 100% (121/121), done.[K
remote: Compressing objects: 100% (106/106), done.[K
remote: Total 121 (delta 54), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (121/121), 1.99 MiB | 8.75 MiB/s, done.
Resolving deltas: 100% (54/54), done.


**Download our evaluation dataset (optional)**

In [2]:
!cd VeriThoughts/evaluation_verithoughts && wget https://huggingface.co/datasets/wilyub/VeriThoughtsBenchmark/resolve/main/hf_benchmark.jsonl


--2025-08-22 00:51:06--  https://huggingface.co/datasets/wilyub/VeriThoughtsBenchmark/resolve/main/hf_benchmark.jsonl
Resolving huggingface.co (huggingface.co)... 108.156.211.90, 108.156.211.51, 108.156.211.95, ...
Connecting to huggingface.co (huggingface.co)|108.156.211.90|:443... connected.
HTTP request sent, awaiting response... 307 Temporary Redirect
Location: /api/resolve-cache/datasets/wilyub/VeriThoughtsBenchmark/ecae68df53c50b113db57c198465d891fcf7736e/hf_benchmark.jsonl?%2Fdatasets%2Fwilyub%2FVeriThoughtsBenchmark%2Fresolve%2Fmain%2Fhf_benchmark.jsonl=&etag=%229f486e6165e9a635087e3eeaa6990949d927537a%22 [following]
--2025-08-22 00:51:06--  https://huggingface.co/api/resolve-cache/datasets/wilyub/VeriThoughtsBenchmark/ecae68df53c50b113db57c198465d891fcf7736e/hf_benchmark.jsonl?%2Fdatasets%2Fwilyub%2FVeriThoughtsBenchmark%2Fresolve%2Fmain%2Fhf_benchmark.jsonl=&etag=%229f486e6165e9a635087e3eeaa6990949d927537a%22
Reusing existing connection to huggingface.co:443.
HTTP request sen

**Install required packages**

In [1]:
!pip install -r VeriThoughts/evaluation_verithoughts/requirements.txt



**Example input file format (test.jsonl):**

Each row corresponds to one verilog generation task.

1.   question: prompt for generating verilog codes (prompt should contain the declaration of input/output ports).
2.   ground_truth (optional): the golden verilog implementation; this is used during evaluation.
3.   generated_verilog: no use here



In [48]:
benchmark_path = "/content/VeriThoughts/evaluation_verithoughts/test.jsonl"
# benchmark_path = "/content/VeriThoughts/evaluation_verithoughts/hf_benchmark.jsonl"

In [39]:
import pandas as pd
df = pd.read_json(benchmark_path, lines=True)
df

Unnamed: 0,ground_truth,question,generated_verilog,verified
0,,\nDesign the following Verilog modules:\n\n1. ...,,True


In [None]:
model_id = "nyu-dice-lab/Qwen-2.5-Instruct-Verilog-Reasoning-7B"
hf_token = ""

**Query VeriThoughts LLMs**:


1.   verilog_vllm_multi.py: batch processing verilog generation prompts in the input benchmark file.
2.   model_id: three types of VeriThoughts LLMs (https://huggingface.co/collections/nyu-dice-lab/verithoughts-models-681eead7cd13abeb5957baf3)
3.   sample_number: due to the non-deterministic generation results of LLM, we can query LLMs sample_number times for a single prompt.
4.   batch_size: vllm processes batch_size prompt concurrently.
5.   reasoning_mode: output the reasoning trace.
6.   hf_read_token: your hugging face token.
7.   tensor_parallel_size: number of gpus you can use.



In [43]:
!cd VeriThoughts/evaluation_verithoughts/ && python verilog_vllm_multi.py --model_id nyu-dice-lab/Qwen-2.5-Instruct-Verilog-Reasoning-7B --sample_number 1 --batch_size 1 --reasoning_mode --hf_read_token $hf_token --benchmark_path $benchmark_path --tensor_parallel_size 1

2025-08-22 02:28:00.508664: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-08-22 02:28:00.527919: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1755829680.549789   32370 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1755829680.556324   32370 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1755829680.572848   32370 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

In [44]:
results_path = "/content/VeriThoughts/evaluation_verithoughts/benchmark_results/nyu-dice-lab/Qwen-2.5-Instruct-Verilog-Reasoning-7B/results.jsonl"

In [45]:
import pandas as pd
result = pd.read_json(results_path, lines=True)
result

Unnamed: 0,question,full_response,generated_code,ground_truth
0,\nDesign the following Verilog modules:\n\n1. ...,"Okay, I need to design three Verilog modules: ...",\n\nmodule byte_splitter (\n input wire [15...,


**Example output file format (result.jsonl):**



1.   question: original prompt.
2.   full_response
3.   generated_code: extracted verilog code from full_response.