VeriScope

Open benchmark and evaluation suite for first-pass Verilog coding.

This repository is the public code release for VeriScope.

What this repository contains

benchmark-side evaluation code
benchmark bundle validators
execution harness integration
judge prompts and scoring rules
scripts for reproducing the main reported comparisons
documentation for the released benchmark package

Suggested repository layout

veriscope/
├── README.md
├── LICENSE
├── requirements.txt
├── REPRODUCIBILITY.md
├── docs/
│   ├── DATASET_LAYOUT.md
│   ├── THIRD_PARTY_ASSETS.md
│   └── SAFETY_AND_LIMITATIONS.md
├── evaluator/
├── judge/
├── scripts/
├── tools/
├── prompts/
├── examples/
└── results/

Scope

VeriScope evaluates first-pass Verilog generation in a deployment-style workflow. Each task is packaged as a reusable problem bundle, and each model run is scored with:

objective execution
RTL-level review
artifact-level review

The public paper reports results on 568 benchmark problems, with analysis over both the full benchmark and the stricter L3--L4+ slice.

The companion Hugging Face dataset release excludes reference.v by default. Public evaluation only needs the task brief, interface metadata, and testbench. Reference RTL is treated as a benchmark-side validation anchor rather than as part of the default public evaluation package.

code repository: https://github.com/zwpride/VeriScope
dataset repository: https://huggingface.co/datasets/zwpride/VeriScope

Quick start

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

python3 scripts/run_public_eval.py \
  --problems data/public_subset \
  --predictions examples/sample_predictions.json \
  --output outputs/sample_eval.json

Reproducibility

See REPRODUCIBILITY.md for:

exact environment setup
required external tools
minimal reproduction commands
which results are fully reproducible from the public release
which evaluations depend on third-party closed APIs

Dataset release

The benchmark bundles live in a companion Hugging Face dataset repository. This code release should link to that dataset and pin the expected dataset version or tag. The default public dataset is expected to contain:

problem.md
config.yaml
testbench.v

and to exclude reference.v unless a separate maintainer-oriented validation package is released.

Release checklist

replace the draft public entrypoint names with the final scripts
update the repository tree to match the final exported layout
add a pinned release tag that matches the published release snapshot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VeriScope

What this repository contains

Suggested repository layout

Scope

Quick start

Reproducibility

Dataset release

Release checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
examples		examples
prompts		prompts
results		results
scripts		scripts
LICENSE		LICENSE
README.md		README.md
REPRODUCIBILITY.md		REPRODUCIBILITY.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

VeriScope

What this repository contains

Suggested repository layout

Scope

Quick start

Reproducibility

Dataset release

Release checklist

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages