-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Dataset Collection
- Gather questions from HDLBits across categories:
- Basics
- Vectors
- Modules & Hierarchy
- Procedures
- Combinational Logic (Gates, Multiplexers, Arithmetic Circuits, etc.)
- Sequential Logic (Latches, Flip-flops, Counters, Shift Registers, etc.)
- Expand coverage with augmented questions (variation, rephrasing, scaling difficulty)
- Prepare additional sources if needed (e.g., RTL-Repo, custom prompts)
Dataset Structuring
- Define consistent schema:
- Question / Problem statement
- Expected input/output behavior (testbench or truth table)
- Ground truth solution (reference Verilog code)
- Ensure compatibility with reward functions (compilation, synthesis, functional correctness, etc.)
- Keep some for evals (5%-10%)
Documentation
- Document dataset structure (fields, formatting rules)
- Provide small example subset in repo for reference
- Note augmentation methods used and rationale
Notes
- Initial dataset size: ~20–50 HDLBits problems
- Augmented to increase volume while retaining diversity
- Benchmarking targets: VerilogEval, VeriReason Benchmarks, and RTL-Repo (for external validation)
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request