# LLM-aided Testbench Generation - Complete Tutorial

## Introduction

Welcome to the **LLM-aided Testbench Generation** tutorial! This notebook provides an interactive, hands-on demonstration of an automated testbench generation system that leverages Large Language Models (LLMs) to create comprehensive Verilog testbenches.

### What You'll Learn

In this notebook, you will:
- Understand the 5-step automated pipeline for testbench generation
- See how natural language descriptions are transformed into working testbenches
- Generate testbenches for two example Verilog modules (a multiplexer and an adder)
- Learn how golden reference models are created and used for verification
- Explore the generated files and understand their purpose

### Prerequisites

- Basic understanding of digital logic design
- Familiarity with Verilog (helpful but not required)
- Python 3.7 or higher

### Key Concepts

**Testbench**: A Verilog module that instantiates and tests another module by applying input patterns and checking outputs.

**Golden Model**: A reference implementation (in Python) that defines the expected behavior of the module under test.

**Test Patterns**: A comprehensive set of input combinations designed to thoroughly verify the module's functionality.

Let's get started!

---
## Section 1: Environment Setup

### Introduction to Setup

Before we can generate testbenches, we need to set up our environment. This section installs the necessary Python packages and imports the required modules from our testbench generation system.

The main components we'll use are:
- **TestbenchPipeline**: The orchestrator that runs the complete generation process
- **LLMClient**: Interfaces with the Language Model (optional for this demo)

**Note**: This demo can run in two modes:
1. **Full LLM mode**: Requires an OpenAI API key for real LLM-powered generation
2. **Demo mode**: Works without an API key using mock generation for demonstration purposes

In [None]:
# Install required packages (if not already installed)
import sys
import subprocess

try:
    import openai
    print("✓ OpenAI package is already installed")
except ImportError:
    print("Installing required packages...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "openai>=0.27.0"])
    print("✓ Packages installed successfully")

In [None]:
# Import required modules
import os
import json
from src.testbench_pipeline import TestbenchPipeline
from src.llm_client import LLMClient

print("✓ All modules imported successfully")
print("\nReady to generate testbenches!")

### Optional: Configure LLM API Key

If you have an OpenAI API key and want to use full LLM-powered generation, uncomment and run the cell below. Otherwise, the system will use mock generation which is sufficient for demonstration purposes.

In [None]:
# Uncomment the next line and add your API key if you want to use real LLM generation
# os.environ['OPENAI_API_KEY'] = 'your-api-key-here'

# Check if LLM is configured
llm = LLMClient()
if llm.is_available():
    print("✓ LLM is configured and ready to use")
else:
    print("ℹ Running in DEMO MODE (without LLM)")
    print("  The examples will still work with mock generation.")

---
## Section 2: Understanding the Pipeline

### Introduction to the 5-Step Pipeline

The LLM-aided Testbench Generation system follows a systematic 5-step process:

**Step 1-2: Input Handling**
- Accept a natural language description of what the module should do
- Accept the Verilog code to be tested (which may contain bugs)

**Step 3: Testbench Generation**
- Analyze the Verilog module structure (inputs, outputs, bit widths)
- Use LLM to generate comprehensive test patterns covering corner cases, boundary values, typical cases, and random values
- Create an initial testbench that applies these patterns but doesn't know the expected outputs yet

**Step 4: Golden Model Creation**
- Use LLM to convert the natural language description into executable Python code (the "golden model")
- Run all test patterns through this Python model to compute the expected outputs
- This gives us the "golden" reference outputs for verification

**Step 5: Testbench Enhancement**
- Inject the golden outputs into the testbench
- Add verification logic that compares actual vs. expected outputs
- Include pass/fail checking and test summary reporting

### Why This Approach?

This approach has several advantages:
- **Automation**: Reduces manual effort in writing testbenches
- **Comprehensive Coverage**: LLM helps identify important test cases
- **Language-based Specification**: Uses natural language for clear specifications
- **Separation of Concerns**: The golden model is independent of the Verilog implementation

---
## Section 3: Example 1 - 2-to-1 Multiplexer

### Introduction to the Multiplexer Example

A multiplexer (MUX) is one of the most fundamental building blocks in digital circuits. It selects one of several input signals and forwards it to a single output based on a control signal.

Our example is a simple 2-to-1 MUX with:
- Two data inputs: `a` and `b` (1-bit each)
- One select input: `sel` (1-bit)
- One output: `y` (1-bit)

**Behavior**: When `sel` is 0, output `y` equals input `a`. When `sel` is 1, output `y` equals input `b`.

This is an excellent starting example because:
- It's simple and easy to understand
- It has only 8 possible input combinations (2³)
- The expected behavior is straightforward
- It demonstrates pure combinational logic

In [None]:
# Define the natural language description for the 2-to-1 MUX
mux_description = """A 2-to-1 multiplexer (MUX).

The module takes two 1-bit input signals (a and b) and one 1-bit select signal (sel).

Functionality:
- Input 'a': First data input (1-bit)
- Input 'b': Second data input (1-bit)
- Input 'sel': Select signal (1-bit)
- Output 'y': Selected output (1-bit)

When sel is 0, the output y should be equal to input a.
When sel is 1, the output y should be equal to input b.

This is a combinational logic circuit with no state or memory."""

print("Natural Language Description:")
print("=" * 70)
print(mux_description)
print()

In [None]:
# Define the Verilog code for the 2-to-1 MUX
mux_verilog = """module mux2to1 (
    input wire a,
    input wire b,
    input wire sel,
    output wire y
);
    assign y = sel ? b : a;
endmodule"""

print("Verilog Code:")
print("=" * 70)
print(mux_verilog)
print()

In [None]:
# Run the testbench generation pipeline for the MUX
print("\n" + "="*80)
print("Starting Testbench Generation for 2-to-1 Multiplexer")
print("="*80 + "\n")

# Initialize pipeline
pipeline = TestbenchPipeline()

# Run the complete pipeline
mux_result = pipeline.run(
    description=mux_description,
    verilog_code=mux_verilog,
    output_dir="notebook_output/mux"
)

print("\n✓ Testbench generation completed!")

### Examining the Generated Files

Let's explore what was generated. The pipeline creates four main files:

1. **testbench_initial.v**: The initial testbench with test patterns but no expected outputs
2. **golden_model.py**: The Python reference implementation
3. **test_patterns_with_golden.json**: Test patterns with expected outputs
4. **testbench_final.v**: The complete testbench with verification logic

In [None]:
# Display the generated Python golden model
print("Generated Python Golden Model:")
print("=" * 70)
with open("notebook_output/mux/golden_model.py", "r") as f:
    golden_model_code = f.read()
    print(golden_model_code)

In [None]:
# Display the test patterns with golden outputs
print("\nTest Patterns with Expected Outputs:")
print("=" * 70)
with open("notebook_output/mux/test_patterns_with_golden.json", "r") as f:
    test_patterns = json.load(f)
    print(json.dumps(test_patterns, indent=2))

In [None]:
# Display summary statistics
print("\nTestbench Generation Summary:")
print("=" * 70)
print(f"Module Name: {mux_result['module_info']['module_name']}")
print(f"Number of Inputs: {len(mux_result['module_info']['inputs'])}")
print(f"Number of Outputs: {len(mux_result['module_info']['outputs'])}")
print(f"Total Test Patterns: {len(test_patterns)}")
successful_tests = sum(1 for p in test_patterns if p.get('expected_outputs'))
print(f"Successful Pattern Computations: {successful_tests}/{len(test_patterns)}")
print(f"\nOutput Directory: {mux_result['output_dir']}")

---
## Section 4: Example 2 - 4-bit Adder

### Introduction to the Adder Example

Now let's try a more complex example: a 4-bit unsigned adder. This demonstrates how the system handles:
- Multi-bit signals (4-bit inputs and outputs)
- Arithmetic operations
- Multiple outputs (sum and carry)
- Overflow detection

The 4-bit adder:
- Takes two 4-bit unsigned inputs: `a` and `b` (range: 0-15 each)
- Produces a 4-bit sum output: `sum` (lower 4 bits of the result)
- Produces a 1-bit carry output: `carry` (1 if result > 15, else 0)

**Example**: If `a=15` (0xF) and `b=1`, then `sum=0` (0x0) and `carry=1` (overflow occurred)

This example is more interesting because:
- It has 256 possible input combinations (2⁸)
- It tests boundary conditions (overflow cases)
- It requires understanding of binary arithmetic

In [None]:
# Define the natural language description for the 4-bit adder
adder_description = """A simple 4-bit adder module.

The module takes two 4-bit input signals (a and b) and produces a 4-bit sum output and a 1-bit carry output.

Functionality:
- Input 'a': 4-bit unsigned number
- Input 'b': 4-bit unsigned number  
- Output 'sum': 4-bit result of a + b (lower 4 bits)
- Output 'carry': 1-bit carry-out flag (set to 1 if result exceeds 15)

The adder performs unsigned addition of the two 4-bit inputs.
If the result is greater than 15 (0xF), the carry output should be set to 1."""

print("Natural Language Description:")
print("=" * 70)
print(adder_description)
print()

In [None]:
# Define the Verilog code for the 4-bit adder
adder_verilog = """module adder4bit (
    input wire [3:0] a,
    input wire [3:0] b,
    output wire [3:0] sum,
    output wire carry
);
    wire [4:0] result;
    assign result = a + b;
    assign sum = result[3:0];
    assign carry = result[4];
endmodule"""

print("Verilog Code:")
print("=" * 70)
print(adder_verilog)
print()

In [None]:
# Run the testbench generation pipeline for the adder
print("\n" + "="*80)
print("Starting Testbench Generation for 4-bit Adder")
print("="*80 + "\n")

# Initialize pipeline (can reuse or create new instance)
pipeline_adder = TestbenchPipeline()

# Run the complete pipeline
adder_result = pipeline_adder.run(
    description=adder_description,
    verilog_code=adder_verilog,
    output_dir="notebook_output/adder"
)

print("\n✓ Testbench generation completed!")

In [None]:
# Display the generated Python golden model for the adder
print("Generated Python Golden Model:")
print("=" * 70)
with open("notebook_output/adder/golden_model.py", "r") as f:
    adder_golden_model = f.read()
    print(adder_golden_model)

In [None]:
# Display a sample of test patterns with golden outputs
print("\nSample Test Patterns (first 10):")
print("=" * 70)
with open("notebook_output/adder/test_patterns_with_golden.json", "r") as f:
    adder_patterns = json.load(f)
    # Show first 10 patterns
    print(json.dumps(adder_patterns[:10], indent=2))
    print(f"\n... and {len(adder_patterns) - 10} more patterns")

In [None]:
# Display summary statistics for the adder
print("\nTestbench Generation Summary:")
print("=" * 70)
print(f"Module Name: {adder_result['module_info']['module_name']}")
print(f"Number of Inputs: {len(adder_result['module_info']['inputs'])}")
print(f"Number of Outputs: {len(adder_result['module_info']['outputs'])}")
print(f"Total Test Patterns: {len(adder_patterns)}")
successful_adder_tests = sum(1 for p in adder_patterns if p.get('expected_outputs'))
print(f"Successful Pattern Computations: {successful_adder_tests}/{len(adder_patterns)}")
print(f"\nOutput Directory: {adder_result['output_dir']}")

---
## Section 5: Understanding Generated Files

### Introduction to Output Files

The testbench generation pipeline produces four types of files. Let's examine each one to understand its purpose and structure.

### File 1: testbench_initial.v

This is the first-stage testbench that:
- Declares all necessary signals
- Instantiates the module under test
- Applies test patterns systematically
- Displays outputs (but doesn't verify them yet)

It serves as an intermediate artifact that shows the test structure before verification logic is added.

In [None]:
# Display the initial testbench (first 50 lines)
print("Initial Testbench (excerpt):")
print("=" * 70)
with open("notebook_output/mux/testbench_initial.v", "r") as f:
    lines = f.readlines()
    print(''.join(lines[:50]))
    if len(lines) > 50:
        print(f"\n... ({len(lines) - 50} more lines)")

### File 2: golden_model.py

The Python golden model is a reference implementation that:
- Implements the expected behavior based on the natural language description
- Takes the same inputs as the Verilog module
- Returns the expected outputs
- Can be used independently for verification

This model is crucial because it provides a language-agnostic specification of the desired behavior.

In [None]:
# Test the golden model directly
print("Testing Golden Model Directly:")
print("=" * 70)

# Import and test the MUX golden model
import sys
sys.path.insert(0, 'notebook_output/mux')
try:
    from golden_model import mux2to1_golden
    
    print("\nMUX Golden Model Test Cases:")
    print(f"mux2to1_golden(a=0, b=0, sel=0) = {mux2to1_golden(0, 0, 0)}")
    print(f"mux2to1_golden(a=1, b=0, sel=0) = {mux2to1_golden(1, 0, 0)}")
    print(f"mux2to1_golden(a=0, b=1, sel=1) = {mux2to1_golden(0, 1, 1)}")
    print(f"mux2to1_golden(a=1, b=1, sel=1) = {mux2to1_golden(1, 1, 1)}")
except Exception as e:
    print(f"Note: Golden model testing skipped in demo mode: {e}")
finally:
    sys.path.pop(0)

### File 3: test_patterns_with_golden.json

This JSON file contains:
- All test patterns (inputs)
- Expected outputs for each pattern (computed by the golden model)
- Test numbering for easy reference

It serves as a complete test specification that bridges the Python and Verilog domains.

In [None]:
# Analyze the test patterns
print("Test Pattern Analysis:")
print("=" * 70)

with open("notebook_output/mux/test_patterns_with_golden.json", "r") as f:
    patterns = json.load(f)

print(f"Total number of test patterns: {len(patterns)}")
print(f"\nPattern structure:")
if patterns:
    print(f"  - test_num: Test identifier")
    print(f"  - inputs: {list(patterns[0]['inputs'].keys())}")
    print(f"  - expected_outputs: {list(patterns[0].get('expected_outputs', {}).keys())}")
    
    print(f"\nExample pattern:")
    print(json.dumps(patterns[0], indent=2))

### File 4: testbench_final.v

The final testbench is the complete, ready-to-simulate file that includes:
- Everything from testbench_initial.v
- Expected output values for each test
- Comparison logic (actual vs. expected)
- Pass/fail tracking for each test
- Summary report showing total tests passed/failed

This is the file you would compile and simulate with a Verilog simulator (Icarus Verilog, ModelSim, etc.).

In [None]:
# Display the final testbench (excerpt)
print("Final Testbench with Verification (excerpt):")
print("=" * 70)
with open("notebook_output/mux/testbench_final.v", "r") as f:
    lines = f.readlines()
    # Show first 60 lines to include some verification logic
    print(''.join(lines[:60]))
    if len(lines) > 60:
        print(f"\n... ({len(lines) - 60} more lines)")

---
## Section 6: Comparing Testbench Stages

### Introduction to Testbench Evolution

Let's compare the initial and final testbenches to see what was added in Step 5. This helps understand how the verification logic is integrated.

In [None]:
# Compare file sizes
import os

print("File Size Comparison:")
print("=" * 70)

initial_size = os.path.getsize("notebook_output/mux/testbench_initial.v")
final_size = os.path.getsize("notebook_output/mux/testbench_final.v")

print(f"testbench_initial.v: {initial_size} bytes")
print(f"testbench_final.v:   {final_size} bytes")
print(f"Additional content:  {final_size - initial_size} bytes")
print(f"Size increase:       {((final_size / initial_size - 1) * 100):.1f}%")

print("\nThe additional content includes:")
print("  - Expected output declarations")
print("  - Comparison statements")
print("  - Pass/fail counters")
print("  - Test summary report")

---
## Section 7: Simulating the Testbench (Optional)

### Introduction to Simulation

If you have a Verilog simulator installed (like Icarus Verilog), you can actually run the generated testbench to verify the module. This section shows how to do that.

**Note**: This requires having `iverilog` and `vvp` installed on your system. If you don't have them, you can skip this section - the testbench generation is complete regardless.

In [None]:
# Check if Icarus Verilog is available
import subprocess
import shutil

iverilog_available = shutil.which('iverilog') is not None
vvp_available = shutil.which('vvp') is not None

if iverilog_available and vvp_available:
    print("✓ Icarus Verilog is available")
    print("  You can simulate the generated testbenches")
else:
    print("ℹ Icarus Verilog is not installed")
    print("  To install it:")
    print("    - Ubuntu/Debian: sudo apt-get install iverilog")
    print("    - macOS: brew install icarus-verilog")
    print("  You can still generate testbenches without simulation")

In [None]:
# Simulate the MUX testbench (if simulator is available)
if iverilog_available and vvp_available:
    print("Simulating MUX Testbench:")
    print("=" * 70)
    
    # First, save the Verilog module to a file
    with open("notebook_output/mux/mux2to1.v", "w") as f:
        f.write(mux_verilog)
    
    try:
        # Compile the testbench
        compile_cmd = [
            'iverilog',
            '-o', 'notebook_output/mux/simulation',
            'notebook_output/mux/mux2to1.v',
            'notebook_output/mux/testbench_final.v'
        ]
        result = subprocess.run(compile_cmd, capture_output=True, text=True, timeout=30)
        
        if result.returncode == 0:
            print("✓ Compilation successful")
            
            # Run the simulation
            sim_cmd = ['vvp', 'notebook_output/mux/simulation']
            result = subprocess.run(sim_cmd, capture_output=True, text=True, timeout=30)
            
            print("\nSimulation Output:")
            print("-" * 70)
            print(result.stdout)
        else:
            print(f"Compilation failed: {result.stderr}")
    except Exception as e:
        print(f"Simulation error: {e}")
else:
    print("Simulation skipped (Icarus Verilog not available)")

---
## Section 8: Summary and Next Steps

### What We've Accomplished

In this notebook, you have:

1. ✓ Set up the LLM-aided Testbench Generation environment
2. ✓ Understood the 5-step automated pipeline
3. ✓ Generated a testbench for a 2-to-1 multiplexer
4. ✓ Generated a testbench for a 4-bit adder
5. ✓ Examined all generated files and understood their purpose
6. ✓ Learned about golden models and test patterns
7. ✓ (Optionally) Simulated the testbench with a Verilog simulator

### Key Takeaways

- **Automation**: LLMs can significantly reduce manual effort in testbench creation
- **Natural Language Specs**: Clear descriptions lead to better test coverage
- **Golden Models**: Python reference models provide independent verification
- **Systematic Testing**: The pipeline ensures comprehensive test pattern generation

### Next Steps

To continue exploring this system:

1. **Try Your Own Modules**: Use the pipeline with your own Verilog designs
2. **Experiment with Descriptions**: See how different descriptions affect test generation
3. **Add More Complex Modules**: Try sequential circuits or state machines
4. **Customize the Pipeline**: Modify the generation logic for specific needs

### Using the System from Command Line

You can also use the system from the command line:

```bash
# With example
python main.py --example

# With custom files
python main.py --description my_desc.txt --verilog my_module.v

# Run the interactive demo
python demo.py
```

### Additional Resources

- `README.md`: Comprehensive project documentation
- `USAGE_GUIDE.md`: Detailed usage instructions and best practices
- `PROJECT_SUMMARY.md`: Technical implementation details
- `examples/`: Sample input and output files

Thank you for using LLM-aided Testbench Generation!

---
## Appendix: File Listings

### All Generated Files

In [None]:
# List all generated files with sizes
import os
from pathlib import Path

print("Generated Files Summary:")
print("=" * 70)

for example in ['mux', 'adder']:
    output_dir = f"notebook_output/{example}"
    if os.path.exists(output_dir):
        print(f"\n{example.upper()}:")
        for file in sorted(Path(output_dir).glob('*')):
            if file.is_file():
                size = file.stat().st_size
                print(f"  - {file.name:30s} ({size:6d} bytes)")