# Fair Forge Generators - Groq Example

This notebook demonstrates how to use the Fair Forge generators module with **Groq Cloud** for ultra-fast synthetic test dataset generation.

## Overview

The `GroqGenerator` uses LangChain to interact with Groq's inference API, which provides extremely fast inference for open-source LLMs.

### Why Groq?
- **Speed**: Up to 10x faster than traditional cloud providers
- **Cost**: Competitive pricing for high-volume usage
- **Models**: Access to popular open-source models (Llama 3, OSS GPT)

## Setup

1. Get your free API key from [Groq Console](https://console.groq.com/)

2. Set your Groq API key as an environment variable:

```bash
export GROQ_API_KEY="your-api-key"
```

Or create a `.env` file:
```.env
GROQ_API_KEY=your-api-key
```

3. Install required dependencies:
```bash
uv venv
source .venv/bin/activate
uv pip install ".[generators-groq]" python-dotenv
uv run jupyter lab
```

**Note:** Use `.[generators-groq]` to get the correct LangChain dependencies. The base `.[generators]` group does not include `langchain-groq`.

If you're already in Jupyter and install packages, **restart the kernel** for changes to take effect.

## Imports

In [1]:
import os
import json
from pathlib import Path
from dotenv import load_dotenv

from fair_forge.generators import (
    create_groq_generator,
    create_markdown_loader,
    GroqGenerator,
)
from fair_forge.schemas import Dataset, Batch

# Load environment variables
load_dotenv()

print("Imports loaded successfully")

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


Imports loaded successfully


## Create Sample Content

Let's create a sample markdown document for testing:

In [2]:
sample_content = """# Machine Learning Fundamentals

This guide covers the basics of machine learning for beginners.

## Types of Machine Learning

Machine learning can be categorized into three main types:

### Supervised Learning
- Uses labeled training data
- Predicts outcomes based on input features
- Examples: Classification, Regression

### Unsupervised Learning
- Works with unlabeled data
- Discovers hidden patterns and structures
- Examples: Clustering, Dimensionality Reduction

### Reinforcement Learning
- Agent learns through interaction with environment
- Maximizes cumulative reward
- Examples: Game playing, Robotics

## Model Evaluation

Key metrics for evaluating ML models:

- **Accuracy**: Proportion of correct predictions
- **Precision**: True positives among predicted positives
- **Recall**: True positives among actual positives
- **F1 Score**: Harmonic mean of precision and recall

## Best Practices

1. Split data into train/validation/test sets
2. Use cross-validation for robust evaluation
3. Monitor for overfitting
4. Document your experiments
"""

# Save to file
sample_file = Path("./ml_fundamentals.md")
sample_file.write_text(sample_content)
print(f"Sample content saved to: {sample_file}")

Sample content saved to: ml_fundamentals.md


## Create Context Loader

In [3]:
# Create markdown loader
loader = create_markdown_loader(
    max_chunk_size=2000,
    header_levels=[1, 2, 3],
)

# Preview chunks
chunks = loader.load(str(sample_file))
print(f"Created {len(chunks)} chunks:\n")
for chunk in chunks:
    print(f"- {chunk.chunk_id}: {len(chunk.content)} chars")

[32m2026-01-14 10:33:46.519[0m | [1mINFO    [0m | [36mfair_forge.generators[0m:[36mcreate_markdown_loader[0m:[36m144[0m - [1mCreating local markdown loader[0m
[32m2026-01-14 10:33:46.521[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m138[0m - [1mLoading markdown file: ml_fundamentals.md[0m
[32m2026-01-14 10:33:46.523[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m186[0m - [1mCreated 7 chunks from ml_fundamentals.md[0m


Created 7 chunks:

- machine_learning_fundamentals: 63 chars
- types_of_machine_learning: 58 chars
- supervised_learning: 111 chars
- unsupervised_learning: 119 chars
- reinforcement_learning: 116 chars
- model_evaluation: 252 chars
- best_practices: 147 chars


## Create Groq Generator

The generator reads the API key from the `GROQ_API_KEY` environment variable.

In [4]:
# Create Groq generator
generator = create_groq_generator(
    model_name="llama-3.1-8b-instant",
    temperature=0.4,
    max_tokens=2048,
    use_structured_output=True,
)

print(f"Groq generator created with model: {generator.model_name}")

[32m2026-01-14 10:33:47.093[0m | [1mINFO    [0m | [36mfair_forge.generators[0m:[36mcreate_groq_generator[0m:[36m117[0m - [1mCreating Groq generator with model: llama-3.1-8b-instant[0m
[32m2026-01-14 10:33:47.119[0m | [1mINFO    [0m | [36mfair_forge.generators.groq_generator[0m:[36m__init__[0m:[36m73[0m - [1mInitializing Groq generator with model: llama-3.1-8b-instant[0m


Groq generator created with model: llama-3.1-8b-instant


## Generate Test Dataset

Groq's fast inference makes generation very quick!

In [5]:
import time

async def generate_dataset():
    print("Generating test dataset with Groq...\n")
    
    start_time = time.time()
    
    dataset = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="ml-assistant",
        num_queries_per_chunk=3,
        language="english",
    )
    
    elapsed = time.time() - start_time
    
    print(f"Generated dataset in {elapsed:.2f} seconds:")
    print(f"  Session ID: {dataset.session_id}")
    print(f"  Total queries: {len(dataset.conversation)}\n")
    
    print("Generated queries:")
    for batch in dataset.conversation:
        difficulty = batch.agentic.get('difficulty', 'N/A')
        query_type = batch.agentic.get('query_type', 'N/A')
        print(f"  [{batch.qa_id}] ({difficulty}/{query_type})")
        print(f"    {batch.query}\n")
    
    return dataset

# Execute
dataset = await generate_dataset()

[32m2026-01-14 10:33:47.418[0m | [1mINFO    [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_dataset[0m:[36m196[0m - [1mLoading context from: ml_fundamentals.md[0m
[32m2026-01-14 10:33:47.420[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m138[0m - [1mLoading markdown file: ml_fundamentals.md[0m
[32m2026-01-14 10:33:47.422[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m186[0m - [1mCreated 7 chunks from ml_fundamentals.md[0m
[32m2026-01-14 10:33:47.424[0m | [1mINFO    [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_dataset[0m:[36m198[0m - [1mLoaded 7 chunks from source[0m
[32m2026-01-14 10:33:47.425[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 3 queries for chunk machine_learning_fundamentals[0m


Generating test dataset with Groq...



2026-01-14 10:33:48,076 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:48.100[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 3 queries for chunk machine_learning_fundamentals[0m
[32m2026-01-14 10:33:48.100[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 3 queries for chunk types_of_machine_learning[0m
2026-01-14 10:33:48,715 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:48.718[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 1 queries for chunk types_of_machine_learning[0m
[32m2026-01-14 10:33:48.719[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generato

Generated dataset in 4.48 seconds:
  Session ID: 8b5e0dc1-5e50-4676-bf0a-f3d53b0552fd
  Total queries: 19

Generated queries:
  [machine_learning_fundamentals_q1] (medium/factual)
    What are the fundamental concepts of machine learning?

  [machine_learning_fundamentals_q2] (hard/application)
    How can you apply machine learning to real-world problems?

  [machine_learning_fundamentals_q3] (medium/comparative)
    What are the differences between supervised and unsupervised learning?

  [types_of_machine_learning_q1] (easy/factual)
    What are the three main categories of machine learning?

  [supervised_learning_q1] (medium/factual)
    What type of problems does the model predict outcomes for?

  [supervised_learning_q2] (hard/inferential)
    How does the model use labeled training data to make predictions?

  [supervised_learning_q3] (easy/application)
    Can you give an example of a problem type where the model's prediction is useful?

  [unsupervised_learning_q1] (medium/fa

## Generate with Seed Examples

In [6]:
async def generate_with_seeds():
    seed_examples = [
        "What is the difference between supervised and unsupervised learning?",
        "How do you prevent overfitting in a machine learning model?",
        "When should you use precision vs recall as your primary metric?",
    ]
    
    print("Generating with seed examples...\n")
    
    dataset = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="ml-assistant",
        num_queries_per_chunk=2,
        seed_examples=seed_examples,
    )
    
    print(f"Generated {len(dataset.conversation)} queries:")
    for batch in dataset.conversation[:5]:
        print(f"  - {batch.query}")
    
    return dataset

# Execute
dataset_with_seeds = await generate_with_seeds()

[32m2026-01-14 10:33:51.909[0m | [1mINFO    [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_dataset[0m:[36m196[0m - [1mLoading context from: ml_fundamentals.md[0m
[32m2026-01-14 10:33:51.910[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m138[0m - [1mLoading markdown file: ml_fundamentals.md[0m
[32m2026-01-14 10:33:51.911[0m | [1mINFO    [0m | [36mfair_forge.generators.context_loaders.local_markdown[0m:[36mload[0m:[36m186[0m - [1mCreated 7 chunks from ml_fundamentals.md[0m
[32m2026-01-14 10:33:51.912[0m | [1mINFO    [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_dataset[0m:[36m198[0m - [1mLoaded 7 chunks from source[0m
[32m2026-01-14 10:33:51.913[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 2 queries for chunk machine_learning_fundamentals[0m


Generating with seed examples...



2026-01-14 10:33:52,813 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:52.818[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 2 queries for chunk machine_learning_fundamentals[0m
[32m2026-01-14 10:33:52.821[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 2 queries for chunk types_of_machine_learning[0m
2026-01-14 10:33:53,429 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:53.433[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 1 queries for chunk types_of_machine_learning[0m
[32m2026-01-14 10:33:53.434[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generato

Generated 12 queries:
  - What are the fundamental concepts of machine learning that a beginner should understand?
  - How can a machine learning model be designed to handle imbalanced datasets?
  - What are the three main types of machine learning?
  - What are the key differences between supervised and unsupervised learning?
  - How can you balance the trade-off between precision and recall in a machine learning model?


## Save Generated Dataset

In [7]:
# Save dataset to JSON
output_file = Path("./generated_tests_groq.json")
with open(output_file, "w") as f:
    json.dump(dataset.model_dump(), f, indent=2)

print(f"Dataset saved to: {output_file}")

Dataset saved to: generated_tests_groq.json


## Available Groq Models

Check them [here](https://console.groq.com/docs/models).

## Speed Comparison

Groq is known for its extremely fast inference. Here's a quick benchmark:

In [8]:
import time

async def benchmark_generation():
    """Benchmark generation speed."""
    times = []
    
    for i in range(3):
        start = time.time()
        await generator.generate_queries(
            chunk=chunks[0],
            num_queries=3,
        )
        elapsed = time.time() - start
        times.append(elapsed)
        print(f"Run {i+1}: {elapsed:.2f}s")
    
    avg = sum(times) / len(times)
    print(f"\nAverage: {avg:.2f}s per chunk (3 queries)")

# Execute
await benchmark_generation()

[32m2026-01-14 10:33:56.617[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 3 queries for chunk machine_learning_fundamentals[0m
2026-01-14 10:33:57,217 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:57.221[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 3 queries for chunk machine_learning_fundamentals[0m
[32m2026-01-14 10:33:57.223[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 3 queries for chunk machine_learning_fundamentals[0m


Run 1: 0.61s


2026-01-14 10:33:57,831 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:57.834[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 3 queries for chunk machine_learning_fundamentals[0m
[32m2026-01-14 10:33:57.835[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m133[0m - [34m[1mGenerating 3 queries for chunk machine_learning_fundamentals[0m


Run 2: 0.61s


2026-01-14 10:33:58,343 - httpx - INFO - HTTP Request: POST https://api.groq.com/openai/v1/chat/completions "HTTP/1.1 200 OK"
[32m2026-01-14 10:33:58.348[0m | [34m[1mDEBUG   [0m | [36mfair_forge.generators.langchain_generator[0m:[36mgenerate_queries[0m:[36m164[0m - [34m[1mGenerated 3 queries for chunk machine_learning_fundamentals[0m


Run 3: 0.51s

Average: 0.58s per chunk (3 queries)


## Cleanup