# Fair Forge Generators - Alquimia Example

This notebook demonstrates how to use the Fair Forge generators module with **Alquimia** to create synthetic test datasets from context documents.

## Overview

The generators module provides:
- **BaseGenerator**: Base class that accepts any LangChain-compatible chat model
- **AlquimiaGenerator**: Adapter that wraps Alquimia's agent API as a LangChain model
- **AlquimiaChatModel**: LangChain-compatible adapter for Alquimia agents
- **BaseContextLoader**: Abstract interface for loading and chunking context documents
- **LocalMarkdownLoader**: Implementation for loading local markdown files with hybrid chunking

## How AlquimiaGenerator Works

The AlquimiaGenerator wraps the Alquimia client as a LangChain-compatible model. The `context`, `seed_examples`, and `num_queries` are extracted from the system prompt and passed as extra data kwargs to the agent.

## Setup

Create `.env` file:
```.env
ALQUIMIA_API_KEY=your-api-key
ALQUIMIA_URL=https://api.alquimia.ai
ALQUIMIA_AGENT_ID=your-agent-id
ALQUIMIA_CHANNEL_ID=your-channel-id
```

Install required dependencies:
```bash
uv venv
source .venv/bin/activate

# Option 1: Install fair-forge with Alquimia support
uv pip install "alquimia-fair-forge[generators-alquimia]" python-dotenv

# Option 2: Install separately
# uv pip install alquimia-fair-forge alquimia-client python-dotenv

uv run jupyter lab
```

**Note:** The AlquimiaGenerator requires an agent configured in your Alquimia workspace. The context, seed examples, and num_queries are passed to the agent as extra data that gets injected into the agent's system prompt.

## Imports

In [None]:
import os
import json
from pathlib import Path
from dotenv import load_dotenv

from fair_forge.generators import (
    create_alquimia_generator,
    create_markdown_loader,
    AlquimiaGenerator,
    AlquimiaChatModel,
    BaseGenerator,
    LocalMarkdownLoader,
    # Strategies for chunk selection
    SequentialStrategy,
    RandomSamplingStrategy,
)
from fair_forge.schemas import Dataset, Batch

# Load environment variables
load_dotenv()

print("Imports loaded successfully")

## Step 1: Create Context Loader

The context loader reads source documents and splits them into chunks for query generation.

The `LocalMarkdownLoader` uses a hybrid chunking strategy:
1. **Primary**: Split by markdown headers (H1, H2, H3)
2. **Fallback**: Split by character count for long sections without headers

In [None]:
# Create context loader with default settings
loader = create_markdown_loader(
    max_chunk_size=2000,   # Maximum characters per chunk
    min_chunk_size=200,    # Minimum characters per chunk
    overlap=100,           # Overlap between size-based chunks
    header_levels=[1, 2, 3],  # Split on H1, H2, H3 headers
)

print("Context loader created successfully")

## Step 2: Create Sample Markdown Content

Let's create a sample markdown file to demonstrate the generator:

In [None]:
# Create sample markdown content
sample_content = """# Fair Forge Documentation

Fair Forge is a performance-measurement library for evaluating AI models and assistants.

## Key Features

The library provides comprehensive metrics for:
- **Fairness**: Measure bias across different demographic groups
- **Toxicity**: Detect harmful or offensive language
- **Conversational Quality**: Evaluate dialogue coherence and relevance
- **Context Adherence**: Check if responses align with provided context

## Getting Started

To get started with Fair Forge, install the package using pip:

```bash
pip install fair-forge
```

Then create a retriever to load your test datasets and run metrics.

### Basic Usage

Here's a simple example of running the toxicity metric:

```python
from fair_forge.metrics import Toxicity

results = Toxicity.run(MyRetriever)
```

## Architecture

Fair Forge follows a modular architecture with the following components:

1. **Core**: Base classes and interfaces
2. **Metrics**: Individual metric implementations
3. **Runners**: Test execution against AI systems
4. **Storage**: Backend for test datasets and results

Each component can be extended to support custom implementations.
"""

# Save to file
sample_file = Path("./sample_docs.md")
sample_file.write_text(sample_content)
print(f"Sample content saved to: {sample_file}")

## Step 3: Load and Chunk Content

Let's see how the loader chunks the markdown content:

In [None]:
# Load and chunk the markdown file
chunks = loader.load(str(sample_file))

print(f"Created {len(chunks)} chunks:\n")
for i, chunk in enumerate(chunks, 1):
    print(f"Chunk {i}: {chunk.chunk_id}")
    print(f"  Header: {chunk.metadata.get('header', 'N/A')}")
    print(f"  Method: {chunk.metadata.get('chunking_method', 'N/A')}")
    print(f"  Length: {len(chunk.content)} chars")
    print(f"  Preview: {chunk.content[:80]}...\n")

## Step 4: Create Alquimia Generator

The `AlquimiaGenerator` wraps the Alquimia client as a LangChain-compatible model, allowing it to be used with the `BaseGenerator` interface.

In [None]:
# Create Alquimia generator using factory function
# NOTE: Set these environment variables or replace with your actual values
generator = create_alquimia_generator(
    base_url=os.getenv("ALQUIMIA_URL", "https://api.alquimia.ai"),
    api_key=os.getenv("ALQUIMIA_API_KEY", "your-api-key"),
    agent_id=os.getenv("ALQUIMIA_AGENT_ID", "your-agent-id"),
    channel_id=os.getenv("ALQUIMIA_CHANNEL_ID", "your-channel-id"),
    use_structured_output=True,
)

print("Generator created successfully")
print(f"  Base URL: {generator.base_url}")
print(f"  Agent ID: {generator.agent_id}")

### Alternative: Direct Instantiation

You can also create the generator directly, or use the `AlquimiaChatModel` with `BaseGenerator`:

In [None]:
# Direct instantiation
# generator = AlquimiaGenerator(
#     base_url="https://api.alquimia.ai",
#     api_key="your-api-key",
#     agent_id="your-agent-id",
#     channel_id="your-channel-id",
# )

# Or use AlquimiaChatModel directly with BaseGenerator
# model = AlquimiaChatModel(
#     base_url="https://api.alquimia.ai",
#     api_key="your-api-key",
#     agent_id="your-agent-id",
#     channel_id="your-channel-id",
# )
# generator = BaseGenerator(model=model)

print("See code comments for alternative approaches")

## Step 5: Generate Queries from Single Chunk

Let's generate queries for a single chunk first:

In [None]:
# Generate queries for a single chunk
async def generate_from_chunk():
    chunk = chunks[0]  # Use first chunk
    print(f"Generating queries for chunk: {chunk.chunk_id}")
    print(f"Content preview: {chunk.content[:100]}...\n")
    
    queries = await generator.generate_queries(
        chunk=chunk,
        num_queries=3,
    )
    
    print(f"Generated {len(queries)} queries:\n")
    for i, q in enumerate(queries, 1):
        print(f"{i}. {q.query}")
        print(f"   Difficulty: {q.difficulty}")
        print(f"   Type: {q.query_type}\n")
    
    return queries

# Execute (uncomment to run)
# queries = await generate_from_chunk()

## Step 6: Generate Complete Dataset

Generate a complete test dataset from all chunks:

In [None]:
# Generate complete dataset
async def generate_full_dataset():
    print("Generating complete dataset from markdown file...\n")
    
    # generate_dataset returns list[Dataset]
    datasets = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="test-assistant",
        num_queries_per_chunk=3,
        language="english",
    )
    
    # With default SequentialStrategy, we get one dataset
    dataset = datasets[0]
    
    print(f"Generated {len(datasets)} dataset(s):")
    print(f"  Session ID: {dataset.session_id}")
    print(f"  Assistant ID: {dataset.assistant_id}")
    print(f"  Language: {dataset.language}")
    print(f"  Total queries: {len(dataset.conversation)}")
    print(f"  Context length: {len(dataset.context)} chars\n")
    
    print("Sample queries:")
    for batch in dataset.conversation[:5]:
        print(f"  - [{batch.qa_id}] {batch.query}")
    
    return datasets

# Execute (uncomment to run)
# datasets = await generate_full_dataset()

## Step 7: Generate with Seed Examples

Guide the query generation style using seed examples:

In [None]:
# Generate with seed examples for style guidance
async def generate_with_seeds():
    seed_examples = [
        "What are the main components of Fair Forge's architecture?",
        "How can I measure bias in my AI assistant's responses?",
        "What steps are needed to integrate Fair Forge with an existing pipeline?",
    ]
    
    print("Generating with seed examples...")
    print(f"Seed examples provided: {len(seed_examples)}\n")
    
    datasets = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="test-assistant",
        num_queries_per_chunk=2,
        language="english",
        seed_examples=seed_examples,
    )
    
    dataset = datasets[0]
    print(f"Generated {len(dataset.conversation)} queries")
    return datasets

# Execute (uncomment to run)
# datasets_with_seeds = await generate_with_seeds()

## Chunk Selection Strategies

Strategies control how chunks are selected and grouped during generation.

### RandomSamplingStrategy

Randomly samples chunks multiple times to create diverse test datasets:

In [None]:
async def generate_with_random_sampling():
    """Generate multiple datasets using random chunk sampling."""
    
    strategy = RandomSamplingStrategy(
        num_samples=2,       # Create 2 datasets
        chunks_per_sample=2, # Each with 2 random chunks
        seed=42,             # For reproducibility
    )
    
    print(f"Strategy: {strategy}\n")
    
    datasets = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="test-assistant",
        num_queries_per_chunk=2,
        selection_strategy=strategy,
    )
    
    print(f"Generated {len(datasets)} datasets:\n")
    for i, ds in enumerate(datasets):
        print(f"Dataset {i+1}: {len(ds.conversation)} queries")
    
    return datasets

# Execute (uncomment to run)
# random_datasets = await generate_with_random_sampling()

## Conversation Mode

Generate coherent multi-turn conversations where each question builds on the previous:

In [None]:
async def generate_conversations():
    """Generate coherent multi-turn conversations."""
    
    print("Generating conversations (each turn builds on the previous)...\n")
    
    datasets = await generator.generate_dataset(
        context_loader=loader,
        source=str(sample_file),
        assistant_id="test-assistant",
        num_queries_per_chunk=3,  # 3-turn conversations
        conversation_mode=True,   # Enable conversation mode
    )
    
    dataset = datasets[0]
    print(f"Generated {len(dataset.conversation)} conversation turns:\n")
    
    # Group by chunk to show conversation flow
    current_chunk = None
    for batch in dataset.conversation:
        chunk_id = batch.agentic.get('chunk_id', 'N/A')
        turn_num = batch.agentic.get('turn_number', 0)
        
        if chunk_id != current_chunk:
            print(f"\n--- Conversation for: {chunk_id} ---")
            current_chunk = chunk_id
        
        print(f"  Turn {turn_num}: {batch.query}")
    
    return datasets

# Execute (uncomment to run)
# conversation_datasets = await generate_conversations()

## Note on Custom System Prompts

**Important:** The AlquimiaGenerator does not support custom system prompts in the same way as direct LangChain models, because the agent's system prompt is configured in the Alquimia workspace.

Instead, you can:
1. Use **seed examples** to guide the style of generated queries
2. Configure the agent's system prompt directly in your Alquimia workspace to accept `context`, `num_queries`, and `seed_examples` as template variables

For full control over the system prompt, use a LangChain model directly with `BaseGenerator` (see the OpenAI or Groq example notebooks).

In [None]:
# For custom system prompts, use LangChain models directly with BaseGenerator:
# 
# from langchain_openai import ChatOpenAI
# 
# model = ChatOpenAI(model="gpt-4o-mini")
# generator = BaseGenerator(model=model)
# 
# custom_prompt = """You are a QA specialist creating test questions...
# Context: {context}
# {seed_examples_section}
# Generate exactly {num_queries} technical questions.
# """
# 
# datasets = await generator.generate_dataset(
#     context_loader=loader,
#     source=str(sample_file),
#     assistant_id="test-assistant",
#     num_queries_per_chunk=2,
#     custom_system_prompt=custom_prompt,
# )

print("See generators_openai.ipynb or generators_groq.ipynb for custom prompt examples")

## Step 8: Save Generated Dataset

Save the generated dataset to JSON for use with runners and metrics:

In [None]:
# Save generated dataset to JSON
async def save_dataset(dataset: Dataset, output_path: str):
    output_file = Path(output_path)
    
    with open(output_file, "w") as f:
        json.dump(dataset.model_dump(), f, indent=2)
    
    print(f"Dataset saved to: {output_file}")
    print(f"Total queries: {len(dataset.conversation)}")
    return output_file

# Example usage (uncomment after generating dataset)
# await save_dataset(dataset, "./generated_tests.json")

## Step 9: Integration with Runners

Use the generated dataset with Fair Forge runners:

In [None]:
# Example integration with runners
# from fair_forge.runners import AlquimiaRunner
# from fair_forge.storage import create_local_storage

async def run_generated_tests(dataset: Dataset):
    """
    Example of running generated tests against an AI assistant.
    
    Uncomment and configure to use.
    """
    # # Configure runner
    # runner = AlquimiaRunner(
    #     base_url=os.getenv("ALQUIMIA_URL"),
    #     api_key=os.getenv("ALQUIMIA_API_KEY"),
    #     agent_id=os.getenv("AGENT_ID"),
    #     channel_id=os.getenv("CHANNEL_ID"),
    # )
    # 
    # # Run dataset
    # updated_dataset, summary = await runner.run_dataset(dataset)
    # 
    # print(f"Completed: {summary['successes']}/{summary['total_batches']} passed")
    # return updated_dataset
    pass

print("Integration example ready (uncomment to use)")

## Creating Custom Context Loaders

You can create custom context loaders by extending `BaseContextLoader`:

In [None]:
from fair_forge.schemas.generators import BaseContextLoader, Chunk

class JsonContextLoader(BaseContextLoader):
    """Example custom loader for JSON documents."""
    
    def load(self, source: str) -> list[Chunk]:
        """Load and chunk a JSON file."""
        import json
        from pathlib import Path
        
        path = Path(source)
        with open(path) as f:
            data = json.load(f)
        
        chunks = []
        # Example: each top-level key becomes a chunk
        for i, (key, value) in enumerate(data.items()):
            content = f"{key}: {json.dumps(value, indent=2)}"
            chunks.append(Chunk(
                content=content,
                chunk_id=f"json_{key}",
                metadata={"key": key, "source": str(path)},
            ))
        
        return chunks

print("Custom JsonContextLoader defined")

## Cleanup

In [None]:
# Clean up sample files
if sample_file.exists():
    sample_file.unlink()
    print("Sample files cleaned up")