# Generators - Alquimia Example

This notebook demonstrates how to use the Fair Forge generators module with **Alquimia** to create synthetic test datasets from context documents.

## Overview

The generators module provides tools for creating test datasets:
- **AlquimiaGenerator**: Wraps Alquimia's agent API for query generation
- **LocalMarkdownLoader**: Loads and chunks markdown files with hybrid splitting
- **Selection Strategies**: Control how chunks are sampled (Sequential, RandomSampling)
- **Conversation Mode**: Generate coherent multi-turn conversations

## Installation

First, install Fair Forge with Alquimia support and required dependencies.

In [None]:
import sys
!uv pip install --python {sys.executable} --force-reinstall "$(ls ../../dist/*.whl)[generators-alquimia]" -q

## Setup

Import the required modules and configure your Alquimia credentials.

**Note:** The AlquimiaGenerator requires an agent configured in your Alquimia workspace.

In [None]:
import os
import sys
import json
from pathlib import Path

sys.path.insert(0, os.path.dirname(os.getcwd()))

from fair_forge.generators import (
    create_alquimia_generator,
    create_markdown_loader,
    RandomSamplingStrategy,
)
from fair_forge.schemas import Dataset

In [None]:
import getpass

ALQUIMIA_API_KEY = getpass.getpass("Enter your Alquimia API key: ")
ALQUIMIA_URL = input("Enter Alquimia URL (default: https://api.alquimia.ai): ") or "https://api.alquimia.ai"
ALQUIMIA_AGENT_ID = input("Enter your Agent ID: ")
ALQUIMIA_CHANNEL_ID = input("Enter your Channel ID: ")

## Create Context Loader

The context loader reads source documents and splits them into chunks for query generation.

The `LocalMarkdownLoader` uses a hybrid chunking strategy:
1. **Primary**: Split by markdown headers (H1, H2, H3)
2. **Fallback**: Split by character count for long sections without headers

In [None]:
# Create context loader with default settings
loader = create_markdown_loader(
    max_chunk_size=2000,
    min_chunk_size=200,
    overlap=100,
    header_levels=[1, 2, 3],
)

print("Context loader created successfully")

## Create Sample Content

Let's create a sample markdown file to demonstrate the generator:

In [None]:
# Create sample markdown content
sample_content = """# Fair Forge Documentation

Fair Forge is a performance-measurement library for evaluating AI models and assistants.

## Key Features

The library provides comprehensive metrics for:
- **Fairness**: Measure bias across different demographic groups
- **Toxicity**: Detect harmful or offensive language
- **Conversational Quality**: Evaluate dialogue coherence and relevance
- **Context Adherence**: Check if responses align with provided context

## Getting Started

To get started with Fair Forge, install the package using pip and create a retriever to load your test datasets.

### Basic Usage

Here's a simple example of running the toxicity metric:

```python
from fair_forge.metrics import Toxicity
results = Toxicity.run(MyRetriever)
```

## Architecture

Fair Forge follows a modular architecture with the following components:

1. **Core**: Base classes and interfaces
2. **Metrics**: Individual metric implementations
3. **Runners**: Test execution against AI systems
4. **Storage**: Backend for test datasets and results

Each component can be extended to support custom implementations.
"""

# Save to file
sample_file = Path("./sample_docs.md")
sample_file.write_text(sample_content)
print(f"Sample content saved to: {sample_file}")

## Load and Chunk Content

Let's see how the loader chunks the markdown content:

In [None]:
# Load and chunk the markdown file
chunks = loader.load(str(sample_file))

print(f"Created {len(chunks)} chunks:\n")
for i, chunk in enumerate(chunks, 1):
    print(f"Chunk {i}: {chunk.chunk_id}")
    print(f"  Header: {chunk.metadata.get('header', 'N/A')}")
    print(f"  Method: {chunk.metadata.get('chunking_method', 'N/A')}")
    print(f"  Length: {len(chunk.content)} chars")
    print(f"  Preview: {chunk.content[:80]}...\n")

## Create Alquimia Generator

The `AlquimiaGenerator` wraps the Alquimia client as a LangChain-compatible model.

In [None]:
# Create Alquimia generator
generator = create_alquimia_generator(
    base_url=ALQUIMIA_URL,
    api_key=ALQUIMIA_API_KEY,
    agent_id=ALQUIMIA_AGENT_ID,
    channel_id=ALQUIMIA_CHANNEL_ID,
    use_structured_output=True,
)

print("Generator created successfully")
print(f"  Base URL: {generator.base_url}")
print(f"  Agent ID: {generator.agent_id}")

## Generate Queries from Single Chunk

Let's generate queries for a single chunk first:

In [None]:
# Generate queries for a single chunk
chunk = chunks[0]
print(f"Generating queries for chunk: {chunk.chunk_id}")
print(f"Content preview: {chunk.content[:100]}...\n")

queries = await generator.generate_queries(
    chunk=chunk,
    num_queries=3,
)

print(f"\nGenerated {len(queries)} queries:\n")
for i, q in enumerate(queries, 1):
    print(f"{i}. {q.query}")
    print(f"   Difficulty: {q.difficulty}")
    print(f"   Type: {q.query_type}\n")

## Generate Complete Dataset

Generate a complete test dataset from all chunks:

In [None]:
# Generate complete dataset
print("Generating complete dataset from markdown file...\n")

datasets = await generator.generate_dataset(
    context_loader=loader,
    source=str(sample_file),
    assistant_id="test-assistant",
    num_queries_per_chunk=3,
    language="english",
)

# With default SequentialStrategy, we get one dataset
dataset = datasets[0]

print(f"Generated {len(datasets)} dataset(s):")
print(f"  Session ID: {dataset.session_id}")
print(f"  Assistant ID: {dataset.assistant_id}")
print(f"  Language: {dataset.language}")
print(f"  Total queries: {len(dataset.conversation)}")
print(f"  Context length: {len(dataset.context)} chars\n")

print("Sample queries:")
for batch in dataset.conversation[:5]:
    print(f"  - [{batch.qa_id}] {batch.query}")

## Generate with Seed Examples

Guide the query generation style using seed examples:

In [None]:
# Generate with seed examples for style guidance
seed_examples = [
    "What are the main components of Fair Forge's architecture?",
    "How can I measure bias in my AI assistant's responses?",
    "What steps are needed to integrate Fair Forge with an existing pipeline?",
]

print("Generating with seed examples...")
print(f"Seed examples provided: {len(seed_examples)}\n")

datasets_with_seeds = await generator.generate_dataset(
    context_loader=loader,
    source=str(sample_file),
    assistant_id="test-assistant",
    num_queries_per_chunk=2,
    language="english",
    seed_examples=seed_examples,
)

dataset_seeds = datasets_with_seeds[0]
print(f"Generated {len(dataset_seeds.conversation)} queries with seed examples:")
for batch in dataset_seeds.conversation[:3]:
    print(f"  - {batch.query}")

## Chunk Selection Strategies

Strategies control how chunks are selected and grouped during generation.

### RandomSamplingStrategy

Randomly samples chunks multiple times to create diverse test datasets:

In [None]:
# Generate multiple datasets using random chunk sampling
strategy = RandomSamplingStrategy(
    num_samples=2,       # Create 2 datasets
    chunks_per_sample=2, # Each with 2 random chunks
    seed=42,             # For reproducibility
)

print(f"Strategy: {strategy}\n")

random_datasets = await generator.generate_dataset(
    context_loader=loader,
    source=str(sample_file),
    assistant_id="test-assistant",
    num_queries_per_chunk=2,
    selection_strategy=strategy,
)

print(f"Generated {len(random_datasets)} datasets:\n")
for i, ds in enumerate(random_datasets):
    print(f"Dataset {i+1}: {len(ds.conversation)} queries")

## Conversation Mode

Generate coherent multi-turn conversations where each question builds on the previous:

In [None]:
# Generate coherent multi-turn conversations
print("Generating conversations (each turn builds on the previous)...\n")

conversation_datasets = await generator.generate_dataset(
    context_loader=loader,
    source=str(sample_file),
    assistant_id="test-assistant",
    num_queries_per_chunk=3,  # 3-turn conversations
    conversation_mode=True,   # Enable conversation mode
)

dataset_conv = conversation_datasets[0]
print(f"Generated {len(dataset_conv.conversation)} conversation turns:\n")

# Group by chunk to show conversation flow
current_chunk = None
for batch in dataset_conv.conversation:
    chunk_id = batch.agentic.get('chunk_id', 'N/A')
    turn_num = batch.agentic.get('turn_number', 0)
    
    if chunk_id != current_chunk:
        print(f"\n--- Conversation for: {chunk_id} ---")
        current_chunk = chunk_id
    
    print(f"  Turn {turn_num}: {batch.query}")

## Save Generated Dataset

Save the generated dataset to JSON for use with runners and metrics:

In [None]:
# Save dataset to JSON
output_file = Path("./generated_tests_alquimia.json")

with open(output_file, "w") as f:
    json.dump(dataset.model_dump(), f, indent=2)

print(f"Dataset saved to: {output_file}")
print(f"Total queries: {len(dataset.conversation)}")

## Note on Custom System Prompts

**Important:** The AlquimiaGenerator does not support custom system prompts in the same way as direct LangChain models, because the agent's system prompt is configured in the Alquimia workspace.

Instead, you can:
1. Use **seed examples** to guide the style of generated queries
2. Configure the agent's system prompt directly in your Alquimia workspace to accept `context`, `num_queries`, and `seed_examples` as template variables

For full control over the system prompt, use a LangChain model directly with `BaseGenerator` (see the OpenAI or Groq example notebooks).

## Cleanup

In [None]:
# Clean up sample files
if sample_file.exists():
    sample_file.unlink()
if output_file.exists():
    output_file.unlink()
print("Sample files cleaned up")