# 🚀 Gretel to Opik Integration: Creating Q&A Datasets for Model Evaluation

**The Story**: You need high-quality Q&A datasets to evaluate your AI models, but creating them manually is time-consuming and expensive. This cookbook shows you how to use Gretel's synthetic data generation to create diverse, realistic Q&A datasets and import them into Opik for model evaluation and optimization.

**What you'll accomplish**:
1. Generate synthetic Q&A data using Gretel Data Designer
2. Convert it to Opik format
3. Import into Opik for model evaluation
4. See your dataset in the Opik UI

---

## 📋 Prerequisites

- **Gretel Account**: Sign up at [gretel.ai](https://gretel.ai) and get your API key
- **Comet Account**: Sign up at [comet.com](https://comet.com) for Opik access

Let's get started! 🎯

## 🛠️ **Two Approaches Available**

This cookbook demonstrates **two methods** for generating synthetic data with Gretel:

1. **Data Designer** (recommended for custom datasets): Create datasets from scratch with precise control
2. **Safe Synthetics** (recommended for existing data): Generate synthetic versions of existing datasets

We'll start with Data Designer, then show Safe Synthetics as an alternative.

## 💾 Step 1: Install Required Packages

We'll install the Gretel client and Opik SDK:

In [12]:
%pip install gretel-client opik pandas --upgrade --quiet

Note: you may need to restart the kernel to use updated packages.


## 🔐 Step 2: Authentication Setup

Let's authenticate with both Gretel and Opik:


In [13]:
import os
import getpass
import opik
import pandas as pd

print("🔐 Setting up authentication...")

# Set up Gretel API key
if "GRETEL_API_KEY" not in os.environ:
    os.environ["GRETEL_API_KEY"] = getpass.getpass("Enter your Gretel API key: ")

# Set up Opik (will prompt for API key if not configured)
opik.configure()

print("✅ Authentication completed!")

OPIK: Opik is already configured. You can check the settings by viewing the config file at /home/mavrick/.opik.config


🔐 Setting up authentication...
✅ Authentication completed!


## 📊 Step 3: Generate Q&A Dataset with Gretel Data Designer

Now we'll use Gretel Data Designer to generate synthetic Q&A data. We'll create questions and answers about AI and machine learning:


In [14]:
from gretel_client.navigator_client import Gretel  # Use navigator_client instead!
from gretel_client.data_designer import columns as C
from gretel_client.data_designer import params as P

print("🤖 Setting up Q&A dataset generation with Gretel Data Designer...")

# Initialize Data Designer using the navigator_client and factory method
gretel_navigator = Gretel()  # This creates the navigator client
dd = gretel_navigator.data_designer.new(model_suite="apache-2.0")

# Add topic column (categorical sampler)
dd.add_column(
    C.SamplerColumn(
        name="topic",
        type=P.SamplerType.CATEGORY,
        params=P.CategorySamplerParams(
            values=[
                "neural networks", "deep learning", "machine learning", "NLP", 
                "computer vision", "reinforcement learning", "AI ethics", "data science"
            ]
        )
    )
)

# Add difficulty column
dd.add_column(
    C.SamplerColumn(
        name="difficulty",
        type=P.SamplerType.CATEGORY,
        params=P.CategorySamplerParams(
            values=["beginner", "intermediate", "advanced"]
        )
    )
)

# Add question column (LLM-generated)
dd.add_column(
    C.LLMTextColumn(
        name="question",
        prompt=(
            "Generate a challenging, specific question about {{ topic }} "
            "at {{ difficulty }} level. The question should be clear, focused, "
            "and something a student or practitioner might actually ask."
        )
    )
)

# Add answer column (LLM-generated)
dd.add_column(
    C.LLMTextColumn(
        name="answer",
        prompt=(
            "Provide a clear, accurate, and comprehensive answer to this {{ difficulty }}-level "
            "question about {{ topic }}: '{{ question }}'. The answer should be educational "
            "and directly address all aspects of the question."
        )
    )
)

print("📊 Generating Q&A dataset...")

# Generate the dataset
workflow_run = dd.create(num_records=20, wait_until_done=True)
synthetic_df = workflow_run.dataset.df

print(f"✅ Generated {len(synthetic_df)} Q&A pairs!")
print(f"\n📊 Dataset shape: {synthetic_df.shape}")
print(f"📋 Columns: {list(synthetic_df.columns)}")

# Display first few rows
print("\n📄 Sample data:")
synthetic_df.head(3)

🤖 Setting up Q&A dataset generation with Gretel Data Designer...
Logged in as mavrickrishi@gmail.com ✅
Using project: default-sdk-project-23e56962f6cd2e8
Project link: https://console.gretel.ai/proj_2yqdYGW3Ez9CKToEZQYN8VFMPnN
📊 Generating Q&A dataset...
[00:04:43] [INFO] 🚀 Submitting batch workflow
▶️ Creating Workflow: w_2zy6IaaJgChytHd3dG2KyIYstxL
▶️ Created Workflow Run: wr_2zy6Ibf5N7fbYalvzwoKNVe52vq
🔗 Workflow Run console link: https://console.gretel.ai/workflows/w_2zy6IaaJgChytHd3dG2KyIYstxL/runs/wr_2zy6Ibf5N7fbYalvzwoKNVe52vq
Fetching task logs for workflow run wr_2zy6Ibf5N7fbYalvzwoKNVe52vq
Got task wt_2zy6IdJs3BBeGquO4xgJpfnrUNo
Workflow run is now in status: RUN_STATUS_ACTIVE
[using-samplers-to-generate-2-columns] Task Status is now: RUN_STATUS_ACTIVE
[using-samplers-to-generate-2-columns] 2025-07-16 18:35:04.057095+00:00 Preparing step 'using-samplers-to-generate-2-columns'
[using-samplers-to-generate-2-columns] 2025-07-16 18:35:17.319534+00:00 Starting 'generate_columns_us

Unnamed: 0,topic,difficulty,question,answer
0,AI ethics,beginner,Should AI systems be designed to always priori...,The design of AI systems should indeed priorit...
1,deep learning,beginner,How does the learning rate affect the converge...,The learning rate in a neural network determin...
2,AI ethics,beginner,Should a self-driving car be programmed to pri...,The question of whether a self-driving car sho...


## 🔄 Step 4: Convert to Opik Format

Let's convert our Gretel-generated data to the format Opik expects:

In [15]:
def convert_to_opik_format(df):
    """Convert Gretel Q&A data to Opik dataset format"""
    opik_items = []
    
    for _, row in df.iterrows():
        # Create Opik dataset item
        item = {
            "input": {
                "question": row["question"]
            },
            "expected_output": row["answer"],
            "metadata": {
                "topic": row.get("topic", "AI/ML"),
                "difficulty": row.get("difficulty", "unknown"),
                "source": "gretel_navigator"
            }
        }
        opik_items.append(item)
    
    return opik_items

print("🔄 Converting to Opik format...")

opik_data = convert_to_opik_format(synthetic_df)

print(f"✅ Converted {len(opik_data)} items to Opik format!")
print("\n📋 Sample converted item:")
import json
print(json.dumps(opik_data[0], indent=2))

🔄 Converting to Opik format...
✅ Converted 20 items to Opik format!

📋 Sample converted item:
{
  "input": {
    "question": "Should AI systems be designed to always prioritize human safety, even if it means sacrificing other values such as privacy or efficiency?"
  },
  "expected_output": "The design of AI systems should indeed prioritize human safety, but it's not accurate to frame the question as an absolute choice between safety and other values like privacy or efficiency. Here's why:\n\n1. **Human Safety**: AI systems should be designed to minimize harm to humans. This is a fundamental principle in AI ethics, as outlined in guidelines from organizations like the European Commission and the IEEE.\n\n2. **Other Values**: However, other values are also important. Privacy, for example, is a fundamental human right. Efficiency is crucial for the practical deployment of AI systems.\n\n3. **Balance**: Instead of prioritizing one value over the others, AI systems should be designed to bal

## 📤 Step 5: Push Dataset to Opik

Now let's upload our dataset to Opik where it can be used for model evaluation:

In [16]:
print("📤 Pushing dataset to Opik...")

# Initialize Opik client
opik_client = opik.Opik()

# Create the dataset
dataset_name = "gretel-ai-qa-dataset"
dataset = opik_client.get_or_create_dataset(
    name=dataset_name,
    description="Synthetic Q&A dataset generated using Gretel Data Designer for AI/ML evaluation"
)

# Insert the data
dataset.insert(opik_data)

print(f"✅ Successfully created dataset: {dataset.name}")
print(f"🆔 Dataset ID: {dataset.id}")
print(f"📊 Total items: {len(opik_data)}")

📤 Pushing dataset to Opik...
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 200 OK"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/items/stream "HTTP/1.1 200 OK"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/items/stream "HTTP/1.1 200 OK"
HTTP Request: PUT https://www.comet.com/opik/api/v1/private/datasets/items "HTTP/1.1 204 No Content"
✅ Successfully created dataset: gretel-ai-qa-dataset
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 200 OK"
🆔 Dataset ID: 0198135a-fadc-7d0d-8900-6a475c9523ad
📊 Total items: 20


The trace can now be viewed in the UI:

![gretel_opik_integration](https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/fern/img/cookbook/gretel_opik_integration_cookbook.png)

## ✅ Step 6: Verify Your Dataset

Let's confirm the dataset was created successfully and see how to use it:


In [17]:
print("🔍 Verifying dataset creation...")

# Try to retrieve the dataset
try:
    retrieved_dataset = opik_client.get_dataset(dataset_name)
    print(f"✅ Dataset verified: {retrieved_dataset.name}")
    print(f"🆔 Dataset ID: {retrieved_dataset.id}")
    
    print(f"\n🎯 Next steps:")
    print(f"1. Go to https://www.comet.com")
    print(f"2. Navigate to Opik → Datasets")
    print(f"3. Find your dataset: {dataset_name}")
    print(f"4. Use it to evaluate your AI models!")
    
except Exception as e:
    print(f"❌ Could not verify dataset: {e}")
    print("Please check your Opik configuration and try again.")

🔍 Verifying dataset creation...
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 200 OK"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/items/stream "HTTP/1.1 200 OK"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/items/stream "HTTP/1.1 200 OK"
✅ Dataset verified: gretel-ai-qa-dataset
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 200 OK"
🆔 Dataset ID: 0198135a-fadc-7d0d-8900-6a475c9523ad

🎯 Next steps:
1. Go to https://www.comet.com
2. Navigate to Opik → Datasets
3. Find your dataset: gretel-ai-qa-dataset
4. Use it to evaluate your AI models!


## 🧪 Step 7: Example Model Evaluation

Here's how you can use your new dataset to evaluate a model with Opik:

In [18]:
# Example: Simple Q&A model evaluation
@opik.track
def simple_qa_model(input_data):
    """A simple example model that generates responses to questions"""
    question = input_data.get('question', '')
    
    # This is just an example - replace with your actual model
    if 'neural network' in question.lower():
        return "A neural network is a computational model inspired by biological neural networks."
    elif 'machine learning' in question.lower():
        return "Machine learning is a subset of AI that enables systems to learn from data."
    else:
        return "This is a complex AI/ML topic that requires detailed explanation."

print("🧪 Example model evaluation setup:")
print(f"Dataset: {dataset_name}")
print("Model: simple_qa_model (replace with your actual model)")
print("\n💡 To run evaluation, uncomment and run the following code:")
print("\n🎉 Integration complete! Your Gretel-generated dataset is ready for model evaluation in Opik.")

🧪 Example model evaluation setup:
Dataset: gretel-ai-qa-dataset
Model: simple_qa_model (replace with your actual model)

💡 To run evaluation, uncomment and run the following code:

🎉 Integration complete! Your Gretel-generated dataset is ready for model evaluation in Opik.


**Congratulations!** 🎉 You've successfully:

1. **Generated synthetic Q&A data** using Gretel Data Designer's advanced column types
2. **Converted the data** to Opik's expected format
3. **Created a dataset** in Opik for model evaluation
4. **Set up the foundation** for AI model testing and optimization

The key advantage of using Gretel Data Designer is its modular approach - you can define exactly what data you want using samplers (for categories) and LLM columns (for generated text), giving you precise control over your synthetic dataset.

---

## 🔗 Next Steps

- **View your dataset**: Go to your Comet workspace → Opik → Datasets
- **Evaluate models**: Use the dataset to test your Q&A models
- **Optimize prompts**: Use Opik's Agent Optimizer with your synthetic data
- **Scale up**: Generate larger datasets for more comprehensive testing

## 📚 Resources

- [Gretel Documentation](https://docs.gretel.ai/)
- [Opik Documentation](https://www.comet.com/docs/opik/)
- [Gretel Data Designer Guide](https://docs.gretel.ai/create-synthetic-data/gretel-data-designer/)

**Happy evaluating!** 🚀


# 🔄 Alternative: Using Gretel Safe Synthetics

If you have an existing Q&A dataset and want to create a synthetic version, you can use **Gretel Safe Synthetics** instead:

In [19]:
%%capture
%pip install -U gretel-client

## Step A: Prepare Sample Data

In [20]:
import pandas as pd
from gretel_client.navigator_client import Gretel

# Initialize Gretel client
gretel = Gretel(api_key="prompt")

# Option 1: Use Gretel's sample ecommerce dataset (has 200+ records)
my_data_source = "https://gretel-datasets.s3.us-west-2.amazonaws.com/ecommerce_customers.csv"

# Option 2: Create your own Q&A dataset (needs 200+ records for holdout)
# For demonstration, we'll create a larger dataset
sample_questions = [
    'What is machine learning?',
    'How do neural networks work?',
    'What is the difference between AI and ML?',
    'Explain deep learning concepts',
    'What are the applications of NLP?'
] * 50  # Repeat to get 250 records

sample_answers = [
    'Machine learning is a subset of AI that enables systems to learn from data.',
    'Neural networks are computational models inspired by biological neural networks.',
    'AI is the broader concept while ML is a specific approach to achieve AI.',
    'Deep learning uses multi-layer neural networks to model complex patterns.',
    'NLP applications include chatbots, translation, sentiment analysis, and text generation.'
] * 50  # Repeat to get 250 records

sample_data = {
    'question': sample_questions,
    'answer': sample_answers,
    'topic': (['ML', 'Neural Networks', 'AI/ML', 'Deep Learning', 'NLP'] * 50),
    'difficulty': (['beginner', 'intermediate', 'beginner', 'advanced', 'intermediate'] * 50)
}

original_df = pd.DataFrame(sample_data)
print(f"📄 Original dataset: {len(original_df)} records")
print(original_df.head())

# Important: Gretel requires at least 200 records to use holdout
if len(original_df) < 200:
    print("⚠️ Warning: Dataset has less than 200 records. Holdout will be disabled.")

Found cached Gretel credentials
Logged in as mavrickrishi@gmail.com ✅
Using project: default-sdk-project-23e56962f6cd2e8
Project link: https://console.gretel.ai/proj_2yqdYGW3Ez9CKToEZQYN8VFMPnN
📄 Original dataset: 250 records
                                    question  \
0                  What is machine learning?   
1               How do neural networks work?   
2  What is the difference between AI and ML?   
3             Explain deep learning concepts   
4          What are the applications of NLP?   

                                              answer            topic  \
0  Machine learning is a subset of AI that enable...               ML   
1  Neural networks are computational models inspi...  Neural Networks   
2  AI is the broader concept while ML is a specif...            AI/ML   
3  Deep learning uses multi-layer neural networks...    Deep Learning   
4  NLP applications include chatbots, translation...              NLP   

     difficulty  
0      beginner  
1  interme

## Step B: Generate Synthetic Version

In [21]:
# For quick demo with small dataset - disable holdout and transform
synthetic_dataset = gretel.safe_synthetic_dataset \
    .from_data_source(original_df, holdout=None) \
    .synthesize(num_records=5) \
    .create()

# Wait for completion and get results
synthetic_dataset.wait_until_done()
synthetic_df_safe = synthetic_dataset.dataset.df

print(f"✅ Generated {len(synthetic_df_safe)} synthetic Q&A pairs using Safe Synthetics!")
print(synthetic_df_safe.head())

Configuring generator for data source: DataFrame (250, 4)
Configuring synthetic data generation model: tabular_ft/default
▶️ Creating Workflow: w_2zy6n4dFIDX4rrumaRrkAs83vkK
▶️ Created Workflow Run: wr_2zy6n9edEFTMeq6dI5dztpQj6TE
🔗 Workflow Run console link: https://console.gretel.ai/workflows/w_2zy6n4dFIDX4rrumaRrkAs83vkK/runs/wr_2zy6n9edEFTMeq6dI5dztpQj6TE
Fetching task logs for workflow run wr_2zy6n9edEFTMeq6dI5dztpQj6TE
Got task wt_2zy6n5psdskpY9SMCSmwf8TVdky
Workflow run is now in status: RUN_STATUS_ACTIVE
[read-data-source] Task Status is now: RUN_STATUS_ACTIVE
[read-data-source] 2025-07-16 18:39:07.251095+00:00 Preparing step 'read-data-source'
[read-data-source] 2025-07-16 18:39:14.903445+00:00 Starting 'data_source' task execution
[read-data-source] 2025-07-16 18:39:16.868988+00:00 Task 'data_source' executed successfully
[read-data-source] 2025-07-16 18:39:16.869498+00:00 Task execution completed. Saving task outputs.
[read-data-source] 2025-07-16 18:39:17.457615+00:00 Task o

## Step C: View Results and Quality Report

In [22]:
# Preview synthetic data
print("🔍 Synthetic dataset preview:")
print(synthetic_dataset.dataset.df.head())

# View quality report table
print("📊 Quality Report Summary:")
print(synthetic_dataset.report.table)

# View detailed HTML report in notebook
# synthetic_dataset.report.display_in_notebook()

# Access workflow details
print("\n🔧 Workflow Configuration:")
print(synthetic_dataset.config_yaml)

# List all workflow steps
print("\n📋 Workflow Steps:")
for step in synthetic_dataset.steps:
    print(f"- {step.name}")

🔍 Synthetic dataset preview:
                                    question  \
0                  What is machine learning?   
1             Explain deep learning concepts   
2               How do neural networks work?   
3             Explain deep learning concepts   
4  What is the difference between AI and ML?   

                                              answer            topic  \
0  Machine learning is a subset of AI that enable...               ML   
1  Deep learning uses multi-layer neural networks...    Deep Learning   
2  Neural networks are computational models inspi...  Neural Networks   
3  Deep learning uses multi-layer neural networks...    Deep Learning   
4  AI is the broader concept while ML is a specif...            AI/ML   

     difficulty  
0      beginner  
1      advanced  
2  intermediate  
3      advanced  
4      beginner  
📊 Quality Report Summary:
<rich.table.Table object at 0x720e11fbe9f0>

🔧 Workflow Configuration:
globals: {}
name: tabular-ft--evaluate

## Step D: Convert to Opik and Upload

In [23]:
def convert_to_opik_format(df):
    """Convert Gretel Q&A data to Opik dataset format"""
    opik_items = []
    
    for _, row in df.iterrows():
        # Create Opik dataset item
        item = {
            "input": {
                "question": row["question"]
            },
            "expected_output": row["answer"],
            "metadata": {
                "topic": row.get("topic", "AI/ML"),
                "difficulty": row.get("difficulty", "unknown"),
                "source": "gretel_navigator"
            }
        }
        opik_items.append(item)
    
    return opik_items

# Initialize Opik client if not already defined
opik_client = opik.Opik()
# Convert and upload to Opik (same process as before)
opik_data_safe = convert_to_opik_format(synthetic_df_safe)

# Create dataset in Opik
dataset_safe = opik_client.get_or_create_dataset(
    name="gretel-safe-synthetics-qa-dataset",
    description="Synthetic Q&A dataset generated using Gretel Safe Synthetics"
)

dataset_safe.insert(opik_data_safe)
print(f"✅ Safe Synthetics dataset created: {dataset_safe.name}")

HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 404 Not Found"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets "HTTP/1.1 201 Created"
HTTP Request: POST https://www.comet.com/opik/api/v1/private/datasets/retrieve "HTTP/1.1 200 OK"


OPIK: Created a "gretel-safe-synthetics-qa-dataset" dataset at https://www.comet.com/opik/api/v1/session/redirect/datasets/?dataset_id=01981490-5a78-77d9-8e53-016c576373e3&path=aHR0cHM6Ly93d3cuY29tZXQuY29tL29waWsvYXBpLw==.


HTTP Request: PUT https://www.comet.com/opik/api/v1/private/datasets/items "HTTP/1.1 204 No Content"
✅ Safe Synthetics dataset created: gretel-safe-synthetics-qa-dataset


The trace can now be viewed in the UI:

![gretel opik integration synthetics](https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/fern/img/cookbook/gretel_opik_integration_cookbook_synthetics.png)


## 🚨 **Important: Dataset Size Requirements**

| Dataset Size | Holdout Setting | Example |
|--------------|----------------|---------|
| **< 200 records** | `holdout=None` | `from_data_source(df, holdout=None)` |
| **200+ records** | Default (5%) or custom | `from_data_source(df)` or `from_data_source(df, holdout=0.1)` |
| **Large datasets** | Custom percentage/count | `from_data_source(df, holdout=250)` |

## 🤔 **When to Use Which Approach?**

| Use Case | Recommended Approach | Why |
|----------|---------------------|-----|
| **Creating new datasets from scratch** | **Data Designer** | More control, custom column types, guided generation |
| **Synthesizing existing datasets** | **Safe Synthetics** | Preserves statistical relationships, privacy-safe |
| **Custom data structures** | **Data Designer** | Flexible column definitions, template system |
| **Production data replication** | **Safe Synthetics** | Maintains data utility while ensuring privacy |

Both approaches integrate seamlessly with Opik for model evaluation! 🎯
