# Synthetic Question Generation: The Power of Prompts

## Why This Notebook Matters

When building RAG systems, one of the biggest challenges is generating good synthetic questions for evaluation and training. Most people focus on getting better models or more data, but **the most critical factor is actually your prompt design**.

In this notebook, we'll compare two different approaches to synthetic question generation and show you how dramatically different results can be based on your prompt strategy.

## What You'll Learn

1. **How prompt design shapes everything** - Small changes in prompts lead to completely different outputs
2. **The importance of understanding user intent** - What kinds of questions will your users actually ask?
3. **How to choose the right approach** - Different prompts for different use cases
4. **Practical guidance** - How to design prompts for your specific domain

## The Setup

We'll use real conversations from the WildChat dataset and generate synthetic queries using two different approaches:
- **V1**: Search-focused (helping users find information)
- **V2**: Pattern-focused (finding similar conversations)

Pay close attention to how different the results are - this will help you understand why prompt engineering is so critical for your RAG system's success.


In [1]:
# Comparing Synthetic Question Generation Methods
# This notebook loads examples from WildChat dataset and compares v1 vs v2 processors

import asyncio
import sys

# Add the utils directory to path
sys.path.append('../utils')
sys.path.append('..')

# Import our modules
from utils.dataloader import WildChatDataLoader

# Setup instructor client
import instructor

# Initialize instructor-patched OpenAI client
client = instructor.from_provider("openai/gpt-4o-mini", async_client=True)

print("Setup complete!")


Setup complete!


In [2]:
from typing import List, Dict, Any
from pydantic import BaseModel, Field

class SearchQueries(BaseModel):
    """Generated search queries that could lead to discovering a conversation."""
    chain_of_thought: str = Field(
        description="Chain of thought process for generating the search queries"
    )
    queries: List[str] = Field(
        description="4-7 diverse search queries that users might type to find this conversation",
        min_items=3,
        max_items=8
    )


async def synthetic_question_generation_v1(
    client,  # instructor-patched client
    messages: List[Dict[str, Any]],
) -> SearchQueries:
    """
    Generate diverse synthetic search queries from a chat conversation.
    
    As a product manager analyzing ChatGPT usage patterns, this function creates
    search queries that users might have typed to discover similar conversations.
    The queries should be diverse and cover different aspects of the conversation.
    
    Args:
        client: instructor-patched client
        conversation: Dictionary containing conversation data with 'messages' or 'conversation' key
        
    Returns:
        SearchQueries object with 4-5 diverse search queries and reasoning
    """
    
    prompt = """
    You are a product manager analyzing ChatGPT usage patterns. Your goal is to understand 
    how users might search to find conversations like this one.
    
    Given this conversation, generate 4-5 diverse search queries that different users might 
    type when looking for similar help or information. The queries should:
    
    1. Cover different aspects of the conversation (technical terms, problem description, solution type)
    2. Vary in specificity (some broad, some specific)
    3. Use different phrasings and vocabulary levels
    4. Reflect natural user search behavior
    5. Include both question-style and keyword-style queries
    
    <conversation>
    {% for message in messages %}
        <message role="{{ message.role }}">
            {{ message.content }}
        </message>
    {% endfor %}
    </conversation>
    
    Generate queries that would realistically lead someone to discover this conversation.
    """
    
    response = await client.chat.completions.create(
        response_model=SearchQueries,
        messages=[
            {
                "role": "user", 
                "content": prompt
            }
        ],
        context={
            "messages": messages
        }
    )
    
    return response


async def synthetic_question_generation_v2(
    client,  # instructor-patched client
    messages: List[Dict[str, Any]],
) -> SearchQueries:
    """
    Generate search queries for finding conversations with similar patterns and characteristics.
    
    This version focuses on identifying conversation types, themes, and patterns that would be
    useful for researchers, content moderators, or analysts studying human-AI interactions.
    
    Args:
        client: instructor-patched client
        messages: List of messages in the conversation
        
    Returns:
        SearchQueries object with pattern-focused search queries
    """
    
    prompt = """
    You are a research analyst studying patterns in human-AI conversations from the WildChat dataset.
    Your goal is to identify the key characteristics and patterns in this conversation that would help
    researchers find similar types of conversations.
    
    Analyze this conversation and generate search queries that would help find conversations with:
    - Similar content themes or domains (medical, creative, technical, etc.)
    - Similar user intents (seeking advice, creative collaboration, testing AI limits, etc.)
    - Similar interaction patterns (role-playing, Q&A, refusal situations, etc.)
    - Similar AI behaviors or response types
    
    Focus on generating queries that capture the ESSENCE and PATTERNS rather than specific details.
    
    Examples of good pattern queries:
    - "conversations where users ask about medical diagnoses"
    - "role-playing scenarios with fictional characters"
    - "conversations where AI refuses medical advice"
    - "creative writing collaborations"
    - "technical troubleshooting discussions"
    - "conversations testing AI content policies"
    - "users seeking relationship advice"
    - "educational Q&A about scientific concepts"
    
    <conversation>
    {% for message in messages %}
        <message role="{{ message.role }}">
            {{ message.content }}
        </message>
    {% endfor %}
    </conversation>
    
    Generate 5-7 search queries that focus on conversation patterns, themes, and characteristics
    rather than specific content details. Think about what makes this conversation type distinct
    and how researchers would categorize it.
    """
    
    response = await client.chat.completions.create(
        response_model=SearchQueries,
        messages=[
            {
                "role": "system",
                "content": "You are an expert conversation analyst specializing in categorizing and understanding patterns in human-AI interactions. Focus on identifying conversation types, themes, and structural patterns rather than specific content details."
            },
            {
                "role": "user", 
                "content": prompt
            }
        ],
        context={
            "messages": messages
        }
    )
    
    return response


In [3]:
# Load examples from the WildChat dataset
print("Loading WildChat examples...")

# Initialize the dataloader with a reasonable limit for initial loading
loader = WildChatDataLoader(limit=50)  # Load first 5K to get good examples

# Stream conversations and collect examples
examples = []
target_count = 5  # Aim for 5 good examples

for conversation in loader.stream_conversations(
    limit=target_count,
    min_message_length=50,
    filter_language='English',
    filter_toxic=True
):
    examples.append(conversation)
    print(f"Loaded example {len(examples)}: {conversation['conversation_hash'][:8]}...")
    
    if len(examples) >= target_count:
        break

Loading WildChat examples...
Loaded example 1: c9ec5b44...
Loaded example 2: cf1267ca...
Loaded example 3: e98d3e74...
Loaded example 4: 2e8fd255...
Loaded example 5: 59c72510...


In [4]:
async def process_examples_async(examples_to_process: List[Dict], num_examples: int = 5):
    """Process examples using both v1 and v2 processors and return results for comparison"""
    
    results = {
        'v1_results': [],
        'v2_results': [],
        'conversations': [],
        'processing_times': {'v1': [], 'v2': []},
        'errors': {'v1': [], 'v2': []}
    }
    
    print(f"Processing {min(num_examples, len(examples_to_process))} examples with both processors...\n")
    
    # Prepare all tasks for parallel execution
    tasks = []
    for i, example in enumerate(examples_to_process[:num_examples]):
        print(f"Preparing example {i+1}/{min(num_examples, len(examples_to_process))}: {example['conversation_hash'][:8]}")
        
        results['conversations'].append({
            'hash': example['conversation_hash'],
            'first_message': example['first_message'][:200],
            'length': example['conversation_length']
        })
        
        # Create tasks for both v1 and v2 processing
        v1_task = synthetic_question_generation_v1(client, example['conversation'])
        v2_task = synthetic_question_generation_v2(client, example['conversation'])
        tasks.extend([v1_task, v2_task])
    
    # Execute all tasks in parallel
    print("Executing all processing tasks in parallel...")
    all_results = await asyncio.gather(*tasks)
    
    # Separate v1 and v2 results
    for i in range(0, len(all_results), 2):
        results['v1_results'].append(all_results[i])
        results['v2_results'].append(all_results[i + 1])
    
    return results

# Run the async processing
if examples:
    print("Starting async processing...")
    results = await process_examples_async(examples, num_examples=5)
    print("Processing complete!")
else:
    print("No examples to process!")


Starting async processing...
Processing 5 examples with both processors...

Preparing example 1/5: c9ec5b44
Preparing example 2/5: cf1267ca
Preparing example 3/5: e98d3e74
Preparing example 4/5: 2e8fd255
Preparing example 5/5: 59c72510
Executing all processing tasks in parallel...
Processing complete!


# Before We Generate: Set Your Expectations

We're about to process the same 5 conversations with two different prompts. 

**What you should watch for:**

1. **Completely different query types** - V1 and V2 will produce queries that serve entirely different purposes
2. **Different vocabulary and phrasing** - Same content, totally different language
3. **Different levels of specificity** - Some approaches will be more general, others more specific
4. **Different user assumptions** - Each approach assumes different user goals

**The key insight**: Neither approach is "better" - they're solving different problems. The "right" approach depends on understanding what your users are actually trying to accomplish when they search your system.

Let's see this in action...


In [8]:
from IPython.display import Markdown, display
from jinja2 import Template

# Define the Jinja template
template_str = """# Synthetic Question Generation Comparison: V1 vs V2

## Overview
Comparing two different approaches to generating search queries from conversation data:
- **V1**: Direct query generation focused on search intent
- **V2**: Query generation focused on conversation patterns and user intents

---

{% for i in range(conversations|length) %}
## Example {{ i + 1 }}: {{ conversations[i].hash[:8] }}

**Conversation Preview:** {{ conversations[i].first_message[:150] }}...

**Length:** {{ conversations[i].length }} messages

### V1 Results (Search-Focused)
**Chain of Thought:** {{ v1_results[i].chain_of_thought }}

**Generated Queries:**
{% for query in v1_results[i].queries %}
{{ loop.index }}. {{ query }}
{% endfor %}

### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** {{ v2_results[i].chain_of_thought }}

**Generated Queries:**
{% for query in v2_results[i].queries %}
{{ loop.index }}. {{ query }}
{% endfor %}

---

{% endfor %}"""

# Create and render the template
template = Template(template_str)
formatted_results = template.render(
    conversations=results['conversations'],
    v1_results=results['v1_results'],
    v2_results=results['v2_results']
)

# Display the results
display(Markdown(formatted_results))


# Synthetic Question Generation Comparison: V1 vs V2

## Overview
Comparing two different approaches to generating search queries from conversation data:
- **V1**: Direct query generation focused on search intent
- **V2**: Query generation focused on conversation patterns and user intents

---


## Example 1: c9ec5b44

**Conversation Preview:** Hey there! Are you familiar with reality shifting? So, I’m refining a foolproof method for reality shifting and want to pick a destination. Want to he...

**Length:** 2 messages

### V1 Results (Search-Focused)
**Chain of Thought:** The conversation revolves around reality shifting, a popular concept where individuals attempt to enter a different reality or dimension mentally. The user is looking for creative ideas for a personalized reality-shifting destination, including specific elements like quests, characters, and the dynamics of unconsciousness. Queries should reflect these themes, including both technical terms and casual language.

**Generated Queries:**

1. what is reality shifting and how to customize it

2. unique reality shifting destination ideas

3. help me create a character for reality shifting

4. reality shifting goals and quests ideas

5. how to write a reality shifting story with specific requirements


### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** The conversation revolves around fantasy and imagination regarding reality shifting, with an emphasis on detailed world-building and personalized adventures. It shows a creative collaboration between the user and AI, where the user specifies unique requirements for a fictional narrative. The interaction pattern is conversational, with an engaging and detailed response from the AI that aligns with the user's intent. The themes include creativity, role-playing in fantastical contexts, and an emphasis on desired experiences rather than mundane details. This understanding of themes and patterns can guide researchers in finding similar conversations.

**Generated Queries:**

1. conversations about crafting personalized fantasy narratives

2. role-playing scenarios focused on imaginative realities

3. creative collaboration for fictional world-building

4. conversations about reality shifting experiences

5. users seeking adventure planning and storytelling

6. engaging dialogues about creating fictional quests

7. discussions on character development in fantasy settings


---


## Example 2: cf1267ca

**Conversation Preview:** Old age PT hx of DM, HTN, dyslipidemia His ECG I.II, aVF (MI) what is the highest risk 

factor for this condition?...

**Length:** 2 messages

### V1 Results (Search-Focused)
**Chain of Thought:** The conversation revolves around a patient with multiple risk factors for myocardial infarction. Queries should reflect medical terminology, patient history, and risk factors for heart conditions. Some queries should be broad, while others can be more specific to the medical condition discussed. Phrasing will vary to reflect how users might ask questions or use keywords.

**Generated Queries:**

1. What are the risk factors for myocardial infarction in elderly patients?

2. Myocardial infarction causes and risk factors

3. Old age diabetes hypertension dyslipidemia heart attack risk

4. ECG interpretations and myocardial infarction risk factors

5. How does age affect heart attack risk in diabetic patients?


### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** This conversation involves a user asking a medical question related to a specific patient scenario, requiring the AI to analyze risk factors for a condition (myocardial infarction). The themes include medical history, risk factors, and ECG interpretation. The user intent is clearly seeking medical advice, while the interaction pattern is a Q&A format. The AI's behavior is informative and analytical, providing a clear response based on the user's query. Therefore, the search queries should cover similar medical advice scenarios, risk factor assessments, Q&A interactions in healthcare, and other related patterns.

**Generated Queries:**

1. conversations seeking medical advice about cardiovascular risk

2. Q&A discussions about patient history and diagnostics

3. medical scenarios analyzing risk factors for diseases

4. AI providing analysis on ECG readings

5. conversations focused on elderly health risks and management

6. users asking about conditions linked to diabetes and hypertension

7. discussions regarding old age and its impact on health conditions


---


## Example 3: e98d3e74

**Conversation Preview:** Hey there! Are you familiar with reality shifting? So, I’m refining a foolproof method for reality shifting and want to pick a destination. Want to he...

**Length:** 2 messages

### V1 Results (Search-Focused)
**Chain of Thought:** The conversation revolves around reality shifting, a fantasy concept involving immersive adventures in imagined worlds, with specific user requests such as a quest, attractive characters, and a unique entry mechanism. Users might look for information on reality shifting, detailed world-building ideas, character creation for these adventures, or guidance on crafting engaging quests. The varied vocabulary and phrasing will help capture the diverse ways people might express their search needs.

**Generated Queries:**

1. How to create a personalized reality shifting scenario

2. Ideas for reality shifting quests

3. What are popular themes in reality shifting

4. How to start a reality shifting adventure

5. Unique character ideas for reality shifting


### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** This conversation centers around a request for creative collaboration in the context of reality shifting, emphasizing user-defined aspects of a fictional world. It exemplifies themes of imaginative storytelling, user intent for co-creation, and detailed response patterns typical in role-playing scenarios. To help researchers find similar conversations, the search queries should focus on creative collaborations, world-building prompts, role-playing interactions, and specific user intents around fictional narratives. Additionally, the format of the user's request indicates a desire for extensive detail, which should also be captured in the search queries.

**Generated Queries:**

1. creative collaboration in world-building

2. role-playing conversations about fictional realities

3. users seeking detailed adventures in imaginary settings

4. conversations where AI assists in developing personalized fantasy narratives

5. discussions about reality shifting and fictional quests

6. interactive storytelling with user-defined parameters

7. conversational prompts for imaginative journeys and quests


---


## Example 4: 2e8fd255

**Conversation Preview:** Hey there! Are you familiar with reality shifting? So, I’m refining a foolproof method for reality shifting and want to pick a destination. Want to he...

**Length:** 2 messages

### V1 Results (Search-Focused)
**Chain of Thought:** The conversation revolves around reality shifting, a technique for exploring personalized dream-like scenarios. It includes detailed requirements for crafting a fictional world, with specific elements like quests, characters, and unconsciousness. Users might search for terms related to reality shifting, personalized world-building, or enchanting narratives. Broad queries can capture general interest, while specific ones can focus on detailed aspects like character qualities or storytelling techniques.

**Generated Queries:**

1. What is reality shifting and how can I create my own world?

2. Tips for reality shifting to personalized fantasies

3. How to create quests and characters for a reality shift

4. Ideas for unique reality shifting destinations

5. What are the requirements for reality shifting adventures?


### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** This conversation features a user seeking creative collaboration with an AI about a fantasy scenario, specifically focusing on reality shifting. The user outlines specific requirements for their imagined world, demonstrating a high level of engagement in a role-playing exercise. The AI responds with detailed descriptions, fostering a rich narrative environment. The key elements include the themes of fantasy world-building, user intent of creative empowerment, and interactive storytelling mechanics. Therefore, the search queries should capture these elements and patterns in human-AI interactions effectively.

**Generated Queries:**

1. creative world-building scenarios with AI

2. role-playing conversations about fantasy realms

3. user-driven narrative development in AI chats

4. fantasy adventure planning with AI guidance

5. conversations about personalized reality shifting

6. AI-assisted storytelling for immersive experiences

7. exploring user fantasies through interactive dialogue


---


## Example 5: 59c72510

**Conversation Preview:** i wanna you to write me terms & conditions and policies for my  website...

**Length:** 2 messages

### V1 Results (Search-Focused)
**Chain of Thought:** Users might be looking for assistance in creating legal documents like terms and conditions for their website, or they might seek advice on the legal requirements for online businesses. Additionally, they could be searching for templates or legal services. The queries should reflect different levels of detail, some being direct questions while others are keyword-focused.

**Generated Queries:**

1. how to write terms and conditions for my website

2. legal advice for website policies

3. professional help for website terms and conditions

4. customizable legal document templates for websites

5. importance of legal policies for online businesses


### V2 Results (Conversation Pattern-Focused)
**Chain of Thought:** The conversation involves a user requesting legal documentation for their website, which the AI refuses to provide due to its limitations. This indicates a discussion centered around legal themes and emphasizes the AI's role in setting boundaries regarding its capabilities. The primary user intent appears to be seeking assistance in creating legal documents, while the interaction pattern reflects a refusal situation where the AI delineates its capabilities. To capture the essence of this interaction, search queries should focus on similar themes of legal advice, AI limitations, and refusal interactions.

**Generated Queries:**

1. conversations where users request legal document creation

2. AI refusal scenarios for legal advice

3. discussions about AI limitations in providing legal assistance

4. seeking advice on website terms and policies

5. interactions where users expect professional legal help from AI

6. conversations involving AI advice on legal matters

7. users asking for automated legal document generation


---

