# Week 2 Video 1: Structured Outputs with Instructor

## Introduction

This notebook demonstrates how to use the **instructor** library to enforce structured outputs from OpenAI models using Pydantic schemas.

**Key Concepts:**
- **Instructor Library**: Patches OpenAI client to enable `response_model` parameter
- **Pydantic Models**: Define the structure and validation rules for LLM responses
- **Structured Outputs**: Ensures LLM responses conform to a specific schema

**Why Use Structured Outputs?**
- **Type Safety**: Guaranteed data structure for downstream processing
- **Validation**: Automatic validation of LLM responses against schema
- **Reliability**: Reduces parsing errors and inconsistent response formats
- **Integration**: Easily integrate LLM outputs with existing code

---

## Imports

In [None]:
import openai
import instructor
from qdrant_client import QdrantClient
from pydantic import BaseModel, Field


## Part 1: Baseline - Standard OpenAI Response

First, let's see how a standard OpenAI API call works **without** structured outputs.

**Characteristics:**
- Returns raw text string
- No schema enforcement
- Requires manual parsing
- Response format can vary

In [None]:
prompt = """
You are a helpful assistant.
Return an answer to the question.
Question: What is your name?
"""

In [None]:
openai_response = openai.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[{"role": "system", "content": prompt}],
    temperature=0,
)

print(openai_response.choices[0].message.content)

In [None]:
openai_response

## Part 2: Add Instructor for Structured Outputs

Now let's use **instructor** to enforce a structured response format using Pydantic models.

**How Instructor Works:**
1. Wraps OpenAI client with `instructor.from_openai()`
2. Accepts `response_model` parameter (a Pydantic BaseModel)
3. Automatically validates LLM response against the schema
4. Returns a typed Pydantic object instead of raw text

**Key Method:** `create_with_completion()`
- Returns both the structured model AND the raw API response
- Useful for debugging and accessing raw metadata

In [None]:
client = instructor.from_openai(openai.OpenAI())

In [None]:
class RAGGenerationResponse(BaseModel):
    answer: str = Field(description="The answer to the question")


In [None]:
response, raw_response = client.chat.completions.create_with_completion(
    model="gpt-4.1-mini",
    messages=[{"role": "system", "content": prompt}],
    temperature=0,
    response_model=RAGGenerationResponse,
)


In [None]:
response

In [None]:
class RAGGenerationResponse(BaseModel):
    answer: str = Field(description="The answer to the question")

In [None]:
raw_response

In [None]:
class RAGGenerationResponse(BaseModel):
    answer: str = Field(description="The answer to the question")
    reasoning: str = Field(description="The reasoning behind the answer")


In [None]:
response

## Part 3: Integrating Instructor with RAG Pipeline

Now let's integrate structured outputs into our RAG pipeline from Week 1.

**Changes from Week 1:**
- `generate_answer()` now uses instructor client
- Returns `RAGGenerationResponse` Pydantic model instead of raw string
- `rag_pipeline()` returns structured dict with both model and extracted fields

**Benefits:**
- Type-safe responses from RAG pipeline
- Automatic validation of LLM outputs
- Easier to extend with additional fields (reasoning, confidence, etc.)
- Better integration with downstream systems

**Pipeline Flow:**
1. **get_embedding()** - Convert query to vector
2. **retrieve_data()** - Semantic search in Qdrant
3. **process_context()** - Format retrieved products
4. **build_prompt()** - Construct system prompt
5. **generate_answer()** - LLM generates structured response ‚Üê **NEW: Uses instructor**
6. **rag_pipeline()** - Orchestrates all steps and returns structured result

In [None]:
class RAGGenerationResponse(BaseModel):
    answer: str = Field(description="The answer to the question")

In [None]:
# ============================================================================
# RAG PIPELINE FUNCTIONS WITH INSTRUCTOR STRUCTURED OUTPUTS
# ============================================================================

def get_embedding(text, model="text-embedding-3-small"):
    """
    Convert text to 1536-dimensional embedding vector using OpenAI.
    
    Args:
        text: Input text to embed (query or document)
        model: OpenAI embedding model name
        
    Returns:
        List of floats representing the embedding vector
    """
    response = openai.embeddings.create(
        input=text,
        model=model,
    )
    
    return response.data[0].embedding


def retrieve_data(query, qdrant_client, k=5):
    """
    Retrieve top-k most relevant products from Qdrant vector database.
    
    Args:
        query: User's question text
        qdrant_client: Connected Qdrant client instance
        k: Number of similar products to retrieve
        
    Returns:
        Dict with retrieved_context_ids, retrieved_context, 
        retrieved_context_ratings, and similarity_scores
    """
    # Convert query to embedding for semantic search
    query_embedding = get_embedding(query)
    
    # Search Qdrant for nearest neighbors using cosine similarity
    results = qdrant_client.query_points(
        collection_name="Amazon-items-collection-00",
        query=query_embedding,
        limit=k,
    )
    
    # Extract metadata from matching results
    retrieved_context_ids = []
    retrieved_context = []
    similarity_scores = []
    retrieved_context_ratings = []
    
    for result in results.points:
        retrieved_context_ids.append(result.payload["parent_asin"])
        retrieved_context.append(result.payload["description"])
        retrieved_context_ratings.append(result.payload["average_rating"])
        similarity_scores.append(result.score)
    
    return {
        "retrieved_context_ids": retrieved_context_ids,
        "retrieved_context": retrieved_context,
        "retrieved_context_ratings": retrieved_context_ratings,
        "similarity_scores": similarity_scores,
    }


def process_context(context):
    """
    Format retrieved product data into readable text for LLM.
    
    Args:
        context: Dict with retrieved_context_ids, retrieved_context, 
                and retrieved_context_ratings
                
    Returns:
        Formatted string with product information
    """
    formatted_context = ""
    
    for id, chunk, rating in zip(
        context["retrieved_context_ids"], 
        context["retrieved_context"], 
        context["retrieved_context_ratings"]
    ):
        formatted_context += f"- ID: {id}, rating: {rating}, description: {chunk}\n"
    
    return formatted_context


def build_prompt(preprocessed_context, question):
    """
    Construct the final prompt sent to the language model.
    
    Args:
        preprocessed_context: Formatted string of retrieved products
        question: User's original question
        
    Returns:
        Complete prompt with system instructions, context, and question
    """
    prompt = f"""
You are a shopping assistant that can answer questions about the products in stock.

You will be given a question and a list of context.

Instructions:
- You need to answer the question based on the provided context only.
- Never use word context and refer to it as the available products.

Context:
{preprocessed_context}

Question:
{question}
"""
    
    return prompt


def generate_answer(prompt):
    """
    Generate structured answer using instructor library for type-safe responses.
    
    **KEY CHANGE FROM WEEK 1:** Now returns structured Pydantic model instead of raw string.
    
    Args:
        prompt: Complete prompt with instructions, context, and question
        
    Returns:
        RAGGenerationResponse Pydantic model with validated answer field
        
    Note:
        - Must use instructor client, NOT raw openai client
        - create_with_completion() returns (model, raw_response) tuple
        - Response IS the Pydantic model, not nested in .choices array
    """
    # CRITICAL: Create instructor client for structured outputs
    # This patches the OpenAI client to support response_model parameter
    instructor_client = instructor.from_openai(openai.OpenAI())
    
    # Use instructor's create_with_completion for structured output
    # Returns both the validated Pydantic model and raw API response
    response, raw_response = instructor_client.chat.completions.create_with_completion(
        model="gpt-4.1-mini",
        messages=[{"role": "system", "content": prompt}],
        temperature=0,
        response_model=RAGGenerationResponse  # Pydantic model enforces structure
    )
    
    # response is already a RAGGenerationResponse object - no need to extract from .choices
    return response


def rag_pipeline(question, top_k=5):
    """
    Complete RAG pipeline with structured outputs.
    
    **ENHANCED IN WEEK 2:** Now returns structured Pydantic models instead of raw text.
    
    Args:
        question: User's question about products
        top_k: Number of products to retrieve (default: 5)
        
    Returns:
        Dict with:
            - datamodel: RAGGenerationResponse Pydantic model
            - answer: Extracted answer string (for convenience)
            - question: Original user query
            - retrieved_context_ids: Product ASINs used
            - retrieved_context: Product descriptions used
            - similarity_scores: Relevance scores for each product
    """
    # Initialize Qdrant client
    qdrant_client = QdrantClient(url="http://localhost:6333")
    
    # Step 1: Retrieve relevant products from vector database
    retrieved_context = retrieve_data(question, qdrant_client, top_k)
    
    # Step 2: Format products into readable text
    preprocessed_context = process_context(retrieved_context)
    
    # Step 3: Build complete prompt with instructions and context
    prompt = build_prompt(preprocessed_context, question)
    
    # Step 4: Generate structured answer using instructor (NEW IN WEEK 2)
    answer = generate_answer(prompt)
    
    # Step 5: Package result with both structured model and metadata
    final_result = {
        "datamodel": answer,  # Full Pydantic model
        "answer": answer.answer,  # Extracted answer string for convenience
        "question": question,
        "retrieved_context_ids": retrieved_context["retrieved_context_ids"],
        "retrieved_context": retrieved_context["retrieved_context"],
        "similarity_scores": retrieved_context["similarity_scores"]
    }
    
    return final_result

In [None]:
qdrant_client = QdrantClient(url="http://localhost:6333")

# output = rag_pipeline("Can I get a tablet? Please suggest me a good one.", qdrant_client)
output = rag_pipeline("Can I get a tablet? Please suggest me a good one.")

In [None]:
output = rag_pipeline(
    "Can I get a charging cable? Please suggest me a good one.", qdrant_client
)


In [None]:
output

In [None]:
print(output["answer"])