# Pydantic with Hugging Face Models for Structured Outputs

This tutorial demonstrates how to implement structured data validation and tool calling with Pydantic using affordable/free Hugging Face models.

## Overview

We'll build a customer support system that:
1. Validates user input with Pydantic
2. Generates structured customer queries
3. Performs tool calling with validation
4. Creates final support tickets

**Key difference:** We'll use free Hugging Face models (like Mistral-7B or Llama-3-8B) via the Hugging Face Inference API instead of paid APIs.

## Setup

First, let's install the required packages.

In [None]:
# Install required packages
!pip install pydantic huggingface_hub requests -q

In [None]:
import os
from datetime import datetime
from typing import Optional, Literal
import json
import re
from pydantic import BaseModel, Field, field_validator, ValidationError
from huggingface_hub import InferenceClient

## Step 1: Configure Hugging Face

Get your free API token from [https://huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)

Then either:
- Set it as an environment variable: `export HF_TOKEN=your_token_here`
- Or paste it directly in the cell below (not recommended for shared notebooks)

In [None]:
# Option 1: Get from environment variable
HF_TOKEN = os.environ.get("HF_TOKEN")

# Option 2: If not set as environment variable, uncomment and paste your token here:
# HF_TOKEN = "your_token_here"

if not HF_TOKEN:
    raise ValueError("Please set your HF_TOKEN either as environment variable or in the cell above")

# Initialize the Hugging Face client
client = InferenceClient(token=HF_TOKEN)

# We'll use Mistral-7B-Instruct - it's free and good at following instructions
MODEL_NAME = "mistralai/Mistral-7B-Instruct-v0.2"

# Alternative free models you can try:
# "meta-llama/Meta-Llama-3-8B-Instruct"
# "microsoft/Phi-3-mini-4k-instruct"
# "HuggingFaceH4/zephyr-7b-beta"

print(f"✓ Configured to use model: {MODEL_NAME}")

## Step 2: Define Pydantic Models

These models define the structure and validation rules for our data.

In [None]:
# Base User Input Model with custom validation
class UserInput(BaseModel):
    name: str = Field(..., description="Customer's full name")
    email: str = Field(..., pattern=r'^[\w\.-]+@[\w\.-]+\.\w+$', description="Valid email address")
    message: str = Field(..., min_length=5, description="Customer's message")
    order_id: Optional[str] = Field(None, description="Order ID format: ABC-12345 (3 letters, dash, 5 numbers)")
    
    @field_validator('order_id')
    @classmethod
    def validate_order_id(cls, v):
        if v is None:
            return v
        pattern = r'^[A-Z]{3}-\d{5}$'
        if not re.match(pattern, v):
            raise ValueError(f"Order ID must match format ABC-12345. Got: {v}")
        return v

# Customer Query Model (extends UserInput)
class CustomerQuery(UserInput):
    category: Literal["order_status", "product_question", "complaint", "password_reset"] = Field(
        ..., description="Category of customer query"
    )
    urgency: Literal["low", "medium", "high"] = Field(
        ..., description="Urgency level of the query"
    )
    sentiment: Literal["positive", "neutral", "negative"] = Field(
        ..., description="Sentiment of the customer message"
    )
    tags: list[str] = Field(
        default_factory=list, description="Relevant tags for categorization"
    )

# Tool argument models
class FAQLookupArgs(BaseModel):
    query: str = Field(..., description="Search query for FAQ lookup")
    tags: list[str] = Field(default_factory=list, description="Tags to filter FAQ results")

class CheckOrderStatusArgs(BaseModel):
    order_id: str = Field(..., description="Order ID format: ABC-12345")
    email: str = Field(..., description="Customer email for verification")
    
    @field_validator('order_id')
    @classmethod
    def validate_order_id(cls, v):
        pattern = r'^[A-Z]{3}-\d{5}$'
        if not re.match(pattern, v):
            raise ValueError(f"Order ID must match format ABC-12345. Got: {v}")
        return v

# Final output model
class OrderDetails(BaseModel):
    status: str
    estimated_delivery: Optional[str] = None
    note: Optional[str] = None

class SupportTicket(CustomerQuery):
    recommended_next_action: Literal["escalate_to_agent", "send_faq_response", "send_order_status", "no_action_needed"]
    order_details: Optional[OrderDetails] = None
    faq_response: Optional[str] = None
    creation_date: Optional[str] = None

print("✓ Pydantic models defined successfully")

## Step 3: Helper Functions for HuggingFace LLM Calls

In [None]:
def call_hf_model(prompt: str, model_name: str = MODEL_NAME, max_tokens: int = 1000, temperature: float = 0.3):
    """
    Call Hugging Face model and return response.
    Lower temperature for more consistent structured outputs.
    """
    try:
        messages = [{"role": "user", "content": prompt}]
        
        response = client.chat_completion(
            model=model_name,
            messages=messages,
            max_tokens=max_tokens,
            temperature=temperature
        )
        
        return response.choices[0].message.content
    except Exception as e:
        print(f"Error calling model: {e}")
        return None

def extract_json_from_response(response: str) -> str:
    """
    Extract JSON from model response that might contain additional text.
    """
    # Try to find JSON between code blocks
    if "```json" in response:
        start = response.find("```json") + 7
        end = response.find("```", start)
        return response[start:end].strip()
    elif "```" in response:
        start = response.find("```") + 3
        end = response.find("```", start)
        return response[start:end].strip()
    
    # Try to find JSON object
    try:
        start = response.find("{")
        end = response.rfind("}") + 1
        if start != -1 and end > start:
            return response[start:end]
    except:
        pass
    
    return response.strip()

print("✓ Helper functions defined")

## Step 4: Validate User Input

Let's test our validation with a sample input.

In [None]:
def validate_user_input(user_input_json: str) -> Optional[UserInput]:
    """
    Validate raw user input against UserInput model.
    """
    try:
        user_input = UserInput.model_validate_json(user_input_json)
        print("✓ User input validated successfully")
        return user_input
    except ValidationError as e:
        print("✗ Validation Error:")
        for error in e.errors():
            print(f"  - {error['loc'][0]}: {error['msg']}")
        return None

# Test with valid input
user_input_json = json.dumps({
    "name": "Jane Smith",
    "email": "jane.smith@example.com",
    "message": "What is the status of my order?",
    "order_id": "ABC-12345"
})

validated_input = validate_user_input(user_input_json)
if validated_input:
    print(f"\nValidated data:\n{validated_input.model_dump_json(indent=2)}")

## Step 5: Create Customer Query with LLM

Use the LLM to analyze the user input and generate a structured CustomerQuery.

In [None]:
def create_customer_query(validated_user_input: UserInput, max_retries: int = 3) -> Optional[CustomerQuery]:
    """
    Use LLM to analyze user input and create a CustomerQuery with retries.
    """
    schema = CustomerQuery.model_json_schema()
    
    prompt = f"""You are a customer support AI. Analyze the following customer input and return ONLY a JSON object matching this schema:

Schema: {json.dumps(schema, indent=2)}

Customer Input:
{validated_user_input.model_dump_json(indent=2)}

Requirements:
- Return ONLY valid JSON, no additional text
- Include all required fields
- Choose appropriate category, urgency, and sentiment
- Add relevant tags

JSON Response:"""

    for attempt in range(max_retries):
        try:
            print(f"\nAttempt {attempt + 1} to generate CustomerQuery...")
            response = call_hf_model(prompt, temperature=0.2)
            
            if not response:
                continue
            
            # Extract JSON from response
            json_str = extract_json_from_response(response)
            print(f"Extracted JSON: {json_str[:200]}...")
            
            # Validate with Pydantic
            customer_query = CustomerQuery.model_validate_json(json_str)
            print("✓ CustomerQuery generated and validated successfully")
            return customer_query
            
        except ValidationError as e:
            print(f"✗ Validation error on attempt {attempt + 1}:")
            for error in e.errors():
                print(f"  - {error['loc']}: {error['msg']}")
            if attempt == max_retries - 1:
                print("Max retries reached")
                return None
        except json.JSONDecodeError as e:
            print(f"✗ JSON decode error: {e}")
            if attempt == max_retries - 1:
                return None
    
    return None

# Test it
customer_query = create_customer_query(validated_input)
if customer_query:
    print(f"\nFinal CustomerQuery:\n{customer_query.model_dump_json(indent=2)}")

## Step 6: Set Up Mock Databases

Create mock databases for FAQ and orders to simulate real data.

In [None]:
# Mock FAQ Database
FAQ_DATABASE = [
    {
        "question": "How do I reset my password?",
        "answer": "Visit the 'Forgot Password' page, enter your email, and follow the reset link sent to you.",
        "keywords": ["password", "reset", "login", "access"]
    },
    {
        "question": "What is your return policy?",
        "answer": "We offer 30-day returns for unopened items with receipt. Contact support to initiate.",
        "keywords": ["return", "refund", "policy", "money back"]
    },
    {
        "question": "How long does shipping take?",
        "answer": "Standard shipping takes 5-7 business days. Express shipping is 2-3 business days.",
        "keywords": ["shipping", "delivery", "how long", "tracking"]
    }
]

# Mock Order Database
ORDER_DATABASE = {
    "ABC-12345": {
        "status": "In Transit",
        "estimated_delivery": "2025-10-08",
        "purchase_date": "2025-10-01",
        "email": "jane.smith@example.com"
    },
    "XYZ-98765": {
        "status": "Delivered",
        "estimated_delivery": "2025-10-02",
        "purchase_date": "2025-09-25",
        "email": "joe.user@example.com"
    },
    "DEF-55555": {
        "status": "Processing",
        "estimated_delivery": "2025-10-10",
        "purchase_date": "2025-10-03",
        "email": "bob.jones@example.com"
    }
}

print("✓ Mock databases created")

## Step 7: Define Tool Functions

These are the actual functions that will be called based on LLM decisions.

In [None]:
def lookup_faq_answer(args: FAQLookupArgs) -> str:
    """
    Search FAQ database for relevant answers.
    """
    query_lower = args.query.lower()
    tags_lower = [tag.lower() for tag in args.tags]
    
    for faq in FAQ_DATABASE:
        # Check if query matches keywords or tags match
        keywords_match = any(keyword in query_lower for keyword in faq["keywords"])
        tags_match = any(tag in faq["keywords"] for tag in tags_lower)
        
        if keywords_match or tags_match:
            return f"FAQ Answer: {faq['answer']}"
    
    return "Sorry, I couldn't find an FAQ answer for your question."

def check_order_status(args: CheckOrderStatusArgs) -> dict:
    """
    Look up order status from database.
    """
    order = ORDER_DATABASE.get(args.order_id)
    
    if not order:
        return {"error": "Order not found", "order_id": args.order_id}
    
    if order["email"] != args.email:
        return {"error": "Email does not match order records", "order_id": args.order_id}
    
    return {
        "order_id": args.order_id,
        "status": order["status"],
        "estimated_delivery": order["estimated_delivery"],
        "note": "Order ID and email match our records."
    }

# Test the tools
print("\n--- Testing FAQ Lookup ---")
faq_args = FAQLookupArgs(query="forgot password", tags=["password"])
print(lookup_faq_answer(faq_args))

print("\n--- Testing Order Status Check ---")
order_args = CheckOrderStatusArgs(order_id="ABC-12345", email="jane.smith@example.com")
print(json.dumps(check_order_status(order_args), indent=2))

## Step 8: Tool Calling with LLM

Let the LLM decide which tool to call based on the customer query.

In [None]:
# Define tool schemas
TOOL_DEFINITIONS = [
    {
        "name": "lookup_faq_answer",
        "description": "Search the FAQ database for answers to common questions",
        "parameters": FAQLookupArgs.model_json_schema()
    },
    {
        "name": "check_order_status",
        "description": "Check the status of a customer order using order ID and email",
        "parameters": CheckOrderStatusArgs.model_json_schema()
    }
]

def decide_next_action_with_tools(customer_query: CustomerQuery, max_retries: int = 3) -> Optional[dict]:
    """
    Use LLM to decide which tool to call based on customer query.
    """
    tools_description = json.dumps(TOOL_DEFINITIONS, indent=2)
    
    prompt = f"""You are a customer support AI. Based on the customer query below, decide if you should call any tools.

Available Tools:
{tools_description}

Customer Query:
{customer_query.model_dump_json(indent=2)}

Instructions:
- If the query mentions an order_id, call check_order_status
- If the query is about password, returns, or shipping, call lookup_faq_answer
- Return ONLY a JSON object in this format:

{{
    "should_call_tool": true/false,
    "tool_name": "tool_name_here" or null,
    "tool_arguments": {{}} or null,
    "reasoning": "brief explanation"
}}

JSON Response:"""

    for attempt in range(max_retries):
        try:
            print(f"\nAttempt {attempt + 1} to decide tool calling...")
            response = call_hf_model(prompt, temperature=0.1)
            
            if not response:
                continue
            
            json_str = extract_json_from_response(response)
            tool_decision = json.loads(json_str)
            
            print(f"✓ Tool decision: {tool_decision.get('reasoning', 'No reasoning provided')}")
            return tool_decision
            
        except json.JSONDecodeError as e:
            print(f"✗ JSON decode error on attempt {attempt + 1}: {e}")
            if attempt == max_retries - 1:
                print("Max retries reached")
                return None
    
    return None

# Test tool decision
tool_decision = decide_next_action_with_tools(customer_query)
if tool_decision:
    print(f"\nTool Decision:\n{json.dumps(tool_decision, indent=2)}")

## Step 9: Execute Tool Calls with Validation

In [None]:
def execute_tool_call(tool_decision: dict) -> Optional[dict]:
    """
    Execute the tool call with Pydantic validation of arguments.
    """
    if not tool_decision.get("should_call_tool"):
        print("No tool call needed")
        return None
    
    tool_name = tool_decision.get("tool_name")
    tool_args = tool_decision.get("tool_arguments")
    
    if not tool_name or not tool_args:
        print("Invalid tool decision format")
        return None
    
    try:
        if tool_name == "lookup_faq_answer":
            # Validate arguments with Pydantic
            validated_args = FAQLookupArgs.model_validate(tool_args)
            print(f"✓ Tool arguments validated for {tool_name}")
            result = lookup_faq_answer(validated_args)
            return {"tool_name": tool_name, "result": result}
            
        elif tool_name == "check_order_status":
            # Validate arguments with Pydantic
            validated_args = CheckOrderStatusArgs.model_validate(tool_args)
            print(f"✓ Tool arguments validated for {tool_name}")
            result = check_order_status(validated_args)
            return {"tool_name": tool_name, "result": result}
            
        else:
            print(f"Unknown tool: {tool_name}")
            return None
            
    except ValidationError as e:
        print(f"✗ Tool argument validation error:")
        for error in e.errors():
            print(f"  - {error['loc']}: {error['msg']}")
        return None

# Execute the tool
tool_output = execute_tool_call(tool_decision)
if tool_output:
    print(f"\nTool Output:\n{json.dumps(tool_output, indent=2)}")

## Step 10: Generate Final Support Ticket

In [None]:
def generate_support_ticket(
    customer_query: CustomerQuery,
    tool_output: Optional[dict] = None,
    max_retries: int = 3
) -> Optional[SupportTicket]:
    """
    Generate final structured support ticket.
    """
    schema = SupportTicket.model_json_schema()
    
    tool_info = "No tools were called."
    if tool_output:
        tool_info = f"Tool called: {tool_output['tool_name']}\nResult: {json.dumps(tool_output['result'], indent=2)}"
    
    prompt = f"""You are a customer support AI. Create a final support ticket based on all information below.

Support Ticket Schema:
{json.dumps(schema, indent=2)}

Customer Query:
{customer_query.model_dump_json(indent=2)}

Tool Information:
{tool_info}

Instructions:
- Return ONLY valid JSON matching the SupportTicket schema
- Include order_details if order information was retrieved
- Include faq_response if FAQ was looked up
- Choose appropriate recommended_next_action
- Do NOT include creation_date (will be added automatically)

JSON Response:"""

    for attempt in range(max_retries):
        try:
            print(f"\nAttempt {attempt + 1} to generate support ticket...")
            response = call_hf_model(prompt, temperature=0.2)
            
            if not response:
                continue
            
            json_str = extract_json_from_response(response)
            
            # Parse and validate
            ticket_data = json.loads(json_str)
            
            # Add creation date
            ticket_data['creation_date'] = datetime.now().isoformat()
            
            # Validate with Pydantic
            support_ticket = SupportTicket.model_validate(ticket_data)
            print("✓ Support ticket generated and validated successfully")
            return support_ticket
            
        except ValidationError as e:
            print(f"✗ Validation error on attempt {attempt + 1}:")
            for error in e.errors():
                print(f"  - {error['loc']}: {error['msg']}")
            if attempt == max_retries - 1:
                print("Max retries reached")
                return None
        except json.JSONDecodeError as e:
            print(f"✗ JSON decode error: {e}")
            if attempt == max_retries - 1:
                return None
    
    return None

# Generate final ticket
support_ticket = generate_support_ticket(customer_query, tool_output)
if support_ticket:
    print(f"\n{'='*60}")
    print("FINAL SUPPORT TICKET")
    print('='*60)
    print(support_ticket.model_dump_json(indent=2))

## Step 11: Complete End-to-End Pipeline

Now let's put everything together in a single function and test with multiple scenarios.

In [None]:
def process_customer_request(user_input_json: str) -> Optional[SupportTicket]:
    """
    Complete pipeline: validate input -> create query -> call tools -> generate ticket.
    """
    print("\n" + "="*60)
    print("PROCESSING CUSTOMER REQUEST")
    print("="*60)
    
    # Step 1: Validate input
    print("\n[1/5] Validating user input...")
    validated_input = validate_user_input(user_input_json)
    if not validated_input:
        return None
    
    # Step 2: Create customer query
    print("\n[2/5] Creating customer query...")
    customer_query = create_customer_query(validated_input)
    if not customer_query:
        return None
    
    # Step 3: Decide on tool calling
    print("\n[3/5] Deciding on tool calls...")
    tool_decision = decide_next_action_with_tools(customer_query)
    
    # Step 4: Execute tools if needed
    tool_output = None
    if tool_decision and tool_decision.get("should_call_tool"):
        print("\n[4/5] Executing tool call...")
        tool_output = execute_tool_call(tool_decision)
    else:
        print("\n[4/5] No tool call needed")
    
    # Step 5: Generate final ticket
    print("\n[5/5] Generating support ticket...")
    support_ticket = generate_support_ticket(customer_query, tool_output)
    
    return support_ticket

### Test Case 1: Order Status Query

In [None]:
print("\n\n" + "#"*60)
print("TEST CASE 1: Order Status Query")
print("#"*60)

test_input_1 = json.dumps({
    "name": "Jane Smith",
    "email": "jane.smith@example.com",
    "message": "What is the status of my order?",
    "order_id": "ABC-12345"
})

ticket_1 = process_customer_request(test_input_1)

### Test Case 2: Password Reset Question

In [None]:
print("\n\n" + "#"*60)
print("TEST CASE 2: Password Reset Question")
print("#"*60)

test_input_2 = json.dumps({
    "name": "Bob Jones",
    "email": "bob.jones@example.com",
    "message": "I forgot my password and can't log in. How do I reset it?",
    "order_id": None
})

ticket_2 = process_customer_request(test_input_2)

### Test Case 3: Product Complaint

In [None]:
print("\n\n" + "#"*60)
print("TEST CASE 3: Product Complaint")
print("#"*60)

test_input_3 = json.dumps({
    "name": "Joe User",
    "email": "joe.user@example.com",
    "message": "I'm really not happy with the product I bought. It doesn't work as advertised.",
    "order_id": "XYZ-98765"
})

ticket_3 = process_customer_request(test_input_3)

## Step 12: Test Invalid Input (Demonstrating Pydantic Validation)

Let's see how Pydantic catches validation errors.

### Test Case 4: Invalid Order ID Format

In [None]:
print("\n\n" + "#"*60)
print("TEST CASE 4: Invalid Input (Wrong Order ID Format)")
print("#"*60)

invalid_input = json.dumps({
    "name": "Test User",
    "email": "test@example.com",
    "message": "Check my order please",
    "order_id": "12345"  # Wrong format! Should be ABC-12345
})

ticket_invalid = process_customer_request(invalid_input)

### Test Case 5: Invalid Email

In [None]:
print("\n\n" + "#"*60)
print("TEST CASE 5: Invalid Email")
print("#"*60)

invalid_email = json.dumps({
    "name": "Test User",
    "email": "not-an-email",  # Invalid email format
    "message": "I need help with my account",
    "order_id": None
})

ticket_invalid_2 = process_customer_request(invalid_email)

## Summary and Key Takeaways

In [None]:
print("\n" + "="*60)
print("SUMMARY: How Pydantic Works with Free HuggingFace Models")
print("="*60)
print("""
✓ Pydantic DOES work with free HuggingFace models!

Key Points:
1. DATA VALIDATION: Pydantic validates all inputs/outputs at every stage
2. STRUCTURED OUTPUTS: We use JSON schemas to guide the LLM
3. ERROR HANDLING: Retry logic handles when models produce invalid JSON
4. TOOL CALLING: Pydantic validates tool arguments before execution
5. COST: Completely free with HuggingFace Inference API

Challenges with Smaller Models:
- 7B models are less consistent than GPT-4/Claude at producing valid JSON
- Require more retries and careful prompt engineering
- May need temperature tuning (lower = more consistent)
- JSON extraction from responses can be messier

Best Practices:
✓ Use clear, explicit prompts with schema examples
✓ Implement retry logic with Pydantic validation
✓ Extract JSON carefully from responses
✓ Use lower temperature (0.1-0.3) for structured outputs
✓ Validate at every step with Pydantic models

Models that work well (free on HuggingFace):
- mistralai/Mistral-7B-Instruct-v0.2 ✓
- meta-llama/Meta-Llama-3-8B-Instruct ✓
- microsoft/Phi-3-mini-4k-instruct ✓
- HuggingFaceH4/zephyr-7b-beta ✓
""")

## Bonus: Model Comparison Function

Test the same input across different models to see which performs best.

In [None]:
def compare_model_performance(user_input_json: str, models: list[str]):
    """
    Test the same input across different models to see which performs best.
    """
    print("\n" + "="*60)
    print("MODEL COMPARISON")
    print("="*60)
    
    results = {}
    
    for model_name in models:
        print(f"\n\nTesting with {model_name}...")
        print("-" * 60)
        
        global MODEL_NAME
        MODEL_NAME = model_name
        
        try:
            ticket = process_customer_request(user_input_json)
            results[model_name] = {
                "success": ticket is not None,
                "ticket": ticket.model_dump() if ticket else None
            }
        except Exception as e:
            results[model_name] = {
                "success": False,
                "error": str(e)
            }
    
    print("\n\n" + "="*60)
    print("COMPARISON RESULTS")
    print("="*60)
    for model, result in results.items():
        status = "✓ SUCCESS" if result["success"] else "✗ FAILED"
        print(f"\n{model}: {status}")
    
    return results

### Run Model Comparison (Optional - Takes Time)

Uncomment and run to compare different models. This will take several minutes!

In [None]:
# Uncomment to test multiple models (this will take a while!)
# test_models = [
#     "mistralai/Mistral-7B-Instruct-v0.2",
#     "microsoft/Phi-3-mini-4k-instruct",
# ]
# 
# test_input = json.dumps({
#     "name": "Test User",
#     "email": "test@example.com",
#     "message": "What is the status of my order?",
#     "order_id": "ABC-12345"
# })
# 
# comparison = compare_model_performance(test_input, test_models)

---

## Conclusion

This tutorial demonstrates that **Pydantic absolutely works with free HuggingFace models**, though with some caveats:

1. **Validation is rock-solid** - Pydantic catches errors at every stage
2. **Smaller models require more hand-holding** - More retries, better prompts, JSON extraction
3. **The pattern is the same** - Whether using GPT-4, Claude, or Mistral-7B, the Pydantic workflow is identical
4. **Cost-effective** - Everything in this tutorial can run on HuggingFace's free tier

The key insight remains true: **It's Pydantic all the way down** in LLM workflows, regardless of which model you use!

### Next Steps

- Try different HuggingFace models
- Experiment with temperature settings
- Add more tools to the system
- Build your own custom validation rules
- Deploy this to a real application!