# Notebook 04: Structured Outputs & JSON Schema

**Objectives:**
- Extract structured data with json_extract.v1 prompt
- Validate against JSON schemas
- Repair malformed JSON
- Log token costs for validation + repair cycles

In [1]:
import sys
sys.path.append('..')

from utils.prompts import render
from utils.llm_client import LLMClient
from utils.logging_utils import log_llm_call
from utils.router import pick_model
from utils.json_utils import safe_parse_json, validate_json_schema, create_simple_schema, format_schema_for_prompt
import json

## Part 1: JSON Extraction with Schema

Define a schema and extract structured data.

Handle malformed JSON with automatic repair.

## Part 2: Pydantic Models (Recommended Approach)

Use Pydantic for type-safe structured outputs with automatic validation and IDE support.


## Key Takeaways

1. **Schema-first**: Define JSON schema before extraction
2. **Pydantic recommended**: Use Pydantic models for better type safety and IDE support
4. **Validation**: Always validate LLM outputs against schema
5. **Repair**: Common JSON errors can be auto-fixed with `repair_json()`
6. **Token cost**: Validation errors require retry, increasing cost

**Comparison:**
- **Manual JSON Schema**: More control, works with all providers
- **Pydantic + Prompts**: Type-safe, better DX, works with all providers