# Structured Outputs

- https://docs.langchain.com/oss/python/langchain/models#structured-output

Structured output allows agents to return data in a specific, predictable format. Instead of parsing natural language responses, you get structured data in the form of JSON objects, Pydantic models, or dataclasses that your application can use directly.

Models can be requested to provide their response in a format matching a given schema. This is useful for ensuring the output can be easily parsed and used in subsequent processing. LangChain supports multiple schema types and methods for enforcing structured output.

- By default some models can provide structured outputs, for such models we can use with_structured_output() method.
    - TypedDict
    - Pydantic
    - jsonSchema

- For models that does not support structured outputs natively, we can use output parsers for arranging the output.


## TypedDict

    - TypedDict is way to define a dictionary in python where you specify what keys and values should exist.
    - It helps ensure that your dictionary follows a specific structure.
    - It tell python what keys are required and what types of values they should have.
    - It does not validate data at runtime(it just helps with hints for better coding)

In [None]:
from langchain.chat_models import init_chat_model

model = init_chat_model(
    model="qwen2.5-coder:7b",
    model_provider="ollama",
    temperature = 0.0
)

In [None]:
from typing_extensions import TypedDict, Annotated

class MovieDict(TypedDict):
    """A movie with details."""
    title: Annotated[str, ..., "The title of the movie"]
    year: Annotated[int, ..., "The year the movie was released"]
    director: Annotated[str, ..., "The director of the movie"]
    rating: Annotated[float, ..., "The movie's rating out of 10"]

model_with_structure = model.with_structured_output(MovieDict)
response = model_with_structure.invoke("Provide details about the movie Inception")
print(response)  # {'title': 'Inception', 'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.8}

{'title': 'Inception', 'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.8}


## Pydantic

    - pydantic is data validation and data parsing library for python.
    - It ensures that the data you work with is correct, structured and type safe.

In [2]:
from pydantic import BaseModel, Field

class Movie(BaseModel):
    """A movie with details."""
    title: str = Field(..., description="The title of the movie")
    year: int = Field(..., description="The year the movie was released")
    director: str = Field(..., description="The director of the movie")
    rating: float = Field(..., description="The movie's rating out of 10")

model_with_structure = model.with_structured_output(Movie)
response = model_with_structure.invoke("Provide details about the movie Inception")
print(response)  # Movie(title="Inception", year=2010, director="Christopher Nolan", rating=8.8)

title='Inception' year=2010 director='Christopher Nolan' rating=8.8


## Json Schema

In [3]:
import json

json_schema = {
    "title": "Movie",
    "description": "A movie with details",
    "type": "object",
    "properties": {
        "title": {
            "type": "string",
            "description": "The title of the movie"
        },
        "year": {
            "type": "integer",
            "description": "The year the movie was released"
        },
        "director": {
            "type": "string",
            "description": "The director of the movie"
        },
        "rating": {
            "type": "number",
            "description": "The movie's rating out of 10"
        }
    },
    "required": ["title", "year", "director", "rating"]
}

model_with_structure = model.with_structured_output(
    json_schema,
    method="json_schema",
)
response = model_with_structure.invoke("Provide details about the movie Inception")
print(response)  # {'title': 'Inception', 'year': 2010, ...}

{'title': 'Inception', 'year': 2010, 'director': 'Christopher Nolan', 'rating': 8.8}


## Use TypedDict If :
    - You only need type hints
    - You don't need validation
    - You trust the LLM to return correct data

## Use Pydantic If :
    - You need data validation
    - You need default values if the LLM misses fields
    - You need automatic type convertion("100"->100)

## Use JsonSchema If :
    - You don't want to import extra python libraries
    - You need validation but don't need python objects
    - You want to define structure in a standard Json Format