### Structured Output

Models can be requested to provide the response in a format matching a given schema. This is useful for ensuring the output can be easily parsed and used in subsequent processing. LandChain supports multiple schema types and methods for enforcing structured output. 

### Pydantic

Paediatric models provide the richest feature set with field validation descriptions and nested instructors. 

In [1]:
import os
from langchain.chat_models import init_chat_model


os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

model = init_chat_model("groq:qwen/qwen3-32b")

In [2]:
model

ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 16384, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x00000220F85A7290>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000220F8717710>, model_name='qwen/qwen3-32b', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [3]:
from pydantic import BaseModel, Field

class Movie(BaseModel):
    title: str = Field(description="The title of the movie")
    year: int = Field(description="The release year of the movie")
    director: str = Field(description="The director of the movie")
    rating: float = Field(description="The rating of the movie out of 10")


In [4]:
model_with_structured_output = model.with_structured_output(Movie)

In [5]:
model_with_structured_output

RunnableBinding(bound=ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 16384, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x00000220F85A7290>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000220F8717710>, model_name='qwen/qwen3-32b', model_kwargs={}, groq_api_key=SecretStr('**********')), kwargs={'tools': [{'type': 'function', 'function': {'name': 'Movie', 'description': '', 'parameters': {'properties': {'title': {'description': 'The title of the movie', 'type': 'string'}, 'year': {'description': 'The release year of the movie', 'type': 'integer'}, 'director': {'description': 'The director of the movie', 'type': 'string'}, 'rating': {'description': 'The rating of the movie out of 10', 'type': 'number'}}, 'required': ['title', 'year'

In [6]:
model.invoke("Provide details about the movie 'The Origen'")

AIMessage(content='<think>\nOkay, the user is asking about the movie "The Origen." Let me start by checking if that\'s the correct title. Sometimes typos can occur. "The Origen" doesn\'t ring a bell immediately. Maybe they meant "The Orphanage" (El Orfelinato), which is a 2007 Spanish horror film? Or perhaps "Origins," but that\'s not a specific title. Let me verify.\n\nFirst, I\'ll search for "The Origen" to see if there\'s a known movie by that name. Hmm, not finding much. Maybe it\'s a recent film or an independent production. Alternatively, the user might have misspelled the title. Let me consider other possibilities. "The Origen" sounds Spanish, but "origen" in Spanish means "origin." Maybe it\'s a translation issue. \n\nWait, there\'s a 2018 movie titled "Origins" directed by Nicolas Cage. But that\'s "Origins," not "The Origen." Another angle: maybe it\'s a TV series? Not that I can recall. Let me check databases like IMDb. Searching IMDb for "The Origen" returns no results. Hmm

In [7]:
output = model_with_structured_output.invoke("Provide details about the movie 'The Origen'")
print(output)

title='Origin' year=2023 director='Ridley Scott' rating=7.0


## Message output alongside parsed structure

In [8]:
from pydantic import BaseModel, Field

model_with_structured = model.with_structured_output(Movie, include_raw=True)

In [9]:
response = model_with_structured.invoke("Provide details about the movie 'The Origin'")

In [10]:
response

{'raw': AIMessage(content='', additional_kwargs={'reasoning_content': 'Okay, the user is asking for details about the movie \'The Origin\'. Let me check what tools I have available. There\'s a Movie function that requires title, year, director, and rating. But wait, the user didn\'t specify the year or director. Hmm, maybe I need to ask for more information. Wait, the function requires those parameters, so I can\'t proceed without them. But the user might not know the exact details. Should I make an assumption or ask for clarification? Since the function needs all four parameters, I can\'t generate the response without them. I need to inform the user that I need more details like the director and release year. Alternatively, maybe there\'s a way to look it up, but according to the provided tools, I don\'t have a search function. So the correct approach is to tell the user I need more information to provide the details. Wait, but maybe there\'s a mistake here. The function is supposed t

## Nested Structure

In [11]:
from pydantic import BaseModel, Field

class Actor(BaseModel):
    name: str = Field(description="Name of the actor")
    role: str = Field(description="Role of the actor")

class MovieDetails(BaseModel):
    title: str = Field(description="Title of the movie")
    year: int = Field(description="Year of the movie")
    genre: list[str] = Field(description="Genre of the movie")
    cast: list[Actor] = Field(description="Actors in the movie")
    budget: float | None = Field(description="Budget of the movie in USD")
    
model_with_structured = model.with_structured_output(MovieDetails)


In [12]:
response = model_with_structured.invoke("Tell me about the movie The Godfather")
response

MovieDetails(title='The Godfather', year=1972, genre=['Crime', 'Drama'], cast=[Actor(name='Marlon Brando', role='Don Vito Corleone'), Actor(name='Al Pacino', role='Michael Corleone'), Actor(name='James Caan', role='Sonny Corleone'), Actor(name='Diane Keaton', role='Kaye Corleone')], budget=6000000.0)

### TypedDict

TypedDict provides a simpler alternative using python's built-in typing, ideal when you don't need  runtime validation.

In [13]:
from typing_extensions import TypedDict, Annotated

class MovieDict(TypedDict):
    """ A movie with details """
    title: Annotated[str, ..., "The title of the movie"]
    year: Annotated[int, ..., "The release year of the movie"]
    director: Annotated[str, ..., "The director of the movie"]
    rating: Annotated[float, ..., "The rating of the movie out of 10"]
    

model_typeddict = model.with_structured_output(MovieDict)
response = model_typeddict.invoke("Tell me about the movie The Godfather details")
response

{'director': 'Francis Ford Coppola',
 'rating': 9.2,
 'title': 'The Godfather',
 'year': 1972}

In [16]:
class Actor(TypedDict):
    name: str
    role: str
    

class MovieDetails(TypedDict):
    title: str
    year: int
    genres: list[str]
    actors: list[Actor]
    budget: float | None = Field(None, description="Budget in millions of USD")

model_with_structured = model.with_structured_output(MovieDetails)

response = model_with_structured.invoke("Tell me about the movie The Matrix")
print(response)
    

{'actors': [{'name': 'Keanu Reeves', 'role': 'Neo'}, {'name': 'Laurence Fishburne', 'role': 'Morpheus'}, {'name': 'Carrie-Anne Moss', 'role': 'Trinity'}, {'name': 'Hugo Weaving', 'role': 'Agent Smith'}], 'budget': 63000000, 'genres': ['Action', 'Science Fiction', 'Thriller'], 'title': 'The Matrix', 'year': 1999}


In [17]:
model.profile

{'max_input_tokens': 131072,
 'max_output_tokens': 16384,
 'image_inputs': False,
 'audio_inputs': False,
 'video_inputs': False,
 'image_outputs': False,
 'audio_outputs': False,
 'video_outputs': False,
 'reasoning_output': True,
 'tool_calling': True}

### DataClasses

A data class is a class typically containing mainly data, although there areb't really any restrictions. You create it using the @dataclass decorator.

In [19]:
import os 

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [None]:
from pydantic import BaseModel, Field
from langchain.agents import create_agent

class ContactInfo(BaseModel):
    """Information about a contact."""
    name: str = Field(description="The name of the contact")
    email: str = Field(description="The email of the contact")
    phone: str = Field(description="The phone number of the contact")

agent = create_agent(
    model="gpt-4o",
    response_format=ContactInfo # auto selects Provider Strategy
)

result = agent.invoke({"messages": [{
    "role": "user", 
    "content": "Extract info from John Doe and my email is joedone@gmail.com and my phone number is 1234567890"}]})

print(result['structured_response'])

name='John Doe' email='joedone@gmail.com' phone='1234567890'


In [None]:
### TypedDict

from typing_extensions import TypedDict
from langchain.agents import create_agent

class ContactInfo(TypedDict):
    """Information about a contact."""
    name: str
    email: str
    phone: str

agent = create_agent(
    model="gpt-4o",
    response_format=ContactInfo # auto selects Provider Strategy
)

result = agent.invoke({
    "messages": [{
        "role": "user", 
        "content": "Extract info from John Doe and my email is joedone@gmail.com and my phone number is 1234567890"
        }
    ]
})

print(result['structured_response'])

{'name': 'John Doe', 'email': 'joedone@gmail.com', 'phone': '1234567890'}


In [21]:
### DataClass

from dataclasses import dataclass
from langchain.agents import create_agent

@dataclass
class ContactInfo:
    """Information about a contact."""
    name: str
    email: str
    phone: str

agent = create_agent(
    model="gpt-4o",
    response_format=ContactInfo # auto selects Provider Strategy
)

result = agent.invoke({
    "messages": [{
        "role": "user", 
        "content": "Extract info from John Doe and my email is joedone@gmail.com and my phone number is 1234567890"
        }
    ]
})

print(result['structured_response'])


ContactInfo(name='John Doe', email='joedone@gmail.com', phone='1234567890')
