# Structured Output

Models can be requested to provide their response in a format matching a given schema. This is useful for ensuring the output can be easily parsed and used in subsequent processing. LangChain supports multiple schema types and methods for enforcing structured output.

### Pydantic
Pydantic models provide the richest feature set with field validation, descriptions, and nested structures.

In [1]:
import os
from langchain.chat_models import init_chat_model

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

model = init_chat_model("groq:qwen/qwen3-32b")
model

ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 16384, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x000002F1D49F4EC0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000002F1D49F5BE0>, model_name='qwen/qwen3-32b', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [2]:
from pydantic import BaseModel,Field

class Movie(BaseModel):
    name: str = Field(description="Name of the movie")
    year: int = Field(description="Year of release")
    director: str = Field(description="Director of the movie")
    rating: float = Field(description="Rating of the movie")

In [3]:
model_with_structure = model.with_structured_output(Movie)

model_with_structure

RunnableBinding(bound=ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 16384, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': True, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x000002F1D49F4EC0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000002F1D49F5BE0>, model_name='qwen/qwen3-32b', model_kwargs={}, groq_api_key=SecretStr('**********')), kwargs={'tools': [{'type': 'function', 'function': {'name': 'Movie', 'description': '', 'parameters': {'properties': {'name': {'description': 'Name of the movie', 'type': 'string'}, 'year': {'description': 'Year of release', 'type': 'integer'}, 'director': {'description': 'Director of the movie', 'type': 'string'}, 'rating': {'description': 'Rating of the movie', 'type': 'number'}}, 'required': ['name', 'year', 'director', 'rating'], 'type': 'objec

In [4]:
## Test-1
## Non-Parametric Model

model.invoke("Provide details about the movie inception")

AIMessage(content='<think>\nOkay, so I need to provide details about the movie Inception. Let me start by recalling what I know about it. Inception is a 2010 film directed by Christopher Nolan. It\'s a science fiction action movie. The main character is Dom Cobb, played by Leonardo DiCaprio. The plot involves some concept about entering people\'s dreams to plant or extract ideas. I think the term used is "inception," which is the act of planting an idea. \n\nThe movie has a complex narrative structure, maybe with multiple layers of dreams. There\'s a team that uses a device called the dream-sharing machine, which requires a subject to enter a dream state. I remember there\'s a scene with a spinning top that Cobb uses to determine if he\'s in reality or a dream. The spinning top is a totem, a personal object that helps characters distinguish between reality and the dream world.\n\nThe cast includes other notable actors like Joseph Gordon-Levitt, who plays Arthur, Cobb\'s right-hand man.

In [5]:
## Test-2
## Structured-Output

response = model_with_structure.invoke("Provide details about the movie inception")

response

Movie(name='Inception', year=2010, director='Christopher Nolan', rating=8.8)

### Message output alongside parsed structure

In [6]:
model_op_with_structure = model.with_structured_output(Movie,include_raw=True)

response = model_op_with_structure.invoke("Provide details about the movie inception")

response

{'raw': AIMessage(content='', additional_kwargs={'reasoning_content': 'Okay, the user is asking for details about the movie "Inception". Let me check what tools I have available. There\'s a Movie function that requires name, year, director, and rating. I need to fill in those parameters. I know the movie "Inception" was directed by Christopher Nolan and released in 2010. The rating is probably around 8.8 on IMDb. Let me confirm the director and year. Yep, Nolan directed it, and it came out in 2010. The rating is indeed 8.8. So I\'ll structure the tool call with those details.\n', 'tool_calls': [{'id': 'vmrm5pw03', 'function': {'arguments': '{"director":"Christopher Nolan","name":"Inception","rating":8.8,"year":2010}', 'name': 'Movie'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 175, 'prompt_tokens': 215, 'total_tokens': 390, 'completion_time': 0.300927659, 'completion_tokens_details': {'reasoning_tokens': 127}, 'prompt_time': 0.009133486, 'prompt_tok

## Nested Structured Output

In [7]:
class Actor(BaseModel):
    name: str
    role: str

class MovieDetails(BaseModel):
    title: str
    year: int
    cast: list[Actor]
    genres: list[str]
    budget: float | None = Field(None, description="Budget in millions USD")

In [8]:
model_with_structure = model.with_structured_output(MovieDetails)

response = model_with_structure.invoke("Provide details about the movie Inception")
response

MovieDetails(title='Inception', year=2010, cast=[Actor(name='Leonardo DiCaprio', role='Dom Cobb'), Actor(name='Joseph Gordon-Levitt', role='Arthur'), Actor(name='Ellen Page', role='Ariadne'), Actor(name='Tom Hardy', role='Jack')], genres=['Science Fiction', 'Action', 'Thriller'], budget=160.0)

## TypedDict
TypedDict provides a simpler alternative using Python’s built-in typing, ideal when you don’t need runtime validation.

In [9]:
from typing_extensions import TypedDict,Annotated

class MovieDict(TypedDict):
    """A movie with details."""
    title: Annotated[str, ..., "The title of the movie"]
    year: Annotated[int, ..., "The year the movie was released"]
    director: Annotated[str, ..., "The director of the movie"]
    rating: Annotated[float, ..., "The movie's rating out of 10"]

In [10]:
model_with_typedict=model.with_structured_output(MovieDict)
response=model_with_typedict.invoke("Please provide the details of the movie avengers")
response

{'director': 'Joss Whedon',
 'rating': 7.9,
 'title': 'The Avengers',
 'year': 2012}

In [11]:
class Actor(TypedDict):
    name: str
    role: str

class MovieDetails(TypedDict):
    title: str
    year: int
    cast: list[Actor]
    genres: list[str]
    budget: float | None = Field(None, description="Budget in millions USD")

In [12]:
model_with_structure = model.with_structured_output(MovieDetails)

response = model_with_structure.invoke("Provide details about the movie Avengers")
response

{'budget': 220000000,
 'cast': [{'name': 'Robert Downey Jr.', 'role': 'Tony Stark / Iron Man'},
  {'name': 'Chris Evans', 'role': 'Steve Rogers / Captain America'},
  {'name': 'Scarlett Johansson', 'role': 'Natasha Romanoff / Black Widow'},
  {'name': 'Mark Ruffalo', 'role': 'Bruce Banner / Hulk'},
  {'name': 'Chris Hemsworth', 'role': 'Thor'},
  {'name': 'Tom Hiddleston', 'role': 'Loki'}],
 'genres': ['Action', 'Superhero', 'Adventure'],
 'title': 'Avengers Assemble',
 'year': 2012}

In [13]:
model.profile

{'max_input_tokens': 131072,
 'max_output_tokens': 16384,
 'image_inputs': False,
 'audio_inputs': False,
 'video_inputs': False,
 'image_outputs': False,
 'audio_outputs': False,
 'video_outputs': False,
 'reasoning_output': True,
 'tool_calling': True}

## DataClasses
A data class is a class typically containing mainly data, although there aren’t really any restrictions. You create it using the @dataclass decorator

In [14]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

In [15]:
from pydantic import BaseModel, Field
from langchain.agents import create_agent

class ContactInfo(BaseModel):
    """Contact information for a person."""
    name: str = Field(description="Name of the person")
    email: str = Field(description="Email of the person")
    phone: str = Field(description="Phone number of the person")

agent = create_agent(
    model="gpt-5",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

In [16]:
result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result

{'messages': [HumanMessage(content='Extract contact info from: John Doe, john@example.com, (555) 123-4567', additional_kwargs={}, response_metadata={}, id='e0611352-9037-4add-bf89-d1e02e6d33db'),
  AIMessage(content='{"name":"John Doe","email":"john@example.com","phone":"(555) 123-4567"}', additional_kwargs={'parsed': None, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1188, 'prompt_tokens': 200, 'total_tokens': 1388, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 1152, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-5-2025-08-07', 'system_fingerprint': None, 'id': 'chatcmpl-Czk54lG8c8PXmRQM9XUJHHfzFpAAD', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019bd68d-d091-76b2-98e9-f5bee048a445-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 200, 'out

In [17]:
result["structured_response"]

ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')

In [18]:
## Typedict
from typing_extensions import TypedDict
from langchain.agents import create_agent


class ContactInfo(TypedDict):
    """Contact information for a person."""
    name: str 
    email: str 
    phone: str 

agent = create_agent(
    model="gpt-5",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}

{'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}

In [19]:
## Dataclass

from dataclasses import dataclass
from langchain.agents import create_agent

@dataclass
class ContactInfo:
    """Contact information for a person."""
    name: str 
    email: str 
    phone: str 


agent = create_agent(
    model="gpt-5",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result["structured_response"]

ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')