### Structured Output

Models can be requested to provide their response in a format matching a given schema. This is useful for 
ensuring the output can be easily parsed and used in subsequebt processing.

Langchain Supports multiple schema types and methods for enforcing structured output

### Pydantic,TypeDict,DataClasses

- (The Best) => Pydantic models provide the richest feature set with field validation, descriptions, and nested structures.

- TypedDict provides a simpler alternative using Python’s built-in typing, ideal when you don’t need runtime validation.

- A data class is a class typically containing mainly data, although there aren’t really any restrictions. You create it using the @dataclass decorator


In [12]:
import os 
from langchain.chat_models import init_chat_model
os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")
model = init_chat_model("groq:llama-3.1-8b-instant")
model.invoke("Hello, how are you?")


AIMessage(content="I'm functioning properly, thank you for asking. How can I assist you today?", additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 41, 'total_tokens': 59, 'completion_time': 0.021499951, 'completion_tokens_details': None, 'prompt_time': 0.005981249, 'prompt_tokens_details': None, 'queue_time': 0.054774031, 'total_time': 0.0274812}, 'model_name': 'llama-3.1-8b-instant', 'system_fingerprint': 'fp_4387d3edbb', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019bd094-9893-7553-be2e-43103dbc529a-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 41, 'output_tokens': 18, 'total_tokens': 59})

-- We want to get output of the movie "Avengers" with the schema

-- title,year,director,rating

In [20]:
model.invoke("Provide the details about the movie Inception")

AIMessage(content='**Inception (2010)**\n\n**Directed by:** Christopher Nolan\n**Starring:** Leonardo DiCaprio, Joseph Gordon-Levitt, Marion Cotillard, Ellen Page, Tom Hardy, Ken Watanabe, Dileep Rao, Cillian Murphy, Tom Berenger, and Pete Postlethwaite\n\n**Genre:** Science Fiction, Action, Thriller\n\n**Plot:**\n\nInception is a mind-bending sci-fi action film that delves into the concept of shared dreaming. Cobb (Leonardo DiCaprio), a skilled thief, specializes in entering people\'s dreams and stealing their secrets. He is hired by a wealthy businessman named Saito (Ken Watanabe) to perform a task known as "inception" - planting an idea in someone\'s mind instead of stealing one.\n\nSaito wants Cobb to convince Robert Fischer (Cillian Murphy), the son of a dying business magnate, to dissolve his father\'s company. In return, Saito offers Cobb a chance to return to the United States and reunite with his children, whom he has not seen in years.\n\nCobb assembles a team of experts, inc

Message output alongside Parsed Structured Output

In [None]:
from pydantic import BaseModel,Field
class Movie(BaseModel):
    title:str=Field(description="The title of the movie")
    year:int=Field(description="The year of the movie")
    director:str=Field(description="The director of the movie")
    rating:float=Field(description="The rating of the movie out of 10 ")

# model_with_structure = model.with_structured_output(Movie,invoke_raw=True) 
#  In previous version invoke_raw=True was required
model_with_structure = model.with_structured_output(Movie)
# model_with_structure

In [None]:
response=model_with_structure.invoke("Provide details about the moview Inception")
response

Movie(title='Inception', year=2010, director='Christopher Nolan', rating=8.8)

In [61]:
from pydantic import BaseModel
class Actor(BaseModel):
    name:str
    role:str

class MovieDetails(BaseModel):
   title:str=Field(description="The title of the movie")
   year:int=Field(description="The year of the movie")
   director:str=Field(description="The director of the movie")
   rating:float=Field(description="The rating of the movie out of 10 ")
   actors:list[Actor]
   budget:float|None=Field(None,description="Budget in Million USD")


In [62]:
model_with_structure=model.with_structured_output(MovieDetails)
model_with_structure.invoke("What's the details of the movie Avatar?")

MovieDetails(title='Avatar', year=2009, director='James Cameron', rating=8.5, actors=[Actor(name='Sam Worthington', role='Jake Sully'), Actor(name='Zoe Saldana', role='Neytiri'), Actor(name='Sigourney Weaver', role='Dr. Norma Spellman')], budget=237.0)

## Pydantic and agents

In [72]:
from pydantic import BaseModel, Field
from langchain.agents import create_agent


class ContactInfo(BaseModel):
    """Contact information for a person."""
    name: str = Field(description="The name of the person")
    email: str = Field(description="The email address of the person")
    phone: str = Field(description="The phone number of the person")

agent = create_agent(
    model="gpt-5",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result

### TypeDict
- TypedDict provides a simpler alternative using Python’s built-in typing, ideal when you don’t need runtime validation.

- TypeDict provide us a dictionary

- No runtime validation what is  being input is given to the model


In [53]:
from typing_extensions import TypedDict,Annotated

#  ... means not to disturb the default parameters
class MovieDict(TypedDict):
    """A movie with details."""

    title: Annotated[str, ..., "The title of the movie"]
    year: Annotated[int, ..., "The year the movie was released"]
    director: Annotated[str, ..., "The director of the movie"]
    rating: Annotated[float, ..., "The movie's rating out of 10"]

model_with_structure=model.with_structured_output(MovieDict)
response=model_with_structure.invoke("Please provide the details of the movie avengers")
response


{'director': 'Joss Whedon', 'rating': 8, 'title': 'The Avengers', 'year': 2012}

In [68]:
from typing import TypedDict, List

class Actor(TypedDict):
    name: str
    role: str

# total=False means no field is required
class MovieDetails(TypedDict, total=False):
    title: str
    year: int
    director: str
    rating: float
    actors: List[Actor]
    budget: float


model_with_structure=model.with_structured_output(MovieDetails)
response=model_with_structure.invoke("Provide the details of the movie The Avengers (2012)")
response


{'actors': [{'name': 'Robert Downey Jr.', 'role': 'Tony Stark / Iron Man'},
  {'name': 'Chris Evans', 'role': 'Steve Rogers / Captain America'},
  {'name': 'Mark Ruffalo', 'role': 'Bruce Banner / Hulk'},
  {'name': 'Chris Hemsworth', 'role': 'Thor'},
  {'name': 'Scarlett Johansson', 'role': 'Natasha Romanoff / Black Widow'},
  {'name': 'Jeremy Renner', 'role': 'Clint Barton / Hawkeye'}],
 'budget': 220000000,
 'director': 'Joss Whedon',
 'rating': 8.1,
 'title': 'The Avengers',
 'year': 2012}

## TypeDict and agents

In [75]:
from typing_extensions import TypedDict
from langchain.agents import create_agent


class ContactInfo(TypedDict):
    """Contact information for a person."""
    name: str # The name of the person
    email: str # The email address of the person
    phone: str # The phone number of the person

agent = create_agent(
    model="groq:llama-3.1-8b-instant",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result["structured_response"]

{'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}

## DataClasses and agents

In [70]:
## Dataclass

from dataclasses import dataclass
from langchain.agents import create_agent

@dataclass
class ContactInfo:
    """Contact information for a person."""
    name: str # The name of the person
    email: str # The email address of the person
    phone: str # The phone number of the person


agent = create_agent(
    model="groq:llama-3.1-8b-instant",
    response_format=ContactInfo  # Auto-selects ProviderStrategy
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})

result

{'messages': [HumanMessage(content='Extract contact info from: John Doe, john@example.com, (555) 123-4567', additional_kwargs={}, response_metadata={}, id='47df9175-553a-4049-b5a3-8ce285487e3d'),
  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'nxjkwvvs3', 'function': {'arguments': '{"email":"john@example.com","name":"John Doe","phone":"(555) 123-4567"}', 'name': 'ContactInfo'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 262, 'total_tokens': 293, 'completion_time': 0.037500328, 'completion_tokens_details': None, 'prompt_time': 0.016488141, 'prompt_tokens_details': None, 'queue_time': 0.054469978, 'total_time': 0.053988469}, 'model_name': 'llama-3.1-8b-instant', 'system_fingerprint': 'fp_4387d3edbb', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019bd0c7-b1d3-7d02-aab4-084d520fcec4-0', tool_calls=[{'name': 'ContactInfo', 'args': {'email': 'john

In [None]:
result["structured_response"]

ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')