## Structured Output
Models can be requested to provide their response in a format matching a given schema. This is useful for ensuring that output can be easily parsed and used in subsequent processing. LangChain supports multiple schema types and methods for enforcing structured output.

## Pydantic
Pydantic models provide the richest feature set with field validation, descriptions and nested structures.

In [1]:
import os 
from langchain.chat_models import init_chat_model
os.environ["GROQ_API_KEY"]=os.getenv("GROQ_API_KEY")
model=init_chat_model("groq:llama-3.3-70b-versatile")

In [2]:
from pydantic import BaseModel,Field
class Movie(BaseModel): #BaseModel is the base class for creating pydantic models
    #so now movie is not just class it is a validated data model
    title:str=Field(description="the title of the movie") #title must be a string, Field(description=...) adds metadata(gives out llm an idea about what is supposed to be in this field)
    year:int=Field(description="This year the movie was released")
    director:str=Field(description="The director of the movie")
    rating:float=Field(description="The movie's rating out of 10")

In [3]:
model_with_structure=model.with_structured_output(Movie)

In [4]:
model_with_structure

RunnableBinding(bound=ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 32768, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x000001EE6F2D4380>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x000001EE6F59F560>, model_name='llama-3.3-70b-versatile', model_kwargs={}, groq_api_key=SecretStr('**********')), kwargs={'tools': [{'type': 'function', 'function': {'name': 'Movie', 'description': '', 'parameters': {'properties': {'title': {'description': 'the title of the movie', 'type': 'string'}, 'year': {'description': 'This year the movie was released', 'type': 'integer'}, 'director': {'description': 'The director of the movie', 'type': 'string'}, 'rating': {'description': "The movie's rating out of 10", 'type': 'number'}}, 'required': ['title'

In [5]:
model_with_structure.invoke("provide details about the movie inception")

Movie(title='Inception', year=2010, director='Christopher Nolan', rating=8.5)

In [6]:
model.invoke("provide details about the movie inception") #see the difference

AIMessage(content='**Inception (2010)**\n\nInception is a science fiction action film written, co-produced, and directed by Christopher Nolan. The movie explores the concept of shared dreaming, where a team of thieves, led by Cobb (Leonardo DiCaprio), navigate the subconscious mind to plant an idea instead of stealing one.\n\n**Plot**\n\nThe film begins with Cobb, a skilled extractor, who is hired by a wealthy businessman named Saito (Ken Watanabe) to perform a task known as "inception" - planting an idea in someone\'s mind instead of stealing one. Saito wants Cobb to convince Robert Fischer (Cillian Murphy), the son of a dying business magnate, to dissolve his father\'s company.\n\nTo accomplish this task, Cobb assembles a team of experts:\n\n1. Arthur (Joseph Gordon-Levitt), a point man who researches the target.\n2. Ariadne (Ellen Page), an architect who designs the dreamscapes.\n3. Eames (Tom Hardy), a forger who can impersonate people in the dream world.\n4. Saito, who joins the t

In [None]:
#But if the title of the movie had some integers or characters in it then it would have given an error as we specified that title has to be a string.
#This is due to field validation that i need the value of the title to be string only


#### Message output alongside parsed structure

In [9]:
from pydantic import BaseModel,Field
class Movie(BaseModel):
    """A movie with details"""
    title:str=Field(...,description="the title of the movie") #the ... is like an additional field
    year:int=Field(...,description="This year the movie was released")
    director:str=Field(...,description="The director of the movie")
    rating:float=Field(...,description="The movie's rating out of 10")

model_with_structure=model.with_structured_output(Movie,include_raw=True) #this means that it will also display the raw message(that is without any structured output)
response=model_with_structure.invoke("Provide details about the movie inception")
response

{'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'ewkf319cd', 'function': {'arguments': '{"director":"Christopher Nolan","rating":8.5,"title":"Inception","year":2010}', 'name': 'Movie'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 32, 'prompt_tokens': 289, 'total_tokens': 321, 'completion_time': 0.07140735, 'completion_tokens_details': None, 'prompt_time': 0.014665531, 'prompt_tokens_details': None, 'queue_time': 0.049347308, 'total_time': 0.086072881}, 'model_name': 'llama-3.3-70b-versatile', 'system_fingerprint': 'fp_dae98b5ecb', 'service_tier': 'on_demand', 'finish_reason': 'tool_calls', 'logprobs': None, 'model_provider': 'groq'}, id='lc_run--019c2dac-aeb4-78c1-80ac-f2a5b4c9cfdb-0', tool_calls=[{'name': 'Movie', 'args': {'director': 'Christopher Nolan', 'rating': 8.5, 'title': 'Inception', 'year': 2010}, 'id': 'ewkf319cd', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 289, 'output_tokens': 32, 

#### Nested Structure

In [10]:
from pydantic import BaseModel,Field
class Actor(BaseModel):
    name:str
    role:str

class MovieDetails(BaseModel):
    title:str
    year:int
    cast:list[Actor] #List of actors, so the actor class is nested here
    genres:list[str]
    budget:float|None=Field(None,description="Budget in millions USD") #we have given default none

In [11]:
model_with_structure=model.with_structured_output(MovieDetails)

response=model_with_structure.invoke("PRovide details about the movie inception")
response

MovieDetails(title='Inception', year=2010, cast=[Actor(name='Leonardo DiCaprio', role='Cobb'), Actor(name='Joseph Gordon-Levitt', role='Arthur')], genres=['Action', 'Sci-Fi'], budget=160.0)

#### TypedDict
TypedDict provides a simpler alternative using Python's built-in typing, ideal when you don't need runtime validation.

In [14]:
from typing_extensions import TypedDict,Annotated

class MovieDict(TypedDict): #it will not have a runtime validation, so in title field i can have integers and i wont get an error
    """A movie with details."""
    title: Annotated[str,...,"The title of the movie"] #... are the optional fields that we are keeping empty
    year: Annotated[int,...,"The year the movie was released"]
    director: Annotated[int,...,"The director of the movie"]
    rating: Annotated[float,...,"The movie's rating out of 10"]

In [16]:
model_withtypedict=model.with_structured_output(MovieDict)
response=model_withtypedict.invoke("Please provide the details of the movie avengers")
response

{'director': 1, 'rating': 8.1, 'title': 'Avengers', 'year': 2012}

In [19]:
#I can use nested structures here also
class Actor(TypedDict):
    name:str
    role:str

class MovieDetails(TypedDict):
    title:str
    year:int
    cast:list[Actor] #List of actors, so the actor class is nested here
    genres:list[str]
    budget:float|None=Field(None,description="Budget in millions USD") #we have given default none


In [20]:
model_with_structure=model.with_structured_output(MovieDetails)
response=model_with_structure.invoke("Provide me details about inception")
response

{'budget': 160000000,
 'cast': [{'name': 'Leonardo DiCaprio', 'role': 'Cobb'},
  {'name': 'Joseph Gordon-Levitt', 'role': 'Arthur'}],
 'genres': ['Action', 'Sci-Fi'],
 'title': 'Inception',
 'year': 2010}

In [None]:
#model_withtypedict.profile ##this will give me error
model.profile #this will not give me error
#this tells us what all the LLM that i called supports

{'max_input_tokens': 131072,
 'max_output_tokens': 32768,
 'image_inputs': False,
 'audio_inputs': False,
 'video_inputs': False,
 'image_outputs': False,
 'audio_outputs': False,
 'video_outputs': False,
 'reasoning_output': False,
 'tool_calling': True}