## Chat Models - Output Parsing


Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

"Get format instructions": A method which returns a string containing instructions for how the output of a language model should be formatted.
"Parse": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.
And then one optional one:

"Parse with prompt": A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.


In [1]:
%pip install langchain langchain_openai --upgrade

Collecting langchain
  Downloading langchain-0.3.1-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain_openai
  Downloading langchain_openai-0.2.1-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-core<0.4.0,>=0.3.6 (from langchain)
  Downloading langchain_core-0.3.6-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.129-py3-none-any.whl.metadata (13 kB)
Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain)
  Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)
Collecting openai<2.0.0,>=1.40.0 (from langchain_openai)
  Downloading openai-1.50.2-py3-none-any.whl.metadata (24 kB)
Collecting tiktoken<1,>=0.7 (from langchain_openai)
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting jsonpatch<

In [2]:
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

··········


In [3]:
from langchain_core.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai.chat_models import ChatOpenAI

from langchain.output_parsers import PydanticOutputParser
from pydantic.v1 import BaseModel, Field, validator

In [4]:
# chat = ChatOpenAI(openai_api_key="...")

# If you have an envionrment variable set for OPENAI_API_KEY, you can just do:
chat = ChatOpenAI(temperature=0)

In [5]:
from typing import List


class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")


class Jokes(BaseModel):
    jokes: List[Joke] = Field(description="list of jokes")

In [10]:
import pydantic
print(pydantic.__version__)


2.9.2


In [12]:
pip install --upgrade pydantic




In [15]:
from typing import List
from pydantic import BaseModel, Field

class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

class Jokes(BaseModel):
    jokes: List[Joke] = Field(description="list of jokes")

# Get the schema for Pydantic v2.x
joke_schema = Jokes.model_json_schema()

print(joke_schema)


{'$defs': {'Joke': {'properties': {'setup': {'description': 'question to set up a joke', 'title': 'Setup', 'type': 'string'}, 'punchline': {'description': 'answer to resolve the joke', 'title': 'Punchline', 'type': 'string'}}, 'required': ['setup', 'punchline'], 'title': 'Joke', 'type': 'object'}}, 'properties': {'jokes': {'description': 'list of jokes', 'items': {'$ref': '#/$defs/Joke'}, 'title': 'Jokes', 'type': 'array'}}, 'required': ['jokes'], 'title': 'Jokes', 'type': 'object'}


In [16]:
# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Jokes)

In [17]:
template = "Answer the user query.\n{format_instructions}\n{query}\n"
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt])

In [18]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"$defs": {"Joke": {"properties": {"setup": {"description": "question to set up a joke", "title": "Setup", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"], "title": "Joke", "type": "object"}}, "properties": {"jokes": {"description": "list of jokes", "items": {"$ref": "#/$defs/Joke"}, "title": "Jokes", "type": "array"}}, "required": ["jokes"]}\n```'

In [19]:
# Format the chat prompt:
messages = chat_prompt.format_prompt(
    format_instructions=parser.get_format_instructions(),
    query="What's really funny about Python programming?",
).to_messages()

In [20]:
result = chat.invoke(messages)

In [21]:
print(result.content)

{
    "jokes": [
        {
            "setup": "Why do programmers prefer Python over C++?",
            "punchline": "Because it has fewer bugs!"
        }
    ]
}


In [22]:
joke_pydantic_object = parser.parse(result.content)

In [23]:
try:
    print(joke_pydantic_object.model_dump())
except AttributeError:
    print(joke_pydantic_object.dict())

{'jokes': [{'setup': 'Why do programmers prefer Python over C++?', 'punchline': 'Because it has fewer bugs!'}]}


In [24]:
joke_pydantic_object.jokes

[Joke(setup='Why do programmers prefer Python over C++?', punchline='Because it has fewer bugs!')]

In [25]:
joke_pydantic_object.jokes[0].punchline

'Because it has fewer bugs!'