## Chat Models - Output Parsing


Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

"Get format instructions": A method which returns a string containing instructions for how the output of a language model should be formatted.
"Parse": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.
And then one optional one:

"Parse with prompt": A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.


In [None]:
import os
%pip install langchain
os.environ['OPENAI_API_KEY'] = 'API_KEY_HERE'

In [1]:
from langchain.prompts import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain.chat_models import ChatOpenAI

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator

In [2]:
# chat = ChatOpenAI(openai_api_key="...")

# If you have an envionrment variable set for OPENAI_API_KEY, you can just do:
chat = ChatOpenAI(temperature=0)

In [18]:
from typing import List


class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")


class Jokes(BaseModel):
    jokes: List[Joke] = Field(description="list of jokes")

In [19]:
# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Jokes)

In [20]:
template = "Answer the user query.\n{format_instructions}\n{query}\n"
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt])

In [21]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"jokes": {"title": "Jokes", "description": "list of jokes", "type": "array", "items": {"$ref": "#/definitions/Joke"}}}, "required": ["jokes"], "definitions": {"Joke": {"title": "Joke", "type": "object", "properties": {"setup": {"title": "Setup", "description": "question to set up a joke", "type": "string"}, "punchline": {"title": "Punchline", "description": "answer to resolve the joke", "type": "string"}}, "required": ["setup", "punchline"]}}}\n```'

In [22]:
# Format the chat prompt:
messages = chat_prompt.format_prompt(
    format_instructions=parser.get_format_instructions(),
    query="What do you call a pig that does karate?",
).to_messages()

In [23]:
result = chat(messages)

In [24]:
print(result.content)

{"jokes": [{"setup": "What do you call a pig that does karate?", "punchline": "A pork chop!"}]}


In [25]:
joke_pydantic_object = parser.parse(result.content)

In [26]:
joke_pydantic_object.dict()

{'jokes': [{'setup': 'What do you call a pig that does karate?',
   'punchline': 'A pork chop!'}]}

In [28]:
joke_pydantic_object.jokes

[Joke(setup='What do you call a pig that does karate?', punchline='A pork chop!')]

In [29]:
joke_pydantic_object.jokes[0].punchline

'A pork chop!'