# Output Parsers
Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

    "Get format instructions": A method which returns a string containing instructions for how the output of a language model should be formatted.
    
    "Parse": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.

In [1]:
!pip install python-dotenv langchain openai



# Pydantic (JSON) parser
This output parser allows users to specify an arbitrary JSON schema and query LLMs for JSON outputs that conform to that schema.

In [2]:
from langchain.prompts import (
    PromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI

from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator
from typing import List

model_name = "text-davinci-003"
temperature = 0.0
model = OpenAI(model_name=model_name, temperature=temperature)

Example with a compound typed field.

In [3]:
class Actor(BaseModel):
    name: str = Field(description="name of an actor")
    film_names: List[str] = Field(description="list of names of films they starred in")


actor_query = "Generate the filmography for a random actor."

parser = PydanticOutputParser(pydantic_object=Actor)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

_input = prompt.format_prompt(query=actor_query)

output = model(_input.to_string())

parser.parse(output)

Actor(name='Tom Hanks', film_names=['Forrest Gump', 'Saving Private Ryan', 'The Green Mile', 'Cast Away', 'Toy Story'])

Compare to output straight from LLM:

In [9]:
# create a prompt template
no_formatting_template = PromptTemplate(
    template="Answer the user query.\n{query}\n",
    input_variables=["query"]
)

_input = no_formatting_template.format_prompt(query=actor_query)

output = model(_input.to_string())

In [10]:
print(output)


Filmography for Random Actor:
1. The Departed (2006)
2. The Aviator (2004)
3. Gangs of New York (2002)
4. Catch Me If You Can (2002)
5. Shutter Island (2010)
6. The Wolf of Wall Street (2013)
7. The Irishman (2019)
8. Hugo (2011)
9. The Great Gatsby (2013)
10. Django Unchained (2012)
