# Output Parsers
Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

`Get format instructions`: A method which returns a string containing instructions for how the output of a language model should be formatted.
`Parse`: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.

In [1]:
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv

In [2]:
load_dotenv()

True

In [3]:
model = ChatOpenAI(temperature=0)

### CSV Parser
This output parser can be used when you want to return a list of comma-separated items.

In [4]:
output_parser = CommaSeparatedListOutputParser()

format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="List five places {places}.\n{format_instructions}",
    input_variables=["places"],
    partial_variables={"format_instructions": format_instructions},
)

In [5]:
chain = prompt | model | output_parser

In [6]:
chain.invoke({"places": "for summer tourism in India"})

['Goa', 'Manali', 'Jaipur', 'Kerala', 'Rishikesh']

### JSON parser
This output parser allows users to specify a JSON schema and query LLMs for outputs that conform to that schema.
Keep in mind that large language models are leaky abstractions! You’ll have to use an LLM with sufficient capacity to generate well-formed JSON. In the OpenAI family, DaVinci can do reliably but Curie’s ability already drops off dramatically.
#### The following example uses Pydantic to declare your data model.

In [7]:
from typing import List
from langchain_core.output_parsers import JsonOutputParser

In [9]:
# Define your desired data structure.
from pydantic import BaseModel, Field

class Travel(BaseModel):
    place: str = Field(description="name of the places")
    description: str = Field(description="description of the place")
    activities: str = Field(description="what to do in that place")    

In [40]:
# And a query intented to prompt a language model to populate the data structure.
travel_query = "Suggest a place in India for going on a trip this summer to avoid heat."

# Set up a parser + inject instructions into the prompt template.
parser = JsonOutputParser(pydantic_object=Travel)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

chain.invoke({"query": travel_query})

{'place': 'Leh-Ladakh',
 'description': 'A high-altitude desert region in the northern part of India, known for its stunning landscapes, Buddhist monasteries, and adventure activities like trekking and river rafting.',
 'activities': 'Explore the monasteries, go trekking in the Himalayas, visit Pangong Lake, and experience the unique culture of the region.'}

### Without Pydantic

In [10]:
travel_query = "Suggest a place in India for going on a trip this summer to avoid heat."

parser = JsonOutputParser()

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

chain.invoke({"query": travel_query})

{'destination': 'Leh-Ladakh, Jammu and Kashmir',
 'description': 'Leh-Ladakh is a high-altitude desert region in the northernmost part of India, known for its stunning landscapes, Buddhist monasteries, and adventurous activities like trekking and river rafting. The weather in Leh-Ladakh during the summer months is cool and pleasant, making it an ideal destination to escape the heat.',
 'highlights': ['Visit the ancient monasteries like Hemis, Thiksey, and Diskit',
  'Explore the breathtaking Pangong Lake and Nubra Valley',
  'Embark on a thrilling road trip on the scenic Manali-Leh Highway']}

## Structured Output Parser
This output parser can be used when you want to return multiple fields. While the Pydantic/JSON parser is more powerful, this is useful for less powerful models.

In [11]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

In [12]:
response_schemas = [
    ResponseSchema(name="answer", description="answer to the user's question"),
    ResponseSchema(name="description", description="detailed description on the answer topic"),
    ResponseSchema(
        name="applications",
        description="real world applications of the answer topic",
    ),
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [13]:
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="answer the users question as best as possible.\n{format_instructions}\n{question}",
    input_variables=["question"],
    partial_variables={"format_instructions": format_instructions},
)

In [14]:
chain = prompt | model | output_parser

chain.invoke({"question": "Name an invention in Healthcare that has caused revolution in twenty first century."})

{'answer': 'CRISPR-Cas9 gene editing technology',
 'description': 'CRISPR-Cas9 is a revolutionary gene-editing tool that allows scientists to precisely modify genes within living organisms. It has the potential to treat genetic disorders, develop new therapies, and even create genetically modified organisms.',
 'applications': 'CRISPR-Cas9 has applications in gene therapy, agriculture, drug development, and disease research. It has the potential to revolutionize healthcare by providing personalized treatments for genetic diseases.'}