## Output Parsers
Language models output text. But many times you may want to get more structured information than just text back. This is where output parsers come in.
Output parsers are classes that help structure language model responses. There are 2 main methods an output parser must implement:
- **Get format instructions**: A method which returns a string containing instructions for how the output of a language model should be fommatted.
- **Parse**: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.

In [1]:
from langchain.output_parsers import CommaSeparatedListOutputParser
from langchain.prompts import PromptTemplate
from langchain_google_genai import GoogleGenerativeAI
from dotenv import load_dotenv
import os

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
load_dotenv()

True

In [3]:
model = GoogleGenerativeAI(
  model="gemini-1.5-pro-latest", 
  google_api_key=os.getenv("GOOGLE_API_KEY"), 
)

### CSV Parser
This output parser can be used when you want to return a list of comma-separated items.

In [4]:
output_parser = CommaSeparatedListOutputParser()

format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="List five places {places}.\n{format_instructions}",
    input_variables=["places"],
    partial_variables={"format_instructions": format_instructions},
)

In [5]:
chain = prompt | model | output_parser

In [6]:
chain.invoke({"places": "for summer tourism in Finland"})

['Helsinki',
 'Tampere',
 'Turku Archipelago',
 'Lake Saimaa',
 'Rovaniemi (Lapland)']

### JSON parser
This output parser allows users to specify a JSON schema and query LLMs for outputs that conform to that schema. Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON. In the OpenAI family, DaVinci can do reliably but Curie’s ability already drops off dramatically.

#### The following example uses Pydantic to declare your data model.

In [27]:
from typing import List
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

In [28]:
class Travel(BaseModel):
    place: str = Field(description="name of the places")
    description: str = Field(description="description of the place")
    activities: str = Field(description="what to do in that place")   

In [30]:
# And a query intented to prompt a language model to populate the data structure.
travel_query = "Suggest a place in Vietnam for going on a trip this summer to avoid heat."

# Set up a parser + inject instructions into the prompt template.
parser = JsonOutputParser(pydantic_object=Travel)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser

chain.invoke({"query": travel_query})

{'place': 'Da Lat',
 'description': 'Located in the Central Highlands, Da Lat enjoys a year-round cool climate, making it a perfect escape from the summer heat. ',
 'activities': 'Visit flower gardens, explore waterfalls, enjoy scenic hikes, experience the French colonial architecture.'}

#### Without Pydantic

In [31]:
travel_query = "Suggest a place in Finland for going on a trip this summer to avoid heat."

parser = JsonOutputParser()

prompt = PromptTemplate(
  template="Answer the user query.\n{format_instructions}\n{query}\n",
  input_variables=["query"],
  partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser
chain.invoke({"query": travel_query})

{'suggestion': 'Lapland, Finland',
 'reason': 'Lapland is located in the northernmost part of Finland, well within the Arctic Circle. This makes it an ideal destination for escaping the summer heat, as temperatures tend to be much cooler than in other parts of the country. ',
 'activities': ['Hiking in stunning national parks like Pallas-Yllästunturi or Urho Kekkonen National Park',
  "Experiencing the midnight sun, a natural phenomenon where the sun doesn't set for weeks during the summer solstice",
  'Going white water rafting or kayaking in pristine rivers and lakes',
  'Spotting wildlife like reindeer, bears, and a variety of bird species',
  'Visiting Santa Claus Village in Rovaniemi for a unique and festive experience']}

#### Structured Output Parser
This output parser can be used when you want to return multiple fields. While the Pydantic/JSON parser is more powerful, this is useful for less powerful models.

In [17]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

In [20]:
response_schemas = [
  ResponseSchema(name="answer", description="Answer to the user's question"),
  ResponseSchema(name="description", description="Detailed description of the answer topic"),
  ResponseSchema(name="applications", description="Real World applications of the answer topic"),
]
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)


In [25]:
format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
  template="Answer the users question as best as possible.\n{format_instructions}\n{question}",
  input_variables=["question"],
  partial_variables={"format_instructions": format_instructions}
)
print("Output parser: ", output_parser)
print("Format instruction: ", format_instructions)
print("Prompt template: ", prompt)

Output parser:  response_schemas=[ResponseSchema(name='answer', description="Answer to the user's question", type='string'), ResponseSchema(name='description', description='Detailed description of the answer topic', type='string'), ResponseSchema(name='applications', description='Real World applications of the answer topic', type='string')]
Format instruction:  The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"answer": string  // Answer to the user's question
	"description": string  // Detailed description of the answer topic
	"applications": string  // Real World applications of the answer topic
}
```
Prompt template:  input_variables=['question'] input_types={} partial_variables={'format_instructions': 'The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{\n\t"answer": string  // Answer to th

In [26]:
chain = prompt | model | output_parser
chain.invoke({"question": "Name an invention in Healthcare that has caused revolution in twenty first century."})

{'answer': 'One of the most revolutionary healthcare inventions of the 21st century is **Artificial Intelligence (AI)**.',
 'description': "AI in healthcare encompasses the use of complex algorithms and software to emulate human cognition in the analysis, interpretation, and comprehension of complex medical data. It's applied in various areas like disease diagnosis, drug discovery, personalized medicine, and patient monitoring.",
 'applications': '**Real-world applications of AI in healthcare include:** \n\n* **Medical Imaging Analysis:** AI algorithms can analyze X-rays, CT scans, and MRIs to detect abnormalities like tumors with high accuracy.\n* **Drug Discovery and Development:** AI accelerates the process of identifying potential drug candidates and predicting their effectiveness.\n* **Personalized Treatment:** AI helps tailor treatment plans based on individual patient data, genetics, and lifestyle.\n* **Virtual Health Assistants:** AI-powered chatbots provide medical advice, sch