## Output parser 

Output parser is responsible for taking the output of a model and transforming it to a more suitable format for downstream tasks. Useful when you are using LLMs to generate structured data, or to normalize output from chat models and LLMs.LangChain has lots of different types of output parsers such as JSON, XML, CSV, OutputFixing, RetryWithError, Pydantic, YAML, PandasDataFrame, Enum, Datatime, Structured.

### Output parsing with Prompt Template

Try to make output parsing without langchain output_parser function.. Give some prompts to the make the structure of the code.

In [17]:
# Chat Prompt Template
from langchain_core.prompts import ChatPromptTemplate
import os
from langchain_core.output_parsers import StrOutputParser
from langchain_google_genai import ChatGoogleGenerativeAI
from dotenv import load_dotenv

# Load environment variables from the .env file
load_dotenv()


prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. You will response the output in a CSV format"),
    ("user", "Tell me about {domain} in {city}")
])

print(prompt_template.invoke({"domain": "tourist places, culture, Facilities & issues", "city": "Dhaka"}))

# Access the API key from the environment
api_key = os.getenv("GOOGLE_GEN_API")
llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro", api_key=api_key)

chain = prompt_template | llm 

result = chain.invoke({"domain": "tourist places, culture, Facilities & issues", "city": "Dhaka"})
print(result.content)

messages=[SystemMessage(content='You are a helpful assistant. You will response the output in a CSV format', additional_kwargs={}, response_metadata={}), HumanMessage(content='Tell me about tourist places, culture, Facilities & issues in Dhaka', additional_kwargs={}, response_metadata={})]
Place Name,Type,Description,Facilities,Issues
Lalbag Fort,Historical & Architectural,A 17th-century Mughal fort with gardens, mosques, and a museum. It showcases Mughal architecture and offers historical insights.,Guided tours, museums, gardens, restrooms, food stalls.,Limited accessibility for people with disabilities, maintenance issues, overcrowding during peak hours.
Ahsan Manzil,Historical & Architectural,A 19th-century palace known as the "Pink Palace," it served as the residence of the Nawabs of Dhaka. It features a grand staircase, ballroom, and exhibits on the city's history.,Guided tours, museums, restrooms, souvenir shops.,Limited accessibility for people with disabilities, preservation ch

In [18]:
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. You will response the output in a JSON format"),
    ("user", "Tell me about {domain} in {city}")
])

chain = prompt_template | llm 

result = chain.invoke({"domain": "tourist places, culture, Facilities & issues", "city": "Dhaka"})
print(result.content)

```json
{
  "tourist_places": [
    {
      "name": "Ahsan Manzil",
      "description": "A magnificent Mughal-era palace, now a museum showcasing the history of Dhaka's Nawabs.",
      "type": "Historical",
      "recommended_visit_duration": "2-3 hours"
    },
    {
      "name": "Lalbagh Fort",
      "description": "An incomplete 17th-century Mughal fort with beautiful gardens, mosques, and the tomb of Pari Bibi.",
      "type": "Historical",
      "recommended_visit_duration": "2-3 hours"
    },
    {
      "name": "Dhakeshwari Temple",
      "description": "An important Hindu temple and one of the oldest structures in Dhaka, dating back to the 12th century.",
      "type": "Religious",
      "recommended_visit_duration": "1-2 hours"
    },
    {
      "name": "Armenian Church",
      "description": "A historic church built in the 18th century, showcasing Armenian architecture and heritage in Dhaka.",
      "type": "Historical/Religious",
      "recommended_visit_duration": "1-2 ho

### Output Parsing with LangChain

We can use an output parser to help users to specify an arbitrary JSON schema via the prompt, query a model for outputs that conform to that schema, and finally parse that schema as JSON.

The JsonOutputParser is one built-in option for prompting for and then parsing JSON output. While it is similar in functionality to the PydanticOutputParser, it also supports streaming back partial JSON objects.

Pydantic in LangChain is used to define and validate data models, which helps in structuring the data passed between various components in a LangChain application.Pydantic is often used to create structured data schemas, like the one in your example, where you define a Joke model with fields setup and punchline.

In [19]:
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from pydantic import BaseModel, Field


# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up about places description")
    description: str = Field(description="answer to provide some description of that place")
    tourist_places: str = Field(description="answer to provide the best tourist spots of that city")


# And a query intented to prompt a language model to populate the data structure.
joke_query = "Dhaka"

# Set up a parser + inject instructions into the prompt template.
parser = JsonOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\nTell me something about {query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | llm | parser

chain.invoke({"query": joke_query})

{'setup': 'Dhaka, the vibrant capital of Bangladesh, is a city where history, culture, and modernity intertwine.',
 'description': "Known for its bustling streets, Mughal architecture, and delicious cuisine, Dhaka offers a sensory overload. From the ancient Mughal fort of Lalbagh to the modern Hatirjheel lake, there's something for everyone.",
 'tourist_places': "Some must-visit tourist spots include the Pink Palace (Ahsan Manzil), the Star Mosque, the Liberation War Museum, and the Sadarghat river port. Don't forget to explore the local markets and try traditional Bangladeshi food."}

In [20]:
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "question to set up about places description", "title": "Setup", "type": "string"}, "description": {"description": "answer to provide some description of that place", "title": "Description", "type": "string"}, "tourist_places": {"description": "answer to provide the best tourist spots of that city", "title": "Tourist Places", "type": "string"}}, "required": ["setup", "description", "tourist_places"]}\n```'

You can also use the JsonOutputParser without Pydantic. This will prompt the model to return JSON, but doesn't provide specifics about what the schema should be.

In [21]:
query = "Dhaka"

parser = JsonOutputParser()
print(parser.get_format_instructions())

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n tell me something about {query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | llm | parser

chain.invoke({"query": joke_query})

Return a JSON object.


{'response': 'Dhaka, the capital of Bangladesh, is a vibrant and bustling megacity known for its rich history, diverse culture, and delicious cuisine. Here are some interesting facts about Dhaka:\n\n* **Historical Significance:** Dhaka has a long and storied past, having been ruled by Mughal emperors, British colonists, and serving as the capital of both Pakistan and Bangladesh.\n* **Cultural Hub:** Dhaka is home to numerous museums, historical sites, and cultural institutions, including the Lalbagh Fort, the Ahsan Manzil (Pink Palace), and the National Museum of Bangladesh.\n* **Culinary Delights:** From street food to fine dining, Dhaka offers a wide array of culinary experiences. Be sure to try traditional dishes like biryani, fish curry, and sweets like roshogolla and mishti doi.\n* **Rickshaw Capital of the World:** Dhaka is famous for its colorful and ubiquitous rickshaws, which provide a unique and chaotic mode of transportation.\n* **Booming Economy:** As a major South Asian ci

### XMLParsing

This guide shows you how to use the XMLOutputParser to prompt models for XML output, then and parse that output into a usable format.

In [22]:
from langchain.output_parsers import XMLOutputParser

actor_query = "Generate the shortened filmography for Tom Hanks."

parser = XMLOutputParser(tags=["movies", "actor", "film", "name", "genre"])

# We will add these instructions to the prompt below
parser.get_format_instructions()

prompt = PromptTemplate(
    template="""{query}\n{format_instructions}""",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | llm | parser

output = chain.invoke({"query": actor_query})
print(output)

Retrying langchain_google_genai.chat_models._chat_with_retry.<locals>._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..


In [16]:
parser.get_format_instructions()

'The output should be formatted as a XML file.\n1. Output should conform to the tags below. \n2. If tags are not given, make them on your own.\n3. Remember to always open and close all the tags.\n\nAs an example, for the tags ["foo", "bar", "baz"]:\n1. String "<foo>\n   <bar>\n      <baz></baz>\n   </bar>\n</foo>" is a well-formatted instance of the schema. \n2. String "<foo>\n   <bar>\n   </foo>" is a badly-formatted instance.\n3. String "<foo>\n   <tag>\n   </tag>\n</foo>" is a badly-formatted instance.\n\nHere are the output tags:\n```\n[\'movies\', \'actor\', \'film\', \'name\', \'genre\']\n```'

### Create a custom Output Parser
In some situations you may want to implement a custom parser to structure the model output into a custom format. Here, we will make a simple parse that inverts the case of the output from the model.

For example, if the model outputs: "Meow", the parser will produce "mEOW".

In [14]:
from typing import Iterable
from langchain_core.messages import AIMessage, AIMessageChunk



def parse(ai_message: AIMessage) -> str:
    """Parse the AI message."""
    return ai_message.content.swapcase()


chain = llm | parse
chain.invoke("hello")

'hELLO! hOW CAN i HELP YOU TODAY? \n'