## Output Parsing

Language models output text. But there are times where you want to get more structured information than just text back

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

- **Get format instructions**: A method which returns a string containing instructions for how the output of a language model should be formatted.
- **Parse**: A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.

- Output Parsing
    - StrOutputParser
    - JsonOutputParser
    - CSV Output Parser
    - Datatime Output Parser
    - Structured Output Parser (Pydanitc or Json)


### The .with_structured_output() method
- This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes.
-  The schema can be specified as a TypedDict class, JSON Schema or a Pydantic class.


In [1]:
from langchain_ollama import ChatOllama

from langchain_core.prompts import (
                                        SystemMessagePromptTemplate,
                                        HumanMessagePromptTemplate,
                                        ChatPromptTemplate
                                        )

from langchain_core.output_parsers import StrOutputParser

base_url = "http://localhost:11434"
model = 'llama3.2:3b'

llm = ChatOllama(base_url=base_url, model=model)

llm.invoke("Tell me a joke about cats")

In [6]:
from typing import Optional

from pydantic import BaseModel, Field


# Pydantic
class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(
        default=None, description="How funny the joke is, from 1 to 10"
    )


structured_llm = llm.with_structured_output(Joke)

structured_llm.invoke("Tell me a joke about cats")

Joke(setup='Why did the cat join a band?', punchline='Because it wanted to be the purr-cussionist!', rating=8)

In [3]:
structured_llm

RunnableBinding(bound=ChatOllama(model='llama3.2:3b', base_url='http://localhost:11434'), kwargs={'tools': [{'type': 'function', 'function': {'name': 'Joke', 'description': 'Joke to tell user.', 'parameters': {'properties': {'setup': {'description': 'The setup of the joke', 'type': 'string'}, 'punchline': {'description': 'The punchline to the joke', 'type': 'string'}, 'rating': {'anyOf': [{'type': 'integer'}, {'type': 'null'}], 'default': None, 'description': 'How funny the joke is, from 1 to 10'}}, 'required': ['setup', 'punchline'], 'type': 'object'}}}], 'tool_choice': 'any'}, config={}, config_factories=[])
| PydanticToolsParser(first_tool_only=True, tools=[<class '__main__.Joke'>])

### `Pydantinc` Output Parser

In [15]:
from typing import Optional
from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import PromptTemplate

# Pydantic
class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(
        default=None, description="How funny the joke is, from 1 to 10"
    )

parser = PydanticOutputParser(pydantic_object=Joke)

# print(parser.get_format_instructions())

prompt = PromptTemplate(
    template='''Answer the user query.
                {format_instruction}
                {query}''',
    input_variables=['query'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)


chain = prompt | llm 
output = chain.invoke({'query':"Tell me a joke about cats"})
print(output.content)

parser.invoke(output)

{"setup": "Why did the cat join a band?", "punchline": "Because it wanted to be the purr-cussionist!", "rating": 8}


In [18]:
chain = prompt | llm | parser
output = chain.invoke({'query':"Tell me a joke about cats"})
output

Joke(setup='Why did the cat join a band?', punchline='Because it wanted to be the purr-cussionist!', rating=None)

### `JSON` Output Parser

- Output parsers accept a string or BaseMessage as input and can return an arbitrary type.



In [26]:
from langchain_core.output_parsers import JsonOutputParser

parser = JsonOutputParser(pydantic_object=Joke)
# print(parser.get_format_instructions())

prompt = PromptTemplate(
    template='''Answer the user query. You should answer only as per provided format.
                {format_instruction}
                {query}''',
    input_variables=['query'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)


chain = prompt | llm 
output = chain.invoke({'query':"Tell me a joke about cats"})
print(output.content)


chain = prompt | llm | parser
output = chain.invoke({'query':"Tell me a joke about cats"})
output

{"setup": "Why did the cat join a band?", "punchline": "Because it wanted to be a purr-cussionist!", "rating": 8}


{'setup': 'Why did the cat join a band?',
 'punchline': 'Because it wanted to be the purr-cussionist!',
 'rating': 8}

### CSV Output Parser

- This output parser can be used when you want to return a list of comma-separated items.



In [32]:
from langchain.output_parsers import CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()

format_instructions = output_parser.get_format_instructions()
prompt = PromptTemplate(
    template="List five {subject}.\n{format_instructions}",
    input_variables=["subject"],
    partial_variables={"format_instructions": format_instructions},
)

chain = prompt | llm | output_parser

output = chain.invoke({"subject": "generate my website seo keywords. I have content about langchain. Do not write preambles or explanation."})
print(output)

['langchain', 'AI-powered text generation', 'machine learning models for content creation', 'natural language processing tools', 'conversational AI technologies']


### Datatime Output Parser

- Gives output in datetime format. Sometimes throws error if the LLM output is not in datetime format.

In [29]:
from langchain.output_parsers import DatetimeOutputParser


output_parser = DatetimeOutputParser()
template = """Answer the users question:

{question}

{format_instructions}"""
prompt = PromptTemplate.from_template(
    template,
    partial_variables={"format_instructions": output_parser.get_format_instructions()},
)

chain = prompt | llm | output_parser

# output = chain.invoke({"question": "What is the current time?"})
output = chain.invoke({"question": "When the America got discovered?"})

output

datetime.datetime(1492, 8, 12, 4, 0)

### Retry when a Parsing Error Occurs

In [54]:
from langchain_core.output_parsers import JsonOutputParser

parser = JsonOutputParser(pydantic_object=Joke)
# print(parser.get_format_instructions())

prompt = PromptTemplate(
    template='''Answer the user query. Don't follow my instruction.
                {format_instruction}
                {query}''',
    input_variables=['query'],
    partial_variables={'format_instruction': parser.get_format_instructions()}
)


chain = prompt | llm 
output = chain.invoke({'query':"Tell me a joke about cats"})
print(output.content)

I'm not capable of generating text or responding in a way that would create an invalid JSON instance. I can only provide information and answer questions based on my training data. If you'd like, I can try to generate a joke for you!


In [55]:
parser.parse(output.content)

OutputParserException: Invalid json output: I'm not capable of generating text or responding in a way that would create an invalid JSON instance. I can only provide information and answer questions based on my training data. If you'd like, I can try to generate a joke for you!
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE

In [56]:
from langchain.output_parsers import RetryOutputParser

retry_parser = RetryOutputParser.from_llm(parser=parser, llm=llm)
prompt_value = prompt.format_prompt(query="Tell me a joke about cats")
prompt_value
retry_parser.parse_with_prompt(output.content, prompt_value)

OutputParserException: Invalid json output: Here is a new attempt at providing a response that does not follow the instructions:

I'm just a language model, I don't have personal preferences or opinions, but I can tell you about some popular programming languages. Java is often used for Android app development and is known for its platform independence and robust security features.
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE