#### Output Parsers
Outputparser is responsible for taking the output of a model and transforming it to a more 
suitable format for downstream tasks.
Useful when you are using LLMs to generate structured data, or to normalize output from chat models and LLMs.

There are two types of Output parser:
1. Predefined parsers
2. Custom Output parser

##### String Output Parser

In [8]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()
print(output_parser.get_output_jsonschema())

{'title': 'StrOutputParserOutput', 'type': 'string'}


In [10]:
from langchain_core.output_parsers import CommaSeparatedListOutputParser

output_parser = CommaSeparatedListOutputParser()
format_instructions = output_parser.get_format_instructions()
format_instructions

'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'

#### JSON Output Parser

In [12]:
from langchain_core.output_parsers import JsonOutputParser
json_output_parser = JsonOutputParser()
json_output_parser.get_format_instructions()

'Return a JSON object.'

#### XML Output Parser

In [17]:
from langchain_core.output_parsers import XMLOutputParser

# Using an XML output parser
xml_output_parser = XMLOutputParser()
xml_output_parser.get_format_instructions()

'The output should be formatted as a XML file.\n1. Output should conform to the tags below.\n2. If tags are not given, make them on your own.\n3. Remember to always open and close all the tags.\n\nAs an example, for the tags ["foo", "bar", "baz"]:\n1. String "<foo>\n   <bar>\n      <baz></baz>\n   </bar>\n</foo>" is a well-formatted instance of the schema.\n2. String "<foo>\n   <bar>\n   </foo>" is a badly-formatted instance.\n3. String "<foo>\n   <tag>\n   </tag>\n</foo>" is a badly-formatted instance.\n\nHere are the output tags:\n```\nNone\n```'

#### Pydantic Output Parser 

In [14]:
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")

# Using the Pydantic model with the JSON output parser
parser = PydanticOutputParser(pydantic_object=Joke)
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "The setup of the joke", "title": "Setup", "type": "string"}, "punchline": {"description": "The punchline of the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}\n```'

In [13]:
from pydantic import BaseModel, Field

class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")

# Using the Pydantic model with the JSON output parser
parser = JsonOutputParser(pydantic_object=Joke)
parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "The setup of the joke", "title": "Setup", "type": "string"}, "punchline": {"description": "The punchline of the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}\n```'

In [15]:
from langchain_core.output_parsers import PydanticOutputParser
from langchain_core.prompts import PromptTemplate
from pydantic import BaseModel, Field, model_validator


# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic.
    @model_validator(mode="before")
    @classmethod
    def question_ends_with_question_mark(cls, values: dict) -> dict:
        setup = values.get("setup")
        if setup and setup[-1] != "?":
            raise ValueError("Badly formed question!")
        return values


# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

print(parser.get_format_instructions())
print(prompt)


The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"setup": {"description": "question to set up a joke", "title": "Setup", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}
```
input_variables=['query'] input_types={} partial_variables={'format_instructions': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "

### Custom Output parser

In [22]:
from langchain_core.output_parsers import BaseOutputParser
class CustomOutputParser(BaseOutputParser):
    def parse(self, text: str) -> str:
        # Custom parsing logic
        return text.strip().upper()  # Example: convert to uppercase
# Using the custom output parser with the chain
custom_parser = CustomOutputParser()
custom_parser

CustomOutputParser()

In [25]:
# Custom output parser with JSON output
class CustomJsonOutputParser(JsonOutputParser):
    def parse(self, text: str) -> dict:
        # Custom parsing logic for JSON
        parsed = super().parse(text)
        # Example: Add a custom field
        parsed['parsed'] = 'true'
        return parsed
# Using the custom JSON output parser with the chain
custom_json_parser = CustomJsonOutputParser()
custom_json_parser

CustomJsonOutputParser()