#### What are Output Parsers?

Language Models returns text. But sometimes, you may want to parse that text into some strucuture. Output Parsers are used in this case.

Output Parsers are Classes that help structure language model responses.

An OutputParser <b>must</b> implement following 2 methods:
1. <b>Get Format Instructions</b> : Returns a string containing instructions for how the output of language model should be formatted.
2. <b>Parse</b> : Takes input string and parses it into some structure.

#### <b><u>Streaming with Parsers</u></b>

While all parsers support the streaming interface, only certain parsers can stream through partially parsed objects, since this is highly dependent on the output type. Parsers which cannot construct partial objects will simply yield the fully parsed output.


In [27]:
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from langchain_core.output_parsers import PydanticOutputParser
from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate

In [28]:
model = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0.0)


# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic.
    @validator("setup")
    def question_ends_with_question_mark(cls, field):
        if field[-1] != "?":
            raise ValueError("Badly formed question!")
        return field


# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)
print(f"Format instructions: \n{parser.get_format_instructions()}")

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

# And a query intended to prompt a language model to populate the data structure.
prompt_and_model = prompt | model | parser

# PydanticOutputParser cannot stream 
output = prompt_and_model.invoke({"query": "Tell me a joke."})
output


Format instructions: 
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"setup": {"title": "Setup", "description": "question to set up a joke", "type": "string"}, "punchline": {"title": "Punchline", "description": "answer to resolve the joke", "type": "string"}}, "required": ["setup", "punchline"]}
```


Joke(setup='Why did the tomato turn red?', punchline='Because it saw the salad dressing!')

In [29]:
# SimpleJSONOutputParser can stream
from langchain_core.output_parsers import SimpleJsonOutputParser

parser=SimpleJsonOutputParser()
prompt = PromptTemplate.from_template("Answer the following question in JSON format:\n {question}")
chain = prompt | model | parser
output = chain.stream("Tell me a joke.")

In [30]:
for chunk in output:
    print(chunk, flush=True)

{}
{'joke': ''}
{'joke': 'Why'}
{'joke': 'Why don'}
{'joke': "Why don't scientists trust"}
{'joke': "Why don't scientists trust atoms"}
{'joke': "Why don't scientists trust atoms?"}
{'joke': "Why don't scientists trust atoms? Because"}
{'joke': "Why don't scientists trust atoms? Because they"}
{'joke': "Why don't scientists trust atoms? Because they make up everything"}
{'joke': "Why don't scientists trust atoms? Because they make up everything."}
