### Output Parsers

Output parsers take the results from a language model and change them into a format that's better for what you need. This is really helpful when you're using language models to create organized data. 

There are a lot of output parser types, the examples will use _'Structured output parser'/Python Dictionary_

Without Output Parser:

In [82]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

output_format = {
  "answer": "string //answer to the user's question",
  "source": "string // source used to answer the user's question, should be a website.",
}

string_template = """answer the users question as best as possible. Output should be a python dictionary. \n{output_format}\n{question}"""
prompt_template = ChatPromptTemplate.from_template(string_template)

print(prompt_template)

input_variables=['output_format', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['output_format', 'question'], template='answer the users question as best as possible. Output should be a python dictionary. \n{output_format}\n{question}'))]


In [83]:
question = "What is the capital of the Philippines?"

messages = prompt_template.format_messages(output_format=output_format, question=question)

llm = ChatOpenAI(temperature=0.0, model="gpt-3.5-turbo")

response = llm.invoke(messages)

print("Prompt: ", question)
print("Output: ", response.content)

Prompt:  What is the capital of the Philippines?
Output:  {'answer': 'Manila', 'source': 'https://www.worldatlas.com/as/ph/where-is-the-philippines.html'}


The output is only in a python dictionary format but the datatype is still string.

In [84]:
type(response.content)

str

We will get an error if we try to access the `answer` key because `answer` is a string not a dictionary.

In [85]:
response.content.get('answer')

AttributeError: 'str' object has no attribute 'get'

With Output Parser:

In [105]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

answer_schema = ResponseSchema(name="answer", description="answer to the user's question")
source_schema = ResponseSchema(
        name="source",
        description="source used to answer the user's question, should be a website.",
)

response_schemas = [answer_schema, source_schema]

# or directly
# response_schemas = [
#     ResponseSchema(name="answer", description="answer to the user's question"),
#     ResponseSchema(
#         name="source",
#         description="source used to answer the user's question, should be a website."
#     )
# ]

In [90]:
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [91]:
format_instructions = output_parser.get_format_instructions()

In [92]:
question = "What is the capital of the Philippines?"
string_template = """answer the users question as best as possible.\n{format_instructions}\n{question}"""

In [113]:
prompt_template = ChatPromptTemplate.from_template(string_template)
question_message = prompt_template.format_messages(format_instructions=format_instructions, question=question)

response = llm.invoke(question_message)

print(response.content)
print(type(response.content))

```json
{
	"answer": "Manila",
	"source": "https://en.wikipedia.org/wiki/Manila"
}
```
<class 'str'>


In [114]:
output_dict = output_parser.parse(response.content)
type(output_dict)
output_dict.get('answer')

# https://python.langchain.com/docs/modules/model_io/output_parsers/#output-parser-types

'Manila'