### Output Parsers
Responsible for taking the output of an LLM and transforming it to a more suitable format.

This is very useful when you are using LLMs to generate any form of structured data.

Besides having a large collection of different typpes of output parsers, one distinguising benefit of LangChain OutputParsers is that many of them supports streaming. 

LLM -> text -> OutputParser -> Structured text

`https://python.langchain.com/docs/modules/model_io/output_parsers/quick_start/` -> more on built in parsers (CSV, JSON, Pandas DF,..
)

In [4]:
import os
from dotenv import load_dotenv
load_dotenv()

openai_key = os.getenv('OPEN_AI_KEY') # or
openai_keyy = os.environ.get('OPEN_AI_KEY')

openai_key == openai_keyy

True

In [23]:
from langchain.output_parsers import DatetimeOutputParser, CommaSeparatedListOutputParser, PydanticOutputParser
from langchain_openai import ChatOpenAI
from langchain.prompts import SystemMessagePromptTemplate,ChatPromptTemplate,HumanMessagePromptTemplate
chat = ChatOpenAI(openai_api_key = openai_key)

date_time = DatetimeOutputParser()
print(date_time.get_format_instructions())
# These system message prompts that you used to write have already been baked into the seperate classes

Write a datetime string that matches the following pattern: '%Y-%m-%dT%H:%M:%S.%fZ'.

Examples: 1405-08-09T00:47:39.537155Z, 1560-06-03T00:10:57.956696Z, 1025-11-12T11:07:12.337299Z

Return ONLY this string, no other words!


In [24]:
comma_seperated = CommaSeparatedListOutputParser()
print(comma_seperated.get_format_instructions())

Your response should be a list of comma separated values, eg: `foo, bar, baz`


In [25]:
human_template = "{request}\n{format_instruction}"
chat_promt = ChatPromptTemplate.from_messages([
    HumanMessagePromptTemplate.from_template(human_template)
])

print(chat_promt)

input_variables=['format_instruction', 'request'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['format_instruction', 'request'], template='{request}\n{format_instruction}'))]


In [26]:
formatted_prompt = chat_promt.format_messages(
    request = "When was world war 2 declared ?",
    format_instruction = date_time.get_format_instructions()
)
print(formatted_prompt)

[HumanMessage(content="When was world war 2 declared ?\nWrite a datetime string that matches the following pattern: '%Y-%m-%dT%H:%M:%S.%fZ'.\n\nExamples: 0223-07-24T04:54:33.860138Z, 0068-03-17T04:23:16.371689Z, 1032-08-05T08:55:46.108810Z\n\nReturn ONLY this string, no other words!")]


In [None]:
response = chat.invoke(formatted_prompt)
print(response.content)
print(date_time.parse(response.content))

#### Pydantic Output Parser

In [28]:
# define your data structure
from langchain_core.pydantic_v1 import BaseModel, Field
class Cricketer(BaseModel):
    name:str = Field(description = "Name of Cricketer")
    records:list = Field(description="Python list of records")

# defining a custom datastructure, indicating the LLM that i want the output structure like this

In [29]:
parser = PydanticOutputParser(pydantic_object=Cricketer)
print(parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"name": {"title": "Name", "description": "Name of Cricketer", "type": "string"}, "records": {"title": "Records", "description": "Python list of records", "type": "array", "items": {}}}, "required": ["name", "records"]}
```


In [35]:
human_template = "{request}\n{format_instruction}"
chat_promt = ChatPromptTemplate.from_messages([
    HumanMessagePromptTemplate.from_template(human_template)
])

formatted_prompt = chat_promt.format_messages(
    request = " tell me about virat kholi",
    format_instruction = parser.get_format_instructions()
)
print(formatted_prompt)

[HumanMessage(content=' tell me about virat kholi\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"name": {"title": "Name", "description": "Name of Cricketer", "type": "string"}, "records": {"title": "Records", "description": "Python list of records", "type": "array", "items": {}}}, "required": ["name", "records"]}\n```')]


In [None]:
response = chat.invoke(formatted_prompt)
print(response.content)
print(parser.parse(response.content))