3. Output Parsing

# Output Parsing:Generating flower recommenddation lists with OuputParser   Pydantic

## OutPut parsers in LangChain:two core methods:
- get_format_instructions
- parse 
- parse_with_prompt

1. List Parser
2. Datetime Parser
3. Enum Parser
4. Structured output parser
5. Pydantic (Json) parser
6. Auto-Fixing Parser
7. RetryWithErrorOutputParser
...

Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:

"Get format instructions": A method which returns a string containing instructions for how the output of a language model should be formatted.
"Parse": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.
And then one optional one:

"Parse with prompt": A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to be the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.

In [None]:
class OutputParser:
    def __init__(self):
        pass
    def get_format_instructions(self):
        pass
    def parse(self,model_output):
        pass
    def parse_with_prompt(self,model_output,prompt):
        pass

## 5. Pydantic (Json) parser
Pydantic-python

In [1]:
#1.Create  model instances
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
from langchain_openai import OpenAI

model = OpenAI(model_name='gpt-3.5-turbo-instruct')

In [3]:
#2.Define the format of output data
import pandas as pd
df = pd.DataFrame(columns=["flower_type","price","description","reason"])

flowers = ["rose","lily","carnation"]
prices = ["50","30","20"]

In [5]:
# from pydantic import BaseModel,Field
# from langchain_core.pydantic_v1 import BaseModel,Field
from pydantic import BaseModel, Field

class flowerDescription(BaseModel):
    flower_type:str = Field(description="Flower Type")
    price:int = Field(description="Price of the flower")
    description:str = Field(description="Description of the flower")
    reason:str = Field(description="Why writer this description like this?")

In [None]:
"42"

In [6]:
#3.Create Output parser
from langchain.output_parsers import PydanticOutputParser
output_parser = PydanticOutputParser(pydantic_object=flowerDescription)

format_instruction = output_parser.get_format_instructions()
print(format_instruction)

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"flower_type": {"description": "Flower Type", "title": "Flower Type", "type": "string"}, "price": {"description": "Price of the flower", "title": "Price", "type": "integer"}, "description": {"description": "Description of the flower", "title": "Description", "type": "string"}, "reason": {"description": "Why writer this description like this?", "title": "Reason", "type": "string"}}, "required": ["flower_type", "price", "description", "reason"]}
```


The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"flower_type": {"title": "Flower Type", "description": "Flower Type", "type": "string"}, "price": {"title": "Price", "description": "Price of the flower", "type": "integer"}, "description": {"title": "Description", "description": "Description of the flower", "type": "string"}, "reason": {"title": "Reason", "description": "Why writer this description like this?", "type": "string"}}, "required": ["flower_type", "price", "description", "reason"]}
```


In [7]:
#4. create prompt templates
from langchain.prompts import PromptTemplate
prompt_template = """You are a professional florist copywriter.\n
Can you provide an attractive brief description for the {flower} priced at ${price}?
{format_instructions}
"""

prompt = PromptTemplate.from_template(prompt_template,
                                      partial_variables={"format_instructions":format_instruction})

print(prompt)

input_variables=['flower', 'price'] input_types={} partial_variables={'format_instructions': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"flower_type": {"description": "Flower Type", "title": "Flower Type", "type": "string"}, "price": {"description": "Price of the flower", "title": "Price", "type": "integer"}, "description": {"description": "Description of the flower", "title": "Description", "type": "string"}, "reason": {"description": "Why writer this description like this?", "title": "Reason", "type": "string"}}, "required": ["flower_type", "price", "descripti

input_variables=['flower_name', 'format_instructions', 'price'] 

partial_variables={'format_instruction': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, 

"required": ["foo"]}\n

the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. 

The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

\n\nHere is the output schema:\n```\n{"properties": {"flower_type": {"title": "Flower Type", "description": "Flower Type", "type": "string"}, "price": {"title": "Price", "description": "Price of the flower", "type": "integer"}, "description": {"title": "Description", "description": "Description of the flower", "type": "string"}, "reason": {"title": "Reason", "description": "Why writer this description like this?", "type": "string"}}, "required": ["flower_type", "price", "description", "reason"]}\n```'} 

template='You are a professional florist copywriter.\n\nCan you provide an attractive brief description for the {flower_name} priced at ${price}?\n{format_instructions}\n'


In [10]:
#5. Generate prompts ,pass to the model and parser  outputs

for flower,price in zip(flowers,prices):
    # print(flower,price)
    input = prompt.format(flower=flower, price=price)

    output = model.invoke(input)

    parsed_output = output_parser.parse(output)
    parsed_output_dict = parsed_output.model_dump()
    df.loc[len(df)] = parsed_output_dict

print(df.to_dict(orient='records'))

[{'flower_type': 'Rose', 'price': 50, 'description': 'This stunning rose is the perfect gift for any occasion. With its delicate petals and vibrant color, it exudes love, passion, and beauty. Hand-picked and carefully arranged by our expert florists, this rose is sure to make a lasting impression. Show your loved one how much they mean to you with this elegant and timeless flower.', 'reason': 'We have crafted this description to highlight the romantic and luxurious qualities of the rose, making it an irresistible choice for our customers.'}, {'flower_type': 'lily', 'price': 30, 'description': "Bring a touch of elegance to any occasion with our stunning lily bouquet. These delicate flowers symbolize purity and innocence, making them a perfect choice for weddings, graduations, or simply to brighten someone's day. With their graceful petals and sweet fragrance, our lilies are sure to make a lasting impression. Order now for just $30 and let the beauty of lilies speak for themselves.", 're


[{'flower_type': 'Rose', 'price': 50, 'description': 'This stunning rose is the perfect choice for any occasion. With its vibrant red petals and delicate fragrance, it is sure to make a statement. Hand-picked and expertly arranged by our skilled florists, this rose is a symbol of love, passion, and beauty. Give the gift of a $50 rose and show your loved one just how much they mean to you.', 'reason': 'This description highlights the beauty and significance of the rose, making it an irresistible choice for customers who want to express their love and affection through flowers.'}, 

{'flower_type': 'Lily', 'price': 30, 'description': 'The elegant and graceful Lily is a classic choice for any occasion. With its large, trumpet-shaped blooms and delicate fragrance, this flower exudes beauty and sophistication. Its deep green foliage adds a touch of contrast and complements the pure white petals perfectly. Whether used in a bouquet or as a standalone arrangement, the Lily is sure to make a statement and leave a lasting impression.', 'reason': 'This description highlights the timeless beauty and versatility of the Lily, making it an attractive choice for customers looking for a sophisticated and classic floral option at an affordable price.'}, 

{'flower_type': 'Carnation', 'price': 20, 'description': "Bright and versatile, the carnation is a classic choice for any occasion. With its ruffled petals and sweet scent, this flower brings a touch of elegance to any bouquet. At $20, it's an affordable option that doesn't compromise on beauty.", 'reason': "This description highlights the carnation's beauty and affordability, making it an attractive option for customers looking for a versatile and budget-friendly flower."}]


In [12]:
## 6. Auto-Fixing Parser

from langchain.output_parsers import PydanticOutputParser
from typing import List
# from langchain_core.pydantic_v1 import BaseModel,Field
from pydantic import BaseModel, Field

class Flower(BaseModel):
    name:str = Field(description="name of a flower")
    colors:List[str] = Field(description="the colors of this flower")

In [14]:
flower_query = "Generate the characters for a random flower."

#'{"name":"Carnation","colors":["Pink","White","Red","Purple","Yellow"]}'
misformatted = "{'name':'Carnation','colors':['Pink','White','Red','Purple','Yellow']}"

parser = PydanticOutputParser(pydantic_object=Flower)
parser.parse(misformatted)

{
	"name": "OutputParserException",
	"message": "Invalid json output: {'name':'Carnation','colors':['Pink','White','Red','Purple','Yellow']}",
	"stack": "---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
File e:  langchain\\Lib\\site-packages\\langchain_core\\output_parsers\\json.py:212, in JsonOutputParser.parse_result(self, result, partial)
    211 try:
--> 212     return parse_json_markdown(text)
    213 except JSONDecodeError as e:

File e:  langchain\\Lib\\site-packages\\langchain_core\\output_parsers\\json.py:157, in parse_json_markdown(json_string, parser)
    156 # Parse the JSON string into a Python dictionary
--> 157 parsed = parser(json_str)
    159 return parsed

File e:  langchain\\Lib\\site-packages\\langchain_core\\output_parsers\\json.py:125, in parse_partial_json(s, strict)
    122 # If we got here, we ran out of characters to remove
    123 # and still couldn't parse the string as JSON, so return the parse error
    124 # for the original string.
--> 125 return json.loads(s, strict=strict)

File e:  langchain\\Lib\\json\\__init__.py:359, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    358     kw['parse_constant'] = parse_constant
--> 359 return cls(**kw).decode(s)

File e:  langchain\\Lib\\json\\decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 \"\"\"Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 \"\"\"
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File e:  langchain\\Lib\\json\\decoder.py:353, in JSONDecoder.raw_decode(self, s, idx)
    352 try:
--> 353     obj, end = self.scan_once(s, idx)
    354 except StopIteration as err:

JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

The above exception was the direct cause of the following exception:

OutputParserException                     Traceback (most recent call last)
Cell In[16], line 7
      4 misformatted = \"{'name':'Carnation','colors':['Pink','White','Red','Purple','Yellow']}\"
      6 parser = PydanticOutputParser(pydantic_object=Flower)
----> 7 parser.parse(misformatted)

File e:  langchain\\Lib\\site-packages\\langchain_core\\output_parsers\\json.py:218, in JsonOutputParser.parse(self, text)
    217 def parse(self, text: str) -> Any:
--> 218     return self.parse_result([Generation(text=text)])

File e:  langchain\\Lib\\site-packages\\langchain\\output_parsers\\pydantic.py:23, in PydanticOutputParser.parse_result(self, result, partial)
     22 def parse_result(self, result: List[Generation], *, partial: bool = False) -> Any:
---> 23     json_object = super().parse_result(result)
     24     try:
     25         return self.pydantic_object.parse_obj(json_object)

File e:  langchain\\Lib\\site-packages\\langchain_core\\output_parsers\\json.py:215, in JsonOutputParser.parse_result(self, result, partial)
    213 except JSONDecodeError as e:
    214     msg = f\"Invalid json output: {text}\"
--> 215     raise OutputParserException(msg, llm_output=text) from e

OutputParserException: Invalid json output: {'name':'Carnation','colors':['Pink','White','Red','Purple','Yellow']}"
}

In [15]:
from langchain.output_parsers import OutputFixingParser
from langchain_openai import ChatOpenAI
new_parser = OutputFixingParser.from_llm(parser=parser,llm=ChatOpenAI())

print(new_parser.parse(misformatted))

name='Rose' colors=['Red', 'Pink', 'White']


name='Rose' colors=['Red', 'White', 'Pink']

In [46]:
# 7. RetryWithErrorOutputParser   parse_with_prompt

template = """Based on the user question, provide an Action and Action input for what step should be taken.
{format_instructions}
Question:{query}
Response:
"""

class Action(BaseModel):
    action:str = Field(description="action to take")
    action_input:str = Field(description="input to the action")
    action_output: str = Field(description="Response or result from executing the action")

In [47]:
parser = PydanticOutputParser(pydantic_object=Action)

In [58]:
prompt = PromptTemplate(
    template="""
You are a helpful assistant.

Given a user question, decide the action, describe the action input, and **provide the factual response in `action_output`**.

{format_instructions}

Question: {query}
Response:
""",
    # template=template,
    input_variables=["query"],
    partial_variables={"format_instructions":parser.get_format_instructions()}
)

prompt_value = prompt.format_prompt(query="What are the colors of Orchid?")

In [59]:
response = '{"action":"search"}'
parser.parse(response)

{
	"name": "OutputParserException",
	"message": "Failed to parse Action from completion {'action': 'search'}. Got: 1 validation error for Action
action_input
  field required (type=value_error.missing)",
	"stack": "---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
File site-packages\\langchain\\output_parsers\\pydantic.py:25, in PydanticOutputParser.parse_result(self, result, partial)
     24 try:
---> 25     return self.pydantic_object.parse_obj(json_object)
     26 except ValidationError as e:

File site-packages\\pydantic\\v1\\main.py:526, in BaseModel.parse_obj(cls, obj)
    525         raise ValidationError([ErrorWrapper(exc, loc=ROOT_KEY)], cls) from e
--> 526 return cls(**obj)

File site-packages\\pydantic\\v1\\main.py:341, in BaseModel.__init__(__pydantic_self__, **data)
    340 if validation_error:
--> 341     raise validation_error
    342 try:

ValidationError: 1 validation error for Action
action_input
  field required (type=value_error.missing)

During handling of the above exception, another exception occurred:

OutputParserException                     Traceback (most recent call last)
Cell In[21], line 2
      1 response = '{\"action\":\"search\"}'
----> 2 parser.parse(response)

File site-packages\\langchain_core\\output_parsers\\json.py:218, in JsonOutputParser.parse(self, text)
    217 def parse(self, text: str) -> Any:
--> 218     return self.parse_result([Generation(text=text)])

File site-packages\\langchain\\output_parsers\\pydantic.py:29, in PydanticOutputParser.parse_result(self, result, partial)
     27 name = self.pydantic_object.__name__
     28 msg = f\"Failed to parse {name} from completion {json_object}. Got: {e}\"
---> 29 raise OutputParserException(msg, llm_output=json_object)

OutputParserException: Failed to parse Action from completion {'action': 'search'}. Got: 1 validation error for Action
action_input
  field required (type=value_error.missing)"
}

In [60]:
fix_parser = OutputFixingParser.from_llm(parser=parser,llm=ChatOpenAI())
result = fix_parser.parse(response)
print(result)

action='search' action_input='keyword' action_output='results'


action='search' action_input='keyword'


In [54]:
from langchain.output_parsers import RetryWithErrorOutputParser

retry_parser = RetryWithErrorOutputParser.from_llm(parser=parser,llm=OpenAI(temperature=0))
parse_result = retry_parser.parse_with_prompt(response,prompt_value)
print(parse_result)

In [61]:
from langchain.output_parsers import RetryOutputParser

retry_parser = RetryOutputParser.from_llm(parser=parser,llm=OpenAI(temperature=0))
parse_result = retry_parser.parse_with_prompt(response,prompt_value)
print(parse_result)

action='search' action_input='colors of Orchid' action_output='Orchid comes in a variety of colors including pink, purple, and white.'


hw:  
1. Try using other types of output parsers and share?

output parse: runnable

LCEL

In [53]:
from langchain.output_parsers import RetryWithErrorOutputParser

llm = OpenAI(temperature=0)

retry_parser = RetryWithErrorOutputParser.from_llm(parser=parser, llm=llm)

# Now run:
response = llm.invoke(prompt_value)
result = retry_parser.parse_with_prompt(response, prompt_value)
print(result)

action='search' action_input='Orchid' action_output='Orchids come in a variety of colors, including pink, purple, white, and yellow.'


In [55]:
from langchain_openai import ChatOpenAI
from langchain.schema.runnable import RunnableSequence

llm = ChatOpenAI(temperature=0)
chain = prompt | llm | parser

result = chain.invoke({"query": "What are the colors of Orchid?"})
print(result)


action='search' action_input='Orchid colors' action_output='Orchids come in a variety of colors including purple, white, pink, and yellow.'
