# Fixing & Retry Output Parsers

**Results:**

* GPT & AI21 worked well
* Cohere gave mixed results

https://python.langchain.com/docs/modules/model_io/output_parsers/types/output_fixing

https://api.python.langchain.com/en/stable/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#

## Setup the environment

In [1]:
from IPython.display import JSON
from dotenv import load_dotenv
import os

import warnings

warnings.filterwarnings("ignore")

# Load the file that contains the API keys
load_dotenv('C:\\Users\\raj\\.jupyter\\.env')

True

## Create LLM

In [2]:
import sys
 
# setting path
sys.path.append('../')

from utils.create_llm import create_gpt_llm, create_cohere_llm, create_ai21_llm

# OpenAI GPT args
openai_args = {"max_tokens": -1, "temperature": 0}
llm_gpt = create_gpt_llm(openai_args)

llm_cohere = create_cohere_llm()

llm_ai21 = create_ai21_llm()

## 1. Output fixing Parser

1. Needs the actual parser
2. Uses an LLM to fix structural issues with data
   
The fixing parser first tries the actual parser, if there is an issue it requests the LLM to fix the issue. Once the issue is addressed, the parser tries to parse again. The number of retries made by fixing parse is controlled by the max-retries




In [3]:
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from langchain.output_parsers import PydanticOutputParser
from langchain.output_parsers import OutputFixingParser
from langchain import PromptTemplate

### Create Pydantic parser

In [4]:
# Create the class for JSON representation
class Actor(BaseModel):
    name: str = Field(description="name of an actor")
    film_names: list[str] = Field(description="list of names of films they starred in")

pydantic_output_parser = PydanticOutputParser(pydantic_object=Actor)

### Create Output Fixing parser

In [5]:
# Create fixing output parser with retries
max_fixing_retries = 2
fixing_parser = OutputFixingParser.from_llm(parser=pydantic_output_parser, llm=llm_cohere, max_retries=max_fixing_retries)

### Utility functions

In [6]:
def   run_with_pydantic(parse_string):
    # Pydantic : Handle the parse error with try-except
    try:
        object = pydantic_output_parser.parse(parse_string)
        print("Pydantic parser : Successful : object")
    except:
        print("Pydantic parser : Exception")

def  run_with_fixing(parse_string):
    # Fixing : Handle the parse error with try-except
    try:
        object = fixing_parser.parse(parse_string)
        print("Fixing parser : Success : ", object)
    except:
        print("Fixing parser : Exception")

### Test with Pydantic & Fixing parsers

In [9]:
# LLM Query
actor_query = "Generate the filmography for a Tom Hanks."

# Example of well formed JSON from LLM
well_formed = '{"name": "Tom Hanks", "film_names": ["Forrest Gump"]}'

# Example of mis-formatted outputs from LLM
mis_formatted = [
    "{'name': 'Tom Hanks', 'film_names': ['Forrest Gump']}",                   # Use of single quote for attributes (instead of double quotes)
    '{"name": "Tom Hanks", "film_names": ["Forrest',                           # Partial JSON !!
    '{"name": "Tom Hanks", "years": [1994],  "film_names": ["Forrest Gump"]}', # Hallucination : Extra field year
        
    '{"name": "Tom Hanks"}',                                                   # Missed out field   - NOT fixed by fixing parser
    '{"nam_87uiy3gjfhj": "Tom Hanks", "film_names": ["Forrest Gump"]}'           # Mispelled namefield     - NOT fixed by fixing parser
]

# Change index to try out different jsons in the array above
for i, parse_string in enumerate(mis_formatted):
    print("String-",i)
    run_with_pydantic(parse_string)
    run_with_fixing(parse_string)

String- 0
Pydantic parser : Exception
Fixing parser : Exception
String- 1
Pydantic parser : Successful : object
Fixing parser : Success :  name='Tom Hanks' film_names=['Forrest']
String- 2
Pydantic parser : Successful : object
Fixing parser : Success :  name='Tom Hanks' film_names=['Forrest Gump']
String- 3
Pydantic parser : Exception
Fixing parser : Success :  name='Tom Hanks' film_names=['Forrest Gump', 'Cast Away', 'A League of Their Own']
String- 4
Pydantic parser : Exception
Fixing parser : Success :  name='Tom Hanks' film_names=['Forrest Gump']


## 2. Retry Parser

Use if the Fixing parser is also giving an exception !!

This sends the prompt and the original output to the LLM to try again to get a better result.

https://python.langchain.com/docs/modules/model_io/output_parsers/types/retry

https://api.python.langchain.com/en/stable/output_parsers/langchain.output_parsers.retry.RetryOutputParser.html#

https://api.python.langchain.com/en/stable/output_parsers/langchain.output_parsers.retry.RetryWithErrorOutputParser.html#langchain.output_parsers.retry.RetryWithErrorOutputParser

In [None]:
from langchain.output_parsers import RetryWithErrorOutputParser
from langchain.output_parsers import RetryOutputParser

max_retries = 2
retry_parser = RetryWithErrorOutputParser.from_llm(parser=pydantic_output_parser, llm=llm_cohere, max_retries=max_retries)


### Utility function

In [None]:
def   run_with_retry(parse_string, prompt):
    # Pydantic : Handle the parse error with try-except
    try:
        object = retry_parser.parse_with_prompt(parse_string, prompt)
        print("Retry parser : Successful : ", object)
    except:
        print("Retry parser : Exception")


### Create a prompt for testing

In [None]:
actor_query_template = """
Generate the filmography for {actor}.
Format Instructions:
{format_instructions}
"""

actor_query_prompt_template = PromptTemplate(
    template = actor_query_template,
    input_variables = ['actor'],
    partial_variables = {"format_instructions": pydantic_output_parser.get_format_instructions()}
)

# Responses at index= 4,5
parse_string = mis_formatted[4]

# Pay attention to the use of format_prompt instead of format. 
# https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.prompt.PromptTemplate.html#
prompt_value = actor_query_prompt_template.format_prompt(actor='Tom Hanks')

# type(prompt_value)

### Run the test

In [None]:
for i, parse_string in enumerate(mis_formatted):
    print("String-",i)
    run_with_retry(parse_string, prompt_value)