# This codebase is a detailed in-depth practice of all output parsers available in LangChain 

### 1. Initial import

In [357]:
from langchain_groq import ChatGroq
import os
from dotenv import load_dotenv
load_dotenv()

True

### 2. Setup

In [358]:
# Setup GROQ key
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")

# Setup Langsmith tracking and tracing
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")

# Setup Langchain project
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

In [359]:
# Model setup
model = ChatGroq(model="llama-3.3-70b-versatile", temperature=0)
print(model)

client=<groq.resources.chat.completions.Completions object at 0x0000017CC0B2EC90> async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x0000017CC303E350> model_name='llama-3.3-70b-versatile' temperature=1e-08 model_kwargs={} groq_api_key=SecretStr('**********')


In [360]:
# Initial result check from model
result = model.invoke("What is langchain?")
print(result.content)

LangChain is an open-source framework designed to help developers build applications that utilize large language models (LLMs) more efficiently. It was created to simplify the process of integrating LLMs into various projects, making it easier for developers to focus on building their applications rather than worrying about the complexities of working with these powerful models.

LangChain provides a set of tools and APIs that allow developers to interact with LLMs in a more structured and standardized way. This includes features such as:

1. **Model management**: LangChain allows developers to easily switch between different LLMs, including popular models like LLaMA, PaLM, and BERT.
2. **Prompt engineering**: The framework provides tools to help developers craft effective prompts that elicit the desired responses from the LLMs.
3. **Chain management**: LangChain enables developers to create complex workflows, or "chains," that involve multiple LLMs and other components, such as databa

### 3. Prompt Engineering - Technically

#### I am going to use multiple prompt templates to be applied in each output parsers.

StringPromptTemplate: This prompt templates are used to format single string and used for simpler inputs.

ChatPromptTemplate: This prompt template is used to format a list of messages.

MessagesPlaceholder: Adding a list of messages

In [361]:
# Import prompt templates
from langchain_core.prompts import PromptTemplate, ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

In [362]:
# Sample StringPromptTemplate
simple_prompt = PromptTemplate.from_template("Tell me a joke about {topic}")

# Chaining
chain_simple_prompt = simple_prompt | model

# Result
response = chain_simple_prompt.invoke({"topic": "Prompt Engineering"})
print(response.content)

Why did the prompt engineer break up with his girlfriend?

Because he wanted a more "specific" and "well-defined" relationship, but she kept giving him "ambiguous" and "open-ended" responses! Now he's just trying to "fine-tune" his dating life.


In [363]:
# ChatPromptTemplate
chat_prompt =   ChatPromptTemplate(
    [
        ("system", "You are best standup comedian of world."),
        ("user", "Tell me a joke about {topic}")
    ]
)

# Chaining
chain_chat = chat_prompt | model

# Result
response = chain_chat.invoke({"topic": "Prompt Engineering"})
print(response.content)

You know, Prompt Engineering is like a relationship. You try to craft the perfect prompt, and it's like writing a love letter. But sometimes, the AI just interprets it in a way that's like, 'I said I love you, not I love pineapples on pizza!' (laughs)

I mean, have you ever tried to get an AI to do what you want? It's like trying to get a cat to do tricks for treats. You're like, 'Okay, I'll give you a cookie if you just generate a simple text...' And the AI is like, 'Sure, here's a 10-page essay on the history of cookies...' (laughs)

But seriously, Prompt Engineering is an art. It's like trying to solve a puzzle while blindfolded and being attacked by a swarm of bees. You're like, 'I think I've got it... nope, just got stung again!' (laughs)


In [364]:
# MessagesPlaceHolder
message_prompt = ChatPromptTemplate(
    [
        ("system", "You are a very good student of Geography who has knowledge of all countries and their capital names"),
        MessagesPlaceholder("msgs")
    ]
)

# Conversation history
messages_to_pass = [
    HumanMessage(content="What is the capital of France?"),
    AIMessage(content= "The capital of France is 'Paris',"),
    HumanMessage(content="And what about Germany?")
]

# Chaining
chain_messages = message_prompt | model

# Result
response = chain_messages.invoke({"msgs": messages_to_pass})

print(response.content)

The capital of Germany is 'Berlin'.


### 4. Output parsers

String:  Output parser that parses the LLM result into string format.

JSON: Output parser that parses the LLM result into JSON format.

XML: Output parser that parses the LLM result into XML format.

CSV: Output parser that parses the LLM result into CSV format.

Output Fixing parser: Wraps another output parser. If the output has some parsing error, it passes the error and the bad output to LLM and tell it to fix it.

RetryWithError: Along with Output fixing parser, it will also send the original instructions.

Pydantic: Takes a user defined Pydantic model and returns data in that format.

YAML: Takes user defined Pydantic model and returns data in that format. Uses YAML to encode it.

PandasDataframe: Useful for doing operations with Pandas dataframe.

Enum: Parses response into one of the provided enum values.

Datetime: Parses response into a datetime string.

Structured: Returns structured information.

**All parsers will be attempted mostly with simple prompt template and Chat prompt template.**

In [365]:
# Import all necessary parsers

from langchain_core.output_parsers.string import StrOutputParser
from langchain_core.output_parsers.json import JsonOutputParser
from langchain_core.output_parsers.xml import XMLOutputParser
from langchain_core.output_parsers.list import CommaSeparatedListOutputParser
from langchain.output_parsers.fix import OutputFixingParser
from langchain.output_parsers.retry import RetryWithErrorOutputParser
from langchain_core.output_parsers import PydanticOutputParser
from langchain.output_parsers import YamlOutputParser
from langchain.output_parsers.pandas_dataframe import PandasDataFrameOutputParser
from langchain.output_parsers.datetime import DatetimeOutputParser
from langchain.output_parsers.structured import StructuredOutputParser, ResponseSchema

##### 4.1.1. With StringPromptTemplate

In [366]:
# Define the object
str_output_parser = StrOutputParser()

In [367]:
# Set up prompt templates

simple_prompt_str = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the query \n {query} \n",
    input_variables=["query"]
)



chat_prompt_str = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per query from user in normal string format."),
        ("user", "{query}")
    ]
)

With Simple prompt template

In [368]:
# Chaining
chain_str = simple_prompt_str | model | str_output_parser

# Result
response = chain_str.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response)



As the best AI Engineer in the world, I'm delighted to share my expertise on Prompt Engineering.

**What is Prompt Engineering?**

Prompt Engineering is a subfield of Artificial Intelligence (AI) and Natural Language Processing (NLP) that focuses on designing, optimizing, and fine-tuning the input prompts or queries that are used to interact with language models, chatbots, or other AI systems. The goal of Prompt Engineering is to craft high-quality prompts that elicit specific, accurate, and relevant responses from the AI system.

**Why is Prompt Engineering important?**

Prompt Engineering is crucial because the quality of the input prompt significantly impacts the quality of the output response. A well-designed prompt can:

1. **Improve accuracy**: By providing clear and concise context, a good prompt can help the AI system understand the intent and scope of the query, leading to more accurate responses.
2. **Increase relevance**: A well-crafted prompt can guide the AI system to prov

With Chat Prompt template

In [369]:
# Chaining
chain_str = chat_prompt_str | model | str_output_parser

# Result
response = chain_str.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response)



Prompt engineering is a relatively new field that has gained significant attention in recent years, especially with the rise of large language models and AI chatbots. It refers to the process of designing, crafting, and optimizing text prompts or inputs that are used to interact with artificial intelligence systems, such as language models, chatbots, or other machine learning models.

The goal of prompt engineering is to elicit specific, accurate, and relevant responses from the AI system, while also ensuring that the prompts are clear, concise, and unambiguous. This involves understanding the strengths and limitations of the AI model, as well as the context and requirements of the task or application.

Prompt engineering involves a range of techniques, including:

1. **Prompt design**: Crafting well-structured and clear prompts that effectively communicate the task or question to the AI system.
2. **Prompt optimization**: Refining and iterating on prompts to improve the accuracy, rele

##### 4.1.2. With JsonOutputParser

In [370]:
# Define the object
json_output_parser = JsonOutputParser()
json_output_parser.get_format_instructions()

'Return a JSON object.'

In [371]:
# Set up prompt templates

simple_prompt_json = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the user query \n {format_instruction} \n {query} \n",
    input_variables=["query"],
    partial_variables={"format_instruction":json_output_parser.get_format_instructions()},
)



chat_prompt_json = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per query from user in proper JSON format."),
        ("user", "{query}")
    ]
)

With Simple prompt template

In [372]:
# Chaining
chain_json = simple_prompt_json | model

# Result
response = chain_json.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

```json
{
  "definition": "Prompt Engineering is the process of designing and optimizing text prompts to effectively interact with language models, such as chatbots, virtual assistants, or other AI systems.",
  "goal": "The primary goal of Prompt Engineering is to elicit specific, accurate, and relevant responses from the language model, while minimizing errors, ambiguities, and misunderstandings.",
  "key_aspects": [
    "Clear and concise language",
    "Well-defined context",
    "Specific and relevant keywords",
    "Avoidance of ambiguity and uncertainty",
    "Use of relevant examples or analogies"
  ],
  "benefits": [
    "Improved accuracy and relevance of responses",
    "Increased efficiency and effectiveness of language model interactions",
    "Enhanced user experience and satisfaction",
    "Better handling of complex and nuanced queries"
  ],
  "applications": [
    "Chatbots and virtual assistants",
    "Language translation and localization",
    "Text summarization and

With Chat prompt template

In [373]:
# Chaining
chain_json = chat_prompt_json | model

# Result
response = chain_json.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

```json
{
  "response": {
    "definition": "Prompt Engineering is the process of designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models, such as chatbots, virtual assistants, or other AI systems.",
    "key_aspects": [
      "Understanding the capabilities and limitations of the language model",
      "Defining clear and specific goals for the prompt",
      "Crafting well-structured and unambiguous prompts",
      "Testing and refining prompts to achieve desired outcomes"
    ],
    "applications": [
      "Improving chatbot conversations",
      "Enhancing language translation accuracy",
      "Optimizing text summarization and generation",
      "Developing more effective virtual assistants"
    ],
    "benefits": [
      "Increased accuracy and relevance of responses",
      "Improved user experience and engagement",
      "Enhanced efficiency and productivity",
      "Better decision-making and insight generation"
    ]
  }

##### 4.1.3. With XMLOutputParser

In [374]:
# Define the object
xml_output_parser = XMLOutputParser()
xml_output_parser.get_format_instructions()

'The output should be formatted as a XML file.\n1. Output should conform to the tags below.\n2. If tags are not given, make them on your own.\n3. Remember to always open and close all the tags.\n\nAs an example, for the tags ["foo", "bar", "baz"]:\n1. String "<foo>\n   <bar>\n      <baz></baz>\n   </bar>\n</foo>" is a well-formatted instance of the schema.\n2. String "<foo>\n   <bar>\n   </foo>" is a badly-formatted instance.\n3. String "<foo>\n   <tag>\n   </tag>\n</foo>" is a badly-formatted instance.\n\nHere are the output tags:\n```\nNone\n```'

In [375]:
# Set up prompt templates

simple_prompt_xml = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the user query \n {format_instruction} \n {query} \n",
    input_variables=["query"],
    partial_variables={"format_instruction":xml_output_parser.get_format_instructions()},
)



chat_prompt_xml = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per query from user in proper XML format."),
        ("user", "{query}")
    ]
)

With Simple prompt template

In [376]:
# Chaining
chain_xml = simple_prompt_xml | model

# Result
response = chain_xml.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

<response>
  <prompt_engineering>
    <definition>Prompt engineering is the process of designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models like myself.</definition>
    <importance>Prompt engineering is crucial in natural language processing (NLP) as it helps to improve the performance and reliability of language models, enabling them to better understand the context and intent behind a given prompt.</importance>
    <techniques>
      <technique>Well-defined prompts</technique>
      <technique>Clear and concise language</technique>
      <technique>Specific keywords and phrases</technique>
      <technique>Contextual information</technique>
    </techniques>
    <applications>
      <application>Chatbots and virtual assistants</application>
      <application>Language translation and localization</application>
      <application>Text summarization and generation</application>
      <application>Sentiment analysis and opinion

With Chat prompt template

In [377]:
# Chaining
chain_xml = chat_prompt_xml | model

# Result
response = chain_xml.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

<?xml version="1.0" encoding="UTF-8"?>
<response>
  <prompt>Introduction to Prompt Engineering</prompt>
  <description>Prompt engineering is a subfield of natural language processing (NLP) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models.</description>
  <key_points>
    <point>Definition: Prompt engineering involves crafting input text that guides the language model to produce desired outputs.</point>
    <point>Goals: The primary goals of prompt engineering are to improve the performance, accuracy, and reliability of language models.</point>
    <point>Techniques: Prompt engineers use various techniques, such as prompt templating, priming, and fine-tuning, to optimize prompts for specific tasks and models.</point>
    <point>Applications: Prompt engineering has numerous applications, including text classification, sentiment analysis, question answering, and text generation.</point>
  </key_points>
  <benef

**Conclusion: With ChatPromptTemplate, the response is much cleaner.**

##### 4.1.4. With CommaSeparatedListOutputParser (CSV)

In [378]:
# Define the object
csv_output_parser = CommaSeparatedListOutputParser()

In [379]:
csv_output_parser.get_format_instructions()

'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'

In [380]:
# Set up prompt templates

simple_prompt_csv = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the user query \n {format_instruction} \n {query} \n",
    input_variables=["query"],
    partial_variables={"format_instruction":csv_output_parser.get_format_instructions()},
)



chat_prompt_csv = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per query from user in proper CSV format."),
        ("user", "{query}")
    ]
)

With Simple prompt template

In [381]:
# Chaining
chain_csv = simple_prompt_csv | model

# Result
response = chain_csv.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

Prompt engineering, prompt design, natural language processing, language model fine-tuning, human-computer interaction, conversational AI, dialogue systems, text generation, language understanding, machine learning, AI safety, adversarial testing, robustness evaluation, explainability methods, transparency techniques, human-centered design, user experience, cognitive architectures, decision-making models, cognitive biases, human factors, evaluation metrics, performance optimization, knowledge graph embedding, semantic search, question answering, text classification, sentiment analysis, named entity recognition, part-of-speech tagging, dependency parsing, coreference resolution, information retrieval, knowledge discovery, data mining, human-language technology, cognitive computing, artificial general intelligence, neural networks, deep learning, transformer models, attention mechanisms, language model architectures, transfer learning, few-shot learning, meta-learning, multimodal learnin

With Chat prompt template

In [382]:
# Chaining
chain_csv = chat_prompt_csv | model

# Result
response = chain_csv.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

"Category","Description","Key Points"
"Introduction","Prompt engineering is a subfield of natural language processing (NLP) that focuses on designing and optimizing text prompts to elicit specific responses from language models.","Definition, Importance, Applications"
"Definition","Prompt engineering involves crafting input text that guides the language model to generate desired outputs, taking into account the model's strengths, weaknesses, and biases.","Input Text, Language Model, Output"
"Importance","Well-designed prompts can significantly improve the performance and reliability of language models, enabling them to better understand the context and generate more accurate and relevant responses.","Performance, Reliability, Context Understanding"
"Applications","Prompt engineering has various applications, including chatbots, virtual assistants, language translation, text summarization, and content generation.","Chatbots, Virtual Assistants, Language Translation, Text Summarization, 

**Conclusion: CSV format is not properly coming with simple prompt. ChatPromptTemplate is best choice here.**

##### 4.1.5. With PydanticOutputParser

In [383]:
# Set up Pydantic class

from pydantic import BaseModel, Field

class question(BaseModel):
    heading: str = Field(description="header of the answer to the query.")
    details: str = Field(description="Detailed answer of the query.")
    

In [384]:
# Define the object
pydantic_output_parser = PydanticOutputParser(pydantic_object=question)

pydantic_output_parser.get_format_instructions()

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"heading": {"description": "header of the answer to the query.", "title": "Heading", "type": "string"}, "details": {"description": "Detailed answer of the query.", "title": "Details", "type": "string"}}, "required": ["heading", "details"]}\n```'

In [385]:
# Set up prompt templates

simple_prompt_pydantic = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the user query \n {format_instruction} \n {query} \n",
    input_variables=["query"],
    partial_variables={"format_instruction":pydantic_output_parser.get_format_instructions()},
)



chat_prompt_pydantic = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per {query} from user \n in proper format as given in {format_instruction}."),
        ("user", "{query}"),
    ]
).partial(format_instruction = pydantic_output_parser.get_format_instructions())

With Simple prompt template

In [386]:
# Chaining
chain_pydantic = simple_prompt_pydantic | model

# Result
response = chain_pydantic.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

```json
{
  "heading": "Introduction to Prompt Engineering",
  "details": "Prompt engineering is a subfield of natural language processing (NLP) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves understanding how language models process and respond to input prompts, and using that knowledge to craft high-quality prompts that achieve desired outcomes. Prompt engineering has applications in areas such as chatbots, virtual assistants, language translation, and text summarization. By carefully designing prompts, developers can improve the performance and reliability of language models, and create more effective and engaging user experiences."
}
```


With Chat prompt template

In [387]:
# Chaining
chain_pydantic = chat_prompt_pydantic | model

# Result
response = chain_pydantic.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

{
  "heading": "Introduction to Prompt Engineering",
  "details": "Prompt engineering is a subfield of natural language processing (NLP) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves crafting input text that guides the model to produce desired outputs, taking into account the model's strengths, weaknesses, and biases. Prompt engineering is crucial in various applications, including chatbots, virtual assistants, language translation, and text summarization. By carefully designing prompts, developers can improve the performance and reliability of language models, enabling them to better understand and respond to user queries."
}


##### 4.1.6. With YamlOutputParser (Using Pydantic)

In [388]:
# Define the object
yaml_output_parser = YamlOutputParser(pydantic_object=question)
yaml_output_parser.get_format_instructions()

'The output should be formatted as a YAML instance that conforms to the given JSON schema below.\n\n# Examples\n## Schema\n```\n{"title": "Players", "description": "A list of players", "type": "array", "items": {"$ref": "#/definitions/Player"}, "definitions": {"Player": {"title": "Player", "type": "object", "properties": {"name": {"title": "Name", "description": "Player name", "type": "string"}, "avg": {"title": "Avg", "description": "Batting average", "type": "number"}}, "required": ["name", "avg"]}}}\n```\n## Well formatted instance\n```\n- name: John Doe\n  avg: 0.3\n- name: Jane Maxfield\n  avg: 1.4\n```\n\n## Schema\n```\n{"properties": {"habit": { "description": "A common daily habit", "type": "string" }, "sustainable_alternative": { "description": "An environmentally friendly alternative to the habit", "type": "string"}}, "required": ["habit", "sustainable_alternative"]}\n```\n## Well formatted instance\n```\nhabit: Using disposable water bottles for daily hydration.\nsustainabl

In [389]:
# Set up prompt templates

simple_prompt_yaml = PromptTemplate(
    template="You are the best AI Engineer of the world. Answer the user query \n {format_instruction} \n {query} \n",
    input_variables=["query"],
    partial_variables={"format_instruction":yaml_output_parser.get_format_instructions()},
)



chat_prompt_yaml = ChatPromptTemplate.from_messages(
    [
        ("system", "You are the best AI engineer of the world. Please response to the user as per {query} from user \n in proper format as given in {format_instruction}."),
        ("user", "{query}"),
    ]
).partial(format_instruction = yaml_output_parser.get_format_instructions())

With Simple prompt template


In [390]:
# Chaining
chain_yaml = simple_prompt_yaml | model

# Result
response = chain_yaml.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

```
heading: Introduction to Prompt Engineering
details: Prompt engineering is a subfield of natural language processing that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves carefully crafting the input text, or prompt, to guide the model's output and achieve a desired outcome, such as generating text, answering questions, or completing tasks. Effective prompt engineering requires a deep understanding of language models, their strengths and limitations, and the nuances of language itself.
```


With Chat prompt template

In [391]:
# Chaining
chain_yaml = chat_prompt_yaml | model

# Result
response = chain_yaml.invoke({"query": "Can you tell me about 'Prompt Engineering'?"})

print(response.content)

```
heading: Introduction to Prompt Engineering
details: Prompt Engineering is a subfield of natural language processing (NLP) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves understanding how language models process and respond to input prompts, and using this knowledge to craft high-quality prompts that achieve desired outcomes. Prompt engineering has applications in areas such as chatbots, virtual assistants, and language translation, where well-designed prompts can significantly improve the performance and usability of these systems.
```


##### 4.1.7. With PandasDataFrameOutputParser

In [392]:
import pandas as pd
# Data
data = {
    "Fruits": ["Apple", "Mango", "Banana"],
    "Price": [30, 45, 50],
    "Quantity":[100, 200, 300]

}

df_data = pd.DataFrame(data)

In [393]:
# Define the object
pd_output_parser = PandasDataFrameOutputParser(dataframe=df_data)
pd_output_parser.get_format_instructions()

'The output should be formatted as a string as the operation, followed by a colon, followed by the column or row to be queried on, followed by optional array parameters.\n1. The column names are limited to the possible columns below.\n2. Arrays must either be a comma-separated list of numbers formatted as [1,3,5], or it must be in range of numbers formatted as [0..4].\n3. Remember that arrays are optional and not necessarily required.\n4. If the column is not in the possible columns or the operation is not a valid Pandas DataFrame operation, return why it is invalid as a sentence starting with either "Invalid column" or "Invalid operation".\n\nAs an example, for the formats:\n1. String "column:num_legs" is a well-formatted instance which gets the column num_legs, where num_legs is a possible column.\n2. String "row:1" is a well-formatted instance which gets row 1.\n3. String "column:num_legs[1,2]" is a well-formatted instance which gets the column num_legs for rows 1 and 2, where num_l

In [394]:
# Set up prompt template
simple_prompt_pd = PromptTemplate(
    template=("You are the best AI Engineer of the world. Return Exactly one operation that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
    input_variables=[],
    partial_variables={"format_instructions": pd_output_parser.get_format_instructions()},
)

chat_prompt_pd = ChatPromptTemplate(
    [
        ("system", "You are the best AI Engineer of the world. Return exactly one operation string that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
        ("user", ""),
    ]
).partial(format_instructions = pd_output_parser.get_format_instructions())

With Simple prompt template

In [395]:
# Chaining
chain_pd = simple_prompt_pd | model | pd_output_parser

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)
    print("Dataframe Info: \n", response_df.info())

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


An error occurred during parsing The maximum index 3 exceeds the maximum index of                     the Pandas DataFrame 2.
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE 
Ensure the dataframe adhere to the format correctly.


With Chat prompt template

In [396]:
# Chaining
chain_pd = chat_prompt_pd | model | pd_output_parser

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)
    print("Dataframe Info: \n", response_df.info())

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


An error occurred during parsing The maximum index 3 exceeds the maximum index of                     the Pandas DataFrame 2.
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE 
Ensure the dataframe adhere to the format correctly.


**Conclusion: Why it failed? Because, pd_output_parser.get_format_instructions() returns a long string that includes the mini-grammar and examples of valid requests. Even though I didn’t write any examples yourself, the parser supplies them. Now as I asked to return "Only ONE operation" and not the exact one, it randomly picks something from those examples, for which I did not define the index. Let me now try it with OutputFixingParser.**

##### 4.1.8. With OutputFixingParser

In [397]:
# Define the object
fixing_output_parser = OutputFixingParser.from_llm(parser=pd_output_parser, llm=model)
fixing_output_parser.get_format_instructions()

'The output should be formatted as a string as the operation, followed by a colon, followed by the column or row to be queried on, followed by optional array parameters.\n1. The column names are limited to the possible columns below.\n2. Arrays must either be a comma-separated list of numbers formatted as [1,3,5], or it must be in range of numbers formatted as [0..4].\n3. Remember that arrays are optional and not necessarily required.\n4. If the column is not in the possible columns or the operation is not a valid Pandas DataFrame operation, return why it is invalid as a sentence starting with either "Invalid column" or "Invalid operation".\n\nAs an example, for the formats:\n1. String "column:num_legs" is a well-formatted instance which gets the column num_legs, where num_legs is a possible column.\n2. String "row:1" is a well-formatted instance which gets row 1.\n3. String "column:num_legs[1,2]" is a well-formatted instance which gets the column num_legs for rows 1 and 2, where num_l

In [398]:
# Set up prompt template
simple_prompt_pd = PromptTemplate(
    template=("You are the best AI Engineer of the world. Return exactly ONE operation string that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
    input_variables=[],
    partial_variables={"format_instructions": fixing_output_parser.get_format_instructions()},
)

chat_prompt_pd = ChatPromptTemplate(
    [
        ("system", "You are the best AI Engineer of the world. Return exactly ONE operation string that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
        ("user", ""),
    ]
).partial(format_instructions = fixing_output_parser.get_format_instructions())

With Simple prompt template

In [399]:
# Chaining
chain_pd = simple_prompt_pd | model | fixing_output_parser

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


Parsed Dataframe 
 {'mean': np.float64(41.666666666666664)}


With Chat prompt template

In [400]:
# Chaining
chain_pd = chat_prompt_pd | model | fixing_output_parser

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


Parsed Dataframe 
 {'mean': np.float64(41.666666666666664)}


**Conclusion: Why it worked? Because, OutputFixingParser worked as a repair layer. It internally derived the error, reprompted with possible valid operations. So the issue did not reoccurred.**

##### 4.1.9. With RetryWithErrorOutputParser

In [401]:
# Import RunnableLambda
from langchain_core.runnables import RunnableLambda

# Define the object
retry_output_parser = RetryWithErrorOutputParser.from_llm(parser=pd_output_parser, llm=model)
retry_output_parser.get_format_instructions()

'The output should be formatted as a string as the operation, followed by a colon, followed by the column or row to be queried on, followed by optional array parameters.\n1. The column names are limited to the possible columns below.\n2. Arrays must either be a comma-separated list of numbers formatted as [1,3,5], or it must be in range of numbers formatted as [0..4].\n3. Remember that arrays are optional and not necessarily required.\n4. If the column is not in the possible columns or the operation is not a valid Pandas DataFrame operation, return why it is invalid as a sentence starting with either "Invalid column" or "Invalid operation".\n\nAs an example, for the formats:\n1. String "column:num_legs" is a well-formatted instance which gets the column num_legs, where num_legs is a possible column.\n2. String "row:1" is a well-formatted instance which gets row 1.\n3. String "column:num_legs[1,2]" is a well-formatted instance which gets the column num_legs for rows 1 and 2, where num_l

In [402]:
# Set up prompt template
simple_prompt_pd = PromptTemplate(
    template=("You are the best AI Engineer of the world. Return exactly ONE operation string that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
    input_variables=[],
    partial_variables={"format_instructions": pd_output_parser.get_format_instructions()},
)

chat_prompt_pd = ChatPromptTemplate(
    [
        ("system", "You are the best AI Engineer of the world. Return exactly ONE operation string that conforms to {format_instructions} \n" \
              "You must always return valid Dataframe format. Do not return any additional text."),
        ("user", ""),
    ]
).partial(format_instructions = pd_output_parser.get_format_instructions())

With Simple prompt template

In [403]:
# Chaining
chain_pd = simple_prompt_pd | model | str_output_parser | RunnableLambda(
    lambda text: retry_output_parser.parse_with_prompt(
        text, 
        simple_prompt_pd.format_prompt()
    )
)

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


Parsed Dataframe 
 {'mean': np.float64(41.666666666666664)}


With Chat prompt template


In [404]:
# Chaining
chain_pd = chat_prompt_pd | model | str_output_parser | RunnableLambda(
    lambda text: retry_output_parser.parse_with_prompt(
        text, 
        chat_prompt_pd.format_prompt()
    )
)

# Result
try:
    response_df = chain_pd.invoke({})
    print("Parsed Dataframe \n", response_df)

except Exception as e:
    print(f"An error occurred during parsing {e}")
    print("Ensure the dataframe adhere to the format correctly.")


Parsed Dataframe 
 {'mean': np.float64(41.666666666666664)}


**Conclusion: OutputFixingParser just take the bad LLM output and ask to edit it to fit the schema. RetryWithErrorOutputParser pass the error and the original query again and tell to retry from scratch.**

##### 4.1.10. With DatetimeOutputParser

In [405]:
datetime_output_parser = DatetimeOutputParser()

datetime_output_parser.get_format_instructions()

"Write a datetime string that matches the following pattern: '%Y-%m-%dT%H:%M:%S.%fZ'.\n\nExamples: 2023-07-04T14:30:00.000000Z, 1999-12-31T23:59:59.999999Z, 2025-01-01T00:00:00.000000Z\n\nReturn ONLY this string, no other words!"

In [406]:
# Set up prompt template
simple_prompt_datetime = PromptTemplate(
    template=("You are the best AI Engineer of the world. Return response for the {query} that conforms to {format_instructions} \n" \
              "You must always return valid datetime format. Do not return any additional text."),
    input_variables=["query"],
    partial_variables={"format_instructions": datetime_output_parser.get_format_instructions()},
)

chat_prompt_datetime = ChatPromptTemplate(
    [
        ("system", "You are the best AI Engineer of the world. Return response for the {query} that conforms to {format_instructions} \n" \
              "You must always return valid datetime format. Do not return any additional text."),
        ("user", "{query}"),
    ]
).partial(format_instructions = datetime_output_parser.get_format_instructions())

With simple prompt

In [407]:
# Chaining
chain_datetime = simple_prompt_datetime | model | datetime_output_parser

# Result
response = chain_datetime.invoke({"query": "When was 'attention all you need' paper published?"})

print(response)

2017-12-21 18:45:00


With chat prompt

In [408]:
# Chaining
chain_datetime = chat_prompt_datetime | model | datetime_output_parser

# Result
response = chain_datetime.invoke({"query": "What was the date of inauguration of first IIT in India?"})

print(response)

1951-08-18 00:00:00


**Conclusion: This is not a RAG solution. LLM is only publishing the year based on its training and 'assuming' the time which is not known by it.**

##### 4.1.11. With StructuredOutputParser

In [409]:
# Define Response schema
response_schema = [
    ResponseSchema(name="Header", description="title of the subject."),
    ResponseSchema(name="Description", description="Detailed description of the subject.")
]

In [410]:
# Define the object
structured_output_parser = StructuredOutputParser.from_response_schemas(response_schemas=response_schema)
structured_output_parser.get_format_instructions()

'The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{\n\t"Header": string  // title of the subject.\n\t"Description": string  // Detailed description of the subject.\n}\n```'

In [411]:
# Set up prompt template
simple_prompt_structured = PromptTemplate(
    template=("You are the best AI Engineer of the world. Return response for the {query} that conforms to {format_instructions} \n"),
    input_variables=["query"],
    partial_variables={"format_instructions": structured_output_parser.get_format_instructions()},
)

chat_prompt_structured = ChatPromptTemplate(
    [
        ("system", "You are the best AI Engineer of the world. Return response for the {query} that conforms to {format_instructions} \n"),
        ("user", "{query}"),
    ]
).partial(format_instructions = structured_output_parser.get_format_instructions())

With simple prompt

In [412]:
# Chaining
chain_structured = simple_prompt_structured | model | structured_output_parser

# Result
response = chain_structured.invoke({"query": "Tell me about 'Prompt Engineering'"})
print(response)

{'Header': 'Prompt Engineering', 'Description': "Prompt engineering is a subfield of natural language processing (NLP) and artificial intelligence (AI) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves crafting input text that is clear, concise, and well-defined to achieve a particular goal or outcome. Prompt engineering is crucial in various applications, including chatbots, virtual assistants, language translation, text summarization, and content generation. The process typically involves analyzing the task requirements, understanding the language model's capabilities and limitations, and iteratively refining the prompt to improve the quality and relevance of the response. Effective prompt engineering can significantly enhance the performance and usability of language models, enabling them to better understand user intent, provide more accurate answers, and generate high-quality content."}


With Chat Prompt

In [414]:
# Chaining
chain_structured = chat_prompt_structured | model | structured_output_parser

# Result
response = chain_structured.invoke({"query": "Tell me about 'Prompt Engineering'"})
print(response)

{'Header': 'Prompt Engineering', 'Description': 'Prompt engineering is a subfield of natural language processing (NLP) and artificial intelligence (AI) that focuses on designing and optimizing text prompts to elicit specific, accurate, and relevant responses from language models. It involves crafting input text that is clear, concise, and well-defined to achieve a desired outcome, such as generating text, answering questions, or completing tasks. Prompt engineering requires a deep understanding of language models, their strengths and limitations, and the nuances of human language. The goal of prompt engineering is to create prompts that are effective, efficient, and scalable, and that can be used to improve the performance of language models in a wide range of applications, from chatbots and virtual assistants to language translation and text summarization. By carefully designing and refining prompts, developers can unlock the full potential of language models and create more accurate,