# LangChain: Models, Prompts and Output Parsers


## Outline

 * Direct API calls to OpenAI
 * API calls through LangChain:
   * Prompts
   * Models
   * Output parsers

## Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)

In [1]:
#!pip install python-dotenv
#!pip install openai

In [2]:
# Import necessary libraries
import os                                      # For interacting with the operating system, e.g., file handling
import openai                                  # OpenAI's Python client for accessing the OpenAI API

# Load environment variables from a .env file
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())                 # Reads the local .env file to load environment variables

# Set the OpenAI API key from environment variables
openai.api_key = os.environ['OPENAI_API_KEY']  # Retrieve the API key from the environment and set it for openai

Note: LLM's do not always produce the same results. When executing the code in your notebook, you may get slightly different answers that those in the video.

In [3]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

## Chat API : OpenAI

Let's start with a direct API calls to OpenAI.

In [4]:
# Define a function to get a response from the OpenAI API based on a given prompt
def get_completion(prompt, model=llm_model):
    # Prepare the message format for the API request, indicating the user's prompt
    messages = [{"role": "user", "content": prompt}]
    
    # Call the OpenAI API's ChatCompletion endpoint to get the model's response
    response = openai.ChatCompletion.create(
        model=model,                            # The model to use for generating a response (default is llm_model)
        messages=messages,                      # The list of messages to send to the model
        temperature=0,                          # Controls the randomness of the response (0 means more deterministic)
    )
    
    # Return the content of the response (the actual generated text from the model)
    return response.choices[0].message["content"]

In [5]:
get_completion("What is 1+1?")

'1+1 equals 2.'

In [6]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

In [7]:
style = """American English \
in a calm and respectful tone
"""

In [8]:
prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_email}```
"""

print(prompt)

Translate the text that is delimited by triple backticks 
into a style that is American English in a calm and respectful tone
.
text: ```
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse,the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```



In [9]:
response = get_completion(prompt)

In [10]:
response

"Ah, I'm really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make matters worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help right now, friend."

## Chat API : LangChain

Let's try how we can do the same using LangChain.

In [11]:
#!pip install --upgrade langchain

### Model

In [12]:
from langchain.chat_models import ChatOpenAI

In [13]:
# To control the randomness and creativity of the generated
# text by an LLM, use temperature = 0.0
chat = ChatOpenAI(temperature=0.0, model=llm_model)
chat

ChatOpenAI(verbose=False, callbacks=None, callback_manager=None, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', temperature=0.0, model_kwargs={}, openai_api_key=None, openai_api_base=None, openai_organization=None, request_timeout=None, max_retries=6, streaming=False, n=1, max_tokens=None)

### Prompt template

In [14]:
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

In [15]:
# Import the ChatPromptTemplate class from the langchain.prompts module
from langchain.prompts import ChatPromptTemplate

# Create a ChatPromptTemplate instance by passing a template string
# The template_string should be a predefined string containing placeholders for dynamic content
prompt_template = ChatPromptTemplate.from_template(template_string)

In [16]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['style', 'text'], output_parser=None, partial_variables={}, template='Translate the text that is delimited by triple backticks into a style that is {style}. text: ```{text}```\n', template_format='f-string', validate_template=True)

In [17]:
# Access the input variables from the first message of the prompt template
# 'messages' is a list of message objects in the prompt template, where each message can have its own structure
# 'prompt' is the specific prompt structure within the message, which holds details like input variables
# 'input_variables' retrieves the list of variables that are expected to be filled in dynamically when using the template
prompt_template.messages[0].prompt.input_variables

['style', 'text']

In [18]:
# Define a string that specifies the desired style for communication
# The string contains the language preference (American English) and the tone of communication (calm and respectful)
customer_style = """American English \
in a calm and respectful tone
"""

In [19]:
# Define a string that simulates a customer's email with a complaint
# The email is written in a pirate-like tone, expressing frustration with a product issue
# It includes multiple lines, where the backslash (\) is used for line continuation
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

In [20]:
# Format the messages in the prompt template by injecting the style and text variables
# The 'customer_style' specifies the tone and language style, 
# while 'customer_email' contains the actual complaint message from the customer
customer_messages = prompt_template.format_messages(
    style=customer_style,  # Pass the desired communication style
    text=customer_email    # Pass the customer's email content
)

In [21]:
# Check the type of the customer_messages variable (should be a list of formatted messages)
print(type(customer_messages))

# Check the type of the first item in the customer_messages list (should be a formatted message object)
print(type(customer_messages[0]))

<class 'list'>
<class 'langchain.schema.HumanMessage'>


In [22]:
# Print the first formatted customer message to check its contents
print(customer_messages[0])

content="Translate the text that is delimited by triple backticks into a style that is American English in a calm and respectful tone\n. text: ```\nArrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!\n```\n" additional_kwargs={} example=False


In [23]:
# Call the LLM (Language Model) to process the customer message and generate a response
# The LLM will take the formatted message(s) and produce a response in the same style
customer_response = chat(customer_messages)

In [24]:
# Print the content of the response from the LLM
print(customer_response.content)

I'm really frustrated that my blender lid flew off and splattered my kitchen walls with smoothie! And to make things worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help right now, friend.


In [25]:
# Define the customer service reply in a casual, slightly humorous tone
# This string contains the response explaining why the warranty doesn't cover the cleaning expenses
# The backslashes (\) at the end of each line ensure the string is treated as a single line in Python
service_reply = """ "Hey there customer, \
the warranty does not cover \
cleaning expenses for your kitchen \
because it's your fault that \
you misused your blender \
by forgetting to put the lid on before \
starting the blender. \
Tough luck! See ya! \
"""

In [26]:
# Define the style for the response, specifying that it should be in a "polite" pirate tone
# The style includes a description of the tone and language, indicating the model should reply in a pirate-like voice
service_style_pirate = """\
a polite tone \
that speaks in English Pirate\
"""

In [27]:
# Use the prompt template to format the service reply with the specified pirate style
# The 'style' argument applies the "pirate" tone, and the 'text' argument provides the service reply text
service_messages = prompt_template.format_messages(
    style=service_style_pirate,  # Set the tone to "pirate" style
    text=service_reply            # Set the customer service reply as the message content
)

Translate the text that is delimited by triple backticks into a style that is a polite tone that speaks in English Pirate. text: ```Hey there customer, the warranty does not cover cleaning expenses for your kitchen because it's your fault that you misused your blender by forgetting to put the lid on before starting the blender. Tough luck! See ya!
```



In [28]:
# Print the content of the formatted message
# The first message in 'service_messages' is printed to see the result of applying the pirate tone
print(service_messages[0].content)

Ahoy there, valued customer! Regrettably, the warranty be not coverin' the cost o' cleanin' yer galley due to yer own negligence. Ye see, 'twas yer own doin' when ye forgot to secure the lid afore startin' the blender. 'Tis a tough break, indeed! Fare thee well, matey!


## Output Parsers

Let's start with defining how we would like the LLM output to look like:

In [29]:
# Define a dictionary representing product information
# - 'gift': Boolean value indicating if the product was bought as a gift (False means not a gift)
# - 'delivery_days': Integer indicating how many days it took for the product to be delivered (5 days)
# - 'price_value': String description of the product's value ("pretty affordable!")
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

In [30]:
# Define the customer review as a multi-line string
# The review contains details about the product's features, delivery time, and price comparison
# The backslashes (\) are used for line continuation to treat this as a single string
# The review mentions the product's settings, how it arrived, and how much the reviewer likes it
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

# Define the template to extract specific information from the review
# This template will be used to format the input for the model, asking for:
# - 'gift': Whether the item was a gift (True/False)
# - 'delivery_days': Number of days for delivery (if not mentioned, output -1)
# - 'price_value': Sentences related to the product's value or price (output as a list)
# The placeholder {text} will be replaced with the actual customer review text
review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [31]:
# Import the necessary class to handle the prompt template
from langchain.prompts import ChatPromptTemplate

# Create a prompt template from the previously defined review_template string
prompt_template = ChatPromptTemplate.from_template(review_template)

# Print the prompt template object to check its structure
print(prompt_template)

input_variables=['text'] output_parser=None partial_variables={} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], output_parser=None, partial_variables={}, template='For the following text, extract the following information:\n\ngift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\ndelivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.\n\nprice_value: Extract any sentences about the value or price,and output them as a comma separated Python list.\n\nFormat the output as JSON with the following keys:\ngift\ndelivery_days\nprice_value\n\ntext: {text}\n', template_format='f-string', validate_template=True), additional_kwargs={})]


In [32]:
# Format the review text using the prompt template
# The 'text' placeholder in the template is replaced with the actual customer_review
messages = prompt_template.format_messages(text=customer_review)

# Initialize the ChatOpenAI model with specified parameters:
# - 'temperature=0.0' makes the model's output more deterministic (less random)
# - 'model=llm_model' specifies which language model to use
chat = ChatOpenAI(temperature=0.0, model=llm_model)

# Send the formatted messages to the model for processing and generate a response
response = chat(messages)

# Print the content of the response from the model (the extracted information)
print(response.content)

{
  "gift": true,
  "delivery_days": 2,
  "price_value": "It's slightly more expensive than the other leaf blowers out there"
}


In [33]:
# Check the type of the response.content to confirm the structure of the output
# The expected output type is a string, as it's the model's generated response in text form
type(response.content)

str

In [34]:
# You will get an error by running this line of code 
# because 'gift' is not a dictionary
# 'gift' is a string
response.content.get('gift')

AttributeError: 'str' object has no attribute 'get'

### Parse the LLM output string into a Python dictionary

In [35]:
# Import necessary classes for structured output parsing
# ResponseSchema is used to define the structure of the expected output
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

In [36]:
# Define a schema for the 'gift' field, specifying that it should be a True/False answer
gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")

# Define a schema for the 'delivery_days' field, specifying that it should be the number of delivery days
delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")

# Define a schema for the 'price_value' field, specifying that it should be a list of sentences about the price
price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

# Group all the response schemas into a list for easier handling
response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

In [37]:
# Initialize the output parser by using the structured response schemas defined earlier
# The parser is created to handle the specific structure of the expected responses, which are described in 'response_schemas'
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [38]:
# Get the format instructions from the output parser
# These instructions explain how the model should structure its output according to the defined schemas
format_instructions = output_parser.get_format_instructions()

In [39]:
# Print the format instructions to verify how the output should be structured
# These instructions will help guide the model in providing the output in the desired format
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "\`\`\`json" and "\`\`\`":

```json
{
	"gift": string  // Was the item purchased                             as a gift for someone else?                              Answer True if yes,                             False if not or unknown.
	"delivery_days": string  // How many days                                      did it take for the product                                      to arrive? If this                                       information is not found,                                      output -1.
	"price_value": string  // Extract any                                    sentences about the value or                                     price, and output them as a                                     comma separated Python list.
}
```


In [40]:
# Define a new review template that includes the format instructions
# The template is used to format the prompt, instructing the model on how to extract specific information
# The {text} placeholder is replaced with the actual customer review text, 
# and {format_instructions} is replaced with the format instructions generated earlier
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

# Create a ChatPromptTemplate from the newly defined template string (review_template_2)
# This prepares the prompt that will be sent to the model, with placeholders to be replaced by actual values
prompt = ChatPromptTemplate.from_template(template=review_template_2)

# Format the message by replacing the {text} and {format_instructions} placeholders with the actual customer review 
# and the format instructions respectively
messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)

In [41]:
# Print the content of the first message in the 'messages' list
# This will display the formatted message, which includes the customer review and the format instructions
print(messages[0].content)

For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the productto arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,and output them as a comma separated Python list.

text: This leaf blower is pretty amazing.  It has four settings:candle blower, gentle breeze, windy city, and tornado. It arrived in two days, just in time for my wife's anniversary present. I think my wife liked it so much she was speechless. So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn. It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "\`\`\`json" and "

In [42]:
# Send the formatted messages to the chat model and generate a response
# The 'chat' function processes the input messages (customer review and format instructions) and returns a model-generated response
response = chat(messages)

In [43]:
# Print the content of the response from the model
# This will show the model's output after processing the formatted messages, which should contain the extracted information
print(response.content)

```json
{
	"gift": true,
	"delivery_days": 2,
	"price_value": ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]
}
```


In [44]:
# Parse the response content using the output parser to convert the text-based response into a structured dictionary
# The output_parser is responsible for extracting and organizing the relevant data from the model's response based on predefined schemas
output_dict = output_parser.parse(response.content)

In [45]:
# Display the parsed output dictionary to verify the extracted information
# This will show the structured data (like 'gift', 'delivery_days', 'price_value') extracted from the model's response
output_dict

{'gift': True,
 'delivery_days': 2,
 'price_value': ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]}

In [46]:
# Check the type of the parsed output dictionary to ensure it is a dictionary
# The expected output type is a dictionary, as it holds the extracted information from the model's response
type(output_dict)

dict

In [47]:
# Access the 'delivery_days' field from the parsed output dictionary to get the number of days the product took to arrive
# This will return the value associated with 'delivery_days' in the parsed output
output_dict.get('delivery_days')

2