"Models" refer to (Large) Language Models.
"Prompts" refer to the style of creating inputs to pass to the models.
"Parsers" refer to taking the output of the models and parsing it into a more structured format that can be used downstream.

When you build an application using an LLM, the model will be reusable.
We repeatedly prompt a model and parse the output.
`LangChain` gives an easy set of abstractions to do this operation.

In [1]:
# !pip install openai

# OpenAI

In [2]:
!pip show openai

Name: openai
Version: 1.2.3
Summary: The official Python library for the openai API
Home-page: 
Author: 
Author-email: OpenAI <support@openai.com>
License: 
Location: C:\Users\A2644752\Sandbox\PRIVATE-RAI\env\Lib\site-packages
Requires: anyio, distro, httpx, pydantic, tqdm, typing-extensions
Required-by: instructor


OpenAI version 1.0.0 introduced an API change.
See [here](https://github.com/openai/openai-python/discussions/742) for more details.

In [3]:
import os

from openai.lib.azure import AzureOpenAI
from openai.types.chat.chat_completion import ChatCompletion

Available API versions can be found [here](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference).
- [Preview](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview)
- [Stable](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable)

In [4]:
api_version = "2023-12-01-preview"

In [5]:
openai = AzureOpenAI(api_version=api_version)

My Azure account requires the deployment name rather than the model name.
Use `model: str = "gpt-35-turbo-16k"` instead of `model: str = "gpt-3.5-turbo"`.

`openai.ChatCompletion.create` was replaced with:
- `AzureOpenAI(...).chat.completions.create` if using AzureOpenAI
- `OpenAI(...).chat.completions.create` if using OpenAI

In [6]:
def get_completion(prompt: str, model: str = "gpt-35-turbo-16k") -> str:
    """Given a prompt and model, return the content of the response."""
    messages = [{"role": "user", "content": prompt}]
    response: ChatCompletion = openai.chat.completions.create(
        messages=messages,
        model=model,
        temperature=0,
    )
    return response.choices[0].message.content

In [7]:
get_completion("What is 1+1?")

'1+1 equals 2.'

To motivate the LangChain abstractions for model prompts and parsers, suppose you get an email in a language other than english.

In [39]:
email = """
Arr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

We can translate the message by asking the LLM to rewrite it in a given style.

In [40]:
style = """American English \
in a calm and respectful tone
"""

In [41]:
prompt = f"""Translate the text \
that is delimited by triple backticks
into a style that is {style}.
text: ```{email}```"""

In [42]:
print(prompt)

Translate the text that is delimited by triple backticks
into a style that is American English in a calm and respectful tone
.
text: ```
Arr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```


In [43]:
response = get_completion(prompt)

In [44]:
response

"I'm really frustrated that my blender lid flew off and made a mess of my kitchen walls with smoothie! And to make things even worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help at this moment, my friend."

We can do this in a more convenient way with `LangChain`.

# LangChain

## Model

In [45]:
from langchain.chat_models.azure_openai import AzureChatOpenAI

In [46]:
chat = AzureChatOpenAI(model="gpt-35-turbo-16k", temperature=0, api_version=api_version)
chat

AzureChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x0000023FCA93D5D0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x0000023FCAA8E0D0>, model_name='gpt-35-turbo-16k', temperature=0.0, openai_api_key='aac22590f900414faaf3637e450a96ae', openai_proxy='', azure_endpoint='https://a3d0dvexroai01t.openai.azure.com/', openai_api_version='2023-12-01-preview', openai_api_type='azure')

In [47]:
type(chat)

langchain_community.chat_models.azure_openai.AzureChatOpenAI

`AzureChatOpenAI` is a `LangChain` object that offers additional attributes and methods over the `AzureOpenAI` object.

## Prompt Template

We use the same prompt, but not as an `f-string`.

In [48]:
template = """Translate the text \
that is delimited by triple backticks
into a style that is {style}.
text: ```{email}```"""

In [49]:
from langchain.prompts import ChatPromptTemplate

The `ChatPromptTemplate` has `classmethod`s for constructing a prompt given a string.

In [50]:
prompt_template = ChatPromptTemplate.from_template(template=template)

In [51]:
prompt_template

ChatPromptTemplate(input_variables=['email', 'style'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['email', 'style'], template='Translate the text that is delimited by triple backticks\ninto a style that is {style}.\ntext: ```{email}```'))])

In [52]:
type(prompt_template)

langchain_core.prompts.chat.ChatPromptTemplate

In [53]:
prompt_template.messages[0].prompt

PromptTemplate(input_variables=['email', 'style'], template='Translate the text that is delimited by triple backticks\ninto a style that is {style}.\ntext: ```{email}```')

We can see that the `prompt_template` has identified two input variables, `email` and `style`.
These were surrounded by braces (`{}`) in the template string.

In [54]:
prompt_template.messages[0].input_variables

['email', 'style']

We can set values to the input variables with `format_messages`.
If we forget a required `input_variable`, a `KeyError` will be raised.

In [55]:
prompt_template.format_messages()

KeyError: 'style'

In [56]:
messages = prompt_template.format_messages(style=style, email=email)

In [57]:
messages

[HumanMessage(content="Translate the text that is delimited by triple backticks\ninto a style that is American English in a calm and respectful tone\n.\ntext: ```\nArr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!\n```")]

In [58]:
type(messages)

list

In [59]:
type(messages[0])

langchain_core.messages.human.HumanMessage

In [60]:
chat(messages=messages)

AIMessage(content="I'm really frustrated that my blender lid flew off and made a mess of my kitchen walls with smoothie! And to make things even worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help at this moment, my friend.")

We can reverse-translate our response using a similar technique.

In [61]:
reply = """Hey there customer, \
the warranty does not cover ]
cleaning expenses for your kitchen \
because it's your fault that \
you misused your blender \
by forgetting to put the lid on before \
starting the blender. \
Tought luck! See ya!"""

In [62]:
reply_style = """
A polite tone \
that speaks in English Pirate"""

In [63]:
reply_messages = prompt_template.format_messages(style=reply_style, email=reply)

In [64]:
print(reply_messages[0].content)

Translate the text that is delimited by triple backticks
into a style that is 
A polite tone that speaks in English Pirate.
text: ```Hey there customer, the warranty does not cover ]
cleaning expenses for your kitchen because it's your fault that you misused your blender by forgetting to put the lid on before starting the blender. Tought luck! See ya!```


In [65]:
response = chat(reply_messages)
print(response.content)

Arr, me hearty customer, the warranty be not coverin' yer cleaning expenses fer yer galley 'cause 'tis yer fault ye misused yer blender by forgettin' to put the lid on afore startin' the blender. Tis a tough luck, matey! Fare thee well!


Prompt templates provide us with a greater level of abstraction vs `f-strings` as the prompts become longer and more complex.

## Output Parsers

We can extract information with the LLM and format it as JSON.

In [66]:
{
    "gift": False,
    "delivery_days": 5,
    "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

In [67]:
review = """
This leaf blower is pretty amazing. It has four settings \
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversay present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

In [68]:
review_template = """
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delviery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price, \
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [69]:
from langchain.prompts import ChatPromptTemplate

In [70]:
prompt_template = ChatPromptTemplate.from_template(template=review_template)

In [71]:
prompt_template.input_variables

['text']

In [72]:
messages = prompt_template.format_messages(text=review)

In [73]:
chat = AzureChatOpenAI(model="gpt-35-turbo-16k", temperature=0, api_version=api_version)
response = chat(messages=messages)
response

AIMessage(content='{\n  "gift": false,\n  "delivery_days": 2,\n  "price_value": ["It\'s slightly more expensive than the other leaf blowers out there, but I think it\'s worth it for the extra features."]\n}')

By default the LLM will return the response content as a string.
We can convert it to another Python object using an `OutputParser`.

In [74]:
# This raises an error because the response.content is a string.
response.content.get("gift")

AttributeError: 'str' object has no attribute 'get'

## Parse the LLM output string into a Python dictionary

In [75]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

We define what we want the response to look like using the `ResponseSchema`.
This is similar to a `pydantic.Field` object.

In [78]:
gift_schema = ResponseSchema(
    name="gift",
    description="Was the item purchased as a gift for someone else? \
    Answer True if yes, False if not or unknown.",
)

In [79]:
delivery_days_schema = ResponseSchema(
    name="delivery_days",
    description="How many days did it take for the product \
    to arrive? If this information is not found, output -1.",
)

In [80]:
price_value_schema = ResponseSchema(
    name="price_value",
    description="Extract any sentences about the value or price, \
    and output them as a comma separated Python list.",
)

In [81]:
response_schemas = [
    gift_schema,
    delivery_days_schema,
    price_value_schema,
]

We create a `StructuredOutputParser` using the `response_schemas`.

In [82]:
output_parser = StructuredOutputParser.from_response_schemas(response_schemas=response_schemas)

In [84]:
type(output_parser)

langchain.output_parsers.structured.StructuredOutputParser

The response schema is formatted with instructions for the LLM.

In [85]:
format_instructions = output_parser.get_format_instructions()

In [86]:
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"gift": string  // Was the item purchased as a gift for someone else?     Answer True if yes, False if not or unknown.
	"delivery_days": string  // How many days did it take for the product     to arrive? If this information is not found, output -1.
	"price_value": string  // Extract any sentences about the value or price,     and output them as a comma separated Python list.
}
```


In [87]:
review_template_2 = """
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}"""

In [88]:
prompt = ChatPromptTemplate.from_template(template=review_template_2)

In [89]:
messages = prompt.format_messages(text=review, format_instructions=format_instructions)

In [90]:
print(messages[0].content)


For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the productto arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,and output them as a comma separated Python list.

text: 
This leaf blower is pretty amazing. It has four settings candle blower, gentle breeze, windy city, and tornado. It arrived in two days, just in time for my wife's anniversay present. I think my wife liked it so much she was speechless. So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn. It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```

In [91]:
response = chat(messages=messages)

In [92]:
print(response.content)

```json
{
	"gift": false,
	"delivery_days": "2",
	"price_value": "It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."
}
```


The LLM does a much better job at formatting the response, though it's still a string.
The `output_parser` can `parse` the `response.content` to the type referenced by the backticks.

In [93]:
output_dict = output_parser.parse(text=response.content)

In [94]:
output_dict

{'gift': False,
 'delivery_days': '2',
 'price_value': "It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."}

In [95]:
type(output_dict)

dict

In [96]:
output_dict.get("delivery_days")

'2'