# LangChain

Large Language Models (LLM) have become increasingly used in more and more complex applications. When we use LLM in applications such as chatbots, there are many things that are happening that need to be considered:
- How do we keep track of historical context?
- Can we simulate a conversation with different people / agents?
- Can we direct the conversation to different 'people' depending on what is asked? (e.g. asking a panel of experts, and picking the most relevant expert based on the question)

We need a framework that allows us to abstract away many of these complexities, and ideally make it easy to logically structure how we use these LLMs. Introducing LangChain:

LangChain is an open-source framework for developing applications powered by LLM. This framework includes the ability to:
- Efficiently integrate with popular AI platforms such as OpenAI (company behind ChatGPT) and Hugging Face
- Connecting language driven models to data sources
- Enable LLMs to interact dynamically with their environment

It is designed to have modular components, that when combined together, can be used in many different applications.

## Course outline
- Models, Prompts and Output Parsers
    - Calling OpenAI
    - OpenAI Endpoints
    - Prompt templates
    - Using LangChain
- Handling memory
    - How do LLM store memory?
    - ConversationBufferMemory
    - ConversationBufferWindowMemory
    - ConversationTokenBufferMemory
    - ConverastionSummaryBufferMemory
    - Other memory methods
- Chains
    - What is a chain?
    - LLMChain
    - SimpleSequentialChain
    - SequentialChain
    - RouterChain
    - Other chains to explore

## Models, Prompts and Output Parsers

### Calling OpenAI

Before we look into LangChain and what it can do, we will make direct calls to OpenAI to show you what LLMs can do.

Let's look at an example to see how this works.

If want to follow along, you will need your own OpenAI API key.

Follow this link to get your own API key:
https://platform.openai.com/account/api-keys

In [13]:
# !pip install python-dotenv
# !pip install openai

In [29]:
# Put your API key here
os.environ["OPENAI_API_KEY"] = "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"

In [30]:
import os
import openai

openai.api_key = os.environ["OPENAI_API_KEY"]

### OpenAI Endpoints

An API (application programming interface) is a software intermediary that allows applications to talk to each other. An API endpoint is a specific location within an API that accepts requests and sends responses back.

OpenAI has several API endpoints that are offered. This includes tasks such as:
- Audio files to text
- Chat responses given list of messages
- Predicted text completion
- Create vector embeddings given a text input
- Generate images given prompts and image
- and many more...

It is fascinating how simple it is now to access all this through a simple API call.

We will demonstrate using openAI's ChatCompletion API. This is an API that is useful for having conversations, where given a list of messages comprising a conversation, it will return a response (think like a Chatbot). Let's start by keeping it simple, and just demonstrate what it looks like to call this API endpoint.

Let us choose ChatGPT as the LLM for this demo. We account for the deprecation of the LLM by comparing to the target date of June 12th 2024.

In [31]:
# account for deprecation of LLM model
import datetime

current_date = datetime.datetime.now().date() # Get the current date
target_date = datetime.date(2024, 6, 12) # Define the date after which the model should be set to "gpt-3.5-turbo"

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

We will now define a function that will take in a prompt, feed it to our chosen LLM, then return the response.

In [32]:
def get_completion(prompt, model=llm_model):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, 
    )
    return response.choices[0].message["content"]

In [33]:
# In this converations, we have different roles (e.g. a customer, and an assistant), and also a 'system' role, which you can think of background information that defines who the assistant is.

In [34]:
get_completion(prompt="What is the capital of Australia?")

'The capital of Australia is Canberra.'

Here, we simply called the ChatCompletion API endpoint with a simple prompt, and got a response. A further breakdown of the function:
- messages : this is the list of messages we are going to pass to the ChatCompletion API endpoint
- response : this is the API endpoint, that takes the messages as input, uses the GPT-3.5 model to process and return a response
- temperature : this is a parameter that defines how random the response should be. 0 is telling the model to be more deterministic.

Let's now explore what else we can do with this.

### Prompt templates

Let's now customise the prompt/input, so that it is more dynamic, and like a template we can reuse.

Suppose we have a customer review of a restaurant they went to. It is written in a very rude tone. We want to be able to rewrite the review so that it is written more politely. We can design this as a prompt template which takes as inputs:
- customer review : content of the review
- style : what style to rewrite the review

We wrap this in a prompt text, which takes the 'customer review' and 'style' as dynamic inputs.


In [35]:
customer_review = """
The food in this restaurant is honestly the \
worst I have ever had. The steak was so dry \
and the portion was so small. The staff \
were not helpful, took forever to come \
and didn't seem to care about providing \
a good customer experience. The meal was also \
grossly overpriced. Do not come here if you \
want good food.
"""

In [62]:
style = """English in a polite tone."""

In [63]:
prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_review}```
"""

print(prompt)

Translate the text that is delimited by triple backticks 
into a style that is English in a polite tone..
text: ```
The food in this restaurant is honestly the worst I have ever had. The steak was so dry and the portion was so small. The staff were not helpful, took forever to come and didn't seem to care about providing a good customer experience. The meal was also grossly overpriced. Do not come here if you want good food.
```



In [64]:
response = get_completion(prompt)

In [65]:
response

'I must say that I was disappointed with the food at this restaurant. Unfortunately, the steak I ordered was quite dry and the portion size was rather small. Additionally, the staff were not very helpful and took quite a long time to attend to our needs. It seemed as though they were not very concerned with providing a positive customer experience. Furthermore, the meal was quite expensive and did not seem to be worth the price. I would not recommend this restaurant if you are looking for good food.'

We now have a more dynamic template, where we can feed in some content (the review) and a style, and the response will vary accordingly. In the next section, we'll repeat this exercise but using LangChain.

### Using LangChain

At the most basic level, we can think of there as being 3 components that make up a call to a LLM.
We need:
- A prompt (i.e. input) that will be fed into a LLM
- A large language model that will read in the prompt as input and process it
- A parser to take the output from the LLM and return it in a desired way

Let's repeat the same exercise we did with OpenAI, but using LangChain this time.

In [40]:
#!pip install --upgrade langchain

In [67]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

# define prompt template
# input variables are denoted in {}
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""
prompt_template = ChatPromptTemplate.from_template(template_string) # define prompt template
chat = ChatOpenAI(temperature=0.0, model=llm_model) # define API endpoint / LLM

# provide example prompt based off template design
writing_style = """English in a formal polite tone."""
customer_review = """
The food in this restaurant is honestly the \
worst I have ever had. The steak was so dry \
and the portion was so small. The staff \
were not helpful, took forever to come \
and didn't seem to care about providing \
a good customer experience. The meal was also \
grossly overpriced. Do not come here if you \
want good food.
"""
customer_messages = prompt_template.format_messages(
                    style=writing_style,
                    text=customer_review)

# Call the LLM to translate to the style of the customer message
customer_response = chat(customer_messages)

# display response
print(customer_response.content)

I must express my disappointment with the quality of the food served at this establishment. Regrettably, the steak I ordered was excessively dry and the portion size was inadequate. Furthermore, the staff were unhelpful, took an unreasonable amount of time to attend to our needs, and appeared indifferent to providing a satisfactory customer experience. Additionally, the cost of the meal was exorbitant. I would advise against dining here if you are seeking a pleasurable culinary experience.


When using LangChain this time, we did not simply pass in a dynamic string, but we actually created a ChatPromptTemplate imported from the LangChain library. This gives us more flexibility, and makes for more modular and clean code. For example, we can actually see what the input variables to this template are as below:

In [66]:
prompt_template.messages[0].prompt.input_variables

['style', 'text']

The benefits of this will become more obvious once we start using more complex logic. Let's see another example now with an output parser.

### Parsing LLM output with LangChain

In [69]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

In [73]:
bedroom_schema = ResponseSchema(
    name="bedroom"
    ,description="How many bedrooms does this property have? Answer as a single number if known. If unsure, Answer as Unknown.")
school_schema = ResponseSchema(
    name="school"
    ,description="What schools are around the property? If this information is not found, output Unknown.")
amenity_schema = ResponseSchema(
    name="amenity"
    ,description="Extract any amenties in the property, and output them as a comma separated Python list.")

response_schemas = [
    bedroom_schema 
    ,school_schema
    ,amenity_schema]

In [75]:
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)

In [77]:
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"bedroom": string  // How many bedrooms does this property have? Answer as a single number if known. If unsure, Answer as Unknown.
	"school": string  // What schools are around the property? If this information is not found, output Unknown.
	"amenity": string  // Extract any amenties in the property, and output them as a comma separated Python list.
}
```


In [78]:
# define prompt template
listing_info_format = """\
For the following text, extract the following information:

bedroom: How many bedrooms does this property have? Answer as a single number if known. If unsure, Answer as Unknown.
school: What schools are around the property? If this information is not found, output Unknown.
amenity: Extract any amenties in the property, and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""
prompt = ChatPromptTemplate.from_template(template=listing_info_format)

# give example input/prompt
listing_description = """
Eva Building - Near New & Luxury Apartment with 2 Large Balconies
Stylishly appointed this near-new three-bedroom apartment is perfectly located in the building of Eva Lane Cove. showcases a bright and versatile floor plan with spacious living and beautiful riverside views. Just footsteps to Hughes Park and a short stroll to city buses, cafes', shops and the bustling village also local schools.

Features including:
* Situated in sought-after location, enjoy parkside and riverside views
* Generous 2 bedrooms plus a multi-function room
* Large 2 balconies all with East aspects
* Elegance 2-layer blackout curtains in the living area and bedrooms
* Spacious interiors with a versatile open-plan living and dining area
* Island modern kitchen with 'Millie' appliances, gas cooking and dishwasher
* Three bedrooms all with built-in, the main bedroom with ensuite
* Sparkling bathroom with floor-to-ceiling tiles
* Ducting Air conditioning
* Video intercom and internal laundry.
* Secure one car space and storage

Outgoings:
Strata levy:$1208.60 pq
Council rate: $359.00 pq
Water: $158.45 pq approx.
"""

messages = prompt.format_messages(
    text=listing_description, 
    format_instructions=format_instructions # give an output parser format defined previously
)

# feed input into model
response = chat(messages)

# parse model output into desired dictionary format
output_dict = output_parser.parse(response.content)

In [79]:
output_dict

{'bedroom': 3,
 'school': 'Unknown',
 'amenity': ['Island modern kitchen',
  'Gas cooking',
  'Dishwasher',
  'Built-in wardrobes',
  'Ensuite',
  'Ducting Air conditioning',
  'Video intercom',
  'Internal laundry',
  'Secure one car space and storage']}

## Handling memory

### How do LLM store memory?

### ConversationBufferMemory

### ConversationBufferWindowMemory

### ConversationTokenBufferMemory

### ConversationSummaryBufferMemory

### Other memory methods