# Introducion
- Langchain is an open source development framework for building LLM applications
- Python and Javascript (TypeScript) supported
- Focuses on composition and modularity
- Key value adds:
    - Modular Components - Models (LLMs, Embedding Models, etc.), Prompts (Templates, Parsers, etc.), Indexes (Document Loaders, Text Splitters, Vector Stores, Retrievers, etc.), Chains (Prompts + LLMs + Output Parsing), Agents (LLMs armed with tools)
    - Common ways to combine components

Get you OpenAI API Key: <https://platform.openai.com/api-keys>

---

# Models, Prompts & Parsers

Langchain gives an easy set of abstractions to do repeatable operations with LLMs.
* Models: refer to LLMs and Embedding models
* Prompts: refer to the style of creating inputs to pass into the models
* Parsers: involves taking the output of models and parsing them into a structured format

In [None]:
#!pip install python-dotenv
#!pip install openai

In [None]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ['OPENAI_API_KEY']

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
def get_completion(prompt, model=llm_model):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, 
    )
    return response.choices[0].message["content"]

In [None]:
get_completion("What is 1+1?")

In [None]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

style = """American English \
in a calm and respectful tone
"""

prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_email}```
"""

print(prompt)

In [None]:
response = get_completion(prompt)
response

Now let's try the same thing in a more convenient way using Langchain.

In [None]:
#!pip install --upgrade langchain

In [None]:
'''
LANGCHAIN ABSTRACTION OF CHATGPT API ENDPOINT
'''
from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI(temperature=0.0, model=llm_model)
chat

To repeateadly reuse the below template, let's import the `ChatPromptTemplate` from Langchain

In [None]:
'''
LANGCHAIN: CHAT PROMPT TEMPLATE
'''
from langchain.prompts import ChatPromptTemplate

In [None]:
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

In [None]:
from langchain.prompts import ChatPromptTemplate

### DEFINE PROMPT TEMPLATE
### Takes 2 inputs: Style, Text

# prompt template
template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

# create and view tempate
prompt_template = ChatPromptTemplate.from_template(template_string)
print("Prompt: ", prompt_template.messages[0].prompt)
print("")
print("Input Variables: ", prompt_template.messages[0].prompt.input_variables)

In [None]:
# convert to calm and respectful tone in english
customer_style = """American English \
in a calm and respectful tone
"""

# customer email in pirate language
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

# apply inputs to the template
customer_messages = prompt_template.format_messages(
                    style=customer_style,
                    text=customer_email)
print("Type: ", type(customer_messages))
print("View template with inputs: ", customer_messages[0])

In [None]:
# call LLM
customer_response = chat(customer_messages)
print(customer_response.content)

In [None]:
# convert model output back to pirate english
service_style_pirate = """\
a polite tone \
that speaks in English Pirate\
"""

# model output
service_reply = """Hey there customer, \
the warranty does not cover \
cleaning expenses for your kitchen \
because it's your fault that \
you misused your blender \
by forgetting to put the lid on before \
starting the blender. \
Tough luck! See ya!
"""

# apply template to output
service_messages = prompt_template.format_messages(
    style=service_style_pirate,
    text=service_reply)
print(service_messages[0].content)

In [None]:
# call LLM
service_response = chat(service_messages)
print(service_response.content)

Why do we use prompt templates?
- Prompts can be long and detailed
- Reuse good prompts when you can
- Langchain provides prompts for some common applications such as summarization, q&a, connecting to APIs/DBs, etc.

<center>
  <img src="images/prompt_template.png"/>
</center>  

One other aspect of Langchain's prompt libraries is that it also supports **output parsing**
- Parsers: Taking the output of a model and passing it into a more structured format (such as extracting specific keywords, etc.)

<center>
  <img src="images/output_parser.png"/>
</center> 

In the below example, let's have the LLM output a JSON and use Langchain to parse the output

In [None]:
'''
LANGCHAIN: OUTPUT PARSERS
'''
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

'''
LANGCHAIN: CHAT PROMPT TEMPLATE
'''
from langchain.prompts import ChatPromptTemplate

In [None]:
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

In [None]:
#### LLM TO PARSE OUTPUT AS JSON

# Json template
review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

# text
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

In [None]:
prompt_template = ChatPromptTemplate.from_template(review_template)
print(prompt_template)

In [None]:
messages = prompt_template.format_messages(text=customer_review)
chat = ChatOpenAI(temperature=0.0, model=llm_model)
response = chat(messages)
print(response.content)

In [None]:
type(response.content) # str

In [None]:
#### LLM TO PARSE OUTPUT AS DICTIONARY

# Dict template
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

# STEP1: Define and describe the dictionary schema
gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")
print("GIFT SCHEMA: ", gift_schema)
print()

delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")
print("DELIVERY DAYS SCHEMA: ", delivery_days_schema)
print()

price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")
print("PRICE VALUE SCHEMA: ", delivery_days_schema)
print()

response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

In [None]:
# STEP 2: Create Output parser
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
print(output_parser)

In [None]:
# STEP 3: This is generated by langchain to format the output
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

In [None]:
# Final prompt
prompt = ChatPromptTemplate.from_template(template=review_template_2)
messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)
print(messages[0].content)

In [None]:
# Call LLM
response = chat(messages)
print(response.content)

In [None]:
# Parse Output
output_dict = output_parser.parse(response.content)
print(type(output_dict))
print()
print(output_dict)

In [None]:
output_dict.get('delivery_days')

Summary: With these tools, its should be easier to re-use prompt templates, share prompt templates and use Langchain's inbuilt prompt templates.

---

## Memory
- LLMs are stateless - each transcation is independent
- Remembering previous parts of a conversation and feeding that into the llm enables users to have a conversation with the model
- Langchain provides several kinds to memory to store and accumulate conversations

<center>
  <img src="images/memory.png"/>
</center> 

In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

import warnings
warnings.filterwarnings('ignore')

from langchain.chat_models import ChatOpenAI

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
'''
LANGCHAIN: CONVERSATION CHAIN TO TRACK MEMORY
'''
from langchain.chains import ConversationChain

#### 1. Conversation Buffer Memory
- Saves *entire* conversation in a "Human-AI" format up to the recent point
- As conversations become long, the amount of memory needed becomes long, and thus many more tokens are sent to the LLM making tokens expensive
- This memory allows for storing of messages and then extracts the messages in a variable

In [None]:
'''
LANGCHAIN: CONVERSATION BUFFER MEMORY
'''
from langchain.memory import ConversationBufferMemory

In [None]:
llm = ChatOpenAI(temperature=0.0, model=llm_model)
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm,
    memory = memory,
    verbose=True
)

In [None]:
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)

In [None]:
conversation.predict(input="Hi, my name is Andrew")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

In [None]:
print(memory.buffer)
print()
print(memory.load_memory_variables({}))

In [None]:
# manually save memory
memory.save_context({"input": "Hi"}, 
                    {"output": "What's up"})
print(memory.buffer)
print()
memory.save_context({"input": "Not much, just hanging"}, 
                    {"output": "Cool"})
print(memory.buffer)


#### 2. Conversation Buffer Window Memory
- Keeps track of the complete converation in a "Human-AI" format but uses parameter `k` to indicate how many previous conversations should the LLM remember
- `k=1` means the LLM remembers only the most recent conversation and drops other exchanges. Thus it only uses the last k conversations

In [None]:
'''
LANGCHAIN: CONVERSATION BUFFER WINDOW MEMORY
'''
from langchain.memory import ConversationBufferWindowMemory

In [None]:
memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.load_memory_variables({})

In [None]:
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
    llm=llm, 
    memory = memory, # ConversationBufferWindowMemory
    verbose=True
)
conversation.predict(input="Hi, my name is Andrew")

In [None]:
conversation.predict(input="What is 1+1?")

In [None]:
conversation.predict(input="What is my name?")

#### 3. Conversation Token Buffer Window Memory
- This memory is limited by the number of tokens saved, depending on max_token_limit
- If max_token_limit=50, only the last 50 tokens will be saved and the rest of tokens will be dropped
- Thus, only recent tokens as a part of the parameter are tracked
- Different LLMs use different ways of counting tokens
- This memory keeps a buffer of recent interactions in memory, and uses token length rather than the number of interactions to determine when to flush interactions

In [None]:
'''
LANGCHAIN: CONVERSATION TOEKN BUFFER MEMORY
'''
# ! pip install tiktoken
from langchain.memory import ConversationTokenBufferMemory

In [None]:
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})
memory.load_memory_variables({})

#### 4. Conversation Summary Memory
- Instead of limiting the memory based on recent conversations or tokens, the LLM writes a summary of the conversation so far, and lets that be the memory
- This function uses a parameter - `max_token_limit` - the lower the limit, the more text will be summarized
- Till the token limit is reached, the explicit messages will be kept as is. After the limit is reached, the older conversation will be summarized and the new conversation will be kept explicit till the token limit is reached
- This memory creates a summary of the conversation over time

In [None]:
'''
LANGCHAIN: CONVERSATION SUMMARY BUFFER MEMORY
'''
from langchain.memory import ConversationSummaryBufferMemory

In [None]:
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})
memory.load_memory_variables({})

In [None]:
conversation = ConversationChain(
    llm=llm, 
    memory = memory,
    verbose=True
)
conversation.predict(input="What would be a good demo to show?")

In [None]:
memory.load_memory_variables({})

#### 5. Other Types of Memory
1. Vector Data Memory - Stores text in a vector database and retrieves the most relevant blocks of text
2. Entity Memory - LLM remembers details about specific entities

You can use multiple memories at the same time, for example, using conversation memory along with entity memory to recall individuals. You can also store the conversations in a conversation database (such as key-value store or SQL db)

---

# Chains
- A chain is a building block of langchain
- A chain combines a LLM + Prompt to carry out a sequence of operations on data

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
#!pip install pandas

In [None]:
import pandas as pd
df = pd.read_csv('Data.csv')
df.head()

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

#### 1. LLM Chain
- This is a combination of LLM + Prompt

In [None]:
'''
LANGCHAIN: BASIC LLM CHAIN
'''
from langchain.chains import LLMChain

In [None]:
llm = ChatOpenAI(temperature=0.9, model=llm_model)
prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}?"
)
chain = LLMChain(llm=llm, prompt=prompt)
product = "Queen Size Sheet Set"
chain.run(product)

#### 2. Sequential Chains
- Combines chain such that the ouput of one chain is the input of another chain
- 2 types of sequential chains:
    1. Simple Sequential Chain - Single input and output
    2. Sequential Chain - Multiple inputs and outputs

##### 1. Simple Chain
- Works on a single input and single output (the first chain outputs 1 value and the value is used for another chain)
- Many simple chains can be linked together 
- Each simple chain also has a single input and a single output

<center>
  <img src="images/simple_sequential_chain.png"/>
</center> 

In [None]:
'''
LANGCHAIN: SPECIAL CHAIN THAT EXPECTS ONE INPUT/OUTPUT
'''
from langchain.chains import SimpleSequentialChain

In [None]:
llm = ChatOpenAI(temperature=0.9, model=llm_model)
product = "Queen Size Sheet Set"

# prompt template 1
first_prompt = ChatPromptTemplate.from_template(
    "What is the best name to describe \
    a company that makes {product}?"
)

# Chain 1
chain_one = LLMChain(llm=llm, prompt=first_prompt)

# prompt template 2
second_prompt = ChatPromptTemplate.from_template(
    "Write a 20 words description for the following \
    company:{company_name}"
)
# chain 2
chain_two = LLMChain(llm=llm, prompt=second_prompt)

# overall chain
overall_simple_chain = SimpleSequentialChain(chains=[chain_one, chain_two],
                                             verbose=True)
print(overall_simple_chain)
print()
print("Run Chain:")
overall_simple_chain.run(product)

##### 2. Sequential Chain
- Processes multiple inputs and outputs
- Output keys need to be mentioned everytime
- Any step in the chain can take in multiple input variables

<center>
  <img src="images/sequnetial_chain.png"/>
</center> 

In [None]:
'''
LANGCHAIN: REGULAR SEQUENTIAL CHAIN THAT ACCEPTS MULTIPLE INPUTS/OUTPUTS
'''
from langchain.chains import SequentialChain

In [None]:
llm = ChatOpenAI(temperature=0.9, model=llm_model)

# prompt template 1: translate to english
first_prompt = ChatPromptTemplate.from_template(
    "Translate the following review to english:"
    "\n\n{Review}"
)

# chain 1: input= Review and output= English_Review
chain_one = LLMChain(llm=llm, 
                     prompt=first_prompt, 
                     output_key="English_Review")

####################################################

# chain 2: summarize review in 1 sentence
second_prompt = ChatPromptTemplate.from_template(
    "Can you summarize the following review in 1 sentence:"
    "\n\n{English_Review}"
)

# chain 2: input= English_Review and output= summary
chain_two = LLMChain(llm=llm, 
                     prompt=second_prompt, 
                     output_key="summary")

####################################################

# chain 3: translate to english
third_prompt = ChatPromptTemplate.from_template(
    "What language is the following review:\n\n{Review}")

# chain 3: input= Review and output= language
chain_three = LLMChain(llm=llm, 
                       prompt=third_prompt,
                       output_key="language")

####################################################

# chain 4: follow up message
fourth_prompt = ChatPromptTemplate.from_template(
    "Write a follow up response to the following "
    "summary in the specified language:"
    "\n\nSummary: {summary}\n\nLanguage: {language}")

# chain 4: input= summary, language and output= followup_message
chain_four = LLMChain(llm=llm, 
                      prompt=fourth_prompt,
                      output_key="followup_message")

In [None]:
# overall_chain: input= Review 
# and output= English_Review, summary, followup_message
overall_chain = SequentialChain(
    chains=[chain_one, chain_two, chain_three, chain_four],
    input_variables=["Review"],
    output_variables=["English_Review", "summary", "followup_message"],
    verbose=True
)

# run chain
review = '' # include a product review in french
overall_chain(review)

#### 3. Router Chain
- Decides which chain to pass the output to
- Route an input to a chain depending on the output
- When multiple chain, each of which is specialized for a particular type of input, the router chain decides which sub-chain to pass it to
- It is possible to provide output parsers at the end of the chain
- It is possible to declare a default chain for the LLM to route if no or irrelevant output found

<center>
  <img src="images/router_chain.png"/>
</center> 

Example - One prompt is good for task A, the other for task B and so on. Router chain uses a prompt name, description and template to decide the sub-class

In [None]:
'''
LANGCHAIN: CHAIN USED FOR ROUTING BETWEEN MULTIPLE PROMPT TEMPLATES
'''
from langchain.chains.router import MultiPromptChain

'''
LANGCHIAN LLMRouterChain: USES THE LANGUAGE MODEL ITSELF TO ROUTE BETWEEN SUBCHAINS
LANGCHAIN RouterOutputParser: PARSES THE LLM OUTPUT TO A DICT THAT IS USED DOWNSTREAM TO DETERMINE THE SUBCHAIN TO BE USED AND ITS INPUTS
'''
from langchain.chains.router.llm_router import LLMRouterChain, RouterOutputParser

'''
LANGCHAIN: CHAT PROMPT TEMPLATE
'''
from langchain.prompts import PromptTemplate

In [None]:
# Each prompt is good for answering a question related to a specific topic

physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise\
and easy to understand manner. \
When you don't know the answer to a question you admit\
that you don't know.

Here is a question:
{input}"""


math_template = """You are a very good mathematician. \
You are great at answering math questions. \
You are so good because you are able to break down \
hard problems into their component parts, 
answer the component parts, and then put them together\
to answer the broader question.

Here is a question:
{input}"""


history_template = """You are a very good historian. \
You have an excellent knowledge of and understanding of people,\
events and contexts from a range of historical periods. \
You have the ability to think, reflect, debate, discuss and \
evaluate the past. You have a respect for historical evidence\
and the ability to make use of it to support your explanations \
and judgements.

Here is a question:
{input}"""


computerscience_template = """ You are a successful computer scientist.\
You have a passion for creativity, collaboration,\
forward-thinking, confidence, strong problem-solving capabilities,\
understanding of theories and algorithms, and excellent communication \
skills. You are great at answering coding questions. \
You are so good because you know how to solve a problem by \
describing the solution in imperative steps \
that a machine can easily interpret and you know how to \
choose a solution that has a good balance between \
time complexity and space complexity. 

Here is a question:
{input}"""

In [None]:
prompt_infos = [
    {
        "name": "physics", 
        "description": "Good for answering questions about physics", 
        "prompt_template": physics_template
    },
    {
        "name": "math", 
        "description": "Good for answering math questions", 
        "prompt_template": math_template
    },
    {
        "name": "History", 
        "description": "Good for answering history questions", 
        "prompt_template": history_template
    },
    {
        "name": "computer science", 
        "description": "Good for answering computer science questions", 
        "prompt_template": computerscience_template
    }
]

In [None]:
destination_chains = {}
for p_info in prompt_infos:
    name = p_info["name"]
    prompt_template = p_info["prompt_template"]
    prompt = ChatPromptTemplate.from_template(template=prompt_template)
    chain = LLMChain(llm=llm, prompt=prompt)
    destination_chains[name] = chain  
    
destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]
destinations_str = "\n".join(destinations)
print(destinations_str)

In [None]:
# default chain
default_prompt = ChatPromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm, prompt=default_prompt)

In [None]:
MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \
language model select the model prompt best suited for the input. \
You will be given the names of the available prompts and a \
description of what the prompt is best suited for. \
You may also revise the original input if you think that revising\
it will ultimately lead to a better response from the language model.

<< FORMATTING >>
Return a markdown code snippet with a JSON object formatted to look like:
```json
{{{{
    "destination": string \ name of the prompt to use or "DEFAULT"
    "next_inputs": string \ a potentially modified version of the original input
}}}}
```

REMEMBER: "destination" MUST be one of the candidate prompt \
names specified below OR it can be "DEFAULT" if the input is not\
well suited for any of the candidate prompts.
REMEMBER: "next_inputs" can just be the original input \
if you don't think any modifications are needed.

<< CANDIDATE PROMPTS >>
{destinations}

<< INPUT >>
{{input}}

<< OUTPUT (remember to include the ```json)>>"""

In [None]:
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(
    destinations=destinations_str # set destination
)
router_prompt = PromptTemplate(
    template=router_template,
    input_variables=["input"],
    output_parser=RouterOutputParser(),
)
router_chain = LLMRouterChain.from_llm(llm, router_prompt)

In [None]:
chain = MultiPromptChain(router_chain=router_chain, 
                         destination_chains=destination_chains, 
                         default_chain=default_chain, verbose=True
                        )

In [None]:
chain.run("What is black body radiation?")

In [None]:
chain.run("what is 2 + 2")

In [None]:
chain.run("Why does every cell in our body contain DNA?")

---

# Question & Answer

The most common application that people are building using LLMs is to create a system that's able to query documents.

Language models can be combined with data they havn't been originally trained on, making them more flexible and adaptible for your use case. 

Below is a general framework:

In [None]:
#pip install --upgrade langchain

In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
#pip install docarray

In [None]:
'''
LANGCHAIN: RETREIVE OVER DOCUMENTS
'''
from langchain.chains import RetrievalQA # retrieval over documents

'''
LANGCHAIN: PERFORM Q&A OVER PROPRITERARY CSV DATA
'''
from langchain.document_loaders import CSVLoader

'''
LANGCHAIN: IMPORT OPENAI EMBEDDING CLASS
'''
from langchain.embeddings import OpenAIEmbeddings

'''
LANGCHAIN: IN-MEMORY VECTOR STORE WITH NO EXTERNAL DB REQUIRED
'''
from langchain.vectorstores import DocArrayInMemorySearch # in-memory vector store with no external vector db required

'''
LANGCHAIN: HELPS CREATE VECTOR STORE
'''
from langchain.indexes import VectorStoreIndexCreator

# Display python output
from IPython.display import display, Markdown # display info in jupyter notebooks

In [None]:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)

In [None]:
# create in-memory vector store
embeddings = OpenAIEmbeddings()
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader]) # from_loaders takes a list of document loaders

In [None]:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

llm_replacement_model = OpenAI(temperature=0, 
                               model='gpt-3.5-turbo-instruct')

response = index.query(query, 
                       llm = llm_replacement_model)

display(Markdown(response))

Step by Step code execution to understand the underlying code below the general framework:
- LLMs can only inspect a few thousand words at a time. Thus, if we have really large documents, we need embeddings and vector stores

##### Embeddings
- Embeddings create numerical representation of pieces of text. These numerical representations capture the semantic meaning of the pieces of text
- Pieces of text with similar content will have similar vectors. This lets us compare the different vectors within the vector space

<center>
  <img src="images/embeddings.png"/>
</center> 

##### Vector Databases
- Vector databases stores the vector representation of the embeddings
- Vector databases are created by populating them with chunks of text from incoming documents. Often, we may not be able to pass the whole document to the language model.
- We then create an embedding for each of these chunks and store those in a vector database (also called creating the **index**).
- We can then use a query (aka index) during run-time to find pieces of text most relevant to an incoming query
- Once identified, these pieces of text and passed to the language model to get the final answer
- To do this use retrievers - a generic interface that takes in a query and returns the documents

<center>
  <img src="images/indexing.png"/>
</center> 

<center>
  <img src="images/vector_db.png"/>
</center> 

In [None]:
# Step by Step Execution

# load the csv document
loader = CSVLoader(file_path=file)
docs = loader.load()

# each document corresponds to 1 csv row
docs[0]

In [None]:
# import openai embeddinga
embeddings = OpenAIEmbeddings()
embed = embeddings.embed_query("Hi my name is Harrison") #check
print(len(embed))
print(embed[:5])

In [None]:
# create embedding of the entire document and store in a vector-store (in memory)
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)

In [None]:
# find answer
query = "Please suggest a shirt with sunblocking"
docs_out = db.similarity_search(query)
len(docs_out) # return rows for which query matched or most similar
print(docs[0])

In [None]:
# now create a retriever from the vector store. 
retriever = db.as_retriever()

In [None]:
# join all the page contents to create a final output document
llm = ChatOpenAI(temperature = 0.0, model=llm_model)
qdocs = "".join([docs[i].page_content for i in range(len(docs))])
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 
display(Markdown(response))

In [None]:
# encapsulate all the above steps as a chain:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, # used for doing text generation at the end
    chain_type="stuff", # stuffs all output docs into context and makes one call to the llm
    retriever=retriever, # interface for fetching documents
    verbose=True
)

query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."
response = qa_stuff.run(query)
display(Markdown(response))

In [None]:
# one liner for the above
response = index.query(query, llm=llm)

# customize vector store creation
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])

In [None]:
# check answer with Debug functionality "ON"
langchain.debug = True

# check same answer as above
qa_stuff.run(query)

# Turn Debug "OFF"
langchain.debug = False

##### Retrieval Methods:

1. Stuff - Stuff all data as prompt into the context to pass to the llm
    - Pros: Single call to LLM + LLM has access to all data at once
    - Cons: LLM context length restrictions
<center>
  <img src="images/stuff.png"/>
</center> 

2. Map Reduce - Takes ALL the chunks, passes them as questions along with the llm, gets back a response and then uses another llm call to summarize all individual responses into a final answer.
    - Pros: Can operate over any number of documents + can do individual documents in parallel
    - Cons: Many LLM calls + Each document treated indpendently thus may not capture the context always
3. Refine - Iterativetly loop over documents and build up on the answer from previous documents
    - Pros: Combines information and builds up answer over time + provides longer answers
    - Cons: Many llm calls hence slow (as many as map reduce)
4. Map Re-rank: Do a single call to the llm and ask it to return a SCORE FOR EACH document and select the document with the highest score
    - Pros: Customize the score and criteria
    - Cons: Many LLM calls + Each document treated indpendently thus may not capture the context always
<center>
  <img src="images/additional_methods.png"/>
</center> 

Note:
- Most common is the Stuff method
- Use map reduce to recursively summarize pieces of documents

---

## Evaluating LLM Applications

When building LLM applications, how do you know if the application is getting better or worse? Thus, an evaluation strategy is required.

Use an evaluation strategy to determine:
1. If accuracy criteria have been met
2. Systematically track the changes in metrics if the LLM, prompt, vector database or any other component has been changed

A lot of LLM evaluation frameworks focus on understanding what is going in and coming out of a chain.

Also, LLMs themselves can be used for evaluating LLM outputs.

We will use the Document QA chain for evaluation.

In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
# import libraries
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain.vectorstores import DocArrayInMemorySearch

In [None]:
# Import data
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)
data = loader.load()

In [None]:
# Index
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [None]:
# Retrival QA Chain
llm = ChatOpenAI(temperature = 0.0, model=llm_model)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

In [None]:
# Come up with test data points
print(data[10])
print(data[11])

### Eval Method 1 - Manually Generate Q&A (Ground truth) Pairs

Based on the test data points above, we create 2 examples. However, this does not scale well.

In [None]:
examples = [
    {
        "query": "Do the Cozy Comfort Pullover Set\
        have side pockets?",
        "answer": "Yes"
    },
    {
        "query": "What collection is the Ultra-Lofty \
        850 Stretch Down Hooded Jacket from?",
        "answer": "The DownTek collection"
    }
]

### Eval Method 2 - Automated Q&A Pair Generation

Takes in documents and creates a QA pair from each document using a LLM.

In [None]:
'''
LANGCHAIN: Generate ground truth Q&A pairs from the entire document
'''
from langchain.evaluation.qa import QAGenerateChain

'''
LANGCHAIN: Evaluate answers using LLM
'''
from langchain.evaluation.qa import QAEvalChain

In [None]:
# The below function will take in documents and create Q&A pairs from each document automatically.
# Use this when you have an entire document and want to generate ground truth Q&A pairs. 
# Also, this is different from Retriveal based answers because Retriveal depends on chunking, semantic search, etc. whereas the below takes in the entire document as context
example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI(model=llm_model)) # pass in the language model

new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[:5]] # get a dictionary with Q&A pairs
)
examples += new_examples # add to the above manually generated examples
print(examples)

In [None]:
# check how answers are generating using debug
langchain.debug = True
qa.run(examples[0]["query"])
langchain.debug = False

In [None]:
# Generate llm based answers for all questions in the examples. Thus, we now have Question, Ground Truth Answer and LLM answer for each question
predictions = qa.apply(examples)

In [None]:
# Get graded outputs by LLM
eval_chain = QAEvalChain.from_llm(llm)
graded_outputs = eval_chain.evaluate(examples, predictions)

In [None]:
# Check LLM based evaluation
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

## Agents

A LLM is generally thought of as a knowledge store, as it learns a lot of information off the internet, so when you ask it questions, it can answer. Agents is an extension of the LLM capability of using it as a reasoning engine. 

When using agents, set the temperature=0 to control any randomness. Also, we want the reasoning engine to be as good and precise and possible.

For the below example, we use built-in langchain tools: DuckDuckGo search and Wikipedia

In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

import warnings
warnings.filterwarnings("ignore")

In [None]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [None]:
#!pip install -U wikipedia

In [None]:
# Import libraries
from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.chat_models import ChatOpenAI

In [None]:
# Init LLM and tools
llm = ChatOpenAI(temperature=0, model=llm_model)
tools = load_tools(["llm-math","wikipedia"], llm=llm)
# llm-math tool is a chain itself which uses an LLM in conjuction with a calculator to do math problems
# The Wikipedia tool is an API that connects with Wiki allowing to run search queries against wiki and search back results

In [None]:
# Init Agent
agent= initialize_agent(
    tools, 
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, # agent type: CHAT indicates Agent works with chat models, REACT: prompting techqniue to get best reasoning performance (Thought/Action/Obeservation framework)
    handle_parsing_errors=True, # asks LLM to correct an mis-formatted text it generated
    verbose = True)

In [None]:
# Math Example
agent("What is the 25% of 300?")

In [None]:
# Wiki Example
question = "Tom M. Mitchell is an American computer scientist \
and the Founders University Professor at Carnegie Mellon University (CMU)\
what book did he write?"
result = agent(question) 

##### Python - Agent : Use a LLM to write and execute code

Below is a REACT based coding agent that generates Python code

In [None]:
agent = create_python_agent(
    llm,
    tool=PythonREPLTool(), # REPL is a way to interact with python interpreter
    verbose=True
)

In [None]:
customer_list = [["Harrison", "Chase"], 
                 ["Lang", "Chain"],
                 ["Dolly", "Too"],
                 ["Elle", "Elem"], 
                 ["Geoff","Fusion"], 
                 ["Trance","Former"],
                 ["Jen","Ayai"]]

# Give the agent a list of names and ask it to sort using Python
langchain.debug=True
agent.run(f"""Sort these customers by \
last name and then first name \
and print the output: {customer_list}""") 
langchain.debug=False

##### Custom Agent : Connect to your own sources & tools (information, api, data)

In [None]:
from langchain.agents import tool
from datetime import date

In [None]:
# the tool decorator is used for initializing the tool so that Langchain can use it
# the agent will use the docstr to understand when to call the tool, so it should be very detailed
@tool
def time(text: str) -> str:
    """Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function."""
    return str(date.today())

In [None]:
# Build an agent to get today's date
agent= initialize_agent(
    tools + [time], 
    llm, 
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors=True,
    verbose = True)

In [None]:
# The agent will sometimes come to the wrong conclusion (agents are a work in progress!). If it does, please try running it again.
try:
    result = agent("whats the date today?") 
except: 
    print("exception on external access")