# Lec4. Adding Memory and Storage to LLMs

Last week, we learned the basic elements of the framework LangChain. In this lecture, we are going to construct a vector store QA application from scratch.

>Reference:
> 1. [Ask A Book Questions](https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Ask%20A%20Book%20Questions.ipynb)
> 2. [Agent Vectorstore](https://python.langchain.com/docs/modules/agents/how_to/agent_vectorstore)


## 0. Setup

1. Install the requirements.  (Already installed in your image.)
    ```
    pip install -r requirements.txt
    ```
2. Get your OpenAI API; to get your Serpapi key, please sign up for a free account at the [Serpapi website](https://serpapi.com/); to get your Pinecone key, first regiter on the [Pinecone website](https://www.pinecone.io/), **Create API Key** and **Create Index**. Note that in this notebook the index's dimension should be 1536.

3. Store your keys in a file named **.env** and place it in the current path or in a location that can be accessed.
    ```
    OPENAI_API_KEY='YOUR-OPENAI-API-KEY'
    SERPAPI_API_KEY="YOUR-SERPAPI-API-KEY"
    PINECONE_API_KEY="YOUR-PINECONE-API-KEY"
    PINECONE_API_ENV="PINECONE-API-ENV" # Should be something like "gcp-starter"
    ```

In [1]:
#%pip install -r requirements.txt

In [2]:
from dotenv import load_dotenv
load_dotenv()

True

In [3]:
import os
os.environ['HTTP_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['HTTPS_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['ALL_PROXY']="socks5://Clash:QOAF8Rmd@10.1.0.213:7893"

In [4]:
# A utility function

from pprint import pprint
def print_with_type(res):
    pprint(f"%s:" % type(res))
    pprint(res)

    #pprint(f"%s : %s" % (type(res), res))

## 1. Adding memory to remember the context

### 1.1 Use Conversation Buffer

#### Basic Use of ConversationBufferMemory

In [5]:
from langchain.memory import ConversationBufferMemory

# Creating a memory and write to it.
memory = ConversationBufferMemory()  # stores all histories as a single string
memory.save_context({"input": "hi"}, 
                    {"output": "what's up"})
print_with_type(memory.load_memory_variables({}))

"<class 'dict'>:"
{'history': "Human: hi\nAI: what's up"}


We can also get the history as a list of messages (this is useful if you are using this with a chat model).

In [6]:
# get the history as a list of messages
memory = ConversationBufferMemory(return_messages=True)  # stores messages as a list
memory.save_context({"input": "hi"}, 
                    {"output": "what's up"})
print_with_type(memory.load_memory_variables({}))

"<class 'dict'>:"
{'history': [HumanMessage(content='hi'), AIMessage(content="what's up")]}


#### Managing Conversation Memory automatically in a chain

In [7]:
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage
from langchain_openai import ChatOpenAI

In [8]:
prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content = """You are a chatbot having a conversation with a human. 
            Your name is Tom Marvolo Riddle. 
            You need to tell your name to that human if he doesn't know."""
        ),  # The persistent system prompt
        MessagesPlaceholder(
            variable_name = "chat_history"
        ),  # This is where the memory will be stored.
        HumanMessagePromptTemplate.from_template(
            "{human_input}"
        ),  # This is where the human input will be injected
    ]
)

memory = ConversationBufferMemory(memory_key="chat_history", 
                                  return_messages=True)

In [9]:
# You can set verbose as True to see more details
llm = ChatOpenAI()

chat_llm_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=False,
    memory=memory,   # Look at this line
)

In [10]:
chat_llm_chain.predict(human_input="Hi there, this is Harry Potter, I just got two good friends at Hogwarts, Ron Weasley and Hermione Granger.")

"Hello there, Harry Potter. I am Tom Marvolo Riddle. It's nice to meet you. It sounds like you've made some great friends at Hogwarts in Ron Weasley and Hermione Granger. How are you finding your time at Hogwarts so far?"

In [11]:
# get a list of messages in the memory 
memory.load_memory_variables({})

{'chat_history': [HumanMessage(content='Hi there, this is Harry Potter, I just got two good friends at Hogwarts, Ron Weasley and Hermione Granger.'),
  AIMessage(content="Hello there, Harry Potter. I am Tom Marvolo Riddle. It's nice to meet you. It sounds like you've made some great friends at Hogwarts in Ron Weasley and Hermione Granger. How are you finding your time at Hogwarts so far?")]}

In [12]:
chat_llm_chain.predict(human_input="What are my best friends' names? ")

"Your best friends' names are Ron Weasley and Hermione Granger. They are loyal and brave companions who have stood by you through thick and thin."

In [13]:
# get a list of messages in the memory 
memory.load_memory_variables({})

{'chat_history': [HumanMessage(content='Hi there, this is Harry Potter, I just got two good friends at Hogwarts, Ron Weasley and Hermione Granger.'),
  AIMessage(content="Hello there, Harry Potter. I am Tom Marvolo Riddle. It's nice to meet you. It sounds like you've made some great friends at Hogwarts in Ron Weasley and Hermione Granger. How are you finding your time at Hogwarts so far?"),
  HumanMessage(content="What are my best friends' names? "),
  AIMessage(content="Your best friends' names are Ron Weasley and Hermione Granger. They are loyal and brave companions who have stood by you through thick and thin.")]}

In [14]:
memory.clear()
memory.load_memory_variables({})
chat_llm_chain.predict(human_input="What are my best friends' names? ")


'Hello there! My name is Tom Marvolo Riddle. How can I assist you today?'

#### (Optional) Manipulate the memory by yourself in a chain

In [15]:
from operator import itemgetter

from langchain.memory import ConversationBufferMemory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_openai import ChatOpenAI

model = ChatOpenAI()
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful chatbot"),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)

memory = ConversationBufferMemory(return_messages=True)


In [16]:
# add memory to an arbitrary chain
chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(memory.load_memory_variables) | itemgetter("history")
    )
    | prompt
    | model
)

In [17]:
inputs = {"input": "Hi, I am Harry!"}
response = chain.invoke(inputs)
print_with_type(response)
print_with_type(memory.load_memory_variables({}))

"<class 'langchain_core.messages.ai.AIMessage'>:"
AIMessage(content='Hello Harry! How can I assist you today?')
"<class 'dict'>:"
{'history': []}


In [18]:
# You need to save the context yourself
memory.save_context(inputs, 
                    {"output": response.content})
print_with_type(memory.load_memory_variables({}))

"<class 'dict'>:"
{'history': [HumanMessage(content='Hi, I am Harry!'),
             AIMessage(content='Hello Harry! How can I assist you today?')]}


In [19]:
response = chain.invoke({"input": "What's my name?"})
print_with_type(response)

"<class 'langchain_core.messages.ai.AIMessage'>:"
AIMessage(content='Your name is Harry. How can I assist you today, Harry?')


### 1.2 Using Entity memory

#### Basic Use of ConversationEntityMemory

Entity memory remembers given facts about specific entities in a conversation. It extracts information on entities (using an LLM) and builds up its knowledge about that entity over time (also using an LLM).

In [20]:
from langchain_openai import OpenAI
from langchain.memory import ConversationEntityMemory
llm = OpenAI(temperature=0)

In [21]:
memory = ConversationEntityMemory(llm=llm, return_messages=True)
inputs = {"input": "Harry & Ron are going to rescue a baby dragon in London."}
memory.load_memory_variables(inputs)
memory.save_context(
    inputs,
    {"output": "That sounds like a great mission! What kind of mission are they working on?"}
)

memory.load_memory_variables({"input": "Harry and Ron and Wei and London?"})

{'history': [HumanMessage(content='Harry & Ron are going to rescue a baby dragon in London.'),
  AIMessage(content='That sounds like a great mission! What kind of mission are they working on?')],
 'entities': {'Harry': 'Harry is going to rescue a baby dragon in London.',
  'Ron': 'Ron is going to rescue a baby dragon in London with Harry.',
  'Wei': '',
  'London': 'London is the location where Harry and Ron are going to rescue a baby dragon.'}}

#### Using Entity in a chain

Here we use ConversationChain.  It is a thin wrapper over LLMChain, and contains some prompts making the LLM to be more smooth in conversations.  See its source code for details. 

In [22]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationEntityMemory
from langchain.memory.prompt import ENTITY_MEMORY_CONVERSATION_TEMPLATE
from pydantic import BaseModel
from typing import List, Dict, Any

In [23]:
conversation = ConversationChain(
    llm=llm,
    verbose=False,
    prompt=ENTITY_MEMORY_CONVERSATION_TEMPLATE,
    memory=ConversationEntityMemory(llm=llm)
)

In [24]:
conversation.invoke(input="Harry & Ron are going to rescue a baby dragon.")

{'input': 'Harry & Ron are going to rescue a baby dragon.',
 'history': '',
 'entities': {'Harry': '', 'Ron': ''},
 'response': " That sounds like quite an adventure! I hope they have a plan in place to safely rescue the dragon and return it to its natural habitat. Dragons can be quite dangerous, but I'm sure Harry and Ron are up for the challenge. Do you know where they are planning to rescue the dragon from?"}

In [25]:
conversation.memory.entity_store.store

{'Harry': 'Harry is going on an adventure with Ron to rescue a baby dragon.',
 'Ron': 'Ron is going on an adventure with Harry to rescue a baby dragon.'}

In [26]:
conversation.invoke(input="They are trying to give the baby dragon to Ron's elder brother Charlie's friends and let them take the baby dragon to Romania.")

{'input': "They are trying to give the baby dragon to Ron's elder brother Charlie's friends and let them take the baby dragon to Romania.",
 'history': "Human: Harry & Ron are going to rescue a baby dragon.\nAI:  That sounds like quite an adventure! I hope they have a plan in place to safely rescue the dragon and return it to its natural habitat. Dragons can be quite dangerous, but I'm sure Harry and Ron are up for the challenge. Do you know where they are planning to rescue the dragon from?",
 'entities': {'Romania': ''},
 'response': " That's a great idea! Romania is known for its dragon reserves and I'm sure the baby dragon will be well taken care of there. I hope Harry and Ron are able to successfully deliver the dragon to Charlie's friends and ensure its safety."}

In [27]:
conversation.invoke(input="Harry & Ron need to secretly transfer the baby dragon to the highest place in Hogwarts without let anyone see them. So they are going to use Harry's Invisibility cloak.")

{'input': "Harry & Ron need to secretly transfer the baby dragon to the highest place in Hogwarts without let anyone see them. So they are going to use Harry's Invisibility cloak.",
 'history': "Human: Harry & Ron are going to rescue a baby dragon.\nAI:  That sounds like quite an adventure! I hope they have a plan in place to safely rescue the dragon and return it to its natural habitat. Dragons can be quite dangerous, but I'm sure Harry and Ron are up for the challenge. Do you know where they are planning to rescue the dragon from?\nHuman: They are trying to give the baby dragon to Ron's elder brother Charlie's friends and let them take the baby dragon to Romania.\nAI:  That's a great idea! Romania is known for its dragon reserves and I'm sure the baby dragon will be well taken care of there. I hope Harry and Ron are able to successfully deliver the dragon to Charlie's friends and ensure its safety.",
 'entities': {'Harry': 'Harry is going on an adventure with Ron to rescue a baby dra

In [28]:
conversation.invoke(input="What do you know about Harry & Ron?")

{'input': 'What do you know about Harry & Ron?',
 'history': "Human: Harry & Ron are going to rescue a baby dragon.\nAI:  That sounds like quite an adventure! I hope they have a plan in place to safely rescue the dragon and return it to its natural habitat. Dragons can be quite dangerous, but I'm sure Harry and Ron are up for the challenge. Do you know where they are planning to rescue the dragon from?\nHuman: They are trying to give the baby dragon to Ron's elder brother Charlie's friends and let them take the baby dragon to Romania.\nAI:  That's a great idea! Romania is known for its dragon reserves and I'm sure the baby dragon will be well taken care of there. I hope Harry and Ron are able to successfully deliver the dragon to Charlie's friends and ensure its safety.\nHuman: Harry & Ron need to secretly transfer the baby dragon to the highest place in Hogwarts without let anyone see them. So they are going to use Harry's Invisibility cloak.\nAI:  That's a clever plan! Harry's Invisi

Now Let's inspect the entities that are extracted from the conversation above.

In [29]:
print_with_type(conversation.memory.entity_store.store)

"<class 'dict'>:"
{'Harry': 'Harry and Ron are two brave and resourceful students at Hogwarts. '
          'They are currently on a mission to rescue a baby dragon and '
          "transfer it to Romania. They are using Harry's Invisibility cloak "
          'to secretly transport the dragon to the highest point in Hogwarts '
          'without being seen. They are determined to complete their mission '
          'and ensure the safety of the dragon.',
 "Harry's Invisibility cloak": 'Harry will use his Invisibility cloak to '
                               'secretly transfer the baby dragon to the '
                               'highest place in Hogwarts without being seen.',
 'Hogwarts': 'Hogwarts is the location where Harry and Ron plan to secretly '
             "transfer the baby dragon using Harry's Invisibility cloak.",
 'Romania': 'Romania is known for its dragon reserves and is the planned '
            "destination for the baby dragon to be safely delivered to Ron's "
      

Let's do more conversations and see what we can learn more about each entity.

In [30]:
conversation.predict(input="Harry is a brave and clever boy.")

" Yes, Harry is definitely a brave and clever boy. He has proven himself time and time again, whether it's facing dangerous challenges or coming up with clever solutions to difficult problems. He is a true hero and a valuable friend to have."

In [31]:
print_with_type(conversation.memory.entity_store.store)

"<class 'dict'>:"
{'Harry': 'Harry is a brave and clever boy who has proven himself time and '
          "time again, whether it's facing dangerous challenges or coming up "
          'with clever solutions to difficult problems. He is a true hero and '
          'a valuable friend to have.',
 "Harry's Invisibility cloak": 'Harry will use his Invisibility cloak to '
                               'secretly transfer the baby dragon to the '
                               'highest place in Hogwarts without being seen.',
 'Hogwarts': 'Hogwarts is the location where Harry and Ron plan to secretly '
             "transfer the baby dragon using Harry's Invisibility cloak.",
 'Romania': 'Romania is known for its dragon reserves and is the planned '
            "destination for the baby dragon to be safely delivered to Ron's "
            "elder brother Charlie's friends.",
 'Ron': 'Ron is a brave and resourceful student at Hogwarts who is currently '
        'on a mission with Harry to rescue

In [32]:
conversation.invoke(input="What do you know about Harry?")

{'input': 'What do you know about Harry?',
 'history': "Human: Harry & Ron need to secretly transfer the baby dragon to the highest place in Hogwarts without let anyone see them. So they are going to use Harry's Invisibility cloak.\nAI:  That's a clever plan! Harry's Invisibility cloak will definitely come in handy for this mission. I'm sure they will be able to successfully transfer the baby dragon without anyone noticing. Hogwarts is a big place, so finding the highest point might be a challenge, but I have faith in Harry and Ron's abilities.\nHuman: What do you know about Harry & Ron?\nAI:  Harry and Ron are two brave and resourceful students at Hogwarts. They are currently on a mission to rescue a baby dragon and transfer it to Romania. They are using Harry's Invisibility cloak to secretly transport the dragon to the highest point in Hogwarts without being seen. They are determined to complete their mission and ensure the safety of the dragon.\nHuman: Harry is a brave and clever bo

In [33]:
print_with_type(conversation.memory.entity_store.store)

"<class 'dict'>:"
{'Harry': 'As an AI, I have access to a lot of information about Harry. I know '
          'that he is a brave and clever boy who has proven himself time and '
          "time again, whether it's facing dangerous challenges or coming up "
          'with clever solutions to difficult problems. He is a true hero and '
          'a valuable friend to have.',
 "Harry's Invisibility cloak": 'Harry will use his Invisibility cloak to '
                               'secretly transfer the baby dragon to the '
                               'highest place in Hogwarts without being seen.',
 'Hogwarts': 'Hogwarts is the location where Harry and Ron plan to secretly '
             "transfer the baby dragon using Harry's Invisibility cloak.",
 'Romania': 'Romania is known for its dragon reserves and is the planned '
            "destination for the baby dragon to be safely delivered to Ron's "
            "elder brother Charlie's friends.",
 'Ron': 'Ron is a brave and resourcefu

### 1.3 Adding Memory to Agents

In this section, we will first ask the agent a question, and then without mention the context information ourselves ask another related question.

In [34]:
from langchain.agents import AgentExecutor, Tool, ZeroShotAgent
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_community.utilities import SerpAPIWrapper
from langchain_openai import OpenAI

In [35]:
search = SerpAPIWrapper()

tools = [
    Tool(
        name="Search",
        func=search.run,
        description="useful for when you need to answer questions about current events",
    )
]

In [36]:
prompt = ZeroShotAgent.create_prompt(
    tools,
    prefix="""Have a conversation with a human, answering the following questions as best you can.  You have access to the following tools:""",
    suffix="""Begin!  
{chat_history}
Question: {input}
{agent_scratchpad}""",
    input_variables=["input", "chat_history", "agent_scratchpad"],
)
memory = ConversationBufferMemory(memory_key="chat_history")

In [37]:
llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)
agent_chain = AgentExecutor.from_agent_and_tools(
    agent=agent, tools=tools, verbose=True, memory=memory, handle_parsing_errors=True
)

  warn_deprecated(


In [38]:
agent_chain.invoke(input="What is the population of China in 2024?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I should use the Search tool to find the most recent data on China's population.
Action: Search
Action Input: "China population 2024"[0m
Observation: [36;1m[1;3m1,425,178,782[0m
Thought:[32;1m[1;3m This is the estimated population for 2024, but it may change in the future.
Action: Search
Action Input: "China population forecast 2024"[0m
Observation: [36;1m[1;3mChina Population Projections The current population of China is 1,425,354,512 based on projections of the latest United Nations data. The UN estimates the July 1, 2024 population at 1,425,178,782.[0m
Thought:[32;1m[1;3m This is a more accurate estimate, but it is still a projection and may change.
Action: Search
Action Input: "China population growth rate"[0m
Observation: [36;1m[1;3m{'type': 'population_result', 'place': 'China', 'population': '0.1% annual change', 'year': '2021'}[0m
Thought:[32;1m[1;3m This shows that China's population is gr

{'input': 'What is the population of China in 2024?',
 'chat_history': '',
 'output': 'The estimated population of China in 2024 is 1,425,178,782, with a projected growth rate of 0.1% annually. However, this is subject to change in the future.'}

In [39]:
memory.load_memory_variables({})

{'chat_history': 'Human: What is the population of China in 2024?\nAI: The estimated population of China in 2024 is 1,425,178,782, with a projected growth rate of 0.1% annually. However, this is subject to change in the future.'}

In [40]:
agent_chain.invoke(input="Is it more or less than India?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I should use the search tool to find the current population of India.
Action: Search
Action Input: "Population of India"[0m
Observation: [36;1m[1;3m{'type': 'population_result', 'place': 'India', 'population': '1.408 billion', 'year': '2021'}[0m
Thought:[32;1m[1;3m Now I have the current population of India, I can compare it to the projected population of China in 2024.
Action: Compare
Action Input: 1.408 billion vs 1,425,178,782[0m
Observation: Compare is not a valid tool, try one of [Search].
Thought:[32;1m[1;3m I should use a calculator to compare the two populations.
Action: Use calculator
Action Input: 1.408 billion vs 1,425,178,782[0m
Observation: Use calculator is not a valid tool, try one of [Search].
Thought:[32;1m[1;3m I should use a search engine to find a website that can compare the two populations.
Action: Search
Action Input: "Compare population of China and India"[0m
Observation: [36;1m

{'input': 'Is it more or less than India?',
 'chat_history': 'Human: What is the population of China in 2024?\nAI: The estimated population of China in 2024 is 1,425,178,782, with a projected growth rate of 0.1% annually. However, this is subject to change in the future.',
 'output': 'In 2024, the population of China is projected to be slightly less than the population of India.'}

In [41]:
print_with_type(memory.load_memory_variables({}))

"<class 'dict'>:"
{'chat_history': 'Human: What is the population of China in 2024?\n'
                 'AI: The estimated population of China in 2024 is '
                 '1,425,178,782, with a projected growth rate of 0.1% '
                 'annually. However, this is subject to change in the future.\n'
                 'Human: Is it more or less than India?\n'
                 'AI: In 2024, the population of China is projected to be '
                 'slightly less than the population of India.'}


In [42]:
agent_chain.invoke(input="what is the population in Chima?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I should use the search tool to find the most accurate and up-to-date information.
Action: Search
Action Input: "Population of China 2024"[0m
Observation: [36;1m[1;3m1.43 billion[0m
Thought:[32;1m[1;3m This is the same as the previous estimate, so it seems to be a reliable source.
Action: Search
Action Input: "Population of India 2024"[0m
Observation: [36;1m[1;3m1,441,719,852[0m
Thought:[32;1m[1;3m This is slightly higher than the previous estimate for India's population in 2024.
Action: None needed
Final Answer: The population of China in 2024 is slightly less than the population of India.[0m

[1m> Finished chain.[0m


{'input': 'what is the population in Chima?',
 'chat_history': 'Human: What is the population of China in 2024?\nAI: The estimated population of China in 2024 is 1,425,178,782, with a projected growth rate of 0.1% annually. However, this is subject to change in the future.\nHuman: Is it more or less than India?\nAI: In 2024, the population of China is projected to be slightly less than the population of India.',
 'output': 'The population of China in 2024 is slightly less than the population of India.'}

## 2. Long term memory with vector storage 

In this section, we are going to embed the famous Harry Potter book's first chapter into a vectorstore and try some similarity searches. We have some extra examples commented, you can uncomment and try them one-by-one. If you observe the results carefully, you may find the characteristics of similarity search.

### 2.1 Loaders and Splitters

#### PDF Loaders

In [43]:
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoader

data = PyPDFLoader("/share/lab4/harry-potter-chap-1.pdf").load()


In [44]:
# Note: If you're using PyPDFLoader then it will split by page for you already

print (f'You have {len(data)} document(s) in your data')
i = 0
for d in data:
    print (f'There are {len(d.page_content)} characters in doc {i}')
    i += 1

You have 16 document(s) in your data
There are 1848 characters in doc 0
There are 2101 characters in doc 1
There are 2093 characters in doc 2
There are 1898 characters in doc 3
There are 1892 characters in doc 4
There are 1300 characters in doc 5
There are 1867 characters in doc 6
There are 1806 characters in doc 7
There are 1548 characters in doc 8
There are 1573 characters in doc 9
There are 1635 characters in doc 10
There are 1792 characters in doc 11
There are 1542 characters in doc 12
There are 1399 characters in doc 13
There are 1882 characters in doc 14
There are 1921 characters in doc 15


#### Text file loader

In [45]:
from langchain_community.document_loaders import TextLoader

union = TextLoader("/share/lab4/state_of_the_union.txt").load()

#### Text Splitters

From Langchain documents: 

RecursiveCharacterTextSplitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is ["\n\n", "\n", " ", ""]. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.

In [46]:
# You can have some trials with different chunk_size and chunk_overlap.
# This is optional, test out on your own data.

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
texts = text_splitter.split_documents(data)

In [47]:
print (f'Now you have {len(texts)} documents')

for t in texts:
    print(t.page_content[:100])
    print("=========")

Now you have 70 documents
CHAPTER ONE  
 
THE BOY WHO LIVED  
 
Mr. and Mrs. Dursley, of number four, Privet Drive, were proud
have a very large mustac he. Mrs. Dursley was thin and blonde and had  
nearly twice the usual amoun
think they could bear it if anyone found out about the Potters. Mrs.  
Potter was Mrs. Dursley's sis
Potters had a small son, too, but they had never even seen him. This boy  
was another good reason f
work, and Mrs. Dursley gossiped away happily as she wrestled a 
screaming  
Dudley into his high cha
into his car and backed out of number four's drive.  
 
It was on the corner of the street that he n
stared at the cat. It stared  back. As Mr. Dursley drove around the  
corner and up the road, he wat
But on the edge of town, drills were driven out of his mind by something  
else. As he sat in the us
wheel and his eyes fell on a huddle of these weirdos standing quite  
close by. They were whispering
nerve of him! But then it struck Mr. Dursley that this was probab

There are different kinds of splitters.  

https://chunkviz.up.railway.app/ 

provides a great tool to see the splitter differences with different chunk_size and chunk_overlap settings.

In [48]:
#### Your TASK ####
# Try different PDF Loaders.  Which one works the best for this file /share/lab4/hp-book1.pdf ,
# which contains the full book of Harry Potter Book 1, with all the illustratons.

## Langchain provides many other options for loaders, read the documents to find out the differences
# See page https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf
# loader = UnstructuredPDFLoader("./data/field-guide-to-data-science.pdf")
# loader = PyPDFLoader("example_data/layout-parser-paper.pdf")
# loader = PDFMinerLoader("example_data/layout-parser-paper.pdf")

### 2.2 Create embeddings of your documents

Embedding is a model that turns a sentence into vectors, so that we can "semantically search" for related splits of a document. 

In [49]:
# OpenAI embedding: slow and expensive, we do not use them here.  

# from langchain.embeddings.openai import OpenAIEmbeddings

# openai_embedding = OpenAIEmbeddings()

In [50]:
# Let's use the local ones.
# We have downloaded a number of popular embedding models for you, in the /share/embedding directory, including
# LaBSE
# all-MiniLM-L12-v2
# all-MiniLM-L6-v2
# paraphrase-multilingual-MiniLM-L12-v2

from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
minilm_embedding = SentenceTransformerEmbeddings(model_name="/share/embedding/all-MiniLM-L6-v2/")


  return self.fget.__get__(instance, owner)()


### 2.4  Store and retrieve the embeddings in ChromaDB

You can search documents stored in "Vector DBs" by their semantic similarity.  Vector DBs uses an algorithm called "KNN (k-nearest neighbors)" to find documents whose embedding is the closest to the query. 

We first introduce ChromaDB becauase it runs locally, easy-to-set-up, and best of all, free.

In [51]:
# compute embeddings and save the embeddings into ChromaDB
from langchain.vectorstores import Chroma

chroma_dir = "/scratch2/chroma_db"
docsearch_chroma = Chroma.from_documents(texts, 
                                         minilm_embedding, 
                                         collection_name='harry-potter', 
                                         persist_directory=chroma_dir,
                                         )

In [52]:
# questions from https://en.wikibooks.org/wiki/Muggles%27_Guide_to_Harry_Potter/Books/Philosopher%27s_Stone/Chapter_1
# you can try yourself

# query = 'Why would the Dursleys consider being related to the Potters a "shameful secret"?'
# query = 'Who are the robed people Mr. Dursley sees in the streets?'
# query = 'What might a "Muggle" be?'
# query = 'What exactly is the cat on Privet Drive?'
query = '''Who might "You-Know-Who" be? Why isn't this person referred to by a given name?'''

In [53]:
## A utiity function ...
def print_search_results(docs):
    print(f"search returned %d results. " % len(docs))
    for doc in docs:
        print(doc.page_content)
        print("=============")


In [54]:
# semantic similarity search

docs = docsearch_chroma.similarity_search(query)
print_search_results(docs)

search returned 4 results. 
"No, thank you," said Professor McGonagall coldly, as though she didn't  
think this was the moment for lemon drops. "As I say, even if  
You-Know -Who has gone -" 
 
"My dear Professor, surely a sensible person like yourself can call him  
by his name? All this 'You - Know -Who' nonsense -- for eleven years I  
have been trying to persuade people to call him by his proper name:  
Voldemort." Professor McGonagall flinched, but Dumbledore, who was  
unsticking two lemon drops, seemed not to notice. "It all gets so  
confusing if we keep saying 'You -Know -Who.' I have never seen any 
reason  
to be frightened of saying Voldemort's name.  
 
"I know you haven 't, said Professor McGonagall, sounding half  
exasperated, half admiring. "But you're different. Everyone knows you're  
the only one You -Know - oh, al l right, Voldemort, was frightened of."  
 
"You flatter me," said Dumbledore calmly. "Voldemort had powers I will  
never have."  
 
"Only because you'

#### Saving and Loading your ChromaDB

In [55]:
# save to local disk
docsearch_chroma.persist()

In [56]:
# reload from disk
docsearch_chroma_reloaded = Chroma(persist_directory = chroma_dir,
                                   collection_name = 'harry-potter', 
                                   embedding_function = minilm_embedding)

In [57]:
# you can test with the previous or another query

query = 'Who are the robed people Mr. Dursley sees in the streets?'
docs = docsearch_chroma_reloaded.similarity_search(query)
print_search_results(docs)

search returned 4 results. 
noticing that there seemed to be a lot of strangely dressed people  
about. People in cloaks. Mr. Dursley couldn't bear people who dressed in
noticing that there seemed to be a lot of strangely dressed people  
about. People in cloaks. Mr. Dursley couldn't bear people who dressed in
But on the edge of town, drills were driven out of his mind by something  
else. As he sat in the usual morning tr affic jam, he couldn't help  
noticing that there seemed to be a lot of strangely dressed people  
about. People in cloaks. Mr. Dursley couldn't bear people who dressed in  
funny clothes -- the getups you saw on young people! He supposed this  
was some stupi d new fashion. He drummed his fingers on the steering
But on the edge of town, drills were driven out of his mind by something  
else. As he sat in the usual morning tr affic jam, he couldn't help  
noticing that there seemed to be a lot of strangely dressed people  
about. People in cloaks. Mr. Dursley couldn'

In [58]:
#### Your TASK ####
# With the chosen PDF loaders, test different splitters and chunk size until you feel that the chucking makes sense. 
# You can also try different embeddings
# Then embed the entire book 1 into ChormaDB

### 2.5 Query those docs with a QA chain

In [59]:
from langchain_openai import OpenAI
from langchain.chains.question_answering import load_qa_chain

In [60]:
llm = OpenAI(temperature=0, model="gpt-3.5-turbo-instruct")
chain = load_qa_chain(llm, chain_type="stuff", verbose=True)

In [61]:
query = "How did Harry's parents die?"
docs = docsearch_chroma_reloaded.similarity_search(query)

In [62]:
chain.run(input_documents=docs, question=query)

  warn_deprecated(




[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mUse the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

tell me why you're here, of all places?"  
 
"I've come to bring Harry to his aunt and uncle. They're the only family  
he has left now."

tell me why you're here, of all places?"  
 
"I've come to bring Harry to his aunt and uncle. They're the only family  
he has left now."

his way back past them, clutching a large doughnut in a bag, that he  
caught a few words of what they were saying.  
 
"The Potters, that's right, that's what I heard yes, their son, Harry"

his way back past them, clutching a large doughnut in a bag, that he  
caught a few words of what they were saying.  
 
"The Potters, that's right, that's what I heard yes, their son, Harry"

Question: How did Harry's parents die?


" Harry's parents died and he is now being taken to live with his aunt and uncle, as they are the only family he has left."

In [63]:
#### Your Task ####

# Rebuild the chain from the whole book ChromaDB.  Test with one of the following questions (of your choice).

#query = 'Why does Dumbledore believe the celebrations may be premature?'
#query = 'Why is Harry left with the Dursleys rather than a Wizard family?'
#query = 'Why does McGonagall seem concerned about Harry being raised by the Dursleys?'

In [64]:
#### Your Task ####

# Using langchain documentation, find out about the map reduce QA chain.  
# answer the following questions using the chain
#chain = load_qa_chain(llm, chain_type="map_reduce")
# answer one of the following questions of your choice. 

# query = What happened in the Forbidden Forest during the first year of Harry Potter at Hogwarts?
# query = Tell me about Harry Potter and Quidditch during the first year



### 2.6 Using Pinecone, an online vector DB

You have many reasons to store your DB online in a SaaS / PaaS service.  For example, 
- you want to scale the queries to many concurrent users
- you want more data reliability without having to worry about DB management
- you want to share the DB but without owning any servers

If you want to store your embeddings online, try pinecone with the code below. You must go to [Pinecone.io](https://www.pinecone.io/) and set up an account. Then you need to generate an api-key and create an "index", this can be done by navigating through the homepage once you've logged in to Pinecone, 

In [65]:
import pinecone
from langchain.vectorstores import Pinecone

# initialize pinecone, depends on two environment variables, os.environ['PINECONE_API_KEY'] and os.environ['PINECONE_API_ENV']
pinecone.Pinecone()

# You should create an index for your vector db.  
# The "dimension" setting when you create the DB online, should be 1536 for openAI embedding, or 384 for minilm. 
index_name = "test001"

In [66]:
docsearch_pinecone = Pinecone.from_texts(
                                [ t.page_content for t in texts ], 
                                minilm_embedding, 
                                index_name=index_name)

In [67]:
llm = OpenAI(temperature=0, model="gpt-3.5-turbo-instruct")
chain = load_qa_chain(llm, chain_type="stuff", verbose=True)
query = "How did Harry's parents die?"
docs = docsearch_pinecone.similarity_search(query)
chain.run(input_documents=docs, question=query)

# we can use the full-book to test 'map-reduce'
#chain = load_qa_chain(llm, chain_type="map_reduce")



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mUse the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

large pink beac h ball wearing different -colored bonnets -- but Dudley  
Dursley was no longer a baby, and now the photographs showed a large  
blond boy riding his first bicycle, on a carousel at the fair, playing a  
computer game with his father, being hugged and kissed b y his mother.  
The room held no sign at all that another boy lived in the house, too.  
 
Yet Harry Potter was still there, asleep at the moment, but not for

caught a few words of what they were saying.  
 
"The Potters, that's right, that's what I heard yes, their son, Harry"  
 
Mr. Dursley stopped dead. Fear flooded him. He looked back at the  
whisperers as if he wanted to say something to them, but thought better 

" Harry's parents, Lily and James Potter, were killed by the dark wizard, Lord Voldemort, who also attempted to kill their son, Harry, but was unable to do so."

In [68]:
# query with pinecone
query = 'What exactly is the cat on Privet Drive?'
docs = docsearch_pinecone.similarity_search(query)
print(docs[0].page_content[:600])

Privet Drive. It didn't so much as quiver when a car door slammed on the  
next street, nor when two owls swooped overhead. In fact, it was nearly  
midnight before the cat moved  at all.  
 
A man appeared on the corner the cat had been watching, appeared so  
suddenly and silently you'd have thought he'd just popped out of the  
ground. The cat's tail twitched and its eyes narrowed.  
 
Nothing like this man had ever been seen on Pri vet Drive. He was tall,


In [69]:
#### Your Task ####
# modify the QA chain in Section 2.5 (Chapter 1 only) to use pinecone instead of ChromaDB

### 2.7 Use vector store in Agent

In this section, we are going to create a simple QA agent that can decide by itself which of the two vectorstores it should switch to for questions of differnent fields.

#### Preparing the tools for the agent.

We will use our chroma_based Harry Potter vectorDB, and let's create another one containing President Biden's State of the Union speech. 

In [70]:
from langchain.document_loaders import TextLoader

documents = TextLoader('/share/lab4/state_of_the_union.txt').load()
texts = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0).split_documents(documents)
docsearch3 = Chroma.from_documents(texts, 
                                   minilm_embedding, 
                                   collection_name="state-of-union", 
                                   persist_directory="/scratch2/chroma_db")
docsearch3.persist()

To allow the agent query these databases, we need to define two RetrievalQA chains.

In [71]:
from langchain.chains import RetrievalQA
from langchain_openai import OpenAI

llm = OpenAI(temperature=0, model="gpt-3.5-turbo-instruct")

harry_potter = RetrievalQA.from_chain_type(llm=llm, 
                                           chain_type="stuff", 
                                           retriever=docsearch_chroma_reloaded.as_retriever())
state_of_union = RetrievalQA.from_chain_type(llm=llm, 
                                             chain_type="stuff", 
                                             retriever=docsearch3.as_retriever())

In [72]:
# Now try both chains

print_with_type(harry_potter.invoke('Why does McGonagall seem concerned about Harry being raised by the Dursleys?'))
print_with_type(state_of_union.invoke("what is the GDP increase last year?"))

"<class 'dict'>:"
{'query': 'Why does McGonagall seem concerned about Harry being raised by the '
          'Dursleys?',
 'result': ' McGonagall may be concerned because she knows that the Dursleys '
           'do not like magic and may not treat Harry well because of his '
           "magical abilities. She also may be worried about Harry's safety "
           'and well-being in their care.'}
"<class 'dict'>:"
{'query': 'what is the GDP increase last year?',
 'result': ' The GDP increase last year was 5.7%.'}


In [73]:
from langchain.agents import AgentType, Tool
from langchain.llms import OpenAI

# define tools
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
    ),
    Tool(
        name="Harry Potter QA System",
        func=harry_potter.run,
        description="useful for when you need to answer questions about Harry Potter. Input should be a fully formed question.",
    ),
]

Now we can create the Agent giving both chains as tools. 

In [74]:
from langchain.agents import initialize_agent


# Construct the agent. We will use the default agent type here.
# See documentation for a full list of options.
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)


  warn_deprecated(


In [75]:
agent.run(
    "What did biden say about ketanji brown jackson?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use the State of Union QA System to find the answer
Action: State of Union QA System
Action Input: What did biden say about ketanji brown jackson?[0m
Observation: [36;1m[1;3m Biden nominated Ketanji Brown Jackson for the United States Supreme Court.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Biden nominated Ketanji Brown Jackson for the United States Supreme Court.[0m

[1m> Finished chain.[0m


'Biden nominated Ketanji Brown Jackson for the United States Supreme Court.'

In [76]:
agent.run(
    "'Why does McGonagall seem concerned about Harry being raised by the Dursleys?'"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m You should always think about what to do
Action: Harry Potter QA System
Action Input: 'Why does McGonagall seem concerned about Harry being raised by the Dursleys?'[0m
Observation: [33;1m[1;3m McGonagall is concerned about Harry being raised by the Dursleys because she knows that they have a negative opinion of magic and the wizarding world. She worries that Harry may not be treated well or may not be taught about his magical abilities.[0m
Thought:[32;1m[1;3m You should always think about what to do
Action: Harry Potter QA System
Action Input: 'Why does McGonagall seem concerned about Harry being raised by the Dursleys?'[0m
Observation: [33;1m[1;3m McGonagall is concerned about Harry being raised by the Dursleys because she knows that they have a negative opinion of magic and the wizarding world. She worries that Harry may not be treated well or may not be taught about his magical abilities.[0m
Thought:[32;1m[1;3

'McGonagall is concerned about Harry being raised by the Dursleys because she knows that they have a negative opinion of magic and the wizarding world. She worries that Harry may not be treated well or may not be taught about his magical abilities.'

We can see that the agent can "smartly" choose which QA system to use given a specific question. 

## 3 Your Task: putting it all together: OpenAI and Langchain

In [77]:
#### Your Task ####

# This is a major task that requires some thinking and time. 

# Build a conversation system from a collection of research papers of your choice. 
# You can ask specific questions of a method about these papers, and the agent returns a brief answer to you (with no more than 100 words). 
# Save your data and ChromaDB in the /share directory so other people can use it. 
# Provide at least three query examples so the TAs can review your work. 

# You may use any tool from the past four labs or from the langchain docs, or any open source project. 

# write a summary (a Markdown cell) at the end of the notebook summarizing what works and what does not. 

