# Lec4. Adding Memory and Storage to LLMs

Last week, we learned the basic elements of the framework LangChain. In this lecture, we are going to construct a vector store QA application from scratch.

>Reference:
> 1. [Ask A Book Questions](https://github.com/gkamradt/langchain-tutorials/blob/main/data_generation/Ask%20A%20Book%20Questions.ipynb)
> 2. [Agent Vectorstore](https://python.langchain.com/docs/modules/agents/how_to/agent_vectorstore)


## 0. Setup

1. Install the requirements.  (Already installed in your image.)
    ```
    pip install -r requirements.txt
    ```
2. Get your OpenAI API; to get your Serpapi key, please sign up for a free account at the [Serpapi website](https://serpapi.com/); to get your Pinecone key, first regiter on the [Pinecone website](https://www.pinecone.io/), **Create API Key** and **Create Index**. Note that in this notebook the index's dimension should be 1536.

3. Store your keys in a file named **.env** and place it in the current path or in a location that can be accessed.
    ```
    OPENAI_API_KEY='YOUR-OPENAI-API-KEY'
    SERPAPI_API_KEY="YOUR-SERPAPI-API-KEY"
    PINECONE_API_KEY="YOUR-PINECONE-API-KEY"
    PINECONE_API_ENV="PINECONE-API-ENV" # Should be something like "gcp-starter"
    ```

In [1]:
#%pip install -r requirements.txt

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
import os
os.environ['HTTP_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['HTTPS_PROXY']="http://Clash:QOAF8Rmd@10.1.0.213:7890"
os.environ['ALL_PROXY']="socks5://Clash:QOAF8Rmd@10.1.0.213:7893"

In [5]:
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder
from langchain_core.messages import SystemMessage
from langchain_openai import ChatOpenAI

In [6]:
from langchain.agents import AgentExecutor, Tool, ZeroShotAgent
from langchain.chains import LLMChain
from langchain.memory import ConversationBufferMemory
from langchain_community.utilities import SerpAPIWrapper
from langchain_openai import OpenAI

In [7]:
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoader, PDFMinerLoader

data = PyPDFLoader("/share/lab4/hp-book1.pdf").load()



In [90]:
#### Your TASK ####
# Try different PDF Loaders.  Which one works the best for this file /share/lab4/hp-book1.pdf ,
# which contains the full book of Harry Potter Book 1, with all the illustratons.

## Langchain provides many other options for loaders, read the documents to find out the differences
# See page https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf
loader = UnstructuredPDFLoader("./data/field-guide-to-data-science.pdf")
# loader = PyPDFLoader("example_data/layout-parser-paper.pdf")
# loader = PDFMinerLoader("example_data/layout-parser-paper.pdf")

We can see that the agent can "smartly" choose which QA system to use given a specific question. 

## 3 Your Task: putting it all together: OpenAI and Langchain

In [14]:
#### Your Task ####

# This is a major task that requires some thinking and time. 

# Build a conversation system from a collection of research papers of your choice. 5? perhaps
# You can ask specific questions of a method about these papers, and the agent returns a brief answer to you (with no more than 100 words).
 
# Save your data and ChromaDB in the /share directory so other people can use it.
# Provide at least three query examples so the TAs can review your work. 
# You may use any tool from the past four labs or from the langchain docs, or any open source project. 
# write a summary (a Markdown cell) at the end of the notebook summarizing what works and what does not. 

#basically import some papers, create embeddings for them
from langchain.document_loaders import UnstructuredPDFLoader, OnlinePDFLoader, PyPDFLoader, PDFMinerLoader
##Load one pdf - how come arxiv's file does not work
data = PyPDFLoader("/share/ch3/data.pdf").load()

In [15]:
##Create a ChromaDB Embedding
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
minilm_embedding = SentenceTransformerEmbeddings(model_name="/share/embedding/all-MiniLM-L12-v2/")

In [41]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=20)
texts = text_splitter.split_documents(data)

print (f'Now you have {len(texts)} documents')

Now you have 135 documents


In [38]:
from langchain.vectorstores import Chroma

chroma_dir = "/share/ch3/chroma_db"
docsearch_chroma = Chroma.from_documents(texts, 
                                         minilm_embedding,
                                         persist_directory=chroma_dir,
                                         )


In [39]:
## Queries first of three: normal
docs = docsearch_chroma.similarity_search("Who are the authors of the paper 'BARTScore: Evaluating Generated Text as Text Generation'")
print_search_results(docs) ##Find the result after adding the name of the paper

search returned 4 results. 
be mitigated.
5 Implications and Future Directions
In this paper, we proposed a metric BARTS CORE that formulates evaluation of generated text as a
text generation task, and empirically demonstrated its efficacy. Without the supervision of human
judgments, BARTS CORE can effectively evaluate texts from 7 perspectives and achieve the best
performance on 16 of 22 settings against existing top-scoring metrics. We highlight potential future
directions based on what we have learned.
BARTS CORE :
Evaluating Generated Text as Text Generation
Weizhe Yuan
Carnegie Mellon University
weizhey@cs.cmu.eduGraham Neubig
Carnegie Mellon University
gneubig@cs.cmu.eduPengfei Liu∗
Carnegie Mellon University
pliu3@cs.cmu.edu
Abstract
A wide variety of NLP applications, such as machine translation, summarization,
and dialog, involve text generation. One major challenge for these applications
is how to evaluate whether such generated texts are actually fluent, accurate,
By explori

In [48]:
## Queries second of three: qa chain
from langchain_openai import OpenAI
from langchain.chains.question_answering import load_qa_chain

llm = OpenAI(temperature=0, model="gpt-3.5-turbo-instruct")
chain = load_qa_chain(llm, chain_type="stuff", verbose=False) ##What are some other chain types

In [49]:
query = 'What is the topic of the research paper BartScore: Evaluating Generated Text as Text Generation'
docs = docsearch_chroma.similarity_search(query)
chain.run(input_documents=docs, question=query)

' The topic of the research paper is evaluating generated text as text generation using a metric called BARTScore.'

In [51]:
## Queries last of three: a different chain
chain = load_qa_chain(llm, chain_type="map_reduce", verbose=False)
query = "What does the paper's experimental result that evaluates the effectiveness of automated scientific reviewing include?" ## also successful##Previous query: What does the paper's experimental result include? ##Previous unsuccessful query: what is hte paper's experimental result?
docs = docsearch_chroma.similarity_search(query)
chain.run(input_documents=docs, question=query)

" The paper's experimental result includes an evaluation of the effectiveness of automated scientific reviewing using BARTS CORE, which can evaluate text from various perspectives and estimate measures of quality such as coherence and fluency. It also includes a measure of precision from reference text to system-generated text."

In [70]:
##Need a memory chain for this... it cannot give me an explicit answer
from langchain.chains import LLMChain
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder
from langchain.schema import SystemMessage
from langchain.memory import ConversationBufferMemory

prompt = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content = """You are an assistent that helps human search for information from research papers.
            If you do not know the answer, answer you don't know. Do not try to make up answers."""
        ),  # The persistent system prompt
        MessagesPlaceholder(
            variable_name = "chat_history"
        ),  # This is where the memory will be stored.
        HumanMessagePromptTemplate.from_template(
            "{human_input}"
        ),  # This is where the human input will be injected
    ]
)

memory = ConversationBufferMemory(memory_key="chat_history",
                                  input_key="human_input",
                                  return_messages=True)

In [73]:

chain = load_qa_chain(llm, chain_type="stuff", verbose=False)
query = "What does the paper's experimental result include?" ## also successful##Previous query: What does the paper's experimental result include? ##Previous unsuccessful query: what is hte paper's experimental result?
docs = docsearch_chroma.similarity_search(query)
chain.run(input_documents=docs, question=query)

" The paper's experimental result includes a fine-grained analysis and prompt analysis, as well as measures such as Kendall's Tau, BERTScore, PRISM, BLEURT, COMET, and BARTS CORE. It also includes evaluations of semantic overlap, linguistic quality, and factual correctness."

In [62]:
query = "What does the paper's experimental result include?"
docs = docsearch_chroma.similarity_search(query)
chain.run(input_documents=docs, question=query)

" The paper's experimental result includes a fine-grained analysis and prompt analysis, as well as measures such as Kendall's Tau, BERTScore, PRISM, BLEURT, COMET, and BARTS CORE. It also includes evaluations of semantic overlap, linguistic quality, and factual correctness."

In [None]:
memory.load_memory_variables({})

In [None]:
memory = ConversationBufferMemory(memory_key="chat_history", input_key="question")
chain = load_qa_chain(
    OpenAI(temperature=0), chain_type="stuff", memory=memory, prompt=PROMPT
)##copyied from somewhere else

docs=db.similarity_search(query=query)

# building the dictionary for chain

chain_input={
    "input_documents": docs,
    "context":"This is contextless",
    "question":query,
    "Customer_Name":"Bob",
    "Customer_State":"NY",
    "Customer_Gender":"Male"
}

result=chain(chain_input, return_only_outputs=True)

In [None]:
##Possibly a Tool?
from langchain.agents import AgentType, Tool
from langchain.llms import OpenAI

# define tools
tools = [
    Tool(
        name="State of Union QA System",
        func=state_of_union.run,
        description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
    ),
    Tool(
        name="Harry Potter QA System",
        func=harry_potter.run,
        description="useful for when you need to answer questions about Harry Potter. Input should be a fully formed question.",
    ),
]

from langchain.chains import RetrievalQA
from langchain_openai import OpenAI

llm = OpenAI(temperature=0, model="gpt-3.5-turbo-instruct")

harry_potter = RetrievalQA.from_chain_type(llm=llm, 
                                           chain_type="stuff", 
                                           retriever=docsearch_chroma_reloaded.as_retriever())

from langchain.agents import initialize_agent

agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In [40]:
## saving the chromaDB to local
docsearch_chroma.persist() ##how do we know where is it stored?