# Tutorial

Kai Foerster, Amin Oueslati, Steve Kerr

## Introduction
Policy motivation: many institutions want to use something like ChatGPT but with their own domain knowledge <br>
Explain what a RAG chatbot is   <br>

## Prequsite

- install libraries <br>
- get access to OpenAI and Pinecone or Chroma for data storage

In [None]:
#!pip install -qU \
#    langchain==0.0.292 \
#    openai==0.28.0 \
#    datasets==2.10.1 \
#    pinecone-client==2.2.4 \
#    tiktoken==0.5.1

## Building a chatbot (no RAG)

In [None]:
#!chainlit hello

In [1]:
import os
import chainlit as cl
from langchain import HuggingFaceHub, PromptTemplate, LLMChain

In [2]:
HF_API_TOKEN = "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"
os.environ["HF_API_TOKEN"] = HF_API_TOKEN

In [28]:
model_id = "gpt2-medium"
conv_model = HuggingFaceHub(huggingfacehub_api_token=os.environ['HF_API_TOKEN'],repo_id=model_id, model_kwargs={"temperature":0.8,"max_length": 500})



In [4]:
template="""You are a helpful assistant that answers questions of the user.
{human_message}
"""

prompt=PromptTemplate(template=template, input_variables=["human_message"])

In [29]:
conv_chain = LLMChain(llm=conv_model, prompt=prompt, verbose=True)

In [6]:
#res=conv_chain.run("what is string theory?")
#print(res)
print(conv_chain.run("what is string theory?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
what is string theory?
[0m

[1m> Finished chain.[0m
string theory is a branch of mathematics which combines the laws of physics with the mathematical foundations of mathematics. string theory was originally proposed by the French physicist, Pierre-Simon Laplace in 1892. His theory is to explain why the Universe works in this way. The first formal proof of string theory came from the work of mathematician Roger Penrose in 1970, published in 1968. This was a mathematical proof of the existence of a special mathematical sub-string, which is the property of the string of space-time. It is possible to prove that the Universe does in fact exist. Another major theory in string theory is quantum mechanics. The theory states that the universe is made up of particles with the same properties of their classical counterparts. There may be some particle

### Appending last response to follow up question

In [7]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory, ConversationBufferMemory

In [8]:
memory = ConversationBufferMemory(memory_key="history")

In [9]:
user_message = "what is string theory?"
while user_message != "bye":
    memory.chat_memory.add_user_message(user_message)
    res = conv_chain.run(user_message)
    print("AI: ",res)
    memory.chat_memory.add_ai_message(res)
    user_message = input("Enter a message or bye to exit!")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
what is string theory?
[0m

[1m> Finished chain.[0m
AI:  string theory is a branch of mathematics which combines the laws of physics with the mathematical foundations of mathematics. string theory was originally proposed by the French physicist, Pierre-Simon Laplace in 1892. His theory is to explain why the Universe works in this way. The first formal proof of string theory came from the work of mathematician Roger Penrose in 1970, published in 1968. This was a mathematical proof of the existence of a special mathematical sub-string, which is the property of the string of space-time. It is possible to prove that the Universe does in fact exist. Another major theory in string theory is quantum mechanics. The theory states that the universe is made up of particles with the same properties of their classical counterparts. There may be some par

### Hallucinations

In [10]:
print(conv_chain.run("What is so special about Llama 2?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
What is so special about Llama 2?
[0m

[1m> Finished chain.[0m
This software can manage your company communications in a more advanced way. So you can add users, change passwords or do any other important things.
What are you like and what do you do?
We are a software developers. We design and release our software. We do the programming, and after development we test our software and provide the feedback. You can find us on Facebook, Twitter and Google +. We are always ready to answer any questions you have.
You can contact Llama Software on Facebook, Google+ and Twitter, as well as via email.
You can also visit Llama Software in our office where we are working on another product, llama.io
What is this project about?
This is a project to make it available to you. So if you want to start using Llama today, you can.
There are two main project

### Source knowledging (manual)

In [11]:
llmchain_information = [
    "A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.",
    "Chains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.",
    "LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those types of applications."
]

source_knowledge = "\n".join(llmchain_information)

In [16]:
source_knowledge

'A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.\nChains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.\nLangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those

In [12]:
template_with_context="""You are a helpful assistant that answers questions of the user.

Contexts:{source_knowledge}

{human_message}
"""

prompt2=PromptTemplate(template=template_with_context, input_variables=["human_message",  "source_knowledge"])

In [30]:
template_with_context="""You are a helpful assistant that answers questions of the user using the contexts.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), 
and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. 
Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format. Chains is an incredibly generic concept which returns to a 
sequence of modular components (or other chains) combined in a particular way to accomplish a common use case. LangChain is a framework for developing applications powered 
by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data,
(2) Be agentic: Allow a language model to interact with its environment.
As such, the LangChain framework is designed with the objective in mind to enable those types of applications.

{human_message}
"""

prompt3=PromptTemplate(template=template_with_context, input_variables=["human_message"])

In [31]:
conv_chain_context = LLMChain(llm=conv_model, prompt=prompt3, verbose=True)

In [32]:
print(conv_chain_context.run("What is so special about LLMchain?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user using the contexts.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), 
and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. 
Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format. Chains is an incredibly generic concept which returns to a 
sequence of modular components (or other chains) combined in a particular way to accomplish a common use case. LangChain is a framework for developing applications powered 
by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other so

In [15]:
print(conv_chain_context.run({

  'source_knowledge': source_knowledge,

  'human_message': "What is PromptTemplate?"

}))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.
Chains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.
LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be a

In [None]:
query = "Can you tell me about the LLMChain in LangChain?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

In [None]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [None]:
print(res.content)

## RAG 
### Create database to store your corpus on

In [None]:
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores.chroma import Chroma
import os
import shutil

CHROMA_PATH = "chroma"
DATA_PATH = "data/books"


def main():
    generate_data_store()


def generate_data_store():
    documents = load_documents()
    chunks = split_text(documents)
    save_to_chroma(chunks)


def load_documents():
    loader = DirectoryLoader(DATA_PATH, glob="*.md")
    documents = loader.load()
    return documents


def split_text(documents: list[Document]):
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=300,
        chunk_overlap=100,
        length_function=len,
        add_start_index=True,
    )
    chunks = text_splitter.split_documents(documents)
    print(f"Split {len(documents)} documents into {len(chunks)} chunks.")

    document = chunks[10]
    print(document.page_content)
    print(document.metadata)

    return chunks


def save_to_chroma(chunks: list[Document]):
    # Clear out the database first.
    if os.path.exists(CHROMA_PATH):
        shutil.rmtree(CHROMA_PATH)

    # Create a new DB from the documents.
    db = Chroma.from_documents(
        chunks, OpenAIEmbeddings(), persist_directory=CHROMA_PATH
    )
    db.persist()
    print(f"Saved {len(chunks)} chunks to {CHROMA_PATH}.")


if __name__ == "__main__":
    main()

### Query data from your database based on your prompt

In [None]:
import argparse
from dataclasses import dataclass
from langchain.vectorstores.chroma import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

CHROMA_PATH = "chroma"

PROMPT_TEMPLATE = """
Answer the question based only on the following context:

{context}

---

Answer the question based on the above context: {question}
"""


def main():
    # Create CLI.
    parser = argparse.ArgumentParser()
    parser.add_argument("query_text", type=str, help="The query text.")
    args = parser.parse_args()
    query_text = args.query_text

    # Prepare the DB.
    embedding_function = OpenAIEmbeddings()
    db = Chroma(persist_directory=CHROMA_PATH, embedding_function=embedding_function)

    # Search the DB.
    results = db.similarity_search_with_relevance_scores(query_text, k=3)
    if len(results) == 0 or results[0][1] < 0.7:
        print(f"Unable to find matching results.")
        return

    context_text = "\n\n---\n\n".join([doc.page_content for doc, _score in results])
    prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)
    prompt = prompt_template.format(context=context_text, question=query_text)
    print(prompt)

    model = ChatOpenAI()
    response_text = model.predict(prompt)

    sources = [doc.metadata.get("source", None) for doc, _score in results]
    formatted_response = f"Response: {response_text}\nSources: {sources}"
    print(formatted_response)


if __name__ == "__main__":
    main()

Alternative from other notebook

In [None]:
from langchain.vectorstores import Pinecone

text_field = "text"  # the metadata field that contains our text

# initialize the vector store object
vectorstore = Pinecone(
    index, embed_model.embed_query, text_field
)

In [None]:
query = "What is so special about Llama 2?"

vectorstore.similarity_search(query, k=3)

In [None]:
def augment_prompt(query: str):
    # get top 3 results from knowledge base
    results = vectorstore.similarity_search(query, k=3)
    # get the text from the results
    source_knowledge = "\n".join([x.page_content for x in results])
    # feed into an augmented prompt
    augmented_prompt = f"""Using the contexts below, answer the query.

    Contexts:
    {source_knowledge}

    Query: {query}"""
    return augmented_prompt

In [None]:
print(augment_prompt(query))

### Parse the augumented prompt into the chatmodel

In [None]:
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

### Human evaluation of RAG model
Do we wanna add some other evaluation methods here??

In [None]:
prompt = HumanMessage(
    content="what safety measures were used in the development of llama 2?"
)

res = chat(messages + [prompt])
print(res.content)

In [None]:
prompt = HumanMessage(
    content=augment_prompt(
        "what safety measures were used in the development of llama 2?"
    )
)

res = chat(messages + [prompt])
print(res.content)

## References

https://github.com/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb <br>
https://github.com/pixegami/langchain-rag-tutorial/tree/main <br>
https://www.youtube.com/watch?v=LhnCsygAvzY <br>
https://www.youtube.com/watch?v=tcqEUSNCn8I

I MADE SOME CHANGES HERE