# Tutorial

Kai Foerster, Amin Oueslati, Steve Kerr

## Introduction
Policy motivation: many institutions want to use something like ChatGPT but with their own domain knowledge <br>
Explain what a RAG chatbot is   <br>

# Setup

* Install dependencies
* Configure an API key for Hugging Face

In [1]:
# install dependencies
!pipenv install langchain
!pipenv install sentence_transformers
!pipenv install chromadb
!pipenv install unstructured
!pipenv install chainlit

[1mLoading .env environment variables...[0m
[1;32mInstalling langchain[0m[1;33m...[0m
[?25lResolving langchain[33m...[0m
[2K✔ Installation Succeeded
[2K[32m⠋[0m Installing langchain...
[1A[2K[1mInstalling dependencies from Pipfile.lock [0m[1m([0m[1mb504e4[0m[1m)[0m[1;33m...[0m
To activate this project's virtualenv, run [33mpipenv shell[0m.
Alternatively, run a command inside the virtualenv with [33mpipenv run[0m.
[1mLoading .env environment variables...[0m
[1;32mInstalling sentence_transformers[0m[1;33m...[0m
[?25lResolving sentence_transformers[33m...[0m
[2K✔ Installation Succeeded
[2K[32m⠋[0m Installing sentence_transformers...
[1A[2K[1mInstalling dependencies from Pipfile.lock [0m[1m([0m[1mb504e4[0m[1m)[0m[1;33m...[0m
To activate this project's virtualenv, run [33mpipenv shell[0m.
Alternatively, run a command inside the virtualenv with [33mpipenv run[0m.
[1mLoading .env environment variables...[0m
[1;32mInstalling chromadb

## Building a chatbot (no RAG)

In [2]:
import os
import chainlit as cl
from langchain import HuggingFaceHub, PromptTemplate, LLMChain

2023-12-10 18:27:36 - Created default config file at /Users/steve/Documents/GitHub/dl-tutorial/.chainlit/config.toml
2023-12-10 18:27:37 - Loaded .env file


In [5]:
model_id = "gpt2-medium"
conv_model = HuggingFaceHub(
    huggingfacehub_api_token=os.environ['HF_API_KEY'], 
    repo_id=model_id, 
    model_kwargs={"temperature":0.8,"max_length": 500}
    )



In [6]:
template="""You are a helpful assistant that answers questions of the user.
{human_message}
"""

prompt=PromptTemplate(template=template, input_variables=["human_message"])

In [7]:
conv_chain = LLMChain(llm=conv_model, prompt=prompt, verbose=True)

In [8]:
#res=conv_chain.run("what is string theory?")
#print(res)
print(conv_chain.run("what is string theory?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
what is string theory?
[0m

[1m> Finished chain.[0m
string theory is a branch of physics which deals with how the mass of a particle is related to the energy it leaves behind.
so what is string theory?
strangely enough, it's also another branch of physics that deals with how particles can communicate with each other. that's one of the main things I love about it.
for example, when scientists are trying to understand how black holes work they have to consider the laws of physics, how they form, how they absorb energy in a certain way.
for string theory, all they have to do is calculate how much energy you can get out of the black hole in a certain way and they can calculate how much energy you can get out of the rest of the universe at the same time.
if you just look at the stuff that happens between the singularities (i.e. what happens at t

### Appending last response to follow up question

In [9]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferWindowMemory, ConversationBufferMemory

In [10]:
memory = ConversationBufferMemory(memory_key="history")

In [12]:
user_message = "what is string theory?"
while user_message != "bye":
    memory.chat_memory.add_user_message(user_message)
    res = conv_chain.run(user_message)
    print("AI: ", res)
    memory.chat_memory.add_ai_message(res)
    user_message = input("Enter a message or 'bye' to exit!")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
what is string theory?
[0m

[1m> Finished chain.[0m
AI:  string theory is a branch of physics which deals with how the mass of a particle is related to the energy it leaves behind.
so what is string theory?
strangely enough, it's also another branch of physics that deals with how particles can communicate with each other. that's one of the main things I love about it.
for example, when scientists are trying to understand how black holes work they have to consider the laws of physics, how they form, how they absorb energy in a certain way.
for string theory, all they have to do is calculate how much energy you can get out of the black hole in a certain way and they can calculate how much energy you can get out of the rest of the universe at the same time.
if you just look at the stuff that happens between the singularities (i.e. what happens

### Hallucinations

In [None]:
print(conv_chain.run("What is so special about Llama 2?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user.
What is so special about Llama 2?
[0m

[1m> Finished chain.[0m
This software can manage your company communications in a more advanced way. So you can add users, change passwords or do any other important things.
What are you like and what do you do?
We are a software developers. We design and release our software. We do the programming, and after development we test our software and provide the feedback. You can find us on Facebook, Twitter and Google +. We are always ready to answer any questions you have.
You can contact Llama Software on Facebook, Google+ and Twitter, as well as via email.
You can also visit Llama Software in our office where we are working on another product, llama.io
What is this project about?
This is a project to make it available to you. So if you want to start using Llama today, you can.
There are two main project

### Source knowledging (manual)

In [13]:
llmchain_information = [
    "A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.",
    "Chains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.",
    "LangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those types of applications."
]

source_knowledge = "\n".join(llmchain_information)

In [14]:
source_knowledge

'A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format.\nChains is an incredibly generic concept which returns to a sequence of modular components (or other chains) combined in a particular way to accomplish a common use case.\nLangChain is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data, (2) Be agentic: Allow a language model to interact with its environment. As such, the LangChain framework is designed with the objective in mind to enable those

In [15]:
template_with_context="""You are a helpful assistant that answers questions of the user.

Contexts:{source_knowledge}

{human_message}
"""

prompt2=PromptTemplate(template=template_with_context, input_variables=["human_message",  "source_knowledge"])

In [16]:
template_with_context="""You are a helpful assistant that answers questions of the user using the contexts.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), 
and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. 
Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format. Chains is an incredibly generic concept which returns to a 
sequence of modular components (or other chains) combined in a particular way to accomplish a common use case. LangChain is a framework for developing applications powered 
by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other sources of data,
(2) Be agentic: Allow a language model to interact with its environment.
As such, the LangChain framework is designed with the objective in mind to enable those types of applications.

{human_message}
"""

prompt3=PromptTemplate(template=template_with_context, input_variables=["human_message"])

In [17]:
conv_chain_context = LLMChain(llm=conv_model, prompt=prompt3, verbose=True)

In [18]:
print(conv_chain_context.run("What is so special about LLMchain?"))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user using the contexts.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), 
and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. 
Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format. Chains is an incredibly generic concept which returns to a 
sequence of modular components (or other chains) combined in a particular way to accomplish a common use case. LangChain is a framework for developing applications powered 
by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other so

In [19]:
print(conv_chain_context.run({

  'source_knowledge': source_knowledge,

  'human_message': "What is PromptTemplate?"

}))



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mYou are a helpful assistant that answers questions of the user using the contexts.

Contexts:A LLMChain is the most common type of chain. It consists of a PromptTemplate, a model (either an LLM or a ChatModel), 
and an optional output parser. This chain takes multiple input variables, uses the PromptTemplate to format them into a prompt. It then passes that to the model. 
Finally, it uses the OutputParser (if provided) to parse the output of the LLM into a final format. Chains is an incredibly generic concept which returns to a 
sequence of modular components (or other chains) combined in a particular way to accomplish a common use case. LangChain is a framework for developing applications powered 
by language models. We believe that the most powerful and differentiated applications will not only call out to a language model via an api, but will also: (1) Be data-aware: connect a language model to other so

In [20]:
query = "Can you tell me about the LLMChain in LangChain?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

In [25]:
from langchain.schema import messages
from langchain.schema.messages import HumanMessage

In [26]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

AttributeError: module 'langchain.schema.messages' has no attribute 'append'

In [None]:
print(res.content)

## RAG 
### Create database to store your corpus on

In [1]:
# load dependencies
from langchain.document_loaders import DirectoryLoader
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
import shutil

In [2]:
# set params
DATA_PATH = "data/html"
CHROMA_PATH = "chroma_db"
EMBED_MODEL = "all-MiniLM-L6-v2" # Chroma defaults to "sentence-transformers/all-MiniLM-L6-v2"
# alternative: "BAAI/bge-small-en-v1.5"

# Load Documents

In [5]:
# load docs
def load_docs(directory):
  loader = DirectoryLoader(directory)
  documents = loader.load()
  return documents

documents = load_docs(DATA_PATH)
len(documents)

3487

In [6]:
documents[0]

Document(page_content='46.708 Warranties of data.\n\nWarranties of data shall be developed and used in accordance with agency regulations.\n\nSubpart 46.7 - Warranties', metadata={'source': 'data/html/46.708.html'})

# Embed Documents & Upload to Vector Database

In [7]:
# define text embedding model
embedding_func = SentenceTransformerEmbeddings(model_name=EMBED_MODEL)

# See https://huggingface.co/spaces/mteb/leaderboard

  from .autonotebook import tqdm as notebook_tqdm


In [8]:
# first, clear out current db
# if os.path.exists(CHROMA_PATH):
    # shutil.rmtree(CHROMA_PATH)

# initialize Chroma db and save locally
db = Chroma.from_documents(
    documents=documents, embedding=embedding_func, persist_directory=CHROMA_PATH
    )

db.persist()

# print message
print(f"Saved {len(documents)} chunks to {CHROMA_PATH}.")

Saved 3487 chunks to chroma_db.


# Query Vector Database

In [14]:
# query vector db
query = "What is the purpose of the Federal Acquisition Regulations?"
matching_docs = db.similarity_search_with_relevance_scores(
    query=query, 
    k=4, # number of docs to return
    #score_threshold=.5,
    #filter=[{"":""}]
    )

matching_docs

[(Document(page_content='1.101 Purpose.\n\nThe Federal Acquisition Regulations System is established for the codification and publication of uniform policies and procedures for acquisition by all executive agencies. The Federal Acquisition Regulations System consists of the Federal Acquisition Regulation (FAR), which is the primary document, and agency acquisition regulations that implement or supplement the FAR. The FAR System does not include internal agency guidance of the type described in 1.301(a)(2).\n\nSubpart 1.1 - Purpose, Authority, Issuance', metadata={'source': 'data/html/1.101.html'}),
  0.754029959492634),
 (Document(page_content='1.000 Scope of part.\n\nThis part sets forth basic policies and general information about the Federal Acquisition Regulations System including purpose, authority, applicability, issuance, arrangement, numbering, dissemination, implementation, supplementation, maintenance, administration, and deviation. subparts\xa0 1.2,1.3, and 1.4 prescribe adm

### Query data from your database based on your prompt

In [None]:
import argparse
from dataclasses import dataclass
from langchain.vectorstores.chroma import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

DATA_PATH = "data/html"
CHROMA_PATH = "chroma_db"
EMBED_MODEL = "all-MiniLM-L6-v2" 
PROMPT_TEMPLATE = """
Answer the question based only on the following context:

{context}

---

Answer the question based on the above context: {question}
"""


def main():
    # Create CLI.
    parser = argparse.ArgumentParser()
    parser.add_argument("query_text", type=str, help="The query text.")
    args = parser.parse_args()
    query_text = args.query_text

    # Prepare the DB.
    embedding_function = SentenceTransformerEmbeddings(model_name=EMBED_MODEL)
    db = Chroma.from_documents(documents=documents, embedding=embedding_function, persist_directory=CHROMA_PATH)

    # Search the DB.
    results = db.similarity_search_with_relevance_scores(query_text, k=3)
    if len(results) == 0 or results[0][1] < 0.7:
        print(f"Unable to find matching results.")
        return

    context_text = "\n\n---\n\n".join([doc.page_content for doc, _score in results])
    prompt_template = ChatPromptTemplate.from_template(PROMPT_TEMPLATE)
    prompt = prompt_template.format(context=context_text, question=query_text)
    print(prompt)

    model = ChatOpenAI()
    response_text = model.predict(prompt)

    sources = [doc.metadata.get("source", None) for doc, _score in results]
    formatted_response = f"Response: {response_text}\nSources: {sources}"
    print(formatted_response)


if __name__ == "__main__":
    main()

### Parse the augumented prompt into the chatmodel

In [None]:
# create a new user prompt
prompt = HumanMessage(
    content=augment_prompt(query)
)
# add to messages
messages.append(prompt)

res = chat(messages)

print(res.content)

### Human evaluation of RAG model
Do we wanna add some other evaluation methods here??

In [None]:
prompt = HumanMessage(
    content="what safety measures were used in the development of llama 2?"
)

res = chat(messages + [prompt])
print(res.content)

In [None]:
prompt = HumanMessage(
    content=augment_prompt(
        "what safety measures were used in the development of llama 2?"
    )
)

res = chat(messages + [prompt])
print(res.content)

## References

https://github.com/pinecone-io/examples/blob/master/learn/generation/langchain/rag-chatbot.ipynb <br>
https://github.com/pixegami/langchain-rag-tutorial/tree/main <br>
https://www.youtube.com/watch?v=LhnCsygAvzY <br>
https://www.youtube.com/watch?v=tcqEUSNCn8I

I MADE SOME CHANGES HERE