# RAG - Retrieval Augmented Generation

The purpose of a RAG system is to generate a response that is relevant to the user's query. The system should be able to retrieve relevant information from a knowledge base and use it to generate a response. This notebook will demonstrate how to use the RAG model to generate responses to user queries.

## Requirements

Before we start, make sure we have an embedding model to use for the RAG system.
for this tutorial we will be using the `nomic-embed-text` model.

in your terminal run the following command:

```bash
ollama pull nomic-embed-text
```

verify that the model is downloaded by running the following command:

```bash
ollama models
```


First we will need a document to work with. We will be using the a code of conduct as our document.
Options for tools to load documents include:
- `PyPDFLoader`
- `WebBaseLoader`
- `ObsidianLoader` 
- `RedditPostsLoader`
- `RecursiveUrlLoader`
- `TextLoader`

In [1]:
from langchain_community.document_loaders import WebBaseLoader

urls = [
    "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/6JDbUb_L3egv_eOkouY71A.txt"
]
loader = WebBaseLoader(urls)
documents = loader.load()
print(documents[0].page_content[:100])


USER_AGENT environment variable not set, consider setting it to identify your requests.


1.	Code of Conduct

Our Code of Conduct outlines the fundamental principles and ethical standards th


# Documents
Loaders will return a list of documents. Each document is a dictionary with the following keys:
- `page_content`: the text of the document
- `metadata`: a dictionary with additional information about the document
The document object will be the most important object in the RAG pipeline. It is used to create the embeddings of the documents.

# Chunking
after loading in our documents, we can now chunk them into smaller documents. This is done by using the `chunk` function from the `langchain.text_splitter` module. We can specify the chunk size and the chunk overlap. More can be found under `langchain_classic.text_splitter`

notable splitters are:
- `RecursiveCharacterTextSplitter`
- `MarkdownTextSplitter`
- `TextSplitter`
- `CharacterTextSplitter`

In [2]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, separators=["\n\n+\d+\.\s*"], is_separator_regex=True, keep_separator=False, strip_whitespace=True
)

chunks = splitter.split_documents(documents=documents)
print(chunks[6].page_content)

Health and Safety Policy

Our commitment to health and safety is paramount. We prioritize the well-being of our employees, customers, and the public. We diligently comply with all relevant health and safety laws and regulations. Our objective is to maintain a workplace free from hazards, preventing accidents, injuries, and illnesses. Every individual within our organization is responsible for upholding these standards. We regularly assess and improve our safety measures, provide adequate training, and encourage open communication regarding safety concerns. Through collective dedication, we aim to ensure a safe, healthy, and secure environment for all. Your cooperation is essential in achieving this common goal.


# Embeddings Model
before we start utilizing our vector storage, we need an embeddings model to convert our text into a vector representation. We will use the Ollama `nomic-embed-text` model from ollama to do this.

other models from ollama also include: 
- `qwen3-embedding`
- `embeddinggemma`
- `snowflake-arctic-embed`

OpenAI also has its own embeddings model that can be used via the openai api key.
more details can be found here: https://docs.langchain.com/oss/python/integrations/text_embedding/openai

In [3]:
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="nomic-embed-text")

# Vector Database
with our documents chunked and our embeddings model ready, we can now create a vector database. We will use the `ChromaDB` library for this. 

The vector database can be used to store the embeddings locally or it cached at run time. We will use the cached version for this example.
if you'd like to have a persistant vector database you can change the persistant_directory to a folder


In [4]:
from langchain_chroma import Chroma
vectorstore = Chroma(
    collection_name="elden_ring_docs",
    embedding_function=embeddings,
    persist_directory=None  # This makes it persistent
)

# Adding Documents to the Knowledge Base

In [5]:
vectorstore.add_documents(chunks)

['693f0ba7-fd4f-4154-8ff3-87ce03127d58',
 '149f894a-5a66-456e-9826-05e5c475d82e',
 '095aaed4-45f9-49aa-8cb0-e0289a08439e',
 'cbcdfb88-ef0f-4424-ab7d-1e5c617acbc8',
 '87f0375f-2ad9-406a-8a12-3e018f0fbd39',
 '4d91cf8b-e571-41eb-8660-5dbb517b597e',
 '8b4cfa26-fbb1-4de4-bc72-a041e76d9a09',
 '3f82b4c8-f970-4c65-af92-93c1a02443f7',
 '5bbd11ab-3937-4fc5-8a00-e42ce12a62ce']

# Creating a Retriever
to get our chunks of text from a query, we need to create a retriever. This can easily be created using the `as_retriever` method from the vector store object.

In [6]:
sim_retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

results = sim_retriever.invoke("Recruitment Policy")

def print_results(results):
    for result in results:
        print("-"*100)
        print("page content:\n", result.page_content[:250])
        print("metadata:", result.metadata)
        
print_results(results)

----------------------------------------------------------------------------------------------------
page content:
 1.	Code of Conduct

Our Code of Conduct outlines the fundamental principles and ethical standards that guide every member of our organization. We are committed to maintaining a workplace that is built on integrity, respect, and accountability.
Integr
metadata: {'source': 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/6JDbUb_L3egv_eOkouY71A.txt'}
----------------------------------------------------------------------------------------------------
page content:
 Health and Safety Policy

Our commitment to health and safety is paramount. We prioritize the well-being of our employees, customers, and the public. We diligently comply with all relevant health and safety laws and regulations. Our objective is to m
metadata: {'source': 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/6JDbUb_L3egv_eOkouY71A.txt'}
------------------------------

# Using a BM25 Retreiver
We can see that using the retriever by itself is not very useful in this case. This is likely because the vector space associates these chunks of text closely together symbolically, But in our case we are using an exact word match. We can use a BM25 retriever to improve the results.

BM25 is a popular retrieval model that ranks documents based on the keyword used in the document. This is basically like CRTL F.
But it also takes into account 
- the frequency of the keyword in the document (This is logarithmic, so it gets less important the more times the keyword appears in the document)
- the rarity of the keyword in other documents


In [7]:
from langchain_classic.retrievers import BM25Retriever
bm25_retriever = BM25Retriever.from_documents(chunks)
bm25_retriever.k = 5 # return top k document chunks


# Combining BM25 Retriever and vectore similarity retriever
with our bm25 retriever set up, we can now combine it with our vector similarity retriever.
This will allow us to use both the bm25 and vector similarity retrievers to retrieve documents.

In [None]:
from langchain_classic.retrievers import EnsembleRetriever

ensembed_retreiver = EnsembleRetriever(
    retrievers=[bm25_retriever, sim_retriever], 
    weights=[0.5, 0.5] # Weights for each retriever
)

results = ensembed_retreiver.invoke("Recruitment Policy")
print_results(results)


----------------------------------------------------------------------------------------------------
page content:
 Recruitment Policy

Our Recruitment Policy reflects our commitment to attracting, selecting, and onboarding the most qualified and diverse candidates to join our organization. We believe that the success of our company relies on the talents, skills, 
metadata: {'source': 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/6JDbUb_L3egv_eOkouY71A.txt'}
----------------------------------------------------------------------------------------------------
page content:
 1.	Code of Conduct

Our Code of Conduct outlines the fundamental principles and ethical standards that guide every member of our organization. We are committed to maintaining a workplace that is built on integrity, respect, and accountability.
Integr
metadata: {'source': 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/6JDbUb_L3egv_eOkouY71A.txt'}
------------------------------

# RAG Conclusion
just like that we have completed our RAG implementation. We are now ready to use this to provide contextual information to our agents.