# RAG - Retrieval-Augmented Generation

RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

## How RAG works

* Create external data
* Retrieve relevant information
* Augment the LLM prompt

In [2]:
import os
import urllib.request
from langchain_community.document_loaders import TextLoader, WebBaseLoader
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain.schema.output_parser import StrOutputParser
from langchain.globals import set_llm_cache
from langchain.cache import SQLiteCache
from langchain_community.vectorstores import SQLiteVSS
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.chat_models import ChatOllama
from langchain_community.vectorstores import FAISS
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough
from IPython.display import display, Markdown, JSON

### Configure embeddings model and the vector store

In [3]:
HOST = "http://localhost:11434"
LLM_MODEL = "gemma:7b"
EMBEDDINGS_MODEL = "nomic-embed-text:latest"
llm = ChatOllama(base_url=HOST, model=LLM_MODEL, temperature=0)
embeddings_model = OllamaEmbeddings(base_url=HOST, model=EMBEDDINGS_MODEL)

### Get the book 20.000 Leagues under the see

In [4]:
url = 'https://www.gutenberg.org/cache/epub/164/pg164.txt'
filename = '../data/twenty-thousand-leagues-under-the-sea.txt'
urllib.request.urlretrieve(url, filename)

('../data/twenty-thousand-leagues-under-the-sea.txt',
 <http.client.HTTPMessage at 0x12b24af90>)

In [5]:
loader = TextLoader(filename)
documents = loader.load()

In [6]:
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)
texts = [doc.page_content for doc in docs]

Created a chunk of size 1076, which is longer than the specified 1000
Created a chunk of size 1042, which is longer than the specified 1000
Created a chunk of size 1378, which is longer than the specified 1000
Created a chunk of size 1094, which is longer than the specified 1000
Created a chunk of size 1127, which is longer than the specified 1000
Created a chunk of size 1057, which is longer than the specified 1000
Created a chunk of size 1240, which is longer than the specified 1000
Created a chunk of size 1034, which is longer than the specified 1000
Created a chunk of size 1084, which is longer than the specified 1000
Created a chunk of size 1048, which is longer than the specified 1000
Created a chunk of size 1149, which is longer than the specified 1000
Created a chunk of size 1289, which is longer than the specified 1000
Created a chunk of size 1091, which is longer than the specified 1000
Created a chunk of size 1016, which is longer than the specified 1000
Created a chunk of s

### Create embedding vectors 

We put all vectors into the sqlite-vss in a table named ```twenty_thousand_leagues_under_the_sea```.
The db_file parameter is the name of the file you want as your sqlite database.

In [29]:
vector_db = FAISS.from_documents(
    documents=docs, 
    embedding=embeddings_model
)

### Query the database

In [30]:
query = "What is the Nautilus?"
docs = vector_db.similarity_search(query)

In [31]:
for doc in docs:
    print(doc.page_content)

I cast a last look at the man-of-war, which was putting on steam, and
rejoined Ned and Conseil.

“We will fly!” I exclaimed.

“Good!” said Ned. “What is this vessel?”

“I do not know; but, whatever it is, it will be sunk before night. In
any case, it is better to perish with it, than be made accomplices in a
retaliation the justice of which we cannot judge.”

“That is my opinion too,” said Ned Land, coolly. “Let us wait for
night.”
Thus the _Abraham Lincoln_ wanted for no means of destruction; and,
what was better still, she had on board Ned Land, the prince of
harpooners.

Ned Land was a Canadian, with an uncommon quickness of hand, and who
knew no equal in his dangerous occupation. Skill, coolness, audacity,
and cunning he possessed in a superior degree, and it must be a cunning
whale or a singularly “cute” cachalot to escape the stroke of his
harpoon.

Ned Land was about forty years of age; he was a tall man (more than six
feet high), strongly built, grave and taciturn, occasionally

### Basic retrieval

In [32]:
retriever = vector_db.as_retriever()

In [33]:
docs = retriever.get_relevant_documents("What is Ned's last name")

In [34]:
for doc in docs:
    print(doc.page_content)

Thus the _Abraham Lincoln_ wanted for no means of destruction; and,
what was better still, she had on board Ned Land, the prince of
harpooners.

Ned Land was a Canadian, with an uncommon quickness of hand, and who
knew no equal in his dangerous occupation. Skill, coolness, audacity,
and cunning he possessed in a superior degree, and it must be a cunning
whale or a singularly “cute” cachalot to escape the stroke of his
harpoon.

Ned Land was about forty years of age; he was a tall man (more than six
feet high), strongly built, grave and taciturn, occasionally violent,
and very passionate when contradicted. His person attracted attention,
but above all the boldness of his look, which gave a singular
expression to his face.
“And why this powerful organisation?” demanded Ned.
I cast a last look at the man-of-war, which was putting on steam, and
rejoined Ned and Conseil.

“We will fly!” I exclaimed.

“Good!” said Ned. “What is this vessel?”

“I do not know; but, whatever it is, it will be s

In [35]:
print(len(docs))

4


In [36]:
prompt_template = """
Answer the question based only on the supplied context. If you don't know the answer, say you don't know the answer.
Context: {context}
Question: {question}
Your answer:
"""
prompt = ChatPromptTemplate.from_template(prompt_template)

In [39]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [41]:
result = chain.invoke("What is the Nautilus?")
display(Markdown(result))

The text does not mention the Nautilus, therefore I cannot answer the question. The text does not describe the Nautilus.