# RAG - Retrieval-Augmented Generation

RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.

## How RAG works

* Create external data
* Retrieve relevant information
* Augment the LLM prompt

In [7]:
import os
import urllib.request
from langchain_community.document_loaders import TextLoader
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain.schema.output_parser import StrOutputParser
from langchain.globals import set_llm_cache
from langchain.cache import SQLiteCache
from langchain_community.vectorstores import SQLiteVSS
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.chat_models import ChatOllama
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnablePassthrough

### Configure embeddings model and the vector store

In [8]:
HOST = "http://192.168.178.84:11434"
MODEL = "llama2:7b"

In [9]:
embeddings_model = OllamaEmbeddings(base_url=HOST, model=MODEL)

### Get the book 20.000 Leagues under the see

In [3]:
url = 'https://www.gutenberg.org/cache/epub/164/pg164.txt'
filename = '../data/twenty-thousand-leagues-under-the-sea.txt'
urllib.request.urlretrieve(url, filename)

('../data/twenty-thousand-leagues-under-the-sea.txt',
 <http.client.HTTPMessage at 0x1169b6590>)

In [5]:
loader = TextLoader(filename)
documents = loader.load()

In [10]:
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
texts = [doc.page_content for doc in docs]

Created a chunk of size 1076, which is longer than the specified 1000
Created a chunk of size 1042, which is longer than the specified 1000
Created a chunk of size 1378, which is longer than the specified 1000
Created a chunk of size 1094, which is longer than the specified 1000
Created a chunk of size 1127, which is longer than the specified 1000
Created a chunk of size 1057, which is longer than the specified 1000
Created a chunk of size 1240, which is longer than the specified 1000
Created a chunk of size 1034, which is longer than the specified 1000
Created a chunk of size 1084, which is longer than the specified 1000
Created a chunk of size 1048, which is longer than the specified 1000
Created a chunk of size 1149, which is longer than the specified 1000
Created a chunk of size 1289, which is longer than the specified 1000
Created a chunk of size 1091, which is longer than the specified 1000
Created a chunk of size 1016, which is longer than the specified 1000
Created a chunk of s

### Create embedding vectors 

We put all vectors into the sqlite-vss in a table named ```twenty_thousand_leagues_under_the_sea```.
The db_file parameter is the name of the file you want as your sqlite database.

In [11]:
vector_db = SQLiteVSS.from_texts(
    texts=texts,
    embedding=embeddings_model,
    table="twenty_thousand_leagues_under_the_sea",
    db_file="/tmp/vss.db",
)

KeyboardInterrupt: 

### Query the database

In [None]:
query = "What is the Nautilus?"
data = vector_db.similarity_search(query)
data[0].page_content

### Basic retrieval

In [None]:
retriever = vector_db.as_retriever(search_kwargs={"k": 3})

In [None]:
prompt_template = """
Answer the question based only on the supplied context. If you don't know the answer, say you don't know the answer.
Context: {context}
Question: {question}
Your answer:
"""
prompt = ChatPromptTemplate.from_template(prompt_template)

In [None]:
llm = ChatOllama(base_url="http://192.168.178.84:11434", model="llama2:7b", temperature=0)
set_llm_cache(SQLiteCache(database_path=".langchain.db"))

In [None]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

chain.invoke(
    "In the given context, what is the Nautilus"
)