# Retrieval Augmentation

**L**arge **L**anguage **M**odels (LLMs) have a data freshness problem. The most powerful LLMs in the world, like GPT-4, have no idea about recent world events.

The world of LLMs is frozen in time. Their world exists as a static snapshot of the world as it was within their training data.

A solution to this problem is *retrieval augmentation*. The idea behind this is that we retrieve relevant information from an external knowledge base and give that information to our LLM. In this notebook we will learn how to do that.

To begin, we must install the prerequisite libraries that we will be using in this notebook.

## Dependencies

In [None]:
# !pip install -qU langchain openai datasets transformers

## Building the Knowledge Base from pdfs

#### 1. Split and Chunk Documents

Every record contains *a lot* of text. Our first task is therefore to identify a good preprocessing methodology for chunking these articles into more "concise" chunks to later be embedding and stored in our vector database.

For this we use LangChain's `RecursiveCharacterTextSplitter` to split our text into chunks of a specified max length.

In [None]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 300, # characters
    chunk_overlap  = 50, # overlap between chunks in characters
    length_function = len, # function to determine the length of a chunk
)

In [None]:
# Make chunks
filepath = ".\demopdfs\generative-ai-report.pdf" # ".\demopdfs\AI_Russell_Norvig.pdf"
loader = PyPDFLoader(filepath)
chunks = loader.load_and_split(text_splitter=text_splitter)

#### 2. Display chunks

In [None]:
for chunk in chunks[12:14]:
    print("Page_content:\n",chunk.page_content)
    print("Page_metadata:\n",chunk.metadata)
    print("----------------------------------------------------------------")

## Vector DB

#### 1. Load dependencies

In [None]:
from langchain.vectorstores import Chroma
from langchain.embeddings.sentence_transformer import SentenceTransformerEmbeddings

#### 2. Use an open source embedding model to embed the chunks

In [None]:
# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

In [None]:
# Explain Embeddings
embedding = embedding_function.embed_documents("This is a test")

print(embedding[0])
print("Dimensions of Embedding:",len(embedding[0]))

#### 3. Upload to Vector DB

In [None]:
# load it into Chroma
db = Chroma.from_documents(chunks, embedding_function) #, persist_directory="./chroma_db")

In [None]:
print("Chunks in DB:",db._collection.count())

#### 4. Retrieval from Vector DB

In [None]:
# Using a retriever object

query = "what is generative ai?"

retriever = db.as_retriever()
retriever.get_relevant_documents(query)

## Setting up RAG

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQAWithSourcesChain

#### 1. Connect to LLM (OpenAI)

In [None]:
from utils import getkey

OPENAI_API_KEY = getkey(key='OPENAI')  # platform.openai.com

In [None]:
# Setup LLM connection
llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    model_name='gpt-3.5-turbo',
    # model_name = "gpt-4",
    temperature=0.0,
    max_tokens=100,
    # streaming=True,
)

#### 2. Connect Vector DB to LLM

In [None]:
# Connect VDB to LLM
qa_with_sources = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
)

#### 3. RAG QA

In [None]:
# RAG QA
query = "What is the impact of generative AI?"
qa_with_sources(query)

---