## What is RAG!

RAG stands for Retrieval Augmented Generation.

The goal of RAG is to take information and pass it to an LLM so it can generate outputs based on that information.

* Retrieval - Find relevant information given a query, e.g. "What are the macronutrients and what do they do?" -> retrives passages of text related to the macronutrients from a nutrition textbook.
* Augmented - We want to take the relevant information and augment our input to an LLM with that relevant information.
* Generation - Take the first two steps and pass them to an LLM for generative outputs.

## Why RAG?

The main goal of RAG is to improve the general output of LLMs.

1. Prevent hallucinations: LLMs are incredibly good at generating text based on the prompts but it may not be factual.
2. Work with custom data: RAG hels LLMs to create specific responses based on specific documents.

## What can RAG be used for?

* Project SME
* Textbook Q&A
* Analysis Tool

## Advantages of Local RAG

* Privacy
* Speed
* Cost
* No Vendor Lockin - Not dependant on an company for our RAG to work.

In [3]:
!nvidia-smi

Sat Jun 15 16:14:15 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01              Driver Version: 555.99         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3080 ...    On  |   00000000:01:00.0  On |                  N/A |
| N/A   49C    P8             19W /  150W |     118MiB /  16384MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## We will be building an ML/DL expert AI.

1. Open PDF document
2. Format the text of the PDF textbook ready for an embedding model
3. Embed all of the chunks of text in the textbook and turn them into numerical representations which can store for later.
4. Build retrieval sustem that uses vector search to find relevant chunk of text based on a query.
5. Create a prompt that incorporates the retrieved pieces of text.
6. Generate an answer to a query based on the passages of the textbook with an LLM.

Stes 1-3: Document preprocessing and embedding creation.<br>
Steps 4-6: Search and answer

## 1. Document preprocessing and embedding creation

* PDF Document or any file which contains text.
* Process text for embedding.
* Embed text chunks with embedding model.
* Save embeddings to the file for later.

# Import PDF Document

In [None]:
import os 
import requests

# Get PDF document path
pdf_path = ""

# Download PDF
if not os.path.exists(pdf.path):
    print(f"[INFO] File doesnt exist, downloading...")
    url = ""
    filename = pdf_path
    response = request.get(url)
    if respone.status_code == 200:
        with open(filename, "wb") as file:
            file.write(response.content)
        print(f"[INFO] File Downloaded!")
    else:
        print(f"[INFO] Failed to download the file!")
            

We have got a PDF, lets open it!

In [3]:
import pymupdf as fitz
from tqdm.auto import tqdm

def text_formatter(text: str) -> str:
    """performs minor formatting on text."""
    cleaned_text = text.replace("\n", " ").strip()
    return cleaned_text

def open_and_read_pdf(pdf_path: str) -> list[dict]:
    doc = fitz.open(pdf_path)
    pages_and_texts = []
    for page_number, page in tqdm(enumerate(doc)):
        text = page.get_text()
        text = text_formatter(text=text)
        pages_and_texts.append({"page_number": page_number - 14,
                                "
                               })
        