🔹 What is RAG?
Normlly, LLMs(like GPT) only know what they were trained on.
RAG connect them with your own data (like pdfs, notes, databases)
workflow:
1) Extract Text from pdf
2) convert text into Embeddings(numerical vectors)
3) stores embeddings in a vector databases (FAISS, pipecone)
4) when you ask a question--- search the DB--- give relevant text-- LLM generates the answers

In [2]:
# Install required Libraries
!pip install pypdf2 faiss-cpu sentence-transformers transformers

# Extract text from pdf
from PyPDF2 import PdfReader

# load pdf
reader=PdfReader('/content/sample.pdf')
text = ""
for page in reader.pages:
  text+=page.extract_text()

print(text[:500]) # shows first 500 chars

# create embeddings
from sentence_transformers import SentenceTransformer

# Use pre-trained embedding model
embedder= SentenceTransformer('all-MiniLM-L6-v2')

# split text into chunks
chunks= text.split(". ")
embeddings= embedder.encode(chunks)

print("Number of chunks:",len(chunks))

# Stores in FAISS (Vector database)
import faiss
import numpy as np

# convert to FAISS index
dimension= embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings))

# Query with a Question
query = 'What is Generative AI?'
query_embedding= embedder.encode([query])

# search top 3 relevant chunks
D, I = index.search(np.array(query_embedding),k=3)
print("\n🔹 Retrieved Chunks:")
for idx in I[0]:
  print(chunks[idx])

# pass Retrived chunks to LLM
from transformers import pipeline

generator= pipeline('text-generation',model='gpt2')

context= " ".join([chunks[i] for i in I[0]])
prompt = f"Anwers the question using the context:\nContext: {context}\nQuestion: {query}\nAnwer:"

result= generator(prompt,
                  max_length=50,
                  temperature=0.7,
                  do_sample=True)
print("\nFinal Answer:",result[0]['generated_text'])

Collecting pypdf2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (31.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.4/31.4 MB[0m [31m61.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf2, faiss-cpu
Successfully installed faiss-cpu-1.12.0 pypdf2-3.0.1
Generative artificial intelligence  (Generative AI , GenAI ,[1] or GAI) is a subfield of  artificial 
intelligence  that uses  generative models  to produce text, images, videos, or other forms of 
data.[2][3][4] These models  learn  the underlying patterns and structures of their  trai

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Number of chunks: 13

🔹 Retrieved Chunks:
Generative artificial intelligence  (Generative AI , GenAI ,[1] or GAI) is a subfield of  artificial 
intelligence  that uses  generative models  to produce text, images, videos, or other forms of 
data.[2][3][4] These models  learn  the underlying patterns and structures of their  training data  and 
use them to produce new data[5][6] based on the input, which often comes in the form of natural 
language  prompts .[7][8] 
Generative AI tools have become more common since the  AI boom  in the 2020s
By the early 1970s,  Harold Cohen  was 
creating and exhibiting generative AI works created by  AARON , the computer program Cohen 
created to generate paintings.[34] The terms generative  AI planning  or generative planning were used in the 1980s and 1990s to 
refer to  AI planning  systems, especially  computer -aided process planning , used to generate 
sequences of actions to reach a specified goal.[35][36] Generative AI planning systems 
used  s

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



Final Answer: Anwers the question using the context:
Context: Generative artificial intelligence  (Generative AI , GenAI ,[1] or GAI) is a subfield of  artificial 
intelligence  that uses  generative models  to produce text, images, videos, or other forms of 
data.[2][3][4] These models  learn  the underlying patterns and structures of their  training data  and 
use them to produce new data[5][6] based on the input, which often comes in the form of natural 
language  prompts .[7][8] 
Generative AI tools have become more common since the  AI boom  in the 2020s By the early 1970s,  Harold Cohen  was 
creating and exhibiting generative AI works created by  AARON , the computer program Cohen 
created to generate paintings.[34] The terms generative  AI planning  or generative planning were used in the 1980s and 1990s to 
refer to  AI planning  systems, especially  computer -aided process planning , used to generate 
sequences of actions to reach a specified goal.[35][36] Generative AI plan