<a href="https://colab.research.google.com/github/anuradha1105/RAG-Assignment/blob/main/rag_free_no_api.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Retrieval-Augmented Generation (RAG) — Free / No API Key Version
This notebook demonstrates a full **Retrieval-Augmented Generation (RAG)** pipeline using only free, local models.

You’ll:
1. Create a small knowledge base  
2. Chunk and embed text using Hugging Face (no API key)  
3. Store and retrieve chunks with Chroma  
4. Generate grounded answers with FLAN‑T5 (local model)

**Screenshots:**  
- Library install success  
- Loaded docs / created chunks  
- Vector store ready  
- Final Q/A answer  


## 1️⃣ Install libraries

In [None]:

!pip install -q langchain langchain-community chromadb sentence-transformers transformers accelerate pypdf
print("✅ All libraries installed")


✅ All libraries installed


## 2️⃣ Load and split documents

In [None]:

from pathlib import Path
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

DATA_DIR = Path("data_rag_demo")
loader = DirectoryLoader(str(DATA_DIR), glob="**/*.txt", loader_cls=TextLoader)
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
splits = splitter.split_documents(docs)

print(f"📄 Loaded {len(docs)} docs → 🔪 Created {len(splits)} chunks")


📄 Loaded 3 docs → 🔪 Created 3 chunks


## 3️⃣ Build local Chroma vector store (no API key)

In [None]:

from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma

PERSIST_DIR = "chroma_db_free"
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

vectordb = Chroma.from_documents(splits, embedding=embeddings, persist_directory=PERSIST_DIR)
vectordb.persist()
retriever = vectordb.as_retriever(search_kwargs={"k": 4})

print("✅ Local Chroma vector store ready at:", PERSIST_DIR)


✅ Local Chroma vector store ready at: chroma_db_free


  vectordb.persist()


In [None]:
import zipfile, os, shutil

zip_name = "RAG_file.zip"   # must match the name shown in the left panel

# Unzip the file
with zipfile.ZipFile(zip_name, "r") as zip_ref:
    zip_ref.extractall(".")

print("📂 After unzip, I see:", os.listdir("."))

# Move the data folder to current working directory if needed
if os.path.exists("RAG_file/data_rag_demo"):
    shutil.move("RAG_file/data_rag_demo", ".")
    print("✅ Moved data_rag_demo into working directory")

print("📄 data_rag_demo contents:", os.listdir("data_rag_demo"))


📂 After unzip, I see: ['.config', 'chroma_db_free', 'RAG_Free_No_API.ipynb', 'RAG_file.zip', 'data_rag_demo', 'sample_data']
📄 data_rag_demo contents: ['rag_intro.txt', 'best_practices.txt', 'vector_stores.txt']


In [None]:
from pathlib import Path
from langchain_community.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

DATA_DIR = Path("data_rag_demo")

loader = DirectoryLoader(str(DATA_DIR), glob="**/*.txt", loader_cls=TextLoader)
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
splits = splitter.split_documents(docs)

print(f"📄 Loaded {len(docs)} docs → 🔪 Created {len(splits)} chunks")


📄 Loaded 3 docs → 🔪 Created 3 chunks


In [None]:
print("Number of docs:", len(docs))
print("Number of chunks:", len(splits))

for i, d in enumerate(docs):
    print(f"[DOC {i}] source={d.metadata.get('source')} preview={d.page_content[:120]!r}")


Number of docs: 3
Number of chunks: 3
[DOC 0] source=data_rag_demo/rag_intro.txt preview='Retrieval-Augmented Generation (RAG) retrieves relevant context from a knowledge base and passes it to a language model '
[DOC 1] source=data_rag_demo/best_practices.txt preview='Good RAG design uses chunk overlaps, stores metadata for traceability, and evaluates answers for faithfulness to the ret'
[DOC 2] source=data_rag_demo/vector_stores.txt preview='Vector stores like Chroma store text embeddings so we can search by semantic meaning, not just exact keywords.'


## 4️⃣ Load free local model (FLAN‑T5)

In [None]:

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

print("✅ Model loaded:", model_name)


tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

✅ Model loaded: google/flan-t5-base


## 5️⃣ Ask questions (RAG pipeline)

In [None]:
import os, shutil
os.makedirs("data_rag_demo", exist_ok=True)
for f in ["rag_intro.txt", "vector_stores.txt", "best_practices.txt"]:
    if os.path.exists(f):
        shutil.move(f, "data_rag_demo/")
print("✅ Folder ready with files:", os.listdir("data_rag_demo"))


✅ Folder ready with files: ['rag_intro.txt', 'best_practices.txt', 'vector_stores.txt']


In [None]:

def generate_answer_from_context(question, context_text):
    prompt = (
        "Use ONLY the context below to answer the question. "
        "If the answer is not in the context, say you don't know.\n\n"
        f"Context:\n{context_text}\n\nQuestion: {question}\nAnswer:"
    )
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True)
    with torch.no_grad():
        outputs = model.generate(**inputs, max_length=256, temperature=0.0)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

def rag_query(question):
    docs = retriever.get_relevant_documents(question)
    context = "\n".join([d.page_content for d in docs])
    answer = generate_answer_from_context(question, context)
    print("Q:", question)
    print("\nContext used:\n", context[:500], "...\n")
    print("A:", answer)

rag_query("What is RAG and why do we use a vector store?")


  docs = retriever.get_relevant_documents(question)
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


Q: What is RAG and why do we use a vector store?

Context used:
 Good RAG design uses chunk overlaps, stores metadata for traceability, and evaluates answers for faithfulness to the retrieved context.
Vector stores like Chroma store text embeddings so we can search by semantic meaning, not just exact keywords.
Retrieval-Augmented Generation (RAG) retrieves relevant context from a knowledge base and passes it to a language model for grounded answers. ...

A: retrievs relevant context from a knowledge base and passes it to a language model for grounded answers


In [None]:
def generate_answer_from_context(question, context_text):
    prompt = (
        "Use ONLY the context below to answer the question. "
        "If the answer is not in the context, say you don't know.\n\n"
        f"Context:\n{context_text}\n\n"
        f"Question: {question}\nAnswer:"
    )

    inputs = tokenizer(prompt, return_tensors="pt", truncation=True)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=256,
            temperature=0.0,
        )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

def rag_query(question):
    docs = retriever.get_relevant_documents(question)
    context = "\n".join([d.page_content for d in docs])

    answer = generate_answer_from_context(question, context)

    print("Q:", question)
    print("\nContext used:\n", context[:600], "...\n")
    print("A:", answer)

    return answer

rag_query("What is RAG and why do we use a vector store?")


Q: What is RAG and why do we use a vector store?

Context used:
 Good RAG design uses chunk overlaps, stores metadata for traceability, and evaluates answers for faithfulness to the retrieved context.
Vector stores like Chroma store text embeddings so we can search by semantic meaning, not just exact keywords.
Retrieval-Augmented Generation (RAG) retrieves relevant context from a knowledge base and passes it to a language model for grounded answers. ...

A: retrievs relevant context from a knowledge base and passes it to a language model for grounded answers


'retrievs relevant context from a knowledge base and passes it to a language model for grounded answers'

In [1]:
from google.colab import files
import json, requests, re, os

# This block grabs the current notebook name from the Colab UI:
notebook_name = "RAG_Assignment.ipynb"  # <-- if you already named it something, put that name here

print("✅ Assuming current notebook is:", notebook_name)


✅ Assuming current notebook is: RAG_Assignment.ipynb


In [2]:
!git config --global user.email "anuradhasrivastav25@gmail.com"
!git config --global user.name "1105"

# clone your repo into Colab
!git clone https://github.com/anuradha1105/RAG-Assignment


Cloning into 'RAG-Assignment'...
remote: Enumerating objects: 22, done.[K
remote: Counting objects: 100% (22/22), done.[K
remote: Compressing objects: 100% (20/20), done.[K
remote: Total 22 (delta 3), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (22/22), 20.31 KiB | 6.77 MiB/s, done.
Resolving deltas: 100% (3/3), done.


In [3]:
import shutil

src_notebook = "RAG_Assignment.ipynb"        # this is the notebook file in /content
dst_repo_dir = "RAG-Assignment"              # this is the folder git clone created

shutil.copy(src_notebook, f"{dst_repo_dir}/RAG_Assignment.ipynb")
print("✅ Copied notebook into repo folder")


FileNotFoundError: [Errno 2] No such file or directory: 'RAG_Assignment.ipynb'

In [4]:
import os
print(os.listdir("/content"))

['.config', 'RAG-Assignment', 'sample_data']


In [5]:
import shutil

src_notebook = "RAG_Assignment.ipynb"        # must match exactly what you see in Step 3
dst_repo_dir = "RAG-Assignment"              # your cloned repo folder name

shutil.copy(src_notebook, f"{dst_repo_dir}/RAG_Assignment.ipynb")
print("✅ Copied notebook into repo folder")


FileNotFoundError: [Errno 2] No such file or directory: 'RAG_Assignment.ipynb'