# Local RAG: Build a Document Q&A System (No Cloud Required)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ThamuMnyulwa/mkdocs_rag/blob/main/notebooks/01_local_rag_no_cloud.ipynb)

mkdocs_rag

## What You'll Learn

This notebook teaches RAG fundamentals using **only local tools** - no cloud account required!

**Time**: 15-20 minutes | **Cost**: $0 | **Prerequisites**: None

### The RAG Pipeline
1. INGEST: Load and chunk documents
2. EMBED: Convert text to vectors  
3. STORE: Save in vector database
4. RETRIEVE: Find relevant chunks
5. GENERATE: Answer with context


## Step 1: Install Dependencies


In [None]:
%%capture
%pip install sentence-transformers faiss-cpu PyPDF2 python-docx transformers torch


In [None]:
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss
from transformers import pipeline
import PyPDF2
print("‚úÖ Libraries imported!")


## Step 2: Upload Documents


In [None]:
from google.colab import files
import io

uploaded = files.upload()
print(f"‚úÖ Uploaded {len(uploaded)} files")


## Step 3: Extract Text


In [None]:
def read_pdf(file_bytes):
    pdf = PyPDF2.PdfReader(io.BytesIO(file_bytes))
    return "\n".join([p.extract_text() for p in pdf.pages])

def read_text(file_bytes):
    return file_bytes.decode('utf-8')

documents = []
for filename, content in uploaded.items():
    text = read_pdf(content) if filename.endswith('.pdf') else read_text(content)
    documents.append({'file': filename, 'text': text})
    print(f"üìÑ {filename}: {len(text)} chars")

print(f"\n‚úÖ Processed {len(documents)} documents")


## Step 4: Chunk Documents


In [None]:
def chunk_text(text, size=512, overlap=100):
    chunks = []
    for i in range(0, len(text), size - overlap):
        chunk = text[i:i+size]
        if chunk.strip():
            chunks.append(chunk)
    return chunks

all_chunks = []
metadata = []
for doc in documents:
    chunks = chunk_text(doc['text'])
    all_chunks.extend(chunks)
    metadata.extend([doc['file']] * len(chunks))
    print(f"üìù {doc['file']}: {len(chunks)} chunks")

print(f"\n‚úÖ Total: {len(all_chunks)} chunks")


## Step 5: Generate Embeddings


In [None]:
model = SentenceTransformer('all-MiniLM-L6-v2')
print("‚úÖ Model loaded!")

embeddings = model.encode(all_chunks, show_progress_bar=True)
print(f"‚úÖ Generated {len(embeddings)} embeddings ({embeddings.shape[1]} dims)")


## Step 6: Create FAISS Index


In [None]:
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings.astype('float32'))
print(f"‚úÖ FAISS index: {index.ntotal} vectors")


## Step 7: Retrieval Function


In [None]:
def retrieve(query, top_k=3):
    q_emb = model.encode([query])
    distances, indices = index.search(q_emb.astype('float32'), top_k)
    return [(all_chunks[i], metadata[i], d) for i, d in zip(indices[0], distances[0])]

# Test
results = retrieve("What is this about?")
for i, (chunk, source, dist) in enumerate(results, 1):
    print(f"{i}. {source} (dist: {dist:.2f})\n   {chunk[:100]}...\n")


## Step 8: Load LLM


In [None]:
generator = pipeline('text2text-generation', model='google/flan-t5-small', max_length=512)
print("‚úÖ LLM loaded!")


## Step 9: RAG Q&A Function


In [None]:
def ask(question, top_k=3):
    results = retrieve(question, top_k)
    context = "\n\n".join([c for c, _, _ in results])
    
    prompt = f"""Answer based ONLY on this context:

{context}

Question: {question}
Answer:"""
    
    response = generator(prompt, max_length=200)[0]['generated_text']
    answer = response.strip()
    sources = list(set([s for _, s, _ in results]))
    
    return {'answer': answer, 'sources': sources}

print("‚úÖ RAG system ready!")


## Step 10: Try It!


In [None]:
question = "What is the main topic?"
result = ask(question)
print(f"‚ùì {question}\n")
print(f"üí° {result['answer']}\n")
print(f"üìö Sources: {', '.join(result['sources'])}")


## Congratulations!

You built a complete RAG system from scratch!

**Next Steps:**
- Notebook 2: Vertex AI RAG Engine (managed)