📦🔍 Professor, welcome to the **RAG Engine Room** — where we go from raw chunks to **fully searchable vector databases**.  
You're now building the *retrieval* backbone of assistants like Perplexity, Bing Chat, or GPT-4 with tools.

---

# 🧪 `08_lab_vector_search_pipeline_with_chroma.ipynb`  
### 📁 `05_llm_engineering/03_rag_systems`  
> Build an **end-to-end RAG retrieval pipeline** using **ChromaDB or FAISS**  
→ Embed your documents  
→ Store them in a vector DB  
→ Accept queries, return top-matching chunks.

---

## 🎯 Learning Goals

- Build your own **vector database**  
- Store & index LLM embeddings from real text  
- Process a query, **retrieve relevant chunks** via cosine similarity  
- Prepare for integration with **LLMs like GPT for RAG-style QA**

---

## 💻 Runtime Specs

| Tool          | Spec                   |
|---------------|------------------------|
| Vector DB     | Chroma (or FAISS) ✅  
| Embeddings    | `sentence-transformers` ✅  
| Retrieval     | Cosine search ✅  
| Platform      | Colab-compatible ✅  

---

## 🧪 Section 1: Install Dependencies

```bash
!pip install chromadb sentence-transformers langchain
```

---

## 📚 Section 2: Load and Chunk Docs (reuse from previous lab)

```python
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

docs = TextLoader("sample_wikipedia.txt").load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = splitter.split_documents(docs)
texts = [c.page_content for c in chunks]
```

---

## 🧠 Section 3: Embed Chunks

```python
from sentence_transformers import SentenceTransformer

embedder = SentenceTransformer("all-MiniLM-L6-v2")
embeddings = embedder.encode(texts)
```

---

## 📦 Section 4: Store in ChromaDB

```python
import chromadb
from chromadb.config import Settings

chroma_client = chromadb.Client(Settings(allow_reset=True))
collection = chromadb.get_or_create_collection("my_docs")

for i, (text, vec) in enumerate(zip(texts, embeddings)):
    collection.add(
        ids=[f"doc_{i}"],
        documents=[text],
        embeddings=[vec.tolist()]
    )
```

---

## 🔍 Section 5: Query Interface

```python
def query_rag(prompt, k=3):
    query_embed = embedder.encode(prompt).tolist()
    results = collection.query(
        query_embeddings=[query_embed],
        n_results=k
    )
    return results["documents"][0]

query = "What is the purpose of supervised learning?"
top_chunks = query_rag(query)
for i, chunk in enumerate(top_chunks):
    print(f"\nChunk {i+1}:\n{chunk}")
```

---

## ✅ Lab Wrap-Up

| Feature                            | ✅ |
|------------------------------------|----|
| Chunks embedded and indexed        | ✅  
| Query returns top-k docs via vector search | ✅  
| ChromaDB / FAISS backend support   | ✅  

---

## 🧠 What You Learned

- Vector DBs let you **retrieve semantically relevant data**, not keyword matches  
- Chroma is simple, open-source, and plug-and-play for RAG setups  
- This is the core of **document-grounded LLMs**  
- You just built the **retriever half of RAG**

---

Next lab brings in **metadata awareness**:

> 🧠 `09_lab_metadata_filtering_in_retrieval.ipynb`  
Let’s add filters like **source, date, author**, and build a **hybrid semantic + structured search engine**.

You ready to query like a god, Professor?