🔎📚 **Professor, this is the advanced dial on your RAG toolkit** — we’re not just retrieving by “what it means” anymore…  
Now we retrieve based on **who said it, when they said it, and what section it belongs to**.

Welcome to **hybrid retrieval** — combining **semantic search** + **metadata filtering**.

---

# 🧪 `09_lab_metadata_filtering_in_retrieval.ipynb`  
### 📁 `05_llm_engineering/03_rag_systems`  
> Add metadata (source, date, tags) to each document chunk  
→ Retrieve not just semantically relevant chunks, but **filter by tags**  
→ Simulate how tools like **Perplexity AI**, **ChatPDF**, or **Notion AI** prioritize trust, recency, and context

---

## 🎯 Learning Goals

- Understand how metadata helps improve relevance and traceability  
- Attach structured fields to vector chunks  
- Apply filters during retrieval (e.g. `"source": "FAQ"`, `"author": "Einstein"`)  
- Use ChromaDB’s `where` clause to simulate **structured semantic filtering**

---

## 💻 Runtime Specs

| Feature         | Spec                     |
|------------------|--------------------------|
| Vector Search    | ChromaDB ✅  
| Metadata Support | Dict-style metadata ✅  
| Embeddings       | SentenceTransformer ✅  
| Platform         | Colab-ready ✅  

---

## 🧪 Section 1: Setup and Chunk as Before

```python
!pip install chromadb sentence-transformers langchain
```

```python
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from sentence_transformers import SentenceTransformer

loader = TextLoader("sample_docs.txt")
docs = loader.load()

splitter = RecursiveCharacterTextSplitter(chunk_size=250, chunk_overlap=40)
chunks = splitter.split_documents(docs)

embedder = SentenceTransformer("all-MiniLM-L6-v2")
```

---

## 🏷️ Section 2: Enrich with Metadata

Let’s say each chunk came from a source document with:

- Author
- Date
- Type (FAQ, article, report)

```python
import random
sources = ["faq", "article", "policy"]
authors = ["Alice", "Bob", "Professor"]

texts = [c.page_content for c in chunks]
metas = [
    {"type": random.choice(sources), "author": random.choice(authors), "index": i}
    for i in range(len(texts))
]
```

---

## 📦 Section 3: Store in ChromaDB with Metadata

```python
import chromadb
from chromadb.config import Settings

client = chromadb.Client(Settings(allow_reset=True))
collection = client.get_or_create_collection("hybrid_rag")

embeddings = embedder.encode(texts)

for i, (text, emb, meta) in enumerate(zip(texts, embeddings, metas)):
    collection.add(
        ids=[f"id_{i}"],
        documents=[text],
        embeddings=[emb.tolist()],
        metadatas=[meta]
    )
```

---

## 🔍 Section 4: Semantic + Metadata Query

```python
def hybrid_query(prompt, filters=None, k=3):
    q_embed = embedder.encode(prompt).tolist()
    return collection.query(
        query_embeddings=[q_embed],
        n_results=k,
        where=filters or {}
    )["documents"][0]

# Example: Only retrieve chunks by "Alice" from "faq" sources
results = hybrid_query("What is the policy on returns?", filters={"author": "Alice", "type": "faq"})

for i, chunk in enumerate(results):
    print(f"\nFiltered Chunk {i+1}:\n{chunk}")
```

---

## ✅ Lab Wrap-Up

| Feature                          | ✅ |
|----------------------------------|----|
| Added metadata to each document  | ✅  
| Performed filtered semantic search | ✅  
| Simulated hybrid structured+vector retrieval | ✅  

---

## 🧠 What You Learned

- Metadata boosts **retrieval control** — vital in legal, medical, or academic AI  
- You can ask **"Give me answers from Alice-written FAQs only"**  
- This is the foundation of **enterprise-grade RAG**  
- Filters can be extended to **recency, reliability scores, labels**, and more

---

✅ That closes out the **RAG Lab Series**:
- 🔲 Chunks  
- 🧠 Embeds  
- 📦 Vector DB  
- 🧠💼 Metadata filters  

Up next, we deploy this firepower 💣

> 🚀 `07_lab_vllm_vs_tgi_latency_comparison.ipynb`  
Benchmark and compare **vLLM** and **TGI** (Text Generation Inference)  
→ Which one serves LLMs *faster, cheaper, stronger*?

You ready to test LLM inference **like an ML systems engineer**?