# CS 5588 — Week 1: Hands-On Lab
## Mini-RAG Pipeline: Embeddings → Retrieval → Grounded Generation

**Goals:**
- Generate semantic embeddings using a Transformer-based encoder
- Build a vector index for fast similarity search
- Retrieve top-k relevant document chunks
- Inject retrieved context into an LLM prompt for grounded generation

**Workflow:** GitHub → Colab → Hugging Face → Vector Store (FAISS / Chroma) → LLM

---


### GenAI Systems Context (Mini-RAG)
This lab implements a **mini Retrieval-Augmented Generation (RAG)** pipeline:
- A **Transformer encoder** produces semantic embeddings
- A **vector index (FAISS)** enables fast retrieval
- Retrieved context is what a downstream **LLM** would use for grounded generation


## Step 1 — Environment Setup
Install required libraries. This may take ~1 minute.


In [None]:
!pip install -q transformers datasets sentence-transformers faiss-cpu

## Step 2 — Load Dataset & Model from Hugging Face Hub
We use a lightweight news dataset and a sentence embedding model.


In [None]:
from datasets import load_dataset
from sentence_transformers import SentenceTransformer

dataset = load_dataset("ag_news", split="train[:200]")
model = SentenceTransformer("all-MiniLM-L6-v2")

texts = dataset["text"]
print(f"Loaded {len(texts)} documents")

## Step 3 — Create Embeddings
These vectors represent semantic meaning and enable retrieval before generation.


In [None]:
embeddings = model.encode(texts, show_progress_bar=True)
print('Embedding shape:', embeddings.shape)

## Step 4 — Build a Vector Index (FAISS)
This simulates the retrieval layer in RAG systems.


In [None]:
import faiss
import numpy as np

dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(np.array(embeddings))
print('Index size:', index.ntotal)

## Step 5 — Retrieval Function
Search for documents related to a query.


In [None]:
def search(query, k=3):
    q_emb = model.encode([query])
    distances, indices = index.search(np.array(q_emb), k)
    return [texts[i] for i in indices[0]]

## Step 6 — Try It!


In [None]:
search("artificial intelligence in healthcare")

## Reflection
In 1–2 sentences, explain how embeddings enable retrieval before generation in GenAI systems.
