Step 1 – Set up your environment
Some basic libraries: sentence-transformers (for embeddings), faiss-cpu (for vector search), numpy

In [1]:
pip install sentence-transformers faiss-cpu numpy

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


Step 2 – Collect some documents (your “knowledge base”)

In [2]:
documents = [
    "Employees get 20 days of paid annual leave per year...",
    "ETL jobs run daily at 1 AM using Apache Airflow...",
    "Employees can work from home up to 3 days per week...",
    "All code changes must go through pull requests and CI..."
]


Step 3 – Chunk your documents

In [4]:
chunks = []
for i, doc in enumerate(documents):
    chunks.append({
        "id": i,
        "text": doc.strip(),
        "metadata": {"source": f"doc_{i}"}
    })


Step 4 – Turn chunks into embeddings

In [5]:
from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")

texts = [c["text"] for c in chunks]
embeddings = model.encode(texts, convert_to_numpy=True, normalize_embeddings=True)


Step 5 – Store embeddings in a Vector Database (FAISS)

In [6]:
import faiss

dim = embeddings.shape[1]
index = faiss.IndexFlatIP(dim)  # IP = inner product; with normalized vectors ≈ cosine similarity
index.add(embeddings)


Step 6 – Implement retrieval (search by question)

In [7]:
def search(query, top_k=3):
    q_emb = model.encode([query], convert_to_numpy=True, normalize_embeddings=True)
    scores, indices = index.search(q_emb, top_k)

    results = []
    for score, idx in zip(scores[0], indices[0]):
        results.append({
            "score": float(score),
            "chunk": chunks[int(idx)]
        })
    return results


Step 7 – Build the “RAG answer” step

In [20]:
def build_prompt(question, results):
    context = "\n\n".join([r["chunk"]["text"] for r in results])
    prompt = f"""
You are a helpful assistant. Answer the question ONLY using the context below.

CONTEXT:
{context}

QUESTION:
{question}

If the answer is not in the context, say "I don't know from the provided documents."
"""
    return prompt
