# Retrieval-augmented generation (RAG)

This notebook demonstrates a compact, self-contained example of RAG. The goal is to show how combining a retriever with a generator leads to answers that are grounded in external knowledge, instead of relying only on what a model "remembers". We will:

1. Create a tiny in-memory knowledge base for a fictional note-taking app called **LumaNote**.
2. Implement a basic vector-based retriever using TF-IDF and cosine similarity.
3. Define a naive generator that answers questions without retrieval (to simulate hallucination).
4. Define a RAG-style generator that first retrieves relevant passages and then uses them as context.
5. Compare answers with and without retrieval for a few example questions.

To keep the notebook self-contained, we do not call any external LLM APIs. Instead, we use simple Python functions to simulate a generator that either ignores or leverages retrieved context.

In [None]:
# If you run this notebook on Colab or a fresh environment, you may need:
# !pip install scikit-learn numpy

from typing import List, Dict, Any
import re
import textwrap

import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [None]:
# A tiny "product documentation" knowledge base for LumaNote
documents: List[Dict[str, str]] = [
    {
        "id": "guide_sync",
        "title": "Syncing notes across devices",
        "text": """LumaNote automatically synchronizes your notes across all signed-in devices.
To enable sync, sign in with the same account on each device and keep the "Cloud sync" toggle turned on in Settings → Sync.
Sync uses end-to-end encryption. LumaNote servers cannot read your note contents.
Sync runs every few seconds when you are online. Large attachments may take longer to upload."""
    },
    {
        "id": "guide_offline",
        "title": "Offline mode and local cache",
        "text": """When you lose network connectivity, LumaNote switches to offline mode.
You can continue creating and editing notes while offline. Changes are stored in a local cache.
When the connection is restored, LumaNote reconciles offline changes and uploads them to the cloud.
If the same note was edited on two devices while offline, you will be asked which version to keep."""
    },
    {
        "id": "guide_sharing",
        "title": "Sharing notes with teammates",
        "text": """You can share a note with teammates by clicking the Share button in the top-right corner.
Add teammates by email address and choose whether they can view or edit.
Shared notes show an avatar for each active collaborator. Changes appear in real time for everyone."""
    },
    {
        "id": "guide_security",
        "title": "Security, encryption, and backups",
        "text": """LumaNote uses end-to-end encryption for note contents in transit and at rest.
Backups are stored in multiple regions. You can export an encrypted backup file at any time from Settings → Backups.
If you forget your password and do not have a recovery key, LumaNote cannot recover your encrypted notes."""
    },
]

len(documents), [doc["title"] for doc in documents]

(4,
 ['Syncing notes across devices',
  'Offline mode and local cache',
  'Sharing notes with teammates',
  'Security, encryption, and backups'])

In [None]:
# Build a TF–IDF vector index for the document texts
corpus = [doc["text"] for doc in documents]
vectorizer = TfidfVectorizer(stop_words="english")
doc_vectors = vectorizer.fit_transform(corpus)


def retrieve(query: str, k: int = 3) -> List[Dict[str, Any]]:
    """Retrieve the top-k most similar documents to the query using cosine similarity."""
    query_vec = vectorizer.transform([query])
    similarities = cosine_similarity(query_vec, doc_vectors)[0]
    top_indices = np.argsort(similarities)[::-1][:k]

    results: List[Dict[str, Any]] = []
    for idx in top_indices:
        results.append(
            {
                "id": documents[idx]["id"],
                "title": documents[idx]["title"],
                "score": float(similarities[idx]),
                "text": documents[idx]["text"],
            }
        )
    return results


# Quick smoke test of retrieval
for r in retrieve("sync notes between devices"):
    print(f"{r['title']}  (score={r['score']:.3f})")

Syncing notes across devices  (score=0.597)
Offline mode and local cache  (score=0.103)
Sharing notes with teammates  (score=0.039)


In [None]:
def naive_answer_without_retrieval(question: str) -> str:
    """A deliberately naive answer generator that does not look at any external context.

    This function simulates a model that tries to answer from vague "general knowledge"
    and may therefore hallucinate details about the LumaNote product.
    """
    # Very simple rule-based responses using pattern matching.
    q_lower = question.lower()
    if "sync" in q_lower or "synchroniz" in q_lower:
        return (
            "You can usually sync notes between devices from the app settings. "
            "Look for a generic 'Sync' or 'Cloud' option and sign in with the same account."
        )
    if "offline" in q_lower:
        return (
            "Most note apps offer some form of offline mode, but the exact behavior depends "
            "on the product version and your subscription."
        )
    if "share" in q_lower:
        return (
            "Sharing is typically done via a Share button where you can invite collaborators, "
            "but the details will vary by app."
        )
    if "encrypt" in q_lower or "security" in q_lower:
        return (
            "Many tools claim to use encryption, but you should check the security "
            "documentation of your specific product."
        )
    return (
        "I do not have specific knowledge about this product. You may need to consult its documentation."
    )


# Quick smoke test
print(naive_answer_without_retrieval("How can I sync my notes between my phone and laptop?"))

You can usually sync notes between devices from the app settings. Look for a generic 'Sync' or 'Cloud' option and sign in with the same account.


In [None]:
def answer_with_rag(question: str, k: int = 3) -> Dict[str, Any]:
    """RAG-style answer:

    1. Retrieve top-k relevant documents.
    2. Use a simple heuristic generator that selects sentences from the retrieved context.
    """
    retrieved_docs = retrieve(question, k=k)

    if not retrieved_docs:
        return {
            "answer": "I could not find information about this question in the available documents.",
            "retrieved_docs": [],
        }

    # Tokenize question into lowercased words, dropping very short terms
    query_terms = [w.lower() for w in re.findall(r"\w+", question) if len(w) > 3]
    query_term_set = set(query_terms)

    # Split retrieved documents into candidate sentences
    sentences: List[str] = []
    for doc in retrieved_docs:
        for sent in re.split(r"(?<=[.!?])\s+", doc["text"].strip()):
            sent = sent.strip()
            if sent:
                sentences.append(sent)

    # Score each sentence by overlap with query terms
    scored: List[Any] = []
    for sent in sentences:
        sent_terms = set(w.lower() for w in re.findall(r"\w+", sent))
        overlap = len(sent_terms & query_term_set)
        if overlap > 0:
            scored.append((overlap, sent))

    if not scored:
        answer = "I could not find information about this question in the available documents."
    else:
        scored.sort(reverse=True, key=lambda x: x[0])
        top_sentences = [s for _, s in scored[:3]]
        answer = " ".join(top_sentences)

    return {
        "answer": answer,
        "retrieved_docs": retrieved_docs,
    }


# Quick smoke test
result = answer_with_rag("How can I sync my notes between my phone and laptop?")
print(result["answer"])

LumaNote automatically synchronizes your notes across all signed-in devices. To enable sync, sign in with the same account on each device and keep the "Cloud sync" toggle turned on in Settings → Sync. Sync uses end-to-end encryption.


In [None]:
def compare_answers(question: str) -> None:
    print("=" * 80)
    print("Question:", question)
    print("\n[Naive answer without retrieval]")
    print(naive_answer_without_retrieval(question))

    rag_result = answer_with_rag(question)
    print("\n[Retrieval-augmented answer]")
    print(rag_result["answer"])

    print("\nTop retrieved documents:")
    for doc in rag_result["retrieved_docs"]:
        snippet = textwrap.shorten(doc["text"], width=120, placeholder="…")
        print(f"- {doc['title']} (score={doc['score']:.3f}): {snippet}")


# Try a few representative questions
compare_answers("How can I sync my notes between my phone and laptop?")
compare_answers("What happens if I lose connection while editing a note?")
compare_answers("How do I share a note with my team?")
compare_answers("Does LumaNote encrypt my notes and backups?")

Question: How can I sync my notes between my phone and laptop?

[Naive answer without retrieval]
You can usually sync notes between devices from the app settings. Look for a generic 'Sync' or 'Cloud' option and sign in with the same account.

[Retrieval-augmented answer]
LumaNote automatically synchronizes your notes across all signed-in devices. To enable sync, sign in with the same account on each device and keep the "Cloud sync" toggle turned on in Settings → Sync. Sync uses end-to-end encryption.

Top retrieved documents:
- Syncing notes across devices (score=0.652): LumaNote automatically synchronizes your notes across all signed-in devices. To enable sync, sign in with the same…
- Sharing notes with teammates (score=0.048): You can share a note with teammates by clicking the Share button in the top-right corner. Add teammates by email…
- Security, encryption, and backups (score=0.045): LumaNote uses end-to-end encryption for note contents in transit and at rest. Backups are store