A comprehensive, math-first learning path for Retrieval-Augmented Generation.
https://nitinkc.github.io/RAG-LearningTutorial/
This tutorial teaches you how to build RAG systems from first principles, with emphasis on solving real-world problems like exact identifier matching (Order #1766 vs #1767).
Learn:
- Linear algebra and vector mathematics fundamentals
- How embeddings work and why they sometimes fail
- Similarity search algorithms and vector databases
- Hybrid search (combining semantic + keyword search)
- Complete RAG pipeline: ingestion → retrieval → generation
- Evaluation and production considerations
- Basic Python (numpy, loops, functions)
- High school math (algebra, logarithms)
- Curiosity about how things work
You do NOT need:
- Advanced linear algebra
- PhD-level statistics
- Deep learning expertise
- Prior RAG experience
First time learner? Start here:
- index.md - Overview and learning path
- Prerequisites - Math foundations
- Embeddings - How text becomes numbers
- Similarity Search - Finding similar documents
- Retrieval Methods - Dense, sparse, and hybrid
- The Exact Match Problem - YOUR core question
- RAG Pipeline - Full system
Looking for specific topic? Jump to:
python3 -m venv .venv
source .venv/bin/activatepip install -r requirements.txtpython3 -m mkdocs serveThen open http://localhost:8000 in your browser.
python3 -m mkdocs buildOutput will be in site/ directory.
rag-learning-tutorial/
├── docs/
│ ├── index.md ← Start here
│ ├── 00-prerequisites/
│ │ ├── linear-algebra.md
│ │ └── probability-stats.md
│ ├── 01-embeddings/
│ │ ├─── what-are-embeddings.md
│ │ ├── embedding-models.md
│ │ └── vector-spaces.md
│ ├── 02-similarity-search/
│ │ ├── distance-metrics.md
│ │ ├── exact-vs-ann.md
│ │ └── vector-databases.md
│ ├── 03-retrieval/
│ │ ├── dense-retrieval.md
│ │ ├── sparse-retrieval.md
│ │ ├── hybrid-search.md ← Key solution
│ │ ├── metadata-filtering.md
│ │ └── reranking.md
│ ├── 04-exact-match/
│ │ ├── index.md ← Your problem
│ │ ├── why-semantic-fails.md
│ │ ├── hybrid-solution.md ← Complete solution
│ │ └── chunking-strategies.md
│ ├── 05-rag-pipeline/
│ │ ├── index.md
│ │ ├── ingestion.md
│ │ ├── retrieval-augmentation.md
│ │ ├── generation.md
│ │ └── evaluation.md
│ ├── css/
│ │ └── extra.css
│ ├── js/
│ │ ├── mathjax.js
│ │ └── theme-toggle.js
│ └── references.md
├── mkdocs.yml ← Site configuration
├── requirements.txt ← Python dependencies
└── README.md ← This file
You asked about exact ID matching (Order #1766 vs #1767). Here's the path:
-
Problem Understanding
- Why Semantic Search Fails
- Mathematical explanation of embedding similarity
-
Solution Architecture
- Hybrid Search
- How to combine semantic + keyword search
-
Complete Implementation
- Hybrid Solution
- Step-by-step code example
-
Supporting Strategies
This tutorial teaches:
- Linear Algebra: Vectors, dot products, norms, cosine similarity
- Probability: IDF, term frequency, TF-IDF scoring
- Geometry: High-dimensional space behavior, distance metrics
- Information Retrieval: BM25, ranking algorithms, evaluation metrics
All with intuition + derivations + code examples.
Each topic includes practical Python code:
# Embeddings
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
embedding = model.encode("Order #1766")
# Similarity search
import faiss
index = faiss.IndexHNSWFlat(384, 32)
distances, indices = index.search(query_vector, k=10)
# BM25 (keyword search)
from rank_bm25 import BM25Okapi
bm25 = BM25Okapi(corpus)
scores = bm25.get_scores(tokens)
# Hybrid combination
hybrid_score = w_dense * dense_norm + w_sparse * sparse_norm
# Vector database
from qdrant_client import QdrantClient
results = client.search(
query_vector=embedding,
query_filter={"order_id": "1766"}
)For Learning:
- MkDocs Material - Documentation engine used
- Jupyter Notebooks - Interactive exploration
- Hugging Face Spaces - Run code without setup
For Implementation:
- Qdrant - Vector database (free)
- LangChain - RAG framework
- Sentence Transformers - Embedding models
- rank-bm25 - BM25 library
For Production:
- Elasticsearch - Production search
- Pinecone - Managed vector DB
- Weaviate - Enterprise vector DB
- Quick read (focusing on your question): 4-6 hours
- Full tutorial (all sections): 20-30 hours
- Implementation (building a system): 40+ hours
No! Section 0 teaches everything from scratch with intuition and derivations.
Not recommended. The math explains WHY things work. Skipping it means you'll copy code without understanding why Order #1766 pattern fails.
- Math-first approach: Explains WHY, not just HOW
- Problem-focused: Built around solving the exact match problem
- Production-ready: Covers real challenges (chunking, filtering, re-ranking)
- Complete: Links all concepts together
Yes! The concepts and code examples are production-ready. For large scale, use Qdrant or Pinecone instead of local FAISS.
Found a mistake or want to add content?
- Fork this repository
- Create a new branch (
git checkout -b fix/issue) - Make changes
- Submit a pull request
This tutorial is provided as-is for learning purposes.
- Start reading: Open docs/index.md
- Build locally:
python3 -m mkdocs serve - Experiment: Modify code examples as you learn
- Implement: Build your own RAG system