A comprehensive, production-ready tutorial for building Retrieval-Augmented Generation (RAG) systems using LangChain.
π― Features: 15 advanced RAG architectures | Multimodal RAG (images + text) | Fine-tuning embeddings | RAGAS evaluation | SQL & Graph support | Docker & production templates | Complete testing & CI/CD
# Clone repository
git clone https://github.com/gianlucamazza/langchain-rag-tutorial.git
cd langchain-rag-tutorial
# Setup environment
python3 -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
# Configure API key
echo "OPENAI_API_KEY=sk-proj-your-key-here" > .env
# Start learning
jupyter notebook notebooks/00_index.ipynbπ Full guide: docs/GETTING_STARTED.md
Master the core concepts of RAG:
- Document Loading & Splitting - Process and chunk text efficiently
- Embeddings Comparison - OpenAI vs HuggingFace benchmarks
- Simple RAG - Build your first end-to-end RAG system
π Start with Fundamentals β
Explore 15 production-ready patterns:
| Architecture | Complexity | Use Case | Key Feature |
|---|---|---|---|
| Memory RAG | ββ | Chatbots | Conversation history |
| Branched RAG | βββ | Research | Multi-query parallel retrieval |
| HyDe | βββ | Ambiguous queries | Hypothetical documents |
| Adaptive RAG | ββββ | Mixed workloads | Intelligent query routing |
| Corrective RAG | ββββ | High accuracy | Quality check + web fallback |
| Self-RAG | βββββ | Self-correcting | Autonomous refinement |
| Agentic RAG | βββββ | Complex reasoning | Multi-tool agent loops |
| Contextual RAG β¨ | βββ | Technical docs | Context-augmented chunks |
| Fusion RAG β¨ | βββ | Best ranking | Reciprocal Rank Fusion |
| SQL RAG β¨ | ββββ | Analytics/BI | Natural Language to SQL |
| GraphRAG β¨ | βββββ | Knowledge graphs | Entity relationships + multi-hop |
| Multimodal RAG π | ββββ | Images + text | GPT-4 Vision + OCR |
| Fine-tuning Guide π | ββββ | Domain embeddings | Custom embedding models |
| RAGAS Evaluation | - | Quality metrics | Comprehensive RAG assessment |
π¬ Explore Advanced Patterns β
Comprehensive docs organized by topic:
- π Getting Started - 5-minute quick start
- π οΈ Installation - Detailed setup guide
- π API Reference - Shared module documentation
- ποΈ Architecture - Design decisions
- π Troubleshooting - Common issues & solutions
- β‘ Performance - Benchmarks & optimization
- β FAQ - Frequently asked questions
- π Deployment - Production deployment
- π Examples - Usage patterns
- π€ Contributing - Contribution guidelines
- π Changelog - Version history
llm_rag/
βββ docs/ # π Modular documentation
β βββ GETTING_STARTED.md # Quick start guide
β βββ INSTALLATION.md # Setup instructions
β βββ API_REFERENCE.md # Shared module API
β βββ ... (8 more specialized docs)
βββ notebooks/
β βββ 00_index.ipynb # π― START HERE - Navigation hub
β βββ fundamentals/ # Core RAG concepts (01-03)
β β βββ 01_setup_and_basics.ipynb
β β βββ 02_embeddings_comparison.ipynb
β β βββ 03_simple_rag.ipynb
β βββ advanced_architectures/ # Advanced patterns (04-18)
β βββ 04_rag_with_memory.ipynb
β βββ 05_branched_rag.ipynb
β βββ 06_hyde.ipynb
β βββ 07_adaptive_rag.ipynb
β βββ 08_corrective_rag.ipynb
β βββ 09_self_rag.ipynb
β βββ 10_agentic_rag.ipynb
β βββ 11_comparison.ipynb # All 12 architectures
β βββ 12_contextual_rag.ipynb β¨ # v1.1.0
β βββ 13_fusion_rag.ipynb β¨ # v1.1.0
β βββ 14_sql_rag.ipynb β¨ # v1.1.0
β βββ 15_graphrag.ipynb β¨ # v1.1.0
β βββ 16_evaluation_ragas.ipynb β¨ # v1.1.0 - Quality metrics
β βββ 17_multimodal_rag.ipynb π # v1.2.0 - Images + Text
β βββ 18_finetuning_embeddings.ipynb π # v1.2.0 - Custom embeddings
βββ templates/ # π Production deployment templates (NEW v1.2.0)
β βββ fastapi/ # REST API template
β βββ streamlit/ # Web UI template
β βββ lambda/ # AWS Lambda serverless
βββ tests/ # π§ͺ Test suite (NEW v1.2.0)
β βββ conftest.py # pytest fixtures
β βββ test_utils.py # Utility tests
β βββ test_config.py # Config tests
βββ shared/ # π§ Reusable utilities (1500+ lines)
β βββ config.py # Configuration management
β βββ utils.py # Utility functions
β βββ loaders.py # Document loading
β βββ prompts.py # Prompt templates (30+ prompts)
βββ data/ # πΎ Vector stores, Chinook DB (gitignored)
βββ Dockerfile # π³ Docker support (NEW v1.2.0)
βββ docker-compose.yml # π³ Full stack orchestration (NEW v1.2.0)
βββ Makefile # π οΈ Development commands (NEW v1.2.0)
βββ pytest.ini # π§ͺ Test configuration (NEW v1.2.0)
βββ .pre-commit-config.yaml # π Pre-commit hooks (NEW v1.2.0)
βββ requirements.txt # Production dependencies
βββ requirements-dev.txt # Development dependencies (NEW v1.2.0)
βββ .env.example # π API key template
βββ README.md # This fileCore Capabilities:
- β 12 RAG Architectures - From simple to graph-based
- β Multimodal RAG π - GPT-4 Vision + OCR for images + text
- β Fine-tuning Guide π - Train custom domain-specific embeddings
- β RAGAS Evaluation - Comprehensive quality metrics
- β SQL & Graph Support - Structured data + relationships
- β Modular Design - Reusable shared utilities (DRY)
- β Vector Store Persistence - No re-embedding needed
- β Comprehensive Benchmarks - Performance & cost analysis
Production-Ready Infrastructure: π
- β Docker Support - Multi-stage builds with Redis, Prometheus, Grafana
- β 3 Deployment Templates - FastAPI, Streamlit, AWS Lambda
- β Testing Suite - pytest with 70%+ coverage target
- β CI/CD Pipelines - GitHub Actions (testing, linting, coverage)
- β Development Tools - Makefile, pre-commit hooks, linting
- β Security Best Practices - Non-root Docker, API key management
Technical Stack:
- LangChain v1.0+ - Framework & LCEL
- OpenAI GPT-4o-mini + GPT-4 Vision - Fast LLM + multimodal
- FAISS - Vector similarity search
- Sentence Transformers + Accelerate - Fine-tuning embeddings
- NetworkX - Graph algorithms
- SQLAlchemy - Database abstraction
- RAGAS - RAG evaluation framework
- Tesseract OCR + Pillow - Image processing
- FastAPI + Streamlit - Production deployment
- Docker + Docker Compose - Containerization
- pytest + GitHub Actions - Testing & CI/CD
- Python 3.9+ - Modern type hints
π See Architecture Details β
Choose based on your needs:
| Your Need | Architecture | Docs |
|---|---|---|
| π Fast & simple | Simple RAG | 03_simple_rag.ipynb |
| π¬ Chatbot with memory | Memory RAG | 04_rag_with_memory.ipynb |
| π Research tool | Fusion RAG | 13_fusion_rag.ipynb β¨ |
| π Ambiguous queries | Contextual RAG | 12_contextual_rag.ipynb β¨ |
| βοΈ Cost optimization | Adaptive RAG | 07_adaptive_rag.ipynb |
| π― High accuracy | Fusion / CRAG | 13_fusion_rag.ipynb β¨ |
| π Self-correcting | Self-RAG | 09_self_rag.ipynb |
| π€ Complex reasoning | Agentic RAG | 10_agentic_rag.ipynb |
| π Analytics/BI | SQL RAG | 14_sql_rag.ipynb β¨ |
| πΈοΈ Knowledge graphs | GraphRAG | 15_graphrag.ipynb β¨ |
| πΌοΈ Images + text π | Multimodal RAG | 17_multimodal_rag.ipynb π |
| π― Custom embeddings π | Fine-tuning Guide | 18_finetuning_embeddings.ipynb π |
| π Quality evaluation | RAGAS | 16_evaluation_ragas.ipynb β¨ |
Rule of thumb: Start with Simple RAG β Add Contextual for quality β Use specialized for specific needs.
Performance tip: For domain-specific content with <75% accuracy, consider fine-tuning embeddings (notebook 18) for +15-25% improvement.
β Need help choosing? See FAQ β
Ready to deploy? Choose from 3 production-ready templates:
# Quick start with Docker Compose
docker-compose up -d
# Or build custom image
docker build -t langchain-rag:latest .
docker run -p 8000:8000 --env-file .env langchain-rag:latestFeatures: Multi-stage builds, Redis caching, Prometheus + Grafana monitoring, non-root security
Complete REST API with automatic documentation:
cd templates/fastapi
pip install -r requirements.txt
uvicorn app:app --reload
# Visit http://localhost:8000/docs for Swagger UIFeatures: Pydantic validation, CORS, health checks, error handling, async support
Interactive web application:
cd templates/streamlit
pip install -r requirements.txt
streamlit run streamlit_app.pyFeatures: Real-time queries, source document display, architecture selection, metrics visualization
Deploy to AWS Lambda:
cd templates/lambda
zip -r function.zip .
aws lambda create-function --function-name rag-api --zip-file fileb://function.zipFeatures: S3 vector store integration, cold start optimization, API Gateway ready
π Full guide: docs/DEPLOYMENT.md
| Architecture | Latency | Cost/Query | Accuracy | Best For |
|---|---|---|---|---|
| Simple RAG | ~2s | $0.002 | Good | General Q&A |
| Contextual RAG β¨ | ~2-3s | $0.002 | Very Good | Technical docs |
| Fusion RAG β¨ | ~5-8s | $0.006 | Excellent | Research |
| SQL RAG β¨ | ~2-5s | $0.004 | Perfect* | Analytics |
| GraphRAG β¨ | ~3-8s | $0.010+ | Excellent** | Relationships |
| Adaptive RAG | Variable | $0.003 | Very Good | Mixed workloads |
| Agentic RAG | ~30s | $0.012 | Excellent | Complex tasks |
*For structured data | **For multi-hop queries
Full benchmarks: 11_comparison.ipynb | RAGAS Evaluation
- Python 3.9+ (3.10+ recommended)
- OpenAI API Key (Get one here)
- ~2GB RAM (4GB+ recommended for fine-tuning)
- ~2GB disk space (dependencies + models)
- System packages (for multimodal): Tesseract OCR, Poppler (see INSTALLATION.md)
π Detailed requirements β
Recommended sequence:
- Setup (10 min): GETTING_STARTED.md
- Navigation Hub (5 min): 00_index.ipynb
- Fundamentals (30-40 min): Notebooks 01-03
- Choose Your Path:
- π Fast track: Simple RAG β Contextual RAG β Your use case
- π¬ Deep dive: Complete all 12 architectures
- π Comparison: Jump to 11_comparison.ipynb
- π Evaluation: Try 16_evaluation_ragas.ipynb
Total time:
- Fast track: ~1-2 hours
- Complete tutorial: ~5-7 hours
- With multimodal + evaluation: ~7-9 hours
- Production deployment: +1-2 hours
Contributions welcome! See CONTRIBUTING.md for guidelines.
Ways to contribute:
- π Report bugs
- β¨ Suggest features
- π Improve documentation
- π» Submit pull requests
Quick commands with Makefile:
make install # Install all dependencies
make test # Run test suite with coverage
make lint # Run code quality checks
make format # Auto-format code (black, isort)
make docker-build # Build Docker image
make docker-run # Run full stack with Docker Compose
make clean # Clean cache and build filesTesting:
# Run all tests with coverage
pytest tests/ -v --cov=shared --cov-report=html
# Run specific test file
pytest tests/test_utils.py -v
# Run with markers
pytest -m "not slow" # Skip slow testsPre-commit hooks:
# Install hooks (runs on every commit)
pre-commit install
# Run manually
pre-commit run --all-filesCI/CD: Automated testing and linting on every push via GitHub Actions (Python 3.9, 3.10, 3.11)
π More details: docs/CONTRIBUTING.md
MIT License - see LICENSE for details.
TL;DR: Free to use commercially, modify, and distribute. Just include the license.
- π Documentation: docs/
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π LangChain Docs: python.langchain.com
- π Check FAQ first
- π Search existing issues
- π Open a new issue
- π¬ Ask in Discussions
β If this helps you, please star the repo!
Latest: v1.2.1 - Critical import fixes, Python 3.9 compatibility, accelerate support | Made with β€οΈ using Claude Code | View Changelog