Enterprise-grade RAG system for technical documentation with advanced retrieval strategies and precise source citation
Features โข Demo โข Quick Start โข Architecture โข Documentation
DocsChat RAG is a production-ready Retrieval-Augmented Generation system designed for querying technical documentation (Python, React, FastAPI) with enterprise-grade architecture patterns, multiple retrieval strategies, and accurate source citations.
- ๐ฏ Hybrid Search: Combines semantic (vector) and keyword (BM25) search with RRF fusion
- ๐ HyDE Query Transformation: Hypothetical Document Embeddings for improved retrieval
- ๐ Intelligent Reranking: Cohere API integration for relevance optimization
- ๐ Source Citation: Precise page numbers and document references
- ๐ฌ Conversational Memory: Multi-turn conversation context
- ๐๏ธ Clean Architecture: SOLID principles, dependency injection, comprehensive testing
- ๐ณ Production Ready: Docker containerization, CI/CD pipelines, monitoring
- Multi-Source Ingestion: Scrapes and processes Python, React, and FastAPI official documentation
- Advanced Chunking: Semantic and recursive text splitting strategies
- Hybrid Retrieval:
- Semantic search (cosine similarity)
- Keyword search (BM25)
- RRF-based fusion
- Query Enhancement: HyDE transformation for better retrieval
- Smart Reranking: Cohere cross-encoder for relevance scoring
- Citation Tracking: Page numbers and source URLs preserved
- Conversational Context: Last N turns memory management
- SOLID Principles: Clean, maintainable, extensible codebase
- Type Safety: Comprehensive type hints with mypy validation
- Testing: Unit and integration tests with pytest
- Documentation: Detailed docstrings (Google style) and architecture docs
- CI/CD: GitHub Actions for linting, testing, and deployment
- Observability: Structured logging with loguru
- Containerization: Docker and docker-compose setup
Live Demo: docschat-rag.streamlit.app (coming soon)
User: "How do I handle CORS in FastAPI?"
DocsChat:
To handle CORS in FastAPI, use the CORSMiddleware:
```python
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
app = FastAPI()
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
Sources:
- FastAPI Documentation: Advanced User Guide > CORS (Page 47)
- Official Docs Link
---
## ๐ Quick Start
### Prerequisites
- Python 3.11+
- Poetry (recommended) or pip
- OpenAI API key
- Cohere API key (optional, for reranking)
### Installation
```bash
# Clone repository
git clone https://github.com/RomanRosa/docschat-rag.git
cd docschat-rag
# Install dependencies with Poetry
poetry install
# OR with pip
pip install -r requirements.txt
# Setup environment variables
cp .env.example .env
# Edit .env with your API keys
# .env
OPENAI_API_KEY=your_openai_api_key_here
COHERE_API_KEY=your_cohere_api_key_here
# LLM Configuration
LLM_MODEL=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-small
TEMPERATURE=0.0
# Retrieval Configuration
TOP_K=10
RERANK_TOP_K=3
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
# Vector Store
CHROMADB_PERSIST_DIR=./data/vectorstore# 1. Ingest documentation (one-time setup)
poetry run python scripts/ingest_docs.py --sources python react fastapi
# 2. Build vector index
poetry run python scripts/rebuild_index.py
# 3. Launch Streamlit UI
poetry run streamlit run src/ui/app.pyAccess at: http://localhost:8501
# Build and run with docker-compose
docker-compose up -d
# Access at http://localhost:8501โโโโโโโโโโโโโโโ
โ User UI โ Streamlit Chat Interface
โโโโโโโโฌโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ RAG Pipeline Orchestrator โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Query Processor โ โ
โ โ - Validation โ โ
โ โ - HyDE Transformation โ โ
โ โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Hybrid Retriever โ โ
โ โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ โ
โ โ โ Semantic โ โ Keyword โ โ โ
โ โ โ (Vector) โ โ (BM25) โ โ โ
โ โ โโโโโโโโฌโโโโโโโโ โโโโโโโโฌโโโโโโโโ โ โ
โ โ โ โ โ โ
โ โ โโโโโโโโโโฌโโโโโโโโโโ โ โ
โ โ โ โ โ
โ โ โโโโโโโโโโผโโโโโโโโโโ โ โ
โ โ โ RRF Fusion โ โ โ
โ โ โโโโโโโโโโฌโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโ โ
โ โ Cohere Reranker โ โ
โ โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโ โ
โ โ LLM Generator (GPT-4o-mini) โ โ
โ โ - Prompt with context โ โ
โ โ - Citation injection โ โ
โ โ - Memory management โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
| Module | Responsibility | Key Classes |
|---|---|---|
ingestion/ |
Document scraping, parsing, chunking | PythonDocsIngester, SemanticChunker |
vectorization/ |
Embeddings generation, vector storage | OpenAIEmbedder, ChromaVectorStore |
retrieval/ |
Search strategies, reranking | HybridRetriever, CohereReranker |
generation/ |
LLM calls, prompt templating | OpenAIGenerator, ConversationalMemory |
pipeline/ |
End-to-end orchestration | RAGPipeline, QueryProcessor |
ui/ |
Streamlit interface | ChatInterface, SourcePanel |
See ARCHITECTURE.md for detailed design documentation.
- Architecture Guide: System design, component diagrams, design decisions
- API Reference: Module APIs, class references
- Development Guide: Local setup, testing, contributing
- Deployment Guide: Docker, Streamlit Cloud, AWS deployment
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=src --cov-report=html
# Run specific test module
poetry run pytest tests/unit/test_retrieval.py
# Run integration tests
poetry run pytest tests/integration/Current coverage: 92% (target: 90%+)
# Install dev dependencies
poetry install --with dev
# Setup pre-commit hooks
pre-commit install
# Run linting
poetry run ruff check src/
poetry run black src/
poetry run mypy src/
# Format code
poetry run black src/- Formatter: Black (line length: 100)
- Linter: Ruff (replaces flake8, isort, pylint)
- Type Checker: Mypy (strict mode)
- Docstrings: Google style
- Commits: Conventional Commits
| Category | Technology |
|---|---|
| Framework | LangChain 0.1.0 |
| LLM | OpenAI GPT-4o-mini |
| Embeddings | OpenAI text-embedding-3-small |
| Vector DB | ChromaDB |
| Reranking | Cohere API |
| UI | Streamlit |
| Testing | Pytest, pytest-cov |
| CI/CD | GitHub Actions |
| Containerization | Docker, docker-compose |
| Logging | Loguru |
- Basic ingestion pipeline
- Hybrid retrieval
- LLM generation with citations
- Streamlit UI
- Docker setup
- HyDE query transformation
- Advanced reranking
- Conversational memory
- Performance benchmarking
- Multi-tenancy support
- API endpoints (FastAPI)
- Admin dashboard
- Usage analytics
- Cost optimization
- Multi-modal support (code screenshots)
- Custom fine-tuned embeddings
- Graph RAG integration
- Real-time doc updates
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'feat(retrieval): add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
This project is licensed under the MIT License - see LICENSE file for details.
- LangChain for RAG framework
- ChromaDB for vector storage
- OpenAI for LLM and embeddings
- Cohere for reranking API
- Streamlit for rapid UI development
Francisco Romรกn Peรฑa de la Rosa
- LinkedIn: franciscopena76165796
- GitHub: @RomanRosa
- Email: roman_de_la_rosa@hotmail.com
If this project helped you, please consider giving it a โญ!
Made with โค๏ธ by Roman de la Rosa