RAGU (Retrieval-Augmented Generation Universal) - A modern, local RAG application with a beautiful web interface
Quick Start β’ Features β’ Documentation β’ API Reference β’ Task Tracking
RAGU (Retrieval-Augmented Generation Universal) is a powerful, privacy-focused documentation search and query platform that enables semantic search across your documentation using local AI models. With a modern Angular web interface, you can upload documents, import from Confluence, query your knowledge base, and manage collectionsβall while keeping your data completely local.
- π Modern Web UI - Beautiful, intuitive interface built with Angular
- π Document Upload - Support for PDF, HTML, TXT, Markdown, and more
- π Confluence Integration - Import pages directly from Confluence
- π Semantic Search - Natural language queries with context-aware responses
- π Version Management - Track and query multiple documentation versions
- π Privacy First - All processing happens locally on your machine
- β‘ Fast & Efficient - Query caching and optimized retrieval
Chat Playground - Interactive chat interface for querying your documentation with natural language
Admin Dashboard - System overview and quick actions
System Monitoring - Query analytics and performance metrics
-
π€ Document Upload & Import
- Upload multiple file formats (PDF, HTML, TXT, Markdown)
- Import Confluence pages via page ID or URL
- Web Scraping: Crawl and embed entire documentation sites
- Batch processing for multiple files
- Incremental updates without data loss
-
π Intelligent Querying
- Natural language question answering
- Multi-version querying across documentation versions
- Version comparison for tracking changes
- Query history and favorites management
- Source citations with document references
-
π Management & Monitoring
- Collection management with version tracking
- Query analytics and statistics
- Performance monitoring
- Export query history (JSON/CSV)
-
βοΈ Configuration & Integration
- Multiple LLM provider support (Ollama, OpenAI, Anthropic, Azure, Google, OpenRouter)
- Configurable embedding providers (Ollama, OpenRouter, OpenAI, Azure, Google)
- OpenRouter support for both LLM and embeddings (OpenAI-compatible API)
- Confluence integration settings
- System settings management
- Optional API authentication
-
π¨ Modern UI/UX
- Clean, responsive design
- Intuitive navigation
- Real-time feedback
- Helpful tooltips and icons
- Error handling with clear messages
-
π± Pages
- Chat Playground - Interactive chat interface for querying documentation
- History - View and rerun previous queries
- Dashboard - System overview and quick actions
- Upload & Import - Document upload and Confluence import
- Collections - Manage your document collections
- Monitoring - System statistics and analytics
- Settings - Configure LLM providers, Confluence, and system settings
Before you begin, ensure you have:
- Python 3.8+ installed
- Node.js 18+ and npm (for web UI)
- Ollama installed and running
- Minimum 8GB RAM (16GB+ recommended)
- 10GB+ free disk space for models and vector database
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Verify installation
ollama --version# Download LLM for generation (~4GB)
ollama pull mistral
# Download embedding model (lightweight)
ollama pull nomic-embed-text
# Verify models
ollama list# Navigate to project directory
cd ragu
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install Python dependencies
pip install --upgrade pip
pip install -r requirements.txt# Navigate to web UI directory
cd web-ui
# Install Node.js dependencies
npm install
# Build the application
npm run build
# Or run in development mode
npm start# Copy example environment file
cp .env.example .env
# Edit .env with your configuration (optional - defaults work for most cases)Option A: Docker Compose (Recommended)
# Development mode (with hot reload)
./scripts/docker-start.sh dev
# Production mode (detached)
./scripts/docker-start.sh prod
# With containerized Ollama
./scripts/docker-start.sh dev containerOption B: Start Backend Only (API)
# Using helper script (automatically uses Gunicorn in production)
./scripts/start-rag-server.sh
# Or manually with Gunicorn (recommended for production)
gunicorn -c gunicorn_config.py "src.app:app"
# Or manually with Flask development server (development only)
python3 -c "from src.app import app; app.run(host='localhost', port=8080)"Option C: Start with Web UI
# Terminal 1: Start backend API
./scripts/start-rag-server.sh
# Terminal 2: Start web UI (development)
cd web-ui
npm startService URLs:
- Docker: Frontend Dev
http://localhost:4200, Frontend Prodhttp://localhost:80, Backend APIhttp://localhost:8080 - Local: API
http://localhost:8080, Web UIhttp://localhost:4200(development)
- Access the Web UI: Open
http://localhost:4200in your browser - Upload Documents: Navigate to "Upload & Import" β "Upload Documents" tab
- Import from Confluence: Use "Confluence Import" tab (configure Confluence settings first)
- Query Documentation: Go to "Query" page and ask questions
- Manage Collections: View and manage collections in "Collections" page
curl http://localhost:8080/healthcurl -X POST http://localhost:8080/embed \
-F "file=@documentation.pdf" \
-F "version=1.2.3"curl -X POST http://localhost:8080/confluence/import \
-H "Content-Type: application/json" \
-d '{
"page_id": "123456",
"version": "1.2.3",
"overwrite": false
}'curl -X POST http://localhost:8080/query \
-H "Content-Type: application/json" \
-d '{
"query": "How do I use the UserService class?",
"version": "1.2.3",
"k": 3
}'# Embed a file
python3 src/cli.py embed path/to/documentation.pdf --version 1.2.3
# Query documentation
python3 src/cli.py query "How does UserService work?" --version 1.2.3
# List collections
python3 src/cli.py list-collections
# Check system status
python3 src/cli.py statusragu/
βββ src/ # Backend Python code
β βββ app.py # Flask API server
β βββ cli.py # Command-line interface
β βββ embed.py # Document embedding logic
β βββ query.py # Query processing
β βββ get_vector_db.py # Vector database management
β βββ settings.py # Settings management
β βββ llm_providers.py # LLM provider abstraction
β βββ confluence.py # Confluence integration
β βββ ...
βββ web-ui/ # Frontend Angular application
β βββ src/
β β βββ app/
β β β βββ features/ # Feature modules
β β β β βββ admin/ # Admin features (dashboard, upload, collections, etc.)
β β β β βββ query/ # Query interface
β β β β βββ auth/ # Authentication
β β β βββ core/ # Core services and state
β β β βββ layout/ # Layout components
β β β βββ shared/ # Shared components
β β βββ ...
β βββ ...
βββ scripts/ # Utility scripts
β βββ start-rag-server.sh # Server startup
β βββ embed-commonmodel-docs.sh # Maven integration
β βββ ...
βββ docs/ # Documentation
β βββ API_REFERENCE.md # Complete API documentation
β βββ DEVELOPER_GUIDE.md # Developer guide
βββ tests/ # Test files
βββ chroma/ # ChromaDB persistence
βββ requirements.txt # Python dependencies
βββ .env.example # Environment configuration example
βββ README.md # This file
Key configuration options in .env:
# Vector Database
CHROMA_PATH=./chroma
COLLECTION_NAME=common-model-docs
# Models
LLM_MODEL=mistral
TEXT_EMBEDDING_MODEL=nomic-embed-text
# API Server
API_PORT=8080
API_HOST=localhost
FLASK_DEBUG=False
# Authentication (Optional)
AUTH_ENABLED=false
AUTH_REQUIRED_FOR=write # Options: 'all', 'write', 'none'
API_KEY=
API_KEY_HEADER=X-API-Key
# Session Security
SECRET_KEY=your-secret-key-here
SESSION_SECURE=false # Set to true for HTTPSThe web UI connects to the backend API. Configure the API URL in web-ui/src/environments/environment.ts:
export const environment = {
apiUrl: 'http://localhost:8080'
};Configure Confluence settings via the web UI (Settings β Confluence Integration) or via API:
curl -X POST http://localhost:8080/settings/confluence \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-domain.atlassian.net",
"username": "your-email@example.com",
"api_token": "your-api-token"
}'- Path Traversal Protection - File paths are sanitized and validated
- Input Validation - All API inputs are validated before processing
- Error Handling - Comprehensive error handling with appropriate HTTP status codes
- Local Processing - All data stays on your machine
- API Authentication (Optional) - API key-based authentication for production use
- Write Protection - Configurable authentication for write operations only
- Session Security - Secure session management for web UI
- Quick Start Guide - Get up and running in 5 minutes
- API Reference - Complete API endpoint documentation
- Developer Guide - Architecture and extension guide
- Changelog - Version history and changes
POST /embed- Embed a single filePOST /embed-batch- Embed multiple filesPOST /confluence/import- Import Confluence pagePOST /query- Query documentationPOST /query/multi-version- Query across multiple versionsGET /collections- List all collectionsGET /stats- System statisticsGET /history- Query history
For complete API documentation, see API_REFERENCE.md.
# Run unit tests
./scripts/run-tests.sh
# Run all tests including integration tests (requires Ollama)
RUN_INTEGRATION_TESTS=1 ./scripts/run-tests.sh --integration
# Or use pytest directly
pytest tests/ # Run all tests
pytest tests/ -m unit # Run only unit tests
pytest tests/ -m integration # Run only integration tests
pytest tests/ --cov=src # With coverage reportTest coverage report is generated in htmlcov/index.html after running tests with coverage.
Ollama not found
- Ensure Ollama is installed and in your PATH
- Check that
ollama serveis running - Verify with
ollama list
Models not available
- Run
ollama pull mistralandollama pull nomic-embed-text - Verify with
ollama list
Import errors
- Ensure virtual environment is activated
- Run
pip install -r requirements.txt - Check Python version:
python3 --version(requires 3.8+)
Port already in use
- Change
API_PORTin.envfile - Or stop the process using port 8080
Web UI not connecting to backend
- Verify backend is running on the configured port
- Check CORS settings if accessing from different origin
- Verify API URL in environment configuration
Confluence import fails
- Verify Confluence settings are configured correctly
- Check that
confluence-markdown-exporteris installed:pip install confluence-markdown-exporter==1.0.4 - Ensure API token has read permissions for the page
Backend:
# Activate virtual environment
source venv/bin/activate
# Run with auto-reload
FLASK_DEBUG=True python3 -c "from src.app import app; app.run(host='localhost', port=8080, debug=True)"Frontend:
cd web-ui
npm start
# Access at http://localhost:4200Backend:
# No build step needed - Python runs directly
# Use production WSGI server (Gunicorn is included in requirements.txt):
# Using configuration file (recommended)
gunicorn -c gunicorn_config.py "src.app:app"
# Or with inline configuration
gunicorn -w 4 -b 0.0.0.0:8080 --timeout 30 "src.app:app"Frontend:
cd web-ui
npm run build
# Output in web-ui/dist/See LICENSE file for details.
- Based on the guide: Build Your Own RAG App
- Uses Ollama for local LLM
- Uses ChromaDB for vector storage
- Uses LangChain for RAG orchestration
- Uses Angular for the web interface
Made with β€οΈ for developers who value privacy and local processing
Report Bug β’ Request Feature β’ Documentation


