Novus Enterprise RAG System

A production-grade, multi-tenant Retrieval-Augmented Generation (RAG) platform built on a microservices architecture. The system integrates:

Weaviate for scalable vector storage and similarity search.
ArangoDB for managing and querying knowledge graphs, enabling rich semantic relationships and context-aware retrieval.
Agentic orchestration for coordinating retrieval, reasoning, and response generation across services and tenants.

The architecture ensures isolation and scalability for multiple tenants, supports domain-specific customization, and is designed for high availability, observability, and extensibility in enterprise environments.

📚 Documentation

Video Demo

Watch Novus System Demo on YouTube - Complete walkthrough of the Novus Enterprise RAG System features and capabilities

Slide Deck & Presentations

Novus RAG Factory Presentation - Comprehensive slide deck covering the Novus RAG system architecture, features, and implementation details

Technical Documentation

Novus System Documentation (PDF) - Complete technical documentation for the Novus Enterprise RAG System

Additional Resources

Services Documentation - Detailed documentation for all microservices
System Scripts Guide - Complete guide for system management scripts
Design Document - Architectural design and technical specifications
Monitoring Guide - Observability and monitoring setup

🏗️ Novus Architecture Overview

Novus RAG Architecture Overview

✨ Key Features

🔄 Microservices Architecture

Event-driven communication via Kafka
Independent scaling for each service
Fault isolation and resilience
Blue-green deployments with zero downtime

📄 Advanced PDF Processing

Streaming processing for large files (>1GB)
Semantic chunking with hierarchical structure
Table extraction and OCR support
Content classification and entity extraction

🧠 Intelligent Embeddings

Multi-model support (OpenAI + local models)
GC-PROD meta embeddings for graph-aware retrieval
Progressive distillation framework
Automatic fallback and quality validation

🗃️ Dynamic Collection Management

Version-aware routing with atomic switching
Dynamic Weaviate collections based on content type
Blue-green indexing for updates
Automatic cleanup and retention policies

🔍 Multi-Modal Retrieval

Vector search (dense embeddings)
Sparse retrieval (BM25/SPLADE)
Knowledge graph traversal
Cross-encoder reranking
LLM synthesis with provenance

📊 Production-Ready Observability

Prometheus metrics for all services with per-service resource monitoring
Grafana dashboards for visualization with pre-built production dashboards
Distributed tracing with Jaeger and OpenTelemetry integration
Centralized logging with Loki and trace correlation
cAdvisor for container-level infrastructure metrics
Pushgateway for Kafka worker metrics
Comprehensive alerting with 50+ production-ready alert rules
Real-time monitoring of CPU, memory, disk I/O, and network I/O per service
Query stage breakdown for performance optimization (vector DB, embedding, LLM)
Health checks and automated incident detection

🚀 Quick Start

Prerequisites

Docker & Docker Compose
16GB+ RAM recommended (8GB minimum)
50GB free disk space for data and metrics storage
OpenAI API key (optional, for best performance)

1. Clone and Setup

git clone <repository-url>
cd Novus
export OPENAI_API_KEY="your-openai-api-key-here"
# If enterprise OPENAI API KEY is used then add the OPENAI_API_URL
export OPENAI_API_URL="your-openai-api-url-here"

2. Create Data Directories

Create persistent storage directories for all services:

# Create all data directories for services and monitoring
mkdir -p data/{weaviate,arangodb,arangodb-apps,opensearch,redis,mongo,minio,upload-temp,prometheus,grafana,loki,pushgateway}

# Set appropriate permissions
chmod -R 777 data/

Data Directory Structure:

data/weaviate - Vector database storage
data/arangodb - Knowledge graph database
data/arangodb-apps - ArangoDB applications
data/opensearch - Full-text search index
data/redis - Cache and session data
data/mongo - User data and metadata
data/minio - Object storage for PDFs
data/upload-temp - Temporary upload files
data/prometheus - Metrics time-series data
data/grafana - Dashboards and visualizations
data/loki - Log aggregation storage
data/pushgateway - Kafka worker metrics

3. Start the System

Quick Start

# Start with existing images
./start-system.sh

# Or rebuild and start (first time or after code changes)
./start-system.sh --build

# Clean rebuild (fresh start, removes all data)
./start-system.sh --clean --build

Start Options

--build - Force rebuild all Docker images
--clean - Clean all volumes and data (WARNING: destructive)
--no-monitoring - Skip monitoring stack (Prometheus, Grafana, etc.)
--no-chat - Skip chat UI
--dev - Development mode (verbose logging)
--help - Show help message

What Gets Started

The script starts services in the correct order:

Infrastructure (started first):

Zookeeper & Kafka (Message Queue)
Redis (Cache)
MongoDB (Auth & Metadata)
MinIO (Object Storage)
Weaviate (Vector Database)
ArangoDB (Graph Database)
OpenSearch (Search Engine)

Core Services:

API Gateway (Port 8000)
Upload Service (Port 8001)
Query Service (Port 8002)
Collection Registry (Port 8003)
PDF Processing Service (Port 8004)
Embedding Service
Weaviate Manager
ArangoDB Updater
Memory Service (Port 8005)

Monitoring (optional):

Prometheus (Port 9090)
Grafana (Port 3000)
Jaeger (Port 16686)
cAdvisor, Pushgateway, Loki, Promtail

UI (optional):

Novus Chat (Port 4200)

4. System Management

Check System Status

# Quick status overview
./system-status.sh

# Detailed health checks
./system-status.sh --health

# View resource usage
./system-status.sh --metrics

# View logs for a specific service
./system-status.sh --logs api-gateway

Stop the System

# Graceful shutdown
./stop-system.sh

# Force stop
./stop-system.sh --force

# Stop and remove all data
./stop-system.sh --clean

Scale PDF Workers

# Scale to 3 workers for faster processing
./scale-pdf-workers.sh 3

# Scale back to 1 worker
./scale-pdf-workers.sh 1

Common Operations

# Restart a specific service
docker-compose restart api-gateway

# View logs for all services
docker-compose logs -f

# View logs for specific service
docker-compose logs -f embedding-service

# Check running containers
docker-compose ps

# Execute command in container
docker exec -it api-gateway bash

5. Access the System

Chat UI: http://localhost:4200
API Gateway: http://localhost:8000
Grafana Monitoring: http://localhost:3000 (admin/admin)
Prometheus Metrics: http://localhost:9090
Jaeger Tracing: http://localhost:16686
Upload Service: http://localhost:8001

6. Upload Documents

curl -X POST "http://localhost:8001/upload" \
  -F "file=@your-document.pdf" \
  -F "tenant_id=demo" \
  -F "topic=Device" \
  -F "device=TechCorp8000"

🔧 Configuration

Environment Variables

# OpenAI Configuration
export OPENAI_API_KEY="sk-..."
export OPENAI_MODEL="gpt-4o-mini"
export EMBEDDER="text-embedding-3-large"

# Service URLs (auto-configured in Docker)
export WEAVIATE_URL="http://weaviate:8080"
export ARANGO_URL="http://arangodb:8529"
export KAFKA_BOOTSTRAP_SERVERS="kafka:9092"

Service Configuration

Each service has its own configuration file:

Upload Service: services/upload-service/app/config.py
PDF Processing: services/pdf-processing-service/app/config.py
Embedding Service: services/embedding-service/app/config.py
And so on...

📈 Monitoring & Observability

Enhanced Observability Stack

The Novus system includes a comprehensive three-pillar observability solution:

Metrics (Prometheus) - Quantitative system measurements
Traces (Jaeger) - Request flow through distributed services
Logs (Loki) - Detailed event records with trace correlation

Grafana Dashboards

Access Grafana at http://localhost:3000 (admin/admin):

Production Overview: System-wide health, request rates, error rates, and latency
Query Service Performance: Detailed query stage breakdown (vector DB, embedding, LLM)
Ingestion Pipeline: Document processing throughput and worker metrics
Resource Utilization: CPU, memory, disk I/O, and network I/O per service
Tracing Correlation: Link metrics, logs, and traces for root cause analysis

Prometheus Metrics

Access Prometheus at http://localhost:9090:

Per-service resource metrics (CPU, memory, I/O)
HTTP request metrics (rate, latency, errors)
Query stage-level metrics
Database operation metrics
Container metrics from cAdvisor
Kafka worker metrics via Pushgateway

Distributed Tracing

Access Jaeger at http://localhost:16686:

Trace complete request flows across all services
Identify performance bottlenecks
Correlate traces with metrics and logs via trace_id
Analyze query stage latencies in detail

Log Management

# View all service logs
docker-compose logs -f

# View specific service logs
docker-compose logs -f query-service

# Search logs with trace correlation in Loki
# Access via Grafana → Explore → Loki
# Query: {service_name="query-service"} |= "trace:"

# Follow processing pipeline
docker-compose logs -f pdf-processing-service embedding-service weaviate-manager

Alerting

50+ production-ready alert rules configured in monitoring/alerting-rules.yml:

Service health alerts (down, high error rate)
Resource alerts (CPU, memory, disk I/O)
Performance alerts (high latency, slow queries)
Pipeline alerts (backlog, processing failures)
Business metrics alerts (low throughput, high failure rate)

Monitoring Documentation

Quick Start: monitoring/README.md - Overview and setup
Observability Guide: monitoring/docs/OBSERVABILITY_GUIDE.md - Complete architecture and usage
Metrics Reference: monitoring/docs/METRICS_REFERENCE.md - Complete metrics catalog
Jaeger Tracing: monitoring/docs/JAEGER_GUIDE.md - Distributed tracing setup

Scaling Services

# Scale processing services for high load
docker-compose up -d --scale pdf-processing-service=3
docker-compose up -d --scale embedding-service=5

# Monitor resource utilization in Grafana to guide scaling decisions

🔍 API Documentation

Upload API

POST /upload
Content-Type: multipart/form-data

Parameters:
- file: PDF file
- tenant_id: Tenant identifier
- topic: Document topic (optional)
- device: Device type (optional)
- version: Version (optional)

Query API

POST /query
Content-Type: application/json

{
  "query": "How to configure BGP on TechCorp 8000?",
  "tenant_id": "demo",
  "filters": {
    "device": "TechCorp8000",
    "topic": "Configuration"
  }
}

Collection Management

# List collections
GET /collections?tenant_id=demo

# Get collection info
GET /collections/{logical_name}

# Resolve collection
GET /resolve/{logical_name}

🏗️ Development

Adding New Services

Create service directory in services/
Implement with FastAPI or async worker pattern
Add to docker-compose.yaml
Update monitoring configuration

Custom Embedding Models

Add model configuration to embedding-service/app/config.py
Implement model loader in embedding_generator.py
Update dimension mappings

Extending Knowledge Graph

Add entity extractors in arangodb-updater/
Define new node/edge types
Update graph traversal queries

🔒 Security & Compliance

Authentication

JWT-based authentication via API Gateway
Tenant isolation at all levels
Role-based access control (RBAC)

Data Privacy

Client-side embedding option for sensitive data
PII detection and redaction
Data residency controls

Audit & Compliance

Immutable audit logs for all operations
Provenance tracking for all answers
Right-to-be-forgotten support

📊 Performance Tuning

Vector Search Optimization

HNSW index tuning: ef_construction, ef_search, max_connections
Batch size optimization for embeddings
Collection sharding strategies

Knowledge Graph Performance

ArangoDB SmartGraphs for large datasets
Query optimization with indexes
Caching frequent traversals

Processing Pipeline

Parallel PDF processing workers
Embedding batch size tuning
Memory management for large documents

Health Checks

# Check all service health
curl http://localhost:8000/health
curl http://localhost:8001/health
curl http://localhost:8002/health
curl http://localhost:8003/health

🙏 Acknowledgments

Built on top of excellent open-source projects:
- Weaviate - Vector database
- ArangoDB - Multi-model database
- LangChain - LLM framework
- FastAPI - Web framework
- Apache Kafka - Message streaming

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
images		images
monitoring		monitoring
novus-chatbot		novus-chatbot
services		services
.dockerignore		.dockerignore
.gitignore		.gitignore
Desgin-Document.md		Desgin-Document.md
LICENSE		LICENSE
Novus_API_Collection.postman_collection.json		Novus_API_Collection.postman_collection.json
Novus_Environment.postman_environment.json		Novus_Environment.postman_environment.json
README.md		README.md
SERVICES_DOCUMENTATION.md		SERVICES_DOCUMENTATION.md
SYSTEM_SCRIPTS.md		SYSTEM_SCRIPTS.md
docker-compose.yaml		docker-compose.yaml
env.example		env.example
requirements.txt		requirements.txt
scale-pdf-workers.sh		scale-pdf-workers.sh
setup_and_generate_video.sh		setup_and_generate_video.sh
start-system.sh		start-system.sh
stop-system.sh		stop-system.sh
system-status.sh		system-status.sh
test-system.sh		test-system.sh

Folders and files

Latest commit

History

Repository files navigation

Novus Enterprise RAG System

📚 Documentation

Video Demo

Slide Deck & Presentations

Technical Documentation

Additional Resources

🏗️ Novus Architecture Overview

Novus RAG Architecture Overview

✨ Key Features

🔄 Microservices Architecture

📄 Advanced PDF Processing

🧠 Intelligent Embeddings

🗃️ Dynamic Collection Management

🔍 Multi-Modal Retrieval

📊 Production-Ready Observability

🚀 Quick Start

Prerequisites

1. Clone and Setup

2. Create Data Directories

3. Start the System

Quick Start

Start Options

What Gets Started

4. System Management

Check System Status

Stop the System

Scale PDF Workers

Common Operations

5. Access the System

6. Upload Documents

🔧 Configuration

Environment Variables

Service Configuration

📈 Monitoring & Observability

Enhanced Observability Stack

Grafana Dashboards

Prometheus Metrics

Distributed Tracing

Log Management

Alerting

Monitoring Documentation

Scaling Services

🔍 API Documentation

Upload API

Query API

Collection Management

🏗️ Development

Adding New Services

Custom Embedding Models

Extending Knowledge Graph

🔒 Security & Compliance

Authentication

Data Privacy

Audit & Compliance

📊 Performance Tuning

Vector Search Optimization

Knowledge Graph Performance

Processing Pipeline

Health Checks

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages