NEXUS

Multimodal RAG investigation platform for legal document intelligence. Ingests, analyzes, and queries 50,000+ pages of mixed-format legal documents — surfacing people, relationships, timelines, and patterns across a heterogeneous corpus.

What NEXUS Does

Ingest anything — PDF, DOCX, XLSX, PPTX, HTML, EML, MSG, RTF, CSV, TXT, images, ZIP archives, EDRM load files
Ask questions, get cited answers — 6 autonomous LangGraph agents investigate your corpus with cited, verifiable responses
Case intelligence — extract claims, parties, defined terms, and timelines from anchor documents
Investigate relationships — knowledge graph with entity resolution, email threading, communication analytics, topic clustering
Export for court — annotations, production sets, Bates numbering, redaction, EDRM-compatible output
Full auditability — SOC 2-ready audit trail for every API call and LLM invocation
Run anywhere — 4 LLM providers, 6 embedding providers, CPU or GPU, full local deployment with zero cloud dependency

Tech Stack

Component	Technology
API	FastAPI
Task Queue	Celery + RabbitMQ (Redis fallback)
Object Storage	MinIO (S3-compatible)
Metadata DB	PostgreSQL 16
Vector DB	Qdrant
Knowledge Graph	Neo4j
LLM	Claude Sonnet 4.5 (Anthropic), OpenAI, vLLM, Ollama
Embeddings	Multi-provider (OpenAI, Ollama, local, Gemini, TEI)
NER	GLiNER (zero-shot, CPU)
Doc Parsing	Docling (PDF, DOCX, XLSX, PPTX, HTML, images) + stdlib (EML, MSG, RTF, CSV, TXT)
Query Orchestration	LangGraph (agentic state graph)
Structured Output	Instructor
Frontend	React 19 + Vite + TanStack Router + shadcn/ui

Prerequisites

Python 3.12+
Node.js 20+
Docker (for infrastructure services)
uv (Python package manager)
Anthropic API key
OpenAI API key (for embeddings — not needed if using Ollama or local)
Ollama (optional — for local LLM inference)

Quick Start

git clone <repo-url> && cd NEXUS
cp .env.example .env   # edit API keys
make install            # Python + frontend deps
make dev                # starts everything in one terminal

API docs available at http://localhost:8000/docs

LLM Providers

NEXUS supports 4 LLM providers. Set LLM_PROVIDER and the corresponding env vars in .env:

Provider	`LLM_PROVIDER`	Requires
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`
OpenAI	`openai`	`OPENAI_API_KEY`
vLLM	`vllm`	`VLLM_BASE_URL`
Ollama	`ollama`	`OLLAMA_BASE_URL`

vLLM and Ollama both expose OpenAI-compatible APIs, so switching is a config change — no code changes needed. See .env.local.example for a full local deployment config.

Embedding Providers

NEXUS supports 6 embedding providers. Set EMBEDDING_PROVIDER in .env:

Provider	`EMBEDDING_PROVIDER`	Default Model	Dimensions	Requires
OpenAI	`openai`	`text-embedding-3-large`	1024	`OPENAI_API_KEY`
Ollama	`ollama`	`nomic-embed-text`	768	`OLLAMA_BASE_URL` + model pulled
Local	`local`	`BAAI/bge-large-en-v1.5`	1024	— (auto-downloads)
Gemini	`gemini`	`gemini-embedding-exp-03-07`	1024	`GEMINI_API_KEY`
TEI	`tei`	Any HuggingFace model	varies	`TEI_EMBEDDING_URL`
BGE-M3	`bgem3`	`BAAI/bge-m3`	1024	— (auto-downloads, dense+sparse in one pass)

Ollama and Local providers run entirely on-device — no data leaves the machine. Set EMBEDDING_DIMENSIONS to match your model's output size. Changing dimensions requires deleting the Qdrant nexus_text collection (auto-recreated on startup).

Makefile Targets

Target	What it does
`make help`	Show all targets (default)
`make install`	Create venv, install Python + frontend deps
`make up`	Start Docker infrastructure services
`make down`	Stop Docker infrastructure services
`make dev`	Start everything: infra + API + worker + frontend (one terminal)
`make api`	Start API server with auto-reload
`make worker`	Start Celery worker with auto-reload on `.py` changes
`make frontend`	Start React frontend dev server
`make test`	Run test suite
`make migrate`	Run database migrations (Alembic)
`make logs`	Tail Docker service logs (filter with `SERVICES=redis,postgres`)

Running Tests

make test

Tests mock all external services — no running infrastructure required.

Load Testing

Locust-based load tests simulate concurrent attorney workflows against the NEXUS API.

pip install -r load_tests/requirements.txt

# Set required env vars
export NEXUS_TEST_MATTER_ID=<your-matter-uuid>
export NEXUS_TEST_EMAIL=admin@nexus.local
export NEXUS_TEST_PASSWORD=changeme

# Web UI (interactive dashboard at http://localhost:8089)
locust -f load_tests/locustfile.py --host http://localhost:8000

# Headless (10 users, 60s)
locust -f load_tests/locustfile.py --headless -u 10 -r 2 --run-time 60s

See load_tests/README.md for full configuration, task weights, and CI integration.

Project Structure

nexus/
├── app/
│   ├── main.py                 # FastAPI app factory + lifespan
│   ├── config.py               # Pydantic Settings (all config from env)
│   ├── dependencies.py         # DI: 20 cached factory functions
│   ├── analysis/               # Sentiment scoring, hot doc detection, completeness
│   ├── analytics/              # Communication matrices, network centrality, org hierarchy
│   ├── annotations/            # Document annotations (notes, highlights, tags)
│   ├── audit/                  # Audit log API endpoints
│   ├── auth/                   # JWT auth, RBAC, matter scoping, admin
│   ├── cases/                  # Case Setup Agent, context resolver, claims/parties
│   ├── common/                 # Shared: middleware, storage, LLM client, embedder, vector store
│   ├── datasets/               # Dataset/collection management, folder tree
│   ├── documents/              # Document metadata, preview, download
│   ├── edrm/                   # EDRM load file import/export
│   ├── entities/               # NER, entity resolution agent, knowledge graph
│   ├── evaluation/             # RAG evaluation datasets and runs
│   ├── exports/                # Production sets, Bates numbering, export jobs
│   ├── ingestion/              # Upload → parse → chunk → embed → extract → index
│   ├── query/                  # LangGraph agentic query pipeline + 12 tools + chat
│   └── redaction/              # PII detection, redaction engine
├── workers/
│   └── celery_app.py           # Celery config + task autodiscovery
├── migrations/                 # Alembic migrations (13 revisions)
├── frontend/                   # React 19 + Vite + TanStack Router
├── tests/                      # 512 backend tests (mirrors app/ structure)
├── scripts/                    # cloud-deploy.sh, seed_admin.py, import_dataset.py, evaluate.py
├── docs/                       # modules.md, database-schema.md, feature-flags.md, agents.md, ...
├── docker-compose.yml          # Infrastructure services (dev) + RabbitMQ
├── docker-compose.prod.yml     # Full containerized stack
├── docker-compose.cloud.yml    # Cloud overlay (Caddy + TLS)
├── docker-compose.gpu.yml      # GPU overlay (NVIDIA passthrough for Ollama + TEI)
├── docker-compose.local.yml    # Local LLM stack (vLLM/Ollama, TEI)
├── Caddyfile                   # Reverse proxy config
├── Makefile                    # Dev workflow targets
├── Procfile.dev                # Process definitions for `make dev`
└── pyproject.toml

API Endpoints

Start the server and visit http://localhost:8000/docs for the full interactive OpenAPI documentation.

Key endpoint groups:

/api/v1/auth — Authentication and user management
/api/v1/ingest — Upload and process documents (single, batch, webhook)
/api/v1/query — Query the corpus (sync and SSE streaming)
/api/v1/chats — Chat thread management
/api/v1/documents — Browse, preview, and download documents
/api/v1/entities — Entity search and knowledge graph exploration
/api/v1/cases — Case setup and context management
/api/v1/analytics — Communication matrices, network centrality
/api/v1/annotations — Document annotations (notes, highlights, tags)
/api/v1/exports — Production sets, Bates numbering, export jobs
/api/v1/edrm — EDRM load file import
/api/v1/redaction — PII detection and document redaction
/api/v1/datasets — Dataset/collection management
/api/v1/evaluation — RAG evaluation datasets and runs
/api/v1/admin — User management, audit log (admin-only)
/api/v1/health — Service health check

Feature Flags

Optional capabilities controlled via environment variables. See docs/feature-flags.md for full details.

Flag	Default	Description
`ENABLE_AGENTIC_PIPELINE`	`true`	Agentic LangGraph query pipeline (vs. v1 linear chain)
`ENABLE_CITATION_VERIFICATION`	`true`	CoVe citation verification in query synthesis
`ENABLE_EMAIL_THREADING`	`true`	Email conversation thread reconstruction
`ENABLE_AI_AUDIT_LOGGING`	`true`	Log LLM calls to ai_audit_log table
`ENABLE_SPARSE_EMBEDDINGS`	`false`	BM42 sparse vectors for hybrid retrieval
`ENABLE_RERANKER`	`false`	Cross-encoder reranking of retrieval results
`ENABLE_VISUAL_EMBEDDINGS`	`false`	ColQwen2.5 visual embedding and reranking
`ENABLE_RELATIONSHIP_EXTRACTION`	`false`	LLM-based relationship extraction
`ENABLE_NEAR_DUPLICATE_DETECTION`	`false`	MinHash near-duplicate and version detection
`ENABLE_HOT_DOC_DETECTION`	`false`	Sentiment scoring and hot doc flagging
`ENABLE_TOPIC_CLUSTERING`	`false`	BERTopic topic clustering
`ENABLE_CASE_SETUP_AGENT`	`false`	Case context pre-population
`ENABLE_COREFERENCE_RESOLUTION`	`false`	spaCy + coreferee pronoun resolution
`ENABLE_GRAPH_CENTRALITY`	`false`	Neo4j GDS centrality metrics
`ENABLE_BATCH_EMBEDDINGS`	`false`	Async batch embedding API (stub)
`ENABLE_REDACTION`	`false`	PII detection and document redaction
`ENABLE_PROMETHEUS_METRICS`	`false`	Prometheus `/metrics` endpoint + custom business metrics
`ENABLE_SSO`	`false`	OIDC/OAuth2 SSO authentication
`ENABLE_MEMO_DRAFTING`	`false`	LLM-powered legal memo generation from investigation results

Deployment

CPU (default)

docker compose -f docker-compose.yml -f docker-compose.prod.yml -f docker-compose.cloud.yml up -d

GPU (NVIDIA T4/L4/A100)

Add the GPU overlay for accelerated embeddings and reranking (~20x faster):

docker compose -f docker-compose.yml -f docker-compose.prod.yml \
  -f docker-compose.cloud.yml -f docker-compose.gpu.yml up -d

Requires NVIDIA drivers + NVIDIA Container Toolkit. The GPU overlay adds NVIDIA passthrough to Ollama and an optional TEI embedding server. See docs/CLOUD-DEPLOY.md for full GPU VM provisioning guide.

Celery Broker

By default, Celery uses Redis as its message broker. For production, set CELERY_BROKER_URL to use RabbitMQ (included in docker-compose.yml):

CELERY_BROKER_URL=amqp://nexus:nexus@rabbitmq:5672/nexus

Leave CELERY_BROKER_URL empty to fall back to Redis. See docs/celery-scaling.md for worker scaling guide.

Name		Name	Last commit message	Last commit date
Latest commit History 479 Commits
.claude		.claude
.github/workflows		.github/workflows
app		app
deploy		deploy
docs		docs
evaluation		evaluation
frontend		frontend
helm/nexus		helm/nexus
load_tests		load_tests
migrations		migrations
monitoring		monitoring
reports		reports
scripts		scripts
tests		tests
workers		workers
.env.example		.env.example
.env.gpu.example		.env.gpu.example
.env.local.example		.env.local.example
.gitignore		.gitignore
.gitleaksignore		.gitleaksignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Caddyfile		Caddyfile
Dockerfile		Dockerfile
Makefile		Makefile
Procfile.dev		Procfile.dev
README.md		README.md
ROADMAP-UPDATE-PROMPT.md		ROADMAP-UPDATE-PROMPT.md
ROADMAP.md		ROADMAP.md
TEST-REPORT-2026-03-20.md		TEST-REPORT-2026-03-20.md
alembic.ini		alembic.ini
architecture-after-fix.png		architecture-after-fix.png
architecture-current.png		architecture-current.png
architecture-loading.png		architecture-loading.png
docker-compose.cloud.yml		docker-compose.cloud.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.ingest.yml		docker-compose.ingest.yml
docker-compose.local.yml		docker-compose.local.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
package-lock.json		package-lock.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NEXUS

What NEXUS Does

Tech Stack

Prerequisites

Quick Start

LLM Providers

Embedding Providers

Makefile Targets

Running Tests

Load Testing

Project Structure

API Endpoints

Feature Flags

Deployment

CPU (default)

GPU (NVIDIA T4/L4/A100)

Celery Broker

See Also

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NEXUS

What NEXUS Does

Tech Stack

Prerequisites

Quick Start

LLM Providers

Embedding Providers

Makefile Targets

Running Tests

Load Testing

Project Structure

API Endpoints

Feature Flags

Deployment

CPU (default)

GPU (NVIDIA T4/L4/A100)

Celery Broker

See Also

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages