Local, private, provider-agnostic RAG engine.
Zero-config Docker setup. Bring your own LLM. Query your documents in seconds.
DockRAG is a self-hosted RAG (Retrieval-Augmented Generation) engine that lets you upload documents, index them as vector embeddings, and chat with them using any LLM provider — all from a single docker compose up.
- Privacy-first — Your documents and conversations never leave your machine.
- Provider-agnostic — Use Ollama (local), OpenAI, Anthropic, or Gemini. Switch anytime.
- Zero-config — API key auto-generated, database auto-initialized, models auto-detected.
- Full-stack — REST API + Web UI + CLI. Use it however you want.
- Embeddable — One API call to query your documents from any external app.
┌────────────────────────────────────────────────────┐
│ Docker Compose │
│ │
│ ┌─────────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ api-core │ │ vectordb │ │ ollama │ │
│ │ (Bun+Hono) │ │ (ChromaDB)│ │ (optional) │ │
│ │ :3000 │ │ :8000 │ │ :11434 │ │
│ └──────┬───────┘ └─────┬─────┘ └──────┬───────┘ │
│ │ │ │ │
│ └────────────────┴───────────────┘ │
└────────────────────────────────────────────────────┘
| Layer | Technology |
|---|---|
| Runtime | Bun |
| Backend | Hono + TypeScript |
| Database | SQLite (WAL mode) |
| Vector Store | ChromaDB |
| Frontend | React 19 + Vite + TailwindCSS 4 + shadcn/ui |
| State | Zustand |
| LLM Providers | Ollama, OpenAI, Anthropic, Gemini |
| CLI | @clack/prompts |
| Deploy | Docker Compose |
- Docker & Docker Compose
- 4 GB RAM minimum
- 5 GB disk space (for models)
git clone https://github.com/your-org/dockrag.git
cd dockrag
# Start all services (including local Ollama)
docker compose --profile local up --build -d# Embedding model (required)
docker compose exec ollama ollama pull nomic-embed-text
# Chat model (pick one)
docker compose exec ollama ollama pull llama3.2Navigate to http://localhost:3000. The API key is auto-detected — you're in.
- Projects → New Project → pick a name, provider, and models
- Documents → Upload your PDFs, markdown, text, or Word files
- Chat → Ask questions and get answers grounded in your documents
That's it. No configuration files, no environment variables to set, no API keys to manage.
DockRAG supports OpenAI, Anthropic, and Gemini alongside Ollama.
- Set a
DOCKRAG_SECRETin.env(required to encrypt cloud API keys):
cp .env.example .env
echo "DOCKRAG_SECRET=$(openssl rand -hex 32)" >> .env-
Go to Settings in the UI → enter your API key → Test → Save
-
Create a new project using the cloud provider. Models are auto-detected.
Note: Anthropic does not offer an embeddings API. Use OpenAI or Ollama for embeddings alongside Anthropic for chat.
DockRAG exposes a simple REST API so you can query your documents from any external application.
curl -X POST http://localhost:3000/api/projects/{PROJECT_ID}/query \
-H "Authorization: Bearer {API_KEY}" \
-H "Content-Type: application/json" \
-d '{"message": "What does the document say about X?", "options": {"stream": false}}'Response:
{
"data": {
"chat_id": "uuid",
"message_id": "uuid",
"content": "According to the document...",
"sources": [
{ "filename": "report.pdf", "chunk_index": 3, "score": 0.92, "preview": "..." }
]
}
}Pass the chat_id from the first response to maintain context:
curl -X POST http://localhost:3000/api/projects/{PROJECT_ID}/query \
-H "Authorization: Bearer {API_KEY}" \
-H "Content-Type: application/json" \
-d '{"message": "Tell me more about that", "chat_id": "uuid-from-previous"}'Streaming is enabled by default. Omit "options": {"stream": false} to get Server-Sent Events:
event: token
data: {"content":"According"}
event: token
data: {"content":" to the"}
event: sources
data: {"sources":[...]}
event: done
data: {"chat_id":"...","message_id":"..."}
The web UI provides ready-to-copy integration code (cURL, JavaScript, React Hook) for each project. Click the Integrate button on any project's document page.
DockRAG includes a CLI that works both inside the container (direct mode) and from your host (HTTP mode).
# Inside the container
docker compose exec api-core bun run cli <command>
# Or set up HTTP mode from your host
bun run cli init| Command | Description |
|---|---|
status |
System health check |
init |
Configure server connection |
key:show |
Show current API key |
key:regenerate |
Regenerate API key |
provider:set <name> |
Configure a provider |
provider:test <name> |
Test provider connectivity |
project:list |
List all projects |
project:create [name] |
Create project interactively |
ingest <project> <path> |
Ingest file or directory |
query <project> "question" |
Query from terminal |
| Format | Extension | Parser |
|---|---|---|
| Plain text | .txt |
Direct |
| Markdown | .md |
Direct |
.pdf |
pdf-parse | |
| Word | .docx |
mammoth |
| Variable | Default | Description |
|---|---|---|
DOCKRAG_PORT |
3000 |
API server port |
DOCKRAG_API_KEY |
Auto-generated | Master API key (set for a fixed key) |
DOCKRAG_SECRET |
— | Encryption key for cloud provider API keys |
DOCKRAG_MAX_UPLOAD_MB |
16 |
Maximum upload file size |
CHROMADB_URL |
http://vectordb:8000 |
ChromaDB connection URL |
OLLAMA_URL |
http://ollama:11434 |
Ollama connection URL |
dockrag/
├── docker-compose.yml # 3 services: api-core, vectordb, ollama
├── Dockerfile # Multi-stage build (deps → frontend → runtime)
├── .env.example # Environment template
├── package.json # Backend (Bun)
├── src/
│ ├── index.ts # Hono server entry point
│ ├── config/ # Environment validation (Zod)
│ ├── db/ # SQLite + ChromaDB clients
│ ├── llm/ # LLM provider implementations
│ │ ├── llm.interface.ts # ILLMService interface
│ │ ├── llm.factory.ts # Factory pattern
│ │ ├── ollama.service.ts
│ │ ├── openai.service.ts
│ │ ├── anthropic.service.ts
│ │ └── gemini.service.ts
│ ├── ingestion/ # Document parsing & chunking
│ ├── middleware/ # Auth, rate limiting, security
│ ├── modules/ # Feature modules (REST)
│ │ ├── providers/ # LLM provider config
│ │ ├── projects/ # Project CRUD
│ │ ├── documents/ # Document upload & status
│ │ ├── chats/ # Chat sessions
│ │ ├── query/ # RAG query pipeline
│ │ └── system/ # Health & config
│ └── cli/ # CLI tool (10 commands)
├── frontend/
│ ├── src/
│ │ ├── pages/ # Dashboard, Projects, Documents, Chat, Settings
│ │ ├── components/ # UI components + wizard
│ │ ├── stores/ # Zustand state
│ │ └── hooks/ # Custom hooks (streaming, polling)
│ └── public/ # Static assets (logo)
├── docs/ # Full documentation
└── data/ # Runtime volumes (sqlite, chroma, ollama)
Upload → Parse → Chunk → Embed → Store (ChromaDB)
↓
Query → Embed question → Search vectors → Build context → LLM → Stream response
- Parse — Extract text from PDF/DOCX/MD/TXT
- Chunk — Recursive character text splitter (configurable size/overlap)
- Embed — Generate vectors via configured embedding model
- Store — Save to project-specific ChromaDB collection
- Retrieve — Cosine similarity search, filter by distance threshold
- Generate — Stream LLM response with RAG system prompt + chat history
# Backend (hot reload)
bun run dev
# Frontend (Vite dev server, proxies /api to :3000)
cd frontend && npm run dev
# Both services need ChromaDB running
docker compose up vectordb -dFull API documentation is available in the docs/ folder:
MIT
