Ask questions to your private files — get accurate answers with sources, fully self-hosted.
Most AI tools require uploading sensitive data to external APIs.
DocuMind keeps everything local.
- 🔐 100% private (self-hosted)
- 🧠 AI-powered search across documents
- 📎 Answers with source citations
- 🏢 Built for teams, companies, and power users
- 🧑💻 Developers building RAG / AI systems
- 🏢 Companies with sensitive documents
- 🔐 Privacy-focused teams avoiding SaaS AI
- ☁️ Nextcloud users wanting AI search
Q: "What are the payment terms in our vendor contracts?"
→ Answer:
"Net 30 payment terms are defined in Section 4..."
Sources:
- /contracts/vendor_a.pdf (page 3)
- /contracts/vendor_b.pdf (page 2)
- 🔎 Semantic search (RAG)
- 📄 Multi-format ingestion (PDF, DOCX, MD, TXT)
- 🔐 Permission-aware retrieval (Nextcloud ACLs)
- 🔄 Auto sync + reindex pipelines
- 🤖 Local LLM support (Ollama)
- 📊 Document intelligence (entities, risks, deadlines)
- 🧩 Nextcloud-native integration
Built on a scalable ingestion + classification pipeline
flowchart LR
NC[Nextcloud] --> API[FastAPI]
FE[React UI] --> API
API --> DB[(Postgres + pgvector)]
API --> Redis
API --> Worker[Celery]
Worker --> AI[Embedding + LLM]
NextCloud bridge: Copy nc_ai_bridge folder to Nextcloud extra apps folder(/var/snap/nextcloud/current/nextcloud/extra-apps/) to access app via icon on Nextcloud:
Docker:
make docker-devLocal:
make local-devThen open:
- 🌐 App → http://localhost:5173
- 📡 API → http://localhost:8000/docs
- ☁️ Nextcloud → http://localhost:8081
Login:
Backend credentials (set in backend/.env):
FIRST_SUPERUSER_EMAIL=admin@admin.com
FIRST_SUPERUSER_PASSWORD=12345678
Nextcloud → Sync → Parse → Chunk → Embed → Store → Query
Under the hood:
- ingestion pipeline
- classification engine
- RAG retrieval + citation builder
- Nextcloud (WebDAV + OCS)
- Ollama (local LLMs)
- PostgreSQL + pgvector
make docker-dev
make docker-logs
make local-dev
make local-backend-testClean modular architecture:
- ingestion pipeline
- classification system
- AI layer isolation
- worker queues
- No external AI APIs required
- Runs fully local
- Respects document permissions
- Secure credential handling
- No OCR (yet)
- Limited file types
- Requires local model setup for best results
- 🔥 Build your own private ChatGPT for documents
- 🧩 Plug into your existing Nextcloud instantly
- 🏗️ Production-ready architecture (not a toy demo)
- 🧠 Extendable AI pipelines (classification, workflows, agents)
- OCR + image parsing
- Email ingestion (IMAP)
- Knowledge graph
- Multi-tenant SaaS mode
- Agent workflows
PRs welcome. Open an issue first for major changes.
MIT License — free for personal and commercial use.
DocuMind is not just another RAG demo.
It’s a foundation for building private AI knowledge systems on top of your existing infrastructure.
