Multimodal RAG with knowledge graph and contextual intelligence. Understands what your documents say, how concepts connect, and what the pages look like. Production-ready.
Most RAG systems treat documents as hierarchical text and search by similarity agentically — visual context is lost, entity relationships are missed, context filtering is limited. DlightRAG combines knowledge graph understanding with dynamic multimodal retrieval to close these gaps.
From text-heavy reports to chart-filled presentations — it adapts to your documents without information compromise. Inquiry answers come with inline citations grounded in actual document content. Flexibly ship it as a ready-to-run service, integrate into your backend, or expose as a tool for AI agents.
- Dual multimodal RAG modes — Caption mode (parse → caption → embed) for pipeline-based multimodal paradigm; Unified mode (render → multimodal embed) for modern multimodal paradigm
- Knowledge graph + vector + visual retrieval — Multi-strategy retrieval across knowledge graph and vector similarity LightRAG, visual content, and dynamic metadata filters
- Multimodal ingestion — PDF, Images, Office Documents from local filesystem, Azure Blob Storage etc.
- Broad LLM support — Any OpenAI-compatible LLM endpoint, plus 100+ providers via LiteLLM
- Cross-workspace federation — Query across embedding-compatible workspaces with well managed merging
- Citation and highlighting — Inline citations with source, page, and highlighting attribution
- Four interfaces — Web UI, REST API, Python SDK, and MCP server
Source: docs/architecture.drawio
Defaults:
gpt-4.1(chat) +text-embedding-3-large(embedding) via OpenAI. To use other providers or models, editconfig.yaml— see Configuration.
If you already have the REST API running (via Docker or dlightrag-api), the Web UI is available at:
http://localhost:8100/web/
Without Docker:
uv add dlightrag # or: pip install dlightrag
cp .env.example .env # set API keys in .env
dlightrag-apigit clone https://github.com/hanlianlu/dlightrag.git && cd dlightrag
cp .env.example .env # set API keys in .env; edit config.yaml for models/providers
docker compose upIncludes PostgreSQL (pgvector + AGE), REST API (:8100), and MCP server (:8101).
Local models (Ollama, Xinference, etc.): use
host.docker.internalinstead oflocalhostinbase_urlsettings.
curl http://localhost:8100/health
curl -X POST http://localhost:8100/ingest \
-H "Content-Type: application/json" \
-d '{"source_type": "local", "path": "/app/dlightrag_storage/sources"}'
curl -X POST http://localhost:8100/retrieve \
-H "Content-Type: application/json" \
-d '{"query": "What are the key findings?"}'
curl -X POST http://localhost:8100/answer \
-H "Content-Type: application/json" \
-d '{"query": "What are the key findings?", "stream": true}'uv add dlightrag # or: pip install dlightrag
cp .env.example .env # set API keys in .envimport asyncio
from dotenv import load_dotenv
from dlightrag import RAGServiceManager, DlightragConfig
load_dotenv() # load .env
async def main():
config = DlightragConfig()
manager = RAGServiceManager(config)
await manager.aingest(workspace="default", source_type="local", path="./docs")
result = await manager.aretrieve(query="What are the key findings?")
print(result.contexts)
result = await manager.aanswer(query="What are the key findings?")
print(result.answer)
asyncio.run(main())Requires PostgreSQL with pgvector + AGE, or JSON fallback for development (see Configuration).
uv tool install dlightrag # or: pip install dlightrag
cp .env.example .env # set API keys in .env
dlightrag-mcp{
"mcpServers": {
"dlightrag": {
"command": "uvx",
"args": ["dlightrag-mcp", "--env-file", "/absolute/path/to/.env"]
}
}
}Tools: retrieve, answer, ingest, list_files, delete_files, list_workspaces — all with workspace isolation.
| Method | Endpoint | Description |
|---|---|---|
POST |
/ingest |
Ingest from local, Azure Blob, or Snowflake |
POST |
/retrieve |
Contexts + sources (no LLM answer) |
POST |
/answer |
LLM answer + contexts + sources (stream: true for SSE) |
GET |
/files |
List ingested documents |
DELETE |
/files |
Delete documents |
GET |
/api/files/{path} |
Serve/download a file (local: stream, Azure: 302 SAS redirect) |
POST |
/reset |
Reset workspace(s) — drop storage, clear indexes |
GET |
/workspaces |
List available workspaces |
GET |
/health |
Health check with storage status |
All write endpoints accept optional workspace; read endpoints accept workspaces list for cross-workspace federated search. Set DLIGHTRAG_API_AUTH_TOKEN in .env to enable bearer auth.
- Request/response schema —
docs/response-schema.mdfor ingestion parameters, retrieval contexts, sources, media, SSE streaming, citations, and multimodal queries. - Retrieval & answer pipeline —
docs/retrieval_answer_mechanism.mdfor unified vs caption mode, visual resolution, reranking, Step 1+2 merge.
Configuration uses a hybrid system — structured app settings in config.yaml, secrets and deployment in .env.
Priority: constructor args > env vars > .env > config.yaml > defaults
See config.yaml for all application settings and .env.example for secrets/deployment reference.
Env var naming: all variables use the
DLIGHTRAG_prefix. Single underscore (_) is part of the field name (e.g.DLIGHTRAG_POSTGRES_HOST→postgres_host). Double underscore (__) means nested object (e.g.DLIGHTRAG_CHAT__MODEL→chat.model). See.env.examplefor details.
The first decision — determines your ingestion pipeline, model requirements, and retrieval behavior.
| Mode | Pipeline | Best for |
|---|---|---|
caption (default) |
Document parsing → VLM captioning → text embedding → KG | Text-heavy documents, structured elements |
unified |
Page rendering → multimodal embedding → VLM entity extraction → KG | Visually rich documents (charts, diagrams, complex layouts) |
Caption mode parsers (parser in config.yaml):
| Parser | Description |
|---|---|
mineru (default) |
MinerU PDF parser — fast, good for text-heavy documents |
docling |
Docling parser — alternative structure-aware parser |
vlm |
VLM-based OCR — renders pages and uses chat model (must be VLM) to extract structured content; no external parser dependency |
All caption mode parsers use Docling's HybridChunker for structure-aware chunking.
Model usage by stage:
| Stage | Caption | Unified |
|---|---|---|
| Image captioning | chat model (VLM) | chat model (VLM) |
| Table / equation captioning | chat model | — |
| Entity extraction | chat model | chat model (VLM) |
| Embedding | embedding model | embedding model (multimodal) |
| Rerank (llm backend) | ingest/chat model | chat model (VLM, pointwise scoring) |
| Rerank (API backend) | cohere/jina/aliyun API | cohere/jina/aliyun API |
| Answer generation | chat model | chat model (VLM, sees text excerpts + page images) |
Important: The chat model must support vision (multimodal/VLM). It doubles as the vision model for image captioning, VLM parser, unified mode, and multimodal queries. A text-only chat model will fail on these tasks.
For unified mode, set rag_mode: unified in config.yaml and use multimodal models:
# config.yaml
rag_mode: unified
chat:
model: qwen3-vl-32b # must support vision
embedding:
model: Qwen3-VL-Embedding # must be multimodal
dim: 4096Limitations: Snowflake is text-only (no visual embedding). A workspace is locked to one mode after first ingestion. Page images ~3-7 MB/page at 250 DPI.
Two-track dispatch — choose per model block in config.yaml:
| Provider | SDK | Use for |
|---|---|---|
openai (default) |
AsyncOpenAI | OpenAI, Azure OpenAI, Qwen/DashScope, MiniMax, Ollama, Xinference, OpenRouter, any OpenAI-compatible endpoint |
litellm |
LiteLLM | Anthropic, Google Gemini, and everything else via LiteLLM model prefixes |
# config.yaml — OpenAI-compatible (Ollama example)
chat:
provider: openai
model: qwen3:8b
base_url: http://localhost:11434/v1
# config.yaml — LiteLLM (Anthropic example)
chat:
provider: litellm
model: anthropic/claude-sonnet-4-20250514API keys go in .env:
DLIGHTRAG_CHAT__API_KEY=sk-...
DLIGHTRAG_EMBEDDING__API_KEY=sk-...Set in config.yaml:
| Setting | Default | Options |
|---|---|---|
vector_storage |
PGVectorStorage |
PGVectorStorage, MilvusVectorDBStorage, NanoVectorDBStorage, ... |
graph_storage |
PGGraphStorage |
PGGraphStorage, Neo4JStorage, NetworkXStorage, ... |
kv_storage |
PGKVStorage |
PGKVStorage, JsonKVStorage, RedisKVStorage, ... |
doc_status_storage |
PGDocStatusStorage |
PGDocStatusStorage, JsonDocStatusStorage, ... |
Note: When using PostgreSQL backends, LightRAG maps its internal namespace names to different table names (e.g.
text_chunks→LIGHTRAG_DOC_CHUNKS,full_docs→LIGHTRAG_DOC_FULL). DlightRAG's unified mode adds avisual_chunkstable via its own KV storage.
Each workspace has its own knowledge graph, vector store, and document index. workspace in config.yaml (default: default) is automatically bridged to backend-specific env vars — no manual setup needed.
| Backend type | Isolation mechanism |
|---|---|
| PostgreSQL (PG*) | workspace column / graph name in same database |
| Neo4j / Memgraph | Label prefix |
| Milvus / Qdrant | Collection prefix |
| MongoDB / Redis | Collection scope |
| JSON / Nano / NetworkX / Faiss | Subdirectory under working_dir/<workspace>/ |
Set in config.yaml under the rerank: block:
| Setting | Default | Description |
|---|---|---|
rerank.backend |
llm |
llm, cohere, jina, aliyun, azure_cohere |
rerank.model |
(backend default) | Model name sent to the endpoint |
rerank.base_url |
(provider default) | Custom endpoint URL for any compatible service |
rerank.api_key |
— | Set in .env as DLIGHTRAG_RERANK__API_KEY |
| Backend | Default model | API key |
|---|---|---|
llm |
(uses ingest/chat model) | (reuses chat/ingest key) |
cohere |
rerank-v4.0-pro |
DLIGHTRAG_RERANK__API_KEY |
jina |
jina-reranker-v3 |
DLIGHTRAG_RERANK__API_KEY |
aliyun |
qwen3-rerank |
DLIGHTRAG_RERANK__API_KEY |
azure_cohere |
Cohere-rerank-v4.0-pro |
DLIGHTRAG_RERANK__API_KEY |
Point any backend at a local reranker (Xinference, etc.) via rerank.base_url + rerank.model in config.yaml.
git clone https://github.com/hanlianlu/dlightrag.git && cd dlightrag
cp .env.example .env && uv sync
docker compose up -d # PostgreSQL + API + MCP
docker compose up postgres -d # PostgreSQL onlyuv run pytest tests/unit # unit tests (no external services)
uv run pytest tests/integration # integration tests (requires PostgreSQL)
uv run ruff check src/ tests/ scripts/ --fix && uv run ruff format src/ tests/ scripts/Apache License 2.0 — see LICENSE.
Built by HanlianLyu. Contributions welcome!
