Multimodal PDF RAG with table-aware retrieval using DuckDB + Chroma.
Most RAG demos ignore tables. This project:
- extracts PDF tables and stores full rows in DuckDB
- stores text chunks + table summaries in Chroma
- performs hybrid retrieval and returns answers with citations (doc/page/table)
cp .env.example .env
docker compose up --buildThe /ask endpoint uses an OpenAI-compatible chat completion API. Switch the
backend by setting LLM_MODE.
CUDA (NIM/Triton):
LLM_MODE=cuda(default)NIM_URL(e.g.,http://<host>:8000/v1)NIM_MODELNIM_KEY(optional)
Mac (local OpenAI-compatible server):
LLM_MODE=macMAC_LLM_URL(e.g.,http://127.0.0.1:8000/v1)MAC_LLM_MODELMAC_LLM_KEY(optional)
- Text blocks and table summaries are indexed in Chroma.
- Full table rows are stored in DuckDB for exact lookup.
- Image metadata (page, size, bbox) is stored in DuckDB for now.