DocRAG-NVIDIA (MVP)

Multimodal PDF RAG with table-aware retrieval using DuckDB + Chroma.

Why this repo

Most RAG demos ignore tables. This project:

extracts PDF tables and stores full rows in DuckDB
stores text chunks + table summaries in Chroma
performs hybrid retrieval and returns answers with citations (doc/page/table)

Quickstart

cp .env.example .env
docker compose up --build

LLM Runtime Selection (CUDA vs Mac)

The /ask endpoint uses an OpenAI-compatible chat completion API. Switch the backend by setting LLM_MODE.

CUDA (NIM/Triton):

LLM_MODE=cuda (default)
NIM_URL (e.g., http://<host>:8000/v1)
NIM_MODEL
NIM_KEY (optional)

Mac (local OpenAI-compatible server):

LLM_MODE=mac
MAC_LLM_URL (e.g., http://127.0.0.1:8000/v1)
MAC_LLM_MODEL
MAC_LLM_KEY (optional)

Ingest Notes

Text blocks and table summaries are indexed in Chroma.
Full table rows are stored in DuckDB for exact lookup.
Image metadata (page, size, bbox) is stored in DuckDB for now.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
apps		apps
docrag		docrag
tests		tests
.env		.env
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocRAG-NVIDIA (MVP)

Why this repo

Quickstart

LLM Runtime Selection (CUDA vs Mac)

Ingest Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocRAG-NVIDIA (MVP)

Why this repo

Quickstart

LLM Runtime Selection (CUDA vs Mac)

Ingest Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages