Skip to content

kkumarM/docrag-nvidia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DocRAG-NVIDIA (MVP)

Multimodal PDF RAG with table-aware retrieval using DuckDB + Chroma.

Why this repo

Most RAG demos ignore tables. This project:

  • extracts PDF tables and stores full rows in DuckDB
  • stores text chunks + table summaries in Chroma
  • performs hybrid retrieval and returns answers with citations (doc/page/table)

Quickstart

cp .env.example .env
docker compose up --build

LLM Runtime Selection (CUDA vs Mac)

The /ask endpoint uses an OpenAI-compatible chat completion API. Switch the backend by setting LLM_MODE.

CUDA (NIM/Triton):

  • LLM_MODE=cuda (default)
  • NIM_URL (e.g., http://<host>:8000/v1)
  • NIM_MODEL
  • NIM_KEY (optional)

Mac (local OpenAI-compatible server):

  • LLM_MODE=mac
  • MAC_LLM_URL (e.g., http://127.0.0.1:8000/v1)
  • MAC_LLM_MODEL
  • MAC_LLM_KEY (optional)

Ingest Notes

  • Text blocks and table summaries are indexed in Chroma.
  • Full table rows are stored in DuckDB for exact lookup.
  • Image metadata (page, size, bbox) is stored in DuckDB for now.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors