- Route: Checks the RAG context for relevance to the query and adds live web search if the context is thin
- Evaluate: Checks responses for relevance and accuracy, flags hallucinations
- Iterate: Goes through multiple evaluation and generation cycles
- Extract: Uploads DOCX/PDF documents and converts each table to CALS XML, preserving spans, column widths, and cell text
- Validate Content: Cross-checks every extracted cell value against two independent PDF parsers (pdfplumber + Camelot) and marks cells
verify="ok"or"unconfirmed" - Annotate Styles: Sends original page images to a local vision LLM (Qwen2.5-VL via LM Studio) to detect bold formatting and indent levels, capturing the visual hierarchy of the source document
- Compare Snapshots: (Planned) TEDS-based tree edit distance comparison between the original snapshot and any re-transformed output (re-exported PDF, HTML, iXBRL)
- Edit Prompts: Customize results through your own prompts
- Change Parameters: Adjust agent behavior through parameters and runtime variables
- Look and Feel: Change the agent and UI by editing the code yourself
- Free Endpoints: use free endpoints on build.nvidia.com
- Self-Hosted: Point to Ollama or NIM on your own GPUs
- Local VLM: Point to a self-hosted LM Studio instance for offline vision LLM annotation
- Easy Mode: Use the application
- Intermediate Mode: Modify the application
- Advanced Mode: Self-host GPUs for inference
You can run Agentic RAG without Workbench, but this README requires NVIDIA AI Workbench installed. See how to install it here.
You need internet because Agentic RAG uses an NVIDIA endpoint for document embedding.
Table extraction and VLM annotation work fully offline once LM Studio is running locally — no NVIDIA API key required for those features.
- Get NVIDIA and Tavily API keys:
- Clone this repo with AI Workbench > configure the keys when prompted.
- Click Open Chat > Go to the Document tab in the web app > Click Add to Context.
- Type in your question > Hit enter - answers come from free cloud endpoints.
- Upload a DOCX containing financial or structured tables via the Document tab.
- Switch to the Table Browser tab to inspect the extracted CALS XML, interactive HTML rendering, and per-cell verification status.
- Click Re-annotate with VLM to run Qwen2.5-VL (via a local LM Studio instance at
http://localhost:1234/v1) and writeboldandindentattributes onto each cell. - Annotated snapshots are persisted to
data/table_catalog.jsonafter each table so progress survives interruption.
LM Studio requirement: launch LM Studio with the
--no-sandboxflag and load theQwen2.5-VL-72Bmodel before clicking Re-annotate.
Click to Expand Easy Mode
| Steps | What can go wrong | Screen shot |
|---|---|---|
| 1. Open the Desktop App > Select local. | Probably a Docker Desktop issue (if selected on install). Fix: See troubleshooting here | |
| 2. Click Clone Project > Paste repository URL > Clone | Incorrect URL. Fix: use the correct URL. | ![]() |
| 3. Click Resolve Now > Enter NVIDIA and Tavily API keys. | You don't see the banner. Fix: go to Project Container > Variables > Configure for API keys. See docs here | ![]() |
| 4. Click Open Chat. | Very little can go wrong here | ![]() |
| 5. Click Documents > Create Context. | Incorrect API key. Fix per Step 3 above. | ![]() |
| 6. Type question > Hit enter. | Incorrect API key. Fix per Step 3 above. | ![]() |
Use these steps when you want to work with your own documents and your own prompts.
| Steps | What can go wrong | Screen shot |
|---|---|---|
| 1. Click Documents > Clear Context. | Very little. | Vector DB reset. |
| 2. Delete the URLs > Add your own > Click Add to Context. | URLs that can't be resolved. Fix: Enter appropriate URLs | New context. |
| 3. Type question > Hit enter. | Incorrect API key. Fix: Fix per Step 3 in table above. | Triggers the agent. |
Click to Expand Intermediate Mode
This application is a quick prototype and not a robust piece of software. So there are many opportunities to improve it.
- Fork this project to your own GitHub account. Then clone it in Workbench
- Add VS Code to the project
- Create an
experimentbranch to protect main - Open VS Code from the Desktop App and edit the application code
- Change recursion limit, number of web sites returned by Tavily, whether previous searches are saved
- Add new endpoints from build.nvidia.com
- Change the look and feel of the Gradio app or add new features
- Modify the agent
- Extend the table extraction pipeline in
code/chatui/utils/database.py:_load_docx_direct()— DOCX → CALS XML with spans and column widthsverify_table()— cross-checks cell values against pdfplumber + Camelot_annotate_entry_styles_with_vlm()— VLM bold/indent detection_cals_to_fop_pdf()/_cals_to_interactive_html()— table rendering
- See
agentic-rag-docs/table-validation-approach.mdfor the full validation design - Fix any bugs you find
Click to Expand Advanced Mode
Use these details if you want to modify the application, e.g. by configuring prompts, adding your own endpoints, changing the Gradio app or whatever else occurs to you.
- Set up a Linux box with an NVIDIA GPU and Docker.
- Deploy an Ollama container or an NVIDIA NIM on that host.
- Configure the chat app to use the self-hosted endpoint.
- Install LM Studio on a machine with a compatible GPU.
- Load model
Qwen2.5-VL-72B(or any OpenAI-compatible vision model). - Start the local server:
./LM-Studio-*.AppImage --no-sandboxand enable the API server at port1234. - The app auto-detects the first available model via
client.models.list(); no extra configuration required.
This NVIDIA AI Workbench example project is under the Apache 2.0 License
This project may utilize additional third-party open source software projects. Review the license terms of these open source projects before use. Third party components used as part of this project are subject to their separate legal notices or terms that accompany the components. You are responsible for confirming compliance with third-party component license terms and requirements.
| ❓ Have Questions? |
|---|
| Please direct any issues, fixes, suggestions, and discussion on this project to the DevZone Members Only Forum thread here |
⬇️ Download AI Workbench | 📖 User Guide |📂 Other Projects | 🚨 User Forum




