updated per bug review by kheiss-uwzoo · Pull Request #1621 · NVIDIA/NeMo-Retriever

kheiss-uwzoo · 2026-03-13T20:08:20Z

Overview

QA review branch that brings in upstream main and applies bug-queue doc and tooling updates. The diff is ~2,300 insertions and ~1,200 deletions across 66 files, touching docs, NeMo Retriever, Helm, the test harness, and retrieval-bench.

Documentation

Extraction docs: Updates across audio, benchmarking, quickstart (guide + library mode), support matrix, content/custom metadata, FAQ, CLI reference, Python API reference, VLM embed, user-defined functions/stages, and v2 API guide.
Naming/links: Consistent use of “NeMo Retriever Library” (replacing “NVIDIA Ingest” / “nv-ingest”) and fixes for support matrix and related links (e.g. RIVA).
Helm: README and values updates; table additions for nimOperator.rerankqa and nimOperator.ocr; nemotron rebranding.
NeMo Retriever
Markdown API: to_markdown() returns None for empty results; markdown I/O and tests adjusted (including test_io_markdown, test_html_convert, test_txt_split).
Image support: Docs for ingesting image files (batch and in-process), including extract_image_files and --input-type image.
Text chunking: .split() for token-count–based chunking (#1547).
Audio: Batched audio extraction improvements; Parakeet CTC ASR and ASR actor updates.
Build/install: Retriever installed as part of Docker build; get_hf_revision removed from code outside nemo_retriever/ (#1612).
Release: Version handling and PyPI wheel naming; NeMo Retriever LICENSE added.
Helm & Harness
Helm: RTX PRO 4500 override and obj-det warmup batch size override; reranker and OCR NIM table docs.
Harness: Wait for healthy reranker when needed for recall; retry for managed Helm port-forwards; docker-compose and Helm service manager improvements; JP20 recall config cleanup and readiness logging.
Retrieval-bench
Pipeline and modality handling improvements; refactors for retriever singletons (ColeEmbed, HF dense, Nemotron ColeEmbed VL v2, Nemotron Embed VL dense).
New BRIGHT agentic submission (bright_agentic.md, sdg.png).
CI / Release
Perform-release: Workflow and release-helm updates; reusable PyPI build/publish and release-helm workflow changes.
Misc: Redis TTL default increased to 48h for VLM captioning; in-process extract fixes for txt and reranker; source_id in LanceDB schema; rerank and release-related fixes.

greptile-apps · 2026-04-22T21:32:32Z

Greptile Summary

This documentation-only PR corrects import paths across several reference pages (replacing stale nemo_retriever.* aliases with canonical nv_ingest_client / nv_ingest_api / nv_ingest paths), fixes tooling examples in benchmarking.md, and adds Python 3.12+ and RTX PRO 4500 entries to the support matrix. Two files have incomplete renames that leave the docs in a contradictory state:

vlm-embed.md: The first line was updated to "Llama Nemotron Embed VL 1B v2" but the body paragraph, step description, and .env config block still reference the old llama-3.2-nemoretriever-1b-vlm-embed-v1 image/model name — users following the guide will configure the wrong model.
audio.md: One !!! important block was changed to "RIVA ASR NIM microservice" while the title, overview paragraph, and a second important block still say "parakeet-1-1b-ctc-en-us ASR NIM microservice", leaving readers with two conflicting component names on the same page.

Confidence Score: 4/5

Safe to merge after resolving the two incomplete renames in vlm-embed.md and audio.md that leave docs in a contradictory state.

Two P1 findings represent genuinely incorrect documentation — users following vlm-embed.md will configure the wrong container image/model name, and users reading audio.md will encounter conflicting service names on the same page. All other changes are clean correctness fixes. Score held at 4 pending those two fixes.

docs/docs/extraction/vlm-embed.md and docs/docs/extraction/audio.md require attention for the incomplete model/service renames.

Important Files Changed

Filename	Overview
docs/docs/extraction/vlm-embed.md	Heading updated to reference new model "Llama Nemotron Embed VL 1B v2" but body text, step descriptions, and .env config block still use old llama-3.2-nemoretriever-1b-vlm-embed-v1 identifiers — incomplete update.
docs/docs/extraction/audio.md	Single important-note block renamed to "RIVA ASR NIM microservice" while the rest of the page still refers to "parakeet-1-1b-ctc-en-us ASR NIM microservice" — inconsistent after partial rename.
docs/docs/extraction/quickstart-library-mode.md	Import paths corrected from nemo_retriever.* to the canonical nv_ingest/nv_ingest_client/nv_ingest_api packages; milvus query API updated to nvingest_retrieval with defensive entity-key access.
docs/docs/extraction/python-api-reference.md	Four import paths corrected from nemo_retriever.client(.interface) to nv_ingest_client.client.interface — straightforward accuracy fix.
docs/docs/extraction/custom-metadata.md	Import paths and nvingest_retrieval call signature updated to use keyword arguments and corrected module paths.
docs/docs/extraction/quickstart-guide.md	Import split for Ingestor corrected; repo URL updated from nv-ingest to NeMo-Retriever; trailing zero-width space on line 539.
docs/docs/extraction/support-matrix.md	Added Python 3.12+ software requirement section and RTX PRO 4500 Blackwell to supported GPU list.
docs/docs/extraction/prerequisites.md	Python 3.12 requirement surfaced as a top-level bullet and the UV note reworded to warn about 3.10/3.11 failures.
docs/docs/extraction/content-metadata.md	Anchor IDs added to NearbyObjectsSchema, ErrorMetadataSchema, and InfoMessageMetadataSchema headings to enable deep-linking.
docs/docs/extraction/benchmarking.md	Directory name corrected from nv-ingest-harness to harness; API_VERSION override example corrected to v2; spurious --dataset flag removed from doc-analysis example.
docs/docs/extraction/user-defined-functions.md	Relative path link to default_pipeline.yaml replaced with absolute GitHub URL, appropriate after repo restructuring.
docs/docs/extraction/faq.md	Whitespace-only fix in extract() call example.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[PR: Documentation & Naming Updates] --> B[Import Path Corrections]
    A --> C[Model / Service Renames]
    A --> D[Tooling & Link Fixes]

    B --> B1["python-api-reference.md\nnemo_retriever.client → nv_ingest_client.client.interface"]
    B --> B2["quickstart-library-mode.md\nnemo_retriever.* → nv_ingest / nv_ingest_client / nv_ingest_api"]
    B --> B3["custom-metadata.md\nnv_ingest_client.util.milvus → nv_ingest_client.util.vdb.milvus"]
    B --> B4["quickstart-guide.md\nIngestor import split"]

    C --> C1["vlm-embed.md ⚠️\nHeading → Llama Nemotron Embed VL 1B v2\nbody/config still uses old name"]
    C --> C2["audio.md ⚠️\nimportant note → RIVA ASR\nrest of page still says Parakeet"]

    D --> D1["user-defined-functions.md\nRelative path → absolute GitHub URL"]
    D --> D2["benchmarking.md\nDirectory name + API_VERSION fix"]
    D --> D3["support-matrix.md\nPython 3.12+ requirement added\nRTX PRO 4500 added"]
    D --> D4["prerequisites.md\nPython 3.12 requirement clarified"]
    D --> D5["content-metadata.md\nAnchor IDs added to headings"]

Comments Outside Diff (1)

docs/docs/extraction/vlm-embed.md, line 5-25 (link)

Stale model name after heading update

The first line of the page was updated to reference Llama Nemotron Embed VL 1B v2, but the body text (line 5), the step description (line 19 – "Llama 3.2 Multimodal model"), and the .env config block (lines 22–24) still use the old llama-3.2-nemoretriever-1b-vlm-embed-v1 identifiers. Users following this guide will configure the wrong container image and model name.

Prompt To Fix With AI

This is a comment left during a code review.
Path: docs/docs/extraction/vlm-embed.md
Line: 5-25

Comment:
**Stale model name after heading update**

The first line of the page was updated to reference `Llama Nemotron Embed VL 1B v2`, but the body text (line 5), the step description (line 19 – "Llama 3.2 Multimodal model"), and the `.env` config block (lines 22–24) still use the old `llama-3.2-nemoretriever-1b-vlm-embed-v1` identifiers. Users following this guide will configure the wrong container image and model name.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: docs/docs/extraction/vlm-embed.md
Line: 5-25

Comment:
**Stale model name after heading update**

The first line of the page was updated to reference `Llama Nemotron Embed VL 1B v2`, but the body text (line 5), the step description (line 19 – "Llama 3.2 Multimodal model"), and the `.env` config block (lines 22–24) still use the old `llama-3.2-nemoretriever-1b-vlm-embed-v1` identifiers. Users following this guide will configure the wrong container image and model name.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/audio.md
Line: 28-30

Comment:
**Inconsistent microservice name after partial rename**

This `!!! important` block was updated to read "RIVA ASR NIM microservice", but the rest of the page (title paragraph at lines 3–5, overview paragraph at lines 25–26, and the second `!!! important` at line 44–46) still uses "parakeet-1-1b-ctc-en-us ASR NIM microservice". RIVA and Parakeet are distinct products, so readers will encounter two different names for what appears to be the same component. Either the remaining references should be updated to RIVA or this block should revert to Parakeet.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docs/docs/extraction/quickstart-guide.md
Line: 539

Comment:
**Zero-width space at end of line**

Line 539 ends with a Unicode zero-width space character (`U+200B`) after the closing `]` of the link. This invisible character can cause broken rendering in some Markdown processors and may appear as an unexpected character in raw text views. It should be removed.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Merge branch '26.03' into kheiss/qa-revi..." | Re-trigger Greptile}

greptile-apps · 2026-04-22T21:32:40Z

 !!! important

-    Due to limitations in available VRAM controls in the current release, the parakeet-1-1b-ctc-en-us ASR NIM microservice must run on a [dedicated additional GPU](support-matrix.md). For the full list of requirements, refer to [Support Matrix](support-matrix.md).
+    Due to limitations in available VRAM controls in the current release, the RIVA ASR NIM microservice must run on a [dedicated additional GPU](support-matrix.md). For the full list of requirements, refer to [Support Matrix](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/support-matrix/support-matrix.html).


Inconsistent microservice name after partial rename

This !!! important block was updated to read "RIVA ASR NIM microservice", but the rest of the page (title paragraph at lines 3–5, overview paragraph at lines 25–26, and the second !!! important at line 44–46) still uses "parakeet-1-1b-ctc-en-us ASR NIM microservice". RIVA and Parakeet are distinct products, so readers will encounter two different names for what appears to be the same component. Either the remaining references should be updated to RIVA or this block should revert to Parakeet.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/audio.md Line: 28-30 Comment: **Inconsistent microservice name after partial rename** This `!!! important` block was updated to read "RIVA ASR NIM microservice", but the rest of the page (title paragraph at lines 3–5, overview paragraph at lines 25–26, and the second `!!! important` at line 44–46) still uses "parakeet-1-1b-ctc-en-us ASR NIM microservice". RIVA and Parakeet are distinct products, so readers will encounter two different names for what appears to be the same component. Either the remaining references should be updated to RIVA or this block should revert to Parakeet. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-22T21:32:41Z

 ## Specify MIG slices for NIM models

-When you deploy the pipeline with NIM models on MIG‑enabled GPUs, MIG device slices are requested and scheduled through the `values.yaml` file for the corresponding NIM microservice. For IBM Content-Aware Storage (CAS) deployments, this allows NIM pods to land only on nodes that expose the desired MIG profiles [raw.githubusercontent](https://raw.githubusercontent.com/NVIDIA/nv-ingest/main/helm/README.md).
+When you deploy NeMo Retriever Library with NIM models on MIG‑enabled GPUs, MIG device slices are requested and scheduled through the `values.yaml` file for the corresponding NIM microservice. For IBM Content-Aware Storage (CAS) deployments, this allows NeMo Retriever Library NIM pods to land only on nodes that expose the desired MIG profiles [raw.githubusercontent](https://raw.githubusercontent.com/NVIDIA/NeMo-Retriever/main/helm/README.md).


Zero-width space at end of line

Line 539 ends with a Unicode zero-width space character (U+200B) after the closing ] of the link. This invisible character can cause broken rendering in some Markdown processors and may appear as an unexpected character in raw text views. It should be removed.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/quickstart-guide.md Line: 539 Comment: **Zero-width space at end of line** Line 539 ends with a Unicode zero-width space character (`U+200B`) after the closing `]` of the link. This invisible character can cause broken rendering in some Markdown processors and may appear as an unexpected character in raw text views. It should be removed. How can I resolve this? If you propose a fix, please make it concise.

updated per bug review

050943c

kheiss-uwzoo marked this pull request as ready for review March 13, 2026 20:09

kheiss-uwzoo requested a review from a team as a code owner March 13, 2026 20:09

kheiss-uwzoo requested review from jdye64, jperez999 and sosahi and removed request for a team March 13, 2026 20:09

Merge branch '26.03' into kheiss/qa-review3b

9e97864

kheiss-uwzoo merged commit d6106d3 into NVIDIA:26.03 Apr 22, 2026
3 of 5 checks passed

greptile-apps Bot reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updated per bug review#1621

updated per bug review#1621
kheiss-uwzoo merged 2 commits intoNVIDIA:26.03from
kheiss-uwzoo:kheiss/qa-review3b

kheiss-uwzoo commented Mar 13, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 22, 2026 •

edited

Loading

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

greptile-apps Bot Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kheiss-uwzoo commented Mar 13, 2026

Overview

Documentation

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Apr 22, 2026 •

edited

Loading