Release Release 25.9.0 · NVIDIA/NeMo-Retriever

The NeMo Retriever extraction 25.09 release adds new hardware and software support, and other improvements, including the following:

Add functional support for RTX Pro 6000.
Add functional support for DGX B200.
Add support for nemoretriever-ocr-v1. For details, refer to Deploy With Docker Compose (Self-Hosted) and NV-Ingest Helm Charts.
Add support for llama-3.2-nemoretriever-1b-vlm-embed-v1.
Add support for Llama Nemotron VLM 8b NIM for image captioning. For details, refer to Extract Captions from Images.
Add support for custom vector database implementations. For details, refer to Build a Custom Vector Database Operator.
Add support for custom Lambda stages. For details, refer to Add User-defined Stages to Your NeMo Retriever Extraction Pipeline.
Expanded documentation for Library Mode.
New documentation Configure Ray Logging.
New documentation Use Multimodal Embedding.
Add support for Integer, float, boolean, and array in custom metadata during Milvus entity creation.
Add support for running more than one VLM at a time by using Helm. For details, refer to NV-Ingest Helm Charts.

Known Issues

The following are the known issues for this release:

A10G and L40S are not supported. For details, refer to Support Matrix.
nemoretriever-parse is not supported on RTX Pro 6000 or B200. For details, refer to Support Matrix.
The NeMo Retriever extraction pipeline does not support ingestion of batches that include individual files greater than approximately 400MB.

Provide feedback