This release adds Agentic RAG support with plan-and-execute pipelines, streaming responses, and UI integration; changes the default vector database to Elasticsearch and the default object store to SeaweedFS; adds Red Hat OpenShift support for Helm-based deployment; and introduces new agent skills for deployment, evaluation, and performance tooling.
Highlights
This release includes the following key updates:
- Added Agentic RAG support, including the plan-and-execute pipeline, streaming responses, and RAG UI integration.
- Changed the default vector database to Elasticsearch.
- GPU accelerated support needs enterprise access and is disabled by default.
- Milvus remains available as an optional vector database backend.
- Changed the default object store to SeaweedFS from MinIO.
- Updated the default LLM to
nvidia/nemotron-3-super-120b-a12band enabled Nemotron reasoning by default in deployment configurations. - Promoted
nvidia/llama-nemotron-embed-vl-1b-v2as the default embedding model. The text embedding modelnvidia/llama-nemotron-embed-1b-v2remains available as an optional configuration. - Added VLM reranker support as an opt-in.
- Added dynamic filter expression generation for Elasticsearch.
- Published RAG performance tooling and skills to use it easily.
- Published the RAG evaluation framework and skills to use it easily.
- Updated NV-Ingest to version 26.3.0.
- Updated OCR NIM naming from
nemoretriever-ocr-v1tonemotron-ocr-v1. - Added OpenClaw plugin for agent-driven deploy/configure/eval workflows.
- Added Red Hat OpenShift and OKD support for Helm deployments.
Fixed Known Issues
The following known issues have been resolved in this release:
- Fixed default LLM sampling parameter handling for non-NVIDIA providers.