v1.5.0
·
245 commits
to develop
since this release
🚀 Notable Features and Improvements
- Dynamo Runtime Intelligence: Automatically infer per-request latency sensitivity from agent profiles and apply runtime hints for cache control, load-aware routing, and priority-aware serving.
- Agent Performance Primitives (APP): Introduce framework-agnostic performance primitives that accelerate graph-based agent frameworks such as LangChain, CrewAI, and Agno with parallel execution, speculative branching, and node-level priority routing.
- LangSmith Native Integration: Observe end-to-end agent execution with native LangSmith tracing, run evaluation experiments, compare outcomes, and manage prompt versions across development and production workflows.
- FastMCP Workflow Publishing: Publish NeMo Agent Toolkit workflows as MCP servers using the FastMCP runtime to simplify MCP-native deployment and integration.
Migration notice: 1.5.0 includes packaging and compatibility refactors (meta-package restructure, eval/profiling package split, and import-path updates). See the Migration Guide.
🚨 Breaking Changes
- Dynamic Inference Headers with Prediction Trie Integration by @dnandakumar-nv in #1483
- improvement(packaging): Packaging Restructure for libraries by @willkill07 in #1512
- fix: Langchain agents should reuse runnable config by @willkill07 in #1604
- Refactor: Split eval/profiler into optional nvidia-nat-eval package by @AnuradhaKaruppiah in #1599
- improvement: nvext.agent_hints and nvext.cache_control clean up by @bbednarski9 in #1648
✨ New Features
- Add evaluator feedback to GA optimizer by @dnandakumar-nv in #1442
- Add similarity scores and threshold filtering to Redis semantic search by @thepatrickchin in #1434
- Add end-to-end custom metadata propagation for observability by @ericevans-nv in #1480
- Expose MCP session ID and add custom headers support for session correlation by @yczhang-nv in #1500
- Preserve workflow state across WebSocket reconnections by @ericevans-nv in #1541
- feat(ci-scripts): utility scripts for license updates and SBOM by @willkill07 in #1548
- Add example of control flow composition with router agent and sequential executor by @thepatrickchin in #1550
- Add support for publishing a NeMo Agent Toolkit workflow as a MCP server via FastMCP3 by @AnuradhaKaruppiah in #1539
- Add dataset store registration and support by @dnandakumar-nv in #1576
- Allow for prompts to be stored in/loaded from files by @pastorsj in #1567
- feat(observability): Cross-Workflow Observability by @willkill07 in #1598
- FastAPI Frontend Refactor with HTTP HITL and OAuth Support by @willkill07 in #1603
- Add support for LangSmith evaluators by @mpenn in #1592
- Add automatic latency sensitivity inference by @dnandakumar-nv in #1618
- RAG Library Mode integration by @ericevans-nv in #1440
- feat: Add HuggingFace Inference API and Embedder providers by @bledden in #1570
- Integrate LangSmith Observability with Evaluation and Optimization by @pastorsj in #1593
- Add experimental nvidia-nat-app Agent Performance Primitives subpackage by @mpenn in #1636
- Add Dynamo Example for Latency Sensitivity Assignment by @dnandakumar-nv in #1634
- Revert unintended changes from PR #1704, preserve nat-ui submodule fix by @ericevans-nv in #1710
🔧 Improvements
- Restore version 1.5 on develop after forward merge by @mnajafian-nv in #1324
- Forward-merge release/1.4 into develop (conflict resolution) by @mnajafian-nv in #1394
- Add OAuth2-Protected MCP Calculator Example by @AnuradhaKaruppiah in #1403
- Forward-merge release/1.4 into develop by @mnajafian-nv in #1453
- Merge release/1.4 into develop by @willkill07 in #1459
- Add
use_native_tool_callingoption toReActagent by @yczhang-nv in #1476 - Add
raise_on_parsing_failureoption toReActagent by @yczhang-nv in #1477 - Enable per-user workflow support in nat eval by @ericevans-nv in #1503
- feat: make tavily internet search tool configurable by @cdgamarose-nv in #1518
- Update nat-ui submodule to latest main by @ericevans-nv in #1551
- chore(pre-commit): update versions in pre-commit; add root-level
uv.lockcheck by @willkill07 in #1553 - Add a new per-user MCP client tool list endpoint by @AnuradhaKaruppiah in #1561
- Add HTTP retry logic and error resilience for workflow evaluation by @ericevans-nv in #1563
- Fix workflow name regression introduced by custom OTEL span naming by @ericevans-nv in #1572
- Add Support for NVExt Annotations and Latency Sensitivity for Dynamo by @dnandakumar-nv in #1575
- chore(ci): add stale action for old Issues/PRs by @willkill07 in #1581
- Update dynamo headers to provide raw integer values by default by @dnandakumar-nv in #1583
- Resolve user ID from JWT or nat-session cookie by @AnuradhaKaruppiah in #1584
- Add support for Weave feedback comments by @thepatrickchin in #1586
- chore(llm-providers): Add env
OPENAI_BASE_URLforopenai; unify llm provider configs by @willkill07 in #1577 - HITL prompt timeouts and API error responses by @ericevans-nv in #1591
- Update nat-ui submodule to latest main by @ericevans-nv in #1594
- enh(sbom-licenses): refactor common code; enable multi-version diffs by @willkill07 in #1597
- Populate full connection attributes and payload for HTTP and WebSocket sessions by @ericevans-nv in #1602
- Refactor latency sensitivity to use integers instead of enums. by @dnandakumar-nv in #1601
- chore: update nat-ui submodule by @ericevans-nv in #1606
- Add cache pinning strategy for KV cache with TTL control by @dnandakumar-nv in #1609
- Add type converters for langgraph wrapper
nat serveendpoints by @ericevans-nv in #1610 - Add max_sensitivity for latency-based prioritization by @dnandakumar-nv in #1612
- feat(agent): add token-by-token streaming to tool_calling_agent by @MylesShannon in #1595
- chore(deps): upgrade uv.lock deps prior to release; relax dependencies by @willkill07 in #1621
- Allow running
pytestfrom project root by @dagardner-nv in #1622 - Fix warning messages emitted from
test_per_user_fastapi_integration.pyby @dagardner-nv in #1624 - feature: dynamo integration with nat profiler and prometheus/grafana dashboard by @bbednarski9 in #1486
- Remove
ci/release/update_toml_dep.pyscript by @dagardner-nv in #1646 - CI fix: exclude CHANGELOG.md from pre-commit checks by @bbednarski9 in #1653
- Remove NASSE naming by @ericevans-nv in #1632
- chore: remove all unnecessary docker deployment guides from examples by @willkill07 in #1655
- Remove Profiler Agent Example from the Primary Toolkit Repo by @AnuradhaKaruppiah in #1656
- improvement: nvext.cache_control warning and HiCache for SgLang images by @bbednarski9 in #1658
- Fix multi_frameworks example UnboundLocalError and upgrade default LLM by @ericevans-nv in #1661
- Fixes and improvements for tests by @dagardner-nv in #1659
- improvement(logging): add file logging mode option by @willkill07 in #1651
- Refactor defense and red teaming middleware with pre/post invoke hooks by @ericevans-nv in #1671
- Add a server-side override for the A2A Agent Card URL by @AnuradhaKaruppiah in #1673
- Fix Mem0 metadata validation error and improve auto_memory_wrapper example by @ericevans-nv in #1683
- Add an optional proxy server to map model names for integration testing by @dagardner-nv in #1679
- chore(deps): bump package versions by @willkill07 in #1682
- Fix DynamicFunctionMiddleware builder patching regression by @ericevans-nv in #1691
- Fix incorrect CLI flag in auto_memory_wrapper README by @ericevans-nv in #1692
- Add E2E test for Tool Calling Responses API Agent by @dagardner-nv in #1726
- Fix Milvus connection failures in RAG integration tests by @ericevans-nv in #1724
- feat: add NIM model endpoint health check by @mnajafian-nv in #1716
- feat: add embedder inference check and Slack reporting for model health by @mnajafian-nv in #1736
- Update container tag for the
nginx-rewrite-modelsservice by @dagardner-nv in #1740 - Improve the text file ingest E2E test by @dagardner-nv in #1759
- Update
uv.lockfiles by @dagardner-nv in #1762 - Improves FastMCP dev experience and docs by @AnuradhaKaruppiah in #1773
- Observability user experiences fixes by @dnandakumar-nv in #1760
- Updating CHANGELOG and Release Notes by @AnuradhaKaruppiah in #1792
🐛 Bug Fixes
- fix: bump NAT version to 1.5 for packages that were added under
release/1.4by @willkill07 in #1399 - Fix MCP tool validation for nullable optional fields by @AnuradhaKaruppiah in #1507
- fix(serve): ensure a single event loop for python 3.11 by @willkill07 in #1528
- fix: flaky batching processor test by @willkill07 in #1529
- fix(ci): coverage reports should only be for
natcode and examples by @willkill07 in #1536 - fix(ci): Fix build_wheels and slack notifications for nightlies by @willkill07 in #1537
- fix(tests): add required deps for some e2e tests; get notebook tests working by @willkill07 in #1538
- Forward-merge release/1.4 into develop by @willkill07 in #1552
- Use relative paths for symlink creation in workflow create command by @thepatrickchin in #1557
- fix(milvus): Fix vector_field config mapping and document_id type by @rmalani-nv in #1555
- Refactor span attribute serialization to use JSON strings by @dnandakumar-nv in #1574
- fix(ci): ensure packaging works in GitLab CI by @willkill07 in #1582
- Fix FastMCP example E2E tests by @AnuradhaKaruppiah in #1580
- fix(gitlab-ci): ensure gitlab artifact upload is configured correctly by @willkill07 in #1588
- fix(ci): ensure stale action has required permissions by @willkill07 in #1589
- fix(mcp): Cache enum classes to prevent validation errors by @bledden in #1564
- fix(tests): prepare for OpenAI endpoint for nightlies; fix failing tests by @willkill07 in #1596
- Refactor call index tracking for prefix predictions by @dnandakumar-nv in #1608
- Fix failures after
nvidia-nat-evalisolation by @willkill07 in #1615 - fix(mcp-client): ensure tools are only invoked when available by @willkill07 in #1616
- fix: update openpipe-art accuracy reward logic by @aslanshi in #1623
- fix: Preserve custom dataset fields in workflow output by @bledden in #1628
- fix: Skip output directory cleanup when --skip_workflow is set by @bledden in #1627
- fix: Pass request_timeout through to OpenAI/Azure LLM clients by @bledden in #1626
- fix: Filter empty LLM responses from ReAct retry scratchpad by @bledden in #1629
- Fix auth callback trace and update test scripts by @AnuradhaKaruppiah in #1633
- fix(http-hitl-oauth): fix streaming and default configuration values by @willkill07 in #1641
- Add validation alias for nvext_max_sensitivity in DynamoLLM by @dnandakumar-nv in #1657
- Treat explicit null defaults as nullable in MCP schema translation by @AnuradhaKaruppiah in #1665
- Fix Unicode escape sequences showing in console workflow output by @yczhang-nv in #1664
- Improve ReAct tool input parsing for Python-style Action Input literals by @AnuradhaKaruppiah in #1666
- Add missing dependency for
nvidia-nat-opentelemetrytonvidia-nat-langchainby @dagardner-nv in #1670 - Fix ReAct agent parsing failures with reasoning models (
<think>tags) by @yczhang-nv in #1667 - Simplify the example questions to bypass priv levels by @AnuradhaKaruppiah in #1672
- Fix
thoughtmatching issue in ReAct agent with theLlama-3.1-Nemotron-Nano-4B-v1.1model by @dagardner-nv in #1675 - chore(deps): provide upper-bound for starlette; bump grpcio versions by @willkill07 in #1669
- Fix setting the openai base url for llama index by @dagardner-nv in #1686
- examples(mcp): make example more robust to LLM hallucination by @willkill07 in #1695
- Fix mixture of agent example from reaching
GRAPH_RECURSION_LIMITby @dagardner-nv in #1697 - fix: handle GraphRecursionError gracefully in tool_calling_agent by @mnajafian-nv in #1705
- Update nat-ui submodule to include conversation state fix by @ericevans-nv in #1704
- Fix alert_triage_agent empty reports in offline mode (#1699) by @mnajafian-nv in #1703
- fix(notebooks): add missing nat workflow reinstall before nat run by @mnajafian-nv in #1713
- fix(simple-web-query): harden tool description; disable thinking by @willkill07 in #1722
- Fix pydantic model validation for nvext hints by @dnandakumar-nv in #1723
- Pin nvidia-nat-ragaai to setuptools v81 by @dagardner-nv in #1730
- Fix Strands integration tests by @dagardner-nv in #1731
- fix: replace llama-3.1-405b model in email phishing analyzer by @mnajafian-nv in #1712
- Fix alert triage agent: switch to nemotron-3-nano model and improve prompts by @hsin-c in #1750
- fix(ci): sanitize sbom license response by @willkill07 in #1763
- fix: replace deprecated mistral-nemo-12b and fix reasoning agent tool discovery by @mnajafian-nv in #1781
- fix: toplevel
pyproject.tomlshould specifytool.uv.managed=trueby @willkill07 in #1783
📝 Documentation Updates
- Fix typo in documentation for uv sync command by @thepatrickchin in #1542
- docs: 1.5 migration guide for packaging by @willkill07 in #1625
- chore(docs): add GitHub Issues/PRs to Linkcheck ignorelist by @willkill07 in #1642
- Ensure that we allways spell vLLM with the same casing that the proje… by @dagardner-nv in #1644
- Document the need to set the
NVIDIA_API_KEYin the Redis example by @dagardner-nv in #1678 - Add documentation for langsmith evaluators by @pastorsj in #1643
- Organize alternate source/package install commands into tabs by @dagardner-nv in #1737
- Cleanup vale vocabulary by @dagardner-nv in #1745
- Fix out of date documentation references to
AIQContextby @dagardner-nv in #1795
New Contributors
Full Changelog: v1.4.1...v1.5.0