Release v1.5.0 · NVIDIA/NeMo-Agent-Toolkit

🚀 Notable Features and Improvements

Dynamo Runtime Intelligence: Automatically infer per-request latency sensitivity from agent profiles and apply runtime hints for cache control, load-aware routing, and priority-aware serving.
Agent Performance Primitives (APP): Introduce framework-agnostic performance primitives that accelerate graph-based agent frameworks such as LangChain, CrewAI, and Agno with parallel execution, speculative branching, and node-level priority routing.
LangSmith Native Integration: Observe end-to-end agent execution with native LangSmith tracing, run evaluation experiments, compare outcomes, and manage prompt versions across development and production workflows.
FastMCP Workflow Publishing: Publish NeMo Agent Toolkit workflows as MCP servers using the FastMCP runtime to simplify MCP-native deployment and integration.

Migration notice: 1.5.0 includes packaging and compatibility refactors (meta-package restructure, eval/profiling package split, and import-path updates). See the Migration Guide.

🚨 Breaking Changes

Dynamic Inference Headers with Prediction Trie Integration by @dnandakumar-nv in #1483
improvement(packaging): Packaging Restructure for libraries by @willkill07 in #1512
fix: Langchain agents should reuse runnable config by @willkill07 in #1604
Refactor: Split eval/profiler into optional nvidia-nat-eval package by @AnuradhaKaruppiah in #1599
improvement: nvext.agent_hints and nvext.cache_control clean up by @bbednarski9 in #1648

✨ New Features

Add evaluator feedback to GA optimizer by @dnandakumar-nv in #1442
Add similarity scores and threshold filtering to Redis semantic search by @thepatrickchin in #1434
Add end-to-end custom metadata propagation for observability by @ericevans-nv in #1480
Expose MCP session ID and add custom headers support for session correlation by @yczhang-nv in #1500
Preserve workflow state across WebSocket reconnections by @ericevans-nv in #1541
feat(ci-scripts): utility scripts for license updates and SBOM by @willkill07 in #1548
Add example of control flow composition with router agent and sequential executor by @thepatrickchin in #1550
Add support for publishing a NeMo Agent Toolkit workflow as a MCP server via FastMCP3 by @AnuradhaKaruppiah in #1539
Add dataset store registration and support by @dnandakumar-nv in #1576
Allow for prompts to be stored in/loaded from files by @pastorsj in #1567
feat(observability): Cross-Workflow Observability by @willkill07 in #1598
FastAPI Frontend Refactor with HTTP HITL and OAuth Support by @willkill07 in #1603
Add support for LangSmith evaluators by @mpenn in #1592
Add automatic latency sensitivity inference by @dnandakumar-nv in #1618
RAG Library Mode integration by @ericevans-nv in #1440
feat: Add HuggingFace Inference API and Embedder providers by @bledden in #1570
Integrate LangSmith Observability with Evaluation and Optimization by @pastorsj in #1593
Add experimental nvidia-nat-app Agent Performance Primitives subpackage by @mpenn in #1636
Add Dynamo Example for Latency Sensitivity Assignment by @dnandakumar-nv in #1634
Revert unintended changes from PR #1704, preserve nat-ui submodule fix by @ericevans-nv in #1710

🔧 Improvements

Restore version 1.5 on develop after forward merge by @mnajafian-nv in #1324
Forward-merge release/1.4 into develop (conflict resolution) by @mnajafian-nv in #1394
Add OAuth2-Protected MCP Calculator Example by @AnuradhaKaruppiah in #1403
Forward-merge release/1.4 into develop by @mnajafian-nv in #1453
Merge release/1.4 into develop by @willkill07 in #1459
Add use_native_tool_calling option to ReAct agent by @yczhang-nv in #1476
Add raise_on_parsing_failure option to ReAct agent by @yczhang-nv in #1477
Enable per-user workflow support in nat eval by @ericevans-nv in #1503
feat: make tavily internet search tool configurable by @cdgamarose-nv in #1518
Update nat-ui submodule to latest main by @ericevans-nv in #1551
chore(pre-commit): update versions in pre-commit; add root-level uv.lock check by @willkill07 in #1553
Add a new per-user MCP client tool list endpoint by @AnuradhaKaruppiah in #1561
Add HTTP retry logic and error resilience for workflow evaluation by @ericevans-nv in #1563
Fix workflow name regression introduced by custom OTEL span naming by @ericevans-nv in #1572
Add Support for NVExt Annotations and Latency Sensitivity for Dynamo by @dnandakumar-nv in #1575
chore(ci): add stale action for old Issues/PRs by @willkill07 in #1581
Update dynamo headers to provide raw integer values by default by @dnandakumar-nv in #1583
Resolve user ID from JWT or nat-session cookie by @AnuradhaKaruppiah in #1584
Add support for Weave feedback comments by @thepatrickchin in #1586
chore(llm-providers): Add env OPENAI_BASE_URL for openai ; unify llm provider configs by @willkill07 in #1577
HITL prompt timeouts and API error responses by @ericevans-nv in #1591
Update nat-ui submodule to latest main by @ericevans-nv in #1594
enh(sbom-licenses): refactor common code; enable multi-version diffs by @willkill07 in #1597
Populate full connection attributes and payload for HTTP and WebSocket sessions by @ericevans-nv in #1602
Refactor latency sensitivity to use integers instead of enums. by @dnandakumar-nv in #1601
chore: update nat-ui submodule by @ericevans-nv in #1606
Add cache pinning strategy for KV cache with TTL control by @dnandakumar-nv in #1609
Add type converters for langgraph wrapper nat serve endpoints by @ericevans-nv in #1610
Add max_sensitivity for latency-based prioritization by @dnandakumar-nv in #1612
feat(agent): add token-by-token streaming to tool_calling_agent by @MylesShannon in #1595
chore(deps): upgrade uv.lock deps prior to release; relax dependencies by @willkill07 in #1621
Allow running pytest from project root by @dagardner-nv in #1622
Fix warning messages emitted from test_per_user_fastapi_integration.py by @dagardner-nv in #1624
feature: dynamo integration with nat profiler and prometheus/grafana dashboard by @bbednarski9 in #1486
Remove ci/release/update_toml_dep.py script by @dagardner-nv in #1646
CI fix: exclude CHANGELOG.md from pre-commit checks by @bbednarski9 in #1653
Remove NASSE naming by @ericevans-nv in #1632
chore: remove all unnecessary docker deployment guides from examples by @willkill07 in #1655
Remove Profiler Agent Example from the Primary Toolkit Repo by @AnuradhaKaruppiah in #1656
improvement: nvext.cache_control warning and HiCache for SgLang images by @bbednarski9 in #1658
Fix multi_frameworks example UnboundLocalError and upgrade default LLM by @ericevans-nv in #1661
Fixes and improvements for tests by @dagardner-nv in #1659
improvement(logging): add file logging mode option by @willkill07 in #1651
Refactor defense and red teaming middleware with pre/post invoke hooks by @ericevans-nv in #1671
Add a server-side override for the A2A Agent Card URL by @AnuradhaKaruppiah in #1673
Fix Mem0 metadata validation error and improve auto_memory_wrapper example by @ericevans-nv in #1683
Add an optional proxy server to map model names for integration testing by @dagardner-nv in #1679
chore(deps): bump package versions by @willkill07 in #1682
Fix DynamicFunctionMiddleware builder patching regression by @ericevans-nv in #1691
Fix incorrect CLI flag in auto_memory_wrapper README by @ericevans-nv in #1692
Add E2E test for Tool Calling Responses API Agent by @dagardner-nv in #1726
Fix Milvus connection failures in RAG integration tests by @ericevans-nv in #1724
feat: add NIM model endpoint health check by @mnajafian-nv in #1716
feat: add embedder inference check and Slack reporting for model health by @mnajafian-nv in #1736
Update container tag for the nginx-rewrite-models service by @dagardner-nv in #1740
Improve the text file ingest E2E test by @dagardner-nv in #1759
Update uv.lock files by @dagardner-nv in #1762
Improves FastMCP dev experience and docs by @AnuradhaKaruppiah in #1773
Observability user experiences fixes by @dnandakumar-nv in #1760
Updating CHANGELOG and Release Notes by @AnuradhaKaruppiah in #1792

🐛 Bug Fixes

fix: bump NAT version to 1.5 for packages that were added under release/1.4 by @willkill07 in #1399
Fix MCP tool validation for nullable optional fields by @AnuradhaKaruppiah in #1507
fix(serve): ensure a single event loop for python 3.11 by @willkill07 in #1528
fix: flaky batching processor test by @willkill07 in #1529
fix(ci): coverage reports should only be for nat code and examples by @willkill07 in #1536
fix(ci): Fix build_wheels and slack notifications for nightlies by @willkill07 in #1537
fix(tests): add required deps for some e2e tests; get notebook tests working by @willkill07 in #1538
Forward-merge release/1.4 into develop by @willkill07 in #1552
Use relative paths for symlink creation in workflow create command by @thepatrickchin in #1557
fix(milvus): Fix vector_field config mapping and document_id type by @rmalani-nv in #1555
Refactor span attribute serialization to use JSON strings by @dnandakumar-nv in #1574
fix(ci): ensure packaging works in GitLab CI by @willkill07 in #1582
Fix FastMCP example E2E tests by @AnuradhaKaruppiah in #1580
fix(gitlab-ci): ensure gitlab artifact upload is configured correctly by @willkill07 in #1588
fix(ci): ensure stale action has required permissions by @willkill07 in #1589
fix(mcp): Cache enum classes to prevent validation errors by @bledden in #1564
fix(tests): prepare for OpenAI endpoint for nightlies; fix failing tests by @willkill07 in #1596
Refactor call index tracking for prefix predictions by @dnandakumar-nv in #1608
Fix failures after nvidia-nat-eval isolation by @willkill07 in #1615
fix(mcp-client): ensure tools are only invoked when available by @willkill07 in #1616
fix: update openpipe-art accuracy reward logic by @aslanshi in #1623
fix: Preserve custom dataset fields in workflow output by @bledden in #1628
fix: Skip output directory cleanup when --skip_workflow is set by @bledden in #1627
fix: Pass request_timeout through to OpenAI/Azure LLM clients by @bledden in #1626
fix: Filter empty LLM responses from ReAct retry scratchpad by @bledden in #1629
Fix auth callback trace and update test scripts by @AnuradhaKaruppiah in #1633
fix(http-hitl-oauth): fix streaming and default configuration values by @willkill07 in #1641
Add validation alias for nvext_max_sensitivity in DynamoLLM by @dnandakumar-nv in #1657
Treat explicit null defaults as nullable in MCP schema translation by @AnuradhaKaruppiah in #1665
Fix Unicode escape sequences showing in console workflow output by @yczhang-nv in #1664
Improve ReAct tool input parsing for Python-style Action Input literals by @AnuradhaKaruppiah in #1666
Add missing dependency for nvidia-nat-opentelemetry to nvidia-nat-langchain by @dagardner-nv in #1670
Fix ReAct agent parsing failures with reasoning models (<think> tags) by @yczhang-nv in #1667
Simplify the example questions to bypass priv levels by @AnuradhaKaruppiah in #1672
Fix thought matching issue in ReAct agent with the Llama-3.1-Nemotron-Nano-4B-v1.1 model by @dagardner-nv in #1675
chore(deps): provide upper-bound for starlette; bump grpcio versions by @willkill07 in #1669
Fix setting the openai base url for llama index by @dagardner-nv in #1686
examples(mcp): make example more robust to LLM hallucination by @willkill07 in #1695
Fix mixture of agent example from reaching GRAPH_RECURSION_LIMIT by @dagardner-nv in #1697
fix: handle GraphRecursionError gracefully in tool_calling_agent by @mnajafian-nv in #1705
Update nat-ui submodule to include conversation state fix by @ericevans-nv in #1704
Fix alert_triage_agent empty reports in offline mode (#1699) by @mnajafian-nv in #1703
fix(notebooks): add missing nat workflow reinstall before nat run by @mnajafian-nv in #1713
fix(simple-web-query): harden tool description; disable thinking by @willkill07 in #1722
Fix pydantic model validation for nvext hints by @dnandakumar-nv in #1723
Pin nvidia-nat-ragaai to setuptools v81 by @dagardner-nv in #1730
Fix Strands integration tests by @dagardner-nv in #1731
fix: replace llama-3.1-405b model in email phishing analyzer by @mnajafian-nv in #1712
Fix alert triage agent: switch to nemotron-3-nano model and improve prompts by @hsin-c in #1750
fix(ci): sanitize sbom license response by @willkill07 in #1763
fix: replace deprecated mistral-nemo-12b and fix reasoning agent tool discovery by @mnajafian-nv in #1781
fix: toplevel pyproject.toml should specify tool.uv.managed=true by @willkill07 in #1783

📝 Documentation Updates

Fix typo in documentation for uv sync command by @thepatrickchin in #1542
docs: 1.5 migration guide for packaging by @willkill07 in #1625
chore(docs): add GitHub Issues/PRs to Linkcheck ignorelist by @willkill07 in #1642
Ensure that we allways spell vLLM with the same casing that the proje… by @dagardner-nv in #1644
Document the need to set the NVIDIA_API_KEY in the Redis example by @dagardner-nv in #1678
Add documentation for langsmith evaluators by @pastorsj in #1643
Organize alternate source/package install commands into tabs by @dagardner-nv in #1737
Cleanup vale vocabulary by @dagardner-nv in #1745
Fix out of date documentation references to AIQContext by @dagardner-nv in #1795

New Contributors

@bledden made their first contribution in #1564
@pastorsj made their first contribution in #1567

Full Changelog: v1.4.1...v1.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5.0

Choose a tag to compare

Sorry, something went wrong.