dynamo llm integration with examples, analysis, and custom predictive routers#1242
Conversation
WalkthroughAdds a Dynamo backend and example React Benchmark Agent: new Dynamo LLM provider with prefix‑header hooks, ADK/LangChain adapters, FastAPI frontend/processor/router services, decision‑only tool stubs, self‑evaluating agent and evaluators, many configs, deployment/monitoring scripts, analysis/plotting utilities, docs, and tests. Changes
Sequence Diagram(s)mermaid mermaid Estimated code review effort🎯 5 (Critical) | ⏱️ ~120+ minutes Areas needing focused review:
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
20147d7 to
750926f
Compare
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
From @willkill07: Example README:
Complete ✅ |
|
From @willkill07: Needs to be a relative symlink rather than absolute Complete ✅ |
|
From @willkill07: (re configs) Complete ✅ |
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 6
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (15)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.yml-94-118 (1)
94-118: Use function group reference instead of enumerating all tools.The configuration defines the
banking_toolsfunction group (lines 31-56), but the workflow explicitly lists all individual tools. Per PR reviewer feedback, workflow configs relying on function groups should list the function group name; all tools in that group will be added automatically.Based on PR comments, replace the explicit tool enumeration with a function group reference:
workflow: _type: react_agent llm_name: dynamo_llm - tool_names: [ - banking_tools.get_account_balance, - banking_tools.get_transaction_history, - banking_tools.transfer_funds, - banking_tools.get_loan_information, - banking_tools.get_credit_card_information, - banking_tools.get_mortgage_details, - banking_tools.get_savings_account_products, - banking_tools.schedule_appointment, - banking_tools.check_loan_application_status, - banking_tools.find_nearby_locations, - banking_tools.get_investment_products, - banking_tools.report_lost_stolen_card, - banking_tools.update_contact_information, - banking_tools.setup_automatic_bill_pay, - banking_tools.initiate_transaction_dispute, - banking_tools.get_exchange_rates, - banking_tools.calculate_loan_payment, - banking_tools.manage_account_alerts, - banking_tools.check_wire_transfer_status, - banking_tools.get_cd_products - ] + tool_names: [banking_tools]examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml-97-122 (1)
97-122: Use function group reference instead of enumerating all tools.The configuration defines the
banking_toolsfunction group (lines 35-60), but the workflow explicitly lists all individual tools. Per PR reviewer feedback, workflow configs relying on function groups should list the function group name; all tools in that group will be added automatically.Based on PR comments, replace the explicit tool enumeration with a function group reference:
workflow: _type: react_agent llm_name: dynamo_llm - tool_names: [ - banking_tools.get_account_balance, - banking_tools.get_transaction_history, - banking_tools.transfer_funds, - banking_tools.get_loan_information, - banking_tools.get_credit_card_information, - banking_tools.get_mortgage_details, - banking_tools.get_savings_account_products, - banking_tools.schedule_appointment, - banking_tools.check_loan_application_status, - banking_tools.find_nearby_locations, - banking_tools.get_investment_products, - banking_tools.report_lost_stolen_card, - banking_tools.update_contact_information, - banking_tools.setup_automatic_bill_pay, - banking_tools.initiate_transaction_dispute, - banking_tools.get_exchange_rates, - banking_tools.calculate_loan_payment, - banking_tools.manage_account_alerts, - banking_tools.check_wire_transfer_status, - banking_tools.get_cd_products - ] + tool_names: [banking_tools]external/dynamo/start_dynamo_disagg.sh-434-436 (1)
434-436: Remove hardcoded user-specific paths from example commands.The example NAT integration test contains hardcoded user-specific paths that won't work for other users.
Apply this diff to use placeholders:
- echo "cd /raid/bbednarski/NeMo-Agent-Toolkit" - echo "source /home/nfs/bbednarski/.venvs/nat_dynamo_eval2/bin/activate" + echo "cd /path/to/NeMo-Agent-Toolkit" + echo "source .venv/bin/activate" echo "export HF_HOME=~/.cache/huggingface"external/dynamo/start_dynamo_disagg.sh-36-36 (1)
36-36: Remove hardcoded user-specific path.The hardcoded path
/raid/bbednarski/models/...contains a user-specific directory that won't work for other users.Make this configurable or use environment variables:
# Local paths -LOCAL_MODEL_DIR="/raid/bbednarski/models/Llama-3.3-70B-Instruct" +LOCAL_MODEL_DIR="${LOCAL_MODEL_DIR:-$HOME/models/Llama-3.3-70B-Instruct}"This allows users to override via environment variable while providing a sensible default.
external/dynamo/start_dynamo_unified.sh-35-35 (1)
35-35: Remove hardcoded user-specific path.The hardcoded path
/raid/bbednarski/models/...contains a user-specific directory that won't work for other users.Make this configurable or use environment variables:
# Local paths -LOCAL_MODEL_DIR="/raid/bbednarski/models/Llama-3.3-70B-Instruct" +LOCAL_MODEL_DIR="${LOCAL_MODEL_DIR:-$HOME/models/Llama-3.3-70B-Instruct}"This allows users to override via environment variable while providing a sensible default.
examples/dynamo_integration/scripts/throughput_analysis.py-70-78 (1)
70-78: Add explicit return type hint.Per coding guidelines, all public APIs require Python 3.11+ type hints on parameters and return values.
Apply this diff:
-def calculate_tokens_per_second(csv_path: str): +def calculate_tokens_per_second(csv_path: str) -> tuple[pd.DataFrame | None, np.ndarray | None, dict | None]: """Calculate tokens/sec and inter-token latency for each LLM call from NEW_TOKEN events. Returns: tuple: (results_df, all_itls_array, aggregate_stats) - results_df: DataFrame with per-call metrics - all_itls_array: Array of all inter-token latencies - aggregate_stats: Dict with aggregate throughput metrics """examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py-245-252 (1)
245-252: JSON string normalization may corrupt valid string values.Replacing all single quotes with double quotes (line 248) can corrupt JSON string values that legitimately contain single quotes (e.g.,
{"name": "O'Brien"}). This would incorrectly become{"name": "O"Brien"}and fail to parse.Consider a more robust approach:
if isinstance(v, str): try: - # Handle both single and double quotes in JSON strings - normalized = v.replace("'", '"') - return json.loads(normalized) + return json.loads(v) except json.JSONDecodeError: - logger.warning("Failed to parse input_params string as JSON: %s", v[:100]) - return {} + # Try replacing single quotes only if standard parsing fails + # This is a fallback for non-standard JSON-like strings + try: + normalized = v.replace("'", '"') + return json.loads(normalized) + except json.JSONDecodeError: + logger.warning("Failed to parse input_params string as JSON: %s", v[:100]) + return {}examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/register.py-1-17 (1)
1-17: Add Apache SPDX header toregister.pyThis Python file is missing the standard Apache-2.0 SPDX header required across the repo.
Add the usual header above the
flake8: noqaline, e.g.:+# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + # flake8: noqaThe side-effect import pattern for registrations otherwise looks consistent with the rest of NAT.
external/dynamo/test_dynamo_integration.sh-1-41 (1)
1-41: Add SPDX header and fix example path / install instructionsTwo concrete issues here:
Missing Apache SPDX header
Shell scripts are required to carry the Apache-2.0 SPDX header. Add it immediately after the shebang:
#!/bin/bash
SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
Wrong example directory for NAT installation and docs
EXAMPLE_DIRis currently set to$SCRIPT_DIR(i.e.,external/dynamo), but the example package and README live underexamples/dynamo_integration. This makes the “activate/install this example” guidance misleading.You can derive the example and workflow dirs from
REPO_ROOT:-# Get script location and derive paths
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
-EXAMPLE_DIR="$SCRIPT_DIR"
-REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
-CONFIG_DIR="$REPO_ROOT/examples/dynamo_integration/react_benchmark_agent/configs"
+# Get script location and derive paths
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
+EXAMPLE_DIR="$REPO_ROOT/examples/dynamo_integration"
+WORKFLOW_DIR="$EXAMPLE_DIR/react_benchmark_agent"
+CONFIG_DIR="$WORKFLOW_DIR/configs"
@@
echo "✗ NAT command not found"
echo " Please activate your NAT environment first:"
echo " cd $REPO_ROOT"
echo " source .venv/bin/activate"
echo ""
echo " Then install this example:"
- echo " cd $EXAMPLE_DIR"
- echo " cd $WORKFLOW_DIR"
echo " uv pip install -e ."
@@
- echo "For detailed setup instructions, see:"
- echo " $EXAMPLE_DIR/README.md"
- echo "For detailed setup instructions, see:"
- echo " $EXAMPLE_DIR/README.md"
After this change, the script will point users at the correct example package for `uv pip install -e .` and at the top-level Dynamo integration README for troubleshooting. Also applies to: 57-69, 174-187 </blockquote></details> <details> <summary>packages/nvidia_nat_adk/tests/test_adk_llm.py-185-191 (1)</summary><blockquote> `185-191`: **Fixture does not disable prefix headers as intended.** The docstring says "no header injection," but `DynamoModelConfig` defaults `prefix_template` to `"nat-dynamo-{uuid}"`. Without explicitly setting `prefix_template=None`, prefix headers will still be injected, causing `test_basic_creation_without_prefix` to fail. ```diff @pytest.fixture def dynamo_cfg_no_prefix(self): """Dynamo config without prefix template (no header injection).""" return DynamoModelConfig( model_name="test-model", base_url="http://localhost:8000/v1", + prefix_template=None, )examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.py-1-12 (1)
1-12: Missing Apache 2.0 license header.Per coding guidelines, all Python source files must include the SPDX Apache-2.0 header template. Add the standard license header at the top of the file.
Apply this diff:
+# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + import logging from typing import Anyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py-62-65 (1)
62-65: Remove unused variableeval_builder.Ruff flagged this as unused. The variable is assigned but never referenced.
- # Store reference to builder for accessing intent buffer - eval_builder = builderexternal/dynamo/start_dynamo_unified_thompson_hints.sh-36-39 (1)
36-39: Hardcoded absolute paths make the script non-portable.These paths are specific to a developer's machine and will not work for other users. Per PR comments, paths should be relative. Consider parameterizing these via environment variables or command-line arguments with sensible defaults.
Apply this diff to use environment variables with defaults:
# Local paths -LOCAL_MODEL_DIR="/raid/bbednarski/models/Llama-3.3-70B-Instruct" -LOCAL_REPO_DIR="/raid/bbednarski/NeMo-Agent-Toolkit" -CUSTOM_DYNAMO_DIR="$LOCAL_REPO_DIR/external/dynamo/generalized" +LOCAL_MODEL_DIR="${NAT_MODEL_DIR:-${HOME}/models/Llama-3.3-70B-Instruct}" +LOCAL_REPO_DIR="${NAT_REPO_DIR:-$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)}" +CUSTOM_DYNAMO_DIR="${LOCAL_REPO_DIR}/external/dynamo/generalized"examples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.py-1-31 (1)
1-31: Missing required SPDX Apache-2.0 license header.The script starts with a shebang and docstring but is missing the required license header.
Apply this diff:
#!/usr/bin/env python3 +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + """ Unified plotting script for throughput metrics vs TSQ (Tool Selection Quality) scores.external/dynamo/start_dynamo_unified_thompson_hints.sh-1-2 (1)
1-2: Missing required SPDX Apache-2.0 license header.Per coding guidelines, all shell scripts must include the standard SPDX Apache-2.0 header at the top of the file.
Apply this diff to add the required license header:
#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. # Dynamo SGLang FULL STACK with Thompson Sampling Router & Prefix Hints
🟡 Minor comments (23)
external/dynamo/README.md-208-208 (1)
208-208: Fix redundant "minutes seconds" phrase.The phrase "~5 minutes seconds" contains a typo; it should read "~5 minutes".
-**Startup time**: ~5 minutes seconds for 70B model +**Startup time**: ~5 minutes for 70B modelexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py-257-259 (1)
257-259: Redundant exception object inlogging.exceptioncall.Per Ruff TRY401 and coding guidelines,
logger.exception()automatically includes the exception information. Remove the redundant exception object from the format arguments.except Exception as e: - logger.exception("Failed to load domain %s: %s", domain, e) + logger.exception("Failed to load domain %s", domain) continueexamples/dynamo_integration/.env.example-2-5 (1)
2-5: Fix copy command to use the correct filenameThe example copy command should reference
.env.example, notenv.example, otherwise users will get a “No such file” error.-# cp env.example .env +# cp .env.example .envEverything else in this template looks reasonable for an example (placeholders, no real secrets).
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_prefix_e2e_test.yml-13-18 (1)
13-18: Align “Key Features” comment with actual config fieldsThe bullet
enable_dynamic_prefix: trueis mentioned in comments but there is no corresponding field in thedynamo_llmconfig, while later comments sayprefix_templatealone enables dynamic prefixes. To avoid confusion, either:
- add the actual
enable_dynamic_prefixfield if it exists in the API, or- remove/adjust that bullet so it only refers to
prefix_templateand theprefix_*hints.Config structure and 2-space indentation look consistent with existing YAML style.
Based on learnings, 2-space indentation is correct for this repo.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.py-6-9 (1)
6-9: Sort__all__entries to satisfy RuffRuff flags
__all__as unsorted; reordering keeps behavior the same and fixes lint.-from .action_completion_evaluator import action_completion_evaluator_function -from .tsq_evaluator import tsq_evaluator_function - -__all__ = ["tsq_evaluator_function", "action_completion_evaluator_function"] +from .action_completion_evaluator import action_completion_evaluator_function +from .tsq_evaluator import tsq_evaluator_function + +__all__ = ["action_completion_evaluator_function", "tsq_evaluator_function"]examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.py-4-7 (1)
4-7: Remove unusednoqadirective to satisfy RuffRuff reports RUF100 for the
# noqa: F401here. SinceF401isn’t enabled (and you’re already exportingregistervia__all__), the directive is unnecessary.-# Import register module to trigger function registrations -from . import register # noqa: F401 + # Import register module to trigger function registrations (side-effect registration) +from . import registerexamples/dynamo_integration/README.md-7-20 (1)
7-20: Fix typos, align naming, and update function-group exampleA few documentation nits worth fixing:
Typos / wording
- Line 9: duplicated word.
- Line 17: “preditve” spelling.
-... with respect to conversation length, system prompts, and tool-calling. We aim to codesign our inference servers to provide better performance in a repeatable, mock, decision-only evaluation harness. The harness uses ...
+... for conversation length, system prompts, and tool-calling. We aim to co-design our inference servers to provide better performance in a repeatable, mock, decision-only evaluation harness. The harness uses ...-Most of these examples could be tested ... However, the intended analysis analysis would require ...
+Most of these examples could be tested ... However, the intended analysis would require ...
@@
-- Dynamo Backend: Fast LLM inference with KV cache optimization (default Dynamo method) and a preditive Thompson sampling router (new implementation)
+- Dynamo Backend: Fast LLM inference with KV cache optimization (default Dynamo method) and a predictive Thompson sampling router (new implementation)2. **NAT naming consistency** Per repo guidance, in body text prefer “NVIDIA NeMo Agent toolkit” on first mention and “NeMo Agent toolkit” thereafter. You can keep the heading as-is (it already matches the heading rule), but adjust the first bold sentence accordingly: ```diff -**This set of example agents and evaluations demonstrate the capability to integrate NVIDIA NeMo Agent Toolkit (NAT) agents... +**This set of example agents and evaluations demonstrate the capability to integrate NVIDIA NeMo Agent toolkit (NAT) agents...
Function-group example
The config snippet under “Basic Evaluation (No Self-Evaluation)” still suggests listing individual tools from a function group:
- tool_names: [banking_tools.get_account_balance, ...]
For function groups, list the group name; all tools in the group are added automatically
- tool_names: [banking_tools]
This keeps the README aligned with both the style guidelines and the reviewer’s feedback about function-group usage.
Also applies to: 80-90, 185-204, 209-223
examples/dynamo_integration/scripts/run_concurrency_benchmark.sh-82-82 (1)
82-82: Inconsistent concurrency level count.Line 97 displays "Run X/8" but
CONCURRENCY_LEVELSonly contains 2 values (16, 32). Either expand the array to 8 levels or fix the progress indicator.-CONCURRENCY_LEVELS=(16 32) +# Add more levels if you want 8 runs, or fix the progress indicator below +CONCURRENCY_LEVELS=(1 2 4 8 16 32 48 64)Or fix the progress indicator:
- echo -e "${BLUE}[Run $((${#JOB_IDS[@]} + 1))/8]${NC} Running eval with max_concurrency = ${YELLOW}$CONCURRENCY${NC}" + echo -e "${BLUE}[Run $((${#JOB_IDS[@]} + 1))/${#CONCURRENCY_LEVELS[@]}]${NC} Running eval with max_concurrency = ${YELLOW}$CONCURRENCY${NC}"Also applies to: 97-97
examples/dynamo_integration/scripts/run_concurrency_benchmark.sh-176-182 (1)
176-182: Markdown report contains incorrect description.Line 178 says "1-8" concurrency levels but the script only tests 2 levels (16, 32). Update to match actual configuration.
## Summary -This benchmark evaluates Dynamo performance across different concurrency levels (1-8). +This benchmark evaluates Dynamo performance across different concurrency levels.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_predictive_prefix_headers.yml-268-283 (1)
268-283: Weight comments don't match actual values.The inline comments describe weights as 50%/30%/20% but the actual values are 0.7/0.2/0.1 (70%/20%/10%).
# Primary: Minimize average LLM latency (most impactful) llm_latency: evaluator_name: avg_llm_latency direction: minimize - weight: 0.7 # 50% weight - primary objective + weight: 0.7 # 70% weight - primary objective # Secondary: Minimize total workflow runtime workflow_runtime: evaluator_name: avg_workflow_runtime direction: minimize - weight: 0.2 # 30% weight + weight: 0.2 # 20% weight # Tertiary: Minimize number of LLM calls (efficiency) num_calls: evaluator_name: avg_num_llm_calls direction: minimize - weight: 0.1 # 20% weight + weight: 0.1 # 10% weightexamples/dynamo_integration/ARCHITECTURE.md-1-3 (1)
1-3: Missing SPDX license header.Per coding guidelines, all source files including Markdown must include the SPDX Apache-2.0 header.
Add the license header at the top of the file:
+<!-- SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. --> +<!-- SPDX-License-Identifier: Apache-2.0 --> + # System Architectureexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md-464-467 (1)
464-467: Address or remove dead code reference.Line 466 documents
prefix_utils.pywith status "To be removed - imports from non-existent module". If this file doesn't exist or shouldn't be included, remove this entry from the documentation to avoid confusion.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md-5-16 (1)
5-16: Fix broken internal link in Table of Contents.The TOC entry at line 16 links to
#deprecation-analysis--recommendations, but this section doesn't exist in the document. Either add the section or remove the TOC entry.Apply this diff to remove the broken link:
4. [Source Code Reference](#source-code-reference) 5. [Evaluators](#evaluators) -6. [Deprecation Analysis & Recommendations](#deprecation-analysis--recommendations)Or add the missing section before the end of the document.
examples/dynamo_integration/react_benchmark_agent/README.md-28-28 (1)
28-28: Fix spelling error: "explainations" → "explanations"-> **Note:** For a more abbreviated way to kick off experimentation, see the [Quick Start](../README.md#quick-start) section in the parent README. This document provides a more detailed explainations of the different test patterns and configurations available. +> **Note:** For a more abbreviated way to kick off experimentation, see the [Quick Start](../README.md#quick-start) section in the parent README. This document provides a more detailed explanations of the different test patterns and configurations available.examples/dynamo_integration/react_benchmark_agent/README.md-5-5 (1)
5-5: Fix spelling error: "exculsivly" → "exclusively"Static analysis flagged this typo.
-Currently this agent supports evaluation exculsivly for the [Galileo Agent Leaderboard v2](https://huggingface.co/datasets/galileo-ai/agent-leaderboard-v2). However, we plan to extend the set of evaluation tool sets and benchmarks and will update this document accordingly. +Currently this agent supports evaluation exclusively for the [Galileo Agent Leaderboard v2](https://huggingface.co/datasets/galileo-ai/agent-leaderboard-v2). However, we plan to extend the set of evaluation tool sets and benchmarks and will update this document accordingly.external/dynamo/generalized/router.py-964-964 (1)
964-964: Typo in comment.Line 964 has "EMAThe" which appears to be a typo for "EMA".
- # Latency EMAThe + # Latency EMAexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py-145-155 (1)
145-155: Remove unused variableexpected_tsq.The variable
expected_tsqis assigned but never used in the test assertion.def test_scenario_000_score(self): """Verify the score from banking_scenario_000.""" tool_accuracy = 4 / 7 # 0.5714285714285715 param_accuracy = 0.0 - expected_tsq = 1.0 * tool_accuracy + 0.0 * param_accuracy - # = 1.0 * 0.5714... + 0.0 * 0 - # = 0.5714... + # expected: 1.0 * tool_accuracy + 0.0 * param_accuracy = 0.5714... tsq = calculate_tsq_score(tool_accuracy, param_accuracy) assert tsq == pytest.approx(4 / 7)examples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py-227-230 (1)
227-230: Simplify and fix the None handling test.The assertion on line 230 is convoluted and will always pass due to the
or Truefallback. This doesn't actually test None handling.def test_empty_tool_name(self): """Verify empty tool names are handled.""" assert normalize_tool_name("") == "" - assert normalize_tool_name(None) == "" if normalize_tool_name(None) is not None else True + # After fixing normalize_tool_name to handle None: + assert normalize_tool_name(None) == ""external/dynamo/generalized/router.py-1-29 (1)
1-29: Missing SPDX license header and duplicate import.The file has an abbreviated license header and a duplicate
asyncioimport on line 7.Apply this diff:
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA # SPDX-License-Identifier: Apache-2.0 +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. import asyncio import argparse import uvloop -import asyncio import loggingexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py-1-3 (1)
1-3: Missing SPDX license header.Per coding guidelines, all source files must include the SPDX Apache-2.0 header template.
Apply this diff to add the required license header:
+# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + #!/usr/bin/env python3 """ Histogram plotting script for throughput metrics distribution analysis.external/dynamo/generalized/frontend.py-367-371 (1)
367-371: Uselogging.exceptionand chain the exception properly.Per coding guidelines: use
logger.exception()to capture full stack trace, and chain exceptions withraise ... from err.except HTTPException: raise except Exception as e: - logging.error("Error in chat completions: %s", e) - raise HTTPException(status_code=500, detail=str(e)) + logging.exception("Error in chat completions") + raise HTTPException(status_code=500, detail=str(e)) from eexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_predictive_prefix_headers.yml-228-230 (1)
228-230: Fix inconsistent comment: spike_threshold value and description do not match.Line 230 has a comment stating
# Alert when concurrent functions >= 5, but the actualspike_thresholdvalue is 24. This mismatch is misleading and makes the intention unclear.Correct the comment to match the threshold value:
concurrency_spike_analysis: enable: true - spike_threshold: 24 # Alert when concurrent functions >= 5 + spike_threshold: 24 # Alert when concurrent functions >= 24examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_predictive_prefix_headers.yml-143-147 (1)
143-147: Remove or clarify inline comments showing alternative parameter values.Lines 145–147 contain inline comments with different values (e.g.,
10 # 6,MEDIUM # HIGH,MEDIUM # LOW). These are confusing and suggest uncertainty about the intended configuration. Either remove these comments or document their purpose explicitly.Apply this diff to remove misleading comments:
# Note: Setting prefix_template enables dynamic prefix headers prefix_template: "react-benchmark-{uuid}" - prefix_total_requests: 10 # 6 - prefix_osl: MEDIUM # HIGH - prefix_iat: MEDIUM # LOW + prefix_total_requests: 10 + prefix_osl: MEDIUM + prefix_iat: MEDIUM # Enable parameter optimization for this LLMCommittable suggestion skipped: line range outside the PR's diff.
willkill07
left a comment
There was a problem hiding this comment.
Adding review for nvidia_nat_adk, nvidia_nat_langchain, and src/nat/llm/dynamo_llm.py
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
external/dynamo/start_dynamo_unified_thompson_hints.sh (1)
1-5: Add required SPDX Apache-2.0 license header.Per coding guidelines, all source files including shell scripts must include the SPDX Apache-2.0 header.
Apply this diff:
#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 # Dynamo SGLang FULL STACK with Thompson Sampling Router & Prefix Hintsexternal/dynamo/start_dynamo_unified.sh (1)
1-5: Add required SPDX Apache-2.0 license header.Per coding guidelines, all source files including shell scripts must include the SPDX Apache-2.0 header.
Apply this diff:
#!/bin/bash +# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 # Dynamo SGLang FULL STACK with Unified Worker
🧹 Nitpick comments (18)
external/dynamo/.env.example (1)
1-72: Well-organized environment template with clear structure.The
.env.examplefile is properly organized with five logical sections (GPU, Network, Model, Disaggregated, Repository), consistent DYNAMO_ variable naming, and well-documented defaults. The structure aligns with the Dynamo integration's support for both unified and disaggregated serving modes.A minor suggestion: Consider whether this template should also support standard dotenv format (without
exportkeywords) for compatibility with Python-based dotenv loaders used elsewhere in the codebase. If the startup scripts use shell sourcing, the current format is appropriate.external/dynamo/README.md (3)
3-3: Use full toolkit name on first mention.Per coding guidelines, the first reference should be "NVIDIA NeMo Agent Toolkit", with "NeMo Agent Toolkit" used subsequently.
-This guide covers setting up, running, and configuring the NVIDIA Dynamo backend for the React Benchmark Agent evaluations. +This guide covers setting up, running, and configuring the NVIDIA Dynamo backend for the NeMo Agent Toolkit's React Benchmark Agent evaluations.Or update line 3 to:
-> ⚠️ **EXPERIMENTAL**: The NeMo Agent Toolkit and Dynamo integration is experimental and under active development. APIs, configurations, and features may change without notice. +> ⚠️ **EXPERIMENTAL**: The NVIDIA NeMo Agent Toolkit and Dynamo integration is experimental and under active development. APIs, configurations, and features may change without notice.
33-150: Specify language identifiers for code blocks.Multiple code blocks are missing language specifications (MD040). Add appropriate language identifiers to improve syntax highlighting and readability.
Examples of fixes:
-``` +``` ┌──────────────────────────────────────────────────────────────────────────────┐-``` +```output +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.65.06 Driver Version: 580.65.06 CUDA Version: 13.0 |-``` +```json {"choices": [...],Use appropriate language identifiers:
bashorshellfor shell commands/outputjsonfor JSON examplesyamlfor YAML configurationstextoroutputfor command output (nvidia-smi, test results, etc.)- Leave empty for ASCII diagrams if no highlighting is applicable
Also applies to: 236-261, 410-445, 466-483, 513-523, 703-708, 906-922
447-447: Consider more formal wording for verb choice.Static analysis suggests using a more formal verb than "provides" in this context.
-If any tests fail, the script provides guidance on how to fix the issue. +If any tests fail, the script offers guidance on how to fix the issue.Or alternatively: "the script supplies", "the script includes", or "you'll find".
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml (1)
35-63: Consider using the function group reference instead of the explicit tool list to reduce verbosity.The
react_workflowlists all 19 banking tools individually (lines 38–59) instead of referencing thebanking_toolsfunction group (lines 104–130). The workflow system supports both patterns, but using the group name directly is simpler and maintains consistency with the function group design pattern. This would changetool_namesfrom a 20-line list to a single line:tool_names: [banking_tools].examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py (13)
125-129: Avoid catching broadException; catch specific exceptions instead.Catching bare
Exceptioncan mask unexpected errors. Narrow the exception type topd.errors.ParserErrororpd.errors.EmptyDataErrorfor CSV parsing issues.try: df = pd.read_csv(csv_path) - except Exception as e: + except (pd.errors.ParserError, pd.errors.EmptyDataError, FileNotFoundError) as e: print(f" Warning: Error reading CSV {csv_path}: {e}") return None, None
144-144: Rename unused loop variableidxto_.The loop variable is not used within the loop body. Per static analysis and convention, rename it to
_.- for idx, row in example_df.iterrows(): + for _, row in example_df.iterrows():
216-227: Fix implicitOptionaltype hint.PEP 484 prohibits implicit
Optional. Use explicit union type for the parameter default.-def collect_job_data_from_dir(jobs_dir: Path, experiment_label: str = None) -> tuple[list[dict], list[dict]]: +def collect_job_data_from_dir(jobs_dir: Path, experiment_label: str | None = None) -> tuple[list[dict], list[dict]]:
352-363: Addstrict=Truetozip()for safety.Adding
strict=Trueensures the iterables have equal length and prevents silent data loss.- for label, key in zip(stat_labels, stat_keys): + for label, key in zip(stat_labels, stat_keys, strict=True):
368-371: Remove assignment to unused variable.The
text_objvariable is assigned but never used. Either remove the assignment or prefix with_if kept for debugging.- text_obj = ax.text(0.98, 0.98, table_text, transform=ax.transAxes, fontsize=8, + ax.text(0.98, 0.98, table_text, transform=ax.transAxes, fontsize=8,
404-415: Addstrict=Truetozip()for safety.Same as above - adding
strict=Trueensures the iterables have equal length.- for label, key in zip(stat_labels, stat_keys): + for label, key in zip(stat_labels, stat_keys, strict=True):
433-449: Add return type hint.Per coding guidelines, all public APIs require type hints on return values. Since this function returns
None, add the explicit type hint.def create_histogram_plots(df: pd.DataFrame, output_dir: Path, - llm_call_df: pd.DataFrame = None): + llm_call_df: pd.DataFrame | None = None) -> None:
493-493: Prefix unusedfigvariable with underscore.The
figvariable fromplt.subplots()is not used. Rename to_to indicate intentional non-use.- fig, ax = plt.subplots(figsize=(10, 7)) + _, ax = plt.subplots(figsize=(10, 7))
604-618: Add return type hint and fix implicitOptional.Per coding guidelines, add explicit return type and use
| Noneinstead of implicitOptional.def create_summary_histogram_plot(df: pd.DataFrame, output_dir: Path, - llm_call_df: pd.DataFrame = None): + llm_call_df: pd.DataFrame | None = None) -> None:
620-621: Prefix unusedfigvariable with underscore.- fig, axes = plt.subplots(2, 3, figsize=(15, 10)) + _, axes = plt.subplots(2, 3, figsize=(15, 10))
664-667: Addstrict=Truetozip()for safety.The
axesarray has 6 elements andall_metricshas 6 items, so they should match. Addingstrict=Trueensures this invariant is enforced.- for ax, (metric_col, label, data_df, num_bins) in zip(axes, all_metrics): + for ax, (metric_col, label, data_df, num_bins) in zip(axes, all_metrics, strict=True):
767-786: Add return type hint formain().Per coding guidelines, add explicit return type hint.
-def main(): +def main() -> None:
823-823: Remove extraneousfprefix from string without placeholders.This f-string has no interpolation; use a regular string instead.
- print(f"Histogram bins: TTFT/ITL/TPS=100, Tokens=50, Calls=25, Duration=25") + print("Histogram bins: TTFT/ITL/TPS=100, Tokens=50, Calls=25, Duration=25")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
examples/dynamo_integration/react_benchmark_agent/configs(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py(1 hunks)external/dynamo/.env.example(1 hunks)external/dynamo/README.md(1 hunks)external/dynamo/start_dynamo_disagg.sh(1 hunks)external/dynamo/start_dynamo_unified.sh(1 hunks)external/dynamo/start_dynamo_unified_thompson_hints.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- examples/dynamo_integration/react_benchmark_agent/configs
- external/dynamo/start_dynamo_disagg.sh
🧰 Additional context used
📓 Path-based instructions (11)
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
external/dynamo/start_dynamo_unified.shexternal/dynamo/start_dynamo_unified_thompson_hints.shexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
external/dynamo/start_dynamo_unified.shexternal/dynamo/README.mdexternal/dynamo/start_dynamo_unified_thompson_hints.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
external/dynamo/README.mdexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NeMo Agent Toolkit' (capitalize 'T') when the name appears in headings
Files:
external/dynamo/README.md
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
external/dynamo/README.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
external/dynamo/README.mdexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
🧠 Learnings (4)
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,js,ts,java,cpp,c,go,rb,php} : Every file must start with the standard SPDX Apache-2.0 header
Applied to files:
external/dynamo/start_dynamo_unified.sh
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,js,ts,java,cpp,c,go,rb,php,sh} : All source files must include the SPDX Apache-2.0 header template
Applied to files:
external/dynamo/start_dynamo_unified.sh
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: CI verifies SPDX headers via ci/scripts/github/checks.sh; do not bypass this check
Applied to files:
external/dynamo/start_dynamo_unified.sh
📚 Learning: 2025-12-12T20:49:44.305Z
Learnt from: zterek
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1243
File: examples/risk_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml:1-98
Timestamp: 2025-12-12T20:49:44.305Z
Learning: In the NVIDIA/NeMo-Agent-Toolkit repository, YAML files generally use 2-space indentation. When reviewing YAML, prefer 2-space indentation to match the existing style over a 4-space guideline until a repo-wide standardization is performed. This applies to YAML configuration files (e.g., red-teaming.yml) and, more broadly, all *.yml files in the project.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml
🪛 dotenv-linter (4.0.0)
external/dynamo/.env.example
[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
[warning] 4-4: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
🪛 LanguageTool
external/dynamo/README.md
[style] ~447-~447: Consider using a different verb for a more formal wording.
Context: ... the script provides guidance on how to fix the issue. --- ## Stopping Dynamo A ...
(FIX_RESOLVE)
🪛 markdownlint-cli2 (0.18.1)
external/dynamo/README.md
33-33: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
236-236: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
410-410: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
466-466: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
513-513: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
703-703: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
906-906: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.8)
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
11-11: Docstring contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF002)
95-95: Consider moving this statement to an else block
(TRY300)
127-127: Do not catch blind exception: Exception
(BLE001)
144-144: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
216-216: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
329-329: Docstring contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF002)
349-349: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
352-352: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
368-368: Local variable text_obj is assigned to but never used
Remove assignment to unused variable text_obj
(F841)
401-401: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
404-404: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
493-493: Unpacked variable fig is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
570-570: Prefer next(iter(job_stats.values())) over single element slice
Replace with next(iter(job_stats.values()))
(RUF015)
620-620: Unpacked variable fig is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
664-664: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
734-734: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
741-741: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
823-823: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (17)
external/dynamo/.env.example (1)
2-4: Static analysis warnings are false positives in shell context.The
dotenv-lintertool flagsQuoteCharacterwarnings on lines 2-4, but these are false positives because the file uses shellexportsyntax (not standard dotenv format). Quotes around paths are appropriate and recommended in shell scripts to handle potential whitespace or special characters. No action needed.external/dynamo/README.md (2)
275-330: Approve comprehensive Dynamo setup documentation.The README effectively addresses the PR objectives by providing:
- Clear prerequisites section (hardware and software requirements)
- Three deployment modes with explicit advantages
- Detailed architecture diagrams
- Step-by-step setup instructions
- Health checks and validation tests
- Troubleshooting guidance
- Configuration reference with environment variables
The documentation is well-structured, user-friendly, and should enable practitioners to set up and validate the Dynamo integration effectively.
1-10: Add required SPDX Apache-2.0 license header.Per coding guidelines, all source files including markdown documentation must include the SPDX Apache-2.0 header.
Apply this diff:
+# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 + # Dynamo Backend Setup Guide⛔ Skipped due to learnings
Learnt from: CR Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0 File: .cursor/rules/general.mdc:0-0 Timestamp: 2025-11-24T18:56:53.109Z Learning: Applies to **/*.{md,rst,py} : Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent referencesLearnt from: CR Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0 File: .cursor/rules/general.mdc:0-0 Timestamp: 2025-11-24T18:56:53.109Z Learning: Applies to **/*.{md,rst} : Use 'NeMo Agent Toolkit' (capitalize 'T') when the name appears in headingsLearnt from: zterek Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1243 File: examples/risk_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml:1-98 Timestamp: 2025-12-12T20:49:51.370Z Learning: In the NVIDIA/NeMo-Agent-Toolkit repository, most YAML configuration files use 2-space indentation rather than 4-space indentation. When reviewing YAML files, prefer consistency with the existing codebase style (2-space indentation) over the stated coding guideline (4-space indentation) until a repository-wide standardization is performed.Learnt from: CR Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0 File: .cursor/rules/general.mdc:0-0 Timestamp: 2025-11-24T18:56:53.109Z Learning: Applies to **/*.{py,js,ts,java,cpp,c,go,rb,php} : Every file must start with the standard SPDX Apache-2.0 headerexternal/dynamo/start_dynamo_unified_thompson_hints.sh (5)
79-93: Approve custom component validation logic.The verification of custom components (frontend.py, router.py, processor.py) before container startup is a solid defensive practice that prevents silent failures later.
166-193: Approve pragmatic HuggingFace token handling.The fallback logic for HF_TOKEN is well-designed:
- Checks for cached local model first
- Uses dummy token if model exists locally
- Prompts user for token if needed
- Provides clear guidance on skipping
This approach balances security, user experience, and failure recovery.
264-294: Approve robust worker initialization wait function.The
wait_for_workerfunction demonstrates excellent defensive coding:
- Verifies process is still running
- Provides progress updates
- Has appropriate timeout (120s max)
- Assumes initialization after 60s for long-loading models like 70B
- Reports clear error messages on failure
322-356: Approve comprehensive startup sequence and diagnostics.The multi-stage startup (ETCD → NATS → Worker → Router → Processor → Frontend) with detailed logging and health checks demonstrates excellent orchestration:
- Each component outputs its PID for debugging
- Clear step numbering aids problem diagnosis
- Metrics CSV paths are documented
- Request flow diagram helps users understand data paths
424-444: Approve process monitoring loop.The monitoring loop that detects failures in critical processes (Frontend, Processor, Router, Worker) and exits immediately is appropriate. This prevents the container from running in a degraded state.
external/dynamo/start_dynamo_unified.sh (4)
29-32: Approve flexible port configuration.Non-default port selections for ETCD (2379) and NATS (4222) with explicit environment variable overrides prevent conflicts with other services and documentation makes this clear.
78-99: Approve ETCD peer configuration.The addition of
ETCD_PEER_PORT(line 31) and peer URL configuration (lines 80-82) enables future clustering without breaking backward compatibility.
230-260: Approve robust infrastructure verification.The in-container verification of ETCD and NATS accessibility with clear error messages ensures the container detects configuration issues early rather than failing mysteriously during worker startup.
405-459: Approve NAT integration documentation.The inclusion of NAT integration test commands (lines 405-414) in the startup output helps users transition from basic validation to end-to-end workflow evaluation.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml (1)
1-261: Confirm file ends with a single newline.Per coding guidelines, all files should end with a single newline. Please verify that this file has a trailing newline after line 261.
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py (4)
42-44: LGTM!Simple utility function with proper type hints and docstring.
47-49: LGTM!Simple utility function with proper type hints and docstring.
52-99: LGTM!Function correctly parses TSQ scores with proper error handling and type hints.
293-321: LGTM!Function correctly aggregates data from multiple directories with proper type hints.
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
Show resolved
Hide resolved
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (4)
external/dynamo/.env.example (1)
1-1: Optional: Add brief usage instruction at the top.Consider adding a one-line comment explaining how to use this file (e.g.,
# Source this file or copy values to your shell environment). This will help users unfamiliar with.env.examplepatterns understand the intended usage quickly.Example:
+# Usage: Source this file or copy variable definitions to your shell environment +# source .env.example +# or manually export the variables below + # Required: Set your model directory pathexamples/dynamo_integration/README.md (1)
9-9: Consider rewording "with respect to" for clarity.Line 9 uses "LLM requests have predictable patterns with respect to conversation length..." — this phrasing is somewhat verbose. Consider: "Agentic LLM requests have predictable patterns in conversation length, system prompts, and tool-calling." This is more direct and reads more smoothly.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml (1)
126-128: Clarify commented configuration alternatives.Lines 126–128 contain commented-out values (e.g.,
prefix_total_requests: 10 # 6,prefix_osl: MEDIUM # HIGH). If these are intentional for benchmarking/testing purposes, consider adding a brief comment explaining why alternatives are documented. If they should be removed, clean them up.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml (1)
79-86: Address the commented optimizer fields and NAT type resolution bug reference.Line 79 references a "NAT type resolution bug" preventing optimizer parameters. This should either:
- Link to a tracked issue/GitHub issue number for future resolution
- Include a temporal note (e.g., "FIXME: Re-enable after NAT issue #XYZ is resolved")
- Be updated if the bug has been fixed
Consider replacing the generic "bug" reference with traceable issue tracking information.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
examples/dynamo_integration/ARCHITECTURE.md(1 hunks)examples/dynamo_integration/README.md(1 hunks)examples/dynamo_integration/react_benchmark_agent/README.md(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml(1 hunks)external/dynamo/.env.example(1 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
examples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
**/*.{md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NeMo Agent Toolkit' (capitalize 'T') when the name appears in headings
Files:
examples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
examples/dynamo_integration/react_benchmark_agent/README.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
🧠 Learnings (2)
📚 Learning: 2025-12-12T20:49:44.305Z
Learnt from: zterek
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1243
File: examples/risk_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml:1-98
Timestamp: 2025-12-12T20:49:44.305Z
Learning: In the NVIDIA/NeMo-Agent-Toolkit repository, YAML files generally use 2-space indentation. When reviewing YAML, prefer 2-space indentation to match the existing style over a 4-space guideline until a repo-wide standardization is performed. This applies to YAML configuration files (e.g., red-teaming.yml) and, more broadly, all *.yml files in the project.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{md,rst,py} : Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Applied to files:
examples/dynamo_integration/README.md
🪛 dotenv-linter (4.0.0)
external/dynamo/.env.example
[warning] 2-2: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
[warning] 4-4: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
🪛 LanguageTool
examples/dynamo_integration/react_benchmark_agent/README.md
[grammar] ~5-~5: Ensure spelling is correct
Context: ...urrently this agent supports evaluation exculsivly for the [Galileo Agent Leaderboard v2](...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~28-~28: Ensure spelling is correct
Context: ... This document provides a more detailed explainations of the different test patterns and conf...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
examples/dynamo_integration/README.md
[style] ~9-~9: ‘with respect to’ might be wordy. Consider a shorter alternative.
Context: ... LLM requests have predictable patterns with respect to conversation length, system prompts, an...
(EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)
[grammar] ~17-~17: Ensure spelling is correct
Context: ...imization (default Dynamo method) and a preditive Thompson sampling router (new implement...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
examples/dynamo_integration/react_benchmark_agent/README.md
330-330: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
346-346: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
427-427: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
522-522: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
570-570: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
645-645: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
examples/dynamo_integration/README.md
103-103: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
examples/dynamo_integration/ARCHITECTURE.md
19-19: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
255-255: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
333-333: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
464-464: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
552-552: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
16-16: Link fragments should be valid
(MD051, link-fragments)
22-22: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
119-119: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
217-217: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
369-369: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
431-431: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (20)
external/dynamo/.env.example (2)
1-4: QuoteCharacter warnings are false positives—quotes are correct in bash export context.The dotenv-linter warnings on lines 2 and 4 flag quotes in path values. However, since this file uses bash
exportstatements (not standard.envformat), the quotes are necessary and correct. They prevent word splitting and properly handle paths with spaces. These can be safely ignored.
1-68: Well-organized configuration template with clear sections and sensible defaults.The file is logically structured with:
- Required variables prominently listed
- Optional variables grouped by concern (GPU, Network, Model, Disaggregated Mode)
- Descriptive comments, default values, and example usage for each variable
This provides good guidance for users setting up the Dynamo integration.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md (1)
465-467: Clarify the deprecation status ofprefix_utils.py.The note states this file "imports from non-existent module" and is "To be removed". Please clarify:
- Is this file currently in the codebase or already removed?
- If still present, should it be removed as part of this PR?
- If it remains, what is the migration path for users currently relying on it?
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml (2)
29-220: YAML indentation follows repo convention.The file uses 2-space indentation throughout, which aligns with the established pattern in this repository (per learnings from PR 1243) rather than the generic 4-space guideline. This is correct and consistent with existing configurations.
1-220: The file already ends with a single newline as required by the coding guidelines. No action needed.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.yml (8)
1-2: File header and license formatting look correct.Apache 2.0 header is present with current copyright year (2024-2025). Per coding guidelines, this is correctly placed.
4-35: Excellent comprehensive documentation and parameter reference.The header documentation is well-structured with clear explanations of Dynamo prefix parameters (prefix_osl, prefix_iat, prefix_total_requests) and router behavior. The ASCII table and bullet points make it easy to understand optimization targets. This is exemplary documentation.
113-165: Dynamo LLM configuration with optimizable parameters is well-structured.The dynamo_llm section properly defines:
- Model configuration (model_name, base_url, temperature, max_tokens)
- Prefix template and parameters (prefix_template, prefix_total_requests, prefix_osl, prefix_iat)
- Optimizable parameters list and search space definitions (lines 142-156)
The search_space definitions are clear and reasonable. The eval_llm secondary LLM is appropriately minimal for evaluation purposes.
167-174: Workflow configuration using self-evaluation pattern is clear.The workflow wraps react_workflow with self_evaluating_agent_with_feedback, properly configuring evaluator_llm, max_retries, and confidence threshold. The pass_feedback_to_agent flag enables iterative improvement. Configuration is sound.
232-279: Optimizer configuration with weighted metrics is well-designed.The optimizer section properly defines:
- reps_per_param_set=1 (appropriate for testing; noted in comment)
- eval_metrics with weights summing to 1.0 (0.7 + 0.2 + 0.1 = 1.0) ✓
- Numeric optimization with grid sampler (note: comment correctly identifies that 180 combinations may be excessive)
- Prompt optimization intentionally disabled (line 279) with clear rationale
The configuration is logically sound for runtime performance tuning.
1-280: File correctly ends with a single newline. The file complies with the coding guideline requirement that all YAML files must end with a single newline character.
229-229: Evaluator typeavg_tokens_per_llm_endis correctly registered.The type name is properly registered in
src/nat/eval/runtime_evaluator/register.pyasAverageTokensPerLLMEndConfig(EvaluatorBaseConfig, name="avg_tokens_per_llm_end")and is used consistently across the codebase in other example configurations.
88-88: The relative path is correct for the intended workflow. The README explicitly documents in Step 3 of the Quick Start section that the dataset must be downloaded first usingpython scripts/download_agent_leaderboard_v2.py --domains banking. No changes needed.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.yml (4)
107-136: System prompt is well-structured and comprehensive.The custom system prompt clearly instructs the agent on tool selection evaluation, single-tool-per-turn constraint, and placeholder value handling. This is appropriately documented with CRITICAL RULES section.
1-3: Apache License header and copyright years are correct.The file properly includes the SPDX license identifier and up-to-date copyright years.
59-83: Dynamo LLM parameters are properly supported and validated.The Dynamo LLM provider implementation in
src/nat/llm/dynamo_llm.pyproperly recognizes, validates, and documents all configured parameters (prefix_template,prefix_total_requests,prefix_osl,prefix_iat). Each parameter is defined with Pydantic field validation, detailed docstring documentation, and comprehensive test coverage. Graceful error handling includes fallback behavior whenprefix_templateis None and bounds validation forprefix_total_requests(1-50 range). The configuration is correct.
33-33: No action needed. The paths in the configuration files are correct.The data files referenced in the config (
./examples/dynamo_integration/data/raw/banking/tools.jsonand./examples/dynamo_integration/data/agent_leaderboard_v2_banking.json) are meant to be executed from the project root usingnat eval, as documented in the Quick Start guide. The data directory does not yet exist because these files must first be downloaded using the download script:python scripts/download_agent_leaderboard_v2.py --domains banking. The paths are properly designed for this documented execution model and should not be changed to relative paths like../../data/....Likely an incorrect or invalid review comment.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml (3)
98-108: Inconsistent workflow configuration between full and minimal test variants.File 1 (
eval_config_no_rethinking_full_test.yml) includesmax_history: 1000in the workflow section (line 103), but this file omits it. Clarify whether this omission is intentional or if it should be added for consistency. If intentional, document the reason in a comment.
142-148: Profiler configuration differs significantly from full test variant.This minimal test variant has most profiler features enabled (lines 159-183), while the full test disables them. Document the rationale for this difference if it's intentional. If the full test should also have these features enabled, consider enabling them there as well for consistency.
1-3: Apache License header and copyright years are correct.The file properly includes the SPDX license identifier and up-to-date copyright years.
...react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml
Outdated
Show resolved
Hide resolved
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 16
♻️ Duplicate comments (2)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml (1)
211-211: Fix comment mismatch on spike_threshold.The comment states the threshold alerts when concurrent functions >= 5, but the value is set to 24. This discrepancy could mislead developers about the actual threshold behavior.
Clarify the intent and update either the comment or the value. If the threshold should alert at 5 concurrent functions, apply this diff:
- spike_threshold: 24 # Alert when concurrent functions >= 5 + spike_threshold: 5 # Alert when concurrent functions >= 5If 24 is the intended threshold, update the comment to reflect the actual behavior:
- spike_threshold: 24 # Alert when concurrent functions >= 5 + spike_threshold: 24 # Alert when concurrent functions >= 24examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml (1)
88-89: Ensure referenced tools/dataset JSON files exist at the configured paths.The config depends on:
tools_json_path: ./examples/dynamo_integration/data/raw/banking/tools.jsonfile_path: ./examples/dynamo_integration/data/agent_leaderboard_v2_banking.jsonIf these files are absent or moved, the banking tool group and evaluation dataset will fail to load at runtime. Please confirm they are present at these exact paths or update the config accordingly.
#!/bin/bash set -euo pipefail for f in \ "examples/dynamo_integration/data/raw/banking/tools.json" \ "examples/dynamo_integration/data/agent_leaderboard_v2_banking.json" do if [[ -f "$f" ]]; then echo "✓ Found: $f" else echo "✗ Missing: $f" fi doneAlso applies to: 182-184
🧹 Nitpick comments (20)
examples/dynamo_integration/scripts/download_agent_leaderboard_v2.py (1)
205-205: Consider the default domain selection.The default is set to only
["banking"]with a comment "for testing". For a production script, consider either using all available domains by default or documenting why banking-only is the intended default behavior.If all domains should be the default:
if domains is None: - domains = ["banking"] # Start with just one for testing - logger.info("No domains specified, using default: %s", domains) + domains = available_domains + logger.info("No domains specified, using all available domains: %s", domains)external/dynamo/start_dynamo_unified_thompson_hints.sh (1)
215-225: Quote host paths and container name indocker runfor robustness.
$LOCAL_MODEL_DIR,$CUSTOM_DYNAMO_DIR, and$CONTAINER_NAMEare currently unquoted. If users set paths with spaces, thedocker runcommand will break. Using quoted interpolations is safer:docker run -d \ --name "$CONTAINER_NAME" \ -v "${LOCAL_MODEL_DIR}:${MODEL}:ro" \ -v "${CUSTOM_DYNAMO_DIR}:/workspace/custom_dynamo:ro" \ ...external/dynamo/.env.example (1)
4-7: Consider unquoting example values to satisfy dotenv linters.
dotenv-linterflags the quoted values on Lines 5 and 7. Since these example paths contain no spaces, you can drop the quotes to quiet the warning:export DYNAMO_MODEL_DIR=/path/to/your/models/Llama-3.3-70B-Instruct export DYNAMO_REPO_DIR=/path/to/NeMo-Agent-ToolkitUsers can re-add quotes if they actually need spaces.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py (2)
42-53: Add an explicit return type for the public evaluator function.Per guidelines, public APIs should have return type hints.
action_completion_evaluator_functionis an async generator yieldingEvaluatorInfo; consider:from collections.abc import AsyncIterator @register_evaluator(config_type=ActionCompletionEvaluatorConfig) async def action_completion_evaluator_function( config: ActionCompletionEvaluatorConfig, builder: EvalBuilder ) -> AsyncIterator[EvaluatorInfo]: ...
72-77: Avoid bareexcept Exceptionor at least log at debug level.Catching a broad
Exceptionand silentlycontinue-ing makes trajectory parsing failures invisible and harder to debug. Either narrow the exception type (e.g., toTypeError/ValueError) or log at debug:except Exception: logger.debug("Failed to model_dump trajectory step of type %s", type(step)) continueexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py (2)
241-257: Consider logging parse failures with stack traces if they indicate bugs.Right now
parse_string_to_dictswallowsjson.JSONDecodeErrorand logs a warning without a stack trace. If malformed JSON here is unexpected (rather than just noisy input), switching tologger.exception(without re-raising) would give better observability:except json.JSONDecodeError: logger.exception("Failed to parse input_params string as JSON: %s", v[:100]) return {}If malformed JSON is expected, leaving this as
warningis fine and the Ruff suggestion can be ignored.
275-279: Remove unused local variables to satisfy Ruff and keep the function tight.
propertiesandrequired_paramsare assigned but never used increate_tool_stub_function. You can safely drop them:tool_name = tool_schema.get("title", "unknown_tool") tool_description = tool_schema.get("description", "") # properties = tool_schema.get("properties", {}) # required_params = tool_schema.get("required", [])Similarly, in
_generate_mock_response,prop_descon Line 335 is unused and can be removed.examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.py (2)
38-52: Add a return type hint for the public function generator.To match the project’s typing guidelines, annotate the return type of
react_benchmark_agent_function, e.g.:from collections.abc import AsyncIterator @register_function(...) async def react_benchmark_agent_function( config: ReactBenchmarkAgentFunctionConfig, builder: Builder ) -> AsyncIterator[FunctionInfo]: ...
68-80: Mark the unusedqueryparameter to satisfy Ruff and intent.
_decision_only_infodoesn’t actually use itsqueryargument, which is fine for a status endpoint but triggers ARG001. Renaming to_query(and adjusting the docstring) makes the intent clear and avoids the warning:async def _decision_only_info(_query: str) -> str: ...examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.py (3)
43-47: Non-cryptographic use ofmd5is OK, but consider clarifying or switching hash.
_get_scenario_id_for_questionuseshashlib.md5purely to derive a short scenario identifier, not for security. To avoid future confusion (and S324 warnings), either:
- Switch to
sha256for a “safe by default” choice, or- Add a brief comment that this hash is non-cryptographic and used only for ID generation.
Example:
# Non-cryptographic hash for short scenario IDs question_hash = hashlib.sha256(question.encode()).hexdigest()[:12]
304-305: Drop the explicit exception argument inlogger.exceptioncalls.
logger.exceptionalready logs the current exception and its stack trace; passing%switheis redundant and slightly noisy. You can simplify to:except Exception: logger.exception("💥 Error executing wrapped agent on attempt %d", attempt + 1) attempt += 1 continueSame applies to the similar calls at Lines 514 and 678 (and to the JSON parsing branch if you decide to switch to
logger.exceptionthere).
617-685: Optional: make JSON parsing more robust and explicit.The current
_evaluate_tool_sequenceimplementation heuristically slices the first{...}block from the LLM response. That’s reasonable for examples but brittle if the model ever includes braces in prose. If you see flaky parsing, consider:
- Instructing the evaluator LLM to reply with JSON only and using
response_text.strip()directly, or- Using a small helper that scans for the first valid JSON object instead of
find/rfind.Given this is example code, the existing behavior is acceptable; keep as-is if it’s stable in your workloads.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.py (1)
4-7: Optional: remove unusednoqaif F401 isn’t enabled.Ruff reports the
# noqa: F401as unused. If F401 isn’t part of your active rule set, you can safely drop the directive:from . import registerIf you expect F401 to be enabled later, keeping the pragma is also reasonable.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.py (1)
6-9: Sort__all__to satisfy Ruff and keep exports predictable.Ruff is flagging
__all__as unsorted. Sorting also makes the public surface easier to scan.-from .action_completion_evaluator import action_completion_evaluator_function -from .tsq_evaluator import tsq_evaluator_function - -__all__ = ["tsq_evaluator_function", "action_completion_evaluator_function"] +from .action_completion_evaluator import action_completion_evaluator_function +from .tsq_evaluator import tsq_evaluator_function + +__all__ = ["action_completion_evaluator_function", "tsq_evaluator_function"]external/dynamo/start_dynamo_disagg.sh (1)
171-199: Quote model directory in Docker volume mount for robustness.To be safe if
DYNAMO_MODEL_DIRever contains spaces or shell metacharacters, it’s better to quote the bind‑mount paths:- -v $LOCAL_MODEL_DIR:$MODEL:ro \ + -v "$LOCAL_MODEL_DIR":"$MODEL":ro \Everything else in the startup and health‑check flow looks solid.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.py (1)
30-47: Add a return type forbanking_tools_group_function(and optionally simplify the FileNotFound path).Per the project’s typing guidelines, this async generator should have an explicit return type:
from collections.abc import AsyncIterator @register_function_group(config_type=BankingToolsGroupConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN]) async def banking_tools_group_function( config: BankingToolsGroupConfig, builder: Builder ) -> AsyncIterator[FunctionGroup]: ...Optionally, to address the TRY003 hint and avoid duplicating error text between the log and exception, you could rely on the exception message alone:
- if not tools_path.exists(): - logger.error("tools.json not found at %s", tools_path) - raise FileNotFoundError(f"Banking tools file not found: {tools_path}") + if not tools_path.exists(): + raise FileNotFoundError(f"Banking tools file not found: {tools_path}")The overall intent‑buffer wiring and stub registration logic otherwise looks good.
Also applies to: 49-93
external/dynamo/start_dynamo_unified.sh (1)
141-165: Avoid echoing HF token back to the terminal.
read -pechoes the Hugging Face token in cleartext, which is undesirable even in local scripts.Use silent input (and
-r) instead:- read -p "Please enter your HuggingFace token (or press Enter to skip): " HF_TOKEN + read -r -s -p "Please enter your HuggingFace token (or press Enter to skip): " HF_TOKEN + echoexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py (2)
145-155: Remove unusedexpected_tsqlocal intest_scenario_000_score.
expected_tsqis computed but never used; the test directly asserts against4/7. This triggers Ruff F841 and adds noise.You can either drop the variable or use it in the assertion, e.g.:
- expected_tsq = 4 / 7 # 0.5714285714285715 - ... - accuracy = calculate_tool_accuracy(actual, expected) - assert accuracy == pytest.approx(4/7) + expected_tsq = 4 / 7 # 0.5714285714285715 + ... + accuracy = calculate_tool_accuracy(actual, expected) + assert accuracy == pytest.approx(expected_tsq)
227-231: Tightentest_empty_tool_nameassertion and avoid redundant calls.The current assertion:
assert normalize_tool_name(None) == "" if normalize_tool_name(None) is not None else Trueis hard to read and calls
normalize_tool_name(None)twice. If the intention is to acceptNoneand normalize to the empty string, this can be simplified to:assert normalize_tool_name(None) == ""which better documents expected behavior.
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py (1)
89-96: Remove unused# noqa: F401suppressions on import tests.The imports in
test_module_importsandtest_module_with_feedback_importsare intentionally unused, but Ruff is not enforcing F401 here, so the# noqa: F401markers show up as unused (RUF100).You can safely drop the
# noqacomments on these two import lines; the tests remain clear and lint‑clean.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (33)
examples/dynamo_integration/.env.example(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_adk_e2e_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_e2e_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_prefix_e2e_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/register.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py(1 hunks)examples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py(1 hunks)examples/dynamo_integration/scripts/create_test_subset.py(1 hunks)examples/dynamo_integration/scripts/download_agent_leaderboard_v2.py(1 hunks)examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py(1 hunks)examples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.py(1 hunks)examples/dynamo_integration/scripts/run_concurrency_benchmark.sh(1 hunks)examples/dynamo_integration/scripts/throughput_analysis.py(1 hunks)external/dynamo/.env.example(1 hunks)external/dynamo/monitor_dynamo.sh(1 hunks)external/dynamo/start_dynamo_disagg.sh(1 hunks)external/dynamo/start_dynamo_unified.sh(1 hunks)external/dynamo/start_dynamo_unified_thompson_hints.sh(1 hunks)external/dynamo/stop_dynamo.sh(1 hunks)external/dynamo/test_dynamo_integration.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_prefix_e2e_test.yml
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_adk_e2e_test.yml
- external/dynamo/stop_dynamo.sh
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/config_dynamo_e2e_test.yml
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/register.py
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_full_test.yml
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/optimize_rethinking_full_test.yml
- external/dynamo/test_dynamo_integration.sh
- examples/dynamo_integration/scripts/create_test_subset.py
🧰 Additional context used
📓 Path-based instructions (11)
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
external/dynamo/monitor_dynamo.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexternal/dynamo/start_dynamo_unified.shexternal/dynamo/start_dynamo_unified_thompson_hints.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexternal/dynamo/start_dynamo_disagg.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.pyexamples/dynamo_integration/scripts/run_concurrency_benchmark.sh
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
external/dynamo/monitor_dynamo.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexternal/dynamo/start_dynamo_unified.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexternal/dynamo/start_dynamo_unified_thompson_hints.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexternal/dynamo/start_dynamo_disagg.shexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.pyexamples/dynamo_integration/scripts/run_concurrency_benchmark.sh
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.pyexamples/dynamo_integration/scripts/throughput_analysis.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.pyexamples/dynamo_integration/scripts/plot_throughput_histograms_per_request.pyexamples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.pyexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.pyexamples/dynamo_integration/scripts/download_agent_leaderboard_v2.pyexamples/dynamo_integration/scripts/run_concurrency_benchmark.sh
**/test_*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named using the test_ prefix, using snake_case
Extract frequently repeated code into pytest fixtures, which should be named using the fixture_ prefix and define the name argument in the decorator
Mock external services with pytest_httpserver or unittest.mock instead of hitting live endpoints
Mark slow tests with @pytest.mark.slow so they can be skipped in the default test suite
Mark integration tests requiring external services with @pytest.mark.integration so they can be skipped in the default test suite
Files:
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py
🧠 Learnings (7)
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,js,ts,java,cpp,c,go,rb,php,sh} : All source files must include the SPDX Apache-2.0 header template
Applied to files:
external/dynamo/monitor_dynamo.shexternal/dynamo/start_dynamo_unified.shexamples/dynamo_integration/scripts/throughput_analysis.pyexternal/dynamo/start_dynamo_disagg.sh
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,js,ts,java,cpp,c,go,rb,php} : Every file must start with the standard SPDX Apache-2.0 header
Applied to files:
external/dynamo/monitor_dynamo.shexternal/dynamo/start_dynamo_unified.shexamples/dynamo_integration/scripts/throughput_analysis.pyexternal/dynamo/start_dynamo_disagg.sh
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: CI verifies SPDX headers via ci/scripts/github/checks.sh; do not bypass this check
Applied to files:
external/dynamo/monitor_dynamo.shexternal/dynamo/start_dynamo_unified.shexamples/dynamo_integration/scripts/throughput_analysis.pyexternal/dynamo/start_dynamo_disagg.sh
📚 Learning: 2025-12-12T20:49:44.305Z
Learnt from: zterek
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1243
File: examples/risk_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml:1-98
Timestamp: 2025-12-12T20:49:44.305Z
Learning: In the NVIDIA/NeMo-Agent-Toolkit repository, YAML files generally use 2-space indentation. When reviewing YAML, prefer 2-space indentation to match the existing style over a 4-space guideline until a repo-wide standardization is performed. This applies to YAML configuration files (e.g., red-teaming.yml) and, more broadly, all *.yml files in the project.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.ymlexamples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_rethinking_full_test.yml
📚 Learning: 2025-12-03T18:42:23.494Z
Learnt from: AnuradhaKaruppiah
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1147
File: packages/nvidia_nat_a2a/pyproject.toml:1-10
Timestamp: 2025-12-03T18:42:23.494Z
Learning: In the packages/ directory, pyproject.toml files typically do not include SPDX license headers. Out of 34 packages, only nvidia_nat_strands is an exception. This pattern differs from the requirement for SPDX headers in source code files (.py, .js, .ts, etc.).
Applied to files:
examples/dynamo_integration/scripts/throughput_analysis.py
📚 Learning: 2025-11-05T11:45:35.119Z
Learnt from: thepatrickchin
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1152
File: examples/config_inheritance/pyproject.toml:1-25
Timestamp: 2025-11-05T11:45:35.119Z
Learning: In the examples/ directory, pyproject.toml files typically do not include SPDX license headers, with only one exception (adk_demo). This is an established pattern that differs from the general guideline requiring SPDX headers in all .toml files.
Applied to files:
examples/dynamo_integration/scripts/throughput_analysis.py
📚 Learning: 2025-11-14T20:33:53.944Z
Learnt from: AnuradhaKaruppiah
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1181
File: packages/nvidia_nat_test/tests/test_test_llm.py:419-484
Timestamp: 2025-11-14T20:33:53.944Z
Learning: The NVIDIA NeMo-Agent-Toolkit project uses pytest-asyncio in strict mode (the default), which requires pytest.mark.asyncio decorator on all async test functions. All async tests in packages/nvidia_nat_test/tests/test_test_llm.py consistently follow this pattern.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
🧬 Code graph analysis (9)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py (2)
src/nat/data_models/component_ref.py (1)
LLMRef(116-124)src/nat/eval/evaluator/evaluator_model.py (4)
EvalInput(46-47)EvalInputItem(23-43)EvalOutput(56-58)EvalOutputItem(50-53)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.py (2)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py (1)
action_completion_evaluator_function(43-245)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py (1)
tsq_evaluator_function(52-328)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.py (1)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py (2)
ToolIntentBuffer(80-137)create_tool_stub_function(260-317)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.py (1)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py (4)
get_global_intents(140-152)set_current_scenario_id(37-56)clear(132-137)get_intents(123-130)
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py (1)
examples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.py (4)
get_job_label(47-49)get_experiment_label(52-54)extract_per_request_tsq_scores(146-193)calculate_per_request_throughput_metrics(196-307)
examples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.py (1)
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py (6)
get_job_label(45-47)get_experiment_label(50-52)extract_per_request_tsq_scores(55-102)calculate_per_request_throughput_metrics(105-216)collect_job_data_from_dir(219-293)collect_job_data(296-324)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.py (3)
src/nat/builder/framework_enum.py (1)
LLMFrameworkEnum(19-26)src/nat/builder/function_info.py (2)
FunctionInfo(290-625)from_fn(552-625)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py (2)
ToolIntentBuffer(80-137)get_intents(123-130)
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py (1)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.py (2)
SelfEvaluatingAgentConfig(53-126)SelfEvaluatingAgentWithFeedbackConfig(132-201)
examples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py (1)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py (3)
normalize_tool_name(134-156)calculate_tool_accuracy(158-177)extract_tool_calls_from_trajectory(66-132)
🪛 dotenv-linter (4.0.0)
external/dynamo/.env.example
[warning] 5-5: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
[warning] 7-7: [QuoteCharacter] The value has quote characters (', ")
(QuoteCharacter)
🪛 Ruff (0.14.8)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py
75-76: try-except-continue detected, consider logging the exception
(S112)
75-75: Do not catch blind exception: Exception
(BLE001)
148-148: Consider moving this statement to an else block
(TRY300)
149-149: Do not catch blind exception: Exception
(BLE001)
210-210: Redundant exception object included in logging.exception call
(TRY401)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/__init__.py
9-9: __all__ is not sorted
Apply an isort-style sorting to __all__
(RUF022)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py
64-64: Local variable eval_builder is assigned to but never used
Remove assignment to unused variable eval_builder
(F841)
88-88: Do not catch blind exception: Exception
(BLE001)
264-264: Do not catch blind exception: Exception
(BLE001)
293-293: Redundant exception object included in logging.exception call
(TRY401)
examples/dynamo_integration/scripts/throughput_analysis.py
99-99: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
176-176: f-string without any placeholders
Remove extraneous f prefix
(F541)
178-178: f-string without any placeholders
Remove extraneous f prefix
(F541)
208-208: f-string without any placeholders
Remove extraneous f prefix
(F541)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/banking_tools.py
91-91: Avoid specifying long messages outside the exception class
(TRY003)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/self_evaluating_agent_with_feedback.py
46-46: Probable use of insecure hash functions in hashlib: md5
(S324)
304-304: Redundant exception object included in logging.exception call
(TRY401)
514-514: Redundant exception object included in logging.exception call
(TRY401)
666-666: Consider moving this statement to an else block
(TRY300)
678-678: Redundant exception object included in logging.exception call
(TRY401)
682-682: Use explicit conversion flag
Replace with conversion flag
(RUF010)
examples/dynamo_integration/scripts/plot_throughput_histograms_per_request.py
14-14: Docstring contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF002)
98-98: Consider moving this statement to an else block
(TRY300)
130-130: Do not catch blind exception: Exception
(BLE001)
147-147: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
219-219: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
332-332: Docstring contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF002)
352-352: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
355-355: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
371-371: Local variable text_obj is assigned to but never used
Remove assignment to unused variable text_obj
(F841)
404-404: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
407-407: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
496-496: Unpacked variable fig is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
573-573: Prefer next(iter(job_stats.values())) over single element slice
Replace with next(iter(job_stats.values()))
(RUF015)
623-623: Unpacked variable fig is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
667-667: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
737-737: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
744-744: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
826-826: f-string without any placeholders
Remove extraneous f prefix
(F541)
examples/dynamo_integration/scripts/plot_throughput_vs_tsq_per_request.py
13-13: Docstring contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF002)
88-88: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
93-93: Consider moving this statement to an else block
(TRY300)
95-95: Do not catch blind exception: Exception
(BLE001)
189-189: Consider moving this statement to an else block
(TRY300)
221-221: Do not catch blind exception: Exception
(BLE001)
238-238: Loop control variable idx not used within loop body
Rename unused idx to _idx
(B007)
310-310: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
447-447: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
499-499: Unpacked variable fig is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
557-557: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
620-620: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
674-674: Unpacked variable p_value is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
674-674: Unpacked variable std_err is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
710-710: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
716-716: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
777-777: String contains ambiguous σ (GREEK SMALL LETTER SIGMA). Did you mean o (LATIN SMALL LETTER O)?
(RUF001)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/__init__.py
5-5: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/react_benchmark_agent.py
68-68: Unused function argument: query
(ARG001)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py
277-277: Local variable properties is assigned to but never used
Remove assignment to unused variable properties
(F841)
278-278: Local variable required_params is assigned to but never used
Remove assignment to unused variable required_params
(F841)
335-335: Local variable prop_desc is assigned to but never used
Remove assignment to unused variable prop_desc
(F841)
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
91-91: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
95-95: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
examples/dynamo_integration/react_benchmark_agent/tests/test_tsq_formula.py
150-150: Local variable expected_tsq is assigned to but never used
Remove assignment to unused variable expected_tsq
(F841)
examples/dynamo_integration/scripts/download_agent_leaderboard_v2.py
1-1: The file is executable but no shebang is present
(EXE002)
258-258: Redundant exception object included in logging.exception call
(TRY401)
🪛 Shellcheck (0.11.0)
examples/dynamo_integration/scripts/run_concurrency_benchmark.sh
[warning] 186-186: JOB_ID appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (15)
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/profile_rethinking_full_test.yml (1)
1-220: The file already ends with a single newline as required. No action needed.examples/dynamo_integration/scripts/download_agent_leaderboard_v2.py (5)
1-7: LGTM!The license header and module docstring follow the project's standards.
20-32: LGTM!The function correctly converts JSON strings to dictionaries with proper defensive copying.
35-127: LGTM!The heuristic approach for deriving expected tool calls is well-documented and handles various parameter types appropriately. The deduplication logic ensures unique tool names while preserving order.
130-186: LGTM!The transformation function properly converts scenarios to NAT format with comprehensive metadata and graceful handling of missing personas.
271-296: LGTM!The CLI setup properly handles argument parsing and sets a sensible default for HF_HOME with appropriate logging.
external/dynamo/monitor_dynamo.sh (1)
1-170: Monitoring script is clear and functionally sound.The menu flow, Docker checks, and health/inference probes are well-structured and defensive enough for this dev‑tool script. I don’t see issues that need changes.
examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml (8)
1-3: Apache License header present and correctly formatted.The copyright year (2025-2026) and license identifier are properly included.
4-25: Excellent documentation and prerequisites section.The header provides clear purpose, use cases, prerequisites with reproducible setup steps, usage instructions, and expected outcomes. This significantly improves discoverability and maintainability.
27-61: Functions and function_groups configuration looks sound.The
react_benchmark_agentfunction andbanking_tools_groupare properly declared with tool inclusions specified clearly. The decision_only flags and templates align with the agent architecture.
62-96: LLM configuration is well-structured with appropriate comments.Both
dynamo_llmandeval_llmare properly configured. The stop tokens (line 72), prefix template settings (lines 75-78), and commented-out optimizer fields (lines 79-86) are appropriately documented. Note that theapi_key: dummyplaceholder is acceptable for a test configuration.
98-138: Workflow and system prompt configuration is comprehensive.The ReACT agent workflow is well-configured with verbose logging, elevated limits (max_tool_calls: 20, recursion_limit: 50), and a detailed system prompt optimized for tool selection evaluation. The prompt instructions are clear and appropriate for the intended use case.
140-193: Eval and evaluator configuration is well-structured.The evaluation section includes comprehensive profiler settings (LLM metrics, token forecasting, prompt caching analysis, bottleneck analysis), and the tool_selection_quality evaluator is properly configured with TSQ-based evaluation.
194-230: Optimizer configuration provides clear parameter exploration setup.The optimizer section with grid search sampling (5 trials), metric direction (maximize TSQ), and disabled prompt optimization provides a solid foundation for parameter exploration. The configuration is well-documented with explanatory comments.
37-37: Resolve the missing data files and paths used in config.The configuration references data files that do not exist in the repository:
./examples/dynamo_integration/data/raw/banking/tools.json(NOT FOUND)./examples/dynamo_integration/data/agent_leaderboard_v2_test_subset.json(NOT FOUND)While the YAML comments document a prerequisite to generate the test subset file, the core issue is that the source data files are not available. Per the coding guidelines, sample data should be checked into git-lfs. Either ensure these files are present and properly configured in git-lfs, or provide clear documentation on how to obtain/generate them before the config can be used. The absolute-style paths (e.g.,
./examples/...) also assume execution from the repository root; consider using relative paths or documenting the required working directory.
...benchmark_agent/src/react_benchmark_agent/configs/eval_config_no_rethinking_minimal_test.yml
Outdated
Show resolved
Hide resolved
...on/react_benchmark_agent/src/react_benchmark_agent/evaluators/action_completion_evaluator.py
Show resolved
Hide resolved
...namo_integration/react_benchmark_agent/src/react_benchmark_agent/evaluators/tsq_evaluator.py
Show resolved
Hide resolved
...ples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/tool_intent_stubs.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
src/nat/llm/dynamo_llm.py (1)
355-355: Add return type hint to function signature.Per coding guidelines, all public APIs require type hints on parameters and return values. This was also noted in past review comments.
Apply this diff to add the return type hint:
+from collections.abc import AsyncIterator + @register_llm_provider(config_type=DynamoModelConfig) -async def dynamo_llm(config: DynamoModelConfig, _builder: Builder): +async def dynamo_llm(config: DynamoModelConfig, _builder: Builder) -> AsyncIterator[LLMProviderInfo]: """Register the Dynamo LLM provider."""
🧹 Nitpick comments (2)
tests/nat/llm/test_dynamo_llm.py (1)
308-308: Consider usingnext(iter(...))for readability.Static analysis suggests preferring
next(iter(prefix_ids))over single element slice for better semantics when extracting a single element from a set.Apply this diff:
- assert list(prefix_ids)[0].startswith("session-") + assert next(iter(prefix_ids)).startswith("session-")src/nat/llm/dynamo_llm.py (1)
287-310: Consider adding type hint to request parameter.While this is an internal function, adding a type hint improves code clarity and helps with IDE support.
Apply this diff:
- async def on_request(request): + async def on_request(request: "httpx.Request") -> None: """Inject Dynamo prefix headers before each request."""
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py(1 hunks)packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py(2 hunks)packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py(2 hunks)src/nat/llm/dynamo_llm.py(1 hunks)tests/nat/llm/test_dynamo_llm.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (14)
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.pyexamples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.
Files:
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.pypackages/nvidia_nat_adk/src/nat/plugins/adk/llm.py
src/nat/**/*
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/llm/dynamo_llm.py
**/test_*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named using the test_ prefix, using snake_case
Extract frequently repeated code into pytest fixtures, which should be named using the fixture_ prefix and define the name argument in the decorator
Mock external services with pytest_httpserver or unittest.mock instead of hitting live endpoints
Mark slow tests with @pytest.mark.slow so they can be skipped in the default test suite
Mark integration tests requiring external services with @pytest.mark.integration so they can be skipped in the default test suite
Files:
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
tests/**/*.py
⚙️ CodeRabbit configuration file
tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using thetest_prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using thefixture_prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass
Files:
tests/nat/llm/test_dynamo_llm.py
🧠 Learnings (2)
📚 Learning: 2025-09-15T21:26:29.430Z
Learnt from: saglave
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 726
File: examples/frameworks/adk_demo/src/nat_adk_demo/weather_update_tool.py:0-0
Timestamp: 2025-09-15T21:26:29.430Z
Learning: In NAT's ADK integration, the docstring of registered functions serves as the tool description sent to the LLM, not standard Python function documentation. The docstring should describe the logical tool interface (parameters the LLM will provide) rather than the NAT framework wrapper parameters like tool_config and builder.
Applied to files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py
📚 Learning: 2025-11-14T20:33:53.944Z
Learnt from: AnuradhaKaruppiah
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1181
File: packages/nvidia_nat_test/tests/test_test_llm.py:419-484
Timestamp: 2025-11-14T20:33:53.944Z
Learning: The NVIDIA NeMo-Agent-Toolkit project uses pytest-asyncio in strict mode (the default), which requires pytest.mark.asyncio decorator on all async test functions. All async tests in packages/nvidia_nat_test/tests/test_test_llm.py consistently follow this pattern.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.pytests/nat/llm/test_dynamo_llm.py
🧬 Code graph analysis (3)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)
src/nat/llm/dynamo_llm.py (4)
dynamo_llm(355-360)create_httpx_client_with_dynamo_hooks(314-344)DynamoModelConfig(153-243)get_dynamo_field_names(221-243)
src/nat/llm/dynamo_llm.py (2)
src/nat/data_models/optimizable.py (2)
OptimizableField(157-196)SearchSpace(35-154)src/nat/llm/openai_llm.py (2)
openai_llm(57-59)OpenAIModelConfig(32-53)
tests/nat/llm/test_dynamo_llm.py (2)
src/nat/llm/dynamo_llm.py (9)
dynamo_llm(355-360)create_httpx_client_with_dynamo_hooks(314-344)DynamoModelConfig(153-243)DynamoPrefixContext(71-145)_create_dynamo_request_hook(251-311)get_dynamo_field_names(221-243)clear(111-114)get(117-119)scope(123-145)src/nat/data_models/common.py (2)
static_type(157-158)static_full_type(161-162)
🪛 Ruff (0.14.8)
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
91-91: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
95-95: Unused noqa directive (non-enabled: F401)
Remove unused noqa directive
(RUF100)
tests/nat/llm/test_dynamo_llm.py
205-205: Avoid specifying long messages outside the exception class
(TRY003)
308-308: Prefer next(iter(prefix_ids)) over single element slice
Replace with next(iter(prefix_ids))
(RUF015)
🔇 Additional comments (11)
tests/nat/llm/test_dynamo_llm.py (3)
34-144: LGTM! Comprehensive configuration validation tests.The test coverage for
DynamoModelConfigis thorough, including defaults, custom values, boundary conditions, validation logic, inheritance, and utility methods.
151-223: LGTM! Thorough context management tests.The tests cover all aspects of
DynamoPrefixContextincluding basic operations, overwriting, context manager usage, exception safety, and nested scope behavior.
230-445: LGTM! Comprehensive hook and client creation tests.The test suite thoroughly validates:
- Header injection mechanics
- Context variable override behavior
- Prefix ID consistency across requests
- Hook isolation between clients
- Default template fallback
- Case normalization
- HTTPX client configuration
- Provider registration
packages/nvidia_nat_langchain/src/nat/plugins/langchain/llm.py (1)
200-248: LGTM! Dynamo LangChain integration follows established patterns.The implementation correctly:
- Excludes Dynamo-specific fields using
get_dynamo_field_names()(as per past review feedback)- Creates httpx client with hooks only when prefix headers are enabled
- Handles RESPONSES API consistently with
openai_langchain- Applies retry and thinking patches via
_patch_llm_based_on_config- Logs Dynamo configuration for debugging
src/nat/llm/dynamo_llm.py (3)
71-145: LGTM! Well-designed context management class.The
DynamoPrefixContextclass properly:
- Uses
ContextVarfor async-safe context management- Provides clear set/get/clear methods with proper type hints
- Implements a context manager for scoped usage with automatic cleanup
- Includes comprehensive docstrings with usage examples
- Logs debug information for troubleshooting
153-243: LGTM! Well-structured configuration model.The
DynamoModelConfigclass properly:
- Extends
OpenAIModelConfigfor compatibility- Uses
OptimizableFieldwithSearchSpacefor tunable parameters- Provides clear field descriptions with optimization hints
- Includes validation constraints for bounds checking
- Implements
get_dynamo_field_names()utility as suggested in past review feedback- Documents the purpose and default behavior (headers enabled by default)
314-344: LGTM! HTTPX client factory is well-implemented.The
create_httpx_client_with_dynamo_hooksfunction properly:
- Creates an httpx client with event hooks for header injection
- Configures timeout appropriately
- Has proper type hints and documentation
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py (4)
1-12: LGTM! Proper licensing and documentation.The SPDX Apache-2.0 header and copyright years are correct, and the module docstring clearly describes the test coverage.
195-326: LGTM! Comprehensive decision logic and feedback generation tests.The test coverage is thorough, including boundary conditions (exact threshold matches, exhausted retries, edge cases). The helper methods appropriately mimic production logic for isolated testing.
328-442: LGTM! Well-structured integration tests with proper markers.The integration tests correctly:
- Use
@pytest.mark.slowand@pytest.mark.integrationmarkers for optional execution- Apply
@pytest.mark.asynciodecorator on all async test methods (consistent with project requirements)- Follow fixture naming conventions with the
fixture_prefix andnameargument- Use NIM backend for realistic integration testing
The
nvidia_api_keyfixture parameter (line 339) is appropriately marked withnoqa: ARG002as it's a pytest fixture dependency ensuring the API key is available.Based on learnings, the pytest-asyncio decorator usage is consistent with the NVIDIA NeMo Agent toolkit project's strict mode configuration.
23-84: LGTM! Thorough configuration validation tests.The config validation tests appropriately verify:
- File existence for both eval and profile configs
- Correct workflow type specification
- Required parameters presence
- Valid ranges for max_retries (0-10) and confidence thresholds (0.0-1.0)
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
Outdated
Show resolved
Hide resolved
examples/dynamo_integration/react_benchmark_agent/tests/test_self_evaluation.py
Show resolved
Hide resolved
…/1242 Merging release/1.4 commits since creating this feature branch for dynamo
|
/ok to test 0773d81 |
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
/ok to test 3fb6d64 |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/source/build-workflows/llms/index.md (1)
23-29: Add NVIDIA Dynamo to the supported LLM providers table.The provider table is missing an entry for Dynamo, although the section below documents it fully. This inconsistency may confuse users about which providers are available.
Add Dynamo to the provider table:
| Provider | Type | Description | |----------|------|-------------| | [NVIDIA NIM](https://build.nvidia.com) | `nim` | NVIDIA Inference Microservice (NIM) | | [OpenAI](https://openai.com) | `openai` | OpenAI API | | [AWS Bedrock](https://aws.amazon.com/bedrock/) | `aws_bedrock` | AWS Bedrock API | | [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/quickstart) | `azure_openai` | Azure OpenAI API | | [LiteLLM](https://github.com/BerriAI/litellm) | `litellm` | LiteLLM API | +| [NVIDIA Dynamo](https://github.com/ai-dynamo/dynamo) | `dynamo` | NVIDIA Dynamo inference optimization (experimental) |
♻️ Duplicate comments (1)
examples/dynamo_integration/react_benchmark_agent/README.md (1)
692-699: Add language identifier to fenced code block (MD040).The file structure example at line 692 is missing a language identifier, causing markdownlint MD040 violation. Add
textto the opening fence:-``` +```text react_benchmark_agent/outputs/benchmarks/<name>_<timestamp>/ ├── benchmark_results.csv # Machine-readable CSV ├── benchmark_report.md # Human-readable markdown ├── analysis_16.txt # Detailed analysis for concurrency=16 ├── analysis_32.txt # Detailed analysis for concurrency=32 └── ...
🧹 Nitpick comments (5)
docs/source/build-workflows/llms/index.md (2)
36-54: Consider adding a YAML configuration example for Dynamo.All documented providers include a YAML configuration example for reference (as shown in lines 36–54), but the new Dynamo section (lines 150–168) does not. Adding a minimal example would improve consistency and help users get started quickly.
Proposed example to add after line 168:
dynamo_llm: _type: dynamo model_name: meta/llama-3.1-70b-instruct prefix_template: "<prefix_id>" prefix_osl: 256
156-168: Consider adding a gated-fields note fortemperatureandtop_p.Providers with
temperatureandtop_pfields (NIM, OpenAI, Azure) include a note that these are model-gated and may not be supported by all models (see lines 73–75, 95–97, 131–133). For consistency and user clarity, consider adding a similar note to the Dynamo section.Proposed addition after line 168:
:::{note} `temperature` and `top_p` are model-gated fields and may not be supported by all models. If unsupported and explicitly set, validation will fail. See [Gated Fields](../../extend/custom-components/gated-fields.md) for details. :::examples/dynamo_integration/README.md (1)
26-26: Optional: Consider rephrasing to reduce wordiness.Line 26 uses "with respect to" which static analysis suggests could be wordy. Optional improvements:
-Agentic LLM requests have predictable patterns with respect to conversation length, system prompts, and tool-calling. +Agentic LLM requests have predictable patterns in conversation length, system prompts, and tool-calling.Or restructure as: "Agentic LLM requests show predictable patterns: conversation length, system prompts, and tool-calling."
examples/dynamo_integration/ARCHITECTURE.md (1)
17-19: Minor note: Consider using full toolkit name on first mention.While this is an architecture document and the experimental note contextualizes usage, adding the NVIDIA prefix in line 18 would align with documentation standards:
-> ⚠️ **EXPERIMENTAL**: This integration between NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice. +> ⚠️ **EXPERIMENTAL**: This integration between NVIDIA NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice.examples/dynamo_integration/react_benchmark_agent/README.md (1)
18-19: Minor: Use full toolkit name on first mention.Line 18 references "NeMo Agent toolkit" in the experimental note. Per documentation guidelines, use the full vendor-prefixed name on first mention:
-> ⚠️ **EXPERIMENTAL**: This integration between NVIDIA NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice. +> ⚠️ **EXPERIMENTAL**: This integration between NVIDIA NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
docs/source/build-workflows/llms/index.md(1 hunks)examples/dynamo_integration/ARCHITECTURE.md(1 hunks)examples/dynamo_integration/README.md(1 hunks)examples/dynamo_integration/react_benchmark_agent/README.md(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md(1 hunks)external/dynamo/README.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
examples/dynamo_integration/ARCHITECTURE.mddocs/source/build-workflows/llms/index.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.mdexternal/dynamo/README.md
**/*.{md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NeMo Agent Toolkit' (capitalize 'T') when the name appears in headings
Files:
examples/dynamo_integration/ARCHITECTURE.mddocs/source/build-workflows/llms/index.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.mdexternal/dynamo/README.md
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
examples/dynamo_integration/ARCHITECTURE.mddocs/source/build-workflows/llms/index.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.mdexternal/dynamo/README.md
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
examples/dynamo_integration/ARCHITECTURE.mddocs/source/build-workflows/llms/index.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.mdexternal/dynamo/README.md
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
examples/dynamo_integration/ARCHITECTURE.mddocs/source/build-workflows/llms/index.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.mdexternal/dynamo/README.md
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/ARCHITECTURE.mdexamples/dynamo_integration/README.mdexamples/dynamo_integration/react_benchmark_agent/README.md
docs/**/*.{md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never use deprecated names: Agent Intelligence toolkit, aiqtoolkit, AgentIQ, AIQ/aiq in documentation (unless intentionally referring to deprecated versions)
Files:
docs/source/build-workflows/llms/index.md
docs/source/**/*.{md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
docs/source/**/*.{md,rst}: Documentation must be clear and comprehensive, without TODOs, FIXMEs, or placeholder text
Ensure documentation is free of offensive or outdated terms
Ensure documentation is free of spelling mistakes and do not use words listed in ci/vale/styles/config/vocabularies/nat/reject.txt
Files:
docs/source/build-workflows/llms/index.md
docs/source/**/*
⚙️ CodeRabbit configuration file
docs/source/**/*: This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in thedocs/source/_staticdirectory.Documentation Categories
Ensure documentation is placed in the correct category:
get-started/: Introductory documentation for new users
get-started/tutorials/: Step-by-step learning guidesbuild-workflows/: Workflow creation, configuration, adding remote MCP tools or A2A agents -run-workflows/: Execution, observability, serving workflows via MCP and A2A protocols -improve-workflows/: Evaluation and optimization guides -components/: Specific component implementations (agents, tools, connectors) -extend/: Custom component development and testing (not core library contributions) -reference/: Python and REST API documentation only -resources/: Project information (licensing, FAQs)
resources/contributing/: Development environment and contribution guidesPlacement rules:
- Component implementations always belong in
components/, notbuild-workflows/2. API documentation belongs only inreference/3. Using remote MCP tools or A2A agents should be placed inbuild-workflows/4. Serving workflows via MCP/A2A should be placed inrun-workflows/
Files:
docs/source/build-workflows/llms/index.md
🧠 Learnings (6)
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{md,rst,py} : Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Applied to files:
examples/dynamo_integration/README.mdexternal/dynamo/README.md
📚 Learning: 2025-08-25T15:24:13.143Z
Learnt from: willkill07
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 710
File: .coderabbit.yaml:33-33
Timestamp: 2025-08-25T15:24:13.143Z
Learning: User feedback indicates that CodeRabbit's chat commands, particularly `coderabbitai review` for manual incremental reviews, need better documentation and discoverability within repositories, especially when auto_incremental_review is disabled.
Applied to files:
examples/dynamo_integration/README.mdexternal/dynamo/README.md
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{md,rst} : Use 'NeMo Agent Toolkit' (capitalize 'T') when the name appears in headings
Applied to files:
external/dynamo/README.md
📚 Learning: 2025-12-12T20:49:51.370Z
Learnt from: zterek
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1243
File: examples/risk_and_security/retail_agent/src/nat_retail_agent/configs/red-teaming.yml:1-98
Timestamp: 2025-12-12T20:49:51.370Z
Learning: In the NVIDIA/NeMo-Agent-Toolkit repository, most YAML configuration files use 2-space indentation rather than 4-space indentation. When reviewing YAML files, prefer consistency with the existing codebase style (2-space indentation) over the stated coding guideline (4-space indentation) until a repository-wide standardization is performed.
Applied to files:
external/dynamo/README.md
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to docs/**/*.{md,rst} : Never use deprecated names: Agent Intelligence toolkit, aiqtoolkit, AgentIQ, AIQ/aiq in documentation (unless intentionally referring to deprecated versions)
Applied to files:
external/dynamo/README.md
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,toml,yaml,yml} : Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Applied to files:
external/dynamo/README.md
🪛 LanguageTool
docs/source/build-workflows/llms/index.md
[grammar] ~152-~152: Use a hyphen to join words.
Context: ...rimental) Dynamo is an inference engine agnostic LLM provider designed to optimi...
(QB_NEW_EN_HYPHEN)
[grammar] ~164-~164: Ensure spelling is correct
Context: ...sation prefix IDs. Setting to null will distable use of prefix_template, prefix_total_re...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
examples/dynamo_integration/README.md
[style] ~26-~26: ‘with respect to’ might be wordy. Consider a shorter alternative.
Context: ... LLM requests have predictable patterns with respect to conversation length, system prompts, an...
(EN_WORDINESS_PREMIUM_WITH_RESPECT_TO)
external/dynamo/README.md
[style] ~465-~465: Consider using a different verb for a more formal wording.
Context: ... the script provides guidance on how to fix the issue. --- ## Stopping Dynamo A ...
(FIX_RESOLVE)
🪛 markdownlint-cli2 (0.18.1)
examples/dynamo_integration/react_benchmark_agent/README.md
692-692: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (1)
external/dynamo/README.md (1)
21-21: Use "NVIDIA NeMo Agent toolkit" on first mention.Line 21 references the toolkit by abbreviated name. Per coding guidelines, the full name with vendor prefix should be used on first mention in the file.
-> ⚠️ **EXPERIMENTAL**: This integration between NVIDIA NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice. +> ⚠️ **EXPERIMENTAL**: This integration between NVIDIA NeMo Agent toolkit and Dynamo is experimental and under active development. APIs, configurations, and features may change without notice.Based on coding guidelines for
**/*.{md,rst,py}.
|
/ok to test 6d4ff39 |
|
/ok to test ce72671 |
|
/ok to test 072a7f7 |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/nat/llm/dynamo_llm.py (1)
251-312: Correct header injection logic with minor type hint opportunity.The hook creation pattern is well-designed:
- Generates a default prefix ID once at hook creation for consistency across requests
- Properly checks
DynamoPrefixContextfirst for per-request overrides (useful in batch evaluation scenarios)- Injects all required Dynamo headers with correct names and value formatting
- Includes helpful debug logging at each decision point
The outer function has comprehensive type hints. The inner
on_requestfunction (line 287) could optionally benefit from explicit type hints for better IDE support and clarity, though this is not strictly required for private functions.If you'd like to add type hints to the inner function for consistency:
- async def on_request(request): + async def on_request(request: "httpx.Request") -> None: """Inject Dynamo prefix headers before each request."""
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py(2 hunks)src/nat/llm/dynamo_llm.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (11)
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.pysrc/nat/llm/dynamo_llm.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.- When adding a new package, that new package name (as defined in the
pyproject.tomlfile) should
be added as a dependency to the nvidia-nat-all package inpackages/nvidia_nat_all/pyproject.toml
Files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py
src/nat/**/*
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/llm/dynamo_llm.py
🧠 Learnings (2)
📚 Learning: 2025-09-15T21:26:29.430Z
Learnt from: saglave
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 726
File: examples/frameworks/adk_demo/src/nat_adk_demo/weather_update_tool.py:0-0
Timestamp: 2025-09-15T21:26:29.430Z
Learning: In NAT's ADK integration, the docstring of registered functions serves as the tool description sent to the LLM, not standard Python function documentation. The docstring should describe the logical tool interface (parameters the LLM will provide) rather than the NAT framework wrapper parameters like tool_config and builder.
Applied to files:
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Confirm that copyright years are up-to-date whenever a file is changed
Applied to files:
src/nat/llm/dynamo_llm.py
🧬 Code graph analysis (2)
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py (3)
src/nat/llm/dynamo_llm.py (3)
dynamo_llm(356-361)DynamoModelConfig(164-243)get_dynamo_field_names(221-243)src/nat/builder/framework_enum.py (1)
LLMFrameworkEnum(19-26)src/nat/utils/responses_api.py (1)
validate_no_responses_api(21-26)
src/nat/llm/dynamo_llm.py (1)
src/nat/data_models/optimizable.py (2)
OptimizableField(157-196)SearchSpace(35-154)
🔇 Additional comments (6)
packages/nvidia_nat_adk/src/nat/plugins/adk/llm.py (1)
130-205: LGTM! Well-structured Dynamo ADK client implementation.The
dynamo_adkfunction correctly follows the established pattern for ADK client registrations in this file. The implementation properly:
- Validates API compatibility with
validate_no_responses_api- Excludes Dynamo-specific fields using
DynamoModelConfig.get_dynamo_field_names()- Constructs prefix headers matching the Dynamo spec (x-prefix-id, x-prefix-total-requests, x-prefix-osl, x-prefix-iat)
- Documents the static header limitation and recommends LangChain for dynamic prefix IDs
The docstring clearly explains the ADK-specific constraint that LiteLLM sets headers at initialization time rather than per-request, which is helpful context for users choosing between frameworks.
src/nat/llm/dynamo_llm.py (5)
1-75: Excellent module documentation and setup.The copyright headers, license information, and module-level documentation are comprehensive and correct. The docstring clearly explains:
- The module's purpose (OpenAI-compatible LLM with Dynamo prefix headers)
- How httpx event hooks enable framework-agnostic header injection
- Detailed explanations of each Dynamo prefix parameter (OSL, IAT, total_requests)
The imports are well-organized and appropriate for the functionality provided.
82-156: Well-implemented context management pattern.The
DynamoPrefixContextclass provides a clean, thread-safe way to manage prefix IDs across LLM calls usingContextVar. Key strengths:
- Thread-safe context variable storage appropriate for async environments
- Context manager (
scope) properly ensures cleanup viafinallyblock, even on exceptions- Clear docstrings with usage examples for both manual and context manager approaches
- Proper type hints on all public methods
This design enables per-evaluation-question prefix IDs while maintaining safety across concurrent operations.
164-243: Solid configuration model with optimizer integration.The
DynamoModelConfigclass is well-designed with proper field constraints and optimizer integration:
- Inherits from
OpenAIModelConfigfor OpenAI API compatibility- Uses
OptimizableFieldwith appropriateSearchSpacedefinitions for the three tunable parameters- Field-level validation constraints (e.g.,
ge=1, le=50) are appropriately broader than optimizer search spaces, allowing manual overrides outside the search rangeget_dynamo_field_names()utility method provides clean field exclusion for framework adapters- Comprehensive docstrings explain the purpose and impact of each Dynamo parameter
The default values (prefix_template enabled, MEDIUM for OSL/IAT, 10 total requests) provide sensible starting points for optimization.
315-345: Clean httpx client factory with proper configuration.The function provides a straightforward way to create a configured
httpx.AsyncClient:
- Proper type hints on all parameters and return value
- Correctly configures event hooks with the Dynamo request hook
- Sets appropriate timeout using
httpx.Timeoutwrapper- Local import of
httpxis appropriate for optional dependency handling- Clear docstring explains the framework-agnostic approach
This factory function enables easy integration with any framework that accepts custom httpx clients (OpenAI SDK, LangChain, etc.).
355-361: Provider registration correctly implemented.The
dynamo_llmfunction properly registers the Dynamo LLM provider:
- Correctly decorated with
@register_llm_providerusing theDynamoModelConfigtype- Yields
LLMProviderInfowith appropriate config and description- Description clearly communicates the provider's purpose (Dynamo-optimized with automatic prefix headers)
The registration pattern is consistent with other LLM providers in the codebase and enables automatic discovery by the NAT framework.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
/ok to test 54aeb06 |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
examples/dynamo_integration/react_benchmark_agent/pyproject.toml (1)
18-18: Consider using lowercase "toolkit" for consistency.The root
pyproject.tomluses "NVIDIA NeMo Agent toolkit" with lowercase "toolkit". For consistency, consider updating the description to match this convention.-description = "Custom NeMo Agent Toolkit Workflow" +description = "Custom NeMo Agent toolkit Workflow"Alternatively, you could make it more specific to this example:
-description = "Custom NeMo Agent Toolkit Workflow" +description = "React Benchmark Agent for Dynamo integration"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
examples/dynamo_integration/react_benchmark_agent/pyproject.toml(1 hunks)examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md(1 hunks)pyproject.toml(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/dynamo_integration/react_benchmark_agent/src/react_benchmark_agent/README.md
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
**/pyproject.toml
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/pyproject.toml: Dependencies should use ~= format with two-digit versions (e.g., ~=1.0)
New dependencies must be added to both pyproject.toml (alphabetically) and uv.lock via uv pip install --sync
Files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/dynamo_integration/react_benchmark_agent/pyproject.toml
🧠 Learnings (7)
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to packages/*/pyproject.toml : The pyproject.toml should declare a dependency on nvidia-nat or another package with a name starting with nvidia-nat-
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Each package should contain a pyproject.toml file
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.toml
📚 Learning: 2025-11-10T21:26:35.059Z
Learnt from: jiaxiangr
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 974
File: packages/nvidia_nat_all/pyproject.toml:39-39
Timestamp: 2025-11-10T21:26:35.059Z
Learning: In packages/nvidia_nat_all/pyproject.toml, workspace dependencies (nvidia-nat-* plugin packages) should NOT have version constraints because they are managed as workspace dependencies. Version constraints are only applied to the base nvidia-nat package and external dependencies, not to internal workspace packages.
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
📚 Learning: 2025-12-03T18:42:23.494Z
Learnt from: AnuradhaKaruppiah
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 1147
File: packages/nvidia_nat_a2a/pyproject.toml:1-10
Timestamp: 2025-12-03T18:42:23.494Z
Learning: In the packages/ directory, pyproject.toml files typically do not include SPDX license headers. Out of 34 packages, only nvidia_nat_strands is an exception. This pattern differs from the requirement for SPDX headers in source code files (.py, .js, .ts, etc.).
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.toml
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{py,toml,yaml,yml} : Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.tomlpyproject.toml
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Applies to **/*.{md,rst,py} : Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Applied to files:
examples/dynamo_integration/react_benchmark_agent/pyproject.toml
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Python code in examples should be placed in a src/ subdirectory with a pyproject.toml file
Applied to files:
pyproject.toml
🔇 Additional comments (2)
pyproject.toml (1)
122-122: LGTM! Example registration looks correct.The new
nat_react_benchmark_agentexample is properly registered in both the examples list and workspace sources with correct alphabetical ordering and path configuration.Also applies to: 228-228
examples/dynamo_integration/react_benchmark_agent/pyproject.toml (1)
1-1: SPDX headers are not required for pyproject.toml files in the examples/ directory.The coding guidelines specify SPDX headers only for source code files:
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}. Configuration files likepyproject.tomlare explicitly excluded from this requirement. Additionally, license information for Python packages is declared in the[project]table ofpyproject.tomlas packaging metadata, not as file-level SPDX headers. This pattern is consistent throughout both theexamples/andpackages/directories in the repository.Likely an incorrect or invalid review comment.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
/ok to test 0205fa3 |
|
/ok to test d33c2b7 |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/nvidia_nat_langchain/tests/test_llm_langchain.py (2)
171-208: Well-structured test fixtures with comprehensive coverage.The fixtures cover the key test scenarios:
- No prefix template (default behavior)
- With prefix template (header injection with all optimization params)
- RESPONSES API type (special branch logic)
Optional: Consider file-wide fixture naming refactor.
The coding guidelines specify that fixtures should use the
fixture_prefix and define anameargument in the decorator. For example:@pytest.fixture(name="dynamo_cfg_no_prefix") def fixture_dynamo_cfg_no_prefix(self): """Dynamo config without prefix template (no header injection).""" return DynamoModelConfig(...)However, this pattern is not used anywhere in this file (all existing fixtures use descriptive names without the
fixture_prefix). Updating this would be a file-wide refactoring effort that could be deferred to maintain consistency with the existing code.
276-313: Strong test with excellent documentation, but could be more maintainable.The test has a detailed docstring explaining the rationale for field exclusion, and the assertions are clear. However, the field names are hardcoded rather than derived from
get_dynamo_field_names().Consider making this test more maintainable by programmatically checking against the fields returned by
get_dynamo_field_names():+ # Get the Dynamo-specific fields that should be excluded + excluded_fields = dynamo_cfg_with_prefix.get_dynamo_field_names() + # These Dynamo-specific fields should NOT be passed to ChatOpenAI - assert "prefix_template" not in kwargs - assert "prefix_total_requests" not in kwargs - assert "prefix_osl" not in kwargs - assert "prefix_iat" not in kwargs - assert "request_timeout" not in kwargs + for field_name in excluded_fields: + assert field_name not in kwargs, f"Dynamo-specific field '{field_name}' leaked to ChatOpenAI kwargs"This ensures the test stays in sync if
get_dynamo_field_names()is updated in the future.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
packages/nvidia_nat_langchain/tests/test_llm_langchain.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (11)
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/test_*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named using the test_ prefix, using snake_case
Extract frequently repeated code into pytest fixtures, which should be named using the fixture_ prefix and define the name argument in the decorator
Mock external services with pytest_httpserver or unittest.mock instead of hitting live endpoints
Mark slow tests with @pytest.mark.slow so they can be skipped in the default test suite
Mark integration tests requiring external services with @pytest.mark.integration so they can be skipped in the default test suite
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.- When adding a new package, that new package name (as defined in the
pyproject.tomlfile) should
be added as a dependency to the nvidia-nat-all package inpackages/nvidia_nat_all/pyproject.toml
Files:
packages/nvidia_nat_langchain/tests/test_llm_langchain.py
🧬 Code graph analysis (1)
packages/nvidia_nat_langchain/tests/test_llm_langchain.py (1)
src/nat/llm/dynamo_llm.py (2)
dynamo_llm(356-361)DynamoModelConfig(164-243)
🔇 Additional comments (3)
packages/nvidia_nat_langchain/tests/test_llm_langchain.py (3)
18-18: LGTM! Imports are appropriate.The new imports (AsyncMock, DynamoModelConfig, dynamo_langchain) are necessary for the Dynamo integration tests and follow the existing import organization pattern.
Also applies to: 28-28, 32-32
210-275: Excellent test coverage with proper async handling.The test methods effectively verify:
- No custom httpx client when prefix template is absent
- Custom httpx client creation with correct parameters when prefix template is present
- Proper async cleanup (aclose awaited)
- RESPONSES API special flags
The AsyncMock usage for the aclose method is correct and the tests properly verify cleanup occurs.
327-337: LGTM! Registry mapping correctly updated.The DynamoModelConfig mapping is properly added to the mock registry, following the same pattern as the existing NIM, OpenAI, and Bedrock mappings. The assertion correctly verifies the registration.
Signed-off-by: Bryan Bednarski <bbednarski@nvidia.com>
|
/ok to test da05743 |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/nat/llm/dynamo_llm.py (1)
305-306: Consider removing redundant .upper() calls.The
oslandiatparameters are already typed asPrefixLevel = Literal["LOW", "MEDIUM", "HIGH"], so calling.upper()is redundant. While harmless, removing it would make the code cleaner.Apply this diff if you'd like to simplify:
- request.headers["x-prefix-osl"] = osl.upper() - request.headers["x-prefix-iat"] = iat.upper() + request.headers["x-prefix-osl"] = osl + request.headers["x-prefix-iat"] = iat
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/nat/llm/dynamo_llm.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{md,rst,py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use 'NVIDIA NeMo Agent toolkit' on first use, then 'NeMo Agent toolkit' for subsequent references
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,toml,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Use abbreviations: 'nat' for API namespace and CLI tool, 'nvidia-nat' for package name, 'NAT' for environment variable prefixes and informal comments
Files:
src/nat/llm/dynamo_llm.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf second (PEP 8 base, column_limit = 120) for Python formatting
Use ruff check --fix (via pre-commit) for linting using configuration embedded in pyproject.toml, fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer collections.abc / typing abstractions (Sequence over list) for type hints
Use typing.Annotated for units or extra metadata when useful
Treat pyright warnings (configured in pyproject.toml) as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare raise statements and logger.error() when re-raising
When catching and logging exceptions without re-raising, always use logger.exception() to capture the full stack trace information
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period (Vale checks this)
Surround code entities with backticks in docstrings to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with functools.lru_cache or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,js,ts,yaml,yml,json,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Indent with 4 spaces, never tabs, and ensure every file ends with a single newline
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,env,toml,yaml,yml,json}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Never commit API keys, credentials or personal data; use environment variables or .env files excluded from Git
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Every file must start with the standard SPDX Apache-2.0 header
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,md,rst}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Version numbers are derived automatically by setuptools-scm; never hard-code them in code or docs
Files:
src/nat/llm/dynamo_llm.py
**/*.{py,js,ts,java,cpp,c,go,rb,php,sh}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All source files must include the SPDX Apache-2.0 header template
Files:
src/nat/llm/dynamo_llm.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of
None,
in that situation no return type hint is needed).
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
src/nat/llm/dynamo_llm.py
src/nat/**/*
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
src/nat/llm/dynamo_llm.py
🧠 Learnings (1)
📚 Learning: 2025-11-24T18:56:53.109Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2025-11-24T18:56:53.109Z
Learning: Confirm that copyright years are up-to-date whenever a file is changed
Applied to files:
src/nat/llm/dynamo_llm.py
🧬 Code graph analysis (1)
src/nat/llm/dynamo_llm.py (2)
src/nat/data_models/optimizable.py (2)
OptimizableField(157-196)SearchSpace(35-154)src/nat/llm/openai_llm.py (2)
openai_llm(57-59)OpenAIModelConfig(32-53)
🔇 Additional comments (7)
src/nat/llm/dynamo_llm.py (7)
1-14: LGTM!License header is correct with up-to-date copyright year (2025) and complete Apache 2.0 license text.
15-47: LGTM!Excellent module-level documentation. The docstring clearly explains the purpose, provides context for Dynamo prefix parameters (OSL, IAT, total_requests), and follows Google-style conventions.
49-75: LGTM!Imports are well-organized and follow best practices. Good use of
TYPE_CHECKINGto defer httpx import andcollections.abcfor abstract type hints per coding guidelines.
82-157: LGTM!Well-implemented context management using
ContextVarfor thread-safe prefix ID tracking. Thescope()context manager provides clean resource management, and all methods include proper type hints and comprehensive docstrings.
165-245: LGTM!Excellent Pydantic configuration model. Proper use of
OptimizableFieldwithSearchSpacefor hyperparameter tuning, comprehensive field descriptions, and appropriate validation constraints. Theget_dynamo_field_names()utility method is well-documented and useful for framework integrations.
317-347: LGTM with one caveat.Well-structured function that creates a properly configured httpx client. The local import, timeout configuration, and event hook setup are all correct.
Note: This function inherits the
prefix_template=Nonebehavior issue flagged in the previous comment. Ensure callers check forNonebefore invoking this function, or add logic here to handleNoneappropriately.
357-363: LGTM!Provider registration follows the framework's established pattern correctly. The async generator properly yields
LLMProviderInfowith appropriate configuration and description.
|
/merge |
… routers (NVIDIA#1242) This is an initial integration between the NVIDIA NAT and [Dynamo](https://github.com/ai-dynamo/dynamo) SDKs. The scope of this PR is around establishing key integrations and a robust evaluation harness, that will allow the NAT team to explore agentic inference acceleration using Dynamo servers. There are a few key components for review: - [dynamo components] (./external/dynamo): startup and shutdown scripts for three variates of dyanmo servers - unified server with KV aware routing, unified server with predictive KV aware routing (thompson sampling), and disaggregated serving with KV aware routing - [dynamo LLM provider] (./src/nat/llm/dynamo_llm.py): a new llm provider that enabled custom request prefix headers which can be optimized for the predictive kv aware cache router (thompson). Framework integration in place for langchain and google ADK - [dynamo integration example agents] (./examples/dynamo_integration) - example project for running nat workflows (run, eval, optimize, profile) and compare performance between difference inference endpoints. Analysis and visualization scripts included. ### Musings - tool calling for the Galileo-V2 dataset used here is not parameterized by the agent in order to help normalize the OSL. - post-processing of experiment data is not integrated with profiler yet. There are currently two sets of plotting tools with _histograms_ being the newer. - After internal discussion, we opted to build a new dynamo llm provider (dynamo_llm.py), rather than implementing this as a mixin. It keeps development isolated to the dynamo components so that we can be more agile moving forward with development. - Features in this PR are largely experimental and non-breaking to the core library - There is an additional evaluator defined in `action_completion_evaluator.py`, its not used in current evaluations but im going to leave it here for completeness and possible extension. Closes ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Dynamo LLM provider and end-to-end Dynamo backend integration with prefix-hint support and streaming/non-streaming API. * **New Agents & Tools** * React benchmark agent, decision-only tool stubs, banking tools group, self-evaluating agent with feedback/retry loop, and intent-buffer utilities. * **New Evaluators** * Tool Selection Quality (TSQ) and Action Completion (AC) evaluators. * **Documentation** * Comprehensive Dynamo integration guides, architecture docs, and READMEs. * **Tools & Scripts** * Deployment, start/stop, monitoring, benchmarking, plotting, dataset utilities, and env templates. * **Tests** * Extensive unit and integration tests across LLMs, agents, evaluators, and analysis tools. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Dhruv Nandakumar (https://github.com/dnandakumar-nv) URL: NVIDIA#1242 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
… routers (NVIDIA#1242) This is an initial integration between the NVIDIA NAT and [Dynamo](https://github.com/ai-dynamo/dynamo) SDKs. The scope of this PR is around establishing key integrations and a robust evaluation harness, that will allow the NAT team to explore agentic inference acceleration using Dynamo servers. There are a few key components for review: - [dynamo components] (./external/dynamo): startup and shutdown scripts for three variates of dyanmo servers - unified server with KV aware routing, unified server with predictive KV aware routing (thompson sampling), and disaggregated serving with KV aware routing - [dynamo LLM provider] (./src/nat/llm/dynamo_llm.py): a new llm provider that enabled custom request prefix headers which can be optimized for the predictive kv aware cache router (thompson). Framework integration in place for langchain and google ADK - [dynamo integration example agents] (./examples/dynamo_integration) - example project for running nat workflows (run, eval, optimize, profile) and compare performance between difference inference endpoints. Analysis and visualization scripts included. ### Musings - tool calling for the Galileo-V2 dataset used here is not parameterized by the agent in order to help normalize the OSL. - post-processing of experiment data is not integrated with profiler yet. There are currently two sets of plotting tools with _histograms_ being the newer. - After internal discussion, we opted to build a new dynamo llm provider (dynamo_llm.py), rather than implementing this as a mixin. It keeps development isolated to the dynamo components so that we can be more agile moving forward with development. - Features in this PR are largely experimental and non-breaking to the core library - There is an additional evaluator defined in `action_completion_evaluator.py`, its not used in current evaluations but im going to leave it here for completeness and possible extension. Closes ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Dynamo LLM provider and end-to-end Dynamo backend integration with prefix-hint support and streaming/non-streaming API. * **New Agents & Tools** * React benchmark agent, decision-only tool stubs, banking tools group, self-evaluating agent with feedback/retry loop, and intent-buffer utilities. * **New Evaluators** * Tool Selection Quality (TSQ) and Action Completion (AC) evaluators. * **Documentation** * Comprehensive Dynamo integration guides, architecture docs, and READMEs. * **Tools & Scripts** * Deployment, start/stop, monitoring, benchmarking, plotting, dataset utilities, and env templates. * **Tests** * Extensive unit and integration tests across LLMs, agents, evaluators, and analysis tools. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> Authors: - Bryan Bednarski (https://github.com/bbednarski9) Approvers: - Dhruv Nandakumar (https://github.com/dnandakumar-nv) URL: NVIDIA#1242 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
|
This integration raises an interesting question around authorization In setups like this, where requests are dynamically routed and I’m less concerned with runtime control than with whether there’s a Curious if this has been discussed elsewhere in the project, or if |
Description
This is an initial integration between the NVIDIA NAT and Dynamo SDKs. The scope of this PR is around establishing key integrations and a robust evaluation harness, that will allow the NAT team to explore agentic inference acceleration using Dynamo servers. There are a few key components for review:
Musings
action_completion_evaluator.py, its not used in current evaluations but im going to leave it here for completeness and possible extension.Closes
By Submitting this PR I confirm:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.