Add E2E test for the simple calculator HITL example by dagardner-nv · Pull Request #984 · NVIDIA/NeMo-Agent-Toolkit

dagardner-nv · 2025-10-13T16:21:00Z

Description

Fixes an issue where an extra \n was being inserted int the HITL prompt due the was the prompt was defined in the YAML.
E2E test created using subprocess, since the HITL callback function only exists within the console front-end.

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

Bug Fixes
- Trimmed leading and trailing whitespace from the HITL approval prompt to ensure consistent behavior and reduce input-related issues.
Tests
- Added an asynchronous integration test for the HITL simple calculator workflow that runs the CLI end-to-end, verifies the confirmation prompt, supplies user responses via stdin, and checks final workflow outcomes for both approval and rejection paths.

Signed-off-by: David Gardner <dagardner@nvidia.com>

…e exceptions without messages Signed-off-by: David Gardner <dagardner@nvidia.com>

Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai · 2025-10-13T16:21:37Z

Walkthrough

Adds a prompt field validator to HITLApprovalFnConfig to strip whitespace after validation. Adds a new async, parametrized integration test that runs the NAT CLI for the simple calculator HITL example, supplies confirmation via stdin, and asserts exit code and expected outputs.

Changes

Cohort / File(s)	Summary
HITL approval config validation `examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py`	Added `@field_validator("prompt", mode="after")` with `validate_prompt` to strip leading/trailing whitespace from the `prompt` field in `HITLApprovalFnConfig`.
Simple calculator HITL E2E test `examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py`	New async, parametrized integration test invoking the NAT CLI with a predefined query; pipes `yes`/`no` via stdin; asserts exit code 0, presence of confirmation prompt in stdout, and expected workflow result in stderr.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

bug

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title uses clear imperative mood, stays well under the 72-character limit, and accurately describes the primary change of adding an end-to-end test for the simple calculator HITL example.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c191270 and 7bbce2a.

📒 Files selected for processing (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

🧬 Code graph analysis (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)

packages/nvidia_nat_test/src/nat/test/utils.py (1)

locate_example_config (56-67)

examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)

retry_react_agent (55-221)

RetryReactAgentConfig (36-51)

🪛 Ruff (0.14.0)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: David Gardner <dagardner@nvidia.com>

…to david-simple-calc-hitl Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/HITL/por_to_jiratickets/README.md (1)

81-84: Fix README reference for HITL approval function
Replace the path to hitl_approval_tool.py instead of jira_tickets_tool.py in the example documentation.

🧹 Nitpick comments (5)

examples/HITL/por_to_jiratickets/README.md (1)

20-21: Brand casing consistency: “NeMo Agent Toolkit”.

Use “NeMo Agent Toolkit” (capital T) in docs to match naming guidance.

As per coding guidelines

Also applies to: 53-55

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1)

41-58: Prompt trimming LGTM; consider minor robustness and typing.

Move import re to module scope.
Guard response.content.text to avoid attr/None errors.
Add return type + function docstring per guidelines.

As per coding guidelines

@@
-import re
+import re
@@
-@register_function(config_type=HITLApprovalFnConfig)
-async def hitl_approval_function(config: HITLApprovalFnConfig, builder: Builder):
+@register_function(config_type=HITLApprovalFnConfig)
+async def hitl_approval_function(config: HITLApprovalFnConfig, builder: Builder) -> "AsyncGenerator[FunctionInfo, None]":
+    """Prompt the user for approval and return True only on an explicit 'yes'."""
@@
-    prompt = f"{config.prompt.strip()} Please confirm if you would like to proceed. Respond with 'yes' or 'no'."
+    prompt = f"{config.prompt.strip()} Please confirm if you would like to proceed. Respond with 'yes' or 'no'."
@@
-        response: InteractionResponse = await user_input_manager.prompt_user_input(human_prompt_text)
-        response_str = response.content.text.lower()  # type: ignore
+        response: InteractionResponse = await user_input_manager.prompt_user_input(human_prompt_text)
+        response_text = getattr(getattr(response, "content", None), "text", "") or ""
+        response_str = response_text.lower()
@@
     yield FunctionInfo.from_fn(_arun,
                                description=("This function will be used to get the user's response to the prompt"))

Add at top of file (imports section):

+from typing import AsyncGenerator

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (3)

35-38: Fix typo in comment.

“usign” → “using”.

-# Use subprocess to run the NAT CLI rather than usign the API for two reasons:
+# Use subprocess to run the NAT CLI rather than using the API for two reasons:

38-41: Harden subprocess invocation and input.

Remove superfluous quotes around the --input value (quotes are for shell parsing).
Set shell=False explicitly, and pin encoding for deterministic behavior.

Also OK per Ruff S603: no untrusted input and shell=False.

-cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", '"Is 2 * 4 greater than 5?"']
-proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
+cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", "Is 2 * 4 greater than 5?"]
+proc = subprocess.Popen(
+    cmd,
+    stdin=subprocess.PIPE,
+    stdout=subprocess.PIPE,
+    stderr=subprocess.PIPE,
+    text=True,
+    encoding="utf-8",
+    errors="replace",
+    shell=False,
+)

44-47: Make result assertion robust (escape regex, search both streams).

“Workflow Result:” may be on stdout depending on env/log config; escape expected_result to avoid regex pitfalls.

-assert re.search(f"Workflow Result:.*{expected_result}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
-    f"Expected result '{expected_result}' not found in stderr: {stderr}"
+pattern = r"Workflow Result:.*" + re.escape(expected_result)
+combined = (stdout or "") + (stderr or "")
+assert re.search(pattern, combined, re.IGNORECASE | re.MULTILINE | re.DOTALL) is not None, \
+    f"Expected result '{expected_result}' not found.\nstdout:\n{stdout}\n\nstderr:\n{stderr}"

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d91fd6c and 86c5ab8.

📒 Files selected for processing (3)

examples/HITL/por_to_jiratickets/README.md (1 hunks)
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1 hunks)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (5)

**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

examples/HITL/por_to_jiratickets/README.md

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/HITL/por_to_jiratickets/README.md
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/HITL/por_to_jiratickets/README.md
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py

🧬 Code graph analysis (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)

packages/nvidia_nat_test/src/nat/test/utils.py (1)

locate_example_config (56-67)

examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)

retry_react_agent (55-221)

RetryReactAgentConfig (36-51)

🪛 Ruff (0.14.0)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

39-39: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (1)

examples/HITL/por_to_jiratickets/README.md (1)

91-91: Whitespace trim fix is correct.

Using config.prompt.strip() prevents stray newlines in the HITL prompt.

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

…tor_hitl.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>

examples/HITL/por_to_jiratickets/README.md

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)
28-28: Add return type hint for completeness.

Per coding guidelines, all functions should include return type hints. Test functions implicitly return None.

Apply this diff:
-def test_hitl_workflow(response: str, expected_result: str):
+def test_hitl_workflow(response: str, expected_result: str) -> None:
As per coding guidelines.

45-46: Escape expected_result in regex pattern for robustness.

The current pattern interpolates expected_result directly without escaping, which works for the current test values but would break if a future test case includes regex metacharacters (e.g., "2+2=4" where + is a regex quantifier).

Apply this diff:
-    assert re.search(f"Workflow Result:.*{expected_result}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
+    assert re.search(f"Workflow Result:.*{re.escape(expected_result)}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
         f"Expected result '{expected_result}' not found in stderr: {stderr}"

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86c5ab8 and cc2bb54.

📒 Files selected for processing (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

🧬 Code graph analysis (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)

packages/nvidia_nat_test/src/nat/test/utils.py (1)

locate_example_config (56-67)

examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)

retry_react_agent (55-221)

RetryReactAgentConfig (36-51)

🪛 Ruff (0.14.0)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

Signed-off-by: David Gardner <dagardner@nvidia.com>

…tor_hitl.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1)
37-37: Remove literal double quotes from the --input argument value.

The command construction still includes literal double quotes that will be passed as part of the input string to the NAT CLI. Since the command is passed as a list to subprocess.Popen (not using shell=True), shell quote processing doesn't occur, and the quotes become part of the actual input value. This issue was flagged in a previous review but remains unresolved.

Apply this diff:
-    cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", '"Is 2 * 4 greater than 5?"']
+    cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", "Is 2 * 4 greater than 5?"]

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc2bb54 and c191270.

📒 Files selected for processing (2)

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (2 hunks)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

🧬 Code graph analysis (1)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)

packages/nvidia_nat_test/src/nat/test/utils.py (1)

locate_example_config (56-67)

examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)

retry_react_agent (55-221)

RetryReactAgentConfig (36-51)

🪛 Ruff (0.14.0)

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (2)

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1)

19-19: LGTM! Field validator correctly strips whitespace from prompt.

The addition of the field_validator for the prompt field properly addresses the issue mentioned in the PR objectives where an extra \n was being inserted in the HITL prompt. The implementation follows Pydantic 2.x best practices using mode="after" and @classmethod decorator.

Also applies to: 40-43

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1)

38-38: Subprocess security warning is a false positive.

The static analysis tool flags this line with S603 (subprocess call with untrusted input). However, all command elements are either literals or controlled paths from locate_example_config, so there's no actual security risk here. This warning can be safely ignored in this context.

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

Signed-off-by: David Gardner <dagardner@nvidia.com>

…-nv/AIQtoolkit into david-simple-calc-hitl Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv · 2025-10-13T17:47:58Z

/merge

dagardner-nv added 7 commits October 9, 2025 16:50

Remove unused LLM

dc6f2d1

Signed-off-by: David Gardner <dagardner@nvidia.com>

Don't raise exceptions without an error message, remove need to handl…

74e4ba1

…e exceptions without messages Signed-off-by: David Gardner <dagardner@nvidia.com>

Log with error rather than exception when re-raising per code rabbit

7f0dbd8

Signed-off-by: David Gardner <dagardner@nvidia.com>

WIP

a3b97b6

Signed-off-by: David Gardner <dagardner@nvidia.com>

Remove extra space in prompt

766a044

Signed-off-by: David Gardner <dagardner@nvidia.com>

First pass at attempting to intercept the prompt

785d8fa

Signed-off-by: David Gardner <dagardner@nvidia.com>

Switch to using subprocess

9542b57

Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv self-assigned this Oct 13, 2025

dagardner-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Oct 13, 2025

dagardner-nv added 2 commits October 13, 2025 09:23

Remove unused logger

1941906

Signed-off-by: David Gardner <dagardner@nvidia.com>

Merge branch 'release/1.3' of github.com:NVIDIA/NeMo-Agent-Toolkit in…

86c5ab8

…to david-simple-calc-hitl Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv changed the base branch from develop to release/1.3 October 13, 2025 16:26

dagardner-nv marked this pull request as ready for review October 13, 2025 16:28

dagardner-nv requested a review from a team as a code owner October 13, 2025 16:28

coderabbitai bot added the bug Something isn't working label Oct 13, 2025

coderabbitai bot reviewed Oct 13, 2025

View reviewed changes

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py Outdated Show resolved Hide resolved

Update examples/HITL/simple_calculator_hitl/tests/test_simple_calcula…

cc2bb54

…tor_hitl.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>

dagardner-nv removed the bug Something isn't working label Oct 13, 2025

willkill07 reviewed Oct 13, 2025

View reviewed changes

examples/HITL/por_to_jiratickets/README.md Outdated Show resolved Hide resolved

willkill07 approved these changes Oct 13, 2025

View reviewed changes

coderabbitai bot reviewed Oct 13, 2025

View reviewed changes

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py Outdated Show resolved Hide resolved

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py Show resolved Hide resolved

dagardner-nv and others added 2 commits October 13, 2025 10:11

Use a field validator to perform the strip

131189c

Signed-off-by: David Gardner <dagardner@nvidia.com>

Update examples/HITL/simple_calculator_hitl/tests/test_simple_calcula…

c191270

…tor_hitl.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>

coderabbitai bot added the bug Something isn't working label Oct 13, 2025

coderabbitai bot reviewed Oct 13, 2025

View reviewed changes

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py Show resolved Hide resolved

dagardner-nv added 2 commits October 13, 2025 10:26

Fix lint issue

4f0571d

Signed-off-by: David Gardner <dagardner@nvidia.com>

Merge branch 'david-simple-calc-hitl' of https://github.com/dagardner…

7bbce2a

…-nv/AIQtoolkit into david-simple-calc-hitl Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv removed the bug Something isn't working label Oct 13, 2025

coderabbitai bot added the bug Something isn't working label Oct 13, 2025

dagardner-nv removed the bug Something isn't working label Oct 13, 2025

rapids-bot bot merged commit ff471c8 into NVIDIA:release/1.3 Oct 13, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add E2E test for the simple calculator HITL example#984

Add E2E test for the simple calculator HITL example#984
rapids-bot[bot] merged 14 commits intoNVIDIA:release/1.3from
dagardner-nv:david-simple-calc-hitl

dagardner-nv commented Oct 13, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 13, 2025 •

edited

Loading

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

dagardner-nv commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dagardner-nv commented Oct 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Pre-merge checks and finishing touches

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

dagardner-nv commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dagardner-nv commented Oct 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 13, 2025 •

edited

Loading