Skip to content

Add E2E test for the simple calculator HITL example#984

Merged
rapids-bot[bot] merged 14 commits intoNVIDIA:release/1.3from
dagardner-nv:david-simple-calc-hitl
Oct 13, 2025
Merged

Add E2E test for the simple calculator HITL example#984
rapids-bot[bot] merged 14 commits intoNVIDIA:release/1.3from
dagardner-nv:david-simple-calc-hitl

Conversation

@dagardner-nv
Copy link
Contributor

@dagardner-nv dagardner-nv commented Oct 13, 2025

Description

  • Fixes an issue where an extra \n was being inserted int the HITL prompt due the was the prompt was defined in the YAML.
  • E2E test created using subprocess, since the HITL callback function only exists within the console front-end.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Bug Fixes

    • Trimmed leading and trailing whitespace from the HITL approval prompt to ensure consistent behavior and reduce input-related issues.
  • Tests

    • Added an asynchronous integration test for the HITL simple calculator workflow that runs the CLI end-to-end, verifies the confirmation prompt, supplies user responses via stdin, and checks final workflow outcomes for both approval and rejection paths.

Signed-off-by: David Gardner <dagardner@nvidia.com>
…e exceptions without messages

Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
@dagardner-nv dagardner-nv self-assigned this Oct 13, 2025
@dagardner-nv dagardner-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Oct 13, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 13, 2025

Walkthrough

Adds a prompt field validator to HITLApprovalFnConfig to strip whitespace after validation. Adds a new async, parametrized integration test that runs the NAT CLI for the simple calculator HITL example, supplies confirmation via stdin, and asserts exit code and expected outputs.

Changes

Cohort / File(s) Summary
HITL approval config validation
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
Added @field_validator("prompt", mode="after") with validate_prompt to strip leading/trailing whitespace from the prompt field in HITLApprovalFnConfig.
Simple calculator HITL E2E test
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
New async, parametrized integration test invoking the NAT CLI with a predefined query; pipes yes/no via stdin; asserts exit code 0, presence of confirmation prompt in stdout, and expected workflow result in stderr.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

bug

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title uses clear imperative mood, stays well under the 72-character limit, and accurately describes the primary change of adding an end-to-end test for the simple calculator HITL example.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c191270 and 7bbce2a.

📒 Files selected for processing (1)
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
🧬 Code graph analysis (1)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)
packages/nvidia_nat_test/src/nat/test/utils.py (1)
  • locate_example_config (56-67)
examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)
  • retry_react_agent (55-221)
  • RetryReactAgentConfig (36-51)
🪛 Ruff (0.14.0)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: David Gardner <dagardner@nvidia.com>
…to david-simple-calc-hitl

Signed-off-by: David Gardner <dagardner@nvidia.com>
@dagardner-nv dagardner-nv changed the base branch from develop to release/1.3 October 13, 2025 16:26
@dagardner-nv dagardner-nv marked this pull request as ready for review October 13, 2025 16:28
@dagardner-nv dagardner-nv requested a review from a team as a code owner October 13, 2025 16:28
@coderabbitai coderabbitai bot added the bug Something isn't working label Oct 13, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/HITL/por_to_jiratickets/README.md (1)

81-84: Fix README reference for HITL approval function
Replace the path to hitl_approval_tool.py instead of jira_tickets_tool.py in the example documentation.

🧹 Nitpick comments (5)
examples/HITL/por_to_jiratickets/README.md (1)

20-21: Brand casing consistency: “NeMo Agent Toolkit”.

Use “NeMo Agent Toolkit” (capital T) in docs to match naming guidance.

As per coding guidelines

Also applies to: 53-55

examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1)

41-58: Prompt trimming LGTM; consider minor robustness and typing.

  • Move import re to module scope.
  • Guard response.content.text to avoid attr/None errors.
  • Add return type + function docstring per guidelines.

As per coding guidelines

@@
-import re
+import re
@@
-@register_function(config_type=HITLApprovalFnConfig)
-async def hitl_approval_function(config: HITLApprovalFnConfig, builder: Builder):
+@register_function(config_type=HITLApprovalFnConfig)
+async def hitl_approval_function(config: HITLApprovalFnConfig, builder: Builder) -> "AsyncGenerator[FunctionInfo, None]":
+    """Prompt the user for approval and return True only on an explicit 'yes'."""
@@
-    prompt = f"{config.prompt.strip()} Please confirm if you would like to proceed. Respond with 'yes' or 'no'."
+    prompt = f"{config.prompt.strip()} Please confirm if you would like to proceed. Respond with 'yes' or 'no'."
@@
-        response: InteractionResponse = await user_input_manager.prompt_user_input(human_prompt_text)
-        response_str = response.content.text.lower()  # type: ignore
+        response: InteractionResponse = await user_input_manager.prompt_user_input(human_prompt_text)
+        response_text = getattr(getattr(response, "content", None), "text", "") or ""
+        response_str = response_text.lower()
@@
     yield FunctionInfo.from_fn(_arun,
                                description=("This function will be used to get the user's response to the prompt"))

Add at top of file (imports section):

+from typing import AsyncGenerator
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (3)

35-38: Fix typo in comment.

“usign” → “using”.

-# Use subprocess to run the NAT CLI rather than usign the API for two reasons:
+# Use subprocess to run the NAT CLI rather than using the API for two reasons:

38-41: Harden subprocess invocation and input.

  • Remove superfluous quotes around the --input value (quotes are for shell parsing).
  • Set shell=False explicitly, and pin encoding for deterministic behavior.

Also OK per Ruff S603: no untrusted input and shell=False.

-cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", '"Is 2 * 4 greater than 5?"']
-proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
+cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", "Is 2 * 4 greater than 5?"]
+proc = subprocess.Popen(
+    cmd,
+    stdin=subprocess.PIPE,
+    stdout=subprocess.PIPE,
+    stderr=subprocess.PIPE,
+    text=True,
+    encoding="utf-8",
+    errors="replace",
+    shell=False,
+)

44-47: Make result assertion robust (escape regex, search both streams).

“Workflow Result:” may be on stdout depending on env/log config; escape expected_result to avoid regex pitfalls.

-assert re.search(f"Workflow Result:.*{expected_result}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
-    f"Expected result '{expected_result}' not found in stderr: {stderr}"
+pattern = r"Workflow Result:.*" + re.escape(expected_result)
+combined = (stdout or "") + (stderr or "")
+assert re.search(pattern, combined, re.IGNORECASE | re.MULTILINE | re.DOTALL) is not None, \
+    f"Expected result '{expected_result}' not found.\nstdout:\n{stdout}\n\nstderr:\n{stderr}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d91fd6c and 86c5ab8.

📒 Files selected for processing (3)
  • examples/HITL/por_to_jiratickets/README.md (1 hunks)
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1 hunks)
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

  • examples/HITL/por_to_jiratickets/README.md
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • examples/HITL/por_to_jiratickets/README.md
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/HITL/por_to_jiratickets/README.md
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
🧬 Code graph analysis (1)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)
packages/nvidia_nat_test/src/nat/test/utils.py (1)
  • locate_example_config (56-67)
examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)
  • retry_react_agent (55-221)
  • RetryReactAgentConfig (36-51)
🪛 Ruff (0.14.0)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

39-39: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (1)
examples/HITL/por_to_jiratickets/README.md (1)

91-91: Whitespace trim fix is correct.

Using config.prompt.strip() prevents stray newlines in the HITL prompt.

…tor_hitl.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>
@dagardner-nv dagardner-nv removed the bug Something isn't working label Oct 13, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)

28-28: Add return type hint for completeness.

Per coding guidelines, all functions should include return type hints. Test functions implicitly return None.

Apply this diff:

-def test_hitl_workflow(response: str, expected_result: str):
+def test_hitl_workflow(response: str, expected_result: str) -> None:

As per coding guidelines.


45-46: Escape expected_result in regex pattern for robustness.

The current pattern interpolates expected_result directly without escaping, which works for the current test values but would break if a future test case includes regex metacharacters (e.g., "2+2=4" where + is a regex quantifier).

Apply this diff:

-    assert re.search(f"Workflow Result:.*{expected_result}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
+    assert re.search(f"Workflow Result:.*{re.escape(expected_result)}", stderr, (re.IGNORECASE | re.MULTILINE | re.DOTALL)) is not None, \
         f"Expected result '{expected_result}' not found in stderr: {stderr}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86c5ab8 and cc2bb54.

📒 Files selected for processing (1)
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
🧬 Code graph analysis (1)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)
packages/nvidia_nat_test/src/nat/test/utils.py (1)
  • locate_example_config (56-67)
examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)
  • retry_react_agent (55-221)
  • RetryReactAgentConfig (36-51)
🪛 Ruff (0.14.0)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

dagardner-nv and others added 2 commits October 13, 2025 10:11
Signed-off-by: David Gardner <dagardner@nvidia.com>
…tor_hitl.py

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: David Gardner <96306125+dagardner-nv@users.noreply.github.com>
@coderabbitai coderabbitai bot added the bug Something isn't working label Oct 13, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1)

37-37: Remove literal double quotes from the --input argument value.

The command construction still includes literal double quotes that will be passed as part of the input string to the NAT CLI. Since the command is passed as a list to subprocess.Popen (not using shell=True), shell quote processing doesn't occur, and the quotes become part of the actual input value. This issue was flagged in a previous review but remains unresolved.

Apply this diff:

-    cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", '"Is 2 * 4 greater than 5?"']
+    cmd = ["nat", "run", "--config_file", str(config_file.absolute()), "--input", "Is 2 * 4 greater than 5?"]
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc2bb54 and c191270.

📒 Files selected for processing (2)
  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (2 hunks)
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

  • If an example contains Python code, it should be placed in a subdirectory named src/ and should
    contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.
  • If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
    be checked into git-lfs.

Files:

  • examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py
  • examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py
🧬 Code graph analysis (1)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (2)
packages/nvidia_nat_test/src/nat/test/utils.py (1)
  • locate_example_config (56-67)
examples/HITL/simple_calculator_hitl/src/nat_simple_calculator_hitl/retry_react_agent.py (2)
  • retry_react_agent (55-221)
  • RetryReactAgentConfig (36-51)
🪛 Ruff (0.14.0)
examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py

38-38: subprocess call: check for execution of untrusted input

(S603)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (2)
examples/HITL/por_to_jiratickets/src/nat_por_to_jiratickets/hitl_approval_tool.py (1)

19-19: LGTM! Field validator correctly strips whitespace from prompt.

The addition of the field_validator for the prompt field properly addresses the issue mentioned in the PR objectives where an extra \n was being inserted in the HITL prompt. The implementation follows Pydantic 2.x best practices using mode="after" and @classmethod decorator.

Also applies to: 40-43

examples/HITL/simple_calculator_hitl/tests/test_simple_calculator_hitl.py (1)

38-38: Subprocess security warning is a false positive.

The static analysis tool flags this line with S603 (subprocess call with untrusted input). However, all command elements are either literals or controlled paths from locate_example_config, so there's no actual security risk here. This warning can be safely ignored in this context.

Signed-off-by: David Gardner <dagardner@nvidia.com>
…-nv/AIQtoolkit into david-simple-calc-hitl

Signed-off-by: David Gardner <dagardner@nvidia.com>
@dagardner-nv dagardner-nv removed the bug Something isn't working label Oct 13, 2025
@coderabbitai coderabbitai bot added the bug Something isn't working label Oct 13, 2025
@dagardner-nv dagardner-nv removed the bug Something isn't working label Oct 13, 2025
@dagardner-nv
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit ff471c8 into NVIDIA:release/1.3 Oct 13, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improvement to existing functionality non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants