Skip to content

Fix swallowing client errors bug#841

Merged
rapids-bot[bot] merged 5 commits intoNVIDIA:developfrom
zhongxuanwang-nv:401_fixies
Sep 25, 2025
Merged

Fix swallowing client errors bug#841
rapids-bot[bot] merged 5 commits intoNVIDIA:developfrom
zhongxuanwang-nv:401_fixies

Conversation

@zhongxuanwang-nv
Copy link
Member

@zhongxuanwang-nv zhongxuanwang-nv commented Sep 23, 2025

Description

Closes AIQ-1952

Bug description from David:

The following example will exit with a status code of 0 and display the service error as if it were legitimate LLM output:

unset NVIDIA_API_KEY
nat run --config_file examples/getting_started/simple_calculator/configs/config.yml --input "Is the product of 2 * 4 greater than the current hour of the day?"

output:

--------------------------------------------------
Workflow Result:
['[401] Unauthorized\nAuthentication failed\nPlease check or regenerate your API key.']
--------------------------------------------------

This PR fixes by explicitly raising the RuntimeError instead of returning a valid ChatResponse object.

🤔 Open to discussion

I explicitly didn't set any description for RuntimeError because that error has been already output by the logger.exception, and if I set the description for the RuntimeError, the same error will be re-iterated for 3 times, and since the error is 4-line long, I think just outputting 1 time is enough.

Therefore, I also added the logic for removing the trailing : if the error message is empty.

However, one downside of it is we may accidentally omit for important exception messages. But from some shallow research I think most exception details would already be captured in the downstream logger.exception

I'm definitely open to discussion here what do you guys think if this would be the best thing to do here!

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Bug Fixes

    • Standardized error handling across agents (ReAct, ReWOO, Tool Calling): exceptions now propagate instead of returning in-chat error strings, yielding clearer failure states.
    • Runner error reporting now formats and logs exception messages consistently while preserving failure behavior.
    • Invocation error logging omits empty messages, preventing trailing empty error text.
  • Chores

    • Console front end now explicitly propagates exceptions during concurrent input handling (no visible behavior change).

@coderabbitai
Copy link

coderabbitai bot commented Sep 23, 2025

Walkthrough

Exception handling was changed to propagate errors (raise RuntimeError) in three agent register modules; console front-end now explicitly sets asyncio.gather(return_exceptions=False); builder and runner adjusted how error messages are composed and logged. No public API signatures were changed. (43 words)

Changes

Cohort / File(s) Summary of changes
Agent error handling propagation
src/nat/agent/react_agent/register.py, src/nat/agent/rewoo_agent/register.py, src/nat/agent/tool_calling_agent/register.py
Internal _response_fn exception paths now raise RuntimeError instead of returning error strings or ChatResponse objects; tool_calling_agent also updates an outer exception path to raise RuntimeError. No exported/public signatures changed.
Console gather behavior explicitness
src/nat/front_ends/console/console_front_end_plugin.py
asyncio.gather call explicitly sets return_exceptions=False and adds a clarifying comment; observable behavior unchanged.
Builder ainvoke error message formatting
src/nat/builder/function.py
Error logging adjusted to conditionally include the exception message ("Error: ") only when non-empty, otherwise omit the message portion. Exception propagation unchanged.
Runner result logging formatting
src/nat/runtime/runner.py
Runner.result error handling now builds err_msg that conditionally prefixes ": " when exception string is non-empty (otherwise uses "."), and logs using the composed err_msg; then completes event stream, sets FAILED state, and re-raises.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant FrontEnd as Console Front-End
  participant Agent as Agent Register (_response_fn)
  participant Provider as OpenAI API Path

  User->>FrontEnd: Submit query
  FrontEnd->>Agent: Invoke _response_fn()
  Agent->>Provider: Make API call
  Provider-->>Agent: Error/Exception
  note over Agent: New behavior: raise RuntimeError
  Agent-->>FrontEnd: Exception propagates
  FrontEnd-->>User: Error surfaced by caller
Loading
sequenceDiagram
  autonumber
  participant Caller as Upstream Caller
  participant Agent as Agent Register (_response_fn)

  rect rgba(200,230,255,0.3)
  note over Agent: Old flow (before)
  Caller->>Agent: Call _response_fn()
  Agent--x Caller: Return ChatResponse or error string
  end

  rect rgba(200,255,200,0.3)
  note over Agent: New flow (after)
  Caller->>Agent: Call _response_fn()
  Agent-->>Caller: Raise RuntimeError (exception propagation)
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Suggested labels

breaking

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title is written in imperative mood, concisely describes the primary fix for a bug regarding swallowed client errors, and is well under the recommended character limit.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zhongxuanwang-nv
Copy link
Member Author

@coderabbitai review

@zhongxuanwang-nv zhongxuanwang-nv self-assigned this Sep 23, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 23, 2025

@zhongxuanwang-nv I'll review the changes in this PR that fixes the client error swallowing bug.


🧠 Learnings used
Learnt from: willkill07
PR: NVIDIA/NeMo-Agent-Toolkit#710
File: .coderabbit.yaml:33-33
Timestamp: 2025-08-25T15:24:13.143Z
Learning: User feedback indicates that CodeRabbit's chat commands, particularly `coderabbitai review` for manual incremental reviews, need better documentation and discoverability within repositories, especially when auto_incremental_review is disabled.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@zhongxuanwang-nv zhongxuanwang-nv added bug Something isn't working non-breaking Non-breaking change labels Sep 23, 2025
@coderabbitai coderabbitai bot added the breaking Breaking change label Sep 23, 2025
@zhongxuanwang-nv zhongxuanwang-nv removed the breaking Breaking change label Sep 23, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
src/nat/cli/commands/start.py (1)

225-236: Add explicit return type on nested coroutine

Minor typing polish for inner coroutine.

Suggested change (outside the selected range; illustrative):

async def run_plugin() -> int | None:
    ...
src/nat/front_ends/console/console_front_end_plugin.py (1)

83-85: LGTM — explicit return_exceptions=False

Change is clear and aligns with the new error-propagation strategy.

Optional: consider cancelling sibling tasks on first failure (TaskGroup in py3.11) to avoid wasted work if one query fails.

src/nat/agent/tool_calling_agent/register.py (1)

116-119: Re-raise pattern: avoid duplicate stacktraces and preserve original traceback

Since you re-raise, log without traceback and keep the original exception to preserve its stack.

Option A (preferred: keep original type and traceback):

-        except Exception as ex:
-            logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex)
-            raise RuntimeError from ex
+        except Exception:
+            logger.error("%s Tool Calling Agent failed with exception", AGENT_LOG_PREFIX)
+            raise

Option B (if keeping RuntimeError is required for API uniformity, still avoid duplicate stack):

-        except Exception as ex:
-            logger.exception("%s Tool Calling Agent failed with exception: %s", AGENT_LOG_PREFIX, ex)
-            raise RuntimeError from ex
+        except Exception as ex:
+            logger.error("%s Tool Calling Agent failed with exception", AGENT_LOG_PREFIX)
+            raise RuntimeError from ex

Additionally, the docstring still mentions returning an “error message”; update it to reflect raising on failure.

src/nat/builder/function.py (1)

162-167: Good: no duplicate trace; consider silencing TRY400

Pattern matches the guidelines (log error, then re-raise). Ruff may flag TRY400; consider an inline ignore to document intent.

Apply this minimal tweak:

-                logger.error("Error with ainvoke in function with input: \"%s\" %s", value, err_msg)
+                logger.error("Error with ainvoke in function with input: \"%s\" %s", value, err_msg)  # noqa: TRY400
src/nat/runtime/runner.py (2)

152-156: Consistent “empty message” formatting — keep; silence TRY400

Matches the new formatting approach and re-raises correctly. Add a noqa to avoid Ruff noise.

-            logger.error("Error running workflow%s", err_msg)
+            logger.error("Error running workflow%s", err_msg)  # noqa: TRY400

187-189: Normalize error logging format in result_stream()

Mirror the empty-message handling used in result() for consistency.

-        except Exception as e:
-            logger.error("Error running workflow: %s", e)
+        except Exception as e:
+            err_msg = "" if str(e).strip() == "" else f": {e}"
+            logger.error("Error running workflow%s", err_msg)  # noqa: TRY400
src/nat/agent/react_agent/register.py (1)

154-157: Re-raise pattern: log without traceback and preserve original exception

Align with the error-handling guideline to avoid duplicated stacks and keep the original traceback.

Option A (preferred):

-        except Exception as ex:
-            logger.exception("%s ReAct Agent failed with exception: %s", AGENT_LOG_PREFIX, ex)
-            raise RuntimeError from ex
+        except Exception:
+            logger.error("%s ReAct Agent failed with exception", AGENT_LOG_PREFIX)
+            raise

Option B (if RuntimeError is required, still avoid duplicate trace):

-        except Exception as ex:
-            logger.exception("%s ReAct Agent failed with exception: %s", AGENT_LOG_PREFIX, ex)
-            raise RuntimeError from ex
+        except Exception as ex:
+            logger.error("%s ReAct Agent failed with exception", AGENT_LOG_PREFIX)
+            raise RuntimeError from ex

Also update the function docstring to remove “or error message” in the return description and add a Raises section.

src/nat/agent/rewoo_agent/register.py (1)

159-162: Re-raise pattern consistency

Same recommendation as other agents to avoid duplicate stacktraces and preserve traceback.

Option A (preferred):

-        except Exception as ex:
-            logger.exception("ReWOO Agent failed with exception: %s", ex)
-            raise RuntimeError from ex
+        except Exception:
+            logger.error("ReWOO Agent failed with exception")
+            raise

Option B (keep RuntimeError, avoid duplicate stack):

-        except Exception as ex:
-            logger.exception("ReWOO Agent failed with exception: %s", ex)
-            raise RuntimeError from ex
+        except Exception as ex:
+            logger.error("ReWOO Agent failed with exception")
+            raise RuntimeError from ex

Please also update the docstring to reflect raising on errors rather than returning error messages.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11d05dc and 4eb69a1.

📒 Files selected for processing (7)
  • src/nat/agent/react_agent/register.py (1 hunks)
  • src/nat/agent/rewoo_agent/register.py (1 hunks)
  • src/nat/agent/tool_calling_agent/register.py (1 hunks)
  • src/nat/builder/function.py (1 hunks)
  • src/nat/cli/commands/start.py (1 hunks)
  • src/nat/front_ends/console/console_front_end_plugin.py (1 hunks)
  • src/nat/runtime/runner.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/console/console_front_end_plugin.py
  • src/nat/builder/function.py
  • src/nat/cli/commands/start.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/react_agent/register.py
  • src/nat/runtime/runner.py
  • src/nat/agent/tool_calling_agent/register.py
src/nat/**/*

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/console/console_front_end_plugin.py
  • src/nat/builder/function.py
  • src/nat/cli/commands/start.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/react_agent/register.py
  • src/nat/runtime/runner.py
  • src/nat/agent/tool_calling_agent/register.py
🪛 Ruff (0.13.1)
src/nat/builder/function.py

166-166: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/nat/cli/commands/start.py

238-238: Do not catch blind exception: Exception

(BLE001)


239-239: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


240-240: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

src/nat/runtime/runner.py

156-156: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

@zhongxuanwang-nv zhongxuanwang-nv marked this pull request as ready for review September 23, 2025 22:00
@zhongxuanwang-nv zhongxuanwang-nv requested a review from a team as a code owner September 23, 2025 22:00
@coderabbitai coderabbitai bot added the breaking Breaking change label Sep 25, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4eb69a1 and bbbf8d1.

📒 Files selected for processing (3)
  • src/nat/agent/react_agent/register.py (1 hunks)
  • src/nat/agent/rewoo_agent/register.py (1 hunks)
  • src/nat/agent/tool_calling_agent/register.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/agent/react_agent/register.py
  • src/nat/agent/rewoo_agent/register.py
  • src/nat/agent/tool_calling_agent/register.py
🪛 Ruff (0.13.1)
src/nat/agent/react_agent/register.py

155-155: Redundant exception object included in logging.exception call

(TRY401)


156-156: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

src/nat/agent/rewoo_agent/register.py

161-161: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

src/nat/agent/tool_calling_agent/register.py

118-118: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check

@willkill07 willkill07 removed the breaking Breaking change label Sep 25, 2025
Signed-off-by: Daniel Wang <daniewang@nvidia.com>
@coderabbitai coderabbitai bot added the breaking Breaking change label Sep 25, 2025
@willkill07 willkill07 removed the breaking Breaking change label Sep 25, 2025
@coderabbitai coderabbitai bot added the breaking Breaking change label Sep 25, 2025
Signed-off-by: Daniel Wang <daniewang@nvidia.com>
@willkill07 willkill07 removed the breaking Breaking change label Sep 25, 2025
@coderabbitai coderabbitai bot added the breaking Breaking change label Sep 25, 2025
@zhongxuanwang-nv zhongxuanwang-nv removed the breaking Breaking change label Sep 25, 2025
@zhongxuanwang-nv
Copy link
Member Author

/merge

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/nat/builder/function.py (1)

162-163: Tighten the log formatting when the exception string is empty.

When str(e).strip() is empty we still emit "Error with ainvoke ... %s. " (note the trailing space). Optional, but we can avoid the extra placeholder altogether by branching the log call.

Apply this tweak locally:

-                err_msg = f"Error: {e}" if str(e).strip() else ""
-                logger.error("Error with ainvoke in function with input: %s. %s", value, err_msg)
+                err_msg = f"Error: {e}" if str(e).strip() else None
+                if err_msg is None:
+                    logger.error("Error with ainvoke in function with input: %s.", value)
+                else:
+                    logger.error("Error with ainvoke in function with input: %s. %s", value, err_msg)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f31ecd3 and 7d158bb.

📒 Files selected for processing (2)
  • src/nat/builder/function.py (1 hunks)
  • src/nat/runtime/runner.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/runtime/runner.py
  • src/nat/builder/function.py
🪛 Ruff (0.13.1)
src/nat/runtime/runner.py

153-153: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/nat/builder/function.py

163-163: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (1)
src/nat/runtime/runner.py (1)

152-153: Nice job cleaning up the log output.

The conditional suffix keeps the log readable without trailing punctuation, while still surfacing the exception string when present. 👍

@rapids-bot rapids-bot bot merged commit 65bb853 into NVIDIA:develop Sep 25, 2025
17 checks passed
rapids-bot bot pushed a commit that referenced this pull request Oct 10, 2025
* This example appears to have been broken by PR #841 which relies on the original LangGraph recursion exception being raised (or scraping it from the output text)


## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.



## Summary by CodeRabbit

- Bug Fixes
  - Preserve original error types and tracebacks for clearer diagnostics.
  - Runner now explicitly marks workflows as FAILED before propagating errors.
  - Standardized error messages improve observability during failures.
- Refactor
  - Unified error logging format and handling across agents and runtime.
- Chores
  - HITL example configuration now relies solely on the NIM LLM; OpenAI option removed.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Will Killian (https://github.com/willkill07)

URL: #970
yczhang-nv pushed a commit to yczhang-nv/NeMo-Agent-Toolkit that referenced this pull request Oct 10, 2025
* This example appears to have been broken by PR NVIDIA#841 which relies on the original LangGraph recursion exception being raised (or scraping it from the output text)

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

## Summary by CodeRabbit

- Bug Fixes
  - Preserve original error types and tracebacks for clearer diagnostics.
  - Runner now explicitly marks workflows as FAILED before propagating errors.
  - Standardized error messages improve observability during failures.
- Refactor
  - Unified error logging format and handling across agents and runtime.
- Chores
  - HITL example configuration now relies solely on the NIM LLM; OpenAI option removed.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Will Killian (https://github.com/willkill07)

URL: NVIDIA#970
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants