Skip to content

Fix/agno flaky test release 1.4#1383

Merged
mnajafian-nv merged 5 commits intorelease/1.4from
fix/agno-flaky-test-release-1.4
Jan 12, 2026
Merged

Fix/agno flaky test release 1.4#1383
mnajafian-nv merged 5 commits intorelease/1.4from
fix/agno-flaky-test-release-1.4

Conversation

@mnajafian-nv
Copy link
Contributor

@mnajafian-nv mnajafian-nv commented Jan 11, 2026

Description

Fixes the flaky test_get_event_loop_called test that's failing on arm64 architectures.

The test was expecting get_running_loop() to be called exactly once, but on arm64 it gets called 3 times (vs 1 on amd64). This is due to pydantic-core's C extensions behaving differently across architectures during function introspection.

Changed the assertion to check for known values [1, 3] instead of assert_called_once(). Added inline comments explaining the architecture differences so future maintainers understand this is expected behavior.

Fixes: CI Pipeline / Test (arm64, 3.12) failure in #1349

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Tests
    • Made test validations architecture- and Python-version-aware to reduce flaky failures across ARM64 and x86_64 hosts.
    • Replaced a rigid single-call assertion with configurable call-count checks per environment and a safer fallback for unknown combos.
    • Improved failure messages to clearly explain mismatches and guide updates when CI behavior changes.

✏️ Tip: You can customize this high-level summary in your review settings.

The test was expecting get_running_loop() to be called exactly once,
but on arm64 it gets called 3 times (vs 1 on amd64). This is due to
pydantic-core's C extensions behaving differently across architectures.

Changed the assertion to check for known values [1, 3] instead of
assert_called_once(). Added comments explaining the difference so
future maintainers know this is expected behavior.

Fixes: CI Pipeline / Test (arm64, 3.12) failure in PR #1349
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@mnajafian-nv mnajafian-nv requested review from a team as code owners January 11, 2026 22:22
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coderabbitai
Copy link

coderabbitai bot commented Jan 11, 2026

Walkthrough

Updated a test to perform architecture- and Python-version-aware validation in test_get_event_loop_called: imports platform, reads platform.machine() and platform.python_version_tuple(), and checks the mocked call count against a mapping of expected counts with a fallback for unknown combos and a guidance message for CI differences.

Changes

Cohort / File(s) Summary
Test Architecture Detection
packages/nvidia_nat_agno/tests/test_tool_wrapper.py
Added platform import and replaced a fixed assert_called_once() with logic that computes the runtime architecture and Python version, then validates the mock call count against an expected-count mapping (per-arch/Python combos) with a default fallback and a detailed error message instructing updates when CI behavior differs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: fixing a flaky test in the agno package for release 1.4.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f97f95 and c499c8d.

📒 Files selected for processing (1)
  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf with PEP 8 base and 'column_limit = 120' for code formatting
Use 'ruff check --fix' for linting with configuration from 'pyproject.toml', fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer 'collections.abc' / 'typing' abstractions (e.g., 'Sequence' over 'list') for type hints
Use 'typing.Annotated' for units or extra metadata when useful
Treat 'pyright' warnings (configured in 'pyproject.toml') as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare 'raise' statements when re-raising, and use 'logger.error()' for logging (not 'logger.exception()') to avoid duplicate stack trace output
When catching and logging exceptions without re-raising, always use 'logger.exception()' (equivalent to 'logger.error(exc_info=True)') to capture full stack trace information
Pydantic models using 'SecretStr', 'SerializableSecretStr', or 'OptionalSecretStr' should use 'default=None' for optional fields and 'default_factory=lambda: SerializableSecretStr("")' for non-optional fields to avoid initialization bugs
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period
Surround code entities in docstrings with backticks to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer 'httpx' with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use 'async'/'await' for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with 'functools.lru_cache' or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,yaml,yml,json,toml}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Indent with 4 spaces (never tabs) and ensure every file ends with a single newline

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/test_*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named with 'test_' prefix using snake_case
Extract frequently repeated code in tests into pytest fixtures with the 'fixture_' prefix on the function name and a 'name' argument in the decorator
Mock external services with 'pytest_httpserver' or 'unittest.mock' instead of hitting live endpoints in tests
Mark slow tests with '@pytest.mark.slow' so they can be skipped in the default test suite
Mark integration tests requiring external services with '@pytest.mark.integration' so they can be skipped in the default test suite

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of None,
    in that situation no return type hint is needed).
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

  • Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
    (ex: "the system's performance" should be "the performance of the system").
  • Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
    The exception to this rule is when referring to package names or code identifiers that contain "nat", th...

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.
  • When adding a new package, that new package name (as defined in the pyproject.toml file) should
    be added as a dependency to the nvidia-nat-all package in packages/nvidia_nat_all/pyproject.toml

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (2)
packages/nvidia_nat_agno/tests/test_tool_wrapper.py (2)

17-17: LGTM!

Standard library import correctly added for platform detection logic.


189-230: Well-documented fix for architecture-specific behavior.

The approach is sound:

  • The detailed comments explaining the pydantic-core C extension differences provide good context for future maintainers.
  • The [1, 3] fallback for unknown combinations ensures forward compatibility with new Python versions.
  • The informative assertion message guides developers on how to update the expected values when CI behavior changes.

One minor consideration: the expected_counts dictionary uses lists for membership checking. While functionally correct, sets (e.g., {1}, {1, 3}) would be slightly more idiomatic for this purpose. However, this is purely stylistic and the current implementation is clear.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mnajafian-nv mnajafian-nv changed the base branch from develop to release/1.4 January 11, 2026 22:23
@mnajafian-nv mnajafian-nv added bug Something isn't working non-breaking Non-breaking change labels Jan 11, 2026
…ions

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.98%. Comparing base (4ec7cf8) to head (c499c8d).
⚠️ Report is 4 commits behind head on release/1.4.

Additional details and impacted files
@@             Coverage Diff              @@
##           release/1.4    #1383   +/-   ##
============================================
  Coverage        74.98%   74.98%           
============================================
  Files              552      552           
  Lines            38769    38769           
============================================
  Hits             29072    29072           
  Misses            9697     9697           
Flag Coverage Δ
unittests 74.98% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The test was failing on arm64 (aarch64) Python 3.11 because it expected
3 calls to get_running_loop() but only 1 call was made. This is due to
pydantic-core C extension differences across architectures and Python versions.

Changes:
- Use platform.machine() and platform.python_version_tuple() for detection
- Define architecture + Python version specific expected call counts
- Confirmed: aarch64 Py3.11 → 1 call (CI job 60115041673)
- Confirmed: x86_64 Py3.11/3.13 → 1 call (CI jobs 60115041674, 60115041704)
- Accept [1, 3] for unconfirmed combinations to prevent future flakiness
- Add clear documentation with CI job references for traceability

This addresses willkill07's review feedback to use platform detection
for more precise test assertions.

Fixes: CI Pipeline / Test (arm64, 3.11) failure in PR #1349
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
packages/nvidia_nat_agno/tests/test_tool_wrapper.py (1)

189-230: Well-structured architecture-aware test logic.

The implementation correctly addresses the flaky test issue with:

  • Proper normalization of machine name to lowercase
  • Consistent use of string types for version tuple elements
  • Defensive fallback for unknown platform/version combinations
  • Actionable error message with guidance for maintainers

The CI job references in comments provide good traceability.

Minor simplification: Consider tuple unpacking for the version extraction.

♻️ Optional simplification
-        py_version_tuple = platform.python_version_tuple()
-        py_major, py_minor = py_version_tuple[0], py_version_tuple[1]
+        py_major, py_minor, _ = platform.python_version_tuple()
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21f7f66 and 9f97f95.

📒 Files selected for processing (1)
  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf with PEP 8 base and 'column_limit = 120' for code formatting
Use 'ruff check --fix' for linting with configuration from 'pyproject.toml', fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer 'collections.abc' / 'typing' abstractions (e.g., 'Sequence' over 'list') for type hints
Use 'typing.Annotated' for units or extra metadata when useful
Treat 'pyright' warnings (configured in 'pyproject.toml') as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare 'raise' statements when re-raising, and use 'logger.error()' for logging (not 'logger.exception()') to avoid duplicate stack trace output
When catching and logging exceptions without re-raising, always use 'logger.exception()' (equivalent to 'logger.error(exc_info=True)') to capture full stack trace information
Pydantic models using 'SecretStr', 'SerializableSecretStr', or 'OptionalSecretStr' should use 'default=None' for optional fields and 'default_factory=lambda: SerializableSecretStr("")' for non-optional fields to avoid initialization bugs
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period
Surround code entities in docstrings with backticks to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer 'httpx' with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use 'async'/'await' for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with 'functools.lru_cache' or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,yaml,yml,json,toml}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Indent with 4 spaces (never tabs) and ensure every file ends with a single newline

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/test_*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named with 'test_' prefix using snake_case
Extract frequently repeated code in tests into pytest fixtures with the 'fixture_' prefix on the function name and a 'name' argument in the decorator
Mock external services with 'pytest_httpserver' or 'unittest.mock' instead of hitting live endpoints in tests
Mark slow tests with '@pytest.mark.slow' so they can be skipped in the default test suite
Mark integration tests requiring external services with '@pytest.mark.integration' so they can be skipped in the default test suite

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*.{py,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of None,
    in that situation no return type hint is needed).
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

  • Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
    (ex: "the system's performance" should be "the performance of the system").
  • Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
    The exception to this rule is when referring to package names or code identifiers that contain "nat", th...

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.
  • When adding a new package, that new package name (as defined in the pyproject.toml file) should
    be added as a dependency to the nvidia-nat-all package in packages/nvidia_nat_all/pyproject.toml

Files:

  • packages/nvidia_nat_agno/tests/test_tool_wrapper.py
🔇 Additional comments (1)
packages/nvidia_nat_agno/tests/test_tool_wrapper.py (1)

17-17: LGTM!

The platform import is correctly positioned alphabetically within the stdlib imports section.

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@willkill07 willkill07 removed the request for review from a team January 12, 2026 17:20
@mnajafian-nv mnajafian-nv merged commit a52147b into release/1.4 Jan 12, 2026
17 checks passed
@mnajafian-nv mnajafian-nv deleted the fix/agno-flaky-test-release-1.4 branch January 12, 2026 17:45
mnajafian-nv added a commit that referenced this pull request Jan 14, 2026
Replace exact call count assertion with assert_called().
Call count varies by arch/Python/pydantic-core version and
has caused repeated CI failures (#1383, #1396).

Fixes: #1396
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
mnajafian-nv added a commit that referenced this pull request Jan 14, 2026
Replace exact call count assertion with assert_called().
Call count varies by arch/Python/pydantic-core version and
has caused repeated CI failures (#1383, #1396).

Fixes: #1396
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
mnajafian-nv added a commit that referenced this pull request Jan 14, 2026
Replace exact call count assertion with assert_called().
Call count varies by arch/Python/pydantic-core version and
has caused repeated CI failures (#1383, #1396).

Fixes: #1396
Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
mnajafian-nv added a commit that referenced this pull request Jan 14, 2026
## Description
fix: simplify event loop test to assert_called

The test was checking exact call counts of get_running_loop() which
varies by architecture, Python version, and pydantic-core version.
This caused repeated CI failures (#1383, #1396).

Simplified to assert_called() since:
- Exact call count is an implementation detail
- Other tests already cover the event loop code path
- The previous approach was not sustainable

Fixes: #1396

## By Submitting this PR I confirm:
- I am familiar with the [Contributing
Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This
certifies that the contribution is your original work, or you have
rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will
not be accepted.
- When the PR is ready for review, new or existing tests cover these
changes.
- When the PR is ready for review, the documentation is up to date with
these changes.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Simplified event-loop verification by replacing architecture-dependent
assertions with a basic check that the running event loop is accessed.
* Renamed and clarified the related test to reflect that only access is
validated, improving cross-environment reliability and maintainability.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Jerryguan777 pushed a commit to Jerryguan777/NeMo-Agent-Toolkit that referenced this pull request Jan 28, 2026
## Description
Fixes the flaky `test_get_event_loop_called` test that's failing on
arm64 architectures.

The test was expecting `get_running_loop()` to be called exactly once,
but on arm64 it gets called 3 times (vs 1 on amd64). This is due to
pydantic-core's C extensions behaving differently across architectures
during function introspection.

Changed the assertion to check for known values `[1, 3]` instead of
`assert_called_once()`. Added inline comments explaining the
architecture differences so future maintainers understand this is
expected behavior.

**Fixes:** CI Pipeline / Test (arm64, 3.12) failure in NVIDIA#1349

## By Submitting this PR I confirm:
- I am familiar with the [Contributing
Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This
certifies that the contribution is your original work, or you have
rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will
not be accepted.
- When the PR is ready for review, new or existing tests cover these
changes.
- When the PR is ready for review, the documentation is up to date with
these changes.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Made test validations architecture- and Python-version-aware to reduce
flaky failures across ARM64 and x86_64 hosts.
* Replaced a rigid single-call assertion with configurable call-count
checks per environment and a safer fallback for unknown combos.
* Improved failure messages to clearly explain mismatches and guide
updates when CI behavior changes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Jerryguan777 pushed a commit to Jerryguan777/NeMo-Agent-Toolkit that referenced this pull request Jan 28, 2026
## Description
fix: simplify event loop test to assert_called

The test was checking exact call counts of get_running_loop() which
varies by architecture, Python version, and pydantic-core version.
This caused repeated CI failures (NVIDIA#1383, NVIDIA#1396).

Simplified to assert_called() since:
- Exact call count is an implementation detail
- Other tests already cover the event loop code path
- The previous approach was not sustainable

Fixes: NVIDIA#1396

## By Submitting this PR I confirm:
- I am familiar with the [Contributing
Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This
certifies that the contribution is your original work, or you have
rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will
not be accepted.
- When the PR is ready for review, new or existing tests cover these
changes.
- When the PR is ready for review, the documentation is up to date with
these changes.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* Simplified event-loop verification by replacing architecture-dependent
assertions with a basic check that the running event loop is accessed.
* Renamed and clarified the related test to reflect that only access is
validated, improving cross-environment reliability and maintainability.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants