Fix tests under `examples/`, remove all pytest `skip` markers by dagardner-nv · Pull Request #846 · NVIDIA/NeMo-Agent-Toolkit

dagardner-nv · 2025-09-24T18:23:20Z

Description

Tests now accurately depend on the API key fixturess they need
test_alert_triage_agent_workflow.py now only processes the first prompt in the dataset reducing the runtime from 4 minutes down to 40s.
Revert unintended formatting change to test_spans.csv causing parsing errors for the profiler_agent tests.
Document the potential need to run the simple web query eval example with max_concurrency=1 as a work-around for Implement retry/sleep/backoff logic when we receive 429 errors #842
Fix Python 3.13 specific issue (swe_bench evaluation example broken in Python 3.13 #845) causing the swe_bench example to fail with a type error (Thanks to @willkill07 on this one).
Fix the early-out check in src/nat/utils/type_converter.py when the source and destination types are the same, avoids useless warnings like: WARNING - Indirect type conversion used to convert <class 'str'> to <class 'str'>, which may lead to unintended conversions. Consider adding a direct converter from <class 'str'> to <class 'str'> to ensure correctness.
Add an Phoenix service instance for e2e testing.

Closes #845

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- Grouped “user report” tooling exposing all report operations via a single function group.
Refactor
- Migrated user report API to grouped access and switched object-store paths to relative semantics.
Documentation
- Added guidance to handle rate limits by setting eval.general.max_concurrency=1; markdown link checks now ignore arize.com.
Tests
- Converted several skipped tests to fixture-based integration tests, added a Docker-required fixture, introduced data fixtures, reduced timeouts, and enforced a concurrency override.
Chores
- CI updated to add a Phoenix service and adjust service ordering.

Signed-off-by: David Gardner <dagardner@nvidia.com>

…is test from 4 minutes down to 40 seconds Signed-off-by: David Gardner <dagardner@nvidia.com>

Signed-off-by: David Gardner <dagardner@nvidia.com>

…avid-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

Signed-off-by: David Gardner <dagardner@nvidia.com>

This reverts commit 77985f9. Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai · 2025-09-24T18:23:29Z

Walkthrough

Adds a Phoenix CI service, converts multiple tests from skip markers to fixture-based runs (adding Docker and NVIDIA fixtures), tightens timeouts/concurrency for evaluations, migrates user-report functions to a grouped API, and refactors core type-instance and conversion checks.

Changes

Cohort / File(s)	Summary
CI services `.gitlab-ci.yml`	Reordered services in `test:python_tests`; added `arizephoenix/phoenix:latest` service with alias `phoenix`; preserved MySQL service.
Alert triage agent tests `examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py`	Replaced skip with integration/test decorators and fixture; read files as UTF-8; run a single JSON input entry; simplified assertions and deterministic checks.
Profiler agent tests `examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py`	Added `df_path` pytest fixture (`df_path_fixture`); updated `test_flow_chart_tool` and `test_token_usage_tool` signatures to accept `df_path`; removed skip placeholders.
Simple web query eval docs `examples/evaluation_and_profiling/simple_web_query_eval/README.md`	Added note advising `eval.general.max_concurrency: 1` (YAML override or CLI) for rate limiting.
Simple web query eval tests `examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py`	Added `usefixtures("nvidia_api_key")`; reduced `endpoint_timeout` from 300→30; added `override=(('eval.general.max_concurrency','1'),)`.
SWE-bench eval tests `examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py`	Changed import to `nat_swe_bench.config`; removed skip decorator; added `@pytest.mark.usefixtures("require_docker")`.
Object store user report tests (grouped API) `examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py`	Migrated tests from per-function registration to group-based `UserReportConfig` and `add_function_group`/`get_function_group`; `ObjectStoreRef(name=...)` → `ObjectStoreRef(value=...)`; object paths made relative (`"reports/..."`); access functions via group getters; removed `KeyAlreadyExistsError` imports.
Test plugin (Docker fixture) `packages/nvidia_nat_test/src/nat/test/plugin.py`	Added session-scoped `require_docker` fixture that attempts to construct a `DockerClient`, yields it on success, and skips or raises with a reason when Docker is unavailable.
Core type handling `src/nat/builder/component_utils.py`, `src/nat/utils/type_converter.py`, `src/nat/utils/type_utils.py`	Changed union and instance checks to use `DecomposedType(...).is_instance(...)` and `get_base_type()` where appropriate; `_convert` uses `decomposed.is_instance(data)` and captures `src_type` for warnings, altering how type matches and conversion warnings are determined.
Markdown link checks `ci/markdown-link-check-config.json`	Added an ignore pattern for `https://arize.com` to markdown link checks.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Test as Test
  participant Builder as WorkflowBuilder
  participant Group as FunctionGroup(user_report)
  participant Store as ObjectStore

  Note over Test,Builder: Register grouped user-report functions
  Test->>Builder: add_function_group("user_report", UserReportConfig(...))
  Builder-->>Test: OK

  Note over Test,Group: Retrieve and invoke operations via group
  Test->>Builder: get_function_group("user_report")
  Builder-->>Test: Group
  Test->>Group: get("user_report.put")
  Group-->>Test: put_fn
  Test->>put_fn: put("reports/abc.json", data)
  put_fn->>Store: write "reports/abc.json"
  Store-->>put_fn: success
  put_fn-->>Test: result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly and concisely describes the main changes to example tests by removing pytest skip markers and fixing test execution, is written in imperative mood, and meets the length requirement.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.

✨ Finishing touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…avid-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/nat/utils/type_converter.py (1)
221-243: isinstance with typing/generic targets can raise TypeError — use DecomposedType(...).is_instance.

isinstance(data, to_type) and isinstance(next_data, to_type) will fail for Annotated, Union, or parametrized generics (list[str]). Use the base-type-aware check for consistency with other parts.

Apply:
-        # 1) If data is already correct type
-        if isinstance(data, to_type):
+        # 1) If data is already correct type (base-type aware)
+        if DecomposedType(to_type).is_instance(data):
             return data
@@
-                        if isinstance(next_data, to_type):
+                        if DecomposedType(to_type).is_instance(next_data):
                             return next_data

🧹 Nitpick comments (14)

.gitlab-ci.yml (1)

82-82: Make MySQL service alias explicit (the mysql:9.3 tag exists)

Replace service entry:
-    - mysql:9.3
+    - name: mysql:9.3
+      alias: mysql
(Optional) In your test job, wait for MySQL before running tests:
-    - echo "Running tests"
+    - echo "Waiting for MySQL..."
+    - until (</dev/tcp/mysql/3306) >/dev/null 2>&1; do sleep 1; done
+    - echo "Running tests"

src/nat/utils/type_converter.py (2)

92-96: Avoid constructing DecomposedType when to_type is None (early-return first).

Minor robustness/clarity: check to_type is None before creating DecomposedType(to_type).

Apply:

-        decomposed = DecomposedType(to_type)
-
-        # 1) If data is already correct type, return it
-        if to_type is None or decomposed.is_instance(data):
-            return data
+        # 1) If to_type is None or data is already correct type, return it
+        if to_type is None:
+            return data
+        decomposed = DecomposedType(to_type)
+        if decomposed.is_instance(data):
+            return data

159-165: Log full stack on conversion failure when not re-raising.

Per guidelines, prefer logger.exception() when swallowing the error to aid diagnosis.

-        except ValueError:
-            logger.warning("Type conversion failed, using original value. From %s to %s", type(data), to_type)
+        except ValueError:
+            logger.exception("Type conversion failed, using original value. From %s to %s", type(data), to_type)
             # Return original data, let downstream code handle it
             return data

packages/nvidia_nat_test/src/nat/test/plugin.py (1)

210-223: Ensure Docker is actually reachable; close the client.

Instantiate via from_env(), ping the daemon to verify availability, and close the client after session. Current DockerClient() neither validates connectivity nor closes.

-@pytest.fixture(name="require_docker", scope='session')
-def require_docker_fixture(fail_missing: bool) -> "DockerClient":
+@pytest.fixture(name="require_docker", scope='session')
+def require_docker_fixture(fail_missing: bool) -> "DockerClient":
     """
     Use for integration tests that require Docker to be running.
     """
-    try:
-        from docker.client import DockerClient
-        yield DockerClient()
-    except Exception as e:
+    try:
+        from docker import from_env
+        client = from_env()
+        # Validate connectivity
+        client.ping()
+        try:
+            yield client
+        finally:
+            # Close client at end of session
+            client.close()
+    except Exception as e:
         reason = f"Unable to connect to Docker daemon: {e}"
         if fail_missing:
             raise RuntimeError(reason) from e
         pytest.skip(reason=reason)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (2)

50-54: Fix fixture docstring (returns dict of functions, not a function).

Minor clarity improvement.
-async def group(builder):
-    """Pytest fixture to get a function from the builder."""
+async def group(builder):
+    """Pytest fixture to get accessible functions from the user_report group."""
1-1: Typo in filename: consider renaming to ‘test_object_store_example_user_report_tool.py’.

Improves discoverability and avoids confusion.

Would you like me to open a follow-up to rename this file and update any references?

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py (5)

41-49: Avoid brittle ../../../ jumps; resolve dataset path robustly and assert existence.

Using importlib.resources with upward traversal is fragile. Prefer Path-based resolution and check existence.

Apply:
-    with open(config_file, "r", encoding="utf-8") as file:
+    with open(config_file, "r", encoding="utf-8") as file:
         config = yaml.safe_load(file)
         input_filepath = config["eval"]["general"]["dataset"]["file_path"]

-    input_filepath_abs = importlib.resources.files(package_name).joinpath("../../../../../", input_filepath).absolute()
+    input_filepath_abs = Path(input_filepath)
+    if not input_filepath_abs.is_absolute():
+        input_filepath_abs = (Path.cwd() / input_filepath_abs).resolve()
+    assert input_filepath_abs.exists(), f"Dataset not found: {input_filepath_abs}"
51-52: Guard against empty or malformed datasets before indexing.

Add a precondition to avoid IndexError if the JSON isn’t a non-empty list.
-    input_data = input_data[0]  # Limit to first row for testing
+    assert isinstance(input_data, list) and input_data, "Dataset is empty or wrong format"
+    input_data = input_data[0]  # Limit to first row for testing
32-36: Add return type annotation to the test function.

Keep type hints consistent per repo standards.
-async def test_full_workflow():
+async def test_full_workflow() -> None:
59-59: Strengthen emptiness check to ignore whitespace-only results.
-    assert len(result) > 0, "Result is empty"
+    assert result and result.strip(), "Result is empty or whitespace"
62-62: Make label match case-insensitive to reduce flakiness.
-    assert input_data['label'] in result
+    assert input_data['label'].lower() in result.lower()

examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py (3)

69-69: Annotate test return type.

Align tests with type-hint policy.

-async def test_flow_chart_tool(df_path: Path):
+async def test_flow_chart_tool(df_path: Path) -> None:

80-80: Annotate test return type.

-async def test_token_usage_tool(df_path: Path):
+async def test_token_usage_tool(df_path: Path) -> None:

51-61: Prefer httpx and tighter exception handling for the Phoenix probe.

Per guidelines, use httpx and catch HTTP-specific errors. Also annotate fixture return type.

-    import requests
+    import httpx
     try:
-        response = requests.get("http://localhost:6006/v1/traces", timeout=5)
-        if response.status_code != 200:
+        with httpx.Client(timeout=5.0) as client:
+            response = client.get("http://localhost:6006/v1/traces")
+        if response.status_code != 200:
             raise ConnectionError(f"Unexpected status code: {response.status_code}")
-    except Exception as e:
+    except httpx.HTTPError as e:
         reason = f"Unable to connect to Phoenix server at http://localhost:6006/v1/traces: {e}"
         if fail_missing:
             raise RuntimeError(reason)
         pytest.skip(reason=reason)

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3a964c and 17bbe8e.

⛔ Files ignored due to path filters (1)

examples/advanced_agents/profiler_agent/tests/test_spans.csv is excluded by !**/*.csv

📒 Files selected for processing (11)

.gitlab-ci.yml (1 hunks)
examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py (1 hunks)
examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py (1 hunks)
examples/evaluation_and_profiling/simple_web_query_eval/README.md (1 hunks)
examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py (2 hunks)
examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py (3 hunks)
examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (2 hunks)
packages/nvidia_nat_test/src/nat/test/plugin.py (2 hunks)
src/nat/builder/component_utils.py (1 hunks)
src/nat/utils/type_converter.py (2 hunks)
src/nat/utils/type_utils.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (10)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py
src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py
src/nat/utils/type_converter.py
packages/nvidia_nat_test/src/nat/test/plugin.py
examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py
examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py
examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py
src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py
src/nat/utils/type_converter.py
packages/nvidia_nat_test/src/nat/test/plugin.py
examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py
examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py
examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py
src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py
src/nat/utils/type_converter.py
examples/evaluation_and_profiling/simple_web_query_eval/README.md
packages/nvidia_nat_test/src/nat/test/plugin.py
examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py
examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py
examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py
examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py
examples/evaluation_and_profiling/simple_web_query_eval/README.md
examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py
examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py
examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py

src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
src/nat/utils/type_converter.py

src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
src/nat/utils/type_converter.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
src/nat/utils/type_converter.py

{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

src/nat/builder/component_utils.py
src/nat/utils/type_utils.py
src/nat/utils/type_converter.py
packages/nvidia_nat_test/src/nat/test/plugin.py

**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

examples/evaluation_and_profiling/simple_web_query_eval/README.md

packages/*/src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Importable Python code inside packages must live under packages//src/

Files:

packages/nvidia_nat_test/src/nat/test/plugin.py

packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/ directory at the same level as the pyproject.toml file.

Files:

packages/nvidia_nat_test/src/nat/test/plugin.py

🧬 Code graph analysis (5)

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py (1)

examples/advanced_agents/alert_triage_agent/src/nat_alert_triage_agent/register.py (1)

AlertTriageAgentWorkflowConfig (46-65)

src/nat/builder/component_utils.py (1)

src/nat/utils/type_utils.py (2)

DecomposedType (60-488)

is_instance (341-356)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (3)

examples/object_store/user_report/src/nat_user_report/user_report_tools.py (1)

UserReportConfig (31-43)

src/nat/builder/workflow_builder.py (4)

add_function_group (440-463)

add_function_group (1134-1135)

get_function_group (475-481)

get_function_group (1147-1153)

src/nat/builder/function.py (4)

get_accessible_functions (485-521)

ainvoke (116-117)

ainvoke (120-121)

ainvoke (124-163)

src/nat/utils/type_converter.py (1)

src/nat/utils/type_utils.py (1)

is_instance (341-356)

examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py (1)

src/nat/utils/type_utils.py (1)

override (56-57)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (16)

.gitlab-ci.yml (1)
83-85: Pin Phoenix image and add readiness wait
Replace arizephoenix/phoenix:latest with ${PHOENIX_IMAGE} and pin to the latest stable tag (v11.36.0):
variables:
  # Pin to a reproducible Phoenix release
PHOENIX_IMAGE: "arizephoenix/phoenix:<PINNED_TAG_OR_DIGEST>"
PHOENIX_IMAGE: "arizephoenix/phoenix:version-11.36.0"
- In your test script, wait for Phoenix to be ready before running tests:  
```yaml
  - echo "Waiting for Phoenix..."
  - until curl -fsS "http://phoenix:${NAT_CI_PHOENIX_PORT:-6006}/" >/dev/null; do sleep 1; done
  - echo "Running tests"
examples/evaluation_and_profiling/simple_web_query_eval/README.md (1)

89-90: Concurrency note addition looks good.

Clear, actionable guidance; matches the CLI usage pattern shown in tests.

src/nat/utils/type_converter.py (1)

201-212: Good: preserve original source type for indirect-conversion warnings.

Capturing src_type = type(data) and reusing it in warnings avoids spurious type re-evaluation.

src/nat/utils/type_utils.py (1)

356-356: LGTM: is_instance now respects base type.

Using get_base_type().root fixes checks for annotated/async-generator-wrapped types.

src/nat/builder/component_utils.py (1)

161-162: LGTM: union-branch check now base-type aware.

Switching to DecomposedType(arg).is_instance(value) aligns with the new type utilities and avoids false negatives for annotated types.

examples/evaluation_and_profiling/swe_bench/tests/test_swe_bench_eval.py (2)

26-26: LGTM: import path update.

Using nat_swe_bench.config is appropriate for config-only access.

82-83: LGTM: require_docker fixture usage.

Makes the Docker dependency explicit and skip/fail behavior configurable via plugin options.

packages/nvidia_nat_test/src/nat/test/plugin.py (1)

21-23: TYPE_CHECKING import is appropriate.

Keeps runtime deps optional while preserving types.

examples/evaluation_and_profiling/simple_web_query_eval/tests/test_simple_web_query_eval.py (2)

105-106: LGTM: API key requirement expressed via fixture.

Aligns with plugin-provided nvidia_api_key and removes skip markers.

127-130: No change needed: EvaluationRunConfig includes an override: tuple[tuple[str, str], ...] field.

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (1)

33-39: LGTM: function group config and includes.

Group-based registration with include is coherent and uses new ObjectStoreRef(value=...).

examples/advanced_agents/alert_triage_agent/tests/test_alert_triage_agent_workflow.py (3)

55-56: LGTM: single-run workflow and typed result.

Using to_type=str is clear and avoids downstream type surprises.

65-65: LGTM: semantic assertion.

Checking for “root cause category” in the output is a meaningful invariant.

32-33: LGTM: integration mark and nvidia_api_key fixture usage validated.

examples/advanced_agents/profiler_agent/tests/test_profiler_agent.py (2)

68-89: Async test support confirmed pytest-asyncio is installed and configured (asyncio_mode = "auto", pytest-asyncio==0.24.*).

63-66: Approve: fixture provides a stable path to test data.

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py

…ple_user_report_tool.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Will Killian <2007799+willkill07@users.noreply.github.com>

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (1)

1-1: Fix filename typo: “objext” → “object”.

Rename the file to test_object_store_example_user_report_tool.py for clarity and professionalism.

🧹 Nitpick comments (3)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (3)
50-53: Rename fixture for clarity.

group actually returns a dict of functions, not the group instance. Consider renaming to user_report_functions (or similar) to reduce confusion.

28-40: Add return type hints on fixtures (pyright treats warnings as errors).

Annotate fixture return types to satisfy the project’s typing standards.
-async def builder():
+async def builder() -> WorkflowBuilder:
@@
-async def group(builder):
+async def group(builder) -> dict[str, Function]:
Add the missing import at the top of the file:
from nat.builder.function import Function
Also applies to: 50-53

60-71: Compare JSON structures, not serialized strings.

Avoid brittleness from string serialization differences.
-        assert result == json.dumps(test_report)
+        assert json.loads(result) == test_report
@@
-        assert result == json.dumps(test_report)
+        assert json.loads(result) == test_report
Also applies to: 81-85

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17bbe8e and eff57c1.

📒 Files selected for processing (1)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py

examples/**/*

⚙️ CodeRabbit configuration file

examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.

If an example contains Python code, it should be placed in a subdirectory named src/ and should
contain a pyproject.toml file. Optionally, it might also contain scripts in a scripts/ directory.

If an example contains YAML files, they should be placed in a subdirectory named configs/. - If an example contains sample data files, they should be placed in a subdirectory named data/, and should
be checked into git-lfs.

Files:

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py

🧬 Code graph analysis (1)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (4)

examples/object_store/user_report/src/nat_user_report/user_report_tools.py (1)

UserReportConfig (31-43)

src/nat/builder/workflow_builder.py (4)

add_function_group (440-463)

add_function_group (1134-1135)

get_function_group (475-481)

get_function_group (1147-1153)

src/nat/builder/function.py (4)

get_accessible_functions (485-521)

ainvoke (116-117)

ainvoke (120-121)

ainvoke (124-163)

src/nat/object_store/models.py (1)

ObjectStoreItem (21-38)

🔇 Additional comments (5)

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py (5)

33-39: LGTM: Correct migration to grouped function config.

Using ObjectStoreRef(value="test_object_store"), include list, and per-op descriptions aligns with the new API.

116-127: LGTM: Idempotent put behavior matches new API (string status instead of exception).

The “already exists” response aligns with the grouped-tooling behavior change.

179-181: Consistent object keys (no leading slash) — resolved.

The earlier inconsistency was fixed; keys now use "reports/..." everywhere, including deletion checks.

Also applies to: 195-196

206-241: LGTM: Solid end-to-end CRUD workflow test.

Covers put/get/update/get/delete/get(not found) with clear assertions.

59-61: Async tests correctly configured via pytest-asyncio
pytest-asyncio==0.24.* is installed and asyncio_mode = "auto" in pyproject.toml enables async def tests to run without explicit markers.

Signed-off-by: David Gardner <dagardner@nvidia.com>

…toolkit into david-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

ci/markdown-link-check-config.json (1)
19-20: Broaden the arize.com ignore to cover subdomains and subpaths
Many links use e.g. https://www.arize.com, https://phoenix.arize.com or subpaths like /docs/phoenix. Update the pattern to match an optional subdomain:
-            "pattern": "^https://arize\\.com"
+            "pattern": "^https://([^.]+\\.)?arize\\.com"
Run the existing JSON validation and link-scan scripts to confirm coverage.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eff57c1 and 2c29ac6.

📒 Files selected for processing (1)

ci/markdown-link-check-config.json (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

ci/markdown-link-check-config.json

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (1)

ci/markdown-link-check-config.json (1)
17-18: Fix JSON syntax: add missing comma between objects

Without a comma after the MySQL pattern object, this JSON is invalid and will break the link checker in CI.

Apply this diff:
-        }
+        },
Likely an incorrect or invalid review comment.

dagardner-nv · 2025-09-24T20:31:10Z

/merge

dagardner-nv added 21 commits September 23, 2025 12:27

Test doesn't hang, it just takes 4 minutes to run

57d7248

Signed-off-by: David Gardner <dagardner@nvidia.com>

Indicate that this test needs an nvidia api key

401c378

Signed-off-by: David Gardner <dagardner@nvidia.com>

Only run the first item in the input dataset, takes the runtime of th…

8c79033

…is test from 4 minutes down to 40 seconds Signed-off-by: David Gardner <dagardner@nvidia.com>

Fix mis-formatting applies to the CSV causing parse errors

4e0389d

Signed-off-by: David Gardner <dagardner@nvidia.com>

Re-enable tests, create a fixture for the df_path

43bfc4d

Signed-off-by: David Gardner <dagardner@nvidia.com>

Add phoenix service to gitlab tests, alphabatize services

7da3848

Signed-off-by: David Gardner <dagardner@nvidia.com>

don't log a warning if the source and destination type are the same

c3332e5

Signed-off-by: David Gardner <dagardner@nvidia.com>

Lower the timeout, set max_concurrency to 1 to avoid being rate limited

9533159

Signed-off-by: David Gardner <dagardner@nvidia.com>

Updated test from @willkill07

6ec1723

Signed-off-by: David Gardner <dagardner@nvidia.com>

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into d…

7190361

…avid-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

Fix merge error

c83673f

Signed-off-by: David Gardner <dagardner@nvidia.com>

Return early if the types are the same

298b0e4

Signed-off-by: David Gardner <dagardner@nvidia.com>

Post-merge restore updated test from @willkill07

0ce0dc6

Signed-off-by: David Gardner <dagardner@nvidia.com>

Document setting eval.general.max_concurrency when being rate limited

51e292a

Signed-off-by: David Gardner <dagardner@nvidia.com>

Avoid calling isinstance when the type is typing.Annotated

ddc6b53

Signed-off-by: David Gardner <dagardner@nvidia.com>

Re-enable test, update import path for the config

b96214a

Signed-off-by: David Gardner <dagardner@nvidia.com>

Apply alternate fix for NVIDIA#845 from @willkill07

dd33de5

Signed-off-by: David Gardner <dagardner@nvidia.com>

lint fixes

f36b982

Signed-off-by: David Gardner <dagardner@nvidia.com>

Temp hack:

77985f9

Signed-off-by: David Gardner <dagardner@nvidia.com>

Add fixture to check if docker is available

42e2958

Signed-off-by: David Gardner <dagardner@nvidia.com>

Revert "Temp hack:"

218ecdb

This reverts commit 77985f9. Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv self-assigned this Sep 24, 2025

dagardner-nv added bug Something isn't working non-breaking Non-breaking change labels Sep 24, 2025

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into d…

17bbe8e

…avid-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv marked this pull request as ready for review September 24, 2025 18:31

dagardner-nv requested a review from a team as a code owner September 24, 2025 18:31

coderabbitai bot added the breaking Breaking change label Sep 24, 2025

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

examples/object_store/user_report/tests/test_objext_store_example_user_report_tool.py Outdated Show resolved Hide resolved

willkill07 removed the breaking Breaking change label Sep 24, 2025

willkill07 approved these changes Sep 24, 2025

View reviewed changes

Update examples/object_store/user_report/tests/test_objext_store_exam…

eff57c1

…ple_user_report_tool.py Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Will Killian <2007799+willkill07@users.noreply.github.com>

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

dagardner-nv added 3 commits September 24, 2025 12:49

Use the same key as the put

7395aec

Signed-off-by: David Gardner <dagardner@nvidia.com>

Merge branch 'david-fix-example-tests' of github.com:dagardner-nv/AIQ…

61e08e4

…toolkit into david-fix-example-tests Signed-off-by: David Gardner <dagardner@nvidia.com>

arize.com is now returning 403 errors to our link checker

2c29ac6

Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai bot reviewed Sep 24, 2025

View reviewed changes

rapids-bot bot merged commit 424bc38 into NVIDIA:develop Sep 24, 2025
17 checks passed

dagardner-nv deleted the david-fix-example-tests branch September 24, 2025 20:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix tests under `examples/`, remove all pytest `skip` markers#846

Fix tests under `examples/`, remove all pytest `skip` markers#846
rapids-bot[bot] merged 26 commits intoNVIDIA:developfrom
dagardner-nv:david-fix-example-tests

dagardner-nv commented Sep 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

dagardner-nv commented Sep 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dagardner-nv commented Sep 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

dagardner-nv commented Sep 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dagardner-nv commented Sep 24, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 24, 2025 •

edited

Loading