Skip to content

Fix the test_unified_api_server integration tests#804

Merged
rapids-bot[bot] merged 9 commits intoNVIDIA:developfrom
dagardner-nv:david-fix-broken-tests
Sep 16, 2025
Merged

Fix the test_unified_api_server integration tests#804
rapids-bot[bot] merged 9 commits intoNVIDIA:developfrom
dagardner-nv:david-fix-broken-tests

Conversation

@dagardner-nv
Copy link
Contributor

@dagardner-nv dagardner-nv commented Sep 16, 2025

Description

  • Also includes improved error reporting

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • Chores

    • Nightly Slack test reports now include clearer summaries, human-readable test names, and a threaded detailed failure breakdown with improved formatting.
    • Slack reporting is run only during scheduled nightly jobs.
  • Tests

    • Test config now supports selecting the config file via environment variables and improves isolation by using function-scoped fixtures.

Signed-off-by: David Gardner <dagardner@nvidia.com>
This reverts commit c619133.

Signed-off-by: David Gardner <dagardner@nvidia.com>
…avid-fix-broken-tests

Signed-off-by: David Gardner <dagardner@nvidia.com>
Only install slack-sdk if we need it

Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
@dagardner-nv dagardner-nv self-assigned this Sep 16, 2025
@dagardner-nv dagardner-nv requested a review from a team as a code owner September 16, 2025 22:51
@dagardner-nv dagardner-nv added bug Something isn't working non-breaking Non-breaking change labels Sep 16, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 16, 2025

Walkthrough

Refactors Slack test reporting to build structured messages (including per-failure details), changes parse_junit to return detailed failed_tests, posts failures in a threaded Slack message when present, moves Slack SDK install into the nightly CI path, and updates a NAT server test fixture to set config via NAT_CONFIG_FILE and revert fixture to function scope.

Changes

Cohort / File(s) Summary
Slack test results reporting refactor
ci/scripts/gitlab/report_test_results.py
Adds ReportMessages NamedTuple; new get_testcase_name; parse_junit now returns a dict including failed_tests; adds build_messages to produce plain_text, blocks, and optional failure_text/failure_blocks; main flow posts a primary Slack message and, if failures exist, a threaded message with failure details.
CI nightly Slack dependency install
ci/scripts/gitlab/tests.sh
Removes unconditional top-level Slack SDK install; moves pip install of Slack SDK inside the CI_CRON_NIGHTLY == 1 branch so SDK is installed only for nightly runs before reporting.
NAT server test fixture env-based config
tests/nat/server/test_unified_api_server.py
Adds os import; fixture config now depends on restore_environ, sets NAT_CONFIG_FILE env var to the config path, and fixture scope changed from session to default function scope; server_config signature updated accordingly.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CI as CI Job
  participant J as parse_junit
  participant B as build_messages
  participant S as Slack API

  CI->>J: Parse JUnit XML
  J-->>CI: {num_tests, num_failures, num_errors, num_skipped, failed_tests}
  CI->>B: Build messages from junit_data, coverage_data
  B-->>CI: ReportMessages {plain_text, blocks, failure_text?, failure_blocks?}
  CI->>S: Post main message (blocks, link_names if failures)
  alt Failures present
    CI->>S: Post thread reply with failure_blocks or failure_text
  end
Loading
sequenceDiagram
  autonumber
  participant R as CI Runner
  participant T as ci/scripts/gitlab/tests.sh
  participant Py as report_test_results.py
  participant S as Slack API

  R->>T: Execute tests.sh
  T->>T: Run tests
  alt CI_CRON_NIGHTLY == 1
    T->>T: Install Slack SDK (pip)
    T->>Py: Generate & send report
    Py->>S: Post Slack message(s)
  else Not nightly
    T-->>R: Skip Slack install/report
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Fix the test_unified_api_server integration tests" directly and concisely describes the main change in the diff, which updates tests/nat/server/test_unified_api_server.py and related fixtures to restore passing integration tests. It is written in the imperative mood and is under the ~72 character guideline. The title is specific rather than vague and aligns with the PR objectives and raw_summary.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ci/scripts/gitlab/report_test_results.py (1)

38-75: Handle JUnit root being either or and improve failure message fallback.

pytest's --junit-xml can produce a root; root.findall('testsuite') will be empty in that case and the parser reports zeros. Also fall back to element text when a failure/error has no message attribute.

File: ci/scripts/gitlab/report_test_results.py
Lines: 38-75

-def parse_junit(junit_file: str) -> dict[str, typing.Any]:
+def parse_junit(junit_file: str) -> dict[str, typing.Any]:
     tree = ET.parse(junit_file)
     root = tree.getroot()
@@
-    failed_tests = []
-    for testsuite in root.findall('testsuite'):
+    failed_tests: list[dict[str, str]] = []
+    # Support both <testsuite> (pytest) and <testsuites> roots
+    testsuite_nodes = [root] if root.tag == "testsuite" else root.findall("testsuite")
+    for testsuite in testsuite_nodes:
         total_tests += int(testsuite.attrib.get('tests', 0))
-        num_failures = int(testsuite.attrib.get('failures', 0))
-        num_errors = int(testsuite.attrib.get('errors', 0))
+        num_failures = int(testsuite.attrib.get('failures', 0))
+        num_errors = int(testsuite.attrib.get('errors', 0))
         total_failures += num_failures
         total_errors += num_errors
         total_skipped += int(testsuite.attrib.get('skipped', 0))
 
-        if (num_failures + num_errors) > 0:
-            for testcase in testsuite.findall('testcase'):
-                failure = testcase.find('failure')
-                error = testcase.find('error')
-
-                for failed_test_tag in (failure, error):
-                    if failed_test_tag is not None:
-                        failed_info = {
-                            "test_name": get_testcase_name(testcase),
-                            "message": failed_test_tag.attrib.get('message', '').strip()
-                        }
-                        failed_tests.append(failed_info)
+        if (num_failures + num_errors) > 0:
+            for testcase in testsuite.findall('testcase'):
+                failure = testcase.find('failure')
+                error = testcase.find('error')
+                for failed_test_tag in (failure, error):
+                    if failed_test_tag is not None:
+                        message = (failed_test_tag.attrib.get('message') or failed_test_tag.text or '').strip()
+                        failed_tests.append({
+                            "test_name": get_testcase_name(testcase),
+                            "message": message,
+                        })
@@
-        "num_skipped": total_skipped,
-        "failed_tests": failed_tests
+        "num_skipped": total_skipped,
+        "failed_tests": failed_tests,
     }
🧹 Nitpick comments (2)
ci/scripts/gitlab/report_test_results.py (2)

101-149: Blocks/threaded failure reporting looks good; consider usergroup mentions.

Slack usergroup mentions typically require <!subteam^ID|alias>. If you need reliable paging of @nat-core-devs, consider injecting a SLACK_USERGROUP_ID_NAT_CORE_DEVS env var and formatting accordingly; keep link_names=True for channel/here.


156-162: Write env-var error to stderr for CI visibility.

Minor UX: send the error to stderr to make failures stand out in logs.

Apply this diff:

-    except KeyError:
-        print('Error: Set environment variables SLACK_TOKEN and SLACK_CHANNEL to post to slack.')
+    except KeyError:
+        print('Error: Set environment variables SLACK_TOKEN and SLACK_CHANNEL to post to slack.', file=sys.stderr)
         return 1
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7276c00 and 673c235.

📒 Files selected for processing (3)
  • ci/scripts/gitlab/report_test_results.py (5 hunks)
  • ci/scripts/gitlab/tests.sh (1 hunks)
  • tests/nat/server/test_unified_api_server.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
ci/scripts/**/*.sh

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

CI shell/utility scripts must live under ci/scripts/

Files:

  • ci/scripts/gitlab/tests.sh
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

  • ci/scripts/gitlab/tests.sh
  • tests/nat/server/test_unified_api_server.py
  • ci/scripts/gitlab/report_test_results.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • ci/scripts/gitlab/tests.sh
  • tests/nat/server/test_unified_api_server.py
  • ci/scripts/gitlab/report_test_results.py
tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Unit tests must live under tests/ and use configured markers (e2e, integration, etc.)

Files:

  • tests/nat/server/test_unified_api_server.py

⚙️ CodeRabbit configuration file

tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using the test_ prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using the fixture_ prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass

Files:

  • tests/nat/server/test_unified_api_server.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bare raise and log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

Files:

  • tests/nat/server/test_unified_api_server.py
  • ci/scripts/gitlab/report_test_results.py
**/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/tests/**/*.py: Test functions must use the test_ prefix and snake_case
Extract repeated test code into pytest fixtures; fixtures should set name=... in @pytest.fixture and functions named with fixture_ prefix
Mark expensive tests with @pytest.mark.slow or @pytest.mark.integration
Use pytest with pytest-asyncio for async code; mock external services with pytest_httpserver or unittest.mock

Files:

  • tests/nat/server/test_unified_api_server.py
**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

  • tests/nat/server/test_unified_api_server.py
  • ci/scripts/gitlab/report_test_results.py
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • tests/nat/server/test_unified_api_server.py
  • ci/scripts/gitlab/report_test_results.py
🪛 Ruff (0.12.2)
tests/nat/server/test_unified_api_server.py

289-289: Unused function argument: restore_environ

(ARG001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (5)
tests/nat/server/test_unified_api_server.py (2)

18-18: Import looks necessary for env-driven config.

No issues with adding os here.


288-295: Fix ruff ARG001: unused fixture arg; narrow return type.

Rename the unused fixture argument to _restore_environ to silence ARG001 and tighten the return annotation to Config (ensure Config is imported). Alternatively keep the name and add # noqa: ARG001.

-@pytest.fixture(name="config")
-def server_config(restore_environ, file_path: str = "tests/nat/server/config.yml") -> BaseModel:
+@pytest.fixture(name="config")
+def server_config(_restore_environ, file_path: str = "tests/nat/server/config.yml") -> Config:
@@
-    os.environ["NAT_CONFIG_FILE"] = file_path
+    os.environ["NAT_CONFIG_FILE"] = file_path
  • Note: ruff also reports TRY301 / TRY003 at tests/nat/server/test_unified_api_server.py:658–662 — consider extracting the raise into a helper or moving long messages into the exception class.
ci/scripts/gitlab/report_test_results.py (2)

34-36: Helper LGTM.

The testcase name helper is concise and readable.


170-182: Threaded failure details: solid approach.

Posting verbose failures in a thread avoids channel spam and keeps summaries readable.

ci/scripts/gitlab/tests.sh (1)

48-51: Good: gate Slack SDK install to nightly.

This reduces CI time and avoids unnecessary deps on regular runs.

Signed-off-by: David Gardner <dagardner@nvidia.com>
@dagardner-nv
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 16d7552 into NVIDIA:develop Sep 16, 2025
16 of 17 checks passed
@dagardner-nv dagardner-nv deleted the david-fix-broken-tests branch September 16, 2025 23:18
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
ci/scripts/gitlab/report_test_results.py (2)

30-31: Consider using Optional[list[...]] for consistency with typing conventions.

The use of | None syntax for optional types requires Python 3.10+. For broader compatibility and consistency with typical typing conventions, consider using Optional[list[...]].

-    failure_text: list[str] | None
-    failure_blocks: list[dict] | None
+    failure_text: Optional[list[str]]
+    failure_blocks: Optional[list[dict]]

Don't forget to import Optional from typing:

-import typing
+from typing import Any, NamedTuple, Optional

135-143: Remove redundant divider after the last failed test.

The current implementation adds a divider after every failed test, including the last one. This creates an unnecessary divider before the job URL link (or at the end if there's no URL).

Apply this fix to only add dividers between tests:

-        for failed_test in failed_tests:
+        for i, failed_test in enumerate(failed_tests):
             test_name = failed_test['test_name']
             message = failed_test['message']
             add_text(f"`{test_name}`\n```\n{message}\n```", failure_blocks, failure_text)
-            failure_text.append("---\n")
-            failure_blocks.append({"type": "divider"})
+            if i < len(failed_tests) - 1:
+                failure_text.append("---\n")
+                failure_blocks.append({"type": "divider"})
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 673c235 and 7650312.

📒 Files selected for processing (1)
  • ci/scripts/gitlab/report_test_results.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bare raise and log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

Files:

  • ci/scripts/gitlab/report_test_results.py
**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

  • ci/scripts/gitlab/report_test_results.py
**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

  • ci/scripts/gitlab/report_test_results.py
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • ci/scripts/gitlab/report_test_results.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • ci/scripts/gitlab/report_test_results.py
🔇 Additional comments (4)
ci/scripts/gitlab/report_test_results.py (4)

34-35: LGTM! Descriptive test name formatting.

Good defensive programming with the use of .get() method to handle missing attributes gracefully.


38-75: Well-structured refactoring with improved error reporting.

The function now returns detailed failure information alongside the summary counts, which enables better error reporting downstream. The logic for collecting failed tests is clear and handles both failure and error elements appropriately.


176-176: Good use of link_names for failure notifications.

Setting link_names=True when failures are present ensures that @nat-core-devs mentions in the message are properly resolved and team members get notified.


178-184: LGTM! Excellent threading strategy for failure details.

Great design decision to post detailed failure information in a thread. This prevents channel spam while maintaining visibility of the main summary message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants