Report nightly test results by dagardner-nv · Pull Request #797 · NVIDIA/NeMo-Agent-Toolkit

dagardner-nv · 2025-09-16T17:40:31Z

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- Nightly Slack notifications summarizing test results and coverage, with clear highlighting for failures/errors and a date-based summary.
Chores
- CI pipeline updated to post results on nightly runs.
- Automatic installation of the Slack SDK during CI.
- Enhanced logging around installation and reporting steps.
- Robust error handling: reporting failures are logged and cause the job to fail, ensuring visibility.

Signed-off-by: David Gardner <dagardner@nvidia.com>

coderabbitai · 2025-09-16T17:40:44Z

Walkthrough

Adds a new Python script to parse JUnit and coverage XML and post nightly test summaries to Slack. Modifies the CI test script to install Slack SDK and, on nightly runs, invoke the reporting script with JUnit and coverage paths, propagating its exit code.

Changes

Cohort / File(s)	Summary of changes
Slack reporting script `ci/scripts/gitlab/report_test_results.py`	New CLI tool to parse JUnit and coverage XML, format a Slack message (text and blocks), and post via slack_sdk using SLACK_TOKEN and SLACK_CHANNEL. Includes helper functions: parse_junit, parse_coverage, get_error_string, text_to_block, add_text, and main with exit codes.
CI nightly integration `ci/scripts/gitlab/tests.sh`	Installs slack-sdk, adds nightly-only reporting step (checks CI_CRON_NIGHTLY), calls report_test_results.py with report paths, logs actions, and propagates non-zero exit from the reporter.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CI as CI Pipeline
  participant Tests as PyTest
  participant Reporter as report_test_results.py
  participant Slack as Slack API

  CI->>Tests: Run tests (produce JUnit & coverage XML)
  alt Nightly run (CI_CRON_NIGHTLY=1)
    CI->>CI: pip install slack-sdk
    CI->>Reporter: Invoke with junit_file, coverage_file
    Reporter->>Reporter: Parse JUnit (totals/errors)
    Reporter->>Reporter: Parse coverage (line-rate)
    Reporter->>Slack: Post message (text + blocks)
    Slack-->>Reporter: API response
    Reporter-->>CI: Exit 0/1
  else Non-nightly
    CI->>CI: Skip Slack reporting
  end

sequenceDiagram
  participant CI as CI Pipeline
  participant Reporter as report_test_results.py
  participant Env as Environment

  CI->>Reporter: Start
  Reporter->>Env: Read SLACK_TOKEN, SLACK_CHANNEL
  alt Missing env vars
    Reporter-->>CI: Print error, exit 1
  else Present
    Reporter-->>CI: Continue (post to Slack)
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

feature request

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title "Report nightly test results" succinctly and accurately summarizes the main change (adding a Slack-reporting script and a nightly CI invocation), is written in the imperative mood, and is well under the 72-character limit. It directly matches the changes in ci/scripts/gitlab/report_test_results.py and the nightly reporting addition in ci/scripts/gitlab/tests.sh.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

ci/scripts/gitlab/report_test_results.py (3)

84-87: Fail fast on missing/invalid XML with preserved stack trace.

Parsing errors should be logged with stack trace and return non‑zero.

Apply this diff:

-    args = parser.parse_args()
-    junit_data = parse_junit(args.junit_file)
-    coverage_data = parse_coverage(args.coverage_file)
+    args = parser.parse_args()
+    try:
+        junit_data = parse_junit(args.junit_file)
+        coverage_data = parse_coverage(args.coverage_file)
+    except (FileNotFoundError, ET.ParseError):
+        logger.exception(
+            "Failed to parse reports (junit=%s, coverage=%s)",
+            args.junit_file, args.coverage_file
+        )
+        return 1

63-70: Complete type hints and annotate main().

Project guideline: type hints on all params/returns.

Apply this diff:

-def text_to_block(text: str) -> dict:
+def text_to_block(text: str) -> dict[str, Any]:
     return {"type": "section", "text": {"type": "mrkdwn", "text": text}}
 
-
-def add_text(text: str, blocks: list[dict], plain_text: list[str]) -> None:
+def add_text(text: str, blocks: list[dict[str, Any]], plain_text: list[str]) -> None:
     blocks.append(text_to_block(text))
     plain_text.append(text)
 
-
-def main():
+def main() -> int:

Also applies to: 72-73

96-106: Reliable mention for on-call/usergroup (optional).

Slack usergroup mentions require the subteam ID; link_names may not ping “@nat-core-devs”. Consider env‑driven mention string like “!subteam^SXXXX|@nat-core-devs”.

Example tweak:

-    link_names = False
-    if (num_errors + num_failures) > 0:
-        link_names = True
-        formatted_summary_line = f"@nat-core-devs :rotating_light: {summary_line}"
+    link_names = False
+    if (num_errors + num_failures) > 0:
+        mention = os.environ.get("SLACK_ALERT_MENTION", "@nat-core-devs")
+        link_names = True
+        formatted_summary_line = f"{mention} :rotating_light: {summary_line}"

ci/scripts/gitlab/tests.sh (1)

25-28: Install Slack SDK only on nightly runs; add fallback if uv is absent.

Reduces CI time and flakiness on regular jobs; improves portability.

Apply this diff:

-# Since this dependency is specific to only this script, we will just install it here
-rapids-logger "Installing slack-sdk"
-uv pip install "slack-sdk~=3.36"
+if [ "${CI_CRON_NIGHTLY}" == "1" ]; then
+    # Install only when needed
+    rapids-logger "Installing slack-sdk"
+    if command -v uv >/dev/null 2>&1; then
+        uv pip install "slack-sdk~=3.36"
+    else
+        python -m pip install --upgrade "slack-sdk~=3.36"
+    fi
+fi

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f631272 and e5424e2.

📒 Files selected for processing (2)

ci/scripts/gitlab/report_test_results.py (1 hunks)
ci/scripts/gitlab/tests.sh (2 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

ci/scripts/**/*.sh

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

CI shell/utility scripts must live under ci/scripts/

Files:

ci/scripts/gitlab/tests.sh

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,sh,md,yml,yaml,toml,ini,json,ipynb,txt,rst}: Every file must start with the standard SPDX Apache-2.0 header; keep copyright years up‑to‑date
All source files must include the SPDX Apache‑2.0 header; do not bypass CI header checks

Files:

ci/scripts/gitlab/tests.sh
ci/scripts/gitlab/report_test_results.py

**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions
Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.
Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:
def my_function(param1: int, param2: str) -> bool:
    pass
For Python exception handling, ensure proper stack trace preservation:

When re-raising exceptions: use bare raise statements to maintain the original stack trace,
and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.

When catching and logging exceptions without re-raising: always use logger.exception()
to capture the full stack trace information.
Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

Confirm that copyright years are up-to date whenever a file is changed.

Files:

ci/scripts/gitlab/tests.sh
ci/scripts/gitlab/report_test_results.py

**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 8/20 style; format with yapf (column_limit=120) and use 4-space indentation; end files with a single newline
Run ruff (ruff check --fix) per pyproject.toml; fix warnings unless explicitly ignored; ruff is linter-only
Use snake_case for functions/variables, PascalCase for classes, and UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: preserve stack traces and avoid duplicate logging
When re-raising exceptions, use bare raise and log with logger.error(), not logger.exception()
When catching and not re-raising, log with logger.exception() to capture stack trace
Validate and sanitize all user input; prefer httpx with SSL verification and follow OWASP Top‑10
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile/mprof; cache with functools.lru_cache or external cache; leverage NumPy vectorization when beneficial

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

Files:

ci/scripts/gitlab/report_test_results.py

**/*.{py,md}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Never hard‑code version numbers in code or docs; versions are derived by setuptools‑scm

Files:

ci/scripts/gitlab/report_test_results.py

**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

ci/scripts/gitlab/report_test_results.py

🪛 Ruff (0.12.2)

ci/scripts/gitlab/report_test_results.py

27-27: Using xml to parse untrusted data is known to be vulnerable to XML attacks; use defusedxml equivalents

(S314)

50-50: Using xml to parse untrusted data is known to be vulnerable to XML attacks; use defusedxml equivalents

(S314)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: CI Pipeline / Check

🔇 Additional comments (3)

ci/scripts/gitlab/report_test_results.py (1)

1-16: Nice, clear reporting flow and accessible plaintext.

Header, a11y plaintext + blocks, and zero/non‑zero exit semantics look good.

Also applies to: 91-95, 116-123
ci/scripts/gitlab/tests.sh (2)
50-58: Confirm desired failure mode when Slack posting fails.

Current logic fails the job on reporting errors, even if tests passed. Is that intentional? If not, consider logging the failure but still exiting with PYTEST_RESULTS.

Option if you want “don’t fail build on reporting failure”:
-       if [ ${REPORT_RESULT} -ne 0 ]; then
-              rapids-logger "Failed to report test results to Slack"
-              exit ${REPORT_RESULT}
-       fi
+       if [ ${REPORT_RESULT} -ne 0 ]; then
+              rapids-logger "Failed to report test results to Slack (non-blocking)"
+       fi
24-24: CI path/layout, headers, and logging usage look good.

SPDX header, location under ci/scripts/, and rapids-logger usage are consistent.

ci/scripts/gitlab/report_test_results.py

dagardner-nv · 2025-09-16T18:36:17Z

/merge

Report nightly test results

e5424e2

Signed-off-by: David Gardner <dagardner@nvidia.com>

dagardner-nv self-assigned this Sep 16, 2025

dagardner-nv requested a review from a team as a code owner September 16, 2025 17:40

dagardner-nv added improvement Improvement to existing functionality non-breaking Non-breaking change labels Sep 16, 2025

coderabbitai bot added the feature request New feature or request label Sep 16, 2025

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

ci/scripts/gitlab/report_test_results.py Show resolved Hide resolved

ci/scripts/gitlab/report_test_results.py Show resolved Hide resolved

dagardner-nv removed the feature request New feature or request label Sep 16, 2025

willkill07 approved these changes Sep 16, 2025

View reviewed changes

rapids-bot bot merged commit 6e5a7e1 into NVIDIA:develop Sep 16, 2025
17 checks passed

dagardner-nv deleted the david-nightly-test-result-reports branch September 16, 2025 18:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report nightly test results#797

Report nightly test results#797
rapids-bot[bot] merged 1 commit intoNVIDIA:developfrom
dagardner-nv:david-nightly-test-result-reports

dagardner-nv commented Sep 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

dagardner-nv commented Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dagardner-nv commented Sep 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

Uh oh!

Uh oh!

Uh oh!

dagardner-nv commented Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dagardner-nv commented Sep 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 16, 2025 •

edited

Loading