Skip to content

Add firewatch-granular-analysis step registry entry#80873

Open
amp-rh wants to merge 2 commits into
openshift:mainfrom
amp-rh:firewatch-granular-step
Open

Add firewatch-granular-analysis step registry entry#80873
amp-rh wants to merge 2 commits into
openshift:mainfrom
amp-rh:firewatch-granular-step

Conversation

@amp-rh

@amp-rh amp-rh commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a firewatch-granular-analysis step to the CI step registry that parses JUnit XML test artifacts and extracts granular metadata as Jira labels for consumption by firewatch-report-issues.

  • Extracts operator:, component:, and location: labels from failing test cases
  • Writes firewatch-additional-labels to ${SHARED_DIR} (read by firewatch-report-issues)
  • Writes firewatch-granular-data.json structured report as a CI artifact
  • Runs with best_effort: true so it cannot block existing jobs
  • Uses the firewatch base image (no new repo or container image needed)

Approach

This is a shell script with an inline Python snippet that uses xml.etree.ElementTree (stdlib, no pip deps). It replaces the approach in #80862 which required a new Go repository, container image, and CI pipeline.

Files

File Purpose
ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh JUnit XML parser and label extractor
ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml Step registry ref definition
ci-operator/step-registry/firewatch/granular-analysis/OWNERS Uses firewatch-owners (matches parent)

Integration

Place in the post chain before firewatch-report-issues:

post:
  - ref: firewatch-granular-analysis
  - ref: firewatch-report-issues

The firewatch-report-issues step already checks for and consumes ${SHARED_DIR}/firewatch-additional-labels via the --additional-labels-file flag.

Supersedes

Supersedes #80862 (same functionality, simpler implementation).

Relates: INTEROP-9185

Summary by CodeRabbit

This PR adds a new firewatch-granular-analysis step to the OpenShift CI step registry. This step enhances the existing Firewatch integration by extracting granular metadata from failing test cases in JUnit XML artifacts.

What the step does:
The new step parses JUnit XML test result files and extracts three types of labels from failure information:

  • Operator identifiers - regex-matched patterns indicating which operator failed
  • Component identifiers - derived from test case class names
  • Location strings - go-location-like patterns from failure content

These extracted labels are written to ${SHARED_DIR}/firewatch-additional-labels for consumption by the existing firewatch-report-issues step, and also saved as firewatch-granular-data.json as a CI artifact for reporting purposes.

Implementation details:

  • The step is implemented as a shell script with an embedded Python snippet using only the standard library
  • Runs with best_effort: true, ensuring failures won't block downstream jobs
  • Uses the existing firewatch base image, avoiding infrastructure overhead
  • Designed to be placed in the post chain immediately before firewatch-report-issues to provide enhanced issue metadata

Files added:

  • firewatch-granular-analysis-commands.sh: Contains the JUnit parser and label extraction logic
  • firewatch-granular-analysis-ref.yaml: Step registry definition with resource limits and configuration
  • OWNERS: References the firewatch-owners team for code ownership

This change enables more detailed Jira issue reporting by automatically categorizing test failures, improving the ability to route and track infrastructure issues across multiple components.

Adds a step that parses JUnit XML test artifacts and extracts granular
metadata (operator names, component names, file locations) as Jira labels.
Writes labels to ${SHARED_DIR}/firewatch-additional-labels for consumption
by firewatch-report-issues. Runs with best_effort: true.

Supersedes openshift#80862 with a simpler shell-script approach that eliminates the
need for a separate repository and container image.

Relates: INTEROP-9185
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 45d4fbe7-72af-4520-9ce4-d44276c8e049

📥 Commits

Reviewing files that changed from the base of the PR and between 5818d3f and 8b6b0f7.

📒 Files selected for processing (1)
  • ci-operator/config/stolostron/policy-collection/stolostron-policy-collection-main__ocp4.22.yaml

Walkthrough

Adds a new CI step firewatch-granular-analysis consisting of three files: an OWNERS file assigning firewatch-owners as approvers and reviewers, a YAML step reference configuring the image, resources, and environment, and a Bash script with embedded Python that parses JUnit XML artifacts to extract failure labels and write output files to SHARED_DIR.

Changes

firewatch-granular-analysis CI step

Layer / File(s) Summary
Step definition and ownership
ci-operator/step-registry/firewatch/granular-analysis/OWNERS, ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml
Defines the step with firewatch:main image, 10m CPU / 100Mi memory, FIREWATCH_GRANULAR_ARTIFACT_SUBDIR env var (default ""), best_effort: true, and sets firewatch-owners as approvers and reviewers.
Command script implementation
ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh
Bash entrypoint resolves artifact/output paths, discovers JUnit XMLs from junit/ or the top-level artifact dir, and handles the no-XML case by writing a zero-failure JSON report. Embedded Python iterates XMLs, counts <failure> elements per testcase, extracts capped sets of operators (regex), components (from classname/name), and locations (regex), writes firewatch-additional-labels only when failures exist, and always writes firewatch-granular-data.json. Post-processing Bash prints output file contents and exits with the Python exit code.
Workflow integration
ci-operator/config/stolostron/policy-collection/stolostron-policy-collection-main__ocp4.22.yaml
Adds the new step to the interop-opp-aws test workflow post-steps, positioned between the ipi-deprovision chain and firewatch-report-issues step.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

lgtm, approved, rehearsals-ack

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding a new firewatch-granular-analysis step registry entry, which is the core objective of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR adds CI infrastructure code (shell script, YAML configs, OWNERS) with no Ginkgo test definitions; check for Ginkgo test name stability is not applicable.
Test Structure And Quality ✅ Passed PR contains no Ginkgo test code; it adds shell/YAML CI configuration files, so the test structure check is not applicable.
Microshift Test Compatibility ✅ Passed This PR adds CI infrastructure (step registry files and a JUnit XML processor), not Ginkgo e2e tests, so the MicroShift compatibility check does not apply.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. Changes include only CI infrastructure files (OWNERS, bash/Python script for test artifact parsing, step registry definition, and workflow config), which a...
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds CI step registry files (ref.yaml, shell script, and OWNERS) with no deployment manifests, operator code, or scheduling constraints. No pod affinity, node selectors, topology constraint...
Ote Binary Stdout Contract ✅ Passed This PR adds bash scripts and YAML configs, not Go test binaries. The OTE Binary Stdout Contract check applies only to Go test binaries communicating with openshift-tests. Not applicable.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR does not add any Ginkgo e2e tests (It(), Describe(), Context(), When()). It adds CI infrastructure: a shell script with embedded Python that analyzes test artifacts post-execution, not test defi...
No-Weak-Crypto ✅ Passed The PR adds CI infrastructure for parsing JUnit XML test artifacts to extract metadata labels. No weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementat...
Container-Privileges ✅ Passed No container privilege escalation settings detected: no privileged, hostPID, hostNetwork, hostIPC, SYS_ADMIN, or allowPrivilegeEscalation configurations in any files.
No-Sensitive-Data-In-Logs ✅ Passed Script logs only non-sensitive metadata: operator names, component names, file paths, and counts. No passwords, tokens, API keys, PII, session IDs, or customer data are exposed in logs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from calebevans and sg-rh June 22, 2026 20:17
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026
Adds the firewatch-granular-analysis step before firewatch-report-issues
in the interop-opp-aws test post chain. The step extracts structured
labels from JUnit XML artifacts so firewatch-report-issues can apply
them to Jira tickets.

Relates: INTEROP-9185
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@amp-rh, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: test firewatch-granular-analysis contains best_effort without timeout
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: amp-rh
Once this PR has been reviewed and has the lgtm label, please assign jan-law for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`:
- Around line 47-57: The XML file discovery logic is too narrow and does not
match the Bash pre-check semantics. Currently, the code only checks the direct
`junit` subdirectory under artifact_dir and top-level XMLs, but it misses XML
files in nested junit directories at any depth (such as
`artifact_dir/some/path/junit/file.xml`). Refactor the xml_paths population to
recursively walk through all directories in artifact_dir using os.walk() to find
XML files under any nested `junit` directory path, ensuring that all XML files
matching the `*/junit/*.xml` pattern are discovered regardless of nesting depth.
- Around line 3-5: The firewatch-granular-analysis-commands.sh script is missing
the `errexit` option in its set command, which means command failures will not
cause the script to exit immediately. Add the `-e` flag to the existing `set -o
nounset` and `set -o pipefail` statement to make it `set -euo pipefail`,
following the standard step-registry baseline flags. This ensures the script
fails fast on any command error rather than continuing silently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8a677b43-f1ec-4dc8-8af3-176bcf42e8b2

📥 Commits

Reviewing files that changed from the base of the PR and between f032d5f and 5818d3f.

📒 Files selected for processing (3)
  • ci-operator/step-registry/firewatch/granular-analysis/OWNERS
  • ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh
  • ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml

Comment on lines +3 to +5
set -o nounset
set -o pipefail

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Enable errexit to avoid silent continuation on command failures.

This step omits -e, so unexpected command errors can be ignored before the explicit Python exit handling. Use the standard step-registry baseline flags.

As per coding guidelines, step-registry command scripts should use set -euo pipefail by default.

Suggested patch
-set -o nounset
-set -o pipefail
+set -euo pipefail
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`
around lines 3 - 5, The firewatch-granular-analysis-commands.sh script is
missing the `errexit` option in its set command, which means command failures
will not cause the script to exit immediately. Add the `-e` flag to the existing
`set -o nounset` and `set -o pipefail` statement to make it `set -euo pipefail`,
following the standard step-registry baseline flags. This ensures the script
fails fast on any command error rather than continuing silently.

Source: Coding guidelines

Comment on lines +47 to +57
xml_paths = []
junit_dir = os.path.join(artifact_dir, "junit")
if os.path.isdir(junit_dir):
for name in os.listdir(junit_dir):
if name.endswith(".xml"):
xml_paths.append(os.path.join(junit_dir, name))
for name in os.listdir(artifact_dir):
full = os.path.join(artifact_dir, name)
if name.endswith(".xml") and os.path.isfile(full) and full not in xml_paths:
xml_paths.append(full)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

XML discovery in Python is narrower than the Bash pre-check, causing missed inputs.

Bash accepts XMLs under any */junit/*, but Python only reads ${artifact_dir}/junit and top-level XMLs. If files exist only in nested junit paths, the step reports zero extracted labels despite passing the initial Bash gate.

Suggested patch (align Python discovery with Bash semantics)
-xml_paths = []
-junit_dir = os.path.join(artifact_dir, "junit")
-if os.path.isdir(junit_dir):
-    for name in os.listdir(junit_dir):
-        if name.endswith(".xml"):
-            xml_paths.append(os.path.join(junit_dir, name))
-for name in os.listdir(artifact_dir):
-    full = os.path.join(artifact_dir, name)
-    if name.endswith(".xml") and os.path.isfile(full) and full not in xml_paths:
-        xml_paths.append(full)
+xml_paths = []
+seen = set()
+for root, _, files in os.walk(artifact_dir):
+    for name in files:
+        if not name.endswith(".xml"):
+            continue
+        full = os.path.join(root, name)
+        rel = os.path.relpath(full, artifact_dir)
+        is_top_level_xml = os.sep not in rel
+        is_under_junit = f"{os.sep}junit{os.sep}" in f"{os.sep}{rel}"
+        if (is_top_level_xml or is_under_junit) and full not in seen:
+            seen.add(full)
+            xml_paths.append(full)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`
around lines 47 - 57, The XML file discovery logic is too narrow and does not
match the Bash pre-check semantics. Currently, the code only checks the direct
`junit` subdirectory under artifact_dir and top-level XMLs, but it misses XML
files in nested junit directories at any depth (such as
`artifact_dir/some/path/junit/file.xml`). Refactor the xml_paths population to
recursively walk through all directories in artifact_dir using os.walk() to find
XML files under any nested `junit` directory path, ensuring that all XML files
matching the `*/junit/*.xml` pattern are discovered regardless of nesting depth.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@amp-rh, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: test firewatch-granular-analysis contains best_effort without timeout
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@amp-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ci-operator-config 8b6b0f7 link true /test ci-operator-config
ci/prow/ci-operator-registry 8b6b0f7 link true /test ci-operator-registry
ci/prow/step-registry-metadata 8b6b0f7 link true /test step-registry-metadata

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant