Add firewatch-granular-analysis step registry entry by amp-rh · Pull Request #80873 · openshift/release

amp-rh · 2026-06-22T20:16:19Z

Summary

Adds a firewatch-granular-analysis step to the CI step registry that parses JUnit XML test artifacts and extracts granular metadata as Jira labels for consumption by firewatch-report-issues.

Extracts operator:, component:, and location: labels from failing test cases
Writes firewatch-additional-labels to ${SHARED_DIR} (read by firewatch-report-issues)
Writes firewatch-granular-data.json structured report as a CI artifact
Runs with best_effort: true so it cannot block existing jobs
Uses the firewatch base image (no new repo or container image needed)

Approach

This is a shell script with an inline Python snippet that uses xml.etree.ElementTree (stdlib, no pip deps). It replaces the approach in #80862 which required a new Go repository, container image, and CI pipeline.

Files

File	Purpose
`ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`	JUnit XML parser and label extractor
`ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml`	Step registry ref definition
`ci-operator/step-registry/firewatch/granular-analysis/OWNERS`	Uses `firewatch-owners` (matches parent)

Integration

Place in the post chain before firewatch-report-issues:

post:
  - ref: firewatch-granular-analysis
  - ref: firewatch-report-issues

The firewatch-report-issues step already checks for and consumes ${SHARED_DIR}/firewatch-additional-labels via the --additional-labels-file flag.

Supersedes

Supersedes #80862 (same functionality, simpler implementation).

Relates: INTEROP-9185

Summary by CodeRabbit

This PR adds a new firewatch-granular-analysis step to the OpenShift CI step registry. This step enhances the existing Firewatch integration by extracting granular metadata from failing test cases in JUnit XML artifacts.

What the step does:
The new step parses JUnit XML test result files and extracts three types of labels from failure information:

Operator identifiers - regex-matched patterns indicating which operator failed
Component identifiers - derived from test case class names
Location strings - go-location-like patterns from failure content

These extracted labels are written to ${SHARED_DIR}/firewatch-additional-labels for consumption by the existing firewatch-report-issues step, and also saved as firewatch-granular-data.json as a CI artifact for reporting purposes.

Implementation details:

The step is implemented as a shell script with an embedded Python snippet using only the standard library
Runs with best_effort: true, ensuring failures won't block downstream jobs
Uses the existing firewatch base image, avoiding infrastructure overhead
Designed to be placed in the post chain immediately before firewatch-report-issues to provide enhanced issue metadata

Files added:

firewatch-granular-analysis-commands.sh: Contains the JUnit parser and label extraction logic
firewatch-granular-analysis-ref.yaml: Step registry definition with resource limits and configuration
OWNERS: References the firewatch-owners team for code ownership

This change enables more detailed Jira issue reporting by automatically categorizing test failures, improving the ability to route and track infrastructure issues across multiple components.

Adds a step that parses JUnit XML test artifacts and extracts granular metadata (operator names, component names, file locations) as Jira labels. Writes labels to ${SHARED_DIR}/firewatch-additional-labels for consumption by firewatch-report-issues. Runs with best_effort: true. Supersedes openshift#80862 with a simpler shell-script approach that eliminates the need for a separate repository and container image. Relates: INTEROP-9185

coderabbitai · 2026-06-22T20:17:12Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 45d4fbe7-72af-4520-9ce4-d44276c8e049

📥 Commits

Reviewing files that changed from the base of the PR and between 5818d3f and 8b6b0f7.

📒 Files selected for processing (1)

ci-operator/config/stolostron/policy-collection/stolostron-policy-collection-main__ocp4.22.yaml

Walkthrough

Adds a new CI step firewatch-granular-analysis consisting of three files: an OWNERS file assigning firewatch-owners as approvers and reviewers, a YAML step reference configuring the image, resources, and environment, and a Bash script with embedded Python that parses JUnit XML artifacts to extract failure labels and write output files to SHARED_DIR.

Changes

firewatch-granular-analysis CI step

Layer / File(s)	Summary
Step definition and ownership `ci-operator/step-registry/firewatch/granular-analysis/OWNERS`, `ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml`	Defines the step with `firewatch:main` image, `10m` CPU / `100Mi` memory, `FIREWATCH_GRANULAR_ARTIFACT_SUBDIR` env var (default `""`), `best_effort: true`, and sets `firewatch-owners` as approvers and reviewers.
Command script implementation `ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`	Bash entrypoint resolves artifact/output paths, discovers JUnit XMLs from `junit/` or the top-level artifact dir, and handles the no-XML case by writing a zero-failure JSON report. Embedded Python iterates XMLs, counts `<failure>` elements per testcase, extracts capped sets of operators (regex), components (from `classname`/`name`), and locations (regex), writes `firewatch-additional-labels` only when failures exist, and always writes `firewatch-granular-data.json`. Post-processing Bash prints output file contents and exits with the Python exit code.
Workflow integration `ci-operator/config/stolostron/policy-collection/stolostron-policy-collection-main__ocp4.22.yaml`	Adds the new step to the `interop-opp-aws` test workflow post-steps, positioned between the `ipi-deprovision` chain and `firewatch-report-issues` step.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

lgtm, approved, rehearsals-ack

🚥 Pre-merge checks | ✅ 15

✅ Passed checks (15 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding a new firewatch-granular-analysis step registry entry, which is the core objective of the PR.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	PR adds CI infrastructure code (shell script, YAML configs, OWNERS) with no Ginkgo test definitions; check for Ginkgo test name stability is not applicable.
Test Structure And Quality	✅ Passed	PR contains no Ginkgo test code; it adds shell/YAML CI configuration files, so the test structure check is not applicable.
Microshift Test Compatibility	✅ Passed	This PR adds CI infrastructure (step registry files and a JUnit XML processor), not Ginkgo e2e tests, so the MicroShift compatibility check does not apply.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	No Ginkgo e2e tests are added in this PR. Changes include only CI infrastructure files (OWNERS, bash/Python script for test artifact parsing, step registry definition, and workflow config), which a...
Topology-Aware Scheduling Compatibility	✅ Passed	This PR adds CI step registry files (ref.yaml, shell script, and OWNERS) with no deployment manifests, operator code, or scheduling constraints. No pod affinity, node selectors, topology constraint...
Ote Binary Stdout Contract	✅ Passed	This PR adds bash scripts and YAML configs, not Go test binaries. The OTE Binary Stdout Contract check applies only to Go test binaries communicating with openshift-tests. Not applicable.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	PR does not add any Ginkgo e2e tests (It(), Describe(), Context(), When()). It adds CI infrastructure: a shell script with embedded Python that analyzes test artifacts post-execution, not test defi...
No-Weak-Crypto	✅ Passed	The PR adds CI infrastructure for parsing JUnit XML test artifacts to extract metadata labels. No weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementat...
Container-Privileges	✅ Passed	No container privilege escalation settings detected: no privileged, hostPID, hostNetwork, hostIPC, SYS_ADMIN, or allowPrivilegeEscalation configurations in any files.
No-Sensitive-Data-In-Logs	✅ Passed	Script logs only non-sensitive metadata: operator names, component names, file paths, and counts. No passwords, tokens, API keys, PII, session IDs, or customer data are exposed in logs.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Adds the firewatch-granular-analysis step before firewatch-report-issues in the interop-opp-aws test post chain. The step extracts structured labels from JUnit XML artifacts so firewatch-report-issues can apply them to Jira tickets. Relates: INTEROP-9185

openshift-merge-bot · 2026-06-22T20:19:24Z

@amp-rh, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: test firewatch-granular-analysis contains best_effort without timeout

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

openshift-ci · 2026-06-22T20:19:46Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: amp-rh
Once this PR has been reviewed and has the lgtm label, please assign jan-law for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

ci-operator/config/stolostron/policy-collection/OWNERS
~~ci-operator/step-registry/firewatch/OWNERS~~ [amp-rh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh`:
- Around line 47-57: The XML file discovery logic is too narrow and does not
match the Bash pre-check semantics. Currently, the code only checks the direct
`junit` subdirectory under artifact_dir and top-level XMLs, but it misses XML
files in nested junit directories at any depth (such as
`artifact_dir/some/path/junit/file.xml`). Refactor the xml_paths population to
recursively walk through all directories in artifact_dir using os.walk() to find
XML files under any nested `junit` directory path, ensuring that all XML files
matching the `*/junit/*.xml` pattern are discovered regardless of nesting depth.
- Around line 3-5: The firewatch-granular-analysis-commands.sh script is missing
the `errexit` option in its set command, which means command failures will not
cause the script to exit immediately. Add the `-e` flag to the existing `set -o
nounset` and `set -o pipefail` statement to make it `set -euo pipefail`,
following the standard step-registry baseline flags. This ensures the script
fails fast on any command error rather than continuing silently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8a677b43-f1ec-4dc8-8af3-176bcf42e8b2

📥 Commits

Reviewing files that changed from the base of the PR and between f032d5f and 5818d3f.

📒 Files selected for processing (3)

ci-operator/step-registry/firewatch/granular-analysis/OWNERS
ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh
ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-ref.yaml

coderabbitai · 2026-06-22T20:21:14Z

+set -o nounset
+set -o pipefail
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Enable errexit to avoid silent continuation on command failures.

This step omits -e, so unexpected command errors can be ignored before the explicit Python exit handling. Use the standard step-registry baseline flags.

As per coding guidelines, step-registry command scripts should use set -euo pipefail by default.

Suggested patch

-set -o nounset -set -o pipefail +set -euo pipefail

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh` around lines 3 - 5, The firewatch-granular-analysis-commands.sh script is missing the `errexit` option in its set command, which means command failures will not cause the script to exit immediately. Add the `-e` flag to the existing `set -o nounset` and `set -o pipefail` statement to make it `set -euo pipefail`, following the standard step-registry baseline flags. This ensures the script fails fast on any command error rather than continuing silently.

Source: Coding guidelines

coderabbitai · 2026-06-22T20:21:14Z

+xml_paths = []
+junit_dir = os.path.join(artifact_dir, "junit")
+if os.path.isdir(junit_dir):
+    for name in os.listdir(junit_dir):
+        if name.endswith(".xml"):
+            xml_paths.append(os.path.join(junit_dir, name))
+for name in os.listdir(artifact_dir):
+    full = os.path.join(artifact_dir, name)
+    if name.endswith(".xml") and os.path.isfile(full) and full not in xml_paths:
+        xml_paths.append(full)
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

XML discovery in Python is narrower than the Bash pre-check, causing missed inputs.

Bash accepts XMLs under any */junit/*, but Python only reads ${artifact_dir}/junit and top-level XMLs. If files exist only in nested junit paths, the step reports zero extracted labels despite passing the initial Bash gate.

Suggested patch (align Python discovery with Bash semantics)

-xml_paths = [] -junit_dir = os.path.join(artifact_dir, "junit") -if os.path.isdir(junit_dir): - for name in os.listdir(junit_dir): - if name.endswith(".xml"): - xml_paths.append(os.path.join(junit_dir, name)) -for name in os.listdir(artifact_dir): - full = os.path.join(artifact_dir, name) - if name.endswith(".xml") and os.path.isfile(full) and full not in xml_paths: - xml_paths.append(full) +xml_paths = [] +seen = set() +for root, _, files in os.walk(artifact_dir): + for name in files: + if not name.endswith(".xml"): + continue + full = os.path.join(root, name) + rel = os.path.relpath(full, artifact_dir) + is_top_level_xml = os.sep not in rel + is_under_junit = f"{os.sep}junit{os.sep}" in f"{os.sep}{rel}" + if (is_top_level_xml or is_under_junit) and full not in seen: + seen.add(full) + xml_paths.append(full)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/firewatch/granular-analysis/firewatch-granular-analysis-commands.sh` around lines 47 - 57, The XML file discovery logic is too narrow and does not match the Bash pre-check semantics. Currently, the code only checks the direct `junit` subdirectory under artifact_dir and top-level XMLs, but it misses XML files in nested junit directories at any depth (such as `artifact_dir/some/path/junit/file.xml`). Refactor the xml_paths population to recursively walk through all directories in artifact_dir using os.walk() to find XML files under any nested `junit` directory path, ensuring that all XML files matching the `*/junit/*.xml` pattern are discovered regardless of nesting depth.

openshift-merge-bot · 2026-06-22T20:21:32Z

@amp-rh, pj-rehearse: unable to determine affected jobs. This could be due to a branch that needs to be rebased. ERROR:

could not determine changed registry steps: could not load step registry: test firewatch-granular-analysis contains best_effort without timeout

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

openshift-ci · 2026-06-22T20:26:42Z

@amp-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/ci-operator-config	`8b6b0f7`	link	true	`/test ci-operator-config`
ci/prow/ci-operator-registry	`8b6b0f7`	link	true	`/test ci-operator-registry`
ci/prow/step-registry-metadata	`8b6b0f7`	link	true	`/test step-registry-metadata`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci Bot requested review from calebevans and sg-rh June 22, 2026 20:17

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026

openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add firewatch-granular-analysis step registry entry#80873

Add firewatch-granular-analysis step registry entry#80873
amp-rh wants to merge 2 commits into
openshift:mainfrom
amp-rh:firewatch-granular-step

amp-rh commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

openshift-ci Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

coderabbitai Bot Jun 22, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

openshift-ci Bot commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amp-rh commented Jun 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Files

Integration

Supersedes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

openshift-ci Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

openshift-ci Bot commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

amp-rh commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading