Skip to content

HIVE-3148: Adding HCMbundle tests using OTE#79862

Open
miyadav wants to merge 1 commit into
openshift:mainfrom
miyadav:otesdhrosatests
Open

HIVE-3148: Adding HCMbundle tests using OTE#79862
miyadav wants to merge 1 commit into
openshift:mainfrom
miyadav:otesdhrosatests

Conversation

@miyadav
Copy link
Copy Markdown
Member

@miyadav miyadav commented May 29, 2026

/hold
Onces the tests looks stable we can retire the CI job using OTP and use this one for HCM bundle tests .

Summary by CodeRabbit

This PR establishes new CI infrastructure for testing Hive using the OpenShift Tests Extension (OTE) framework on ROSA (Red Hat OpenShift on AWS) deployments.

What changed:
Two additions to the Hive periodic CI configuration (openshift-hive-master__periodic.yaml):

  1. New hive-tests image build: Created a container image that builds and packages the OTE test extension binary. This image is constructed by:

    • Compiling Hive's OTE test tools (make -C test/ote build)
    • Installing required runtime dependencies (gzip, jq)
    • Packaging the compiled test extension as /usr/bin/openshift-tests-extension
  2. New e2e-ote-sd-rosa periodic job: A weekly scheduled test job (runs Saturdays at 6 AM) that:

    • Deploys Hive to an AWS QE cluster
    • Executes the OTE-based ROSA test suite against the deployed Hive instance
    • Converts OTE JSON test results into JUnit XML for CI integration
    • Reports outcomes to the #team-hive-alert Slack channel
    • Runs with an 8-hour timeout and includes proper AWS credential handling

Purpose: As noted in the PR comments, this adds a new CI validation path for HCM (Hive Cluster Manager) bundle functionality. Once these OTE-based tests prove stable, the team plans to retire the existing OTP-based CI job in favor of this new test pipeline.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 29, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 29, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 29, 2026

@miyadav: This pull request references HIVE-3148 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

/hold
Onces the tests looks stable we can retire the CI job using OTP and use this one for HCM bundle tests .

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: miyadav
Once this PR has been reviewed and has the lgtm label, please assign dlom for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from dlom and suhanime May 29, 2026 09:08
@miyadav
Copy link
Copy Markdown
Member Author

miyadav commented May 29, 2026

/pj-rehearse periodic-ci-openshift-hive-master-periodic-e2e-ote-sd-rosa

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@miyadav: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-hive-master-periodic-images openshift/hive presubmit Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-vsphere-weekly N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-weekly N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-aws-ipi-f7-longduration-hive-sd-rosa N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-azure-weekly N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-ote-sd-rosa N/A periodic Periodic changed
periodic-ci-openshift-hive-master-periodic-e2e-gcp-weekly N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-openstack-weekly N/A periodic Ci-operator config changed
periodic-ci-openshift-hive-master-periodic-e2e-pool-weekly N/A periodic Ci-operator config changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Walkthrough

This PR adds periodic CI testing for Hive with ROSA-specific OTE (OpenShift Test Extension) validation. It introduces a hive-tests container image stage that builds and packages the OTE test extension binary, and a new weekly scheduled job that deploys Hive, runs the OTE test suite against AWS infrastructure, and converts test results to JUnit XML reports.

Changes

OTE Hive Periodic Testing

Layer / File(s) Summary
Test image build for OTE extension
ci-operator/config/openshift/hive/openshift-hive-master__periodic.yaml
Adds hive-tests image that builds the OTE test extension, installs gzip and jq utilities, installs the compiled binary to /usr/bin/openshift-tests-extension, and sets the working directory to /tmp.
Periodic OTE test job for ROSA
ci-operator/config/openshift/hive/openshift-hive-master__periodic.yaml
Adds e2e-ote-sd-rosa weekly job that deploys Hive, executes openshift-tests-extension run-suite with mounted AWS and pull-secret credentials, post-processes JSON results into junit_results.xml using jq, and exits non-zero when test failures are detected.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

rehearsals-ack


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
No-Sensitive-Data-In-Logs ❌ Error Lines 372-373 export AWS credentials as environment variables passed to openshift-tests-extension, risking exposure in logs. Use set +x before credential export, pass via stdin/files instead, or configure credential redaction in the test tool.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly relates to the main change: adding a new HCMbundle test job using OTE framework to the CI configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR modifies only CI configuration YAML files, not Ginkgo test code. No test title definitions (It(), Describe(), etc.) present in changes.
Test Structure And Quality ✅ Passed PR modifies CI configuration YAML files only, not Ginkgo test code. Check criteria for reviewing Ginkgo test structure does not apply to infrastructure configuration.
Microshift Test Compatibility ✅ Passed PR modifies only CI/YAML configuration in openshift/release; no new Ginkgo e2e tests are added. The custom check applies only to new test code, not CI configuration files.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR adds CI configuration only (YAML files) with no new Ginkgo e2e test code; check for SNO test compatibility does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies only CI-operator configuration for testing, not deployment manifests or operator code. No topology-aware scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed PR only modifies release repository configuration and utilities (YAML, shell scripts, Go test tools), not OTE binary source code from hive repository. Check not applicable to CI configuration changes.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds CI configuration for running OTE tests, not new Ginkgo e2e test code. Custom check applies only to new Ginkgo tests (It(), Describe(), etc.); this PR does not add any.
No-Weak-Crypto ✅ Passed PR only modifies a YAML CI/CD configuration file with no cryptographic operations, weak crypto algorithms, custom crypto implementations, or insecure secret comparisons present.
Container-Privileges ✅ Passed No privileged container configurations found in the CI-operator config file. File contains only Dockerfiles and test job definitions with no security context specifications.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ci-operator/config/openshift/hive/openshift-hive-master__periodic.yaml`:
- Around line 378-407: The step currently forces
/usr/bin/openshift-tests-extension run-suite to succeed with "|| true" and only
fails when extension_test_result_*.json exists, so if run-suite crashes before
writing JSON the job will incorrectly pass; change the logic in the run-suite
block to capture run-suite's exit code (e.g., RC=$?), remove the unconditional
"|| true" or immediately check RC, and if RC is non-zero and no RESULT_JSON was
produced (RESULT_JSON empty or not a file in ARTIFACT_DIR) then exit 1; keep the
existing JSON-to-junit conversion and FAIL_COUNT logic but add an explicit check
after running run-suite that fails the job when run-suite failed or when no
RESULT_JSON was created.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: cac7ada3-f79c-4bdb-b8ed-0d05804f51e6

📥 Commits

Reviewing files that changed from the base of the PR and between 322149b and 599a2d1.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift/hive/openshift-hive-master-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (1)
  • ci-operator/config/openshift/hive/openshift-hive-master__periodic.yaml

Comment on lines +378 to +407
/usr/bin/openshift-tests-extension run-suite -c 1 openshift/hive -j ${ARTIFACT_DIR}/junit_results.xml || true
RESULT_JSON=$(ls ${ARTIFACT_DIR}/extension_test_result_*.json 2>/dev/null | head -1)
if [[ -n "$RESULT_JSON" && -f "$RESULT_JSON" ]]; then
jq -r '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>",
"<testsuite tests=\"\(length)\" failures=\"\([.[] | select(pass | not)] | length)\">",
(.[] |
if pass then
"<testcase name=\"\(.name | @html)\"/>"
else
"<testcase name=\"\(.name | @html)\"><failure><![CDATA[\((.error // "")[0:500])]]></failure></testcase>"
end
),
"</testsuite>"
' "$RESULT_JSON" > "${ARTIFACT_DIR}/junit_results.xml"
FAIL_COUNT=$(jq '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
[.[] | select(pass | not)] | length
' "$RESULT_JSON")
TOTAL=$(jq '(. // []) | length' "$RESULT_JSON")
echo "Results: $((TOTAL - FAIL_COUNT)) passed, ${FAIL_COUNT} failed"
[[ $FAIL_COUNT -gt 0 ]] && exit 1
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail the job when test execution fails before producing result JSON.

run-suite is forced to succeed with || true, and the script only asserts failures when a JSON file exists. If the runner fails early and writes no extension_test_result_*.json, this step passes incorrectly.

Proposed fix
-        /usr/bin/openshift-tests-extension run-suite -c 1 openshift/hive -j ${ARTIFACT_DIR}/junit_results.xml || true
+        set +e
+        /usr/bin/openshift-tests-extension run-suite -c 1 openshift/hive -j ${ARTIFACT_DIR}/junit_results.xml
+        RUN_SUITE_RC=$?
+        set -e
         RESULT_JSON=$(ls ${ARTIFACT_DIR}/extension_test_result_*.json 2>/dev/null | head -1)
-        if [[ -n "$RESULT_JSON" && -f "$RESULT_JSON" ]]; then
+        if [[ -n "$RESULT_JSON" && -f "$RESULT_JSON" ]]; then
           jq -r '
             (. // []) |
             def pass:
               .result == "passed" or
               ((.output // "") | contains("SUCCESS! -- 1 Passed"));
@@
           TOTAL=$(jq '(. // []) | length' "$RESULT_JSON")
           echo "Results: $((TOTAL - FAIL_COUNT)) passed, ${FAIL_COUNT} failed"
           [[ $FAIL_COUNT -gt 0 ]] && exit 1
+        elif [[ $RUN_SUITE_RC -ne 0 ]]; then
+          echo "openshift-tests-extension failed and produced no result JSON"
+          exit $RUN_SUITE_RC
         fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/usr/bin/openshift-tests-extension run-suite -c 1 openshift/hive -j ${ARTIFACT_DIR}/junit_results.xml || true
RESULT_JSON=$(ls ${ARTIFACT_DIR}/extension_test_result_*.json 2>/dev/null | head -1)
if [[ -n "$RESULT_JSON" && -f "$RESULT_JSON" ]]; then
jq -r '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>",
"<testsuite tests=\"\(length)\" failures=\"\([.[] | select(pass | not)] | length)\">",
(.[] |
if pass then
"<testcase name=\"\(.name | @html)\"/>"
else
"<testcase name=\"\(.name | @html)\"><failure><![CDATA[\((.error // "")[0:500])]]></failure></testcase>"
end
),
"</testsuite>"
' "$RESULT_JSON" > "${ARTIFACT_DIR}/junit_results.xml"
FAIL_COUNT=$(jq '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
[.[] | select(pass | not)] | length
' "$RESULT_JSON")
TOTAL=$(jq '(. // []) | length' "$RESULT_JSON")
echo "Results: $((TOTAL - FAIL_COUNT)) passed, ${FAIL_COUNT} failed"
[[ $FAIL_COUNT -gt 0 ]] && exit 1
fi
set +e
/usr/bin/openshift-tests-extension run-suite -c 1 openshift/hive -j ${ARTIFACT_DIR}/junit_results.xml
RUN_SUITE_RC=$?
set -e
RESULT_JSON=$(ls ${ARTIFACT_DIR}/extension_test_result_*.json 2>/dev/null | head -1)
if [[ -n "$RESULT_JSON" && -f "$RESULT_JSON" ]]; then
jq -r '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>",
"<testsuite tests=\"\(length)\" failures=\"\([.[] | select(pass | not)] | length)\">",
(.[] |
if pass then
"<testcase name=\"\(.name | `@html`)\"/>"
else
"<testcase name=\"\(.name | `@html`)\"><failure><![CDATA[\((.error // "")[0:500])]]></failure></testcase>"
end
),
"</testsuite>"
' "$RESULT_JSON" > "${ARTIFACT_DIR}/junit_results.xml"
FAIL_COUNT=$(jq '
(. // []) |
def pass:
.result == "passed" or
((.output // "") | contains("SUCCESS! -- 1 Passed"));
[.[] | select(pass | not)] | length
' "$RESULT_JSON")
TOTAL=$(jq '(. // []) | length' "$RESULT_JSON")
echo "Results: $((TOTAL - FAIL_COUNT)) passed, ${FAIL_COUNT} failed"
[[ $FAIL_COUNT -gt 0 ]] && exit 1
elif [[ $RUN_SUITE_RC -ne 0 ]]; then
echo "openshift-tests-extension failed and produced no result JSON"
exit $RUN_SUITE_RC
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ci-operator/config/openshift/hive/openshift-hive-master__periodic.yaml`
around lines 378 - 407, The step currently forces
/usr/bin/openshift-tests-extension run-suite to succeed with "|| true" and only
fails when extension_test_result_*.json exists, so if run-suite crashes before
writing JSON the job will incorrectly pass; change the logic in the run-suite
block to capture run-suite's exit code (e.g., RC=$?), remove the unconditional
"|| true" or immediately check RC, and if RC is non-zero and no RESULT_JSON was
produced (RESULT_JSON empty or not a file in ARTIFACT_DIR) then exit 1; keep the
existing JSON-to-junit conversion and FAIL_COUNT logic but add an explicit check
after running run-suite that fails the job when run-suite failed or when no
RESULT_JSON was created.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 29, 2026

@miyadav: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-hive-master-periodic-e2e-ote-sd-rosa 599a2d1 link unknown /pj-rehearse periodic-ci-openshift-hive-master-periodic-e2e-ote-sd-rosa

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants