Skip to content

ipi-install-install-aws: Fix ADD_INGRESS_RECORDS_MANUALLY=yes timeout#77955

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
openshift-cloud-team:aws-manual-ingress-timeout
May 7, 2026
Merged

ipi-install-install-aws: Fix ADD_INGRESS_RECORDS_MANUALLY=yes timeout#77955
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
openshift-cloud-team:aws-manual-ingress-timeout

Conversation

@mdbooth
Copy link
Copy Markdown
Contributor

@mdbooth mdbooth commented Apr 17, 2026

When ADD_INGRESS_RECORDS_MANUALLY=yes we explicitly remove zone creation
from the manifests. This prevents the ingress operator from coming up,
which means that 'create cluster' always times out.

Ideally we would create the ingress records in advance, but because the
LB is created during installation we have no way of knowing what they
should be. There is also currently no mechanism to precreate a
loadbalancer and pass it to the installer. To avoid the timeout we need
a workaround.

With this change, instead of running after create cluster and assuming
that router-default has been created, we kick off a process that polls
for its creation. We create the DNS records immediately, during cluster
create, which means we don't have to wait for the timeout. This saves
20-30 minutes per run in testing.

Previous flow:

  • create cluster -> times out after 40 mins due to missing DNS records
  • create DNS records manually, unblocks ingress operator and installation progresses again
  • wait-for-install-complete eventually completes successfully

New flow:

  • concurrently:
    • create cluster creates the cluster as before
    • create_ingress_dns_record waits for the default ingress router to exist and creates DNS records for it immediately it is observed
  • create cluster succeeds within the initial timeout.
  • wait-for-install-complete completes immediately.

Summary by CodeRabbit

  • Refactor
    • DNS record creation now runs concurrently with cluster installation, speeding setup.
    • Route53 record type selection varies by cluster type (A/ALIAS vs CNAME) with explicit handling for unsupported types.
    • Installer exit handling and background-job cleanup have been reworked to reliably record final install status and terminate watchers.
    • Cluster credentials are now applied only after install completion for safer sequencing.
  • Bug Fixes
    • Installation fails if the concurrent DNS watcher exits with an error, preventing silent failures.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

Walkthrough

prepare_next_steps() now accepts an exit-code argument and writes it to install-status.txt. Synchronous router LB polling and wait_router_lb_provision() were removed. A background DNS watcher (create_ingress_dns_record()) and CFN template writer (write_ingress_cfn_template()) were added to create Route53 records; trap/cleanup logic was reworked to terminate background jobs and call prepare_next_steps(exit_code).

Changes

Cohort / File(s) Summary
DNS Record Management & Installer Control Flow
ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh
Removed wait_router_lb_provision(). Added write_ingress_cfn_template(cfn_template) (creates Route53 CFN template per CLUSTER_TYPE) and create_ingress_dns_record() (background watcher that waits for kubeconfig, router LB hostname, resolves hosted zone, writes CFN, creates ${CLUSTER_NAME}-apps-dns stack and waits for completion). Modified installer flow to start the DNS watcher in background when ADD_INGRESS_RECORDS_MANUALLY=yes and to wait for its exit later. Moved KUBECONFIG/AWS creds handling into the watcher and only export KUBECONFIG at the install-complete phase.
Exit/Trap Handling & Next-step Persistence
ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh
Replaced previous TERM/EXIT trap logic with a cleanup() function registered for EXIT and TERM that captures the exit code, kills background jobs, calls prepare_next_steps(exit_code), removes the EXIT trap, and exits. Updated prepare_next_steps() signature to accept exit_code_arg and persist that value to install-status.txt.

Sequence Diagram(s)

sequenceDiagram
    participant Installer as Installer Process
    participant DNS as DNS Watcher
    participant K8s as Kubernetes API
    participant LB as Router Service
    participant Route53 as Route53
    participant CFN as CloudFormation

    Installer->>Installer: Launch installer (create cluster)
    Installer->>DNS: Start create_ingress_dns_record() (background)
    par Installer continues / DNS watcher runs
        Installer->>Installer: Installer proceeds (install completes later)
        DNS->>K8s: Poll for kubeconfig until available
        DNS->>LB: Query router-default service for LB hostname
        DNS->>Route53: Resolve private hosted zone ID
        DNS->>DNS: write_ingress_cfn_template(cfn_template)
        DNS->>CFN: Create `${CLUSTER_NAME}-apps-dns` CloudFormation stack
        CFN-->>DNS: Stack creation completes
    end
    Installer->>DNS: Wait for DNS watcher PID to exit
    DNS-->>Installer: Exit status
    Installer->>Installer: Call prepare_next_steps(exit_code) and finish
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (9 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main change: fixing the ADD_INGRESS_RECORDS_MANUALLY=yes timeout by implementing a background DNS record creation process instead of waiting for the installer to complete.
Stable And Deterministic Test Names ✅ Passed Repository contains only bash shell scripts with no Ginkgo test files present.
Test Structure And Quality ✅ Passed The PR modifies a bash shell script for AWS infrastructure provisioning with no Ginkgo test code present, making test quality requirements inapplicable.
Microshift Test Compatibility ✅ Passed PR modifies only bash shell script for AWS CI infrastructure; no Ginkgo e2e tests added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR modifies only a bash script for AWS IPI cluster installation and does not add any new Ginkgo e2e tests, making the SNO compatibility check not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed Script automates AWS cluster setup and DNS record creation without deploying manifests, operators, or scheduling constraints that could break on non-standard topologies.
Ote Binary Stdout Contract ✅ Passed The custom check for OTE Binary Stdout Contract is not applicable to this PR, which only modifies a bash shell script used for AWS cluster installation, not any OTE binaries or Go test code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Pull request modifies only a Bash shell script for AWS cluster installation, containing no Ginkgo e2e test constructs. Custom check for IPv6 and disconnected network compatibility does not apply.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from sosiouxme and xueqzhan April 17, 2026 10:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh`:
- Around line 115-199: The if/elif currently checks CLUSTER_TYPE only for exact
values "aws" and "aws-arm64", which excludes other AWS partitions like
"aws-eusc"; update the conditional logic around the CloudFormation template
generation (the if that writes to "${cfn_template}" and the corresponding elif
for "aws-usgov") to use a pattern match for any AWS partition (e.g., test
CLUSTER_TYPE starts with "aws") for the general AWS template path and keep the
explicit "aws-usgov" branch for GovCloud; apply the same change to the other
manual-ingress CFN-template branch in the file that duplicates this logic so all
AWS partitions (including aws-eusc) are supported when
ADD_INGRESS_RECORDS_MANUALLY is used.
- Around line 358-359: The TERM trap currently calls prepare_next_steps which
causes cleanup to run twice and can overwrite the original exit code; change the
TERM trap to only reap and kill background children (keep the jobs -p; kill ...;
wait logic) and remove prepare_next_steps from the TERM handler so that
prepare_next_steps is invoked only from the EXIT trap; look for the trap lines
and update the TERM trap to not call prepare_next_steps while leaving the
existing EXIT trap that calls prepare_next_steps intact.
- Around line 241-248: The current lookup for the private Route53 hosted zone
can return the wrong zone; update the logic around private_route53_hostzone_name
and private_route53_hostzone_id to perform an exact name match and require
PrivateZone=true: call aws route53 list-hosted-zones (or
list-hosted-zones-by-name but compare HostedZones[].Name ==
"${private_route53_hostzone_name}." with the trailing dot) and filter
HostedZones[] where .PrivateZone == true, then extract the .Id (strip the
/hostedzone/ prefix) into private_route53_hostzone_id and keep the existing
error return if empty; update any jq/awk pipeline around
private_route53_hostzone_id to reflect this exact-match+PrivateZone filtering.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a6f271fb-88f1-49bf-af5e-0906e03bd151

📥 Commits

Reviewing files that changed from the base of the PR and between 0368f89 and 5730608.

📒 Files selected for processing (1)
  • ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh

@mdbooth mdbooth force-pushed the aws-manual-ingress-timeout branch from 5730608 to a64fda5 Compare April 17, 2026 11:33
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh (2)

261-285: Consider adding an else clause for defensive coding.

Currently, if CLUSTER_TYPE is not one of aws, aws-arm64, or aws-usgov, the stack creation is silently skipped (though write_ingress_cfn_template would have already failed earlier at line 254). Adding an explicit else clause would make the code more maintainable and catch any future logic drift.

Suggested addition
     elif [ "${CLUSTER_TYPE}" == "aws-usgov" ]; then
         aws --region "${REGION}" cloudformation create-stack --stack-name "${APPS_DNS_STACK_NAME}" \
             ...
             --capabilities CAPABILITY_NAMED_IAM || return 1
+    else
+        echo "DNS watcher: ERROR - unsupported CLUSTER_TYPE for stack creation: ${CLUSTER_TYPE}"
+        return 1
     fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh`
around lines 261 - 285, The code path handling CLUSTER_TYPE covers
"aws"/"aws-arm64" and "aws-usgov" but silently skips stack creation for any
other value; add an explicit else branch after the existing if/elif to fail fast
and log/return an error. Update the conditional block that sets
router_lb_hostzone_id and calls aws cloudformation (referencing CLUSTER_TYPE,
router_lb_hostzone_id, APPS_DNS_STACK_NAME and write_ingress_cfn_template) to
include a final else that prints a clear error (via echo or process logger)
mentioning the unexpected CLUSTER_TYPE and returns a non-zero status (e.g.,
return 1) so the caller immediately sees the misconfiguration.

359-375: Good cleanup implementation with proper exit code handling.

The trap flow correctly:

  1. Captures exit code before any commands modify $?
  2. Reaps background children
  3. Passes the original exit code to prepare_next_steps
  4. Removes the EXIT trap to prevent double invocation

One minor robustness note: consider adding || true to wait in case some children have already exited, though this is not strictly necessary since we're in cleanup.

Optional: more defensive error suppression
   if test -n "${CHILDREN}"; then
-    kill ${CHILDREN} 2>/dev/null
-    wait
+    kill ${CHILDREN} 2>/dev/null || true
+    wait || true
   fi

Note: The static analysis hint SC2329 about cleanup never being invoked is a false positive — it's invoked via the trap at line 375. The SC2086 hint about quoting ${CHILDREN} is intentionally unquoted to allow word splitting of multiple PIDs.

,

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh`
around lines 359 - 375, The cleanup function should tolerate "wait" returning
non-zero if children already exited; update the wait call in function cleanup
(where CHILDREN is populated and kill is invoked) to suppress errors (e.g.,
change the standalone wait to a tolerant form like wait || true) so cleanup
still proceeds to call prepare_next_steps with the captured exit_code and remove
the EXIT trap; keep the rest of the logic in cleanup and the trap cleanup EXIT
TERM unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh`:
- Around line 261-285: The code path handling CLUSTER_TYPE covers
"aws"/"aws-arm64" and "aws-usgov" but silently skips stack creation for any
other value; add an explicit else branch after the existing if/elif to fail fast
and log/return an error. Update the conditional block that sets
router_lb_hostzone_id and calls aws cloudformation (referencing CLUSTER_TYPE,
router_lb_hostzone_id, APPS_DNS_STACK_NAME and write_ingress_cfn_template) to
include a final else that prints a clear error (via echo or process logger)
mentioning the unexpected CLUSTER_TYPE and returns a non-zero status (e.g.,
return 1) so the caller immediately sees the misconfiguration.
- Around line 359-375: The cleanup function should tolerate "wait" returning
non-zero if children already exited; update the wait call in function cleanup
(where CHILDREN is populated and kill is invoked) to suppress errors (e.g.,
change the standalone wait to a tolerant form like wait || true) so cleanup
still proceeds to call prepare_next_steps with the captured exit_code and remove
the EXIT trap; keep the rest of the logic in cleanup and the trap cleanup EXIT
TERM unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2ce4721c-01c1-42b9-a697-8184651d6420

📥 Commits

Reviewing files that changed from the base of the PR and between 5730608 and a64fda5.

📒 Files selected for processing (1)
  • ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh

@mdbooth mdbooth force-pushed the aws-manual-ingress-timeout branch from a64fda5 to ff58de6 Compare April 17, 2026 11:46
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh`:
- Around line 369-373: In cleanup(), the 'wait' after killing background
CHILDREN can return non-zero and, under errexit, abort the function before
prepare_next_steps runs; change the pattern to tolerate non-zero job exits
(e.g., replace 'wait' with 'wait || true' or capture and ignore the wait status)
so that cleanup() always proceeds to call prepare_next_steps and persist
artifacts/status; locate the block that uses CHILDREN=$(jobs -p) and adjust the
wait handling in the cleanup() function accordingly.
- Around line 250-256: The DNS watcher references unset variables (match_count,
zone_fqdn, hz_json) which will break under set -o nounset; update the
hosted-zone selection block (the code around the private_route53_hostzone_id
assignment and the conditional using match_count) to first validate/initialize
these variables: test that hz_json is non-empty before running jq, ensure
zone_fqdn is defined (or fail with a clear error log and return 1), and treat
match_count safely (use default 0 or compute it only after hz_json is verified)
so the echo/return logic and the jq + awk invocation do not run on unset values;
add explicit error logging and return codes when any required input is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0a6c2d49-d459-4baf-af42-db49b6fc6077

📥 Commits

Reviewing files that changed from the base of the PR and between a64fda5 and ff58de6.

📒 Files selected for processing (1)
  • ci-operator/step-registry/ipi/install/install/aws/ipi-install-install-aws-commands.sh

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented Apr 17, 2026

The first 2 should be without ADD_INGRESS_RECORDS_MANUALLY, the second 2 should set it.

/pj-rehearse cucushift-installer-rehearse-aws-ipi-edge-zone-rootvolume cucushift-installer-rehearse-aws-ipi-edge-zone-byo-subnet-ovn-day2 cucushift-installer-rehearse-aws-ipi-disconnected-private-cco-manual-security-token-service-private-s3-with-ep-sts-ec2-elb cucushift-installer-rehearse-aws-ipi-disconnected-private-edge-zone

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented Apr 17, 2026

/test ci-operator-config ci-operator-registry owners step-registry-metadata step-registry-shellcheck

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented Apr 17, 2026

/test step-registry-metadata

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mdbooth: job(s): cucushift-installer-rehearse-aws-ipi-edge-zone-rootvolume, cucushift-installer-rehearse-aws-ipi-edge-zone-byo-subnet-ovn-day2, cucushift-installer-rehearse-aws-ipi-disconnected-private-cco-manual-security-token-service-private-s3-with-ep-sts-ec2-elb, cucushift-installer-rehearse-aws-ipi-disconnected-private-edge-zone either don't exist or were not found to be affected, and cannot be rehearsed

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented Apr 17, 2026

/pj-rehearse

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented Apr 17, 2026

/pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-disc-priv-sts-fips-dualstack-ipv6-mini-perm-tp-f7 periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-byo-subnet-ovn-day2-f7

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 18, 2026

@mdbooth: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-quay-quay-tests-master-omr-ocp414-quay-omr-tests-omr-ocp414-disconnected e17256e link unknown /pj-rehearse periodic-ci-quay-quay-tests-master-omr-ocp414-quay-omr-tests-omr-ocp414-disconnected
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-byo-subnet-ovn-day2-f7 e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-byo-subnet-ovn-day2-f7
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-4.19-upgrade-from-stable-4.18-aws-ipi-disc-priv-f28 e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-4.19-upgrade-from-stable-4.18-aws-ipi-disc-priv-f28
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.14-amd64-nightly-aws-ipi-disc-priv-localzone-fips-f60-destructive e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.14-amd64-nightly-aws-ipi-disc-priv-localzone-fips-f60-destructive
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.16-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f28-destructive e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.16-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f28-destructive
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-wavelength-fips-mini-perm-f14 e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-wavelength-fips-mini-perm-f14
ci/rehearse/periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-disc-priv-sts-fips-dualstack-ipv6-mini-perm-tp-f7 e17256e link unknown /pj-rehearse periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-disc-priv-sts-fips-dualstack-ipv6-mini-perm-tp-f7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

When ADD_INGRESS_RECORDS_MANUALLY=yes we explicitly remove zone creation
from the manifests. This prevents the ingress operator from coming up,
which means that 'create cluster' always times out. We were previously
creating the dns records cluster create timed out and then doing
'wait-for install-complete', so it still succeeded eventually.

Ideally we would create the ingress records in advance, but because the
LB is created during installation we have no way of knowing what they
should be. There is also currently no mechanism to precreate a
loadbalancer and pass it to the installer. To avoid the timeout we need
a workaround.

With this change, instead of running after create cluster and assuming
that router-default has been created, we kick off a process that polls
for its creation. We create the DNS records immediately, during cluster
create, which means we don't have to wait for the timeout. This saves
20-30 minutes per run in testing.
@mdbooth mdbooth force-pushed the aws-manual-ingress-timeout branch from e17256e to 10ad89a Compare April 30, 2026 12:31
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@mdbooth: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-cluster-capi-operator-main-e2e-aws-capi-disconnected-techpreview openshift/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-cluster-capi-operator-release-5.1-e2e-aws-capi-disconnected-techpreview openshift/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-cluster-capi-operator-release-5.0-e2e-aws-capi-disconnected-techpreview openshift/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-cluster-capi-operator-release-4.23-e2e-aws-capi-disconnected-techpreview openshift/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-priv-cluster-capi-operator-main-e2e-aws-capi-disconnected-techpreview openshift-priv/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-priv-cluster-capi-operator-release-5.1-e2e-aws-capi-disconnected-techpreview openshift-priv/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-priv-cluster-capi-operator-release-5.0-e2e-aws-capi-disconnected-techpreview openshift-priv/cluster-capi-operator presubmit Registry content changed
pull-ci-openshift-priv-cluster-capi-operator-release-4.23-e2e-aws-capi-disconnected-techpreview openshift-priv/cluster-capi-operator presubmit Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.20-amd64-nightly-aws-c2s-ipi-disc-priv-fips-f28-tp-longduration-cloud N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.16-amd64-nightly-aws-ipi-disc-priv-localzone-fips-f14-nokubeadmin N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.19-arm64-nightly-4.19-upgrade-from-stable-4.19-aws-ipi-disc-priv-sts-ep-mini-perm-f60 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.21-arm64-nightly-4.21-upgrade-from-stable-4.20-aws-ipi-localzone-byo-subnet-role-sts-kms-mini-perm-f28 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.15-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f60-destructive N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-fips-mini-perm-f28-destructive N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.15-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f60 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f14 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.13-arm64-nightly-aws-ipi-disc-priv-sts-efs-f60-destructive N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.21-arm64-stable-aws-ipi-disc-priv-sts-ep-f28 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.16-arm64-nightly-aws-ipi-disc-priv-sts-ep-f28 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-5.0-multi-nightly-aws-ipi-disc-priv-amd-mixarch-f28-destructive N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.20-amd64-nightly-aws-ipi-localzone-fips-mini-perm-f14 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.20-amd64-nightly-aws-ipi-wavelength-byo-subnet-ovn-mtu-day2-f14 N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.15-arm64-nightly-4.15-upgrade-from-stable-4.14-aws-ipi-disc-priv-sts-ep-f60 N/A periodic Registry content changed
periodic-ci-quay-quay-tests-master-omr-ocp420-unreleased-quay-omr-tests-omr-ocp420-disconnected-unreleased N/A periodic Registry content changed
periodic-ci-openshift-openshift-tests-private-release-4.12-amd64-nightly-aws-ipi-disc-priv-sts-ep-fips-f60-destructive N/A periodic Registry content changed

A total of 854 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented May 1, 2026

periodic-ci-openshift-openshift-tests-private-release-4.19-amd64-nightly-4.19-upgrade-from-stable-4.18-aws-ipi-disc-priv-f28 uses ADD_INGRESS_RECORDS_MANUALLY=yes. It exercises the new code path. The installation succeeded. The job fails later for unrelated reasons. This is verified.

periodic-ci-openshift-openshift-tests-private-release-4.22-amd64-nightly-aws-ipi-localzone-byo-subnet-ovn-day2-f7 does not set ADD_INGRESS_RECORDS_MANUALLY=yes. It exercises the default codepath. The installation succeeded. The job fails later for unrelated reasons. This is verified.

/pj-rehearse ack

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mdbooth: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot openshift-merge-bot Bot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label May 1, 2026
@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented May 1, 2026

/assign xueqzhan

Are you able to approve this? This looks big, but it's almost entirely code motion.

@petr-muller
Copy link
Copy Markdown
Member

/approve

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2026
Copy link
Copy Markdown
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

thanks!

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 7, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damdo, mdbooth, petr-muller

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 039b946 into openshift:main May 7, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants