Skip to content

GCP-413: add image registry v2 e2e tests for hosted clusters#8412

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
cblecker:worktree-dapper-munching-fiddle
May 8, 2026
Merged

GCP-413: add image registry v2 e2e tests for hosted clusters#8412
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
cblecker:worktree-dapper-munching-fiddle

Conversation

@cblecker
Copy link
Copy Markdown
Member

@cblecker cblecker commented May 4, 2026

Summary

  • Adds a platform-generic image registry e2e test suite to the v2 framework, validating ClusterOperator health, installer-cloud-credentials secret, storage configuration, and cluster-image-registry-operator deployment readiness
  • Adds GCP-specific sub-context validating GCS bucket name and WIF credentials in the hosted-cluster-side installer-cloud-credentials secret
  • Ports the disabled-capability test from v1 (EnsureImageRegistryCapabilityDisabled) as ImageRegistryCapabilityDisabledTest
  • Registers imageregistryv1 in support/api/scheme.go so the hosted cluster client can query imageregistryv1.Config objects

Test plan

  • Confirm go vet -tags e2ev2 ./test/e2e/v2/... passes
  • Confirm make lint passes
  • Run against a GCP hosted cluster with ImageRegistry capability enabled and verify all specs pass
  • Run against a hosted cluster with ImageRegistry capability disabled and verify the disabled-path specs pass and enabled-path specs skip

Summary by CodeRabbit

  • New Features

    • Added support for the OpenShift ImageRegistry API group to enable ImageRegistry-related resources in hosted clusters.
  • Tests

    • Added end-to-end tests covering ImageRegistry enabled/disabled states, operator and deployment readiness, credential validation, and platform-specific storage backend checks.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 4, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 4, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 4, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 4, 2026

@cblecker: This pull request references GCP-413 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds a platform-generic image registry e2e test suite to the v2 framework, validating ClusterOperator health, installer-cloud-credentials secret, storage configuration, and cluster-image-registry-operator deployment readiness
  • Adds GCP-specific sub-context validating GCS bucket name and WIF credentials in both the management-side (image-registry-creds) and hosted-cluster-side (installer-cloud-credentials) secrets
  • Ports the disabled-capability test from v1 (EnsureImageRegistryCapabilityDisabled) as ImageRegistryCapabilityDisabledTest
  • Registers imageregistryv1 in support/api/scheme.go so the hosted cluster client can query imageregistryv1.Config objects

Test plan

  • Confirm go build -tags e2ev2 ./test/e2e/v2/... passes
  • Confirm go vet -tags e2ev2 ./test/e2e/v2/... passes
  • Confirm make lint passes
  • Run against a GCP hosted cluster with ImageRegistry capability enabled and verify all specs pass
  • Run against a hosted cluster with ImageRegistry capability disabled and verify the disabled-path specs pass and enabled-path specs skip

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added do-not-merge/needs-area area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels May 4, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 4, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 4, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

📝 Walkthrough

Walkthrough

Registers the OpenShift imageregistry v1 API group in the runtime scheme by importing imageregistryv1 and calling AddToScheme. Adds an e2e test suite for HostedCluster ImageRegistry with ordered scenarios for enabled and disabled capability states. Enabled tests wait for the image-registry ClusterOperator conditions, verify installer-cloud-credentials secret, assert the imageregistry Config has storage backends, and check the cluster-image-registry-operator Deployment; on GCP they also validate GCS bucket config and WIF service-account credentials. Disabled tests assert operator/namespace absence and empty default ImagePullSecrets.

Sequence Diagram(s)

sequenceDiagram
    participant Test as ImageRegistry Test
    participant HC as HostedCluster API
    participant CO as ClusterOperator(image-registry)
    participant Secret as Secret(installer-cloud-credentials)
    participant Config as imageregistry Config
    participant Deploy as Deployment(cluster-image-registry-operator)
    participant GCS as GCS (GCP only)

    Test->>HC: Read HostedCluster capabilities
    alt Capability enabled
        Test->>CO: Wait for Available && not Degraded
        Test->>Secret: Check exists and has data
        Test->>Config: Read imageregistry Config -> verify >=1 storage backend
        Test->>Deploy: Check Deployment has >=1 ready replica
        alt Platform is GCP
            Test->>Config: Verify storage uses GCS with bucket
            Test->>Secret: Read service_account.json
            Test->>GCS: Validate external_account creds and impersonation URL
        end
    else Capability disabled
        Test->>CO: Expect ClusterOperator not found
        Test->>HC: Expect openshift-image-registry namespace not found
        Test->>HC: Verify default ServiceAccount ImagePullSecrets empty in existing and new namespaces
    end
Loading
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Microshift Test Compatibility ⚠️ Warning The new e2e tests use OpenShift-specific APIs unavailable on MicroShift, lacking MicroShift compatibility protection mechanisms. Add [apigroup:config.openshift.io] and [apigroup:imageregistry.operator.openshift.io] tags or use IsMicroShiftCluster() checks to skip tests on MicroShift.
✅ Passed checks (11 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding image registry v2 e2e tests for hosted clusters, which matches the primary focus of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test names are stable and deterministic with static descriptive strings. Dynamic values are properly placed in test bodies, not titles.
Test Structure And Quality ✅ Passed Tests demonstrate excellent quality across all five requirements with single responsibility, proper setup/cleanup using DeferCleanup, appropriate timeouts with WithTimeout and WithPolling, comprehensive assertion messages for all expectations, and consistent patterns with existing codebase tests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The new image registry e2e tests verify operator health and configuration readiness without requiring multi-node topology. Tests check for at least one ready replica and ClusterOperator availability, both compatible with SNO's single-node control plane and worker design.
Topology-Aware Scheduling Compatibility ✅ Passed The pull request does not introduce deployment manifests, operator code, or controller implementations that establish scheduling constraints.
Ote Binary Stdout Contract ✅ Passed PR changes add Kubernetes scheme registration and Ginkgo test cases that do not produce process-level stdout writes.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New e2e tests use only cluster-internal Kubernetes APIs with no hardcoded IPv4 addresses, external connectivity requirements, or IPv6-specific parsing logic, ensuring full IPv6-only and disconnected environment compatibility.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@cblecker cblecker marked this pull request as ready for review May 4, 2026 20:41
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 4, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and enxebre May 4, 2026 20:41
@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 4, 2026

/test e2e-v2-aws e2e-v2-gke

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 37.23%. Comparing base (68106f0) to head (baaf52e).
⚠️ Report is 67 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8412   +/-   ##
=======================================
  Coverage   37.22%   37.23%           
=======================================
  Files         750      752    +2     
  Lines       91789    91830   +41     
=======================================
+ Hits        34172    34196   +24     
- Misses      54978    54993   +15     
- Partials     2639     2641    +2     
Files with missing lines Coverage Δ
support/api/scheme.go 89.74% <100.00%> (+0.08%) ⬆️

... and 6 files with indirect coverage changes

Flag Coverage Δ
cmd-support 32.07% <100.00%> (+<0.01%) ⬆️
cpo-hostedcontrolplane 36.50% <ø> (+0.05%) ⬆️
cpo-other 37.73% <ø> (ø)
hypershift-operator 47.85% <ø> (ø)
other 27.77% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/e2e/v2/tests/hosted_cluster_image_registry_test.go (1)

126-133: ⚡ Quick win

Prefer eventual readiness check for operator deployment.

A single-read ReadyReplicas >= 1 assertion can be timing-sensitive in e2e runs. Polling reduces transient failures.

Suggested refactor
 		It("should have a ready cluster-image-registry-operator deployment", func() {
-			deployment := &appsv1.Deployment{}
-			Expect(tc.MgmtClient.Get(tc.Context, crclient.ObjectKey{
-				Namespace: tc.ControlPlaneNamespace,
-				Name:      "cluster-image-registry-operator",
-			}, deployment)).To(Succeed())
-			Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1),
-				"cluster-image-registry-operator deployment should have at least one ready replica")
+			Eventually(func(g Gomega) {
+				deployment := &appsv1.Deployment{}
+				g.Expect(tc.MgmtClient.Get(tc.Context, crclient.ObjectKey{
+					Namespace: tc.ControlPlaneNamespace,
+					Name:      "cluster-image-registry-operator",
+				}, deployment)).To(Succeed())
+				g.Expect(deployment.Status.ReadyReplicas).To(BeNumerically(">=", 1),
+					"cluster-image-registry-operator deployment should have at least one ready replica")
+			}).WithTimeout(5 * time.Minute).WithPolling(15 * time.Second).Should(Succeed())
 		})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/e2e/v2/tests/hosted_cluster_image_registry_test.go` around lines 126 -
133, The test currently asserts deployment.Status.ReadyReplicas >= 1 once which
is flaky; replace the single check with a polling assertion that retries until
the operator is ready. Use Gomega's Eventually to repeatedly call
tc.MgmtClient.Get for the deployment (the existing deployment variable and
tc.Context/ tc.MgmtClient.Get) and check deployment.Status.ReadyReplicas >= 1
within a reasonable timeout and interval, ensuring the Get errors are handled
inside the polled function so transient API delays are retried until ready.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/e2e/v2/tests/hosted_cluster_image_registry_test.go`:
- Around line 92-95: The test currently only checks degraded.Status when the
Degraded condition exists, allowing a missing Degraded condition to pass; update
the assertion to first require the Degraded condition to exist (e.g., use
g.Expect(degraded).NotTo(BeNil()) or equivalent) and then assert that
degraded.Status is not configv1.ConditionTrue using the existing
g.Expect(...).NotTo(Equal(...)) check so the test fails if the Degraded
condition is absent or set to True.

---

Nitpick comments:
In `@test/e2e/v2/tests/hosted_cluster_image_registry_test.go`:
- Around line 126-133: The test currently asserts
deployment.Status.ReadyReplicas >= 1 once which is flaky; replace the single
check with a polling assertion that retries until the operator is ready. Use
Gomega's Eventually to repeatedly call tc.MgmtClient.Get for the deployment (the
existing deployment variable and tc.Context/ tc.MgmtClient.Get) and check
deployment.Status.ReadyReplicas >= 1 within a reasonable timeout and interval,
ensuring the Get errors are handled inside the polled function so transient API
delays are retried until ready.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f4ba3d63-4255-4380-95ef-310ce280cbce

📥 Commits

Reviewing files that changed from the base of the PR and between ad88854 and 503ad9e.

📒 Files selected for processing (2)
  • support/api/scheme.go
  • test/e2e/v2/tests/hosted_cluster_image_registry_test.go

Comment thread test/e2e/v2/tests/hosted_cluster_image_registry_test.go
@cblecker cblecker force-pushed the worktree-dapper-munching-fiddle branch from 503ad9e to 7d20309 Compare May 4, 2026 20:51
@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 4, 2026

/test e2e-v2-aws e2e-v2-gke

Add platform-generic image registry e2e tests to the v2 framework,
validating ClusterOperator health, installer-cloud-credentials,
storage configuration, and operator deployment readiness. Includes
GCP-specific sub-context for WIF credentials and GCS bucket
validation, and ports the disabled-capability test from v1.
Register imageregistryv1 scheme in support/api to enable querying
imageregistryv1.Config objects via the hosted cluster client.

Assisted-by: Claude:claude-sonnet-4-6[1m]
@cblecker cblecker force-pushed the worktree-dapper-munching-fiddle branch from f86c7ca to baaf52e Compare May 4, 2026 21:06
@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 4, 2026

/test e2e-v2-aws e2e-v2-gke

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 7, 2026

/test e2e-v2-gke

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 7, 2026

/verified by e2e

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 7, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@cblecker: This PR has been marked as verified by e2e.

Details

In response to this:

/verified by e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 7, 2026

@cblecker: This pull request references GCP-413 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds a platform-generic image registry e2e test suite to the v2 framework, validating ClusterOperator health, installer-cloud-credentials secret, storage configuration, and cluster-image-registry-operator deployment readiness
  • Adds GCP-specific sub-context validating GCS bucket name and WIF credentials in the hosted-cluster-side installer-cloud-credentials secret
  • Ports the disabled-capability test from v1 (EnsureImageRegistryCapabilityDisabled) as ImageRegistryCapabilityDisabledTest
  • Registers imageregistryv1 in support/api/scheme.go so the hosted cluster client can query imageregistryv1.Config objects

Test plan

  • Confirm go vet -tags e2ev2 ./test/e2e/v2/... passes
  • Confirm make lint passes
  • Run against a GCP hosted cluster with ImageRegistry capability enabled and verify all specs pass
  • Run against a hosted cluster with ImageRegistry capability disabled and verify the disabled-path specs pass and enabled-path specs skip

Summary by CodeRabbit

  • New Features

  • Added support for the OpenShift ImageRegistry API group to enable ImageRegistry-related resources in hosted clusters.

  • Tests

  • Added end-to-end tests covering ImageRegistry enabled/disabled states, operator and deployment readiness, credential validation, and platform-specific storage backend checks.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jimdaga
Copy link
Copy Markdown

jimdaga commented May 7, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 9638d44 and 2 for PR HEAD baaf52e in total

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 7, 2026

/retest-required

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 37f46b9 and 1 for PR HEAD baaf52e in total

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD acc539c and 0 for PR HEAD baaf52e in total

@hypershift-jira-solve-ci
Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2052705615389659136 | Cost: $2.1371037500000005 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 8, 2026

/retest-required

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 8, 2026

/retest

1 similar comment
@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 8, 2026

/retest

@hypershift-jira-solve-ci
Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2052741517704957952 | Cost: $2.838862749999999 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@hypershift-jira-solve-ci
Copy link
Copy Markdown

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

TestCreateCluster, TestCreateClusterPrivate, TestCreateClusterPrivateWithRouteKAS:
  api error RequestLimitExceeded: Request limit exceeded. Account 820196288204 has been
  throttled on ec2:CreateVpcEndpoint because it exceeded its request rate limit.

TestKarpenter/Teardown:
  Failed to wait for infra resources in guest cluster to be deleted: context deadline exceeded
  Failed to clean up 9 remaining resources for guest cluster

Summary

All 5 test failures are unrelated to PR #8412 (which only adds test/e2e/v2/tests/hosted_cluster_image_registry_test.go and modifies support/api/scheme.go). Three failures (TestCreateCluster, TestCreateClusterPrivate, TestCreateClusterPrivateWithRouteKAS) are caused by AWS EC2 API throttling — the shared CI AWS account (820196288204) hit the ec2:CreateVpcEndpoint rate limit, returning HTTP 503 after 11 retry attempts. The remaining two failures (TestKarpenter, TestKarpenter/Teardown) are a teardown-only issue: all Karpenter functional subtests passed, but the teardown phase timed out waiting for 9 leftover AWS resources (8 EBS volumes + 1 NLB) to be deleted. This is a known CI infrastructure flake pattern, not a product regression.

Root Cause

Failure Group 1 — AWS API Throttling (3 tests):
The AWS account 820196288204 used by the HyperShift CI pool hit the EC2 CreateVpcEndpoint API rate limit. This is an infrastructure-level issue caused by too many concurrent CI jobs making EC2 API calls against the same account. All three cluster-creation tests (TestCreateCluster, TestCreateClusterPrivate, TestCreateClusterPrivateWithRouteKAS) attempted to create VPC S3 endpoints simultaneously and received HTTP 503 RequestLimitExceeded responses after exhausting 11 retry attempts. The tests never reached the point of creating hosted clusters — they failed during infrastructure provisioning before any HyperShift-specific code was exercised.

Failure Group 2 — Karpenter Teardown Timeout (2 tests):
TestKarpenter/Teardown failed because 9 AWS resources (8 EBS volumes across multiple Karpenter nodepools and 1 NLB for the ingress router) were not deleted within the teardown deadline. The parent TestKarpenter test is marked as failed only because its child Teardown subtest failed — all actual functional Karpenter subtests passed:

  • TestKarpenter/ValidateHostedCluster — PASSED
  • TestKarpenter/Main (including all parallel provisioning tests) — PASSED
  • TestKarpenter/EnsureHostedCluster — PASSED

The teardown timeout is likely related to the same AWS API throttling affecting the account — resource deletion API calls were also being rate-limited, causing the context deadline to expire before all 9 resources could be cleaned up.

PR #8412 Relevance: The PR changes only support/api/scheme.go (adding API types to the scheme) and adds a new test file test/e2e/v2/tests/hosted_cluster_image_registry_test.go. These changes do not touch any code paths involved in cluster creation, VPC endpoint provisioning, Karpenter, or teardown logic. The new test file is under test/e2e/v2/ and is not executed by the e2e-aws job.

Recommendations
  1. Retest the PR — These failures are infrastructure flakes unrelated to the PR changes. A /retest should resolve them assuming the AWS account is no longer throttled.
  2. No code changes needed — The PR's changes to support/api/scheme.go and the new image registry test file are completely orthogonal to the failing tests.
  3. For the HyperShift CI team — Consider implementing exponential backoff with jitter for CreateVpcEndpoint calls, or staggering test parallelism to reduce concurrent AWS API pressure on the shared account.
Evidence
Evidence Detail
Failed tests TestCreateCluster, TestCreateClusterPrivate, TestCreateClusterPrivateWithRouteKAS, TestKarpenter/Teardown, TestKarpenter
AWS throttling error RequestLimitExceeded: Request limit exceeded. Account 820196288204 has been throttled on ec2:CreateVpcEndpoint (HTTP 503)
Retry exhaustion exceeded maximum number of attempts, 11 across all 3 cluster creation tests
Karpenter functional tests All PASSED (ValidateHostedCluster, Main, EnsureHostedCluster)
Karpenter teardown failure context deadline exceeded — 9 resources not deleted (8 EBS volumes, 1 NLB)
PR #8412 files changed support/api/scheme.go, test/e2e/v2/tests/hosted_cluster_image_registry_test.go
PR relevance to failures None — PR adds image registry v2 e2e tests; failures are in cluster creation and Karpenter teardown
Tests passed 383 of 412 (93%)
Prow job URL View in Prow

@cblecker
Copy link
Copy Markdown
Member Author

cblecker commented May 8, 2026

/override ci/prow/e2e-aws
This PR doesn't touch production code or the v1 e2e suite. v2 suites are passing.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

@cblecker: Overrode contexts on behalf of cblecker: ci/prow/e2e-aws

Details

In response to this:

/override ci/prow/e2e-aws
This PR doesn't touch production code or the v1 e2e suite. v2 suites are passing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

@cblecker: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 1138dfe into openshift:main May 8, 2026
43 checks passed
@cblecker cblecker deleted the worktree-dapper-munching-fiddle branch May 8, 2026 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants