Skip to content

CNTRLPLANE-3160: Drop AutoNodeKarpenter feature gate and promote EC2NodeClass to v1#8166

Draft
enxebre wants to merge 6 commits intoopenshift:mainfrom
enxebre:drop-karpenter-feature-gate
Draft

CNTRLPLANE-3160: Drop AutoNodeKarpenter feature gate and promote EC2NodeClass to v1#8166
enxebre wants to merge 6 commits intoopenshift:mainfrom
enxebre:drop-karpenter-feature-gate

Conversation

@enxebre
Copy link
Copy Markdown
Member

@enxebre enxebre commented Apr 6, 2026

What this PR does / why we need it:

Promotes the AutoNode/Karpenter feature to stable/GA by:

  1. Dropping the AutoNodeKarpenter feature gate — the autoNode field on HostedCluster and HostedControlPlane is no longer gated behind TechPreviewNoUpgrade
  2. Promoting OpenshiftEC2NodeClass API from v1beta1 to v1 — signals API stability
  3. Adding envtest validation test suites for both HostedCluster autoNode CEL rules and OpenshiftEC2NodeClass v1 CEL rules
  4. Removing the TECH_PREVIEW_NO_UPGRADE skip from the karpenter e2e test

No runtime/controller code changes — all controllers use IsKarpenterEnabled() which checks spec values, not the feature gate.

Which issue(s) this PR fixes:

Fixes https://issues.redhat.com/browse/CNTRLPLANE-3160

Special notes for your reviewer:

  • The karpenter envtest tests are self-contained under karpenter-operator/controllers/karpenter/assets/ with their own zz_generated.crd-manifests/ and tests/ directories. The only integration point is adding a second path to LoadTestSuiteSpecs in test/envtest/suite_test.go.
  • The hypershift install path (cmd/install/install.go) is unaffected — the karpenter CRD is not placed in the hypershift-operator install assets.
  • The v1beta1 → v1 migration has no struct changes, only the API version label changes. The karpenter-operator controls the CRD lifecycle so there is no multi-version concern.

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code via /jira:solve [CNTRLPLANE-3160](https://redhat.atlassian.net/browse/CNTRLPLANE-3160)

@openshift-ci-robot
Copy link
Copy Markdown

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 6, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 6, 2026

@enxebre: This pull request references CNTRLPLANE-3160 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Promotes the AutoNode/Karpenter feature to stable/GA by:

  1. Dropping the AutoNodeKarpenter feature gate — the autoNode field on HostedCluster and HostedControlPlane is no longer gated behind TechPreviewNoUpgrade
  2. Promoting OpenshiftEC2NodeClass API from v1beta1 to v1 — signals API stability
  3. Adding envtest validation test suites for both HostedCluster autoNode CEL rules and OpenshiftEC2NodeClass v1 CEL rules
  4. Removing the TECH_PREVIEW_NO_UPGRADE skip from the karpenter e2e test

No runtime/controller code changes — all controllers use IsKarpenterEnabled() which checks spec values, not the feature gate.

Which issue(s) this PR fixes:

Fixes https://issues.redhat.com/browse/CNTRLPLANE-3160

Special notes for your reviewer:

  • The karpenter envtest tests are self-contained under karpenter-operator/controllers/karpenter/assets/ with their own zz_generated.crd-manifests/ and tests/ directories. The only integration point is adding a second path to LoadTestSuiteSpecs in test/envtest/suite_test.go.
  • The hypershift install path (cmd/install/install.go) is unaffected — the karpenter CRD is not placed in the hypershift-operator install assets.
  • The v1beta1 → v1 migration has no struct changes, only the API version label changes. The karpenter-operator controls the CRD lifecycle so there is no multi-version concern.

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code via /jira:solve [CNTRLPLANE-3160](https://redhat.atlassian.net/browse/CNTRLPLANE-3160)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 6, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 6, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 330d7abc-8cb3-4624-91ba-a231f84f1359

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/karpenter-operator Indicates the PR includes changes related to the Karpenter operator area/platform/aws PR/issue for AWS (AWSPlatform) platform area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels Apr 6, 2026
@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 6, 2026

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Apr 6, 2026
@enxebre enxebre force-pushed the drop-karpenter-feature-gate branch from 9b013f2 to ca02ee4 Compare April 6, 2026 15:37
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 33.89%. Comparing base (fc03240) to head (0a43efe).
⚠️ Report is 60 commits behind head on main.

Files with missing lines Patch % Lines
cmd/cluster/aws/create.go 0.00% 3 Missing ⚠️
support/karpenter/karpenter.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8166      +/-   ##
==========================================
+ Coverage   32.17%   33.89%   +1.71%     
==========================================
  Files         766      768       +2     
  Lines       91968    93161    +1193     
==========================================
+ Hits        29592    31575    +1983     
+ Misses      59843    58927     -916     
- Partials     2533     2659     +126     
Files with missing lines Coverage Δ
cmd/cluster/core/dump.go 4.27% <ø> (ø)
...lers/hostedcontrolplane/v2/karpenter/deployment.go 100.00% <ø> (ø)
...trollers/hostedcluster/hostedcluster_controller.go 43.27% <ø> (+0.01%) ⬆️
...ator/controllers/karpenter/karpenter_controller.go 19.86% <ø> (ø)
.../karpenterignition/karpenterignition_controller.go 62.65% <ø> (ø)
.../controllers/nodeclass/ec2_nodeclass_controller.go 50.90% <ø> (ø)
support/api/scheme.go 89.65% <ø> (ø)
support/karpenter/karpenter.go 68.00% <0.00%> (+5.03%) ⬆️
cmd/cluster/aws/create.go 41.59% <0.00%> (ø)

... and 30 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 6, 2026

/test e2e-aws

@cwbotbot
Copy link
Copy Markdown

cwbotbot commented Apr 6, 2026

Test Results

e2e-aws

@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 6, 2026

/test e2e-aws

1 similar comment
@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 7, 2026

/test e2e-aws

@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 7, 2026

/assign @jkyros @JoelSpeed

Copy link
Copy Markdown
Contributor

@jparrill jparrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments, mostly to learn. 🙏

// +openshift:enable:FeatureGate=AutoNodeKarpenter
// +optional
AutoNode *AutoNode `json:"autoNode,omitempty"`
AutoNode AutoNode `json:"autoNode,omitzero"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to know more, is this change expected by promoting a featureGate or this is due to the new api linter?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is expected that any API follow best practices, specially if it's GA. The linter enforces those best practices.

// +optional
// +unionMember
AWS *KarpenterAWSConfig `json:"aws,omitempty"`
AWS KarpenterAWSConfig `json:"aws,omitzero"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above

Comment thread api/.golangci.yml
text: 'arrayofstruct: OpenStackPlatformSpec.Subnets is an array of structs, but the struct has no required fields. At least one field should be marked as required to prevent ambiguous YAML configurations'
- linters:
- kubeapilinter
path: karpenter/v1beta1/karpenter_types.go
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add v1 to this path?

Copy link
Copy Markdown
Member Author

@enxebre enxebre Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the exception is not actually needed, the struct already has // +kubebuilder:validation:MinProperties=1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this triggered an offline discussion for me with Joel. This would still hit the linter on a new field or a field change. "Not having a specific requirement but at least one" is not an ideal choice when the struct belongs to a slice. The reasoning is that with all fields optional, a user could accidentally write two separate items thinking they're writing one (or the other way around). I think we'll accepted that here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put the slice thing into an example

- foo: some
- bar: thing

vs

- foo: some
  bar: thing

Is a very subtle difference that you might glance over, but has a very different meaning to the API.

This has caught people out in security contexts before with pretty nasty results

@@ -0,0 +1,261 @@
apiVersion: apiextensions.k8s.io/v1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there any docs to know how to add more testCases to this suite/framework?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread api/karpenter/v1/doc.go
// +groupName=karpenter.hypershift.openshift.io
// +k8s:openapi-gen=true
package v1beta1
package v1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By doing this as a rename, rather than creating a duplicate, anyone currently vendoring HyperShift cannot safely have an intermediate step where they update their HyperShift dependency, are able to compile, and then move over to the v1 version of this API.

We would normally expect to create a v1 dir alongside the v1alpha1 (or beta in this case)

A secondary benefit of that is then we get a full diff of the API being promoted and can give it a more thorough API review

Comment thread api/.golangci.yml
text: 'arrayofstruct: OpenStackPlatformSpec.Subnets is an array of structs, but the struct has no required fields. At least one field should be marked as required to prevent ambiguous YAML configurations'
- linters:
- kubeapilinter
path: karpenter/v1beta1/karpenter_types.go
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put the slice thing into an example

- foo: some
- bar: thing

vs

- foo: some
  bar: thing

Is a very subtle difference that you might glance over, but has a very different meaning to the API.

This has caught people out in security contexts before with pretty nasty results

Comment thread test/e2e/karpenter_test.go Outdated
if hostedCluster.Spec.AutoNode == nil ||
hostedCluster.Spec.AutoNode.Provisioner.Karpenter == nil ||
hostedCluster.Spec.AutoNode.Provisioner.Karpenter.AWS == nil {
if hostedCluster.Spec.AutoNode.Provisioner.Karpenter.AWS == (hyperv1.KarpenterAWSConfig{}) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing to an empty struct doesn't seems right. Should we check hostedCluster.Spec.AutoNode.Provisioner.Karpenter.Platform == AWS instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, use a discriminator field if it exists

enxebre added 5 commits April 8, 2026 10:17
Remove the AutoNodeKarpenter feature gate to promote the autoNode field
to stable/GA. The autoNode field on HostedCluster and HostedControlPlane
spec/status is no longer gated behind TechPreviewNoUpgrade.

Updates CLI and test code to use value types instead of pointers for the
now-ungated AutoNode, KarpenterConfig, and KarpenterAWSConfig structs.
Removes the TECH_PREVIEW_NO_UPGRADE skip from the karpenter e2e test.

No runtime code changes are needed because all controllers use
IsKarpenterEnabled() which checks spec values, not the feature gate.

Ref: CNTRLPLANE-3160
Replace api/karpenter/v1beta1/ with api/karpenter/v1/ to signal API
stability as part of the AutoNode GA promotion. The types, structs, and
validation markers are unchanged - only the API version is bumped.

All source files importing api/karpenter/v1beta1 are updated to
api/karpenter/v1. The CRD will now have v1 as the served and storage
version. Since the karpenter-operator controls the CRD lifecycle (not
the hypershift install), there is no multi-version serving concern.

Also fixes a godoc comment on the Conditions field and removes the
now-stale v1beta1 lint suppression from .golangci.yml.

Ref: CNTRLPLANE-3160
Add two new envtest test suites:

- stable.hostedclusters.karpenter.testsuite.yaml: 6 tests covering
  HostedCluster autoNode CEL validation (provisioner union, platform
  union, enum validation, roleARN validation)

- stable.openshiftec2nodeclasses.testsuite.yaml: 18 tests covering
  OpenshiftEC2NodeClass v1 CEL validation (subnet/SG selectors, tags,
  block device mappings, version, capacity reservations, metadata)

The karpenter EC2NodeClass tests are self-contained under
karpenter-operator/controllers/karpenter/assets/ with their own
zz_generated.crd-manifests/ directory. The Makefile karpenter-api
target copies the CRD there, and suite_test.go loads both asset dirs.

Ref: CNTRLPLANE-3160
Regenerate all auto-generated artifacts after dropping the
AutoNodeKarpenter feature gate and promoting OpenshiftEC2NodeClass
to v1:

- CRD manifests: autoNode fields merged into AAA_ungated.yaml,
  AutoNodeKarpenter.yaml deleted
- Featuregate payload manifests updated
- Client code regenerated for karpenter v1
- Vendor directory synced
- API docs regenerated

Ref: CNTRLPLANE-3160
Address PR review feedback: check the platform discriminator field
rather than comparing against an empty struct to determine if
Karpenter AWS is configured.

Ref: CNTRLPLANE-3160
@enxebre enxebre force-pushed the drop-karpenter-feature-gate branch from df67072 to dc1cc61 Compare April 8, 2026 08:23
@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 8, 2026

/test e2e-aws

@openshift openshift deleted a comment from hypershift-jira-solve-ci bot Apr 9, 2026
@enxebre
Copy link
Copy Markdown
Member Author

enxebre commented Apr 9, 2026

/test e2e-aws

Add karpenter-operator/controllers/karpenter/assets/tests/** to the
path filters so that changes to the karpenter envtest suites trigger
the envtest workflows. Also fixes the path for the crds/ directory
rename.

Ref: CNTRLPLANE-3160
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 11, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 11, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 17, 2026

@enxebre: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify-workflows 0a43efe link true /test verify-workflows

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

hypershift-jira-solve-ci bot commented Apr 17, 2026

I have all the evidence needed. Here is the analysis:

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

Automatic merge failed; fix conflicts and then commit the result.
# Error: exit status 1
# Final SHA: 
# Total runtime: 0s

Summary

The job failed during the Prow git clone phase — before any CI step could execute. When Prow attempted to merge PR #8166 (SHA 0a43efecad) onto the current main branch (SHA 1180bcaf5), git encountered 5 merge conflicts across 5 files. The PR branch needs to be rebased on top of current main to resolve these conflicts. This is not a product bug or CI infrastructure issue — it is a branch-out-of-date condition.

Root Cause

The PR branch drop-karpenter-feature-gate is out of date with the main branch. Since the PR was created, other changes have been merged into main that modify the same files this PR touches. Specifically:

  1. modify/delete conflicts (2 files): The PR deletes the AutoNodeKarpenter.yaml featuregated CRD manifests for both hostedclusters and hostedcontrolplanes, but main has since modified those same files (likely via regeneration of CRD manifests from other PRs). Git cannot decide whether the deletion or the modification should win.

    • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AutoNodeKarpenter.yaml
    • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AutoNodeKarpenter.yaml
  2. content conflicts (3 files): Both the PR branch and main have made overlapping changes to:

    • .github/workflows/envtest-kube.yaml — envtest workflow file
    • .github/workflows/envtest-ocp.yaml — envtest workflow file
    • test/e2e/karpenter_test.go — Karpenter e2e test file

The base commit on main is 1180bcaf5 ("Merge pull request #8208 from enxebre/worktree-fix-capacity-reservation-limit"), which itself is from the same author (enxebre), suggesting rapid iteration on the Karpenter feature that has created divergence between the branches.

Recommendations
  1. Rebase the PR branch onto current main: Run git rebase main on the drop-karpenter-feature-gate branch to incorporate the latest changes and resolve all 5 conflicts.

  2. For the modify/delete conflicts on AutoNodeKarpenter.yaml files: Since this PR's intent is to drop the AutoNodeKarpenter feature gate entirely, the correct resolution is to accept the deletion (the PR's version) and discard the modifications from main.

  3. For the content conflicts in .github/workflows/envtest-*.yaml and test/e2e/karpenter_test.go: Manually resolve by incorporating both sets of changes where appropriate, re-running code generation (make generate) after the rebase to ensure the zz_generated files are consistent.

  4. Re-trigger the job after pushing the rebased branch: /test verify-workflows

Evidence
Evidence Detail
Failure phase Git clone/merge (pre-CI, no test steps executed)
Job duration 32 seconds total (clone-only failure)
Exit code exit status 1 from git merge --no-ff
Conflict count 5 files
Conflict type 1 CONFLICT (modify/delete)hostedclusters.../AutoNodeKarpenter.yaml deleted in PR, modified in HEAD
Conflict type 2 CONFLICT (modify/delete)hostedcontrolplanes.../AutoNodeKarpenter.yaml deleted in PR, modified in HEAD
Conflict type 3 CONFLICT (content).github/workflows/envtest-kube.yaml
Conflict type 4 CONFLICT (content).github/workflows/envtest-ocp.yaml
Conflict type 5 CONFLICT (content)test/e2e/karpenter_test.go
Base SHA 1180bcaf5 — Merge PR #8208 from enxebre/worktree-fix-capacity-reservation-limit
PR SHA 0a43efecadCNTRLPLANE-3160: Drop AutoNodeKarpenter feature gate
Auto-merged (no conflict) 21 files including hostedcluster_types.go, karpenter_types.go, CRD manifests, controllers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/karpenter-operator Indicates the PR includes changes related to the Karpenter operator area/platform/aws PR/issue for AWS (AWSPlatform) platform area/testing Indicates the PR includes changes for e2e testing do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants