Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOSTEDCP-1041] Defaulting webhook for self managed HCP #2864

Merged

Conversation

davidvossel
Copy link
Contributor

@davidvossel davidvossel commented Jul 31, 2023

Previously, the hostedCluster.Spec.Release.Image was required and the hcp or hypershift cli defaulted that value client side. This posed several issues though.

  1. The defaulting did not take into consideration what the max and min version the deployed hypershift operator supports, which will eventually lead to the client defaulting unsupported versions.
  2. This logic was all client side, which meant that it couldn't be used in a gitops flow or reused by the console web UI.

The new logic performs defaulting the release image on the backend using an optional mutating webhook. The logic takes into account the latest/min supported versions and picks the most recent version that falls within those constraints.

A webhook was chosen over performing defaults in the controller's reconcile loop due to the following reasons.

  • The Webhook requires no api changes (such as making the release image optional)
  • Error discovery occurs immediately at creation time (http Post response) rather than after admission via condition

@openshift-ci openshift-ci bot requested review from enxebre and hasueki July 31, 2023 21:12
@openshift-ci openshift-ci bot added the area/cli Indicates the PR includes changes for CLI label Jul 31, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 31, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: davidvossel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Jul 31, 2023
@sjenning
Copy link
Contributor

sjenning commented Aug 1, 2023

Needs a make update

@@ -1659,7 +1659,7 @@ type ManagedEtcdStorageSpec struct {
//
// +optional
// +immutable
RestoreSnapshotURL []string `json:"restoreSnapshotURL"`
Copy link
Contributor

@sjenning sjenning Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Seems unrelated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know... I thought so too. but it actually matters in a kind of dumb way (that I only half understand)

The issue is with how the decoding/encoding of the object works in the webhook. If we don't add omitempty here, then the decode/encode logic results in the mutation webhook trying to patch RestoreSnapshotURL to Null, then we get an API error during the validation phase about trying to set a []string to a Null value... or something along those lines.

@@ -440,7 +432,11 @@ func CreateCluster(ctx context.Context, opts *CreateOptions, platformSpecificApp
func defaultNetworkType(ctx context.Context, opts *CreateOptions, releaseProvider releaseinfo.Provider, readFile func(string) ([]byte, error)) error {
if opts.NetworkType != "" {
return nil
} else if opts.ReleaseImage == "" {
Copy link
Contributor

@sjenning sjenning Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is causing the unit test failure of TestDefaultNetworkType in github.com/openshift/hypershift/cmd/cluster/core

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yep, fixed

return "", err
}
return version.PullSpec, nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra line


if np.Spec.Release.Image != "" {
return nil
} else if np.Spec.ClusterName == "" {
Copy link
Member

@enxebre enxebre Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the defaulter throwing this validation error? the .ClusterName is already required and immutable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons

  1. I don't think ClusterName required to be a non empty string. I believe it is only required to be immutable. this is the validation rule for cluster name +kubebuilder:validation:XValidation:rule="self == oldSelf", message="ClusterName is immutable"

  2. The mutation webhook runs before the validation, which is why we can default the empty release image even though it is required to not be empty via this validation rule +kubebuilder:validation:Pattern=^(\w+\S+)$

So, i'm kind of in a weird spot with the cluster name here.

Copy link
Member

@enxebre enxebre Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ClusterName required to be a non empty string

Yeh it is, every API field is required unless marked as optional via kubebuilder marker or a pointer with omitempty tag. So I think we can drop this check completely unless you prove me wrong, in which case that's a bug and we should fix the field to be a required non empty string at the API level and address separately from the release defaulting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right though this ensures presence but it would accept an explicitly set zero value for the string, and as you mentioned if we enforced != "" validation at the crd schema it would run after this, so I'm fine either way.

}

func LookupLatestSupportedRelease(ctx context.Context) (string, error) {
prefix := "https://multi.ocp.releases.ci.openshift.org/api/v1/releasestream/4-stable-multi/latest"
Copy link
Member

@enxebre enxebre Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we make a decision of defaulting to multiarch? This has an impact particularly for disconnected environments https://issues.redhat.com/browse/MIXEDARCH-292 / https://issues.redhat.com/browse/MIXEDARCH-255

At minimum I think we should document and communicate this clearly in docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cli defaulted to multi, so i brought that over to the webhook.

defaultReleaseStream = "4-stable-multi"

If someone is using disconnected, I doubt they'd be using the release image defaulting.

PullSpec string `json:"pullSpec"`
DownloadURL string `json:"downloadURL"`
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind document this func? e.g // LookupLatestSupportedRelease picks the latest multi-arch image supported by this Hypershift Operator...

@enxebre
Copy link
Member

enxebre commented Aug 1, 2023

it'd be nice to exercise this in the kubevirt job

@davidvossel
Copy link
Contributor Author

it'd be nice to exercise this in the kubevirt job

yep, I think it will take a few steps for us to get to that point though. We'll need to enable the webhook during install (which involves messing with openshift/release ci code), then we can add a e2e test case that ensures the HC/NP defaults the release image during creation when kubevirt platform is in use.


err := defaulter.client.Get(ctx, client.ObjectKeyFromObject(hc), hc)
if err != nil {
return fmt.Errorf("Error retrieving HostedCluster named [%s], %v", np.Spec.ClusterName, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cap letter here makes verify fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, thanks, fixed

Signed-off-by: David Vossel <davidvossel@gmail.com>
@netlify
Copy link

netlify bot commented Aug 2, 2023

Deploy Preview for hypershift-docs ready!

Name Link
🔨 Latest commit d3e9893
🔍 Latest deploy log https://app.netlify.com/sites/hypershift-docs/deploys/64ca95818b03030008dc81d8
😎 Deploy Preview https://deploy-preview-2864--hypershift-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@davidvossel
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2023

@davidvossel: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-ibmcloud-iks d3e9893 link false /test e2e-ibmcloud-iks

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sjenning
Copy link
Contributor

sjenning commented Aug 7, 2023

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 7, 2023
@sjenning
Copy link
Contributor

sjenning commented Aug 7, 2023

/retest-required

@davidvossel
Copy link
Contributor Author

/override "Pipelines as Code CI"

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 8, 2023

@davidvossel: Overrode contexts on behalf of davidvossel: Pipelines as Code CI

In response to this:

/override "Pipelines as Code CI"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD b802382 and 2 for PR HEAD d3e9893 in total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants