Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for update without registry #1548

Conversation

jhernand
Copy link
Contributor

@jhernand jhernand commented Aug 4, 2023

This patch adds to the ClusterVersion type the fields required for updating without a registry server.

Related: https://issues.redhat.com/browse/RFE-4482
Related: openshift/enhancements#1432

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 4, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2023

Hello @jhernand! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 4, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 4, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jhernand
Once this PR has been reviewed and has the lgtm label, please assign jwforres for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

// inmmediately after the required images are pinned and pre-loaded.
//
// +optional
Hold bool `json:"hold"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't allow booleans in our APIs, https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#do-not-use-boolean-fields, typically we use enumerated values instead.

This is a difficult API though. A hold option isn't something we normally represent in APIs. I would suggest an enum which determines the type of update, whether it's immediate or ... (on schedule?), how would you describe the current trigger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am replacing this with a Schedule structure that contains a Type enum with possible values Immediate and Manual.

// +optional
Hold bool `json:"hold"`

// bundle describes updates that will be performed using an update bundle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when this isn't present? What happens if this is specified and hold is not true currently? Does this field interact with any other field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is orthogonal to Hold (now Schedule). It doesn't interact with any other field. I am renaming this to ImageSource and trying to clarify it.

Comment on lines 737 to 739
// If explicitly set to true then the desiredUpdate.version and desiredUpdate.image fields
// will be ignored. Instead the cluster will loop for the update bundle and will use to
// determine the version, pin and pre-loaded the images and appy the update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to be obvious to an end user, as there's no documentation or validation currently on this, we should improve this.

Ideally the update type would be in a discriminated union, rather than implicit like this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am replacing this with a UpdateImageSource discrimiated union that will have Registry and Bundle variants. Registry will be the default and correspond to the image source that we use today: image registries. Bundle will correspond to the mechanism that uses a bundle.

Comment on lines 744 to 748
// monitor indicates if the cluster should monitor device plug events to automatically
// detect when an update bundle is available. The default is false.
//
// +optional
Monitor bool `json:"monitor"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like another variation of the trigger, I think this is looking more and more like an enum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am replacing this with a DetectionMechanism discriminated union (inside the Bundle image source). It will have Manual and Automatic variants.

// optional, and by default the cluster will look for the bundle in all the nodes.
//
// +optional
Node string `json:"node"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im curious, is this bundle field expected to be persistent? What happens when the node named isn't present? What happens if the node gets removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be persisted, at least till the actual update has been started.

If the value of Node doesn't match any existing node then it should be flagged as an error.

If the node specified in Node is removed then the bundle will never be detected.

//
// The format is be identifier of the digest algorithm followed by a colon and the digest
// value. Currently the only supported digest algorithm is sha256, so the value shold
// be something like sha256:1ef...03a.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be validation with a kubebuilder validation and regex.
Do you expect additional types to be supported in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the validation. I don't expect any additional types to be supported in the future, but can't rule it out.

// Keys are the names of the nodes.
//
// +optional
Nodes map[string]NodeBundleStatus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an equivalent for this information already in the cluster? Eg via the MCO already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of any other place containing this information.

File string `json:"file"`

// node contains the name of the node where the update bundle is available. This is
// optional, and by default the cluster will look for the bundle in all the nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if multiple bundles are found?
Is there any way that this resolution determines that bundles are incompatible? Eg we block upgrades on certain platforms for certain versions, how is this handled with this API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If multiple bundles are found the CVO will compare the metadata (which includes the version and architecture, also the list of images) and will abort the upgrade if they are different.

Comment on lines 798 to 808
// extracted indicates if the upgrade bundle has already been extracted to a temporary
// directory in the node.
//
// +required
Extracted bool `json:"extracted"`

// loaded indicates if the images of the upgrade bundle have been copied to the containers
// storage directory of the node.
//
// +required
Loaded bool `json:"loaded"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend conditions instead, they allow you add additional context to explain why or why not the action has occurred

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The latest version of the enhancement and the associated API pull request use discriminated unions and conditions.

// +required
Loaded bool `json:"loaded"`

// metadata is the content of the metadata.json file that is part of the bundle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big is the metadata.json file typically? What kind of information does it store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It stores the architecture, version and list of image references of the upgrade. For a typical upgrade it takes approx 22 KiB.

This patch adds to the `ClusterVersion` type the fields required for
updating without a registry server.

Related: https://issues.redhat.com/browse/RFE-4482
Related: openshift/enhancements#1432
Signed-off-by: Juan Hernandez <juan.hernandez@redhat.com>
@jhernand jhernand force-pushed the upgrade_without_registry_enhancement branch from d33b001 to a2218a4 Compare September 1, 2023 12:08
@jhernand
Copy link
Contributor Author

jhernand commented Sep 1, 2023

@JoelSpeed I tried to address your comments and pushed a new version of the patch. Let me know what you think.

// downloaded from their corresponding registry servers.
//
// +optional
ImageSource UpdateImageSource `json:"imageSource,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove Schedule UpdateSchedule json:"schedule,omitempty"` from this PR as it is not part of the parent enhancement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is part of the parent enhancement. It used to be a hold boolean flag to indicate if the upgrade has to be performed immediately after the images are pre-loaded or if it should wait for a later time. That functionality was suggested by @williamcaban during the review of the draft of the enhancement.

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the current state of the enhancement, is there rough agreement on the approach? I think we can probably hold off doing too much of an in-depth API review until there is rough consensus on the approach of this going into CVO

Comment on lines -514 to -515
// +kubebuilder:validation:XValidation:rule="has(self.architecture) && has(self.image) ? (self.architecture == '' || self.image == '') : true",message="cannot set both Architecture and Image"
// +kubebuilder:validation:XValidation:rule="has(self.architecture) && self.architecture != '' ? self.version != '' : true",message="Version must be set if Architecture is set"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not intended as part of your PR, please revert

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 15, 2024
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

openshift-ci bot commented Jan 15, 2024

@jhernand: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-serial a2218a4 link true /test e2e-aws-serial
ci/prow/e2e-aws-serial-techpreview a2218a4 link true /test e2e-aws-serial-techpreview
ci/prow/e2e-upgrade a2218a4 link true /test e2e-upgrade
ci/prow/e2e-aws-ovn-techpreview a2218a4 link true /test e2e-aws-ovn-techpreview
ci/prow/e2e-aws-ovn a2218a4 link true /test e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jhernand
Copy link
Contributor Author

Replaced by #1713.

@jhernand jhernand closed this Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants