-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement basic control plane upgrade orchestration #81
Implement basic control plane upgrade orchestration #81
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ironcladlou The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
0fa8622
to
d8dc52f
Compare
@@ -96,6 +96,9 @@ type HostedControlPlaneStatus struct { | |||
// +kubebuilder:validation:Optional | |||
Version string `json:"version,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need both version
and ReleaseImage
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from our original API design... my understanding is users of HostedCluster want to know the semver for the release, so it has to bubble up from here as it's resolved at runtime. It's not used for reconciliation and is just additional metadata associated with the release image (again, just my understanding). cc @csrwng @sjenning @derekwaynecarr
hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
Outdated
Show resolved
Hide resolved
hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
Show resolved
Hide resolved
2a621fc
to
a9151c5
Compare
/test e2e-aws |
a9151c5
to
0b25c9e
Compare
@sjenning @enxebre @csrwng @derekwaynecarr this is ready for review, I'm down to just polishing at this point. Hopefully the test cases (both unit and e2e) are useful for evaluating whether the behavior is correctly defined. |
/test e2e-aws |
0b25c9e
to
3ca65a7
Compare
/test e2e-aws |
3ca65a7
to
3c97777
Compare
/test e2e-aws |
This commit introduces basic support for control plane upgrades according to the HostedCluster versioning API. Control plane upgrades are initiated by updating the HostedCluster `.spec.release.image` field, which is propagated to the HostedControlPlane resource following some rules which are articulated in the test cases but not yet codified in the API. For example, rollouts are serialized. See the test cases for more details. To make this work, the control plane operator has been updated to remove gating logic which caused it to stop reconciling in reaction to fields like a "ready" flag (which are reminiscent of state machine behavior). Now the HostedControlPlane will always be reconciled idempotently. The HostedControlPlane controller itself isn't architected with granular reconciliation in mind and so there are still some spurious updates and reconcile attempts when a no-op would have been preferred, but things should be basically correct. In addition, the HostedControlPlane controller is not very accurate with regards to its status reporting, as it does no deep per-component analysis to gauge control plane health. This weakness of reporting is reflected to some extent now in HostedCluster (garbage in, garbage out). This commit expands status for HostedCluster with a basic Availability condition (see the new tests for details). Most of the status reporting happening now is very minimal and has lots of room for improvement to provide better decision making context for other logic. The goal of this commit is to establish the bare minimum parts to make everything work together and demonstrate a functional system for iteration.
3c97777
to
134fc35
Compare
/test e2e-aws |
1 similar comment
/test e2e-aws |
/lgtm need to make sure we want to stick with this as the API, but this works for me |
This commit introduces basic support for control plane upgrades according to the
HostedCluster versioning API.
Control plane upgrades are initiated by updating the HostedCluster
.spec.release.image
field, which is propagated to the HostedControlPlaneresource following some rules which are articulated in the test cases but not
yet codified in the API. For example, rollouts are serialized. See the test
cases for more details.
To make this work, the control plane operator has been updated to remove gating
logic which caused it to stop reconciling in reaction to fields like a "ready"
flag (which are reminiscent of state machine behavior). Now the
HostedControlPlane will always be reconciled idempotently.
The HostedControlPlane controller itself isn't architected with granular
reconciliation in mind and so there are still some spurious updates and
reconcile attempts when a no-op would have been preferred, but things should be
basically correct.
In addition, the HostedControlPlane controller is not very accurate with regards
to its status reporting, as it does no deep per-component analysis to gauge
control plane health. This weakness of reporting is reflected to some extent now
in HostedCluster (garbage in, garbage out).
This commit expands status for HostedCluster with a basic Availability condition
(see the new tests for details).
Most of the status reporting happening now is very minimal and has lots of room
for improvement to provide better decision making context for other logic. The
goal of this commit is to establish the bare minimum parts to make everything
work together and demonstrate a functional system for iteration.