New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STOR-962: Update storage operators for HyperShift #1202
Conversation
fff2831
to
6f255f9
Compare
/priority important-soon |
d9c65d3
to
2aa2c3e
Compare
Updated with CSI driver operator + added missing chapters |
Add alternative with storage operators watching HostedCluster. Add risk about insufficient design. Review updates Review updates 2 Review updates 3 Review updates after storage team meeting Review updates 4 Review updates 5 Review updates 6
|
||
#### Failure Modes | ||
|
||
Same as in standalone OCP - status of all storage operators is reported to ClusterOperator CRs in the guest cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One new failure mode that we haven't really discussed yet: what happens if the management cluster loses connection or access to the guest cluster?
This isn't specific to storage, but it's a new failure mode with hypershift in general. Maybe that's documented elsewhere (or should be).
|
||
The components currently are: | ||
|
||
* cluster-csi-snapshot-controller-operator and its operand, csi-snapshot-controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we follow always the same dash convention for naming? "cluster-csi-snapshot-controller-operator" vs "cluster-storage operator" vs "AWS EBS CSI driver"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AWS EBS CSI driver is actually collection of many things (its Deployment has 6 sidecars), it's not a monolith like our operators in one repo / Deployment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jsafrane
Awesome stuff
@jsafrane can we may be put this enhancement here https://github.com/openshift/enhancements/tree/master/enhancements/hypershift |
| manifests/05_operand_rbac.yaml | CVO | G | Few RBAC objects for the operand. We can run the operand with kubeadmin for now. | | ||
| manifests/05_operator_rbac.yaml | CVO | G | Operator RBAC, basically cluster-admin binding. No special RBAC is needed in HyperShift. | | ||
| manifests/05_user_rbac.yaml | CVO | G | Aggregate RBAC roles to add to storage-admin and basic-user to manage snapshots. | | ||
| manifests/07_deployment.yaml | CPO | MC | Operator's Deployment. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we going to keep these manifests synchronized between snapshot-operator (which is managed by CVO in other environments) and by CPO in hypershift environment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to do something similar for MicroShift, and there we have automation to extract the manifests from the release payload and add them to our repo to be customized. We're working toward a model where the operators that own the assets can be run in a one-time mode that produces the same content, so the customization can be known and tested "upstream" of MicroShift. https://github.com/openshift/microshift/blob/main/scripts/rebase.sh#L277
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This IMO calls for taking a step back and redesign how standalone OCP (CVO), HyperShift and MicroShift manifests and code are kept consistent. Notice there already is OLM with its yaml manifests in HyperShift repo, synced ad-hoc (?) from OCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That OLM directory looks something like what we're doing in MicroShift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we will sync the yamls manually, at least in the beginning.
* Complexity of cluster-storage-operator, cluster-csi-snapshot-controller-operator, and aws-ebs-csi-driver-operator | ||
will grow significantly, they need to handle two kubeconfigs / kubeclients where one was needed before. We can expect | ||
regressions. | ||
* We will mitigate this risk by reworking the unit tests to handle primarily hypershift case. Standalone OCP is | ||
a variation of HyperShift, where both kube clients are actually the same and both lead to the standalone OCP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a big risk. Storage is complicated already and this design adds yet another dimension to our support / maintenance matrix. We have 9 CSI driver operators, each of them is a bit different (because the clouds are different!), and keeping track of what operator watches / syncs which object in mgmt/guest cluster is crazy. This is error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the first time we've encountered this sort of need for HyperShift? How did we deal with it in other situations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think storage is the first one to encounter this issue. Other control-plane components are started by code hardcoded to control-plane-operator and IMO they don't try to reuse code from their standalone OCP operators.
(I did not research OLM deployment thoroughly).
@csrwng, is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the cluster network operator uses two kubeconfigs as well in order to work in a HyperShift environment.
in the guest cluster, a Cluster Instance Admin can DOS the whole CSI driver / CSI snapshots by editing these CRs in | ||
an endless loop. The operators will redeploy the operands with each CR change and the new operands will need to wait | ||
for leader election to expire, while the old operands are already terminated. | ||
* We mitigate this issue a bit by overwriting user changes in the operator CRs by Hosted Cluster Config Operator that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use a validating webhook to prevent this. But in general, there are lots of ways for a cluster admin to break their cluster, if they explicitly attempt to.
| manifests/05_operand_rbac.yaml | CVO | G | Few RBAC objects for the operand. We can run the operand with kubeadmin for now. | | ||
| manifests/05_operator_rbac.yaml | CVO | G | Operator RBAC, basically cluster-admin binding. No special RBAC is needed in HyperShift. | | ||
| manifests/05_user_rbac.yaml | CVO | G | Aggregate RBAC roles to add to storage-admin and basic-user to manage snapshots. | | ||
| manifests/07_deployment.yaml | CPO | MC | Operator's Deployment. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to do something similar for MicroShift, and there we have automation to extract the manifests from the release payload and add them to our repo to be customized. We're working toward a model where the operators that own the assets can be run in a one-time mode that produces the same content, so the customization can be known and tested "upstream" of MicroShift. https://github.com/openshift/microshift/blob/main/scripts/rebase.sh#L277
* Complexity of cluster-storage-operator, cluster-csi-snapshot-controller-operator, and aws-ebs-csi-driver-operator | ||
will grow significantly, they need to handle two kubeconfigs / kubeclients where one was needed before. We can expect | ||
regressions. | ||
* We will mitigate this risk by reworking the unit tests to handle primarily hypershift case. Standalone OCP is | ||
a variation of HyperShift, where both kube clients are actually the same and both lead to the standalone OCP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the first time we've encountered this sort of need for HyperShift? How did we deal with it in other situations?
of them when fixing a bug. | ||
* Esp. the CSI driver operators are complex, each of them is a bit different and we have ~9 of them (and the number is | ||
growing). We would need to reimplement it in the control-plane-operator and the implementation would get out of sync | ||
with the standalone OCP easily. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the code to do this be shared between the 2 operators, or would it have to be very different implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO the implementations would be quite different, because inputs of the operators would be different. In standalone OCP, a CSI driver operator needs to watch Infrastructure, cluster-wide + cloud CA bundles (ConfigMaps), Proxy, ClusterCSIDriver CR and act on their changes. In HyperShift, the input would be only HostedControlPlane CR.
They could share some template of the CSI driver objects, but that's not much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's not a lot of code to share and the inputs and outputs are so different, would it make sense to build a special-purpose thing to meet the new requirements? It might prove simpler to maintain than weaving if/then checks throughout the existing code to deal with both cases in one place.
be a good place to talk about core concepts and how they relate. | ||
--> | ||
|
||
#### csi-snapshot-controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you combine the diagrams in this section it would be clearer that the cluster-csi-snapshot-controller-operator is reconciling resources in both clusters (deployments in the management-cluster and a bunch of other things in the guest-cluster) at the same time.
Could we run 2 copies of that operator separately with command line flags to enable/disable certain reconciliation steps so that each binary only had to reconcile resources in 1 cluster (and therefore only deal with one kube client)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would these two operators agree on ClusterOperator.Status? IMO it would be more complicated / confusing than running 1 operator with 2 kubeconfigs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would they need to agree on status? Why not report status separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean two OperatorStatus CRs for csi-snapshot-controller? Yes, we could have them, but I don't see the benefit. Yes, one operator instance will have just one kubeconfig, but the source code will still have code paths with different kubeconfigs.
|
||
#### Support Procedures | ||
|
||
TODO: link to HyperShift support procedures, we don't plan anything special for storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr @csrwng Do you have any "HyperShift Support Procedures" that I can link from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing we have documented yet, but this proposal can be updated when we do have them in place.
36074f7
to
961c6fa
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
lgtm, @jsafrane can you please squash commits before merging? |
squashed |
/assign @derekwaynecarr |
re- |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: derekwaynecarr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jsafrane: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Describe deployment for storage components in HyperShift control plane + necessary changes in existing components.