New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hypershift changes for STS support #1427
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -169,7 +169,16 @@ CCO should consider adding the above, three-part test as a utility function expo | |||
**HyperShift changes**: Include Cloud Credential Operator with token-aware mode (see above). Allows for processing of | |||
CredentialsRequest objects added by operators. | |||
|
|||
The CCO is expected to run on the control plane NS, not on the customer's worker nodes, and not be visible to the customer. However if CredentialsRequests are created on the worker nodes, the CCO will process them and yield Secrets as described elsewhere in this document. | |||
The CCO is expected to run on the control plane NS, not on the customer's worker nodes, and not be visible to the customer. We propose that multiple | |||
instances of CCO be installed on the control plane each consuming the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this phrasing is confusing. You're installing multiple instances on the management cluster, but it's still just one instance per hosted control plane (i.e. one per hosted cluster)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
instances of CCO be installed on the control plane each consuming the | |
The CCO is expected to run in the management cluster, not on the customer's worker nodes, and not be visible to the customer. We propose that multiple | |
instances of CCO be installed on management cluster each consuming the kubeconfig of the hosted control plane it is intended to be watching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to mention that it will run in the hosted control plane's NS on the management cluster, but othewise yes, better.
CCO will be modified so that it can consume a given kubeconfig when specified. | ||
Because the intention is to make this feature available with HyperShift GA it | ||
will need to be backported all the way to 4.12 (contingent on CCO changes | ||
moving out of the feature gate). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure this backport is necessary(SD has not explicitly required it). At a minimum it's contingent on us backporting the rest of the STS-enablement work (e.g. console) to 4.12.
if we didn't backport it, it would just mean that you only get this behavior on 4.14+ guest clusters and not 4.12/4.13 guest clusters.
it also might mean the hypershift operator would need to understand the version difference, depending on whether it needs to do anything special to provide the hosted cluster's kubeconfig to the CCO.
|
||
The kubeconfig of the worker node is mounted at a specied path on the CCO pod. | ||
CCO can consume that kubeconfig when it starts the go client. | ||
|
||
### Risks and Mitigations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add some risks+mitigations in which we look at the resource consumption of the CCO, determine if it's acceptable to the hypershift/rosa team, and if not, potentially pursue how we can reduce the footprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you thinking comparing the utilization of "CCO can watch multiple clusters" vs "multiple CCOs each watching one cluster" ? or more broadly even comparing options beyond the scope of this EP like the pod identity webhook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, i wasn't trying to go that far, just calling it out as:
- we need to measure it
- hypershift/rosa team needs some threshold defined that is acceptable/unacceptable
- if we are exceeding that threshold, work will be required. That work may be as simple as just going into the CCO code and finding optimizations we can make to reduce its footprint. (possibly optimizations we can make that are specific to manual/STS mode). If we find ourselves having to do more sophisticated things like a shared CCO across clusters, or switching back to using webhooks, that's going to significantly derail things. So i guess you can also mention those options, but i do not expect us to be going down those paths. (Shared CCO probably isn't even an option, from a tenancy/isolation requirements perspective)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CCO on build farms is taking up enormous amounts of memory footprint. It looks like the CCO keeps the following in cache:
- secrets
- configmaps
- namespaces
This is cluster-wide, always. We should be able to make the following optimizations:
- watch the two configmaps that need to be watched - the legacy cco config one and the one for leader-election (should be doable in c-r cache lib)
- do not cache namespaces, we just want to watch for creation to start filling in credentials (not sure if possible)
- label secrets created by the CCO, restrict all secret LIST+WATCH to that label selector (doable in c-r)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift/cloud-credential-operator#544 handles the configmap situation
Handling secrets will require a couple releases for labeling existing data and ratcheting validation to kick in - if we backport this, could be fast. LMK what we want.
Namespaces, I think we will need to just use a metadata-only watch if we want that functionality, but namespace objects are small so I'm not sure how much of a difference it will make
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevekuznetsov Appreciate the proposed partial fix.
Since you have ready data can you better help us understand the current scale of the problem? How many secrets, configmaps, and namespaces does a build farm cluster have and what's the resource consumption of CCO currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift/cloud-credential-operator#545 adds a controller to start labelling things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdodson two answers to your question:
- Regardless of what's going on in the build farms, I think it's irresponsible and wasteful to have this component holding the entire state of the clusters' ConfigMaps, Secrets and Namespaces in memory at all times - especially the former two objects are used to hold large volumes of data and there's no need for holding them, it's just a missed opportunity for optimization. Supportability of a HCP solution that places memory burden on the service cluster which scales linearly with customer dataset size is poor.
- Build farms are huge, as you know, so on e.g. build02 the CCO is holding on to 600MiB :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, agreed. Just trying to understand broadly what things look like today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift/cloud-credential-operator#546 adds a metadata-only watch for namespaces
@gallettilance: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
||
#### HyperShift Stories | ||
|
||
CCO without the pod identity webhook is enabled on the HyperShift management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openshift/cloud-credential-operator#547 disables starting the pod identity webhook controller when the infrastructure ControlPlaneTopology is External.
### Risks and Mitigations | ||
|
||
Enabling CCO on HyperShift incurs additional infrastructure cost. The exact |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to have concrete data for:
- Compute resources impact
- RBAC reqs management and guest cluster side
- Networking requirements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compute resources impact
We've done some aggressive pruning and the final numbers are low. The CCO container and the rbac-proxy together are something like 80MiB RAM and 20m vCPU. I'm going to work on the RAM a bit more after this, I think we should be able to get that down to 20MiB or so.
RBAC reqs management and guest cluster side
Not sure exactly what you mean by this but the CCO running in the HyperShift management plane will simply use the ServiceAccount credentials as it does in standalone.
Networking requirements
CCO simply needs to connect to the API server for the tenant.
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale |
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
Rotten enhancement proposals close after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Reopen the proposal by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
(automated message) This pull request is closed with lifecycle/rotten. The associated Jira ticket, OCPBU-4, has status "In Progress". Should the PR be reopened, updated, and merged? If not, removing the lifecycle/rotten label will tell this bot to ignore it in the future. |
1 similar comment
(automated message) This pull request is closed with lifecycle/rotten. The associated Jira ticket, OCPBU-4, has status "In Progress". Should the PR be reopened, updated, and merged? If not, removing the lifecycle/rotten label will tell this bot to ignore it in the future. |
No description provided.