-
Notifications
You must be signed in to change notification settings - Fork 66
✨ Promote Webhook FeatureGates to GA (OPRUN-4098) #2267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
✨ Promote Webhook FeatureGates to GA (OPRUN-4098) #2267
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2267 +/- ##
==========================================
- Coverage 72.93% 70.13% -2.81%
==========================================
Files 88 88
Lines 8743 8748 +5
==========================================
- Hits 6377 6135 -242
- Misses 1951 2197 +246
- Partials 415 416 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of why multiple files should be used in the experimenta-e2e directory. You should just be able to move the file without significant changes. In this case, we had to create a new file, and delete content from another. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of removing the experimental-e2e entirely and introspecting the cluster to see which features are enabled and run/skip accordingly - or even informing it about which environment its running in: standard/experimental. That removes all of this complexity, all e2es live together where they can share packages and libraries easily under the e2e directory, we could even change the flow of existing e2es to keep duplication low, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are moving the tests from experimental_e2e_test.go
to e2e_suite_test.go
.
At the moment, not all tests live in a single file.
It might not be fully achievable in every case if we decide to enforce that as a rule, since doing so could require duplicating some code or setups.
However, anyway, IHMO refactoring the e2e tests related to the feature gate is out of scope for this PR. We can discuss that, but I think we need to be in a second moment.
Here, we’re simply promoting the feature and moving its specific tests into the main e2e
suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Common packages and libraries and functions could still be shared by different directories, those libraries would need to live elsewhere. So, that's a false argument.
The idea was that each feature would have a single file devoted to its tests, and that it should have been a simple file move/package rename to move the tests from experimental to the e2e. I don't see how that's complex...
Adding the code for introspection would be the complexity, IMHO. But, it would work downstream in experimental-e2e better.
must be enabled to make use of it. See the instructions below on how to enable the feature-gate. | ||
Webhook support is enabled by default. The controller uses the `WebhookProviderCertManager` | ||
feature gate unless you override it. To switch to the OpenShift Service CA provider, | ||
start the controller with `--feature-gates=WebhookProviderCertManager=false`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incomplete, because you also would need to enable the WebhookProviderOpenshiftServiceCA
feature-gate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebhookProviderOpenshiftServiceCA is enabled by default with this PR.
Users if wish need disable, see: internal/operator-controller/features/features.go
So, wdyt? Do you have you any suggestion here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually... that's a problem. Both of them can not be enabled by default. Only one of them should be enabled at a time, and those are in different deployment scenarios (cert-manager vs Openshift).
The helm config does not provide a way to disable feature gates, only enable them.
So either:
a) We need to keep them disabled and add the set of feature flags to values.yaml as default on, which also means they would need to be kept in the experimental.yaml file as well.
b) We need to keep them disabled and add the feature-flags with a check for .Values.options.openshift.enabled
to add the appropriate enabled=True feature-flag.
c) Inverse of (b) keep flags enabled, and then disabled=True based on `.Values.options.openshift.enabled'
Personally, I dislike negative logic, and would prefer that (b) is used, so it's explicitly clear what's enabled, rather than "disabling" the one we don't want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would introducing another flag to the command like --certificate-provider (that can be one of openshift-serviceca, or cert-manager (default: cert-manager)) help here? Then we can have them both on by default. Then downstream Deployment manifest uses --certificate-provider=openshift-serviceca
and upstream Deployment manifest uses --certificate-provider=cert-manager
(or nothing since its the default). Upstream users can then choose which to use to use openshift-serviceca if they want to by altering the Deployment command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really liked @perdasilva’s idea 💡
The main concern with not keeping it enabled by default is the user experience — having too many flags to enable could quickly become cumbersome as this list grows.
We should also think about the long term. At some point, we’ll likely remove the feature flag entirely and make the code part of the main source.
Maybe it’s worth discussing how we want to handle that transition and what our long-term approach should be? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit with @perdasilva's idea see the latest one.
@tmshort @perdasilva
docs/draft/tutorials/explore-available-content-metas-endpoint.md
Outdated
Show resolved
Hide resolved
/hold |
I think my preference would be to leave the feature-gates off by default, and put just the one we want into the BUT, the strangeness comes from the fact that we have cluster-olm-operator getting feature-gates from Openshift. I have a downstream change that would "remove" all upstream gates from BUT, another strangeness occurs because helm does not merge/combine lists; they are overwritten. I need to think about this some more. |
/hold Until we decide how to do |
That we have two feature-gates that can't be enabled simultaneously is causing us this headache (this should not be an issue with 99%of other features). Although the CertManager one takes priority: operator-controller/cmd/operator-controller/main.go Lines 517 to 524 in 1d685b1
Which I guess is a reasonable default. But it means we have to explicitly turn off CertManager to get OpenshiftServiceCA, and we don't really have that capability downstream. Due to the downstream mapping of feature-gates, we only handle "enabled" flags (i.e. we can't turn things off), and we couldn't turn things off without a way in the helm chart to do so (i.e. additional upstream effort). And disabling features just sounds wrong to me. I really think we need to enable features. Add in the fact that helm doesn't merge lists makes this more challenging. I'm envisioning a number of possible solutions, none of which I really like, so I'm wondering if this is something to be discussed in a meeting (office hours)? |
Options?
1a. Add a new "standardFeatures" list that parallels the existing helm feature lists so that the value doesn't have to be duplicated in both
|
Co-authored-by: Todd Short <tmshort@users.noreply.github.com>
681c83d
to
efaa1b9
Compare
…eCA with flag certificate-provider
562998c
to
61ead05
Compare
flags.StringVar(&cfg.cachePath, "cache-path", "/var/cache", "The local directory path used for filesystem based caching") | ||
flags.StringVar(&cfg.systemNamespace, "system-namespace", "", "Configures the namespace that gets used to deploy system resources.") | ||
flags.StringVar(&cfg.globalPullSecret, "global-pull-secret", "", "The <namespace>/<name> of the global pull secret that is going to be used to pull bundle images.") | ||
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't think we can have this enabled to cert-manager by default, because we have the option to run without cert-manager.
I do like the idea, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Change for: --webhook-certificate-provider
{{- if .Values.options.openshift.enabled }} | ||
- --certificate-provider=openshift-serviceca | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{- if .Values.options.openshift.enabled }} | |
- --certificate-provider=openshift-serviceca | |
{{- end }} | |
{{- if .Values.options.certManager.enabled }} | |
- --certificate-provider=cert-manager | |
{{- else if .Values.options.openshift.enabled }} | |
- --certificate-provider=openshift-serviceca | |
{{- end }} |
flags.StringVar(&cfg.cachePath, "cache-path", "/var/cache", "The local directory path used for filesystem based caching") | ||
flags.StringVar(&cfg.systemNamespace, "system-namespace", "", "Configures the namespace that gets used to deploy system resources.") | ||
flags.StringVar(&cfg.globalPullSecret, "global-pull-secret", "", "The <namespace>/<name> of the global pull secret that is going to be used to pull bundle images.") | ||
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.") | |
flags.StringVar(&cfg.certificateProvider, "", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' or 'openshift-serviceca', disabled by default.") |
switch strings.ToLower(cfg.certificateProvider) { | ||
case "", certificateProviderCertManager: | ||
return certproviders.CertManagerCertificateProvider{}, nil | ||
case certificateProviderOpenshiftCA: | ||
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
switch strings.ToLower(cfg.certificateProvider) { | |
case "", certificateProviderCertManager: | |
return certproviders.CertManagerCertificateProvider{}, nil | |
case certificateProviderOpenshiftCA: | |
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil | |
} | |
switch strings.ToLower(cfg.certificateProvider) { | |
case "": | |
return nil, nil | |
certificateProviderCertManager: | |
return certproviders.CertManagerCertificateProvider{}, nil | |
case certificateProviderOpenshiftCA: | |
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil | |
} |
There is no default certificate provider in OLMv1. We provide explicit options to turn on and use cert-manager and/or Openshift serviceca, and I think we need to be able to reflect that. Some fairly small changes would be needed to make the |
// WebhookProviderCertManager enables support for installing | ||
// registry+v1 cluster extensions that include validating, | ||
// mutating, and/or conversion webhooks with CertManager | ||
// as the certificate provider. | ||
WebhookProviderCertManager: { | ||
Default: false, | ||
PreRelease: featuregate.Alpha, | ||
LockToDefault: false, | ||
}, | ||
|
||
// WebhookProviderCertManager enables support for installing | ||
// registry+v1 cluster extensions that include validating, | ||
// mutating, and/or conversion webhooks with Openshift Service CA | ||
// as the certificate provider. | ||
WebhookProviderOpenshiftServiceCA: { | ||
Default: false, | ||
PreRelease: featuregate.Alpha, | ||
LockToDefault: false, | ||
}, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fear is that this will break downstream... we need to remove downstream feature gate references before we can delete this.
Specifically, the techpreview/devpreview builds will fail. So, this should remain as false, and should not be looked at. As long as we can accept the feature-gate value, we're good. Otherwise, operator-controller will exit early complaining of an unknown feature gate.
WebhookProviderCertManager featuregate.Feature = "WebhookProviderCertManager" | ||
WebhookProviderOpenshiftServiceCA featuregate.Feature = "WebhookProviderOpenshiftServiceCA" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we can't really delete feature-gates until it's no longer referenced downstream.
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
No description provided.