Skip to content

Conversation

camilamacedo86
Copy link
Contributor

No description provided.

@camilamacedo86 camilamacedo86 requested a review from a team as a code owner October 14, 2025 11:01
@openshift-ci openshift-ci bot requested review from pedjak and trgeiger October 14, 2025 11:01
Copy link

openshift-ci bot commented Oct 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign perdasilva for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

netlify bot commented Oct 14, 2025

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit 61ead05
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/68ef50989360ba0008c9c780
😎 Deploy Preview https://deploy-preview-2267--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

codecov bot commented Oct 14, 2025

Codecov Report

❌ Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.13%. Comparing base (9eac616) to head (61ead05).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
cmd/operator-controller/main.go 46.15% 6 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2267      +/-   ##
==========================================
- Coverage   72.93%   70.13%   -2.81%     
==========================================
  Files          88       88              
  Lines        8743     8748       +5     
==========================================
- Hits         6377     6135     -242     
- Misses       1951     2197     +246     
- Partials      415      416       +1     
Flag Coverage Δ
e2e 45.19% <46.15%> (+5.81%) ⬆️
experimental-e2e 14.19% <0.00%> (-32.54%) ⬇️
unit 58.08% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@camilamacedo86 camilamacedo86 changed the title ✨ Promote Webhook FeatureGates to GA ✨ Promote Webhook FeatureGates to GA (OPRUN-4098) Oct 14, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of why multiple files should be used in the experimenta-e2e directory. You should just be able to move the file without significant changes. In this case, we had to create a new file, and delete content from another. :(

Copy link
Contributor

@perdasilva perdasilva Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of removing the experimental-e2e entirely and introspecting the cluster to see which features are enabled and run/skip accordingly - or even informing it about which environment its running in: standard/experimental. That removes all of this complexity, all e2es live together where they can share packages and libraries easily under the e2e directory, we could even change the flow of existing e2es to keep duplication low, etc.

Copy link
Contributor Author

@camilamacedo86 camilamacedo86 Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are moving the tests from experimental_e2e_test.go to e2e_suite_test.go.

At the moment, not all tests live in a single file.
It might not be fully achievable in every case if we decide to enforce that as a rule, since doing so could require duplicating some code or setups.

However, anyway, IHMO refactoring the e2e tests related to the feature gate is out of scope for this PR. We can discuss that, but I think we need to be in a second moment.
Here, we’re simply promoting the feature and moving its specific tests into the main e2e suite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common packages and libraries and functions could still be shared by different directories, those libraries would need to live elsewhere. So, that's a false argument.
The idea was that each feature would have a single file devoted to its tests, and that it should have been a simple file move/package rename to move the tests from experimental to the e2e. I don't see how that's complex...

Adding the code for introspection would be the complexity, IMHO. But, it would work downstream in experimental-e2e better.

must be enabled to make use of it. See the instructions below on how to enable the feature-gate.
Webhook support is enabled by default. The controller uses the `WebhookProviderCertManager`
feature gate unless you override it. To switch to the OpenShift Service CA provider,
start the controller with `--feature-gates=WebhookProviderCertManager=false`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incomplete, because you also would need to enable the WebhookProviderOpenshiftServiceCA feature-gate.

Copy link
Contributor Author

@camilamacedo86 camilamacedo86 Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebhookProviderOpenshiftServiceCA is enabled by default with this PR.
Users if wish need disable, see: internal/operator-controller/features/features.go

So, wdyt? Do you have you any suggestion here?

Copy link
Contributor

@tmshort tmshort Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually... that's a problem. Both of them can not be enabled by default. Only one of them should be enabled at a time, and those are in different deployment scenarios (cert-manager vs Openshift).

The helm config does not provide a way to disable feature gates, only enable them.
So either:
a) We need to keep them disabled and add the set of feature flags to values.yaml as default on, which also means they would need to be kept in the experimental.yaml file as well.
b) We need to keep them disabled and add the feature-flags with a check for .Values.options.openshift.enabled to add the appropriate enabled=True feature-flag.
c) Inverse of (b) keep flags enabled, and then disabled=True based on `.Values.options.openshift.enabled'

Personally, I dislike negative logic, and would prefer that (b) is used, so it's explicitly clear what's enabled, rather than "disabling" the one we don't want.

Copy link
Contributor

@perdasilva perdasilva Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would introducing another flag to the command like --certificate-provider (that can be one of openshift-serviceca, or cert-manager (default: cert-manager)) help here? Then we can have them both on by default. Then downstream Deployment manifest uses --certificate-provider=openshift-serviceca and upstream Deployment manifest uses --certificate-provider=cert-manager (or nothing since its the default). Upstream users can then choose which to use to use openshift-serviceca if they want to by altering the Deployment command.

Copy link
Contributor Author

@camilamacedo86 camilamacedo86 Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really liked @perdasilva’s idea 💡

The main concern with not keeping it enabled by default is the user experience — having too many flags to enable could quickly become cumbersome as this list grows.

We should also think about the long term. At some point, we’ll likely remove the feature flag entirely and make the code part of the main source.
Maybe it’s worth discussing how we want to handle that transition and what our long-term approach should be? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit with @perdasilva's idea see the latest one.
@tmshort @perdasilva

@tmshort
Copy link
Contributor

tmshort commented Oct 14, 2025

/hold
We can't have both feature gates enabled simultaneously, as one is meant for upstream (WebhookProviderCertManager), and the other is meant for downstream (WebhookProviderOpenshiftServiceCA).

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 14, 2025
@tmshort
Copy link
Contributor

tmshort commented Oct 14, 2025

I think my preference would be to leave the feature-gates off by default, and put just the one we want into the helm/olmv1/values.yaml file, but this would mean also leaving them in the experimental.yaml file as well, which is not desirable (i.e. have it in two places). The alternative is to use an if/.Values.openshift.enabled/else sequence to separate the two flags.

BUT, the strangeness comes from the fact that we have cluster-olm-operator getting feature-gates from Openshift. I have a downstream change that would "remove" all upstream gates from values.yaml, and that might be needed before this is merged downstream.

BUT, another strangeness occurs because helm does not merge/combine lists; they are overwritten.

I need to think about this some more.

@camilamacedo86
Copy link
Contributor Author

/hold

Until we decide how to do

@tmshort
Copy link
Contributor

tmshort commented Oct 14, 2025

That we have two feature-gates that can't be enabled simultaneously is causing us this headache (this should not be an issue with 99%of other features).

Although the CertManager one takes priority:

func getCertificateProvider() render.CertificateProvider {
if features.OperatorControllerFeatureGate.Enabled(features.WebhookProviderCertManager) {
return certproviders.CertManagerCertificateProvider{}
} else if features.OperatorControllerFeatureGate.Enabled(features.WebhookProviderOpenshiftServiceCA) {
return certproviders.OpenshiftServiceCaCertificateProvider{}
}
return nil
}

Which I guess is a reasonable default. But it means we have to explicitly turn off CertManager to get OpenshiftServiceCA, and we don't really have that capability downstream.

Due to the downstream mapping of feature-gates, we only handle "enabled" flags (i.e. we can't turn things off), and we couldn't turn things off without a way in the helm chart to do so (i.e. additional upstream effort).

And disabling features just sounds wrong to me. I really think we need to enable features.

Add in the fact that helm doesn't merge lists makes this more challenging.

I'm envisioning a number of possible solutions, none of which I really like, so I'm wondering if this is something to be discussed in a meeting (office hours)?

@tmshort
Copy link
Contributor

tmshort commented Oct 14, 2025

Options?

  1. Keep these feature-gates off, explicitly turn on CertManager upstream through a new standard.yaml file, which means that experimental.yaml would also need to include it. Downstream, the OpenshiftServiceCA value would be enabled via the Openshift feature-gates.

1a. Add a new "standardFeatures" list that parallels the existing helm feature lists so that the value doesn't have to be duplicated in both standard.yaml and experimental.yaml. (this is a bunch of effort to avoid duplication of a value in a list).

  1. Change the feature-gates to be on, but then add upstream and downstream and cluster-olm-operator mechanisms to be able to disable the feature gates. This would be a lot of additional work.

  2. Add a new --webhook-provider option to operator-controller to select between CertManager and OpenshiftServiceCA (the default would be no provider). This effectively replaces the feature-gates with a tri-state, and would be enabled in the chart via .Values.openshift.enabled or .Values.certManager.enabled settings.

Co-authored-by: Todd Short <tmshort@users.noreply.github.com>
flags.StringVar(&cfg.cachePath, "cache-path", "/var/cache", "The local directory path used for filesystem based caching")
flags.StringVar(&cfg.systemNamespace, "system-namespace", "", "Configures the namespace that gets used to deploy system resources.")
flags.StringVar(&cfg.globalPullSecret, "global-pull-secret", "", "The <namespace>/<name> of the global pull secret that is going to be used to pull bundle images.")
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think we can have this enabled to cert-manager by default, because we have the option to run without cert-manager.
I do like the idea, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Change for: --webhook-certificate-provider

Comment on lines +47 to +49
{{- if .Values.options.openshift.enabled }}
- --certificate-provider=openshift-serviceca
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{- if .Values.options.openshift.enabled }}
- --certificate-provider=openshift-serviceca
{{- end }}
{{- if .Values.options.certManager.enabled }}
- --certificate-provider=cert-manager
{{- else if .Values.options.openshift.enabled }}
- --certificate-provider=openshift-serviceca
{{- end }}

flags.StringVar(&cfg.cachePath, "cache-path", "/var/cache", "The local directory path used for filesystem based caching")
flags.StringVar(&cfg.systemNamespace, "system-namespace", "", "Configures the namespace that gets used to deploy system resources.")
flags.StringVar(&cfg.globalPullSecret, "global-pull-secret", "", "The <namespace>/<name> of the global pull secret that is going to be used to pull bundle images.")
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
flags.StringVar(&cfg.certificateProvider, "certificate-provider", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' (default) or 'openshift-serviceca'.")
flags.StringVar(&cfg.certificateProvider, "", certificateProviderCertManager, "The certificate provider to use for webhook support. Options: 'cert-manager' or 'openshift-serviceca', disabled by default.")

Comment on lines +526 to 531
switch strings.ToLower(cfg.certificateProvider) {
case "", certificateProviderCertManager:
return certproviders.CertManagerCertificateProvider{}, nil
case certificateProviderOpenshiftCA:
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
switch strings.ToLower(cfg.certificateProvider) {
case "", certificateProviderCertManager:
return certproviders.CertManagerCertificateProvider{}, nil
case certificateProviderOpenshiftCA:
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil
}
switch strings.ToLower(cfg.certificateProvider) {
case "":
return nil, nil
certificateProviderCertManager:
return certproviders.CertManagerCertificateProvider{}, nil
case certificateProviderOpenshiftCA:
return certproviders.OpenshiftServiceCaCertificateProvider{}, nil
}

@tmshort
Copy link
Contributor

tmshort commented Oct 15, 2025

There is no default certificate provider in OLMv1. We provide explicit options to turn on and use cert-manager and/or Openshift serviceca, and I think we need to be able to reflect that.

Some fairly small changes would be needed to make the --certificate-provider a true tri-state.

Comment on lines -49 to -68
// WebhookProviderCertManager enables support for installing
// registry+v1 cluster extensions that include validating,
// mutating, and/or conversion webhooks with CertManager
// as the certificate provider.
WebhookProviderCertManager: {
Default: false,
PreRelease: featuregate.Alpha,
LockToDefault: false,
},

// WebhookProviderCertManager enables support for installing
// registry+v1 cluster extensions that include validating,
// mutating, and/or conversion webhooks with Openshift Service CA
// as the certificate provider.
WebhookProviderOpenshiftServiceCA: {
Default: false,
PreRelease: featuregate.Alpha,
LockToDefault: false,
},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fear is that this will break downstream... we need to remove downstream feature gate references before we can delete this.
Specifically, the techpreview/devpreview builds will fail. So, this should remain as false, and should not be looked at. As long as we can accept the feature-gate value, we're good. Otherwise, operator-controller will exit early complaining of an unknown feature gate.

Comment on lines -17 to -18
WebhookProviderCertManager featuregate.Feature = "WebhookProviderCertManager"
WebhookProviderOpenshiftServiceCA featuregate.Feature = "WebhookProviderOpenshiftServiceCA"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we can't really delete feature-gates until it's no longer referenced downstream.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2025
@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants