Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Helm Chart remove Capabilities.APIVersions for Kustomize to parse file #7829

Merged
merged 1 commit into from
Feb 1, 2022

Conversation

stoupance
Copy link
Contributor

@stoupance stoupance commented Oct 20, 2021

What this PR does / why we need it:

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation only

Which issue/s this PR fixes

This PR fixes the issue initially described here. The Helm chart provided could not customized by Kustomize to build the YAML section for serviceMonitor. Indeed this Helm template file has the following test as first line.

{{- if and ( .Capabilities.APIVersions.Has "monitoring.coreos.com/v1" ) .Values.controller.metrics.enabled .Values.controller.metrics.serviceMonitor.enabled -}}

So Kustomize with Helm (kubectl kustomize . --enable-helm=true) build locally the YAML from this template and could not pass the first test ( .Capabilities.APIVersions.Has "monitoring.coreos.com/v1" ) for obvious reasons. Test failed and then did not output, as expected, the final YAML section for serviceMonitor.

This test irrelevant and is not present in other template files, for example like the one regarding Prometheus to build the prometheusRule.

This issue is not presents with vanilla Helm and only appears when Kustomize is used on top of Helm

How Has This Been Tested?

  1. I fork this repo
  2. I made the change into controller-servicemonitor.yaml file and pushed it into my forked repo
  3. I change my code to pull files from my forked repo as a Helm repo
  4. I ran the kustomize command provided previously (kubectl kustomize . --enable-helm=true)
  5. Results: it outputs the YAML section for serviceMonitor as expected

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 20, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @stoupance!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @stoupance. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 20, 2021
@k8s-ci-robot k8s-ci-robot added the area/helm Issues or PRs related to helm charts label Oct 20, 2021
@stoupance
Copy link
Contributor Author

/assign @ChiefAlexander

@ChiefAlexander
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Oct 20, 2021
@ChiefAlexander
Copy link
Contributor

@rikatz for the release process of the helm chart do we need to have PR's bump the chart version?

This looks good to me otherwise.

Copy link
Member

@cpanato cpanato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold for @rikatz

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 21, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 21, 2021
@cpanato
Copy link
Member

cpanato commented Oct 21, 2021

@ChiefAlexander @rikatz i think we decide when we have enough things to release or when is a critical fix/feature

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 21, 2021
@ChiefAlexander
Copy link
Contributor

I disagree. I would rather we release fixes as they happen to get them out as quickly as possible. There should be no harm in bumping a bugfix version and turning that around. Even minor fixes.

@cpanato
Copy link
Member

cpanato commented Oct 21, 2021

I disagree. I would rather we release fixes as they happen to get them out as quickly as possible. There should be no harm in bumping a bugfix version and turning that around. Even minor fixes.

i agree with you @ChiefAlexander I was just explaining how was before, not sure if we want to continue having that

Copy link
Member

@tao12345666333 tao12345666333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is added here to prevent incorrect configuration from causing exceptions.
And this is a Helm chart, I don’t think it is reasonable to delete this condition.

/hold

@ChiefAlexander
Copy link
Contributor

This check is added here to prevent incorrect configuration from causing exceptions. And this is a Helm chart, I don’t think it is reasonable to delete this condition.

/hold

It is not used in the other CRD's for this. Given that the option .Values.controller.metrics.serviceMonitor.enabled is tied directly to the CRD in this case this seems totally fine.

@cpanato
Copy link
Member

cpanato commented Oct 24, 2021

i agree with @ChiefAlexander or I'm missing something that I cannot see/understand

@rikatz
Copy link
Contributor

rikatz commented Oct 24, 2021

  • You folks are the helm expert, not me. You can approve and release as you wish :P
  • I think it's fine to release helm untied, what we need is actually something that defines when to bump or not. If we bump too much because of typos or new features to a really small subset of users, we may lose users opportunity to update as they think all the releases are small ones :)

@tao12345666333
Copy link
Member

The problem I am considering here is the following situation:

➜  ingress-nginx git:(stoupance/main) ✗ helm install ingress -n 7829-ingress --set controller.metrics.enabled=true --set controller.metrics.serviceMonitor.enabled=true charts/ingress-nginx
Error: unable to build kubernetes objects from release manifest: unable to recognize "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"

If we assume that the user knows what they are doing, then this condition can be removed. (Feel free to remove the hold and merge it. )

@longwuyuan
Copy link
Contributor

longwuyuan commented Oct 25, 2021 via email

@stoupance
Copy link
Contributor Author

stoupance commented Oct 25, 2021

The problem I am considering here is the following situation:

➜  ingress-nginx git:(stoupance/main) ✗ helm install ingress -n 7829-ingress --set controller.metrics.enabled=true --set controller.metrics.serviceMonitor.enabled=true charts/ingress-nginx
Error: unable to build kubernetes objects from release manifest: unable to recognize "": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"

If we assume that the user knows what they are doing, then this condition can be removed. (Feel free to remove the hold and merge it. )

@tao12345666333 Well, this error is thrown by your cluster because I assume you don't have the proper (Prometheus) CRDs installed? Which is, in fact, an expected and "normal" error. It goes the same way with any operator, if it's not installed, then you cannot use its CRDs. Nothing can protect a user from this, this is the default behaviour.

There is actually 3 conditions in the template file

  1. .Capabilities.APIVersions.Has "monitoring.coreos.com/v1"
  2. .Values.controller.metrics.enabled
  3. .Values.controller.metrics.serviceMonitor.enabled

If you want to keep condition 1, I can understand this, then it must be also added in this other template file because both are using the same CRDs (keep in mind doing this will then prevent any usage of this Helm chart with Kustomize (which is the case of maybe only few users, but which is my case 😄 ))

Indeed, condition 1 cannot be present in one file, and not in another template file which both are tied to the same CRDs. This is not logical. And this condition is actually causing an issue when used by Kustomize.

That being said, the third condition (.Values.controller.metrics.serviceMonitor.enabled) also ensure the proper CRDs are deployed into the cluster and prevents any wrong situations for users who don't know what they are doing (this is because of this you got the error output and preventing you to deploy the serviceMonitor).


Personally, servicemonitor is just too complicated. Unless the docs can be updated to teach and support service monitors, this may not end up being an improvement. Thanks, ; Long

@longwuyuan I don't understand your point. The serviceMonitor is already present in the helm template file and the issue is not related to the serviceMonitor. Nor it is related to Prometheus/serviceMonitor documentation. Nor an improvement. It is a bug fix about **conditions on how to deploy ** the serviceMonitor to avoid any error.

In this case, when using Kustomize to customize a Helm chart the serviceMonitor is not deployed because the condition .Capabilities.APIVersions.Has "monitoring.coreos.com/v1" cannot be verified by Kustomize.
As said by @ChiefAlexander the option .Values.controller.metrics.serviceMonitor.enabled is already tied directly to this same CRDs so bascially it means in this Helm template file there is twice the same check, but written differently, and the first one causing an issue in this particular case of using Kustomize. See my answer just above for more details.

So, it's not an improvement, but a bug fix while some users (me for example) cannot use this Helm chart because of this "double checks" when one is totally irrelevant.

@tao12345666333
Copy link
Member

@tao12345666333 Well, this error is thrown by your cluster because I assume you don't have the proper (Prometheus) CRDs installed? Which is, in fact, an expected and "normal" error. It goes the same way with any operator, if it's not installed, then you cannot use its CRDs. Nothing can protect a user from this, this is the default behaviour.

Sure. The situation I mentioned above is that when the user did not install the relevant CRD, the option was turned on by mistake.

So adding this condition is equivalent to a stricter restriction.

Strict restrictions have not caused any bugs, on the contrary, it reduces the probability of users seeing errors and can bring a better user experience.

I think the reason here is that kustomize does not fully support all the capabilities of Helm (and this is not the native usage of Helm). ref: kubernetes-sigs/kustomize#3815 (comment)

The safest, most performant, and completely unlimited way to use kustomize + helm to generate config is

  1. manually inflate the helm chart via helm template --foo --bar ... (with whatever risky options one wants),
  2. commit the resulting plain old k8s YAML to a repo.
  3. use kustomize to operate on that repo, treating it as a base, adding some last mile config changes.

From time to time, capture the latest chart, and repeat steps 1 and 2. It's dangerous to set up a production stack that relies on kustomize to freshly download a helm chart with each build.

@rikatz
Copy link
Contributor

rikatz commented Jan 9, 2022

How is this going?

@rikatz rikatz removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 16, 2022
Copy link
Contributor

@iamNoah1 iamNoah1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longwuyuan @tao12345666333 friendly ping to follow up here :)

Copy link
Contributor

@iamNoah1 iamNoah1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/kind bug
/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Jan 25, 2022
@longwuyuan
Copy link
Contributor

@stoupance , any chance you can create a issue and link it here.
Such a complicated discussion requires at least a issue to be describing what is the current problem this PR wants to solve, in such a way that all normal users of the ingress-nginx controlller, can understand and track

Copy link
Member

@afirth afirth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 1, 2022
@rikatz
Copy link
Contributor

rikatz commented Feb 1, 2022

/lgtm
/approve
We discussed this in ingress-nginx dev meeting and this makes sense.

Let's move forward with this

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 1, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afirth, cpanato, rikatz, stoupance

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 0f46433 into kubernetes:main Feb 1, 2022
@tao12345666333 tao12345666333 mentioned this pull request Feb 27, 2022
Bazze added a commit to Bazze/ingress-nginx that referenced this pull request Aug 3, 2022
As fixed in pull request kubernetes#7829 for the ServiceMonitor resource, this is also needed for the PrometheusRule. When upgrading the ingress-nginx chart in our environment (via Pulumi) from a really old version to the latest (4.2.0) we noticed it wanted to delete the PrometheusRule resource. This PR should fix that.
rchshld pushed a commit to joomcode/ingress-nginx that referenced this pull request May 19, 2023
Bazze added a commit to Bazze/ingress-nginx that referenced this pull request Jun 5, 2023
As fixed in pull request kubernetes#7829 for the ServiceMonitor resource, this is also needed for the PrometheusRule. When
upgrading the ingress-nginx chart in our environment (via Pulumi) from a really old version to the latest (4.2.0) we
noticed it wanted to delete the PrometheusRule resource. This PR should fix that.
k8s-ci-robot pushed a commit that referenced this pull request Feb 27, 2024
As fixed in pull request #7829 for the ServiceMonitor resource, this is also needed for the PrometheusRule. When
upgrading the ingress-nginx chart in our environment (via Pulumi) from a really old version to the latest (4.2.0) we
noticed it wanted to delete the PrometheusRule resource. This PR should fix that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/helm Issues or PRs related to helm charts cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants