Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce expected tags.monitoring=false behavior #1927

Merged
merged 20 commits into from
Aug 22, 2023

Conversation

danielhoherd
Copy link
Member

Description

tags.monitoring=false was not disabling all components when it was expected to do so. This was caused by two different implementations of the conditions used in the tags. Components whose conditions were set to true would always be on, and components whose conditions were not used elsewhere would work as expected. This is because helm values override the tags behavior, so if a condition is set to true in values.yaml, disabling the tag has lower precedence and is ignored.

The fix I have in this branch may not be the right fix. It does what is expected, but I am not sure if it is the best way to solve the problem. It may be acceptable for our use case.

While working on debugging this I also made the following changes:

  • Delete a bunch of cruft related to the Prometheus Operator ServiceMonitor custom resource. AFAIK these have never been supported. The are certainly not supported now, and were breaking some broad tests that I added.
  • Re-order some data in Chart.yaml, improve comments, fix conditions so tags work
  • Make DEBUG=1 show additional test data and not delete temp helm values files
  • Rename some templates so they have consistent, singluar names that match their k8s kind. EG: secrets.yaml -> secret.yaml
  • Add a few more enable_all_features.yaml flags
  • Add some missing k8s schema files that cover templates that had not previously been tested

Related Issues

https://github.com/astronomer/issues/issues/5714

Testing

I have done a lot of testing on this, and most of the changes are additional tests. Some template files have been renamed though, which may cause undesirable behaviors in operation, such as a deletion of a thing followed by the addition of a thing, which could cause things to shuffle around or potentially break. This is an undesirable behavior, so we should test this and consider the stability of existing deployments compared with the value of this additional test coverage.

Merging

These changes fix a bug, so they should be merged everywhere.

@danielhoherd danielhoherd requested review from a team as code owners July 15, 2023 19:50
@pgvishnuram
Copy link
Contributor

@danielhoherd since we have enabled nodeExporter as true in https://github.com/astronomer/astronomer/blob/master/values.yaml#L75

@danielhoherd
Copy link
Member Author

danielhoherd commented Jul 20, 2023

@danielhoherd since we have enabled nodeExporter as true in https://github.com/astronomer/astronomer/blob/master/values.yaml#L75

Yeah, that's the source of the problem. Anything in values.yaml will override the condition in Chart.yaml. Because we set global.nodeExporterEnabled, the dependencies[].condition in Chart.yaml can never be false, which means tags.monitoring=false could never work the way it was intended. Changing the condition value in Chart.yaml to something that is not explicitly set in values.yaml or anywhere else in the chart allows both tags.monitoring to work and global.nodeExporterEnabled to work as expected, which is that tags.monitoring takes precedence.

@danielhoherd
Copy link
Member Author

danielhoherd commented Jul 20, 2023

Actually, TIL that we can completely remove the conditions, which avoids the conflict between conditions and values. This works for our use case and the way we have designed our chart.

@danielhoherd
Copy link
Member Author

danielhoherd commented Jul 20, 2023

After digging a bit, it looks like our Astronomer chart is not using tags and conditions best practices, and this is not the right time to rip that out and fix it. After my last comment I was hoping to simplify the conditions, but I am not going to do so. I think this PR is good to go. CC @pgvishnuram

https://helm.sh/docs/chart_best_practices/dependencies/#conditions-and-tags

Comment on lines +24 to +49
- name: alertmanager
condition: global.alertmanagerEnabled
tags:
- monitoring
- name: prometheus
condition: global.prometheusEnabled
- name: grafana
condition: global.grafanaEnabled
tags:
- monitoring
- name: prometheus-postgres-exporter
condition: global.prometheusPostgresExporterEnabled
- name: kube-state
condition: global.kubeStateEnabled
tags:
- monitoring
- name: alertmanager
condition: global.alertmanagerEnabled
- name: prometheus-blackbox-exporter
condition: global.blackboxExporter.Enabled
tags:
- monitoring
- name: kube-state
condition: global.kubeStateEnabled
- name: prometheus-node-exporter
condition: global.nodeExporter.Enabled
tags:
- monitoring
- name: prometheus-node-exporter
condition: global.nodeExporterEnabled
- name: prometheus-postgres-exporter
condition: global.prometheusPostgresExporter.Enabled
tags:
- monitoring
- name: prometheus-blackbox-exporter
condition: global.blackboxExporterEnabled
- name: prometheus
condition: global.prometheusEnabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the alphabetical order improvement, easier to find a dependency

@danielhoherd danielhoherd merged commit b6e796e into master Aug 22, 2023
7 of 9 checks passed
@danielhoherd danielhoherd deleted the 5714-monitoring-tag branch August 22, 2023 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants