Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator: Allow disabling Grafana deployment #1241

Merged
merged 4 commits into from Jul 5, 2021

Conversation

bison
Copy link
Contributor

@bison bison commented Jun 23, 2021

This exposes a config option in the ConfigMap that allows disabling the Grafana deployment.

  • I added CHANGELOG entry for this change.
  • No user facing changes, so no entry in CHANGELOG was needed.

Summary of the changes:

  • Added Enabled field to GrafanaConfig struct.
  • The Grafana task is split into create() and destroy() methods.
  • The htpasswd related bits (volume and container argument) are removed from jsonnet.
  • The operator injects these at runtime if Grafana is enabled.
  • The shared config task doesn't inject URLs if they are nil.
  • Add some basic unit and e2e tests.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2021
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 23, 2021
@bison bison force-pushed the disable-grafana branch 2 times, most recently from 4ec99aa to b9ede49 Compare June 23, 2021 12:03
@bison bison changed the title WIP: operator: Allow disabling Grafana deployment operator: Allow disabling Grafana deployment Jun 23, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2021
@bison bison changed the title operator: Allow disabling Grafana deployment WIP: operator: Allow disabling Grafana deployment Jun 23, 2021
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 23, 2021
@bison bison force-pushed the disable-grafana branch 3 times, most recently from 612e227 to f747f70 Compare June 29, 2021 11:51
@bison bison changed the title WIP: operator: Allow disabling Grafana deployment operator: Allow disabling Grafana deployment Jun 29, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 29, 2021
@@ -186,3 +186,27 @@ func TestEtcdDefaultsToDisabled(t *testing.T) {
t.Error("an empty etcd configuration should have etcd disabled")
}
}

func TestGrafanaDefaultsToEnabled(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to convert this into a table driven test to make sure each test case is run independently of each other. Here is one such example: https://github.com/openshift/cluster-monitoring-operator/blob/master/pkg/manifests/manifests_test.go#L60

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. I also prefer that. Honestly, I just copied the extremely similar test above this and changed it to test Grafana. I can try to find time to make it table driven.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bison
Copy link
Contributor Author

bison commented Jun 29, 2021

Quick note: There was some offline discussion about the fact that we generally try to keep the jsonnet / assets as close as possible to the default deployment. In this case that would require either needlessly enabling the static password auth when Grafana is disabled, or inverting the logic in manifests.go to remove things when Grafana is disabled instead of adding them when it is enabled. That is tricky in this case because these things are often buried in nested arrays. IMHO with the comments in the jsonnet, the original approach of adding the things at runtime is better.

@bison bison force-pushed the disable-grafana branch 2 times, most recently from 469bd81 to 605592e Compare June 29, 2021 15:17
Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2 minor comments but nothing blocking. Very nice job!

})

// Add the htpasswd arg and volume mount to the proxy container.
for i := range p.Spec.Containers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be clearer to move it to the range loop above where we alreayd tweak the prometheus-proxy container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do that. I don't have a strong opinion. I guess I could actually move all of this there to keep it all together. Unless it's too weird to have the volume bit there as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

})

// Add the htpasswd arg and volume mount to the proxy container.
for i := range d.Spec.Template.Spec.Containers {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same remark here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here as well.

assertOperatorCondition(t, configv1.OperatorAvailable, configv1.ConditionTrue)

// Push a default configuration that re-enables Grafana.
validCM.Data["config.yaml"] = "enableUserWorkload: true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does enabling user workload monitoring also enable grafana?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. That was just setting the config back to what it was originally, i.e. removing the previously added Grafana bit. It was definitely confusing though. The new test is different.

@@ -92,6 +94,50 @@ func TestClusterMonitoringOperatorConfiguration(t *testing.T) {
t.Log("asserting that CMO goes back healthy after the configuration is fixed")
assertOperatorCondition(t, configv1.OperatorDegraded, configv1.ConditionFalse)
assertOperatorCondition(t, configv1.OperatorAvailable, configv1.ConditionTrue)

// Push a configuration that disables Grafana.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extract this addition into a new independent test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +1157 to +1180
data := map[string]string{}

// Configmap keys need to include "public" to indicate that they are public values.
// See https://bugzilla.redhat.com/show_bug.cgi?id=1807100.
if promHost != nil {
data["prometheusPublicURL"] = promHost.String()
}

if amHost != nil {
data["alertmanagerPublicURL"] = amHost.String()
}

if grafanaHost != nil {
data["grafanaPublicURL"] = grafanaHost.String()
}

if thanosHost != nil {
data["thanosPublicURL"] = thanosHost.String()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This will be removed in #1223

@fpetkovski
Copy link
Contributor

/lgtm but would be nice to get another review

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 30, 2021
Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
I've checked offline with @sichvoge who's fine with the field as exposed in the configmap.

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 2, 2021
@fpetkovski
Copy link
Contributor

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 2, 2021
bison added 4 commits July 5, 2021 12:06
This adds an option to the CMO ConfigMap that allows disabling the
Grafana deployment.  The actual removal of deployed Grafana resources
is not yet implemented.
This implements the destroy method for the Grafana task.  It deletes
all the resources created by the create method in the reverse order.
@openshift-ci openshift-ci bot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 5, 2021
@bison
Copy link
Contributor Author

bison commented Jul 5, 2021

@simonpasquier @fpetkovski: I had to rebase this. Can you have another look when you have a chance?

Copy link
Contributor

@simonpasquier simonpasquier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 5, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 5, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bison, fpetkovski, simonpasquier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [bison,fpetkovski,simonpasquier]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 5, 2021

@bison: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-aws-single-node 91de7fb link /test e2e-aws-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants