Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus operator helm chart fun and games #824

Closed
gitfool opened this issue Oct 1, 2019 · 4 comments
Closed

Prometheus operator helm chart fun and games #824

gitfool opened this issue Oct 1, 2019 · 4 comments
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed

Comments

@gitfool
Copy link

gitfool commented Oct 1, 2019

I've been struggling to install the prometheus-operator to an EKS cluster via the helm chart.

I wanted everything to be tucked into a "monitoring" namespace, and I eventually worked out what I needed to disable, which makes sense in hindsight given the EKS control plane, and now have the following:

const monitoringNamespace = new k8s.core.v1.Namespace("monitoring", { metadata: { name: "monitoring" } }, { provider: provider });

function fixGrafanaTest(obj: any) {
    if (obj.metadata.name === "po-grafana-test") {
        if (obj.kind === "Pod") {
            obj.metadata.annotations = {
                "pulumi.com/skipAwait": "true"
            };
        }
    }
}

function setMonitoringNamespace(obj: any) {
    if (obj.metadata.namespace === undefined) {
        obj.metadata.namespace = "monitoring";
    }
}

const prometheusOperatorChart = new k8s.helm.v2.Chart("po", {
    repo: "stable",
    chart: "prometheus-operator",
    version: "6.11.0",
    namespace: "monitoring",
    transformations: [ fixGrafanaTest, setMonitoringNamespace ],
    values: {
        kubeControllerManager: { enabled: false },
        kubeEtcd: { enabled: false },
        kubeScheduler: { enabled: false },
        kubeTargetVersionOverride: k8sVersion,
        prometheusOperator: { createCustomResource: false }
    }
}, {
    dependsOn: monitoringNamespace,
    provider: provider
});

I'd like highlight the following issues:

  • If the prometheusOperatorChart referenced monitoringNamespace using namespace: monitoringNamespace.metadata.name, then I'd get a warning about [Can't preview] all chart values must be known ahead of time to generate an accurate preview, so I'm using the same constant value instead. Maybe this could be improved? (Propagate inputs to outputs during preview. pulumi#3245).
  • The Grafana test is using a helm hook on test-success and is always failing its grafana health check, which I initially thought was due to the timing of helm hooks not being supported when using helm template, like Pulumi does behind the scenes, but then I'd expect subsequent runs to succeed since Grafana is running then, but it always fails so probably something more subtle with the test itself.
    • Subsequently there is the problem that Pulumi waits until it times out; handily I could add an annotation to "skipAwait", which works around my impatience, but then it seems to me that Pulumi could be smarter here since the pod specifies to never restart, so it should stop waiting after the first failure.
    • So now it fails fast, but I still need a workaround to avoid the failing test, like using another transformation to neuter or preferably remove it altogether, along with all of its supporting resource detritus. I can see how to neuter the pod by modifying its image etc, but not so much the supporting resources.
@gitfool
Copy link
Author

gitfool commented Oct 1, 2019

@hausdorff mentioned in Slack a possible hack to remove resources in a transformation:

if you set apiVersion: "v1" and kind: "List" I think it should work

I'll give that a go in the meantime, but it would obviously be better to have a first-class way to remove resources via a transformation!

@pgavlin
Copy link
Member

pgavlin commented Oct 1, 2019

I'll give that a go in the meantime, but it would obviously be better to have a first-class way to remove resources via a transformation!

FWIW, that capability is tracked via #486

@gitfool
Copy link
Author

gitfool commented Oct 1, 2019

I used kubectl to run the equivalent Grafana test and it worked:

kubectl exec -it -n monitoring -c grafana po-grafana-56fd4bc598-lb94k bash
curl -s -o /dev/null -I -w '%{http_code}' http://po-grafana/api/health
200

... so I'm still not sure why it's always failing.

Meanwhile, I'm very pleased to say the hack to remove resources in a transformation works for me:

function removeGrafanaTest(obj: any) {
    if (obj.metadata.name === "po-grafana-test") {
        obj.apiVersion = "v1";
        obj.kind = "List";
        obj.items = [];
    }
}

@lblackstone
Copy link
Member

Looks like this was fixed in #486

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec resolution/fixed This issue was fixed
Projects
None yet
Development

No branches or pull requests

3 participants