can't install prometheus-operator chart #285

geekflyer · 2018-11-19T09:18:27Z

chart: stable/prometheus-operator
version: 0.1.22
values: {} (default)

Hi it seems impossible to install the prometheus operator chart - I'm getting multiple errors that won't go away even after multiple attempts:

code:

import { core, helm } from '@pulumi/kubernetes';
import { k8sProvider } from '../cluster';

const appName = 'prometheus';

const namespaceName = appName;

const namespace = new core.v1.Namespace(
  namespaceName,
  {
    metadata: { name: namespaceName }
  },
  { provider: k8sProvider }
);

new helm.v2.Chart(
  appName,
  {
    repo: 'stable',
    chart: appName + '-operator',
    namespace: namespaceName,
    version: ' 0.1.22 ',
    values: {}
  },
  { dependsOn: namespace, providers: { kubernetes: k8sProvider } }

error:

  kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-etcd):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-kube-etcd'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-scheduler):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-kube-scheduler'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:core:Service (prometheus-prometheus-node-exporter):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-node-exporter'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:monitoring.coreos.com:Alertmanager (prometheus-prometheus-oper-alertmanager):
    error: Plan apply failed: unable to fetch resource description for monitoring.coreos.com/v1: the server could not find the requested resource

  kubernetes:core:Service (prometheus-prometheus-oper-alertmanager):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-alertmanager'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-controller-manager):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-kube-controller-manager'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:core:Service (kube-system/prometheus-prometheus-oper-coredns):
    error: Plan apply failed: 2 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-coredns'
    * Service does not target any Pods. Application Pods may failed to become alive, or field '.spec.selector' may not match labels on any Pods

  kubernetes:apps:Deployment (prometheus-prometheus-oper-operator):
    error: Plan apply failed: 3 errors occurred:

    * Resource operation was cancelled for 'prometheus-prometheus-oper-operator'
    * Minimum number of live Pods was not attained
    * 1 Pods failed to run because: [CrashLoopBackOff] Back-off 40s restarting failed container=prometheus-operator pod=prometheus-prometheus-oper-operator-6878755977-zbwqw_default(cf3964d8-ebda-11e8-90b8-42010a8a012d)

It seems that it attempts to create multiple services but no matching deployments / daemonsets.
I also got in some cases an error that "a resource does not specify a metadata.name" which I believe is probably related to the non-existence of the deployments.

The text was updated successfully, but these errors were encountered:

hausdorff · 2018-11-24T00:22:35Z

Summary: There are two issues here which are very likely to be Pulumi issues; one of them is fixed, and the other will be fixed soon. If possible, it would be great to have you run against PR #294 and see if that resolves those issues.

The rest of the issues are either unclear (i.e., I don't have the logs) or are likely-working-as-expected. I've provided some code that should fix those, too.

More detailed discussion below.

Pulumi issues

ConfigMapList, apparently, is allowed to not have a name at all, as the API server knows to flatten it out and instantiate only the ConfigMaps inside. I've started Don't require names for built-in Kubernetes list types #294 to try to fix, but I'm not yet confident it's the right approach, because those semantics do not appear to be captured inside the OpenAPI spec.
unable to fetch resource description for monitoring.coreos.com/v1: the server could not find the requested resource This should have been fixed in Fixes in how the provider handles CRDs and CRs #271. What version of @pulumi/kubernetes is inside your package.json?

Requires more info

kubernetes:apps:Deployment (prometheus-prometheus-oper-operator)

Looks like the Pod is crashing. Can you run kubectl logs on the operator Pod? The operator needs to know about AlertManager, so I'm guessing that's the error you'll find in the logs. If so, that should be resolved by #271 as well.

Likely working as expected

For the errors related to the following:

kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-etcd)
kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-scheduler)
kubernetes:core:Service (prometheus-prometheus-node-exporter)
kubernetes:core:Service (kube-system/prometheus-prometheus-oper-kube-controller-manager)
kubernetes:core:Service (kube-system/prometheus-prometheus-oper-coredns)

These Services target Pods which some cloud providers do not actually expose -- among them, I believe, is GKE. Our normal strategy is to upstream fixes to Helm Charts that have bugs, but my belief is that this is desired behavior, and a good set of defaults.

If you are on one of those cloud providers, you should be able to resolve these by changing your Chart definition to something like this (tested on Kubernetes v1.9.7, which is what I had lying around at the time):

new k8s.helm.v2.Chart(
    appName,
    {
        repo: "stable",
        chart: appName + "-operator",
        namespace: namespaceName,
        version: " 0.1.22 ",
        values: {
            kubeEtcd: { enabled: false },
            kubeScheduler: { enabled: false },
            kubeControllerManager: { enabled: false },
            coreDns: { enabled: false },
            // I needed this because GKE started k8s without `PodSecurityPolicy`, somehow?
            global: { rbac: { pspEnabled: false } }
        }
    },
    { dependsOn: namespace }
);

hausdorff · 2018-11-26T18:39:09Z

Ok, talking to @geekflyer, I think this is likely to be solved -- I'll close for now. If you run into more issues, please feel free to re-open.

lukehoban assigned hausdorff Nov 19, 2018

lukehoban added this to the 0.19 milestone Nov 19, 2018

hausdorff closed this as completed Nov 26, 2018

bradkyle mentioned this issue Nov 27, 2020

Pulumi failing to install kube-prometheus helm chart #1389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't install prometheus-operator chart #285

can't install prometheus-operator chart #285

geekflyer commented Nov 19, 2018

hausdorff commented Nov 24, 2018 •

edited

Loading

hausdorff commented Nov 26, 2018

can't install prometheus-operator chart #285

can't install prometheus-operator chart #285

Comments

geekflyer commented Nov 19, 2018

hausdorff commented Nov 24, 2018 • edited Loading

Pulumi issues

Requires more info

Likely working as expected

hausdorff commented Nov 26, 2018

hausdorff commented Nov 24, 2018 •

edited

Loading