b0c1abc Aug 20, 2018
3 contributors

Users who have contributed to this file

@brancz @gytisgreitai @azzaka
75 lines (53 sloc) 4.17 KB

Note: Starting with v0.12.0, Prometheus Operator requires use of Kubernetes v1.7.x and up.

FAQ / Troubleshooting

RBAC on Google Container Engine (GKE)

When you try to create ClusterRole (kube-state-metrics, prometheus prometheus-operator, etc.) on GKE Kubernetes cluster running 1.6 version, you will probably run into permission errors:

Error from server (Forbidden): error when creating 
"manifests/prometheus-operator/prometheus-operator-cluster-role.yaml": "prometheus-operator" is forbidden: attempt to grant extra privileges:

This is due to the way Container Engine checks permissions. From Google Container Engine docs:

Because of the way Container Engine checks permissions when you create a Role or ClusterRole, you must first create a RoleBinding that grants you all of the permissions included in the role you want to create. An example workaround is to create a RoleBinding that gives your Google identity a cluster-admin role before attempting to create additional Role or ClusterRole permissions. This is a known issue in the Beta release of Role-Based Access Control in Kubernetes and Container Engine version 1.6.

To overcome this, you must grant your current Google identity cluster-admin Role:

# get current google identity
$ gcloud info | grep Account
Account: []

# grant cluster-admin to your current identity
$ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin
Clusterrolebinding "myname-cluster-admin-binding" created

Troubleshooting ServiceMonitor changes

When creating/deleting/modifying ServiceMonitor objects it is sometimes not as obvious what piece is not working properly. This section gives a step by step guide how to troubleshoot such actions on a ServiceMonitor object.

Has my ServiceMonitor been picked up by Prometheus?

ServiceMonitor objects are selected by the serviceMonitorSelector of a Prometheus object. The name of a ServiceMonitor is encoded in the Prometheus configuration, so you can simply grep whether it is present there. The configuration generated by the Prometheus Operator is stored in a Kubernetes Secret, named after the Prometheus object name prefixed with prometheus- and is located in the same namespace as the Prometheus object. For example for a Prometheus object called k8s one can find out if the ServiceMonitor named my-service-monitor has been picked up with:

kubectl -n monitoring get secret prometheus-k8s -ojson | jq -r '.data["prometheus.yaml"]' | base64 -d | grep "my-service-monitor"

Prometheus kubelet metrics server returned HTTP status 403 Forbidden

Prometheus is installed, all looks good, however the Targets are all showing as down. All permissions seem to be good, yet no joy. Prometheus pulling metrics from all namespaces expect kube-system, and Prometheus has access to all namespaces including kube-system.

Did you check the webhooks?

Issue has been resolved by amending the webhooks to use instead of Follow the below commands and it will update the webhooks which allows connections to all clusterIP's in all namespaces and not just

Update the kubelet service to include webhook and restart:

sed -e "/cadvisor-port=0/d" -i "$KUBEADM_SYSTEMD_CONF"
if ! grep -q "authentication-token-webhook=true" "$KUBEADM_SYSTEMD_CONF"; then
  sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i "$KUBEADM_SYSTEMD_CONF"
systemctl daemon-reload
systemctl restart kubelet

Modify the kube controller and kube scheduler to allow for reading data:

sed -e "s/- --address= --address=" -i /etc/kubernetes/manifests/kube-controller-manager.yaml
sed -e "s/- --address= --address=" -i /etc/kubernetes/manifests/kube-scheduler.yaml