New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kube-prometheus-stack] Helm install on a new kubernetes cluster fails #3187
Comments
I see that the admission-patch job is failing as it expects a secret to be available, |
@jjayabal23 the secret is created by this job https://github.com/prometheus-community/helm-charts/blob/kube-prometheus-stack-38.0.3/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml. Does this also fails? |
@GMartinez-Sisti I have not checked. But obviously, this will fail as this is dependent on the missing secret, Line 44 in 1658a4e
|
From what I understand, the job-createSecret.yaml will create the secret. It's using this image and set as My best guess is that the job to create the secret failed. Can you check that job result? |
@GMartinez-Sisti @jjayabal23 But secret is correctly created : any help welcome Kubernetes version : 1.24.6 |
I can't seem to reproduce this issue :/ Using Kind on K8s 1.24.6 → kind create cluster --name issues-3187 --image "kindest/node:v1.24.6"
Creating cluster "issues-3187" ...
✓ Ensuring node image (kindest/node:v1.24.6) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
Set kubectl context to "kind-issues-3187"
You can now use your cluster with:
kubectl cluster-info --context kind-issues-3187
Have a nice day! 👋
→ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"4c9411232e10168d7b050c49a1b59f6df9d7ea4b", GitTreeState:"clean", BuildDate:"2023-04-14T13:14:41Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6", GitCommit:"b39bf148cd654599a52e867485c02c4f9d28b312", GitTreeState:"clean", BuildDate:"2022-09-22T05:53:51Z", GoVersion:"go1.18.6", Compiler:"gc", Platform:"linux/arm64"}
WARNING: version difference between client (1.27) and server (1.24) exceeds the supported minor version skew of +/-1
→ kubectl create ns kube-prometheus-stack
namespace/kube-prometheus-stack created
→ helm upgrade -i kube-prometheus-stack . -n kube-prometheus-stack --set prometheusOperator.admissionWebhooks.enabled=true
Release "kube-prometheus-stack" does not exist. Installing it now.
NAME: kube-prometheus-stack
LAST DEPLOYED: Thu May 25 14:27:11 2023
NAMESPACE: kube-prometheus-stack
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace kube-prometheus-stack get pods -l "release=kube-prometheus-stack"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
→ kubectl --namespace kube-prometheus-stack get pods
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 0 62s
kube-prometheus-stack-grafana-785c74cd7-cqrp7 3/3 Running 0 74s
kube-prometheus-stack-kube-state-metrics-7d9c6c9956-fkfq8 1/1 Running 0 74s
kube-prometheus-stack-operator-7bfb77d695-xf2k2 1/1 Running 0 74s
kube-prometheus-stack-prometheus-node-exporter-czxdj 1/1 Running 0 74s
prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 0 62s I'm using latest helm chart from main (currently @jjayabal23 Can you try with this version? @Kilz78 What version are you using? |
ping @jjayabal23 @Kilz78 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions. |
Describe the bug a clear and concise description of what the bug is.
Installing kube-prometheus-stack helm chart on a fresh kubernetes cluster 1.24.x (AKS) fails with the following error,
helm.go:84: [debug] pre-upgrade hooks failed: timed out waiting for the condition
In this case, I have set
prometheusOperator.admissionWebhooks.enabled=true
and the helm chart tries to create a serviceAccount and fails. The following is the service account the pre-install hook is trying to create,https://github.com/prometheus-community/helm-charts/blob/kube-prometheus-stack-38.0.3/charts/kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/serviceaccount.yaml
As per kubernetes documentation https://kubernetes.io/docs/concepts/configuration/secret/#service-account-token-secrets, the secrets for the service account is not created automatically. Hence the helm chart is waiting for the service account token Secret object to be created and times out.
This is never an issue with the existing cluster where a secret object is already in place.
What's your helm version?
v3.11.0
What's your kubectl version?
v1.24.6
Which chart?
kube-prometheus-stack
What's the chart version?
38.0.3
What happened?
Helm chart installation timed out waiting for the Secret object to be created for the service account.
What you expected to happen?
In a kubernetes cluster (fresh / existing), appropriate service account token Secret objects needs to be created if it does not exist and associate it with the service account.
How to reproduce it?
Install kube-prometheus-stack helm chart with default values in a fresh kubernetes cluster.
Enter the changed values of values.yaml?
Used only default values.yaml
Enter the command that you execute and failing/misfunctioning.
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack --version 38.0.3 -n monitoring --debug
Anything else we need to know?
The secrets should only get created if it does not exist already in the cluster.
The text was updated successfully, but these errors were encountered: