New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facing error err="parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: empty duration string" #5197
Comments
Have you upgraded the prometheus-operator CRDs to the same v0.61.1 version? |
We had removed some of the default values and validations in operator code in v0.60.1 since they are already covered in OpenAPI. So like Simon mentioned you would need to update CRD's to 0.61.0 to make it work. |
seeing the same issue after updating CRDs + prometheus-operator to EDIT: Fixed after updating CRDs and Prometheus-operator, had to restart the operator after applying the CRDs, then restart prometheus to get it working. |
I was still having this problem with the kube-prometheus-stack, so I wanted to share how I debugged and fixed it. I had gotten the stack running in one cluster, but not another, so I compared the I output each secrets data context of $ echo '<SECRET_CONTENT>' | base64 -d | gunzip > /tmp/config-<CLUSTER_NAME>.yaml Then performed a diff.
Sure enough, the evaluation interval and scrape intervals were not being set on lines 2 and 3! To fix, I set them explicitly, redeployed, and bounced the prometheus pod. prometheus:
prometheusSpec:
scrapeInterval: 30s
evaluationInterval: 30s |
@billiford it's very likely that you have a difference between the version of the CRDs and the operator version (e.g. operator version > CRD version). |
I deployed all the CRDs to both clusters that I found here. It would be nice to know which CRD specifically is the root of this problem and why it is causing these intervals to not be set. |
It is the Prometheus CRD. You need to check that |
The bug I think is that the scrapeInterval and evaluationInterval are defined in the values.yaml as empty strings. line 2624 : scrapeInterval: "" These either need to be commented out so the defaults get inserted or set to the default value "30s". I edited my kube-prometheus-stack\values.yaml so values where line 2624 : scrapeInterval: "30s" and then installed via helm: helm install promstack --namespace monitoring -f --create-namespace kube-prometheus-stack/values.yaml ./kube-prometheus-stack |
As others stated, the CRDs were incompatible and I was also getting this error message. Following the CRD upgrade solved the problem for me. I was installing chart version 45.X.X, so the following CRDs were applicable:
After installing them, the problem went away. See more on the documentation: https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#upgrading-an-existing-release-to-a-new-major-version |
The issue was resolved after CRD update. Thanks everyone for the help :-) |
https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml#L2815 it's still in the default values yaml |
I had the same issue and the problem was the
|
What happened?
We upgraded to prometheus operator version 0.61.1 and after upgrade we found that prometheus pods are failing with below error:
We load the configurations from a secret, and config-reloader parses them and puts them into /etc/prometheus/config_out/prometheus.env.yaml file.
We found in the output file that global.scrape_interval is not being parsed and it is put as empty value in the prometheus.env.yaml file, due to which prometheus keeps crashing. Below is the snippet of the prometheus.env.yaml file
If we downgrade operator to previous version i.e. 0.60.1, then everything works fine.
How to reproduce it (as minimally and precisely as possible):
Environment
Any. We checked on OCP, AWS and Azure
Prometheus Operator version:v 0.61.1
Kubernetes version information: 1.23.5
Prometheus Operator Logs:
The text was updated successfully, but these errors were encountered: