Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Error loading config couldn't load configuration (--config.file=/etc/config/prometheus.yml): parsing YAML #4129

Closed
iamdeadman opened this Issue May 1, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@iamdeadman
Copy link

iamdeadman commented May 1, 2018

Hi,

I am trying to deploy a prometheus monitoring stack in a kubernetes sandbox on Google compute vm.

I have got kubernetes, helm, tiller up & running & I followed this blog post to install the prometheus server - tutorial

So, as it suggested I made changes to my cpu, memory limits & resources &
tried setting extra args to both -

  extraArgs: 
    storage.tsdb.retention: 744h

&

extraArgs: 
  storage.local.retention: 744h

& I then thought that the issue might be because of incorrect cpu, memory limits I had set.

I tried initially yesterday with these values -

resources:
  limits:
    cpu: 4Gi
    memory: 4Gi
  requests:
    cpu: 4Gi
    memory: 4Gi

& today went ahead & tried these -

  resources:
    limits:
      cpu: 2
      memory: 4Gi
    requests:
      cpu: 2
      memory: 4Gi

Now, prometheus server pod is getting crashed on startup.

Here is the gist of the values.yaml I am using to install the chart -

gist values.yaml

& here's the command I am using to deploy the chart -

sudo helm install -f values_backup.yaml stable/prometheus

Here are the logs from the container that suggest that there is a configuration mistake here -

urtutors_dev@bitnami-kubernetessandbox-dm-8e00:~/prometheus$ kubectl logs -f kneeling-marsupial-prometheus-server-5f94f9595f-4cxb9 prometheus-server
level=info ts=2018-05-01T13:50:26.466206807Z caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0, branch=HEAD, revision=85f23d82a045d103ea7f3c89a91fba4a93e6367a)"
level=info ts=2018-05-01T13:50:26.466276087Z caller=main.go:226 build_context="(go=go1.9.2, user=root@6e784304d3ff, date=20180119-12:01:23)"
level=info ts=2018-05-01T13:50:26.466299226Z caller=main.go:227 host_details="(Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 kneeling-marsupial-prometheus-server-5f94f9595f-4cxb9 (none))"
level=info ts=2018-05-01T13:50:26.466316603Z caller=main.go:228 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-05-01T13:50:26.470127889Z caller=web.go:383 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-05-01T13:50:26.470120709Z caller=main.go:499 msg="Starting TSDB ..."
level=info ts=2018-05-01T13:50:26.480127293Z caller=main.go:509 msg="TSDB started"
level=info ts=2018-05-01T13:50:26.480199354Z caller=main.go:585 msg="Loading configuration file" filename=/etc/config/prometheus.yml
level=info ts=2018-05-01T13:50:26.480803755Z caller=main.go:386 msg="Stopping scrape discovery manager..."
level=info ts=2018-05-01T13:50:26.4808297Z caller=main.go:400 msg="Stopping notify discovery manager..."
level=info ts=2018-05-01T13:50:26.480840617Z caller=main.go:424 msg="Stopping scrape manager..."
level=info ts=2018-05-01T13:50:26.480921603Z caller=main.go:382 msg="Scrape discovery manager stopped"
level=info ts=2018-05-01T13:50:26.480969882Z caller=manager.go:460 component="rule manager" msg="Stopping rule manager..."
level=info ts=2018-05-01T13:50:26.48099561Z caller=manager.go:466 component="rule manager" msg="Rule manager stopped"
level=info ts=2018-05-01T13:50:26.481006337Z caller=notifier.go:493 component=notifier msg="Stopping notification manager..."
level=info ts=2018-05-01T13:50:26.481007667Z caller=manager.go:59 component="scrape manager" msg="Starting scrape manager..."
level=info ts=2018-05-01T13:50:26.481019699Z caller=main.go:570 msg="Notifier manager stopped"
level=info ts=2018-05-01T13:50:26.481027179Z caller=main.go:418 msg="Scrape manager stopped"
level=info ts=2018-05-01T13:50:26.481032803Z caller=main.go:396 msg="Notify discovery manager stopped"
level=error ts=2018-05-01T13:50:26.481112947Z caller=main.go:579 err="Error loading config couldn't load configuration (--config.file=/etc/config/prometheus.yml): parsing YAML file /etc/config/prometheus.yml: yaml: line 160: mapping values are not allowed in this context"
level=info ts=2018-05-01T13:50:26.481159397Z caller=main.go:581 msg="See you next time!"

& here's the output of the kubectl describe pod <prometheus-server>

Name:           kneeling-marsupial-prometheus-server-5f94f9595f-4cxb9
Namespace:      default
Node:           bitnami-kubernetessandbox-dm-8e00/10.128.0.2
Start Time:     Tue, 01 May 2018 13:49:58 +0000
Labels:         app=prometheus
                component=server
                pod-template-hash=1950951519
                release=kneeling-marsupial
Annotations:    <none>
Status:         Running
IP:             10.32.0.73
Controlled By:  ReplicaSet/kneeling-marsupial-prometheus-server-5f94f9595f
Init Containers:
  init-chown-data:
    Container ID:  docker://fa1426a69f5506815abad677cd929947336f95cc082582c9849e78a1b4fa8625
    Image:         busybox:latest
    Image ID:      docker-pullable://busybox@sha256:58ac43b2cc92c687a32c8be6278e50a063579655fe3090125dcb2af0ff9e1a64
    Port:          <none>
    Host Port:     <none>
    Command:
      chown
      -R
      65534:65534
      /data
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 May 2018 13:50:01 +0000
      Finished:     Tue, 01 May 2018 13:50:01 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /data from storage-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kneeling-marsupial-prometheus-server-token-jkb48 (ro)
Containers:
  prometheus-server-configmap-reload:
    Container ID:  docker://f59c7ea8563093d4db5509ceda4c7b591098da42b9159832a6bd328979ee97fe
    Image:         jimmidyson/configmap-reload:v0.1
    Image ID:      docker-pullable://jimmidyson/configmap-reload@sha256:2d40c2eaa6f435b2511d0cfc5f6c0a681eeb2eaa455a5d5ac25f88ce5139986e
    Port:          <none>
    Host Port:     <none>
    Args:
      --volume-dir=/etc/config
      --webhook-url=http://localhost:9090/-/reload
    State:          Running
      Started:      Tue, 01 May 2018 13:50:03 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/config from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kneeling-marsupial-prometheus-server-token-jkb48 (ro)
  prometheus-server:
    Container ID:  docker://7d553aef080dc20a3e3a1a8aa6cbcfce839e5acc1991326f4beb9bd1adf948ae
    Image:         prom/prometheus:v2.1.0
    Image ID:      docker-pullable://prom/prometheus@sha256:7b987901dbc44d17a88e7bda42dbbbb743c161e3152662959acd9f35aeefb9a3
    Port:          9090/TCP
    Host Port:     0/TCP
    Args:
      --config.file=/etc/config/prometheus.yml
      --storage.tsdb.path=/data
      --web.console.libraries=/etc/prometheus/console_libraries
      --web.console.templates=/etc/prometheus/consoles
      --web.enable-lifecycle
      --storage.tsdb.retention=744h
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 01 May 2018 14:00:52 +0000
      Finished:     Tue, 01 May 2018 14:00:52 +0000
    Ready:          False
    Restart Count:  7
    Limits:
      cpu:     2
      memory:  4Gi
    Requests:
      cpu:        2
      memory:     4Gi
    Liveness:     http-get http://:9090/-/healthy delay=30s timeout=30s period=10s #success=1 #failure=3
    Readiness:    http-get http://:9090/-/ready delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /data from storage-volume (rw)
      /etc/config from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kneeling-marsupial-prometheus-server-token-jkb48 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kneeling-marsupial-prometheus-server
    Optional:  false
  storage-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  kneeling-marsupial-prometheus-server
    ReadOnly:   false
  kneeling-marsupial-prometheus-server-token-jkb48:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kneeling-marsupial-prometheus-server-token-jkb48
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                 Age                From                                        Message
  ----     ------                 ----               ----                                        -------
  Normal   Scheduled              11m                default-scheduler                           Successfully assigned kneeling-marsupial-prometheus-server-5f94f9595f-4cxb9 to bitnami-kubernetessandbox-dm-8e00
  Normal   SuccessfulMountVolume  11m                kubelet, bitnami-kubernetessandbox-dm-8e00  MountVolume.SetUp succeeded for volume "config-volume"
  Normal   SuccessfulMountVolume  11m                kubelet, bitnami-kubernetessandbox-dm-8e00  MountVolume.SetUp succeeded for volume "kneeling-marsupial-prometheus-server-token-jkb48"
  Normal   SuccessfulMountVolume  11m                kubelet, bitnami-kubernetessandbox-dm-8e00  MountVolume.SetUp succeeded for volume "local-pv-3d955afd"
  Normal   Started                11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Started container
  Normal   Pulled                 11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Container image "busybox:latest" already present on machine
  Normal   Created                11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Created container
  Normal   Pulling                11m                kubelet, bitnami-kubernetessandbox-dm-8e00  pulling image "prom/prometheus:v2.1.0"
  Normal   Pulled                 11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Container image "jimmidyson/configmap-reload:v0.1" already present on machine
  Normal   Started                11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Started container
  Normal   Created                11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Created container
  Normal   Pulled                 11m                kubelet, bitnami-kubernetessandbox-dm-8e00  Successfully pulled image "prom/prometheus:v2.1.0"
  Normal   Created                11m (x3 over 11m)  kubelet, bitnami-kubernetessandbox-dm-8e00  Created container
  Normal   Started                11m (x3 over 11m)  kubelet, bitnami-kubernetessandbox-dm-8e00  Started container
  Normal   Pulled                 10m (x3 over 11m)  kubelet, bitnami-kubernetessandbox-dm-8e00  Container image "prom/prometheus:v2.1.0" already present on machine
  Warning  BackOff                1m (x49 over 11m)  kubelet, bitnami-kubernetessandbox-dm-8e00  Back-off restarting failed container
(END)

So, I don't think I am modifying any other unnecessary values from this values file -

values.yaml original

but still If there is any configuration error in my gist, can someone point as to how to resolve it or debug this issue further.

Proposal

Use case. Why is this important?

Nice to have' is not a good use case :)

Bug Report

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

Environment

  • System information:

    insert output of uname -srm here

  • Prometheus version:

    insert output of prometheus --version here

  • Alertmanager version:

    insert output of alertmanager --version here (if relevant to the issue)

  • Prometheus configuration file:

insert configuration here
  • Alertmanager configuration file:
insert configuration here (if relevant to the issue)
  • Logs:
insert Prometheus and Alertmanager logs relevant to the issue here
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 1, 2018

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.