Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus doesn't respect cgroup memory limit #5384

Closed
Zero-2 opened this Issue Mar 19, 2019 · 2 comments

Comments

Projects
None yet
2 participants
@Zero-2
Copy link

Zero-2 commented Mar 19, 2019

Bug Report

What did you do?

What did you expect to see?
Prometheus tunes memory usage to limits defined in the deployment.

What did you see instead? Under which circumstances?
Prometheus use more memory than available and it's killed by cgroup system.

Environment
Kubernetes cluster on Azure.

  • System information:

Linux 4.15.0-1036-azure x86_64

  • Prometheus version:

prometheus, version 2.7.0 (branch: HEAD, revision: 410ee9e)
build user: root@3bc81b516055
build date: 20190128-10:09:51
go version: go1.11.5

  • Prometheus configuration file:
global:
  evaluation_interval: 30s
  scrape_interval: 30s
  external_labels:
    prometheus: monitoring/prometheus
    prometheus_replica: prometheus-prometheus-0
rule_files:
- /etc/prometheus/rules/prometheus-prometheus-rulefiles-0/*.yaml
scrape_configs:
- job_name: monitoring/burrow-prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kafka
  scrape_interval: 30s
  scrape_timeout: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: kafka-burrow
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    regex: kafka
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: burrow-exporter
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: burrow-exporter
- job_name: monitoring/kafka-prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kafka
  scrape_interval: 30s
  scrape_timeout: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: kafka
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_metrics
    regex: kafka-prometheus
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_release
    regex: kafka
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: kafka-exporter
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: kafka-exporter
- job_name: monitoring/prova-prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - prova
  scrape_interval: 30s
  scrape_timeout: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_metrics
    regex: prometheus
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - target_label: endpoint
    replacement: http
- job_name: monitoring/prometheus/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  scrape_interval: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: prometheus
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_chart
    regex: prometheus-0.0.52
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_prometheus
    regex: prometheus
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http
- job_name: monitoring/prometheus-alertmanager/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  scrape_interval: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_alertmanager
    regex: prometheus
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: alertmanager
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_chart
    regex: alertmanager-0.1.8
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_app
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http
- job_name: monitoring/prometheus-exporter-kube-dns/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scrape_interval: 15s
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: exporter-kube-dns
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    regex: kube-dns
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics-dnsmasq
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics-dnsmasq
- job_name: monitoring/prometheus-exporter-kube-dns/1
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scrape_interval: 15s
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: exporter-kube-dns
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    regex: kube-dns
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http-metrics-skydns
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http-metrics-skydns
- job_name: monitoring/prometheus-exporter-kube-state/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  scrape_interval: 15s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: exporter-kube-state
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    regex: kube-state
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: kube-state-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: kube-state-metrics
- job_name: monitoring/prometheus-exporter-kubelets/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scrape_interval: 15s
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    regex: kubelet
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
- job_name: monitoring/prometheus-exporter-kubelets/1
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - kube-system
  scrape_interval: 30s
  metrics_path: /metrics/cadvisor
  scheme: https
  tls_config:
    insecure_skip_verify: true
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_k8s_app
    regex: kubelet
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: https-metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: https-metrics
- job_name: monitoring/prometheus-exporter-node/0
  honor_labels: false
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  scrape_interval: 15s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_app
    regex: exporter-node
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_component
    regex: node-exporter
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: metrics
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_component
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: metrics
- job_name: monitoring/prometheus-operator/0
  honor_labels: true
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  scrape_interval: 30s
  relabel_configs:
  - action: keep
    source_labels:
    - __meta_kubernetes_service_label_operated_prometheus
    regex: "true"
  - action: keep
    source_labels:
    - __meta_kubernetes_endpoint_port_name
    regex: http
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Node;(.*)
    replacement: ${1}
    target_label: node
  - source_labels:
    - __meta_kubernetes_endpoint_address_target_kind
    - __meta_kubernetes_endpoint_address_target_name
    separator: ;
    regex: Pod;(.*)
    replacement: ${1}
    target_label: pod
  - source_labels:
    - __meta_kubernetes_namespace
    target_label: namespace
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: service
  - source_labels:
    - __meta_kubernetes_pod_name
    target_label: pod
  - source_labels:
    - __meta_kubernetes_service_name
    target_label: job
    replacement: ${1}
  - source_labels:
    - __meta_kubernetes_service_label_prometheus_operator
    target_label: job
    regex: (.+)
    replacement: ${1}
  - target_label: endpoint
    replacement: http
alerting:
  alert_relabel_configs:
  - action: labeldrop
    regex: prometheus_replica
  alertmanagers:
  - path_prefix: /
    scheme: http
    kubernetes_sd_configs:
    - role: endpoints
      namespaces:
        names:
        - monitoring
    relabel_configs:
    - action: keep
      source_labels:
      - __meta_kubernetes_service_name
      regex: prometheus-alertmanager
    - action: keep
      source_labels:
      - __meta_kubernetes_endpoint_port_name
      regex: http

  • Logs:
level=info ts=2019-03-19T09:20:00.66608894Z caller=main.go:302 msg="Starting Prometheus" version="(version=2.7.0, branch=HEAD, revision=410ee9e04acb8f59f400858752ca82b4ef88035e)"
level=info ts=2019-03-19T09:20:00.666204145Z caller=main.go:303 build_context="(go=go1.11.5, user=root@3bc81b516055, date=20190128-10:09:51)"
level=info ts=2019-03-19T09:20:00.666246347Z caller=main.go:304 host_details="(Linux 4.15.0-1036-azure #38~16.04.1-Ubuntu SMP Fri Dec 7 03:21:52 UTC 2018 x86_64 prometheus-prometheus-0 (none))"
level=info ts=2019-03-19T09:20:00.66632425Z caller=main.go:305 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-03-19T09:20:00.666351651Z caller=main.go:306 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-03-19T09:20:00.667682008Z caller=main.go:620 msg="Starting TSDB ..."
level=info ts=2019-03-19T09:20:00.667881117Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-03-19T09:20:00.677831342Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552716000000 maxt=1552737600000 ulid=01D63GA13S5VGP9014GRKS85CQ
level=info ts=2019-03-19T09:20:00.689720751Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552737600000 maxt=1552759200000 ulid=01D644X6YKP9G25DWBKFYM4P2A
level=info ts=2019-03-19T09:20:00.699372964Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552759200000 maxt=1552780800000 ulid=01D64SGCNW1QB1FHMRX9CG84S4
level=info ts=2019-03-19T09:20:00.712271617Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552780800000 maxt=1552802400000 ulid=01D65E3NF1JK08WZSTBF13P9BP
level=info ts=2019-03-19T09:20:00.720908986Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552802400000 maxt=1552824000000 ulid=01D662PTRPS9BJ8W27SB4TCDD1
level=info ts=2019-03-19T09:20:00.726902843Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552824000000 maxt=1552845600000 ulid=01D66QA0D2S7MW4B868W7CBSCQ
level=info ts=2019-03-19T09:20:00.729522555Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552845600000 maxt=1552867200000 ulid=01D67BX3M1DJHBJQ4QMXN12RTG
level=info ts=2019-03-19T09:20:00.734639074Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552867200000 maxt=1552888800000 ulid=01D680G95MWREHH99556ZNSDMV
level=info ts=2019-03-19T09:20:00.76726117Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552888800000 maxt=1552910400000 ulid=01D68N3F9BKT8J8JGB1C9FRP54
level=info ts=2019-03-19T09:20:00.77590314Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552910400000 maxt=1552932000000 ulid=01D699PRYFC7PCWCDQC62T5AAX
level=info ts=2019-03-19T09:20:00.784958628Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552932000000 maxt=1552953600000 ulid=01D69Y9WS7K92SQ23T86Y31PHV
level=info ts=2019-03-19T09:20:00.816262467Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552975200000 maxt=1552982400000 ulid=01D6AJVJZJ0GHR2YX9GWEXPJNN
level=info ts=2019-03-19T09:20:00.821824306Z caller=repair.go:48 component=tsdb msg="found healthy block" mint=1552953600000 maxt=1552975200000 ulid=01D6AJX3ANJJSCKDQMS2EQFMPX
level=warn ts=2019-03-19T09:20:00.82544196Z caller=wal.go:116 component=tsdb msg="last page of the wal is torn, filling it with zeros" segment=/prometheus/wal/00003983
level=info ts=2019-03-19T09:21:10.973661276Z caller=main.go:635 msg="TSDB started"
level=info ts=2019-03-19T09:21:10.97375118Z caller=main.go:695 msg="Loading configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-03-19T09:21:10.980716278Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-19T09:21:10.982182841Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-19T09:21:10.982818768Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-19T09:21:10.983457995Z caller=kubernetes.go:201 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-19T09:21:10.984218428Z caller=kubernetes.go:201 component="discovery manager notify" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2019-03-19T09:21:10.993419622Z caller=main.go:722 msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=info ts=2019-03-19T09:21:10.993450423Z caller=main.go:589 msg="Server is ready to receive web requests."
level=warn ts=2019-03-19T09:38:00.576520933Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30312778 (30312789)"
level=warn ts=2019-03-19T09:38:04.474959221Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30312778 (30312796)"
level=warn ts=2019-03-19T09:41:53.445150002Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30312770 (30313257)"
level=warn ts=2019-03-19T09:55:05.644769729Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30314804 (30314822)"
level=warn ts=2019-03-19T09:55:10.57127359Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30314813 (30314833)"
level=warn ts=2019-03-19T09:57:42.361647015Z caller=klog.go:86 component=k8s_client_runtime func=Warningf msg="/app/discovery/kubernetes/kubernetes.go:300: watch of *v1.Endpoints ended with: too old resource version: 30314226 (30315122)"
  • dmesg log

[Mar19 10:02] prometheus invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=-998
[  +0.000003] prometheus cpuset=af69816b468120af4af818ca9e56e97e4d4ee733ce0eeb340af4f0bcbabef35a mems_allowed=0
[  +0.000005] CPU: 2 PID: 11069 Comm: prometheus Not tainted 4.15.0-1036-azure #38~16.04.1-Ubuntu
[  +0.000001] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007  06/02/2017
[  +0.000001] Call Trace:
[  +0.000009]  dump_stack+0x63/0x82
[  +0.000004]  dump_header+0x77/0x285
[  +0.000002]  oom_kill_process+0x22e/0x450
[  +0.000005]  out_of_memory+0x11d/0x4c0
[  +0.000004]  mem_cgroup_out_of_memory+0x4b/0x80
[  +0.000002]  mem_cgroup_oom_synchronize+0x32a/0x350
[  +0.000002]  ? mem_cgroup_css_reset+0xe0/0xe0
[  +0.000002]  pagefault_out_of_memory+0x36/0x7b
[  +0.000003]  mm_fault_error+0x8f/0x190
[  +0.000003]  ? handle_mm_fault+0xcc/0x1c0
[  +0.000002]  __do_page_fault+0x4cd/0x500
[  +0.000002]  do_page_fault+0x2e/0xf0
[  +0.000002]  ? page_fault+0x2f/0x50
[  +0.000004]  page_fault+0x45/0x50
[  +0.000002] RIP: 0033:0x45ce53
[  +0.000001] RSP: 002b:000000c000097ed8 EFLAGS: 00010202
[  +0.000001] RAX: 0000000000000000 RBX: 00000000000d6e00 RCX: 0000000000000000
[  +0.000001] RDX: 000000000013ae00 RSI: 00007f1a66963e00 RDI: 00007f1a6688d000
[  +0.000001] RBP: 000000c000097f50 R08: 0000000003852000 R09: 00000000004eb800
[  +0.000001] R10: 00000000000013ad R11: 0000000000001c28 R12: 0000000000000000
[  +0.000001] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  +0.000012] Task in /kubepods/pod4b260c26-44e0-11e9-9cc9-325ddc75cea6/af69816b468120af4af818ca9e56e97e4d4ee733ce0eeb340af4f0bcbabef35a killed as a result of limit of /kubepods/pod4b260c26-44e0-11e9-9cc9-325ddc75cea6/af69816b468120af
[  +0.000004] memory: usage 4607968kB, limit 4608000kB, failcnt 934557
[  +0.000001] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[  +0.000001] kmem: usage 11428kB, limit 9007199254740988kB, failcnt 0
[  +0.000000] Memory cgroup stats for /kubepods/pod4b260c26-44e0-11e9-9cc9-325ddc75cea6/af69816b468120af4af818ca9e56e97e4d4ee733ce0eeb340af4f0bcbabef35a: cache:208KB rss:4596332KB rss_huge:1675264KB shmem:0KB mapped_file:8KB dirty:0KB
[  +0.000017] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[  +0.000090] [10976]  1000 10976  2540056  1148928  9936896        0          -998 prometheus
[  +0.000004] Memory cgroup out of memory: Kill process 10976 (prometheus) score 1 or sacrifice child
[  +0.007966] Killed process 10976 (prometheus) total-vm:10160224kB, anon-rss:4595712kB, file-rss:0kB, shmem-rss:0kB
[  +0.170178] oom_reaper: reaped process 10976 (prometheus), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

  • Limits
    resources:
      limits:
        cpu: "2"
        memory: 4500Mi
      requests:
        cpu: "2"
        memory: 4500Mi
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Mar 19, 2019

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

@Zero-2

This comment has been minimized.

Copy link
Author

Zero-2 commented Mar 19, 2019

Ok, thanks :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.