Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting very high CPU usage results. #2872

Closed
SainathM opened this Issue Jun 23, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@SainathM
Copy link

SainathM commented Jun 23, 2017

What did you do?
I originally setup node-exporter and cadvisor individually on each node and used prometheus as my database and grafana to graph the data. This worked fine, next I tried to create the above mentioned design using a docker compose file so I can launch all the containers using services in my swarm. For some reason my CPU usage is -64,000% and higher.
What did you expect to see?
I expect to see around 20% cpu usage like i saw earlier when i installed node-exporter and cadvisor as containers on each node.
What did you see instead? Under which circumstances?
High CPU suage -60K% to -80k%
Environment
Docker

  • System information:

    insert output of uname -srm here

  • Prometheus version:

    latest

  • Alertmanager version:

    insert output of alertmanager -version here (if relevant to the issue)

  • Prometheus configuration file:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'codelab-monitor'

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first.rules"
  # - "second.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node_exporter_manager'
         # metrics_path defaults to '/metrics'
         # scheme defaults to 'http'.

    static_configs:
      - targets: ['10.xxx.xx.xx:9100']

  - job_name: 'cadvisor_manager'


    static_configs:
      - targets: ['10.xxx.xx.xx:8080']


  - job_name: 'node_exporter_worker_one'
  
    static_configs:
      - targets: ['10.xxx.xx.xx:9100']

  - job_name: 'cadvisor_worker_one'
 
    static_configs:
      - targets: ['10.xxx.xx.xx:8080']

  - job_name: 'node_exporter_worker_two'
     
    static_configs:
      - targets: ['10.xxx.xx.xx:9100']

  - job_name: 'cadvisor_worker_two'

    static_configs:
      - targets: ['10.xxx.xx.xx:8080']

  - job_name: 'node_exporter_worker_three'
     
    static_configs:
      - targets: ['10.xxx.xx.xx:9100']

  - job_name: 'cadvisor_worker_three'

    static_configs:
      - targets: ['10.xxx.xx.xx:8080']  

insert configuration here
here is my docker compose file

version: "3"

networks:
  monitoring:

services:
  cadvisor:
    image: google/cadvisor:${CADVISOR_VERSION:-v0.25.0}
    networks:
      - monitoring
    ports:
     - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock,readonly
      - /:/rootfs
      - /var/run:/var/run
      - /sys:/sys
      - /var/lib/docker/:/var/lib/docker
    deploy:
      mode: global
      resources:
        limits:
          cpus: '0.10'
          memory: 128M
        reservations:
          cpus: '0.10'
          memory: 64M

  node-exporter:
    image: basi/node-exporter:${NODE_EXPORTER_VERSION:-v1.13.0}
    networks:
      - monitoring
    ports:
     - "9100:9100"
    volumes:
      - /proc:/host/proc
      - /sys:/host/sys
      - /:/rootfs
      - /etc/hostname:/etc/host_hostname
    environment:
      HOST_HOSTNAME: /etc/host_hostname
    command: -collector.procfs "/host/proc" -collector.sysfs /host/sys -collector.textfile.directory /etc/node-exporter/ -collectors.enabled 'conntrack,diskstats,entropy,filefd,filesystem,loadavg,mdadm,meminfo,netdev,netstat,stat,textfile,time,vmstat,ipvs' -collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc)($$|/)"
    deploy:
      mode: global
      resources:
        limits:
          cpus: '0.10'
          memory: 32M
        reservations:
          cpus: '0.10'
          memory: 16M

  docker-exporter:
    image: basi/socat:${DOCKER_EXPORTER_VERSION:-v0.1.0}
    networks:
      - monitoring
    deploy:
      mode: global
      resources:
        limits:
          cpus: '0.05'
          memory: 6M
        reservations:
          cpus: '0.05'
          memory: 4M

  alertmanager:
    image: basi/alertmanager:${ALERTMANAGER_VERSION:-v0.1.0}
    networks:
      - monitoring
      # - logging
    ports:
     - "9093:9093"
    environment:
      SLACK_API: ${SLACK_API:-YOURTOKENGOESHERE}
      LOGSTASH_URL: http://logstash:8080/
    command: -config.file=/etc/alertmanager/config.yml
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          cpus: '0.01'
          memory: 32M
        reservations:
          cpus: '0.01'
          memory: 16M

  prometheus:
    image: basi/prometheus-swarm:${PROMETHEUS_SWARM_VERSION:-v0.4.3}
    ports:
      - "9090:9090"
    networks:
      - monitoring
    volumes:
      - /docker/prometheus/:/etc/prometheus
    command: -config.file=/etc/prometheus/prometheus.yml -storage.local.path=/prometheus -web.console.libraries=/etc/prometheus/console_libraries -web.console.templates=/etc/prometheus/consoles -alertmanager.url=http://alertmanager:9093
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          cpus: '0.50'
          memory: 1024M
        reservations:
          cpus: '0.50'
          memory: 128M

  grafana:
    image: basi/grafana:${GRAFANA_VERSION:-v4.1.1}
    ports:
      - "3000:3000"
    networks:
      - monitoring
    environment:
      GF_SECURITY_ADMIN_PASSWORD: ${GF_PASSWORD:-admin}
      PROMETHEUS_ENDPOINT: http://prometheus:9090
      ELASTICSEARCH_ENDPOINT: ${ES_ADDRESS:-http://elasticsearch:9200}
      ELASTICSEARCH_USER: ${ES_USERNAME}
      ELASTICSEARCH_PASSWORD: ${ES_PASSWORD}
    deploy:
      mode: replicated
      replicas: 1
      resources:
        limits:
          cpus: '0.50'
          memory: 64M
        reservations:
          cpus: '0.50'
          memory: 32M

* Alertmanager configuration file:

insert configuration here (if relevant to the issue)


* Logs:

insert Prometheus and Alertmanager logs relevant to the issue here

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jul 14, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.