Skip to content
This repository has been archived by the owner on May 6, 2021. It is now read-only.

Mountpoint set to "/", swap file returns NaN and upgrade of containers ? #133

Open
iangregsondev opened this issue Sep 28, 2020 · 1 comment

Comments

@iangregsondev
Copy link

Hi,

I am getting some weird errors on available disk use, the query I notice is this.

sum((node_filesystem_free_bytes{mountpoint="/rootfs"} / node_filesystem_size_bytes{mountpoint="/rootfs"}) * on(instance) group_left(node_name) node_meta{node_id=~".+"} * 100) / count(node_meta * on(instance) group_left(node_name) node_meta{node_id=~".+"})

but on my system, there is no /rootfs, I did a check on node_filesystem_size_bytes that is used on the query and it outputs

✔node_filesystem_size_bytes{device="/dev/mapper/ubuntu--vg-ubuntu--lv",fstype="ext4",instance="10.0.8.3:9100",job="node-exporter",mountpoint="/"}
✔node_filesystem_size_bytes{device="/dev/mapper/ubuntu--vg-ubuntu--lv",fstype="ext4",instance="10.0.8.16:9100",job="node-exporter",mountpoint="/"}

as you can see the mountpoint is "/"

This is what I have set in my docker (nothing changed as far as variables are concerned - I did upgraded the version - see below)

    environment:
      - NODE_ID={{.Node.ID}}
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
      - /etc/hostname:/etc/nodename
    command:
      - '--path.sysfs=/host/sys'
      - '--path.procfs=/host/proc'
      - '--collector.textfile.directory=/etc/node-exporter/'
      - '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/)'
      - '--no-collector.ipvs'

Also another one is the "used swap memory" which returns NaN, i.e. this query

sum(((node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes) / node_memory_SwapTotal_bytes) * on(instance) group_left(node_name) node_meta{node_id=~".+"} * 100) / count(node_meta * on(instance) group_left(node_name) node_meta{node_id=~".+"})

Can anybody help me debug it ?

I need to be upfront, I did upgrade the containers as I felt there were outdated so I upgraded the containers and also I changed unsee for karma (karma is a fork by the original developer, unsee is actually deprecated, its technically the same - just nicer and written in react)

I will leave the compose and dockerfiles here, would interested in knowing if anybody else had tried this and getting issues :-) All I really did was edit the dockerfiles and add ":latest" tag and update docker-compose to build the images.

docker compose

version: "3.3"

networks:
  internal:
    external: false
  traefik:
    external: true

configs:
  dockerd_config:
    file: ./dockerd-exporter/Caddyfile
  node_rules:
    file: ./prometheus/rules/swarm_node.rules.yml
  task_rules:
    file: ./prometheus/rules/swarm_task.rules.yml

services:
  dockerd-exporter:
    image: stefanprodan/caddy
    networks:
      - internal
    environment:
      - DOCKER_GWBRIDGE_IP=172.18.0.1
    configs:
      - source: dockerd_config
        target: /etc/caddy/Caddyfile
    deploy:
      mode: global
      resources:
        limits:
          memory: 128M
        reservations:
          memory: 64M

  cadvisor:
    image: google/cadvisor
    networks:
      - internal
    command: -logtostderr -docker_only
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /:/rootfs:ro
      - /var/run:/var/run
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    deploy:
      mode: global
      resources:
        limits:
          memory: 128M
        reservations:
          memory: 64M

  grafana:
    image: iangregsondev/swarmprom-grafana:latest
    build:
      context: ./grafana
      dockerfile: Dockerfile
    networks:
      - default
      - internal
      - traefik
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=admin
      - GF_USERS_ALLOW_SIGN_UP=false
      #- GF_SERVER_ROOT_URL=${GF_SERVER_ROOT_URL:-localhost}
      #- GF_SMTP_ENABLED=${GF_SMTP_ENABLED:-false}
      #- GF_SMTP_FROM_ADDRESS=${GF_SMTP_FROM_ADDRESS:-grafana@test.com}
      #- GF_SMTP_FROM_NAME=${GF_SMTP_FROM_NAME:-Grafana}
      #- GF_SMTP_HOST=${GF_SMTP_HOST:-smtp:25}
      #- GF_SMTP_USER=${GF_SMTP_USER}
      #- GF_SMTP_PASSWORD=${GF_SMTP_PASSWORD}
    volumes:
      - /mnt/grafana:/var/lib/grafana
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          memory: 128M
        reservations:
          memory: 64M
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik
        - traefik.http.routers.grafana.entrypoints=https
        - traefik.http.routers.grafana.tls.certresolver=le
        - traefik.http.services.grafana.loadbalancer.server.port=3000
        - traefik.http.routers.grafana.rule=Host(`grafana.somedomain.dev`)
        - traefik.http.middlewares.grafana-ipwhitelist.ipwhitelist.sourcerange=192.168.1.0/24        
        - traefik.http.routers.grafana.middlewares=grafana-ipwhitelist@docker  

  alertmanager:
    image: iangregsondev/swarmprom-alertmanager:latest
    build:
      context: ./alertmanager
      dockerfile: Dockerfile
    networks:
      - default
      - internal
      - traefik
    environment:
      - SLACK_URL=${SLACK_URL:-https://hooks.slack.com/services/TOKEN}
      - SLACK_CHANNEL=${SLACK_CHANNEL:-general}
      - SLACK_USER=${SLACK_USER:-alertmanager}
    command:
      - '--config.file=/etc/alertmanager/alertmanager.yml'
      - '--storage.path=/alertmanager'
    volumes:
      - /mnt/alertmanager:/alertmanager
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          memory: 128M
        reservations:
          memory: 64M
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik
        - traefik.http.routers.alertmanager.entrypoints=https
        - traefik.http.routers.alertmanager.tls.certresolver=le
        - traefik.http.services.alertmanager.loadbalancer.server.port=9093
        - traefik.http.routers.alertmanager.rule=Host(`alertmanager.somedomain.dev`)
        - traefik.http.middlewares.alertmanager-ipwhitelist.ipwhitelist.sourcerange=192.168.1.0/24        
        - traefik.http.routers.alertmanager.middlewares=alertmanager-ipwhitelist@docker  

  karma:
    image: lmierzwa/karma:latest
    networks:
      - default
      - internal
      - traefik
    environment:
      - "ALERTMANAGER_URI=http://alertmanager:9093"
    deploy:
      mode: replicated
      replicas: 1
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik
        - traefik.http.routers.karma.entrypoints=https
        - traefik.http.routers.karma.tls.certresolver=le
        - traefik.http.services.karma.loadbalancer.server.port=8080
        - traefik.http.routers.karma.rule=Host(`karma.somedomain.dev`)
        - traefik.http.middlewares.karma-ipwhitelist.ipwhitelist.sourcerange=192.168.1.0/24        
        - traefik.http.routers.karma.middlewares=karma-ipwhitelist@docker  

  node-exporter:
    image: iangregsondev/swarmprom-node-exporter:latest
    build:
      context: ./node-exporter
      dockerfile: Dockerfile     
    networks:
      - internal
    environment:
      - NODE_ID={{.Node.ID}}
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
      - /etc/hostname:/etc/nodename
    command:
      - '--path.sysfs=/host/sys'
      - '--path.procfs=/host/proc'
      - '--collector.textfile.directory=/etc/node-exporter/'
      - '--collector.filesystem.ignored-mount-points=^/(sys|proc|dev|host|etc)($$|/)'
      - '--no-collector.ipvs'
    deploy:
      mode: global
      resources:
        limits:
          memory: 128M
        reservations:
          memory: 64M

  prometheus:
    image: iangregsondev/swarmprom-prometheus:latest
    build:
      context: ./prometheus
      dockerfile: Dockerfile    
    networks:
      - default
      - internal
      - traefik
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention=24h'
    volumes:
      - /mnt/prometheus:/prometheus
    configs:
      - source: node_rules
        target: /etc/prometheus/swarm_node.rules.yml
      - source: task_rules
        target: /etc/prometheus/swarm_task.rules.yml
    deploy:
      mode: replicated
      replicas: 1
      placement:
        constraints:
          - node.role == manager
      resources:
        limits:
          memory: 2048M
        reservations:
          memory: 128M
      labels:
        - traefik.enable=true
        - traefik.docker.network=traefik
        - traefik.http.routers.prometheus.entrypoints=https
        - traefik.http.routers.prometheus.tls.certresolver=le
        - traefik.http.services.prometheus.loadbalancer.server.port=9090
        - traefik.http.routers.prometheus.rule=Host(`prometheus.somedomain.dev`)
        - traefik.http.middlewares.prometheus-ipwhitelist.ipwhitelist.sourcerange=192.168.1.0/24        
        - traefik.http.routers.prometheus.middlewares=prometheus-ipwhitelist@docker  
        

and

FROM prom/alertmanager:latest

COPY conf /etc/alertmanager/

ENTRYPOINT  [ "/etc/alertmanager/docker-entrypoint.sh" ]
CMD        [ "--config.file=/etc/alertmanager/alertmanager.yml", \
             "--storage.path=/alertmanager" ]

and

FROM grafana/grafana:latest
# https://hub.docker.com/r/grafana/grafana/tags/

COPY datasources /etc/grafana/provisioning/datasources/
COPY swarmprom_dashboards.yml /etc/grafana/provisioning/dashboards/
COPY dashboards /etc/grafana/dashboards/

ENV GF_SECURITY_ADMIN_PASSWORD=admin \
    GF_SECURITY_ADMIN_USER=admin \
    GF_PATHS_PROVISIONING=/etc/grafana/provisioning/

and

FROM prom/node-exporter:latest

ENV NODE_ID=none

USER root

COPY conf /etc/node-exporter/

ENTRYPOINT  [ "/etc/node-exporter/docker-entrypoint.sh" ]
CMD [ "/bin/node_exporter" ]

and

FROM prom/prometheus:latest
# https://hub.docker.com/r/prom/prometheus/tags/

ENV WEAVE_TOKEN=none

COPY conf /etc/prometheus/

ENTRYPOINT [ "/etc/prometheus/docker-entrypoint.sh" ]
CMD        [ "--config.file=/etc/prometheus/prometheus.yml", \
             "--storage.tsdb.path=/prometheus" ]

@iangregsondev
Copy link
Author

It seems everything else is working great in the dashboards.

Just diskspace and memory

image

image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant