Skip to content
πŸ“Š Prometheus and Grafana 101
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Hands on lab : Prometheus and Grafana

Slides here

0 - Introduction

Full setup (with workshop solutions)

Locally with Docker

Locally without Docker

Download Prometheus and official exporters:

Download Grafana:

1 - Metrics types

Take a look on Prometheus metric types (counter, gauges, histogram, summary) =>

2 - Start Prometheus

# Starts Prometheus
docker-compose up -d prometheus

# Starts system metrics exporter
docker-compose up -d node-exporter

3 - Let's grab some system metrics (memory, CPU, disk...)

Update prometheus.yml config file, to scrape node-exporter metrics every 10 seconds. πŸš€

πŸ’‘ Solution
# /etc/prometheus/prometheus.yml

scrape_interval: 30s

- job_name: 'node-exporter'
  scrape_interval: 10s
    - targets: ['node-exporter:9100']

Then reload Prometheus with docker-compose exec prometheus kill -HUP 1 and see what happens here: http://localhost:9090/targets.

4 - Execute your first PromQL query

PromQL documentation:

4.0 - Memory usage

Go to http://localhost:9090/graph and write a query displaying a graph of free memory on your OS.

Metric name is node_memory_MemFree_bytes.

πŸ’‘ Solution

Query: node_memory_MemTotal_bytes{}

4.1 - Human readable

Same metric but in GigaBytes ?

πŸ’‘ Solution

Query: node_memory_MemTotal_bytes{} / 1024 / 1024 / 1024

4.2 - Relative to total memory

Same metric, but in percent of total available memory ?

Tips: node-exporter metrics are prefixed by node_.

πŸ’‘ Solution

Query: (node_memory_MemTotal_bytes{} - node_memory_MemFree_bytes{}) / node_memory_MemTotal_bytes{} * 100

5 - Setup Grafana

Uncomment grafana in docker-compose.yml and launch it:

docker-compose up -d grafana

Open http://localhost:3000 (user: grep / pass: demo).

Add a new datasource to Grafana.

6 - Hand-made dashboard

Add a new dashboard to Grafana.

6.0 - Simple graph

Create a graph showing current memory usage.

πŸ’‘ Solution

Query: (node_memory_MemTotal_bytes{} - node_memory_MemFree_bytes{}) / node_memory_MemTotal_bytes{} * 100

6.1 - Some formatting

Grafana should be displaying graph in %, such as:

πŸ’‘ Solution

6.2 - CPU load

In the same dashboard, add a new graph for CPU load (1min, 5min, 15min).

Tips: you will need a new metric prefixed by node_.

πŸ’‘ Solution

6.3 - Disk usage

In the same dashboard, add a new graph for sda disk usage (ko written per second).

You will need rate() PromQL function:

πŸ’‘ Solution

Query: rate(node_disk_written_bytes_total{device="sda"}[30s])

7 - Dashboards from community

Let's import a dashboard from Grafana website.

Those dashboards are only compatible with Prometheus data-source and node-exporter.

8 - Monitor services: nginx, postgresql...

8.1 - Export Nginx and PostgreSQL metrics

Uncomment postgres, postgresql-exporter and nginx-exporter services in docker-compose.yml, and launch containers.

docker-compose up -d nginx-exporter
docker-compose up -d postgres postgresql-exporter

Update Prometheus configuration to scrape Nginx and PostgreSQL exporters.

πŸ’‘ Solution


- job_name: 'postgresql-exporter'
    - targets: ['postgresql-exporter:9187']

- job_name: 'nginx-exporter'
    - targets: ['nginx-exporter:9101']

Then docker-compose exec prometheus kill -HUP 1

Check everything is working well here: http://localhost:9090/targets

Take a look on /metrics routes of exporters: http://localhost:9187/metrics + http://localhost:9101/metrics

8.2 - Generate some metrics

Send tens of requests to Nginx on localhost:8080 (200, 404...) and fill PostgreSQL database:

# 2xx

# 4xx
# inserts data into pg

8.3 - Import PG dashboards to Grafana

Go on and find a dashboard for PostgreSQL, compatible with Prometheus and wrouesnel/postgres_exporter.

πŸ’‘ Solution

Those exporters looks nice:,

8.4 - Create Nginx dashboards

Display 2 graphs:

  • number of 2xx http requests per second

  • number of 4xx http requests per second

Tips: you should use sum by(<label>) (<metric>) and irate(<metric>) (cf PromQL doc).

πŸ’‘ Solution

Query graph 1: sum by (status) (irate(nginx_http_requests_total{status=~"2.."}[1m]))

Legend graph 1: Status: {{ status }}

Query graph 2: sum by (status) (irate(nginx_http_requests_total{status=~"4.."}[1m]))

Legend graph 2: Status: {{ status }}

9 - Export some business metrics

Let's display in real time:

  • number of users
  • number of posts per user

9.0 - Export data

Grab custom metrics with postgresql-exporter by adding queries to custom-queries.yml:

  • Metric user_count of type counter => SELECT COUNT(*) FROM users;
  • Metric post_per_user_count of type gauge with user_id and email in labels => SELECT,, COUNT(*) FROM posts p JOIN users u ON = p.user_id GROUP BY;

Example and syntax here.

http://localhost:9187/metrics should output:


# HELP user_count_count Number of users
# TYPE user_count_count counter
user_count_count 2

# HELP post_per_user_count_count Number of posts per user
# TYPE post_per_user_count_count gauge
post_per_user_count_count{email="",id="e1c10ca1-60c8-405c-a9f3-3ff41456ca9f"} 1
post_per_user_count_count{email="",id="fde08ee6-5fb9-4c4f-9b40-dc2ad69bb855"} 2

πŸ’‘ Solution

Append to custom-queries.yml:

  query: "SELECT COUNT(*) FROM users;"
    - count:
        usage: "COUNTER"
        description: "Number of users"

  query: "SELECT,, COUNT(*) FROM posts p JOIN users u ON = p.user_id GROUP BY;"
    - count:
        usage: "GAUGE"
        description: "Number of posts per user"
    - id:
        usage: "LABEL"
        description: "User id"
    - email:
        usage: "LABEL"
        description: "User email"

9.1 - Graph time!

With user_count{} and post_per_user_count{id,email} metrics, build following graphs:

Simple graph of users signup (rate(<metric>)):


Heatmap of signups (increase(<metric>)):

docker-compose exec grafana grafana-cli plugins install petrslavotinek-carpetplot-panel
docker-compose restart grafana

Table of top 10 users per post count (topk(), sum by(<label>) (<metric>)):

πŸ’‘ Solution

Query 1: rate(user_count{}[1m])

Query 2: increase(user_count{}[$__interval]) > 0

Query 3: topk(10, sum by (id, email) (post_per_user_count{}) > 0)

9.2 - Expose /metrics from a micro-service

You can play with this sample in NodeJS: microservice-demo/

Don't forget to update Prometheus configuration in prometheus.yml !

42 - More

  • Monitor a Redis server, a RabbitMQ cluster, Mysql...
  • Increase data retention (15d by default).
  • Setup alerting with AlertManager and basic rules
  • Setup Prometheus service discovery (consul, etc, dns...) to import configuration automatically
  • Limits: multitenancy - partitionning/sharding - scaling - cron tasks
You can’t perform that action at this time.