🔥 Ember

A quiet system for revealing hidden heat and load.

Ember is a standalone Linux host observability stack designed to diagnose fan noise, thermal spikes, and system bottlenecks using host and container metrics. Built for Mini PCs and homelab environments where understanding thermal behavior and resource utilization is critical.

Features

Thermal Monitoring: Track CPU temperatures, thermal zones, and fan speeds via hwmon
CPU Analysis: Per-core utilization, iowait detection, load averages
Memory & Swap: Real-time memory breakdown and swap usage tracking
Disk I/O: Throughput, IOPS, and filesystem usage monitoring
Network: Bandwidth utilization, errors, and packet drops
Container Insights: Top containers by CPU/memory, restart detection
Process Visibility: Per-process resource consumption (optional)

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Linux Host                               │
├─────────────────────────────────────────────────────────────────┤
│  ┌──────────────┐  ┌──────────────┐  ┌────────────────────────┐ │
│  │ node_exporter│  │   cAdvisor   │  │   process-exporter     │ │
│  │   :9100      │  │    :8080     │  │        :9256           │ │
│  └──────┬───────┘  └──────┬───────┘  └───────────┬────────────┘ │
│         │                 │                      │              │
│         └────────────┬────┴──────────────────────┘              │
│                      ▼                                          │
│              ┌───────────────┐                                  │
│              │  Prometheus   │──── 15 day retention             │
│              │    :9090      │                                  │
│              └───────┬───────┘                                  │
│                      │                                          │
│                      ▼                                          │
│              ┌───────────────┐                                  │
│              │    Grafana    │──── Auto-provisioned dashboards  │
│              │    :3000      │                                  │
│              └───────────────┘                                  │
└─────────────────────────────────────────────────────────────────┘

Prerequisites

Docker >= 20.10
Docker Compose >= 2.0 (or docker-compose v1.29+)
Linux host with access to /proc, /sys, and /dev
lm-sensors (recommended for temperature metrics)

Verify Docker Installation

docker --version
docker compose version

Quick Start

1. Clone or Create the Project

cd /path/to/ember

2. Configure Environment

Copy the example environment file and optionally generate a new secure password:

# Use the provided .env (has a pre-generated secure password)
# OR generate your own:
cp .env.example .env
echo "GF_SECURITY_ADMIN_PASSWORD=$(openssl rand -base64 24)" > .env

3. Start the Stack

docker compose up -d

4. Verify All Services Are Running

docker compose ps

Expected output shows all 5 services as "healthy" or "running":

ember-prometheus
ember-grafana
ember-node-exporter
ember-cadvisor
ember-process-exporter

Accessing the Interfaces

Service	URL	Default Credentials
Grafana	http://localhost:3000	admin / (see .env)
Prometheus	http://localhost:9090	-

Note: All services bind to 127.0.0.1 (localhost only) by default for security.

External Application Integration

Ember can scrape metrics from external applications running in Docker containers via an external Docker network. This is configured using override files that are gitignored, keeping the main Ember configuration clean.

Prerequisites

External application stack must be running
External Docker network must exist (e.g., <app>_default, <app>_network)

Setup

Verify external network exists:
```
docker network ls | grep <network-name>
```
Configure the override files (gitignored):
- docker-compose.override.yml - Adds external network connectivity
- prometheus/prometheus.local.yml - Adds scrape jobs for external services

Update network name in docker-compose.override.yml:

networks:
  <external-network-name>:
    external: true

Start Ember (override auto-applied):
```
docker compose up -d
```
Start WITHOUT external integration (ignore override):
```
docker compose -f docker-compose.yml up -d
```

Verify Targets

Open http://localhost:9090/targets
Look for your custom job targets
UP = service is running and metrics are being scraped
DOWN = service is not running (expected if external app is stopped)

Test Connectivity from Prometheus Container

# Replace <service-name> and <port> with actual values
docker exec ember-prometheus wget -qO- http://<service-name>:<port>/metrics | head -20

Troubleshooting

If connectivity fails, verify:

External stack is running: docker ps | grep <app-name>
External network exists: docker network ls | grep <network-name>
Ember is connected to the network: docker network inspect <external-network-name>

Adding Alert Rules

Alert rules allow Prometheus to fire alerts when conditions are met (displayed in Prometheus UI, no notifications without Alertmanager).

Copy the example rules file:

cp prometheus/rules/example.rules.yml prometheus/rules/my-app.rules.yml

Edit and uncomment rules in prometheus/rules/my-app.rules.yml

Validate rules:

docker run --rm --entrypoint promtool \
  -v "$(pwd)/prometheus:/etc/prometheus:ro" \
  prom/prometheus:v2.51.0 check config /etc/prometheus/prometheus.yml

Reload Prometheus (no restart needed):

curl -X POST http://localhost:9090/-/reload

View alerts: http://localhost:9090/alerts

Note: Rules files (*.rules.yml) are gitignored except example.rules.yml. This allows environment-specific alert configurations.

Common Alert Patterns

# Service down
- alert: ServiceDown
  expr: up{job="my-service"} == 0
  for: 1m
  labels:
    severity: critical

# High latency (histogram)
- alert: HighLatency
  expr: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) > 1
  for: 5m
  labels:
    severity: warning

# Message queue backlog (NATS JetStream example)
- alert: ConsumerBacklogHigh
  expr: jetstream_consumer_num_pending > 1000
  for: 5m
  labels:
    severity: warning

Verification Steps

Check Prometheus Targets

Open http://localhost:9090/targets
Verify all targets show UP status:
- node-exporter (1/1 up)
- cadvisor (1/1 up)
- process-exporter (1/1 up)
- prometheus (1/1 up)

Check Grafana Dashboards

Open http://localhost:3000
Login with admin and the password from your .env file
Navigate to Dashboards in the left sidebar
Verify two dashboards are present:
- Host Health + Thermals
- Containers Overview

Verify Metrics Collection

In Prometheus (http://localhost:9090/graph), try these queries:

# CPU temperature (requires lm-sensors)
node_hwmon_temp_celsius

# CPU usage
rate(node_cpu_seconds_total{mode="user"}[1m])

# Container memory
container_memory_usage_bytes{name!=""}

# Process CPU
namedprocess_namegroup_cpu_seconds_total

Enabling Temperature Metrics

Temperature metrics require lm-sensors to be installed and configured on the host.

Install lm-sensors

Debian/Ubuntu:

sudo apt update
sudo apt install lm-sensors

Fedora/RHEL:

sudo dnf install lm_sensors

Arch Linux:

sudo pacman -S lm_sensors

Detect Sensors

Run the sensor detection wizard:

sudo sensors-detect

Answer YES to probe for various sensor chips
Answer YES to add modules to /etc/modules when prompted
Reboot or load modules manually:

sudo systemctl restart systemd-modules-load.service

Verify Sensors

sensors

Expected output shows temperature readings:

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +45.0°C  (high = +100.0°C, crit = +100.0°C)
Core 0:        +43.0°C  (high = +100.0°C, crit = +100.0°C)
Core 1:        +44.0°C  (high = +100.0°C, crit = +100.0°C)

Check hwmon Files

Verify sensor data is exposed in sysfs:

ls /sys/class/hwmon/
cat /sys/class/hwmon/hwmon*/temp*_input

Restart node_exporter

After configuring lm-sensors, restart the stack to pick up new sensors:

docker compose restart node-exporter

If Temperature Metrics Are Missing

Check if hwmon is exposed:
```
ls -la /sys/class/hwmon/
```

Verify node_exporter can read hwmon:

curl -s http://localhost:9100/metrics | grep hwmon

Check for thermal_zone metrics (alternative):

curl -s http://localhost:9100/metrics | grep thermal_zone

Ensure kernel modules are loaded:

lsmod | grep -E 'coretemp|k10temp|nct|it87'

Common issues:
- Some Mini PCs don't expose fan RPM via hwmon
- Virtual machines typically don't have hwmon sensors
- BIOS/UEFI settings may disable sensor reporting

Commands Reference

Start Stack

docker compose up -d

Stop Stack

docker compose down

View Logs

# All services
docker compose logs -f

# Specific service
docker compose logs -f prometheus
docker compose logs -f grafana
docker compose logs -f node-exporter

Restart Services

docker compose restart

Update Images

docker compose pull
docker compose up -d

Check Resource Usage

docker stats

Remove Everything (including data)

docker compose down -v

Data Persistence

Data is stored in Docker named volumes:

Volume	Purpose
`prometheus_data`	Prometheus TSDB (15 days)
`grafana_data`	Grafana config & state

To backup volumes:

docker run --rm -v ember_prometheus_data:/data -v $(pwd):/backup alpine tar czf /backup/prometheus-backup.tar.gz /data
docker run --rm -v ember_grafana_data:/data -v $(pwd):/backup alpine tar czf /backup/grafana-backup.tar.gz /data

Security Notes

Localhost Binding

All services bind to 127.0.0.1 by default, making them accessible only from the local machine. This is intentional for security.

To expose externally (not recommended without additional security):

Edit docker-compose.yml and change port bindings:

ports:
  - "0.0.0.0:3000:3000"  # Exposes to all interfaces

Secrets Management

Never commit .env to version control (it's in .gitignore)
The .env.example file shows the required format without real secrets
Generate strong passwords: openssl rand -base64 24

Container Privileges

node-exporter: Runs with host PID namespace for accurate process metrics
cadvisor: Runs privileged to access container metrics
process-exporter: Runs privileged to read /proc

These are required for accurate metrics collection.

Troubleshooting

host.docker.internal Not Resolving

On Linux, host.docker.internal requires explicit configuration. The docker-compose.yml includes:

extra_hosts:
  - "host.docker.internal:host-gateway"

If you still have issues:

Verify Docker version >= 20.10

Check the host gateway IP:

docker run --rm alpine ip route | grep default

Manually specify the host IP in prometheus.yml if needed

Prometheus Can't Scrape Targets

Check target status: http://localhost:9090/targets
Verify containers are running: docker compose ps
Check container logs: docker compose logs <service>

Test connectivity from Prometheus container:

docker compose exec prometheus wget -qO- http://node-exporter:9100/metrics | head

Grafana Dashboard Shows "No Data"

Verify Prometheus datasource: Grafana → Connections → Data sources → Prometheus
Check if Prometheus has data: http://localhost:9090/graph
Ensure time range is appropriate (default: last 1 hour)
Check for metric name changes between versions

cAdvisor Memory Issues

cAdvisor can be memory-intensive. To limit:

cadvisor:
  deploy:
    resources:
      limits:
        memory: 512M

High Disk Usage

Prometheus stores 15 days of data. To reduce:

Edit docker-compose.yml:

command:
  - '--storage.tsdb.retention.time=7d'

Restart Prometheus:
```
docker compose restart prometheus
```

process-exporter High Cardinality

If you have many unique processes, edit process-exporter/process-exporter.yml to group more aggressively or exclude noisy processes.

Customization

Changing Scrape Intervals

Edit prometheus/prometheus.yml:

scrape_configs:
  - job_name: 'node-exporter'
    scrape_interval: 10s  # Change from 5s

Adding Custom Dashboards

Create JSON dashboard file in grafana/dashboards/
Dashboards are auto-loaded within 30 seconds
Or restart Grafana: docker compose restart grafana

Modifying Retention

Edit retention in docker-compose.yml:

prometheus:
  command:
    - '--storage.tsdb.retention.time=30d'  # 30 days instead of 15

Stack Versions

Component	Version
Prometheus	v2.51.0
Grafana	10.4.1
node_exporter	v1.7.0
cAdvisor	v0.47.2
process-exporter	0.8.2

License

This project is provided as-is for personal and educational use.

Ember — Revealing the hidden heat in your system 🔥

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
assets		assets
grafana		grafana
process-exporter		process-exporter
prometheus		prometheus
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
Taskfile.yml		Taskfile.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

🔥 Ember

Features

Architecture

Prerequisites

Verify Docker Installation

Quick Start

1. Clone or Create the Project

2. Configure Environment

3. Start the Stack

4. Verify All Services Are Running

Accessing the Interfaces

External Application Integration

Prerequisites

Setup

Verify Targets

Test Connectivity from Prometheus Container

Troubleshooting

Adding Alert Rules

Common Alert Patterns

Verification Steps

Check Prometheus Targets

Check Grafana Dashboards

Verify Metrics Collection

Enabling Temperature Metrics

Install lm-sensors

Detect Sensors

Verify Sensors

Check hwmon Files

Restart node_exporter

If Temperature Metrics Are Missing

Commands Reference

Start Stack

Stop Stack

View Logs

Restart Services

Update Images

Check Resource Usage

Remove Everything (including data)

Data Persistence

Security Notes

Localhost Binding

Secrets Management

Container Privileges

Troubleshooting

host.docker.internal Not Resolving

Prometheus Can't Scrape Targets

Grafana Dashboard Shows "No Data"

cAdvisor Memory Issues

High Disk Usage

process-exporter High Cardinality

Customization

Changing Scrape Intervals

Adding Custom Dashboards

Modifying Retention

Stack Versions

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages