Skip to content

Docker homelab orchestration: systemd lifecycle, Graylog/Prometheus/Grafana observability, validation scripts.

Notifications You must be signed in to change notification settings

scscodes/Container-Controller

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ct-controller

Standardized Docker container management for a homelab environment. Containers run as systemd services, with unified observability (logs + metrics) and automated lifecycle management.

Design Principles

  1. Systemd-native lifecycle β€” Each stack is a systemd service, enabling boot ordering, dependency management, and standard systemctl commands
  2. Centralized observability β€” All logs flow to Graylog; all metrics flow to Prometheus; Grafana provides unified dashboards
  3. Opt-in automation β€” Watchtower updates only labeled containers on a controlled schedule
  4. Explicit resource limits β€” Every container declares memory/CPU caps to prevent runaway usage
  5. Health-first orchestration β€” Services use healthchecks to gate dependent startups
  6. Programmatic validation β€” Python-based tooling for structured parsing, validation, and reporting

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                              Host                                       β”‚
β”‚                                                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                    Application Containers                        β”‚   β”‚
β”‚  β”‚  graylog   pihole   unifi   homeassistant   openclaw   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚         β”‚ stdout/stderr            β”‚ metrics                           β”‚
β”‚         β–Ό                          β–Ό                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚  β”‚ Fluent Bit β”‚             β”‚  cAdvisor β”‚                              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜             β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                              β”‚
β”‚        β”‚ GELF                     β”‚ scrape                             β”‚
β”‚        β–Ό                          β–Ό                                    β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚  β”‚ Graylog  β”‚              β”‚ Prometheus β”‚                              β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                              β”‚
β”‚        β”‚                          β”‚                                    β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                    β”‚
β”‚                       β–Ό                                                β”‚
β”‚                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                           β”‚
β”‚                β”‚  Grafana  β”‚  ← dashboards + alerting                  β”‚
β”‚                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                           β”‚
β”‚                                                                        β”‚
β”‚  Lifecycle: Systemd (boot) + Watchtower (image updates)                β”‚
β”‚  Validation: Python scripts β†’ JSON reports                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Stacks

Stack Purpose Ports
graylog Log aggregation (MongoDB + OpenSearch + Graylog) 9000, 514, 1514, 12201
monitoring Metrics pipeline (Prometheus + cAdvisor + Pushgateway + Grafana) 3000, 9090, 9091
fluentbit Log shipper β€” tails Docker logs, ships to Graylog β€”
watchtower Automated container image updates β€”
homeassistant Home automation platform 8123 (host)
pihole DNS sinkhole and ad blocker 53, 8053 (host)
unifi UniFi network controller 8443, 8080, 3478
openclaw AI agent gateway 18789

Directory Layout

/opt/docker/                    # Production deployment path
β”œβ”€β”€ <stack>/
β”‚   β”œβ”€β”€ docker-compose.yml      # Stack definition
β”‚   β”œβ”€β”€ .env                    # Secrets (not in git)
β”‚   β”œβ”€β”€ .env.example            # Template for .env
β”‚   β”œβ”€β”€ README.md               # Stack-specific docs
β”‚   └── data/                   # Persistent volumes
└── scripts/                    # Python validation & management tools
    β”œβ”€β”€ validate.py             # Stack validation
    β”œβ”€β”€ audit.py                # Full infrastructure audit
    β”œβ”€β”€ healthcheck.py          # Container health checks
    β”œβ”€β”€ backup.py               # Backup management
    β”œβ”€β”€ setup.py                # Prerequisites and installation
    β”œβ”€β”€ host.py                 # Host system information
    β”œβ”€β”€ lib/                    # Core library modules
    └── templates/              # Systemd & cron templates

This repository mirrors the structure at /opt/docker/ on the target host.

Standards

Compose files are validated for healthcheck, restart policy, resource limits (all ERROR), plus container_name and Watchtower label (WARNING). Full list and severity: docs/STANDARDS.md. Run ./scripts/validate.py to check stacks.

Quick Reference

Validation & Audit

# Validate all stacks (JSON output)
./scripts/validate.py

# Human-readable output
./scripts/validate.py --human

# Validate specific stack
./scripts/validate.py graylog

# Full infrastructure audit
./scripts/audit.py --summary

# Port conflict check
./scripts/audit.py --ports

# Image version audit
./scripts/audit.py --images --human

Service Management

# Start/stop/restart
sudo systemctl start docker-compose@<stack>
sudo systemctl stop docker-compose@<stack>
sudo systemctl restart docker-compose@<stack>

# Enable at boot
sudo systemctl enable docker-compose@<stack>

# View logs
sudo journalctl -u docker-compose@<stack> -f

Container Health

# Check all containers (JSON)
./scripts/healthcheck.py

# Human-readable with failures only
./scripts/healthcheck.py --human --quiet

# With metrics push
./scripts/healthcheck.py --push-metrics --send-log

Backup

# Backup configurations
./scripts/backup.py

# Backup with data
./scripts/backup.py --data

# List existing backups
./scripts/backup.py --list --human

Host Information

# Full system report (JSON)
./scripts/host.py

# Human-readable
./scripts/host.py --human

# Specific sections
./scripts/host.py --hardware    # CPU, memory, disk
./scripts/host.py --docker      # Docker daemon info
./scripts/host.py --services    # Systemd compose services

Setup & Prerequisites

# Check prerequisites
./scripts/setup.py

# Install/fix issues
./scripts/setup.py --install

Observability

Reference: docs/OBSERVABILITY.md.

What Where
Logs Graylog UI (:9000) or docker logs <container>
Metrics Grafana (:3000) or Prometheus (:9090)
Script metrics Pushgateway (:9091)
Container stats docker stats

External Ingress

Send logs and metrics from scripts, external apps, or ad-hoc debugging sessions.

Logs β†’ Graylog

# Using Python library
from scripts.lib.observability import log_info
log_info("Operation completed", facility="myapp", duration_ms=150)
# Direct curl to GELF HTTP
curl -X POST -H "Content-Type: application/json" \
  -d '{"version":"1.1","host":"myhost","short_message":"Hello"}' \
  http://localhost:12201/gelf

Metrics β†’ Pushgateway

# Using Python library
from scripts.lib.observability import metric_gauge
metric_gauge("myapp_items", 42, labels={"env": "prod"})
# Direct curl
echo 'myapp_items 42' | curl --data-binary @- http://localhost:9091/metrics/job/myapp

Initial Setup

# 1. Check prerequisites
./scripts/setup.py

# 2. Install/configure (run fixes)
./scripts/setup.py --install

# 3. Create shared network
docker network create monitoring_net

# 4. Deploy stacks in dependency order
sudo systemctl enable --now docker-compose@graylog
sudo systemctl enable --now docker-compose@fluentbit
sudo systemctl enable --now docker-compose@monitoring
sudo systemctl enable --now docker-compose@watchtower
# ... then application stacks

# 5. Validate
./scripts/validate.py --human

See each stack's README for specific setup instructions.

Security

  • .env files contain secrets β€” never commit them (see .gitignore)
  • .env permissions should be 600
  • Containers needing Docker socket (/var/run/docker.sock) are explicitly documented
  • Resource limits prevent denial-of-service from runaway containers

User and Group Access

The project uses a dedicated service account for file ownership and a group-based access model for operators.

User/Group Purpose
docker-services Service account that owns /opt/docker. System user (no login shell).
docker Docker daemon group. Required to run docker commands.

Operator access: Add your user to both groups to manage the project without sudo:

# Add user to required groups
sudo usermod -aG docker-services $USER
sudo usermod -aG docker $USER

# Apply (or log out and back in)
newgrp docker-services

Directory permissions: /opt/docker must have group write and setgid:

Permission Purpose
g+w Group members can create/modify files
g+s (setgid) New files inherit docker-services group

Fix permissions if needed:

sudo chmod -R g+w /opt/docker
sudo find /opt/docker -type d -exec chmod g+s {} \;

Or use setup.py:

sudo ./scripts/setup.py --fix

Verify access:

# Should show docker-services and docker in groups
id $USER

# Should be able to create files without sudo
touch /opt/docker/test && rm /opt/docker/test

Adding a New Stack

  1. Create stack directory with required files:

    mkdir -p myapp
    touch myapp/docker-compose.yml myapp/.env.example myapp/README.md
  2. Edit docker-compose.yml with required standards (healthcheck, restart, limits, labels)

  3. Create .env from .env.example

  4. Create data directories:

    sudo mkdir -p /opt/docker/myapp/data
    sudo chown -R docker-services:docker-services /opt/docker/myapp
  5. Enable and start:

    sudo systemctl enable --now docker-compose@myapp
  6. Validate:

    ./scripts/validate.py myapp

Cron Jobs

Scheduled maintenance is installed via ./scripts/setup.py --install. Full schedule and commands: docs/CRON.md.

Related Documentation

Evergreen reference: docs/*.md (one doc per topic).

About

Docker homelab orchestration: systemd lifecycle, Graylog/Prometheus/Grafana observability, validation scripts.

Topics

Resources

Stars

Watchers

Forks

Contributors