Skip to content

mipro98/infra

Repository files navigation

🖴 mipro98/infra 🖴

An Ansible playbook for a docker-based homelab with automatic maintenance and monitoring.


Flexible docker-compose templates for various services like Nextcloud, Vaultwarden, Gitea, Prometheus+Grafana,...

All services are set up ready-to-use with a Traefik reverse proxy and automatic TLS certificates.

Unattended server maintenance with auto updates, S.M.A.R.T monitoring, BTRFS snapshots and email notifications.


Features

This playbook includes tasks for:

  • Setting up the host system (packages, user, mounts,...).
  • Deploying all docker services according to their docker-compose templates.
  • Setting up fully automatic maintenance tasks like:
    • BTRFS snapshots and backups to a different drive using the btrbk utility
    • BTRFS scrubbing
    • S.M.A.R.T. monitoring with email notifications
    • automatic system and container updates.

Various configuration can be done using variables defined in group_vars or in vault.yml. For almost all files, templates are used to craft configuration files or docker-compose.yml files according to set variables.

This repo contains ready-to-use container setups for:

All these services are setup to be exposed through Traefik on different domains by simply providing domain names for the services in vault.yml.

Quick start

  1. Clone the repo.
  2. Run ./git-init.sh
  3. (optionally: run source unlock-bw.sh)
  4. Deploy (see below)

Deploy everything:

make deploy
  • Deploy only host os system setup: make system
  • Deploy only docker-compose changes: make containers
  • Deploy only maintenance changes: make maintenance

Ansible-Vault

This repo uses ansible-vault to encrypt secret variables used by Ansible (passwords, domain names, etc.). This affects the file vars/vault.yml. A template for vault.yml can be found in vars/vault.yml.template The password for the encryption is obtained by bitwarden-cli in the script vault-pass.sh. Therefore you can encrypt and decrypt by just entering your bitwarden master password by just writing:

make encrypt
make decrpyt

Always remember to encrypt the vault before commiting! Also, absolutely run the script git-init.sh after cloning to install a pre-commit hook which prevents committing the unencrypted vault!


Roles

The repo contains 3 ansible roles:

  1. system
  2. containers
  3. maintenance

system role

  • installs all packages defined in extra_packages
  • sets correct permissions for docker
  • sets fstab mount points for a "NAS" and a "BACKUP" drive
  • sets up smartd with a sane monitoring configuration and email alerts

Note: SSH configuration is skipped for now.

container role

  • Creates the folder structure for docker-compose files according to enabled services
  • Deploys the docker-compose.yml templates for the services in their respective folders
  • Makes some special configuration for nextcloud, if nextcloud is enabled

maintenance role

At the heart of the maintenance role is the script mpserver-maintenance.sh which gets deployed by ansible. Additionally:

  • A systemd timer is set up which triggers above script every night at 2 A.M.
  • The btrbk.conf file is deployed
  • An uptime-monitor.sh script is deployed which checks continously whether the server is online. When it is offline, it optionally executes an ssh command (I use it to restart my router). After successful reconnect, an E-Mail is sent about the downtime.
  • A systemd service is installed to start the uptime-monitor.sh script on boot.

Refer to section Automatic Maintenance for more details.


Automatic Maintenance

The custom bash script mpserver-maintenance.sh gets triggered every night at 2 A.M. by systemd. Based on the time (or command line flags), it decides what actions to take:

  • monthly tasks (when script is triggered on last sunday of a month)
  • weekly tasks (when script is triggered on a sunday except the last sunday of a month)
  • daily tasks (when triggered all other days between 0 A.M. and 6 A.M.)

The script can be triggered manually for daily, monthly or weekly tasks. Refer to sudo ./mpserver-maintenance.sh --help.

Usage: ./mpserver-maintenance.sh [<options>] [<command>]

Options:
	--dry-run	Don't execute anything, only show what would be executed
	--no-email	Don't send an email with the log
	--help		Show this help message and exit.

Commands:
	daily		Run daily maintenance tasks
	weekly		Run weekly maintenance tasks
	monthly	Run monthly maintenance tasks

	If no command is passed, the script will determine the correct schedule and run the appropriate tasks.

The tasks are defined in the bash functions weekly, monthly, daily. Each trigger writes a detailed log into ~/maintenance/logs which also gets emailed for monthly and weekly triggers.

Daily tasks:

  1. Snapshot the subvolume "Daten" through btrbk

Weekly tasks:

  1. backup ~/dockerdata onto a backup location through rsync
  2. snapshot and backup all configured subvolumes according to btrbk.conf
  3. update the host system
  4. send an email with a the detailed log.

Monthly tasks:

  1. backup ~/dockerdata onto a backup location through rsync
  2. snapshot and backup all configured subvolumes according to btrbk.conf
  3. update all containers (docker-compose pull all services)
  4. prune old docker images and volumes
  5. btrfs-scrub both the NAS and the BACKUP drive.
  6. update the host system.
  7. send an email with a the detailed log.

Nextcloud stack

The Nextcloud stack is carefully fine-tuned for performance and simplicity using:

  • A custom built Nextcloud image with ffmpeg, imagemagick and ghostscript for media previews.
  • MariaDB for a performant database
  • Redis for fast memory caching and Transactional File Locking (I personally use APCu for caching and Redis only for File Locking)
  • A dedicated go-vod container with access to /dev/dri for hardware acceleration in Nextcloud Memories.
  • A dedicated Nextcloud cron container with mounted crontab file so you can easily add cronjobs without messing on the host system's cron/systemd.
  • A dedicated Collabora container for performant Nextcloud Office integration.

Notes:

  • The nextcloud container will not auto-update unless you run docker-compose build --pull nextcloud inside ~/docker/nextcloud because the container is built using ~/docker/nextcloud/Dockerfile in order to have ffmpeg support. Since Nextcloud updates often require manual intervention and can easily be discovered through the admin panel, this isn't that much of an issue.
  • I use Nextcloud Memories alongside Preview Generator and Recognize without issues and with hardware transcoding using the seperate go-vod instance. To make it work, adjust the following in the Memories Admin GUI:
    • enable "Images", "HEIC" and "Videos" under "File Support"
    • define /usr/bin/ffmpeg / /usr/bin/ffprobe as ffmpeg / ffprobe path
    • enable "external transcoder" and just write go-vod:47788 under "Connection address" (The Traefik DNS will resolve go-vod correctly within the docker proxy network).
    • tick "Enable acceleration with VA-API" to on. (ignore the warning "VA-API device not found")
    • (note that I added the cronjob for preview generator in the mounted crontab-www-data file.)
  • To enter the Nextcloud container (e.g. to run occ commands), run:
    docker exec -itu www-data nextcloud bash

Notes & Tips

  • for ulogger:
    • The container is setup to use sqlite
    • Before the first run: Either use a named volume1 or start the container once with a bind mount other than /data to copy the container contents of /data onto the host to bootstrap the database.
  • The makefile supports shorthands for often used Ansible commands. For example, when only modifying mpserver-maintenance.sh, just run make script and it will only replace the script on the server.
  • The ~/docker/docker-compose.yml, serves as a "master" docker-compose.yml file with all the activated services included in the file. You can use it to execute actions on all containers at once just with one docker-compose <up|down|...> command.
  • Make sure to also re-deploy the maintenance script when adding/removing docker services in group_vars since the script will change according to the enabled services.2
  • Enable debug_ports_open: true to open ports on the host bypassing the reverse proxy. This can be used for debugging, e.g. Traefik metrics and node_exporter is usually not accessible outside of traefik-net.

TODOs

  • Maintenance: Run all scrubbing tasks in parallel.
  • Maintenance: Send Email when scrubbing starts, send another one when scrubbing has ended containing possible errors.
  • Maintenance: Better log for transferred files during snapshots & backups with btrbk and rsync.
  • Maintenance: Email formatting with code so that >> doesn't get interpreted as quote by some Email clients.
  • Add overall downtime monitoring with alert when server comes back online (and possibly when it is down and monitoring is done from another host).
  • Improve Grafana dashboards.
  • Maintenance: Auto Reboot when Kernel has updated.
  • Add alertion system for suspicious events like access from a specific country or a DDOS attempt.
  • Maintenance: Better system update / pacman / paru logging (do not log whole stdout).
  • Docker-Compose: making local compose invocations work exchangeably with master-compose invocations (since com.docker.compose.project and com.docker.compose.project.working_dir Label don't match) (fixed using )

Credits

A large portion of the repository is inspired by Alex Kretzschmar's infra repo as well as Wolfgang notthebee's infra repo. Also, both BTRFS maintenance and the system setup was largely inspired by zilexa's Homeserver Guide.

Thanks also goes to Jeff Geerling's security role where I took almost all of my (not yet activated) SSH configuration.

Footnotes

  1. https://stackoverflow.com/q/65176940

  2. Not needed anymore since we can use the new include mechanism to generate one "master" docker-compose.yml.