Skip to content

My Backup Strategy

Dave edited this page Dec 10, 2021 · 4 revisions

Summary

This article lists some ways I'm planning to backup the data on my CloudPi. You can use this information as you see fit. If you want to do something different, that's okay. It's your data and you're responsible for it. Do what's best for you.

Do not rely on information in here. This is a work in progress. Nothing has been tested.

My idea for backing up data is to use an older Pi as a secondary device. This is not a high-availability configuration. Rather than running Docker apps and everything else, this secondary Pi will just keep copies of the files I don't want to lose on another external hard drive.

Understanding the Backup Procedures

For my backup strategy, I'm using the sqlite3 command-line tool to dump any SQLite databases to flat files. I'm then using rsync to make copies of files. I'm using Ansible to take care of running the commands.

This approach has good points and bad points.

The Good

  • The sqlite and rsync tools are already installed.
  • Ansible playbooks make it easy for me to configure and run.
  • The backup device does not need to be very powerful.

The Bad

  • The files on the secondary get overwritten each time the backup is run. If a file got corrupted last week, there's no way for me to recover.
  • To ensure files are not changing as backup is running, the applications need to be stopped.
  • If something is mis-configured or the backup job doesn't run, I won't get notified, so I need to check it periodically.

Configuring a Second Pi

TODO

Planning the Individual Application Backups

Each application has a certain amount of uniqueness, so there's no single backup strategy to fit them all. But, there are some common steps.

  1. If the application makes frequent updates to files, it should be stopped first.
  2. If there is a SQL database, it should be dumped to a flat file.
  3. Any configuration and user data should be copied to the backup device.
  4. The application can be started again.

Nextcloud

Nextcloud holds copies of files on my laptop that I don't want to lose of that I want to make available across all my devices. My primary concern is to have copies of those files. And those are stored in /srv/cloud. Nextcloud also keeps a database with metadata and configuration information. There is also configuration information stored in files. This is of secondary importance to me.

If something happens Nextcloud to the primary Pi and Nextcloud and all its data go missing, my recovery procedure would probably look like this:

  1. Redeploy the Nextcloud container using the file-sharing.yml file.
  2. Reconfigure Nextcloud using the web interface.
  3. Re-sync the files from my laptop.

Because I'm using LDAP user accounts for authentication, the information in the Nextcloud is not that critical to me. But, the Nextclound documentation does give instructions on how to save and restore the database contents, so I will build that into the process... just in case.

Reference Procedure

The official documentation on Nextcloud backup is at:

https://docs.nextcloud.com/server/20/admin_manual/maintenance/backup.html

Applied Procedure

Condensing the step outlined in the official documentation, I've got five basic tasks. Substituting in the correct paths for my Pi's setup, it looks like this:

  1. Put Nextcloud in maintenance mode, using Nextcloud's occ command-line tool from inside the container.
  2. Dump the contents of the SQLite3 database to a flat file from the bind mounted directory /opt/docker/nextcloud.
  3. Make a copy of the apps, configuration, and themes in /opt/docker/nextcloud
  4. Make a copy of the database and user files stored in /srv/cloud
  5. Take Nextcloud out of maintenance mode.

Here's what the commands look like for those steps:

docker exec -u www-data nextcloud php occ maintenance:mode --on
sqlite3 /srv/cloud/owncloud.db .dump >/srv/cloud/owncloud.db_\`date -I\`.dump
sudo rsync -avz /opt/docker/nextcloud root@backup-pi:/media/backup/primary-pi/opt/docker
sudo rsync -avz /srv/cloud root@backup-pi:/media/backup/primary-pi/srv
docker exec -u www-data nextcloud php occ maintenance:mode --off 

The date -I part of the sqlite dump creates a file with a date code in its name. Each day the command is run, it creates a new file. This can be a nuisance after a while without proper pruning. So in my case, I'm simply running sqlite3 owncloud.db .dump >owncloud.db.dump and overwriting the backup file each time.

Home Assistant

Home Assistant provides a central dashboard of information gathered from other sources. As such, the only data it really holds is for configuration and historical trends gathered from attached devices. Personally, I don't care about the trend data so much that I would go through an even moderately complex recovery process. The configuration, however, would take some effort to recreate, so a backup copy is important.

Reference Procedure

I found no official documentation on how to backup Home Assistant running in a Docker container. Everything references supervisor, which is a feature of the Home Assistant Raspberry Pi appliance.

It appears most of the configuration is stored in .yaml files. Looking at the database, most of the tables seem to be events and statistics, so I doubt there is any configuration in the db. But, it's sqlite3 and dumping the database is not difficult.

Applied Procedure

This process is nearly identical to the way I'm backing up Nextcloud.

  1. Stop Home Assistant container using a docker command.
  2. Dump the contents of the SQLite database to a flat file with sqlite3.
  3. Backup the data in /opt/docker/homeassistant
  4. Start the Home Assistant container.

Here's what the command-line procedure would look like:

docker stop homeassistant
sqlite3 /opt/docker/homeassistant/home-assistant_v2.db .dump >/opt/docker/homeassistant/home-assistant_v2.db.dump
sudo rsync -avz /opt/docker/homeassistant root@backup-pi:/media/backup/primary-pi/opt/docker
docker start homeassistant

Mosquitto

Mosquitto is an MQTT message broker. Data is mostly passing through and not sticking around. Therefore, backing it up is mostly a matter of making sure the configuration and any username/password credentials are saved.

Reference Procedure

I found no information for backing up Eclipse Mosquitto. Realistically, the recovery process is probably just reinstalling. There are two directories of interest in /opt/docker/mosquitto.

  • The config directory holds the configuration file and password database. But both were created by the Ansible playbook I used to install Mosquitto, so nothing is really lost unless I've customized these files.
  • The data directory probably holds the values for any persistent topics in MQTT. It's not a regular database file, so it can't be dumped like a SQLite3 db.

Applied Procedure

If the data is important enough to keep (and not just reinstall Mosquitto and start over) the process would be as follows:

  1. Stop the container with Docker.
  2. Backup the files in /opt/docker/mosquitto/config and /opt/docker/mosquitto/data
  3. Start the container.
docker stop mosquitto
sudo rsync -avz /opt/docker/mosquitto root@backup-pi:/media/backup/primary-pi/opt/docker
docker start mosquitto

NodeRED

Reference Procedure

There are plenty of sites with instructions for manual export and import of flows from the NodeRED application interface, but I found no official information about automated backups at the file-system level. There does not appear to be a database, only some .json files. (You can think of this as another example of a non-SQL database.)

Applied Procedure

I decided to apply the stop container, copy data, start container method used with the other applications.

  1. Stop NodeRED container.
  2. Backup the data in /opt/docker/nodered
  3. Start the NodeRED container.

Here are the commands to do it:

docker stop nodered
sudo rsync -avz /opt/docker/nodered root@backup-pi:/media/backup/primary-pi/opt/docker
docker start nodered

Here's an Ansible playbook that can be run on the secondary host:

backup/node-red.yml

Samba

Reference Procedure

TODO: media and public shares are read-only and don't need container to be stopped. shared is read-write and could need stopping

Applied Procedure

  1. rsync /srv/public, /srv/media
  2. Stop container
  3. rsync /srv/shared
  4. Start container
sudo rsync -avz /srv/media root@backup-pi:/media/backup/primary-pi/srv
sudo rsync -avz /srv/public root@backup-pi:/media/backup/primary-pi/srv
docker stop samba
sudo rsync -avz /srv/shared root@backup-pi:/media/backup/primary-pi/srv
docker start samba

Here is a playbook to do it:

backup/samba.yml

Gitea

TODO: Stop, rsync, start

Nginx

TODO: mostly read-only. just rsync it

backup/nginx.yml

ESPHome

ESPHome keeps copies of the YAML configuration files used to generate firmware in its /config directory. That seems to be the only persistent data.

Reference Procedure

The FAQ only mentions that YAML files should be backed up, but not how they should be backed up.

Applied Procedure

All the YAML files are stored in the ESPHome container's /config directory, which is bind mounted to /opt/docker/esphome/config on the Pi. Copying this directory should be sufficient for backup. Stopping the container first will ensure no one is making changes to files as they're being backed up.

The steps below should look very familiar.

  1. Stop the container with Docker.
  2. Backup the files in /opt/docker/esphome/config and /opt/docker/esphome/data
  3. Start the container.
docker stop esphome
sudo rsync -avz /opt/docker/esphome root@backup-pi:/media/backup/primary-pi/opt/docker
docker start esphome

Here's an Ansible playbook that can be run from the secondary host:

backup/esphome.yml