Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set timeline.service as Type=oneshot to allow serial odering #392

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xenithorb
Copy link

@xenithorb xenithorb commented Feb 6, 2018

Systemd starts units in parallel unless otherwise specified by the unit files. A Type=simple service has no such constraint, and as such any WantedBy service introduced to run after snapper-timeline.service will execute in parallel instead of waiting until snapper is finished. By changing simple to oneshot, other units can be added to the system that use WantedBy=snapper-timeline.service and they will actually run after snapper-timeline.service is finished. When snapper-timeline.service is installed by itself this has no effect, and the unit continues to operate as it did for anyone that doesn't have any dependent services.

Example Before:

Feb 06 05:00:01 localhost systemd[1]: Started Timeline of Snapper Snapshots.
Feb 06 05:00:01 localhost systemd[1]: Started Borg backup post snapper snapshots.
Feb 06 05:00:01 localhost systemd-helper[15627]: running timeline for 'home'.
Feb 06 05:00:01 localhost systemd-helper[15627]: running timeline for 'root'.
Feb 06 05:00:01 localhost systemd-helper[15627]: running timeline for 'usr'.
Feb 06 05:00:02 localhost snapper2borg.sh[15628]: borg create -x -s -C auto,zstd /mnt/external1/backups/home::home-snapshot84 /tmp/borg/fedora-home
Feb 06 05:00:02 localhost systemd-helper[15627]: running timeline for 'var'.
Feb 06 05:00:02 localhost snapper2borg.sh[15628]: borg create -x -s -C auto,zstd /mnt/external1/backups/root::root-snapshot84 /tmp/borg/fedora-root
Feb 06 05:00:02 localhost snapper2borg.sh[15628]: borg create -x -s -C auto,zstd /mnt/external1/backups/usr::usr-snapshot83 /tmp/borg/fedora-usr
Feb 06 05:00:02 localhost snapper2borg.sh[15628]: borg create -x -s -C auto,zstd /mnt/external1/backups/var::var-snapshot83 /tmp/borg/fedora-var

Note how the two units outputs are interleaved because in the beginning they are started in parallel. Also note the actual issue, which is that my script picks up two older-numbered snapshots because of a race condition caused by not ordering this properly.

Example After:

Feb 06 05:52:26 localhost systemd[1]: Starting Timeline of Snapper Snapshots...
Feb 06 05:52:26 localhost systemd-helper[28935]: running timeline for 'home'.
Feb 06 05:52:26 localhost systemd-helper[28935]: running timeline for 'root'.
Feb 06 05:52:26 localhost systemd-helper[28935]: running timeline for 'usr'.
Feb 06 05:52:27 localhost systemd-helper[28935]: running timeline for 'var'.
Feb 06 05:52:27 localhost systemd[1]: Started Timeline of Snapper Snapshots.
Feb 06 05:52:27 localhost systemd[1]: Starting Borg backup post snapper snapshots.
Feb 06 05:52:32 localhost snapper2borg.sh[29007]: borg create -x -s -C auto,zstd /mnt/external1/backups/home::home-snapshot99 /tmp/borg/fedora-home
Feb 06 05:52:32 localhost snapper2borg.sh[29007]: borg create -x -s -C auto,zstd /mnt/external1/backups/root::root-snapshot99 /tmp/borg/fedora-root
Feb 06 05:52:33 localhost snapper2borg.sh[29007]: borg create -x -s -C auto,zstd /mnt/external1/backups/usr::usr-snapshot99 /tmp/borg/fedora-usr
Feb 06 05:52:33 localhost snapper2borg.sh[29007]: borg create -x -s -C auto,zstd /mnt/external1/backups/var::var-snapshot99 /tmp/borg/fedora-var
Feb 06 05:52:33 localhost systemd[1]: Started Borg backup post snapper snapshots.

The unit that is running after:

# /etc/systemd/system/snapper2borg.service
[Unit]
Description=Borg backup post snapper snapshots
After=snapper-timeline.service

[Service]
Type=oneshot
ExecStart=/usr/local/bin/snapper2borg.sh

[Install]
WantedBy=snapper-timeline.service

Systemd starts units in parallel unless otherwise specified by the unit files. A `Type=simple` service has no such constraint, and as such any `WantedBy` service introduced to run after `snapper-timeline.service` will execute in parallel instead of waiting until snapper is finished. By changing `simple` to `oneshot`, other units can be added to the system that use `WantedBy=snapper-timeline.service` and they will actually run after `snapper-timeline.service` is finished. When `snapper-timeline.service` is installed by itself this has no effect, and the unit continues to operate as it did for anyone that doesn't have any dependent services.
xenithorb added a commit to xenithorb/snapper2borg that referenced this pull request Feb 6, 2018
 - Infinity start time is default
 - Helps with serial execution
 - Also see openSUSE/snapper#392

In order for this to run properly after `snapper-timeline.service`, it
too must be a `Type=oneshot` service to serially execute this service
afterwards. Otherwise we run into a race condition where this service is
not able to read the newly created lvm snapshots yet.
@tblancher
Copy link

@aschnell Are you planning on merging this PR anytime soon? On Arch Linux I'm setting an override to set the snapper-timeline.service to be Type=oneshot and RemainOnExit=true. Since this runs snapper-helper --timeline, which will exit once it's finished, there's really no need for this to be Type=simple.

This also opens up the possibility of having a different cadence than hourly for creating snapshots, by modifying snapper-timeline.timer (or timeline.timer in this repository). Some of my backups reading the latest read-only snapshot take more than an hour to run. I'm running my backup every four hours, so I don't want snapper creating a new snapshot every hour since at least three of those four will be unused, and potentially claiming space in Btrfs unnecessarily.

This all assumes that snapperd isn't actually running the schedule, the manual for snapperd is completely absent of implementation details. It just looks like it's the DBus daemon for snapper.

@blubberdiblub
Copy link

@xenithorb Note that Wants= and Requires= (and by extension WantedBy= and RequiredBy=) just introduce a dependency between systemd units, however, they do not impose an ordering of execution of units.

If you want to enforce a specific ordering between two units, you have to additionally specify After= or Before= respectively, otherwise they could be executed in any order (depending on the implementation, the phase of the moon or whether your goldfish had a bad day).

The relevant section of the systemd manpages is in systemd.unit(5) under the Wants= heading:

Note that requirement dependencies do not influence the order in which services are started or stopped. This has to be configured independently with the After= or Before= options. If unit foo.service pulls in unit bar.service as configured with Wants= and no ordering is configured with After= or Before=, then both units will be started simultaneously and without any delay between them if foo.service is activated.

@blubberdiblub
Copy link

@tblancher You can make the changes yourself on your system. systemctl edit snapper-timeline.service will open an editor where you add the following in the space between the comments systemd set aside for you:

[Service]
Type=oneshot
RemainOnExit=true

Systemd will add an override.conf file to /etc/systemd/system/... for you, so it won't affect the original unit file installed from the package and your adjustments will happily survive updates.

Analogously for systemctl edit snapper-timeline.timer to adjust when and how often the timeline snapshots are taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants