Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: ability to run pre/post hooks on a push job #394

Open
davefullard opened this issue Oct 12, 2020 · 14 comments
Open

Request: ability to run pre/post hooks on a push job #394

davefullard opened this issue Oct 12, 2020 · 14 comments

Comments

@davefullard
Copy link

I have a remote machine configured as a 'sink' to an encrypted dataset. I'd like this dataset to unmounted and locked when the data is at rest. At the moment, I have a cron job that mounts and unlocks the dataset remotely prior to waking up the push job. What I would like is the ability to run pre and post command prior to/after running the job. I see something similar in issue #74 for snapshots.

@problame
Copy link
Member

This feature requests and reasonable and should be fairly easy to implement for beginner. A few questions:

What metadata will you need in the hook script?
What should happen if the hook times out or exits with non-zero status code?

Also: I suppose your use case is sending from a pool that doesn't support OpenZFS encryption into one that does?

@problame
Copy link
Member

(To get an impression of the metadata that's available on the post-edge, run zrepl status --raw after a job completed its invocation).

@davefullard
Copy link
Author

davefullard commented Oct 12, 2020

What metadata will you need in the hook script?

For my particular use case, I don't need any metadata at this time, but I could see some of it being useful in the future. Maybe providing items such as the state and errors of the job would allow a script to have better error handling.

What should happen if the hook times out or exits with non-zero status code?

In my scenario, I would expect the job to fail as the encrypted dataset on the remote side would not be reachable or the dataset is not mounted.

Also: I suppose your use case is sending from a pool that doesn't support OpenZFS encryption into one that does?

Correct.

(To get an impression of the metadata that's available on the post-edge, run zrepl status --raw after a job completed its invocation).

{"_control":{"internal":null,"type":"internal"},"offsite":{"push":{"Replication":{"StartAt":"2020-10-12T09:13:33.853594254-04:00","FinishAt":"0001-01-01T00:00:00Z","WaitReconnectSince":"0001-01-01T00:00:00Z","WaitReconnectUntil":"0001-01-01T00:00:00Z","WaitReconnectError":null,"Attempts":[{"State":"done","StartAt":"2020-10-12T09:13:33.85391539-04:00","FinishAt":"2020-10-12T09:13:34.505158342-04:00","PlanError":null,"Filesystems":[{"Info":{"Name":"storage/home"},"State":"done","PlanError":null,"StepError":null,"CurrentStep":0,"Steps":[]},{"Info":{"Name":"storage/home/dave"},"State":"done","PlanError":null,"StepError":null,"CurrentStep":0,"Steps":[]},{"Info":{"Name":"storage/syncthing"},"State":"done","PlanError":null,"StepError":null,"CurrentStep":0,"Steps":[]}]}]},"PruningSender":{"State":"Done","Error":"","Pending":[],"Completed":[{"Filesystem":"storage/home","SnapshotList":[{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""},{"Filesystem":"storage/home/dave","SnapshotList":[{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""},{"Filesystem":"storage/syncthing","SnapshotList":[{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""}]},"PruningReceiver":{"State":"Done","Error":"","Pending":[],"Completed":[{"Filesystem":"storage","SnapshotList":null,"DestroyList":null,"SkipReason":"filesystem is placeholder","LastError":""},{"Filesystem":"storage/home","SnapshotList":[{"Name":"zfs-auto-snap_weekly-2020-10-11-00h14","Replicated":true,"Date":"2020-10-11T00:14:00-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-11-23h14","Replicated":true,"Date":"2020-10-11T23:14:21-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""},{"Filesystem":"storage/home/dave","SnapshotList":[{"Name":"zfs-auto-snap_weekly-2020-10-11-00h14","Replicated":true,"Date":"2020-10-11T00:14:00-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-11-23h14","Replicated":true,"Date":"2020-10-11T23:14:21-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""},{"Filesystem":"storage/syncthing","SnapshotList":[{"Name":"zfs-auto-snap_weekly-2020-10-11-00h14","Replicated":true,"Date":"2020-10-11T00:14:00-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-11-23h14","Replicated":true,"Date":"2020-10-11T23:14:21-04:00"},{"Name":"zfs-auto-snap_daily-2020-10-12-00h07","Replicated":false,"Date":"2020-10-12T00:07:00-04:00"}],"DestroyList":[],"SkipReason":"","LastError":""}]},"Snapshotting":null},"type":"push"}}

@MarcusWichelmann
Copy link

As @problame pointed out in #395 the post hook feature requested by this issue could as well be used to send emails after a job finished executing.

zrepl status --raw seems to already provide all the metadata needed for implementing such a hook script. But IMO some environment variables would be helpful so the script can at least easily tell, if the job succeeded or failed.
Additionally I'd keep the syntax and behaviour of this hook as near to the snapshot-hooks as possible, so the same timeout and err_is_fatal options and environment variables are available as they are just as useful here.

@problame Is there a way to make zrepl status write the human readable form of the current status to stdout only once so it could be added to the email? If not I'll create an issue for that.

@problame
Copy link
Member

s there a way to make zrepl status write the human readable form of the current status to stdout only once so it could be added to the email? If not I'll create an issue for that.

That'd be addressed by #297 (rewrite of the terminal status UI).
The rewrite refactors the printing logic so that it can write both to stdout and to a tcell buffer.
Would you like to pick up that PR? It's a great starter project and Go is really easy to pick up!

@problame
Copy link
Member

zrepl status --raw seems to already provide all the metadata needed for implementing such a hook script. But IMO some environment variables would be helpful so the script can at least easily tell, if the job succeeded or failed.

We'd likely provide the output of zrepl status --raw as an env variable or as an env variable that contains a path to a tmp file that contains that information (so that it is a fixed parameter to the hook script). But yeah, I can see how some status variables could be useful.

@MarcusWichelmann
Copy link

Would you like to pick up that PR? It's a great starter project and Go is really easy to pick up!

You mean if I could work on implementing this feature request for job hooks?
I'm not sure if this is feasible with zero previous go experience. I'm a C# guy. 😄 But I'll take a look.

@problame
Copy link
Member

You mean if I could work on implementing this feature request for job hooks?
I'm not sure if this is feasible with zero previous go experience. I'm a C# guy. smile But I'll take a look.

That PR implements a better terminal UI and coincidentally brings the ability to render the current zrepl status (without --raw) to stdout. So that's not exactly what you were requesting in this PR.

But the hooks code is pretty easy to follow as well. And a lot less code to worry about. So yes, I'd recommend you taking a look!

@problame problame added this to To do in Replication Feb 21, 2021
@callumgare
Copy link

callumgare commented Apr 15, 2022

I'd also find this very helpful so I'm adding my use case here. On success running a script that pings https://healthchecks.io so that I get alerted if backups don't run for some reason. In that use case I wouldn't need any particular info for the script, only for the script to run on success (or if it were to run on success or failure then an env var to easily tell the difference).


Update for anyone else in this situation: I've written a little script to check if a backup has been received recently + setup systemd to run it periodically:

/usr/local/sbin/check-backups-occurred.sh

#!/bin/bash
dataset="$1"
timeToCheckForSnapshotsAfterHumanReadable="$2"
endpointToPingOnSuccess="$3"
endpointToPingOnFailure="$4"


timeToCheckForSnapshotsAfter=$(date --date "$timeToCheckForSnapshotsAfterHumanReadable" +'%s')
timeOfLastSnapshot=$(sudo zfs list -Hp -t snapshot -r -o creation -s creation "$dataset" | tail -1)

if [ $timeOfLastSnapshot -gt $timeToCheckForSnapshotsAfter ]; then
        /usr/bin/curl -fsS -m 10 --retry 5 -o /dev/null "$endpointToPingOnSuccess"
else
        /usr/bin/curl -fsS -m 10 --retry 5 -o /dev/null "$endpointToPingOnFailure"
fi

/etc/systemd/system/check-zrepl-backups-occurred.service

[Unit]
Description=Reports whether a recently backup has successfully occurred

[Service]
Type=oneshot
ExecStart=/usr/local/sbin/check-backups-occurred.sh "your/dataset/to/check" "11 minutes ago" "https://hc-ping.com/uuid-of-your-check" "https://hc-ping.com/uuid-of-your-check/fail"

/etc/systemd/system/check-zrepl-backups-occurred.timer

[Unit]
Description=Periodically check that zrepl backups have occurred

[Timer]
OnBootSec=10min
OnUnitActiveSec=10min
Persistent=true

[Install]
WantedBy=timers.target

@problame
Copy link
Member

(FYI I have an unfinished branch that refactors the snapshotting code to support cron syntax. Right now, hooks are buried in the snapshotting code. So, if anyone starts hacking on this, reach out to me first to avoid a nasty merge conflict)

@Phlogi
Copy link

Phlogi commented Jul 8, 2022

I'd use this functionality to attach / import a zfs backup drive that is usually offline except during backups. The pre hook would import the drive and the post hook export it again and put the drive to sleep.

@problame
Copy link
Member

problame commented Jul 8, 2022

I'll try to get the cron syntax stuff cleared away asap, so that people can start hacking on this if they want.

@problame
Copy link
Member

problame commented Jul 9, 2022

The issue that tracks the cron snapshotting work is #554

@problame
Copy link
Member

The aforementioned snapper refactor & cron support has been released in zrepl 0.6. So, if anyone wants to work on this feature (=generic pre-post hooks for snapshotting & replication) I'm happy to review PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants