Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create zfssnap.md #3

Merged
merged 2 commits into from Jul 18, 2019
Merged
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Next
Create zfssnap.md
How to set up ZFS snapshots on Project Trident with zfsnap
  • Loading branch information
jdrch committed Jul 16, 2019
commit 18430388cecd47547a8336449c0d2775bf3f63c3
@@ -0,0 +1,169 @@
**How to set up regular recurring, recursive, incremental, online filesystem backups using `zfsnap`**

*What this guide will get you:*

* The creation of snapshots from which you can restore all practical (there are some that do not make sense to back up from a UNIX perspective) filesystems from historical points in time with equal time gaps between them
* Automatic snapshot pruning (deletion when they reach a certain age, set by the user)
* Snapshots that contain only changes since the previous snapshot
* Snapshots happen instantaneously and while the computer and filesytems are both online and mounted
* Snapshots on the same physical block device as the source data

*What this guide will NOT get you:*

* Device (entire PC, including bootloader, etc.) backup. For that, try [Bacula](https://www.bacula.org/) or [Amanda](http://www.amanda.org/). No claim about those tools is made or implied by this statement
* Block/physical storage device redundancy. If the physical storage device(s) your pool is on stops functioning,

**STEP 0: Get the right mindset**

Be prepared to NOT understand things at first, but also be patient. Eventually you'll get the hang of what you're doing.

Also, because snapshots are just backups, and `zfsnap` operates on those exclusively, there is minimal risk to your data during setup. The mistake you're most likely to make is setting `zfsnap` to run too often, which might slow down the machine until the problem is discovered and rectified.

Advanced users may find this guide a bit like spoonfeeding, but it's deliberately written in this manner to enable people unfamiliar basic programming concepts and who do better with GUIs than CLIs to understand how `crontab` and the commands it calls work. Too many *nix guides assume a lot of prior knowledge for which no easily accessible corresponding guide or documentation exists, leaving users struggling to understand what they need to do to get their desired effects.

There are many links and citations to allow users to do as much (or as little) background reading as they want or need.

**STEP 1: Read the zfsnap documentation**

Read:

1. The [introductory writeup](https://www.zfsnap.org/docs.html) at the project website
2. The [`zfsnap` man page](https://www.zfsnap.org/zfsnap_manpage.html)

As with most *nix documentation, 1.2 may be pretty dense reading, but it's the official, canonical description of the functionality this guide uses, so patiently go through it.

**STEP 2: Read FreeBSD's `crontab` man page**

While `zfsnap` creates, names, and automatically prunes backups, it uses `cron` to invoke it automatically. As such, you'll have to be familiar with `cron` syntax to create the zfsnap run schedule you want. `cron` syntax varies based on OS implementation, and Project Trident (ultimately) [uses FreeBSD's implementation thereof](https://t.me/ProjectTrident/33871).

Read:

1. The [`crontab` man page](https://www.freebsd.org/cgi/man.cgi?crontab(5))
2. [*Configuring* `cron`](https://www.freebsd.org/doc/handbook/configtuning-cron.html) in the FreeBSD Handbook

The same dense reading warning applies. *nix was written for research and business applications and expected to be operated by people with 4 year degrees in operating such systems, so it's OK if it takes a while to understand. Don't feel stupid.

**STEP 3: Download and install `zfsnap`**

1. Open the Lumina start menu
2. Search for AppCafe
3. In the window that opens, search for `zfsnap`. You may see multiple versions listed; you can read about their differences [here](https://github.com/zfsnap/zfsnap). This guide will use the latest version (2.x at this writing)
4. "Install" (this term is used loosely as `zfsnap` is a script that makes use of built-in ZFS functions/programs, not an application itself) the latest `zfsnap` version

**STEP 4: Determine the backups you want**

Although `zfsnap` doesn't use this terminology, snapshots can be divded into "families" based on everything in a given `zfsnap` command after `zfsnap snapshot` command. The significance of this is each family requires only **ONE** `crontab` entry for creation. It's simultaneously simplying *and* confusing. Put another way, a *single* `crontab` entry can generate *multiple* snapshots of the same type (family).

1. Determine the [zpools](https://wiki.ubuntu.com/ZFS/ZPool) (ZFS storage pools, yes that's an Ubuntu documentation link, but the definition of zpool is universal across ZFS implementations) in your system by using [`zpool list`](https://docs.oracle.com/cd/E19253-01/819-5461/gamml/index.html)
2. Select the zpool(s) whose filesytems you want to back up using the [`zfsnap snapshot`](https://www.zfsnap.org/zfsnap_manpage.html#snapshot) command
3. Build the required `zfsnap snapshot` command accordingly

As an example, all snapshots created from `zfsnap snapshot -rv -a 6w zpool` are in the same family, and so require only 1 `crontab` line. The aforesaid command translates to:

* Invoke zfsnap (represented by by the absolute path to the `zfsnap` command, `/sbin/zfsnap`) to ...
* Create individual snapshots (also known as recursive snapshots, represented by `-rv`) with ...
* A minimum retention period (represented by `-a`, also called [TTL](https://en.wikipedia.org/wiki/Time_to_live) but "minimum retention period" is easier to understand as will be shown later) of ...
* 6 weeks (represented by `6w`) of ...
* All filesystems on the zpool named `zpool` (represented by `zpool`)

Or, put into one sentence: Create a recursive snapshot of all filesystems on zpool named `zpool` with a minimum retention period of (read "that should be deleted after") 6 weeks. It may be easier to understand the syntax if you read the translation first before reading the command.

Test each of your `zfsnap snapshot` commands using the use the `-n` (dry-run) and `-v` (verbose) flags to make sure the command does what you think it does, e.g. `zfsnap snapshot -n -v -rv -a 6w zpool`.

The point of this step in the exercise is for the user to determine which pool they want to backup and how long they want to keep each backup of that pool for.

**STEP 5: Determine the `crontab` schedule for your backups**

1. Think about how often you want to create backups
2. 1) above is determined entirely by `crontab`, so *write down* a `crontab` schedule that matches the above, based on the `crontab` syntax in 2.1 above

A couple details about 5.2 above:

* `crontab` fields support whole numbers only, e.g. `0 2 * * *` will work, `0 2.2 * * *` will not work
* /*n*, where *n* is a whole number, e.g. `0/5`, does not work (the way you might think) for the minutes field. Without going into details, just avoid it
* The `@ns` syntax, where *n* is a whole number, e.g. `@1000s`, is much easier to understand than the individual fields. In addition, it ensures that each successive invocation happens *n* seconds *after the previous one has completed*, which ensures tasks in the same family never collide (read: attempt to start a new instance before the previous instance has completed. This is generally not an issue for snapshot creation because it's instantaneous, but may be an issue for snapshot deletion). The main drawback is it's a more difficult to set jobs based on absolute calendar date and time

An example of a `crontab` schedule is `0 14 * * *` (tab separated), which translates to:

* Every time the system clock time value is 0 minutes, 14 hours (represented by `0 14`, evaluates to 14:00/2:00 PM) ...
* Regardless of the day of the month, the month, or the day of the week (what * * * stand for, respectively)

Or, put into one sentence: Every day at 14:00/2 PM.

**STEP 6: Match desired backups with their corresponding `crontab` schedules to create single, complete `crontab` entries for each backup**

For example, putting the examples in Steps 5 and 4 together - in that sequence - into a sample `crontab` entry gives:

`0 14 * * * /sbin/zfsnap snapshot -rv -a 6w zpool`

Which translates to:

* Every time the system clock time value is 0 minutes, 14 hours (represented by `0 14`, evaluates to 14:00/2:00 PM) ...
* Regardless of the day of the month, the month, or the day of the week (what * * * stand for, respectively) ...
* Invoke zfsnap (represented by by the absolute path to the `zfsnap` command, `/sbin/zfsnap`) to ...
* Create individual snapshots (also known as recursive snapshots, represented by `-rv`) with ...
* A minimum retention period (represented by `-a`, also called [TTL](https://en.wikipedia.org/wiki/Time_to_live) but "minimum retention period" is easier to understand as will be shown later) of ...
* 6 weeks (represented by `6w`) of ...
* All filesystems on the zpool named `zpool` (represented by `zpool`)

Or, put into one sentence: Create a recursive snapshot of all filesystems on zpool named `zpool` every day at 14:00/2 PM with a minimum retention period of (read "that should be deleted after") 6 weeks.

You can have as many creation entries (snapshot families) as you want.

You may have to read that over multiple times to completely understand it. That's fine, be patient with yourself.

Clearly, from the above, a snapshot will be created each day at 14:00/2 PM. After 10 days, for example, there will be 10 snapshots created from that single line. This is what is meant by the snapshot "family" concept introduced in Step 4.

**STEP 7: Decide when you want to delete (`zfsnap destroy`/prune) snapshots whose age is greater than their specified minimum retention time (read: old snapshots)**

There are a lot of options for this, but to keep things simple this guide will cover `destroy`ing *all* old snapshots at once.

A snapshot's minimum retention time (TTL) and the cadence of `zfsnap destroy` are related *only* in the sense that the latter will delete snapshots older than their minimum retention time *when* a `zfsnap destroy` matching said snapshot (to be covered later) is invoked*. In other words, a snapshot will surive beyond its minimum retention time until the next `zfsnap destroy` invocation that matches it.

As an example, consider a snapshot with a minium retention time of 2 hours, taken and midnight (00:00). A matching `zfsnap destroy` invocation at 1 hour after that snapshot was taken, at 01:00 on the same day, will leave that snapshot intact. However, a matching `zfsnap destroy` invocation at 3 hours after that snapshot was taken, at 03:00 on the same day, will delete that snapshot.

Of note is the fact that the snapshot was retained for 3 hours despite its TTL being 2 hours. That is why, to this point, this guide uses the term "minimum retention time" instead of TTL: it better and more plainly describes the meaning of that parameter. "TTL" implies that whatever it refers to "dies" when the TTL value is reached, while "minimum retention time" conveys that whatever it refers to will live for *at least* the minimum retention time. Minimum retention time was used up to this point to avoid confusion, but the guide will now use the official term, TTL, to align with the documentation.

The steps, therefore, are:

1. Decide how often/when old snapshots should be deleted. As the documentation states, while snapshot creation is (mostly) instantaneous, deletion takes longer and can be taxing to the machine. This is exacerbated by the fact the guide covers deleting all old snapshots at once, for the sake of simplicity. As such, it is highly recommended that `zfsnap destroy` be scheduled with sufficient time between invocations, that it be regular but not be *too* frequent, AND that the `@ns` syntax is used to prevent task collision
2. Build the `crontab` schedule using the same syntax as Step 5
3. Add `zfsnap destroy -rv zpool` to 7.2 above

An example of a combined command for the above is:

`@100000s /sbin/zfsnap destroy -rv zpool`

Which translates to:

* 100000s after the completion of the previous invocation of the following command (represented by `@100000s`) ...
* Invoke zfsnap (represented by by the absolute path to the `zfsnap` command, `/sbin/zfsnap`) to ...
* Destroy individual snapshots (also known as recursive snapshots, represented by `-rv`) whose age is greater than their TTL value on ...
* All filesystems on the zpool named `zpool` (represented by `zpool`)

Or, put into a single sentence: Destroy all old snapshots in all filesystems on zpool 100000 seconds after the last such destruction completed.

Test each of your `zfsnap destroy` commands using the use the `-n` (dry-run) and `-v` (verbose) flags to make sure the command does what you think it does, e.g. `zfsnap destroy -n -v -rv zpool`.

You can have as many deletions as you want, especially if you prefer to prune (delete members of) only certain snapshot families at a time. The full syntax for `zfsnap destroy`, which enables that, is [here](https://www.zfsnap.org/zfsnap_manpage.html#destroy). It is omitted for the sake of simplicity.

**STEP 8: Put each combined (schedule + `zfsnap` command) command into `/etc/crontab`**

* Open `crontab` in Lumina Text Editor via `sudo lte /etc/crontab`. Unlike on (some) Linux distros, `crontab` can be edited directly in a GUI application on Project Trident without needing to use `visudo` or something.
* Put each combined `zfsnap snapshot` and `zfsnap destroy` command into the `crontab` file, with each line preceded by a comment describing what the line does. This is helpful for troubleshooting; in a crisis the last thing you want to be doing is trying to divine exactly what each line is doing. See below for what that should look like
* Save the `crontab` file and exit Lumina Text Editor

The additional `crontab` entries should look like this:

`# Create a recursive snapshot of zpool daily at 2 PM with a TTL of 6 weeks`
`0 14 * * * /sbin/zfsnap snapshot -rv -a 6w zpool`

`# Destroy all old snapshots on zpool 100000 seconds after last destruction completed`
`@100000s /sbin/zfsnap destroy -rv zpool`

**(Later on) STEP 9: Verify that your snapshots are being created as you want**

1. Depending on the schedule you set, wait some time for enough snapshots to have been taken and deleted
2. List all snapshots using [`zfs list -t snapshot`](https://docs.oracle.com/cd/E19253-01/819-5461/gbiqe/index.html). The output should match your `crontab` entries

If you made a syntax error in `crontab` resulting in too many snapshots being taken, correct the problematic `crontab` entry first. This will stop the excessive snapshot creation. Then, run a `zfsnap destroy`on the affected pools. This will delete all the snapshots created so far and allow you to start from scratch.