-
-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sanoid: add package, NixOS module and test #72060
Conversation
This comment has been minimized.
This comment has been minimized.
44ca23b
to
6a09b51
Compare
@GrahamcOfBorg test sanoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I might try this out, currently using znapzend, but it's a bit problematic occasionally
}; | ||
|
||
extraConfig = mkOption { | ||
description = "Extra configuration for this template/dataset"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I'm describing in NixOS/rfcs#42, introducing such extraConfig
options has problems, so I'd rather have this PR use the approach described there. This means:
- Changing this to
settings = mkOption { type = types.attrsOf (types.nullOr (types.oneOf [ int bool str ])); ... }
- Setting all other options in the
config
section of the (sub)module likesettings.hourly = mkDefault cfg.hourly;
instead - Generating the config file only from the
cfg.settings
values. You might be able to uselib.generators.toINI
for this
If you want to I could implement this for you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this; let me know if it looks good to you.
datasetOptions = commonOptions // { | ||
useTemplate = mkOption { | ||
description = "Names of the templates to use for this dataset"; | ||
type = types.nullOr (types.listOf (types.enum (attrNames cfg.templates))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the manual you should override this options description to note that only values in templates
can be used, because while building docs templates
is just gonna be empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this also one of those situations where changing templates
will cause unintended manual rebuilds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to cause manual rebuilds as far as I can tell, but I overrode the type description anyway so that it is more useful than list of one of s
.
''; | ||
}; | ||
|
||
defaultArguments = mkOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about removing this option and useDefaultArguments
and instead putting the default arguments as default = [ ... ]
into extraArguments
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name was not a good description of what that the option does. I changed it to commonArgs
(it seems like Args
is more common than Arguments
in nixpkgs). I created that option because I wanted a convenient shortcut to apply arguments to all (or most, since you can disable it for a specific command with useCommonArgs = false
) of the commands, in addition to those configured in extraArgs
. For example, I use it to add --no-privilege-elevation
and --no-sync-snap
in my configuration.
systemd.services.sanoid = { | ||
description = "Sanoid snapshot service"; | ||
serviceConfig.ExecStart = "${pkgs.sanoid}/bin/sanoid --cron --configdir=${configDir} ${cfg.extraArgs}"; | ||
after = [ "zfs.target" ]; | ||
startAt = cfg.interval; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sanoid should run as its own user, not root. Also consider passing --verbose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need environment.TZ = "UTC";
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a user
option, but left it as root
by default because AFAIK sanoid cannot run as another user unless ZFS privilege delegation is configured.
I'd rather not pass --verbose
unconditionally in case someone doesn't want it for some reason. I added it to the example for extraArgs
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you need something like /run/booted-system/sw/bin/zfs allow sanoid snapshot,destroy,mount,send,receive ${lib.escapeShellArg pool}
to allow sanoid to act on that pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this module should automatically make changes like that to users' pools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to implement this, but I'm not sure what approach to take. zfs allow
needs to run as root, but PermissionsStartOnly
is deprecated and NixOS doesn't support the command prefixes that replace it, so I don't see a way to setup privilege delegation in preStart
. The only other way I can see is to create another service that gets started before sanoid.service
, but that seems hacky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the prefixes can be used, just not with pretty abstractions like script
/preStart
/preStop
/etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that sanoid cannot currently run as non-root. It hardcodes the path to the cache file as /var/cache/sanoidsnapshots.txt
, which cannot be created as an unprivileged user. This could be worked around with tmpfiles or another ExecStartPre
command. The real problem is that it puts lock files directly in /var/run
. In this case, sanoid needs to create and delete these files itself, so I don't see a way to work around this.
I am working on patches to fix these issues, which I will submit upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --run-dir
and --cache-dir
options have been accepted upstream. There is no new release yet, so I added them as patches.
6a09b51
to
f6652b2
Compare
c57daa3
to
1c45bf9
Compare
7d9a60c
to
f5d9349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! I have some final comments, but I'm happy to merge once those are addressed.
aaaadce
to
b6a9ae3
Compare
ee4dce2
to
4e40ca7
Compare
ExecStartPre = map (pool: lib.escapeShellArgs [ | ||
"+/run/booted-system/sw/bin/zfs" "allow" | ||
cfg.user "snapshot,destroy" pool | ||
]) pools; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! What's the status on being able to use DynamicUser
now? I see you're including patches for --cache-dir
and --run-dir
, yet those options aren't passed anywhere. Also I see zfs allow
here but not in syncoid
. The ideal situation would be to get rid of all user
and group
options and switch to DynamicUser
. Combining this with zfs allow
also has the benefit of not needing a user with permanent permissions, since the service user will be only temporary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--cache-dir
and --run-dir
are not passed because they default to the directories that systemd uses, although I could pass them to be explicit.
I did some testing with DynamicUser
and it seems to work. I remove the privilege delegations in ExecStopPost
, which should result in them being correctly cleaned up even if the service fails. My only concern is that if the service is interrupted (ie. by power failure, kernel panic) then the permissions won't be cleaned up. If systemd were to reuse this UID for another DynamicUser
service, then that service would gain unintended ZFS permissions, which seems like a (admittedly rather hard to exploit) security vulnerability.
It should be possible to perform automatic privilege delegation for syncoid as well, although it is somewhat more complicated to extract the pool name and determine whether a target is local or remote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see the removal of privileges, is that change missing perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't publish the DynamicUser
change. I have added it now in a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, sanoid is looking great now. Does zfs allow
also work for syncoid? If so there wouldn't be a need for a user
option (unless you kept it because of a certain use case?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You aren't concerned about the cleanup issue I mentioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, forgot about that. I guess we can delegate this as potential future work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand your comment correctly, you are suggesting we forgo DynamicUser
for now (that would be my choice). I am planning to implement automatic privilege delegation for syncoid (without DynamicUser
), I've just been busy recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow, despite the discussion here, the final version of the PR that got merged uses DynamicUser
. While working on some other issues, I discovered that at some point sanoid failed to clean up its permissions, as show below:
---- Permissions on root ---------------------------------------------
Local+Descendent permissions:
user syncoid destroy,hold,send,snapshot
user destroy,mount,snapshot
Besides the possible security issue, it also appears to be currently impossible to remove these permissions until openzfs/zfs#10280 makes it into a release.
4e40ca7
to
3fafcb5
Compare
I'm looking to switch to sanoid/syncoid on my NAS. So, I've tested the package on my system - it works correctly. I didn't yet have a chance to try the module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I'll do some final testing, will merge after that, just before the 20.03 feature freeze :)
ec789f4
to
7684537
Compare
I took the liberty to improve the options a bit:
With these changes this looks good to me, are they also fine with you @lopsided98? |
That looks good to me. |
Merging then! 🚀 |
See #79759 |
Motivation for this change
Adds sanoid, a ZFS snapshotting and synchronization tool.
I have been using a version of this package and module locally for over a year, and just cleaned it up before submitting it upstream. I also just added a test that takes snapshots with
sanoid
and then usessyncoid
to transfer them to another machine.Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)Notify maintainers
cc @deliciouslytyped @Nekroze