-
-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consul: rewrite nixos module #107798
consul: rewrite nixos module #107798
Conversation
This PR has some backwards incompatible changes to the module's configuration. I am happy to adjust that if necessary. |
Great. Maybe you can submit a PR to actually include that |
Is it explains somewhere how that works, e.g. will it still be overridable in NixOS? |
I'm not sure. I don't see it in the manual... It is fairly straightforward, though. If
The idea is to get as much upstream as possible:
As a simple example we want |
I agree with you here. Is an upstream unit a blocker to get this PR merged? This service hasn't seen any maintenance in NixOS for a while, so I do think there's benefit to getting this in before an upstream unit is merged. |
Most certainly not, but if you circle back around to that later you get full bonus points ✨ |
Thanks for the PR! I will make a bit longer comment on the after work, if I can fit it in today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consul: rewrite nixos module
This commit message will need some significant expansion:
I know multiple production setups where Consul is a critical infrastructure part that will go down if something in the Consul NixOS module breaks (including my own), so it would be great if every significant change to the module could be clearly outlined in the commit message.
I will try to point at some of them in the in-code comments with Significant enough for changelog entry
.
The user-visible changes also need to go into a NixOS release notes entry (the equivalent of nixos/doc/manual/release-notes/rl-2103.xml
).
I am also wondering if it makes sense to keep the old module for 1 release, in addition to the new module proposed here (e.g. keep the old one as services.consul_deprecated
or so), so that people who use this in production servers have an easier migration path. I don't know if there is something similar in nixpkgs for service modules yet, I'm just quite sure that this will be a risky upgrade for my org at least. (So I'm not requesting this as a change here yet but would like to know what the PR author and other nixpkgs/consul users think.)
+ concatMapStrings (file: " -config-file=${file}") settingsFiles; | ||
ExecReload = "${pkgs.coreutils}/bin/kill --signal HUP $MAINPID"; | ||
ExecStop = optionalString cfg.leaveOnStop "${cfg.package}/bin/consul leave"; | ||
DynamicUser = cfg.dropPrivileges; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the description
of dropPrivileges
should now be slightly adjusted to reflect this.
}; | ||
|
||
config = mkIf cfg.enable ( | ||
mkMerge [{ | ||
|
||
users.users.consul = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant enough for changelog entry
|
||
systemd.services.consul = { | ||
wantedBy = [ "multi-user.target" ]; | ||
after = [ "network.target" ] ++ systemdDevices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this service was previously discovering host IP addresses in an unnecessarily complex way, when Consul supports using
go-sockaddr
templates in configuration
It's unclear to me whether go-sockaddr
does the equivalent of the removed ++ systemdDevices
, does anyone know more?
bindsTo = systemdDevices; | ||
restartTriggers = [ config.environment.etc."consul.json".source ] | ||
++ mapAttrsToList (_: d: d.source) | ||
(filterAttrs (n: _: hasPrefix "consul.d/" n) config.environment.etc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant enough for changelog entry
(the no-longer-using of consul.d
)
In particular, this changes how users of the module need to write modular configs.
For example, a common way how we would define additional checks so far was to create more consul.d
JSON files, e.g. like so:
{
systemd.services.consul.restartTriggers = [
config.environment.etc."consul.d/nginx-service.json".source
];
environment.etc."consul.d/nginx-service.json" = {
text = builtins.toJSON {
service = {
name = "nginx";
checks = [
{
id = "nginx-status-page";
name = "nginx status page";
http = "http://localhost:80/nginx_status";
interval = "1s";
timeout = "1s";
}
];
};
};
};
}
I think we need to explain how this is to be translated to the new setup.
I guess there are 2 ways to do it:
cfg.settingsFiles
- translating them to
checks = [ ]
as described in https://www.consul.io/docs/discovery/checks#multiple-check-definitions and putting them intocfg.settings
.
Does one of these have more benefits / should we recommend one over the other?
For the second approach, does type = format.type;
take care to concatenate lists if set from multiple different NixOS modules?
@@ -40,15 +29,6 @@ in | |||
''; | |||
}; | |||
|
|||
|
|||
webUi = mkOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of simply removing the config.
options, I think it would be better to make use of them emit an explicit error message, that explains what to use instead (can be removed in the subsequent NixOS release).
@@ -40,15 +29,6 @@ in | |||
''; | |||
}; | |||
|
|||
|
|||
webUi = mkOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe nixos/tests/consul.nix
needs to be adjusted to still evaluate given this change.
The NixOS tests fail, which is misleading: The Github action https://github.com/NixOS/nixpkgs/pull/107798/checks?check_run_id=1617881367 says
So it says success when in fact it wasn't successful. What can we do against that in general? From https://logs.nix.ci/?key=nixos/nixpkgs.107798&attempt_id=2ae55234-6480-4039-a2ae-1214c7ec5326 the actual failure output is:
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/can-we-make-ofborg-show-broken-tests-as-red/10717/1 |
@nh2 Thanks for your review! I think this PR is probably easier to review and get merged if I split it into a number of smaller ones:
|
In light of that I'll close this one out and start on the split. |
Since NixOS is so flexible I don't think there is any value in that. You can just import the module from |
@cpcloud Not sure if it's easier/faster, as it'll require more roundtrips, and it won't allow people to see the entire plan at once as a single commit or multiple commits in a single PR would, but I'll try to review any approach you pick.
Do you want to do the release notes at the end or also add them step by step (just so I know whether I should check for them in the individual PRs)? |
@nh2 exactly 👍 |
Motivation for this change
This is a major simplification of the
consul
service exposed in nixos.The primary motivation was to match the systemd unit settings provided by Hashicorp here.
Additionally, this service was previously discovering host IP addresses in an unnecessarily complex way, when Consul supports using
go-sockaddr
templates in configuration.Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)