Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace simp-le with lego and support DNS-01 challenge #77578

Merged
merged 10 commits into from Feb 10, 2020
Merged

Conversation

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Jan 12, 2020

Motivation for this change

Continuation of #63613

Closes #34941, #35168

I have the following things still to do:

  • Add a deprecation notice for security.acme.certs.<name>.plugins (I don't know how, need help)
  • Similarly add some equivalent of mkChangedOptionModule to convert extraDomains to a list since custom webroots are no longer possible.

It is mostly backwards compatible, with a few caveats.

  • extraDomains can no longer have different webroots to the main webroot for the cert.
  • An email address is now mandatory for CA registration. I added a more globally scoped security.acme.email to make transitioning easier, and an assertion.
  • Accepting terms of service was a topic discussed in the other PR. For now, I have added
    security.acme.acceptTerms and defaulted it to true to avoid breaking user's configs. There may be reason to change this default depending on what reviewers think.

The following other changes were required:

  • Deprecate security.acme.certs.<name>.plugins, as this was specific to simp-le
  • Rename security.acme.validMin to validMinDays, to avoid confusion and errors. Lego requires the TTL to be specified in days. I have added a mkChangedOptionModule to cover this.
  • Add options to cover DNS-01 challenge (dnsProvider, credentialsFile, dnsPropagationCheck)
  • A shared state directory is now used (/var/lib/acme/.lego) to avoid account creation rate limits and share credentials between certs

I have tested converting my own simp-le generated certs to lego and it worked fine. I also have lego in production generating 28 HTTP validated certs and a DNS-01 cert with a bind DNS server (also on a NixOS box 馃槃 ). This is a non-breaking change if you haven't used custom webroots or plugins.

Also I have tried to keep the diff as small as possible but that other PR was so old it took me a full day to manually rebase on top of all the more recent changes to the ACME module. I'd be tempted to do some other cleanup to the module but there's good git blame as it is now.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
m1cr0man added 2 commits Oct 18, 2019
Lego allows users to use the DNS-01 challenge to validate their
certificates. It is mostly backwards compatible, with a few
caveats.

 - extraDomains can no longer have different webroots to the
   main webroot for the cert.
 - An email address is now mandatory for account creation

The following other changes were required:
 - Deprecate security.acme.certs.<name>.plugins, as this was
   specific to simp-le
 - Rename security.acme.validMin to validMinDays, to avoid
   confusion and errors. Lego requires the TTL to be specified in
   days
 - Add options to cover DNS challenge (dnsProvider,
   credentialsFile, dnsPropagationCheck)
 - A shared state directory is now used (/var/lib/acme/.lego)
   to avoid account creation rate limits and share credentials
   between certs
++ concatMap (p: [ "-f" p ]) data.plugins
++ concatLists (mapAttrsToList (name: root: [ "-d" (if root == null then name else "${name}:${root}")]) data.extraDomains)
email = if data.email == null then cfg.email else data.email;
globalOpts = [ "-d" data.domain "--email" email "--path" "." ]

This comment has been minimized.

@m1cr0man

m1cr0man Jan 12, 2020
Author Contributor

A quick note on --path .. Somewhere between lego 3.0.2 and lego 3.2.0, the default path changed from ./.lego to ~/.lego. I had to add this statement so that lego always runs in the service's WorkingDirectory.

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Jan 12, 2020

Ping @aanderse

Finally, the PR is here.

@aanderse
Copy link
Contributor

@aanderse aanderse commented Jan 12, 2020

Thanks @m1cr0man! This is great work. I hope to review and test this over the next few weeks, switching some of my test (and hopefully production) systems over to this.

@arianvp
Copy link
Member

@arianvp arianvp commented Jan 13, 2020

I will have a ook at this Tuesday. Thanks for the work

# Only try loading the credentialsFile if the dns challenge is enabled
EnvironmentFile = if data.dnsProvider != null then data.credentialsFile else null;
ExecStart = pkgs.writeScript "acme-start" ''
#!${pkgs.runtimeShell} -e

This comment has been minimized.

@arianvp

arianvp Jan 13, 2020
Member

Is it not possible to do what we want in one run of the binary?

This comment has been minimized.

@Mic92

Mic92 Jan 13, 2020
Contributor

Does not look like it does: go-acme/lego#216

This comment has been minimized.

@m1cr0man

m1cr0man Jan 13, 2020
Author Contributor

In order to enforce the validMinDays you have to run renew instead of run, which seems like an oversight in lego honestly. Run creates the account credentials if they don't exist but forces a new cert to be generated. I was tempted to add explicit logic to check for the cert's credentials files but I feel that's too solution specific. This conditional logic is the same as in the other PR.

Copy link
Contributor

@aanderse aanderse left a comment

A couple notes as I work through this...

Some issues with webroot at the moment. The tmpfiles rules reference webroot but for DNS certs this won't be defined.

nixos/modules/security/acme.nix Outdated Show resolved Hide resolved
@@ -173,6 +189,15 @@ in
'';
};

acceptTerms = mkOption {

This comment has been minimized.

@aanderse

aanderse Jan 13, 2020
Contributor

What do other people think of this? Personally I think it is overkill... but am fairly indifferent I guess 馃し鈥嶁檪锔

This comment has been minimized.

@mweinelt

mweinelt Jan 19, 2020
Member

If Let's Encrypt wants users to accept their terms I don't think implicitly accepting it for endusers should be the way to go. This knob is the exact place where we can point users to the terms of services of Lets Encrypt.

This doesn't make much sense though when overriding the API endpoints (-s|--server) because then we can't know where the correct ToS are.

This comment has been minimized.

@arianvp

arianvp Jan 27, 2020
Member

I don't have a really strong feeling about this. We were implicitly accepting it before no? This seems to make the behaviour slightly better than before

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Jan 15, 2020

If anyone can help me on my two todo's mentioned in the description that would be great. I tried adding a regular mkRemovedOptionModule under certOpts, with a relative attribute path. In order to change extraDomains I tried mkChangedOptionModule but I doubt this works since the attribute path has not changed, and the warning it would produce wouldn't explain the change to a list.

   cfg = config.security.acme;
 
   certOpts = { name, ... }: {
+    imports = [
+      (mkRemovedOptionModule [ "plugins" ] ''
+        This option has been removed since simp-le was replaced with LEGo.
+        You may be able to achieve the desired functionality with postRun
+      '')
+      (mkChangedOptionModule [ "extraDomains" ] [ "extraDomains" ] (config: let
+        domains = config.security.acme."${name}".extraDomains;
+      in if isList domains then domains else attrNames domains))
+    ];
+

These are the errors I got:

The option `security.acme.certs.mydomain.warnings' defined in `/root/nixpkgs/nixos/modules/security/acme.nix' does not exist.
# Removing the mkChangedOptionModule....
The option `security.acme.certs.mydomain.assertions' defined in `/root/nixpkgs/nixos/modules/security/acme.nix' does not exist.
@aanderse
Copy link
Contributor

@aanderse aanderse commented Jan 15, 2020

If anyone can help me on my two todo's mentioned in the description that would be great. I tried adding a regular mkRemovedOptionModule under certOpts, with a relative attribute path. In order to change extraDomains I tried mkChangedOptionModule but I doubt this works since the attribute path has not changed, and the warning it would produce wouldn't explain the change to a list.

@m1cr0man this is what @Infinisil had to say on the matter: https://logs.nix.samueldr.com/nixos-dev/2020-01-15#1579087078-1579096064

@Mic92
Copy link
Contributor

@Mic92 Mic92 commented Jan 16, 2020

It also seems that simp-le still uses acmev1 which will be deprecated by letsencrypt in June.

@arianvp
Copy link
Member

@arianvp arianvp commented Jan 16, 2020

@Mic92 nope don't worry; we already addressed this #71953

@aanderse
Copy link
Contributor

@aanderse aanderse commented Jan 16, 2020

I'm very happy with the new DNS challenge functionality. After a few minor changes I mentioned it worked flawlessly. Next step is to test migration of existing HTTP challenge based certificates.

Great work @m1cr0man!

@arianvp
Copy link
Member

@arianvp arianvp commented Jan 16, 2020

@aanderse how do you envision that migration path? for example; would keeping around account information and private keys from simp_le around be a requirement for you?

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Jan 16, 2020

@arianvp Those are already being kept around. I'm not explicitly deleting the JSON files in /var/lib/acme/${name} so you can revert back pretty painlessly.

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Feb 3, 2020

Sorry for the triple-comment, but I just want to point out that I never figured out how to add the assertions to the submodule. The plugins attribute will raise an assertion anyway since it's not defined anymore. extraModules is currently unchanged but setting a custom webroot is now impossible so it makes more sense to convert it to a list. I'm happy to merge without these and just the tests for now, and work on them later.

@nixos-discourse
Copy link

@nixos-discourse nixos-discourse commented Feb 6, 2020

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-20-03-feature-freeze/5655/27

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Feb 9, 2020

Ok it was super difficult to figure out, but I got the DNS-01 challenge test working. However in doing this I've somehow broken the test for b.example.com and I'm not sure how. Would someone else be able to take a look at it(@arianvp)?

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Feb 9, 2020

Fixed it - http-01 validation was running before nginx was started. :) This is ready for review again

nixos/tests/acme.nix Outdated Show resolved Hide resolved
user = config.services.nginx.user;
group = config.services.nginx.group;
Comment on lines +119 to +120

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

Why is this necessary? Can't we use the per-vhost enableACME option in the nginx module, and just reconfigure the dns provider in the acme module?

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

I'm testing wildcard certs. You can't specify a wildcard cert through enableACME, and that option is tested elsewhere. Personally I use the cert for more than just nginx so I want to ensure it works hard way.

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

I'm not sure - I'd expect serverAliases to work with that - if not, I'd consider it a bug.

I just find this test hard to understand. - If we can't use the nginx-provided ACME functionalities, probably it'd be more understandable to just not use nginx in the test, only configure the acme module, and check test certificate files appear.

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

I don't think it's sufficient to simply check does the file exist to truly test wildcard certs. It should work when used with any service (in this case, nginx) serving under that domain. Using nginx here keeps things consistent with the other tests. You're right I could've jused used serverAliases and added *.example.com there and it would've worked fine, but I'm testing the acme module, not nginx.

This comment has been minimized.

@flokli

flokli Feb 10, 2020
Contributor

I meant using nginx, but not the acme-specific support of it makes the test much more verbose. Maybe a nginx test withserverAliases, and a simple one which just looks at the certificate via openssl x509 -in full.pem -text -noout | grep DNS?

user = config.services.nginx.user;
group = config.services.nginx.group;
};
systemd.targets."acme-finished-example.com" = {};

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

Why do we do that? This looks like you're workarounding something鈥

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

There's a comment on the earlier ones. It's so that it can be waited upon in the tests.

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

Please copy that comment over - I got drowned in the rest.

systemd.services."acme-example.com" = {
wants = [ "acme-finished-example.com.target" ];
before = [ "acme-finished-example.com.target" "nginx.service" ];
wantedBy = [ "nginx.service" ];
};
Comment on lines +123 to +127

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

Same here - I wouldn't expect to have to specify that.

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

nginx was starting after the certs consistently when I was testing, causing the HTTP-01 validation to fail. Roll it back and run nix-build acme.nix and you'll see what I mean

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

That's only for the wildcard certificate, right?

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

No it was for the HTTP-01 challenge certs, aka the ones using enableACME

wantedBy = [ "nginx.service" ];
};
services.nginx.virtualHosts."c.example.com" = {
forceSSL = true;

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

why can't we use enableACME here?

This comment has been minimized.

@m1cr0man

m1cr0man Feb 9, 2020
Author Contributor

I'm testing wildcard certs. I don't want a cert for c.example.com

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

see above.

sslCertificate = config.security.acme.certs."example.com".directory + "/cert.pem";
sslTrustedCertificate = config.security.acme.certs."example.com".directory + "/full.pem";
sslCertificateKey = config.security.acme.certs."example.com".directory + "/key.pem";
Comment on lines +130 to +132

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

This is then probably not necessary anymore, too.

Merge remote-tracking branch 'remotes/upstream/master'
user = config.services.nginx.user;
group = config.services.nginx.group;
};
systemd.targets."acme-finished-example.com" = {};

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

Please copy that comment over - I got drowned in the rest.

user = config.services.nginx.user;
group = config.services.nginx.group;

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

I'm not sure - I'd expect serverAliases to work with that - if not, I'd consider it a bug.

I just find this test hard to understand. - If we can't use the nginx-provided ACME functionalities, probably it'd be more understandable to just not use nginx in the test, only configure the acme module, and check test certificate files appear.

systemd.services."acme-example.com" = {
wants = [ "acme-finished-example.com.target" ];
before = [ "acme-finished-example.com.target" "nginx.service" ];
wantedBy = [ "nginx.service" ];
};

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

That's only for the wildcard certificate, right?

systemd.targets."acme-finished-b.example.com" = {};
systemd.services."acme-b.example.com" = {
wants = [ "acme-finished-b.example.com.target" ];
before = [ "acme-finished-b.example.com.target" ];
after = [ "nginx.service" ];
};
services.nginx.virtualHosts."b.example.com" = {
enableACME = true;

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

we should name these "subdomain certs" a bit more self-explaining, like "webroot-nginx" maybe?

wantedBy = [ "nginx.service" ];
};
services.nginx.virtualHosts."c.example.com" = {
forceSSL = true;

This comment has been minimized.

@flokli

flokli Feb 9, 2020
Contributor

see above.

@disassembler
Copy link
Member

@disassembler disassembler commented Feb 10, 2020

@m1cr0man are you going to be able to get this addressed before feature freeze tomorrow?

@flokli
Copy link
Contributor

@flokli flokli commented Feb 10, 2020

@m1cr0man
Copy link
Contributor Author

@m1cr0man m1cr0man commented Feb 10, 2020

@flokli in general I found these acme tests to be a bit too complex. I would love to rewrite the whole lot of them and incorporate some Apache based tests too, but I can't do that today. I feel that given that there is at least a working test for DNS-01 and the numerous production instances of this code now running that it is stable for 20.03

@arianvp
Copy link
Member

@arianvp arianvp commented Feb 10, 2020

@flokli
Copy link
Contributor

@flokli flokli commented Feb 10, 2020

Okay, let's merge this, and improve the tests later on.

@flokli flokli merged commit 4e0fea3 into NixOS:master Feb 10, 2020
13 checks passed
13 checks passed
Evaluation Performance Report Evaluator Performance Report
Details
grahamcofborg-eval ^.^!
Details
grahamcofborg-eval-check-maintainers matching changed paths to changed attrs...
Details
grahamcofborg-eval-check-meta config.nix: checkMeta = true
Details
grahamcofborg-eval-darwin nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A darwin-tested
Details
grahamcofborg-eval-nixos nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release-combined.nix -A tested
Details
grahamcofborg-eval-nixos-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A manual
Details
grahamcofborg-eval-nixos-options nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./nixos/release.nix -A options
Details
grahamcofborg-eval-nixpkgs-manual nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A manual
Details
grahamcofborg-eval-nixpkgs-tarball nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A tarball
Details
grahamcofborg-eval-nixpkgs-unstable-jobset nix-instantiate --arg nixpkgs { outPath=./.; revCount=999999; shortRev="ofborg"; } ./pkgs/top-level/release.nix -A unstable
Details
grahamcofborg-eval-package-list nix-env -qa --json --file .
Details
grahamcofborg-eval-package-list-no-aliases nix-env -qa --json --file . --arg config { allowAliases = false; }
Details
@m1cr0man m1cr0man mentioned this pull request Feb 23, 2020
7 of 10 tasks complete
@emilazy emilazy mentioned this pull request Feb 23, 2020
5 of 10 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.