Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] nixos/acme: switch to lego and support DNS challenge #63613

Closed
wants to merge 1 commit into from

Conversation

@petabyteboy
Copy link
Member

@petabyteboy petabyteboy commented Jun 21, 2019

Motivation for this change

We use Let's Encrypt with DNS-based ACME challenges at work, so I needed this. I have seen multiple other people request DNS challenge support (#34941, #35168).

  • I have not tested if HTTP challenge still works with this at all
  • Todo: Is anything else needed to support wildcard certificates?
  • Todo: Documentation on how to use DNS challenge

The following changes would be breaking:

  • As far as I could see, lego requires an email address to create an account
  • I could not see any possibility to specify different webroots when requesting one certificate for multiple domains
  • The plugin system that was used with simp_le is not supported. Some common files are generated in the post run script.
  • The remaining validity before renewal is now specified in days instead of minutes
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@petabyteboy petabyteboy force-pushed the petabyteboy:feature/acme-lego branch from ffbf5f6 to 5ea566d Jun 21, 2019
@petabyteboy petabyteboy changed the title nixos/acme: switch to lego and support DNS challenge (WIP) [WIP] nixos/acme: switch to lego and support DNS challenge Jun 21, 2019
@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Jun 21, 2019

@aanderse
Copy link
Contributor

@aanderse aanderse commented Jun 23, 2019

@aanderse
Copy link
Contributor

@aanderse aanderse commented Jun 23, 2019

@petabyteboy Could you show some examples of different dns providers? I'm specifically interested in an example for https://go-acme.github.io/lego/dns/httpreq/ if you don't mind.

Thanks for taking this on. As you mentioned, a much requested feature.

@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Jun 23, 2019

My configuration for an Amazon Route53 domain looks like this:

  services.nginx.enable = true;
  services.nginx.virtualHosts."ci.nyantec.com" = {
    enableACME = true;
    forceSSL = true;
  };

  systemd.services.nginx.after = [ "acme-ci.nyantec.com.service" ];

  security.acme.certs = {
    "ci.nyantec.com" = {
      email = "mil@nyantec.com";
      dnsProvider = "route53";
      credentialsFile = "/var/src/secrets/route53-api-keys";
    };
  };

/var/src/secrets/route53-api-keys contains something like this:

AWS_HOSTED_ZONE_ID="XYZ"
AWS_ACCESS_KEY_ID="XYZ"
AWS_SECRET_ACCESS_KEY="XYZ"

For the httpreq provider it might look like this:

  services.nginx.enable = true;
  services.nginx.virtualHosts."example.org" = {
    enableACME = true;
    forceSSL = true;
  };

  systemd.services.nginx.after = [ "acme-example.org.service" ];

  security.acme.certs = {
    "ci.nyantec.com" = {
      email = "you@example.org";
      dnsProvider = "httpreq";
      credentialsFile = "/var/src/secrets/lego-httpreq-credentials";
    };
  };

with lego-httpreq-credentials containing:

HTTPREQ_ENDPOINT="http://example.org:9090"
@aanderse
Copy link
Contributor

@aanderse aanderse commented Jun 28, 2019

Thanks for the information @petabyteboy. Great work as usual. I hope to test this over the next week or so.

Copy link
Contributor

@aanderse aanderse left a comment

This seems to be pretty incompatible with existing setups. How are we going to migrate users of the existing module over to this new one, preferably without causing them any pain?

@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Jun 28, 2019

I have listed some of the breaking changes in the initial post. Some of them might be fixable, but this needs work. If there is no way to make it compatible, I would suggest using lego only for installations with a recent stateVersion.

@arianvp
Copy link
Member

@arianvp arianvp commented Jun 30, 2019

@petabyteboy you mind if I cherry-pick this into https://github.com/NixOS/nixpkgs/pull/60219/commits ?

I'm pretty heavily rewriting the letsencrypt module, and having two diverging branches might cause trouble . Might be better to integrate this into my PR so that we won't have issues merging later on.

Keeping both lego and simp_le around sounds a bit too complicated to my taste. But breaking existing installations (especially ones usig the plugins option) is less than ideal

Another option:
Create a new nixos modules services.lego and keep the old one around. This way, we don't break old workflows... Though i'm not sure if that's ideal

@arianvp
Copy link
Member

@arianvp arianvp commented Jun 30, 2019

Also, perhaps we want to be able to specify the provider right from the nginx/lighttpd module?

@eyJhb
Copy link
Member

@eyJhb eyJhb commented Sep 14, 2019

@petabyteboy are you still working on this, or how is it? :)
Also, I might have some comments on how stuff is handled, e.g. the ToS shouldn't be accepted default, the user needs to active take action and agree (I would assume).

I might test this out, and see how it works!

@aanderse
Copy link
Contributor

@aanderse aanderse commented Sep 14, 2019

By enabling the services the user has agreed. I guess it would be worth mentioning that in the documentation... 🤷‍♂️

@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Sep 14, 2019

I'm not actively working on it currently, but I haven't abandoned it.

@eyJhb
Copy link
Member

@eyJhb eyJhb commented Sep 14, 2019

By enabling the services the user has agreed. I guess it would be worth mentioning that in the documentation...

Does that work from a legal standpoint?

Also, I have made two improvements so far

  1. Don't just use /tmp/success, uses $cpath/renewed, plus it will delete it in the preScript
  2. Bug where it would make acme-challenge dirs, if we used a dnsProvider.

Don't know the best way to apply them, or add them to the PR.. :)

--- acme.nix.bak	2019-09-14 13:11:59.084248039 +0200
+++ acme.nix	2019-09-14 13:16:50.411259892 +0200
@@ -213,7 +213,7 @@
                 lpath = "${cfg.directory}/${cert}";
                 rights = if data.allowKeysForGroup then "750" else "700";
                 renewHook = pkgs.writeScript "lego-renew-hook" ''
-                  touch /tmp/success
+                  touch ${cpath}/renewed
                 '';
                 globalOpts = optionals (!cfg.production) ["--server" "https://acme-staging-v02.api.letsencrypt.org/directory"]
                           ++ concatLists (mapAttrsToList (name: root: [ "--domains" name ]) data.extraDomains)
@@ -243,10 +243,13 @@
                     fi
                     chmod ${rights} '${cpath}'
                     chown -R '${data.user}:${data.group}' '${cpath}'
-                    ${optionalString (data.dnsProvider != null) ''
+                    ${optionalString (data.dnsProvider == null) ''
                       mkdir -p '${data.webroot}/.well-known/acme-challenge'
                       chown -R '${data.user}:${data.group}' '${data.webroot}/.well-known/acme-challenge'
                     ''}
+                    if [ -e ${cpath}/renewed ]; then
+                      rm ${cpath}/renewed
+                    fi
                   '';
                   script = ''
                     cd '${cpath}'
@@ -262,7 +265,7 @@
                   postStop = ''
                     cd '${cpath}'
 
-                    if [ -e /tmp/success ]; then
+                    if [ -e ${cpath}/renewed ]; then
                       cp .lego/certificates/${data.domain}.crt cert.pem
                       cp .lego/certificates/${data.domain}.issuer.crt chain.pem
                       cp .lego/certificates/${data.domain}.key key.pem

EDIT: Wrong patch

@eyJhb
Copy link
Member

@eyJhb eyJhb commented Sep 14, 2019

--key-type might be worth exposing as well :)

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Nov 2, 2019

Started using this in my own production to generate wildcard certificates. It works but there was a few things I had to modify:

  • Incorporated the path change for renewed by @eyJhb
  • Added security.legoCerts to wrap the lego-based cert generation separately to the acme-simple generation and avoid forking nixpkgs.
  • Added an extraFlags field to certs to specify any extra flags you want to lego. I needed this to add --dns.disable-cp because I'm using rfc2136 + local bind and my network is weird.
  • A wildcard domain needs to be passed as an extraDomains and not just domain, because there's some usages of domain that can't handle it in a few places.
  • Since wildcard certs are created with the name _.$domain.$filetype I had to add a * in some paths in the postStop hook. I don't see this affecting previous functionality.
@@ -207,11 +207,11 @@ in {
                     cd '${cpath}'
 
                     if [ -e ${cpath}/renewed ]; then
-                      cp .lego/certificates/${data.domain}.crt cert.pem
-                      cp .lego/certificates/${data.domain}.issuer.crt chain.pem
-                      cp .lego/certificates/${data.domain}.key key.pem
-                      cat .lego/certificates/${data.domain}.crt .lego/certificates/${data.domain}.issuer.crt > fullchain.pem
-                      cat .lego/certificates/${data.domain}.key .lego/certificates/${data.domain}.crt .lego/certificates/${data.domain}.issuer.crt > full.pem
+                      cp .lego/certificates/*${data.domain}.crt cert.pem
+                      cp .lego/certificates/*${data.domain}.issuer.crt chain.pem
+                      cp .lego/certificates/*${data.domain}.key key.pem
+                      cat .lego/certificates/*${data.domain}.crt .lego/certificates/*${data.domain}.issuer.crt > fullchain.pem
+                      cat .lego/certificates/*${data.domain}.key .lego/certificates/*${data.domain}.crt .lego/certificates/*${data.domain}.issuer.crt > full.pem
                       chmod ${rights} "${cpath}/"{key,fullchain,full,chain,cert}.pem
                       chown '${data.user}:${data.group}' "${cpath}/"{key,fullchain,full,chain,cert}.pem
@wizeman
Copy link
Member

@wizeman wizeman commented Nov 4, 2019

@m1cr0man Are your improvements publicly available?

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Nov 4, 2019

@wizeman, I have it in a self hosted git but I created a gist out of it with a usage example here. If there's more interest I think it will be worth me talking to @petabyteboy about just contributing to or replacing this PR.

@aanderse aanderse added this to the 20.03 milestone Nov 4, 2019
@wizeman
Copy link
Member

@wizeman wizeman commented Nov 7, 2019

@m1cr0man Your module worked for me (with one caveat, see below). The extraFlags parameter was very useful because I needed to generate a key of a different type.

The only problem I experienced was that the acme module in NixOS specifies validMin to be 30 * 24 * 3600 by default. This is interpreted as a number of seconds in NixOS's acme module (which was really confusing -- both me and @petabyteboy interpreted the option as being a number of minutes at first, due to its name), but in your module this is interpreted as a number of days (just like in this PR). The end result is that a new certificate was being generated every week, because the default Let's Encrypt expiration of 90 days was always less than NixOS's default 30 * 24 * 3600.

We should really think about how to integrate this module into NixOS, as the DNS challenge is an extremely useful feature.

I subscribe to @arianvp's opinion that it's not ideal to maintain two modules that almost do the same thing, but I also don't like breaking existing installations.
So perhaps having two modules (temporarily, at least?) is the way forward? Perhaps we can even deprecate the old module to give people time to migrate their configuration (if at all possible).

If we're deprecating things, I'd also vote to change the validMin option to some other name that doesn't deceptively imply that the unit is "minutes".

@eyJhb
Copy link
Member

@eyJhb eyJhb commented Nov 7, 2019

@wizeman making it a new module and then marking the original as deprecated seems like a reasonable thing to do, but I would suggest only doing it, if we cannot make the current module backwards compatible (which should be possible, right?).

Also, is @petabyteboy still working on this? Would be lovely to see some of these changes in the PR.

@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Nov 7, 2019

@wizeman making it a new module and then marking the original as deprecated seems like a reasonable thing to do, but I would suggest only doing it, if we cannot make the current module backwards compatible (which should be possible, right?).

No, it will not be possible to make it fully backwards-compatible, because lego does not provide some options, for example using a seperate webdir per domain with the HTTP challenge and multi-domain certs.
However it would be possible to cover most of the use cases of the current acme module.

@eyJhb
Copy link
Member

@eyJhb eyJhb commented Nov 7, 2019

@wizeman making it a new module and then marking the original as deprecated seems like a reasonable thing to do, but I would suggest only doing it, if we cannot make the current module backwards compatible (which should be possible, right?).

No, it will not be possible to make it fully backwards-compatible, because lego does not provide some options, for example using a seperate webdir per domain with the HTTP challenge and multi-domain certs.

For some reason I just remember both of these things being part of lego as is, but I might remember wrong.

But can't this be fixed using the following ways?

  1. Run a command pr. domain which requires a separate web dir, instead of running a single command.
  2. Post commands which combines the certs into a single certificate?
globalOpts = optionals (!cfg.production) ["--server" "https://acme-staging-v02.api.letsencrypt.org/directory"]
++ concatLists (mapAttrsToList (name: root: [ "--domains" name ]) data.extraDomains)
++ [ "--domains" data.domain "--email" data.email "--accept-tos" ]
++ (if data.dnsProvider != null then [ "--dns" data.dnsProvider ] else [ "--http.webroot" data.webroot ]);

This comment has been minimized.

@paumr

paumr Nov 8, 2019
Contributor

pretty sure it should be:
else [ "--http" "--http.webroot" data.webroot ]

};
path = with pkgs; [ simp_le systemd ];
path = with pkgs; [ lego systemd ];
preStart = ''
mkdir -p '${cfg.directory}'

This comment has been minimized.

@paumr

paumr Nov 8, 2019
Contributor

should this be: mkdir -p '${cfg.cpath}' ?

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Nov 8, 2019

@m1cr0man Your module worked for me (with one caveat, see below). The extraFlags parameter was very useful because I needed to generate a key of a different type.

We should really think about how to integrate this module into NixOS, as the DNS challenge is an extremely useful feature.

I subscribe to @arianvp's opinion that it's not ideal to maintain two modules that almost do the same thing, but I also don't like breaking existing installations.

Thanks for the feedback @wizeman , I must fix that validMin in my version.

I think it would be possible to make this backwards compatible. With regards to the custom webdirs - symlinks could be used to achieve this, and I don't see what would be wrong with doing so in the preStart hook for each domain's certs service.

@aanderse
Copy link
Contributor

@aanderse aanderse commented Nov 8, 2019

Let's take extra care not to complicate the module code too much. Also keep in mind that the existing module runs as a non root user now, so it would be great to continue that way.

@nixos-discourse
Copy link

@nixos-discourse nixos-discourse commented Nov 14, 2019

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/what-are-your-goals-for-20-03/4773/3

@aanderse
Copy link
Contributor

@aanderse aanderse commented Dec 3, 2019

I'll be so sad if this functionality isn't added to 20.03 😭

I believe the correct path forward is to replace security.acme.certs instead of creating a competing service. I think we should try our hardest to make this module as compatible as possible with the current service options, though not to the detriment of the new service.

If anyone disagrees can they please speak up.

@petabyteboy if there is agreement (or at least no disagreement) with what I have said are you motivated to continue on with this PR as described?

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Dec 3, 2019

FWIW, I am still up for taking this on to see if I can make it compatible with the existing setup. I am using in this prod for DNS and HTTP verified certs, but I haven't had time to clean it up to submit here.

@aanderse
Copy link
Contributor

@aanderse aanderse commented Dec 3, 2019

@m1cr0man that would be fantastic. Would you be willing to push your branch in its current state just so we can take a look?

What do you think your timeline might be like for cleaning it up? Realistic to have in place well before 20.03?

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Dec 3, 2019

Hi @aanderse all my current work is here and so will any updates I am doing. I'm currently writing a bit of Nix to map a list of vhosts to some HTTP certs. After I've succeeded with that I will immediately try standardising my setup so I can PR it back into nixpkgs and remove it from my repo. This is actually quite high priority for me and I'd be aiming to get it done before the end of the year, although maintaining this infra is a part time job for me at the same time ;)

@aanderse
Copy link
Contributor

@aanderse aanderse commented Dec 3, 2019

@m1cr0man fantastic news! Thank you very much for your efforts.

@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Dec 3, 2019

If someone else is working on this, feel free to open a PR. I have tried to rebase this multiple times but never with an acceptable result, and since the backwards-compatibility requirements are higher than expected I will probably not make it work in time.

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Dec 18, 2019

Over the weekend I deployed a new web host which now serves 28 different domains. 1 of them is using wildcard LE certs, the other 27 are using HTTP-01 challenge. I had to make a few changes to the Lego based cert service to achieve this:

  • Shared .lego state directory for all certs (to avoid accounts rate limit)
  • Add a config option to specify where the state directory is
  • Add --http flag to lego when using http based validation
  • Avoid looking up DNS specific settings (credentials file) when DNS-01 is not being used

This summary is just a TL;DR, I will open a PR very soon to merge this in to nixpkgs. I'm eager to do so since this is in production now. I have 2 things to do:

  • Merge with the standard security.acme.certs (I created security.acme.legoCerts so I could override things)
  • Test a migration of a cert from old acme to lego to see what happens and where the incompatibilities are.
@aanderse
Copy link
Contributor

@aanderse aanderse commented Dec 18, 2019

Fantastic news! Thanks for the update @m1cr0man 🎉

@leo60228
Copy link
Contributor

@leo60228 leo60228 commented Dec 19, 2019

I just enabled forceACME on a new domain and got an error from LE saying to switch to ACMEv2.

@aanderse
Copy link
Contributor

@aanderse aanderse commented Jan 8, 2020

@m1cr0man happy new year! Any news on this?

@m1cr0man
Copy link
Contributor

@m1cr0man m1cr0man commented Jan 8, 2020

@aanderse All I can really say is yes, but I was set back by the death of my main NixOS server over the weekend. I'll be testing the lego based certs ontop of a machine currently running the old cert setup this weekend and I'll likely have the PR in this weekend too.

@m1cr0man m1cr0man mentioned this pull request Jan 12, 2020
3 of 10 tasks complete
@petabyteboy
Copy link
Member Author

@petabyteboy petabyteboy commented Jan 13, 2020

Replaced by #77578

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants
You can’t perform that action at this time.