New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ACME renewals fail due to DNS being unavailable during switch #85794
Comments
My gut feeling this is because of me using |
I don't think dbus.service should reload? |
Relatedly it tried to restart
Which is useful because it didn't disconnect me whilst remotely applying; but also sounds like a bug in our service file? |
ah no the problem is that it also stopped and started Usually you'd solve this ordering problem by ordering units
What do you think @worldofpeace ? |
I am rather confused. AFAIK
|
As per discussion, this isn't a regression, but so far we don't ship the dbus-activated units that systemd ships, causing resolved to not auto-activate. We'll create a follow-up issue to enable it on unstable. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/make-acme-renew-systemd-service-depend-on-dns-nss-lookup/7412/6 |
This seems to be fixed on 20.09. On 20.03, when using
and I couldn't I found that according to https://logs.nix.samueldr.com/nixos/2020-02-23#3099629; @infinisil had a very similar problem with a server that ran its own DNS server. For me the upgrade to 20.09 on the same machine fixes it. But I'm not sure which concrete nixpkgs change fixed it. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/make-acme-renew-systemd-service-depend-on-dns-nss-lookup/7412/7 |
I'm reopening this issue as it wasn't entirely solved by #99901. I've been looking into it further and the only real solution to verify that DNS is working during startup would be to push it into a |
Switching to a socket-activated DNS helps but does not eliminate errors in the acme units entirely.
which makes me wonder whether this unit should retry itself. Perhaps not, because of letsencrypt's rate limits. It runs on a systemd timer anyway. So perhaps the solution is to avoid running the acme client during switch whenever possible. On way we might do that is to duplicate the service unit, run the otherwise unmodified unit on a timer only and change the new unit to run during activation only and short-circuit when a certificate is already present. |
#114752 runs the expiration check offline, so we usually don't need network during switch. |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/configure-acme-to-retry-challenge-multiple-times/12118/4 |
I marked this as stale due to inactivity. → More info |
Closes #129838 It is possible for the CA to revoke a cert that has not yet expired. We must run lego to validate this before expiration, but we must still ignore failures on unexpired certs to retain compatibility with #85794 Also changed domainHash logic such that a renewal will only be attempted at all if domains are unchanged, and do a full run otherwises. Resolves #147540 but will be partially reverted when go-acme/lego#1532 is resolved + available.
Should have been fixed by #147784. |
Describe the bug
I
nixos-rebuild switch
'd on my server with a bunch of ACME certs. It failed due to DNS not being availableTo Reproduce
Expected behavior
ACME Certs renew as expected
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Notify maintainers
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.Maintainer information:
The text was updated successfully, but these errors were encountered: