-
-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NixOS 24.05 creates a new letsencrypt account and breaks CAA #316608
Comments
And you will, of course, very rapidly hit the ultra-frustrating letsencrypt rate limits again. They really ought to revise those policies as it makes relatively minor bugs like this a much bigger pain in the ass than they should be.. |
cc @NixOS/acme |
Did you update from 23.11? |
Yes. |
Suspect 4494fca, because the two different directories I have now match the SHA256 of |
I honestly have no idea why this happened. The module hasn't had any significant changes for the past few years and nothing stands out in the recent history to me. |
Wow @stephank good catch. Yes sounds like that is the issue |
Do what's the plan forward? IMHO we should fix this whitespace and backport it to stable, assuming there's more people who didn't yet upgrade to the latest release vs that did. Also, there might be a chance people who already did the upgrade still have their old accounts available, no? |
Apparently this bug was brought up during review: #270221 (comment) :( |
Is this a solid assumption? With a maintainance window of just one month, there is quite some pressure to update NixOS.
The issue is having to update the CAA records in the DNS, being able to still use the old account does not really matter. |
Maybe we can keep the current method, but if the correct accounts directory doesn't exist, check if one exists under the old hash and move it?
|
What's really bad is neither @NixOS/acme got highlighted on the issue, nor did the release note update happen. This currently is a tire fire for everyone switching to it, either having them run into rate limits or having to update CAA records (after they notice, at all). |
We could create a patch that migrates the directory like @stephank proposes. Suggestion in the NixOS ACME/LetsEncrypt channel was similar. Something along the lines of:
In any case we still should also add a note to the release notes as this migration doesn't help people who have already upgraded. |
I've updated probably 25 VMs all with certs and didn't run into an problems and already forgot that this was a problem 4 months ago. I assume this is only a problem if you have the account ID in the caa record or if you have way to many http challenge certs with different accounts.
We cannot just do that, as that would cause the opposite direction for anyone on unstable for the last 4 months, that already migrated or that recently installed 24.05. |
Since #106857 landed the rate-limit issue is probably less of an issue in practise. I think we really only should care about breaking people's CAA setup. In that case I think we can get away with just an entry in the release notes and state convergence can be seen as a nice to have but not a must have. |
At the Flying Circus, there's a high likelihood we're going to be affected by this nonetheless, despite the head-of-line registration blocker setup you mentioned, @arianvp. We have several hundred machines that'll request a new account when rolling out 24.05.
https://letsencrypt.org/docs/rate-limits/ When having to fix this downstream, I'd probably adopt a state convergence activation script approach. |
I'll work on a patch to re-allow null tonigh. Will post here for review in a bit! |
…p old accounts directory The accounts directory is based on the hash of the settings. NixOS#270221 changed the default of security.acme.defaults.server from null to the default letsencrypt URL however as an unwanted side effect this means the accounts directory changes and the ACME module will create a new a new account. This can cause issues with people using CAA records that pin the account ID or people who have datacenter-scale NixOS deployments We allow setting this option to null again for people who want to keep the old account and migrate at their own leisure. Fixes NixOS#316608 Co-authored-by: Arian van Putten <arian.vanputten@gmail.com>
…p old accounts directory The accounts directory is based on the hash of the settings. #270221 changed the default of security.acme.defaults.server from null to the default letsencrypt URL however as an unwanted side effect this means the accounts directory changes and the ACME module will create a new a new account. This can cause issues with people using CAA records that pin the account ID or people who have datacenter-scale NixOS deployments We allow setting this option to null again for people who want to keep the old account and migrate at their own leisure. Fixes #316608 Co-authored-by: Arian van Putten <arian.vanputten@gmail.com> (cherry picked from commit d1f07e6)
After updating to 24.05, it appears that the ACME client (with the default infrastructure: security.acme/services.nginx.xxx.enableACME etc.) registers a new account for obtaining letsencrypt certificates. If you had CAA with account binding enabled on your domains, the certificate renewals will now fail.
If you are affected by the problem, here is how you can figure out the new account URI in order to update your CAA entries:
This should probably be added to the release notes.
The text was updated successfully, but these errors were encountered: