Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compliance with Let's Encrypt Integration Guide #71

Closed
9 of 11 tasks
jcgruenhage opened this issue Jan 1, 2023 · 9 comments
Closed
9 of 11 tasks

Compliance with Let's Encrypt Integration Guide #71

jcgruenhage opened this issue Jan 1, 2023 · 9 comments

Comments

@jcgruenhage
Copy link
Contributor

jcgruenhage commented Jan 1, 2023

Let's Encrypt has an integration guide for hosting providers and client software that's meant to be used with Let's Encrypt that has a few things that they want those users to implement, and most clients do fail to implement those in one way or another.

The integration guide is available over at https://letsencrypt.org/docs/integration-guide/. I'll try to summarize the requirements below, and whether acmed is implementing it in a compliant way, if that requirement is applicable at all. I'll not add a checkmark iff there is a recommendation that is definitely applicable to acmed and which acmed doesn't apply properly.

  • One Account or Many?: Let's Encrypt recommends to use a single account. While it's required for acmed to support multiple accounts, because different providers could require different key algorithms for example, the documentation could be improved here, to recommend sticking to a single account. Even worse than not recommending to use a single account though, the README even encourages using multiple accounts (at least when using multiple endpoints). Edit/Clarification: the recommendation from LE is just about larger hosting providers, and the recommendation from the README is just about different endpoints, so my initial assessment here was off.
  • Multi-domain (SAN) Certificates: This is just some text about what drawbacks it might have to have multiple certificates vs one certificate with multiple domains, not really applicable to acmed
  • Storing and Reusing Certificates and Keys: acmed provides hooks for storing files and reusing them properly, so this is definitely okay. What Let's Encrypt says here is to not generate certificates in volatile environments such as short lived containers, which is not really applicable to acmed.
  • Picking a Challenge Type: As acmed supports all current challenge types, this is in good shape right now.
  • Central Validation Servers: Not really applicable to acmed, as this depends more on the deployment than the acme client.
  • Implement OCSP Stapling: Definitely not applicable, this is the responsibility of the TLS server.
  • Firewall Configuration: Not applicable
  • Supported Key Algorithms: Except for 3072 bit RSA, all algorithms supported by Let's Encrypt are supported by acmed as well.
  • HTTPS by default: Not applicable, depends on the service being secured with TLS, not the acme client.
  • When to Renew: Let's Encrypt recommends that you renew certificates at a third of their total lifetime. With the current 90-day certificates, this means 30 days before expiration. What acmed does right now is 3 weeks before the end of the expiration, so 21 days. This is not the only way that things could be improved here though: Especially if you're running very big deployments with thousands of certificates, you can reduce the cost for Let's Encrypt by spacing out the renewals randomly. My suggestions here would be to allow setting a range here, instead of a concrete value, and to take a random time between the start of the range and the end of the range for the renewal, so something like 35 days until expiry and 30 days until expiry, and then it chooses to do it at 33.78 days before expiry for example. With the current structure of how acmed handles this though, it's not that easy to do this.
  • Retrying failures: What Let's Encrypt states here is that renewal failures should not be treated as fatal errors, but that those should be gracefully retried with an exponential backoff. acmed is pretty far off here (as far as I can tell, I'm not quite sure actually). As far as I can tell, acmed will check each certificate for whether it needs renewal once per hour (at least that's the default) and not do any sort of exponential backoff. Some specific errors are also treated as recoverable, which are then retried 20 times with 1s delay in between. Whether retrying recoverable errors in such quick succession is in line with what Let's Encrypt wants aside, the more important thing would be to implement some for of exponential backoff. Their suggested schedule is retrying after 1 minute, then 10 minutes, then 100 minutes and each subsequent retry after 1 day. I think it'd be in the spirit of acmed to have this as something that's configurable per endpoint, with defaulting to the schedule suggested by Let's Encrypt.

Summarizing: acmed is already doing a lot of things right here, with the documentation on accounts being one thing that could be improved and scheduling of renewals and retrying failures being the other thing.

@breard-r
Copy link
Owner

breard-r commented Jan 1, 2023

Thank you for taking the time to review how ACMEd works and review it against Let's Encrypt's integration guide.

I agree with you about the "when to renew" and "retrying failures" parts, there is space for improvement here. However, I planned to do so after rewriting a large part of ACMEd. The current thread-based model has serious drawbacks and I am planning to completely rewrite it using async, which is much more suited for this job. I currently have little time for that, but I might have more in a few months, so it might take some time.

On the other side, I disagree with you about the "one account or many?" for two reasons:

  1. the integration guide itself states the benefits from having multiple accounts and does not discourage it for most users since its recommendation on having a single account is explicitly aimed at "larger hosting providers", which currently are not ACMEd's intended audience (service restart is required for each configuration change, including adding/removing certificates, which makes ACMEd impractical when you regularly add and remove certificates) ;
  2. recommending a different account for each endpoint is not in contradiction with using a single account for Let's Encrypt (unless you really want people to use the same account for the production environment and the staging one, but I don't think that's when LE meant).

That said, the README part about using a different account for each endpoint will be dropped with the async rewrite and I guess it wouldn't hurt to replace it with something like "please check your CA policy on this point".

@jcgruenhage
Copy link
Contributor Author

I am planning to completely rewrite it using async

That sounds like a good plan. Going through the codebase, I was thinking quite a few times that an async approach would make a few things, especially around retrying and renewal timing quite a bit easier.

I currently have little time for that, but I might have more in a few months, so it might take some time.

Would you be interested in contributions for that async rewrite? As it's going to affect nearly the whole code-base, I can absolutely understand it if not, but I'd be interested to at least take a look how much work it would be.

its recommendation on having a single account is explicitly aimed at "larger hosting providers"

Fair enough, I'm evaluating acmed from the perspective of such a larger hosting provider, which is why I was missing some nuance in that statement.

which currently are not ACMEd's intended audience (service restart is required for each configuration change, including adding/removing certificates, which makes ACMEd impractical when you regularly add and remove certificates)

As someone who is deploying thousands of certificates with ACME: a service restart is not a blocker here, as we also deal with thousands of servers at the same time. Each acme client only ever deals with one certificate anyway in this scenario.

recommending a different account for each endpoint is not in contradiction with using a single account for Let's Encrypt

ACK

That said, the README part about using a different account for each endpoint will be dropped with the async rewrite and I guess it wouldn't hurt to replace it with something like "please check your CA policy on this point".

Yep, that sounds good.

@jcgruenhage
Copy link
Contributor Author

Hey @breard-r, coming back to this, I've split off the two problems regarding scheduling that you agreed on and have clarified the bit about multiple accounts, so I think this issue here can be closed, as there's dedicated issues for #74 and #75 now.

The interest here is coming from @famedly. We (that's my colleague @lukaslihotzki and me) are currently looking for a new ACME client, and are willing to invest some time into helping acmed to get into shape and also help with maintenance down the line. The issues #74 and #75 we've mentioned above are one area where we see potential for improvement, and the others are #72 and #73. Would you be interested in cooperating on getting those changes in here? We'd like to to avoid forking and having sole maintainer-ship of something internally and instead want to contribute back to the community here.

Specifically what we'd like to do is working on getting acmed async, implement #74 and #75 based on that, revamp config parsing (which is currently quite complicated with loads and loads of macros), which also relates to #73. Some code cleanup and using more external libraries instead of reinventing wheels would be another part here, which specifically would be #72.

I do understand that you can't allocate large amounts of time to this project though, so if this is all a bit overwhelming and you can't (or don't want to) include our changes right now, we can also fork acmed, if you'd prefer that. Even then, we can look at upstreaming our changes later on (again, we'd prefer contributing over forking), although that will certainly become harder if the feedback loop is slow compared to working together directly.

Last but not least: I'm closing this issue, as all boxes in the description are either checked or split out into separate issues.

@breard-r
Copy link
Owner

Hey @breard-r, coming back to this, I've split off the two problems regarding scheduling that you agreed on and have clarified the bit about multiple accounts, so I think this issue here can be closed, as there's dedicated issues for #74 and #75 now.

Thant's a good idea, thank you!

The interest here is coming from @famedly. We (that's my colleague @lukaslihotzki and me) are currently looking for a new ACME client, and are willing to invest some time into helping acmed to get into shape and also help with maintenance down the line. The issues #74 and #75 we've mentioned above are one area where we see potential for improvement, and the others are #72 and #73. Would you be interested in cooperating on getting those changes in here? We'd like to to avoid forking and having sole maintainer-ship of something internally and instead want to contribute back to the community here.

Specifically what we'd like to do is working on getting acmed async, implement #74 and #75 based on that, revamp config parsing (which is currently quite complicated with loads and loads of macros), which also relates to #73. Some code cleanup and using more external libraries instead of reinventing wheels would be another part here, which specifically would be #72.

I do understand that you can't allocate large amounts of time to this project though, so if this is all a bit overwhelming and you can't (or don't want to) include our changes right now, we can also fork acmed, if you'd prefer that. Even then, we can look at upstreaming our changes later on (again, we'd prefer contributing over forking), although that will certainly become harder if the feedback loop is slow compared to working together directly.

Thank you very much, I really appreciate your help and interest 🙂

It's been a while since I made some real changes and improvements to ACMEd, I mostly idled while conceptualizing what I wanted to do. Lack of time and motivation mostly explain this state.
Seeing that people are interested in ACMEd help me to gain some motivation, and I secured one week of paid leave in February. I guess everything is set so I can get back on tracks with this project and finally implement all those things I've been thinking about.

As you guessed, I would like, in a first time, to start this big async rewrite myself since I been thinking on how to do it for quite a while now and I have a good idea of what I would like to do about it.
That said, there is some parts that are less affected by this change and therefore may be improved while this rewrite in still going on, as well as some architectural changes for which I do not have fixed ideas. Therefore, I'm going to open issues for those points so we can all discuss them. Since I mostly developed ACMEd alone I'm still a little bit conservative on some aspects, but I'll work on that so we can have a more open development.

@jcgruenhage
Copy link
Contributor Author

I just realized I haven't replied back at the end of January, even though I wanted to... I've seen that the async rewrite has happened, congrats!

With regards to us contributing, it probably makes sense to look into #77 before we start on #75, so that the backoff stuff can be persisted across restarts. #74 should be possible to handle without persisting state though, so maybe that's something we can start on earlier (and honestly, it's the more important one for us anyway). All of that of course is assuming that in general contributions in these three issues are welcome if we stick to the guidelines laid out in the contribution guidelines?

Last but not least: What do you think about a chatroom for less forum and more ad-hoc development communication/coordination? I've had great experiences using Matrix for that kind of stuff.

@breard-r
Copy link
Owner

I've seen that the async rewrite has happened, congrats!

Thanx ! I must point out that I haven't swiched the http part to async yet, however I'm working on it and it doesn't block some other work like #74

With regards to us contributing, it probably makes sense to look into #77 before we start on #75, so that the backoff stuff can be persisted across restarts.

I fully agree with that. Furthermore, I should also implement async for the http part before any work on #75 should be done.

#74 should be possible to handle without persisting state though, so maybe that's something we can start on earlier (and honestly, it's the more important one for us anyway). All of that of course is assuming that in general contributions in these three issues are welcome if we stick to the guidelines laid out in the contribution guidelines?

Yes of course.

@breard-r
Copy link
Owner

Last but not least: What do you think about a chatroom for less forum and more ad-hoc development communication/coordination? I've had great experiences using Matrix for that kind of stuff.

I'm ok with it. Currently I mostly uses IRC and XMPP, but I have a good image if Matrix. I'll have a look on how I can easily manage my identity across all those protocols. If I can easily set-up a matrix server with bridges to IRC and XMPP, I may switch to it.

@jcgruenhage
Copy link
Contributor Author

Going with libera.chat and the built-in matrix bridge might be an option? That would reduce the dependencies on self hosting matrix and bridges. https://matrix.to/#/#acmed:libera.chat or #acmed via IRC would be an easy option.

@breard-r
Copy link
Owner

breard-r commented Mar 6, 2023

Well, I finally joined Matrix. You'll find me at @rodolphe:matrix.what.tf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants