-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry ACME challenge more than once #162
Conversation
http.Client doesn't supports retry during connect. For k8s this can cause issues like: - dial tcp: i/o timeout - dial tcp 10.106.221.133:80: connect: connection refused
Codecov Report
@@ Coverage Diff @@
## master #162 +/- ##
==========================================
+ Coverage 78.98% 78.99% +0.01%
==========================================
Files 58 58
Lines 6224 6228 +4
==========================================
+ Hits 4916 4920 +4
Misses 1063 1063
Partials 245 245
Continue to review full report at Codecov.
|
@dopey: can you take a look to this? |
Yep. I'd like to get to the bottom of the database issue first since it seems these are related. According to @jkralik the size of the database has stopped endlessly expanding since they made this change, so I definitely want to pull it in. My only qualm is that we'd be pulling in an external dependency. I haven't had any time to check if that's definitely necessary, or if this behavior is something we can replicate using the standard golang clients. |
This fix doesn't fix endlessly expanding DB, but it just try challenge more times. It's fix #149. I think the issue just trigger the bug with endlessly expanding DB. |
@jkralik I reviewed the ACME spec. Section 8.2 discusses some mandatory and considerations when implementing challenge validation retries in both the server and the client. Most notably:
The spec mandates that we tie the retry state into our challenge resource so that clients can discern what is happening during the challenge/validation process. This also helps us keep track of retries in the code in any event (as opposed to "fire and forget"). Additionally, while some amount of server retry is allowable to handle exactly your type of scenario (propagation delay for newly provisioned infrastructure), based on my interpretation of language in the section, the responsibility really falls on the client to continue to bug the server until the challenge can be validated. In other words, make sure your client only requests the challenge once it knows the infrastructure is ready, and make sure your client re-requests the challenge until it is happy with the outcome. Thank you for the patch! Since you've likely got this up and running and it works for you, feel free to keep on doing what you're doing. In terms of pulling this into the official CA, we'd like to hold off until we can fully address the requirements laid out in section 8.2 of the spec. If you're interested in revising your patch we're more than happy to work with you to get things in shape. Just say so and we can reopen or you can propose a new patch. |
Description
http.Client doesn't supports retry during connect. For k8s this
can cause issues like:
Fixes
#149
💔Thank you!