-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd: 🙏 propagate errors from ACME worker #4652
Conversation
First attempt on this diff to join preview with an intentionally incorrect auto-https value failed to exit:
Desired behavior is for pd to exit. Trying again with the following patch, pd did indeed exit when encountering the error:
patch
This is a huge improvement over prior behavior! Will do a bit more testing before calling it final. |
d386312
to
1b8f5d6
Compare
when auto-https is enabled, we spawn a task in the background to handle https certificate resolution, via `rustls_acme::AcmeState`. if that task encounters errors, they should be propagated up to the daemon, so that `pd` does not rapidly retry lookups, potentially hitting rate-limits and causing service interruptions. this changes the `pd` entrypoint, binding the [`JoinHandle`] to a variable and polling upon that future in the `tokio::select` block that represents the core steady-state event loop of the daemon. we also update the acme_worker loop to log-and-bail on error, ensuring that pd exits when the error hits the select loop in pd main. Co-Authored-By: Conor Schaefer <conor@penumbralabs.xyz>
1b8f5d6
to
21cefbd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sprucing this up, much more confident about recommending this behavior as a first-run experience!
when auto-https is enabled, we spawn a task in the background to handle https certificate resolution, via
rustls_acme::AcmeState
.if that task encounters errors, they should be propagated up to the daemon, so that
pd
does not rapidly retry lookups, potentially hitting rate-limits and causing service interruptions.this changes the
pd
entrypoint, binding the [JoinHandle
] to a variable and polling upon that future in thetokio::select
block that represents the core steady-state event loop of the daemon.checklist before requesting a review
If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason: