New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tls: Add healthcheck to shutdown if certificate is expired #4792
tls: Add healthcheck to shutdown if certificate is expired #4792
Conversation
e35f020
to
7e72bb1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @murtyjones, this is going to be a really helpful one for a lot of people.
I don't think the IsOutdated
function is the right check for this, because it just checks ip addresses and dns names are in sync with our existing config - confusingly named, I also thought it would check expiry!
Would also be nice to have some default values for this check, and it needs to be validated in lncfg
.
server.go
Outdated
if err != nil { | ||
return err | ||
} | ||
outdated, err := cert.IsOutdated( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsOutdated
checks the ip addresses + dns names on the cert, rather than its expiry. This can be replaced with a check on cert.NotAfter
to detect expiry.
I think we can leave IsOutdated
out of this check entirely because we already have the tlsautorefresh
flag which will sync the cert with our config at startup if enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! I didn't even think to look at the source method 😅
e2bf1a3
to
9a2b65c
Compare
@carlaKC Thanks for reviewing! Incorporated your feedback. Admittedly I'm not sure what the workflow to test this change looks like, so I've just compiled it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there, just need to bump the default interval and we are gg!
I'm not sure what the workflow to test this change looks like
I've tested locally by deleting my existing cert and updating the expiry time to 5 minutes, then run lnd and check that it shuts down once it's expired.
Goods are as advertised:
2020-12-01 09:13:48.992 [INF] HLCK: Health check: tls, call: 1 failed with: TLS certificate is expired, backing off for: 1m0s
2020-12-01 09:14:48.993 [CRT] SRVR: Health check: tls failed after 2 calls
2020-12-01 09:14:48.993 [INF] SRVR: Sending request for shutdown
server.go
Outdated
// If the current time is passed the certificate's | ||
// expiry time, then it is considered expired | ||
if time.Now().After(parsedCert.NotAfter) { | ||
return fmt.Errorf("TLS certificate is expired") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitty-nit: include the cert expiry in the error (nice to have, non-blocking)
config.go
Outdated
// is not expired. Although this check is off by default (not all setups | ||
// require it), we still set the other default values so that the health | ||
// check can be easily enabled with sane defaults. | ||
defaultTLSInterval = time.Second * 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As is, interval needs to be at least 1 minute, so defaultTLSInterval
fails validation as is - bump to 1 min?
The 1 min minimum is arbitrarily picked, so can be changed if we see a pressing reason to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very useful feature, LGTM (barring @carlaKC's comments) 🎉
if err != nil { | ||
return err | ||
} | ||
// If the current time is passed the certificate's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add newlines before comments.
9a2b65c
to
83787ee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff, thanks for the quick turnaround @murtyjones 🙌
In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
83787ee
to
bad4e9a
Compare
|
||
; The amount of time we should wait between certificate expiration health checks. | ||
; This value must be >= 1m. | ||
; healthcheck.tls.interval=1m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added one last change to make this >=1m (was 30s
)
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
…gnetwork#4792) In certain container set ups, it's useful to optionally have lnd just shutdown if it detects that its certs are expired, as assuming there's a hypervisor to restart the container/pod, then upon restart, lnd will have fully up to date certs.
Adds a configurable healthcheck that shuts down if an expired certificate is found.
Reference issue