Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: cannot recover from expired certificate #14126

Closed
knz opened this issue Mar 13, 2017 · 6 comments
Closed

server: cannot recover from expired certificate #14126

knz opened this issue Mar 13, 2017 · 6 comments
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Milestone

Comments

@knz
Copy link
Contributor

knz commented Mar 13, 2017

Found together with @dianasaur323: the reg server is currently running but the CA cert, node cert and client certs have all expired.

There are two issues from there:

  • clients can't connect any more.
  • if the server went down, it could not come back up afterwards.

Now we tried also to regenerate a new client cert with a new expiration date, but since the CA cert also expired this is not possible any more.
And also we do not provide a way to renew the CA cert (from the same key) so this cannot be fixed currently.

The way forward from here to keep the ability to connect clients without having to restart the server is to define a new command that can extend the CA cert's expiration date.

(Another mechanism is needed for the node certs, but this is kinda already tracked in #6263 and #1675)

@bdarnell
Copy link
Contributor

if the server went down, it could not come back up afterwards.

Even worse, if a connection failed a heartbeat, that connection could not be reestablished (unless we've enabled session resumption, but I don't think we have).

And also we do not provide a way to renew the CA cert (from the same key) so this cannot be fixed currently.

Now that we're already down, we can simply regenerate all certificates (CA and nodes and clients) and restart everything.

The way forward from here to keep the ability to connect clients without having to restart the server is to define a new command that can extend the CA cert's expiration date.

The proper cryptographic way would be to A) support multiple CA certs at once (this may already work) and B) allow CA certs to be reloaded without restarting the process (see #14012). Then once you've installed your new CA certs on all nodes, you can start rolling out certs issued by the new CA to all nodes and clients (for servers, you'd probably reload the new certs by the same mechanism. For clients, I have no idea how well client libraries support key rotation and this may well require client restarts).

@mberhault
Copy link
Contributor

oops. those were good for a year. that happened fast.
for now, there's not much to do other than re-generate all certs/keys and restart everything (nodes and registration tasks). We do need to implement cert and key rotation soon though.

@mberhault
Copy link
Contributor

regenerated all certs and keys for the registration cluster and restart all nodes/clients.

@bdarnell
Copy link
Contributor

More things to do:

  • Monitoring and alerts for the time remaining before expiration
  • Consider increasing the default lifetime of certificates created by cockroach cert (as a stopgap until we have better key rotation in place)

@a-robinson
Copy link
Contributor

Yeah, if we don't increase the default lifetime and/or provide good warnings, this could very easily happen to folks out in the wild.

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. labels Mar 14, 2017
@knz knz added this to the 1.0 milestone Mar 14, 2017
@mberhault
Copy link
Contributor

moved to #1674, including "more things to do"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

No branches or pull requests

4 participants