Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: change encryption keys #199

Merged
merged 65 commits into from
Apr 18, 2014
Merged

feature: change encryption keys #199

merged 65 commits into from
Apr 18, 2014

Conversation

ryanuber
Copy link
Member

For a week or so I've been chipping away at this feature little by little. It uses the new memberlist keyring feature to allow Serf to handle encryption key changes in a running cluster. I feel like it's feature-complete, and ready to start gathering some suggestions on the code itself.

There is more conversation around the feature in general in #194.

The bottom line on what it does can be seen with a few command examples. All of the operations are idempotent, so you can just keep running them without negative consequences if they fail.

Install a new key

$ serf key -install E4SoXyZSJkcg37n42U2i2Q==
==> Installing key on all members...
==> Successfully installed key!

Change the key used to encrypt messages:

$ serf key -use E4SoXyZSJkcg37n42U2i2Q==
==> Changing primary key on all members...
==> Successfully changed primary key!

List keys in use on the cluster:

$ serf key -list
==> Asking all members for installed keys...
==> Keys gathered, listing cluster keys...

E4SoXyZSJkcg37n42U2i2Q==
9kHC3w5eswavy1iU56YG8g==

Remove a key:

$ serf key -remove 9kHC3w5eswavy1iU56YG8g==
==> Removing key on all members...
==> Successfully removed key!

Error conditions during key operations:

$ serf key -remove E4SoXyZSJkcg37n42U2i2Q==
==> Removing key on all members...
failed:  db9    Removing the primary key is not allowed
failed:  web15  Removing the primary key is not allowed
failed:  db8    Removing the primary key is not allowed
failed:  web21  Removing the primary key is not allowed
failed:  lb31   Removing the primary key is not allowed
failed:  lb3    Removing the primary key is not allowed

Error removing key: 6/102 nodes reported failure

I know this pull request is pretty massive, so if splitting it into smaller chunks is preferred, I can try doing that. Just let me know!

@armon
Copy link
Member

armon commented Apr 11, 2014

@ryanuber Wow! This is a huge PR! Awesome work! I will try to comb through this over the weekend.

With the -list mode, does it indicate if some nodes are missing certain keys? e.g. How does the operator verify all nodes have a given key before trying to -use that key?

@ryanuber
Copy link
Member Author

@armon the -list command just gives you a list of all unique keys in use on the cluster. It doesn't tell you if they are in use on some but not others. You could get that confidence using the -install command though, since it will only return 0 if every node replies with success.

My thinking was that if the commands are idempotent, you would just do a -install until you succeeded. Similarly, you would want to do a -use until it succeeded before doing a -remove of the old key. So essentially, -install could also be looked at as a "verify" of sorts, "verify this key exists on all nodes".

@armon
Copy link
Member

armon commented Apr 11, 2014

@ryanuber Got it. Would it be possible to add a very simple indicator to list, just something thats like [58/60] or something that indicates how many nodes have that key? This way an operator can quickly see if they need to run -install again.

@ryanuber
Copy link
Member Author

@armon definitely! I'll take a stab at that over the weekend.

goto SEND
}

s.logger.Printf("[DEBUG] serf: Received install-key query")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd make this INFO level

@armon
Copy link
Member

armon commented Apr 14, 2014

@ryanuber I've made a few notes, but overall this looks super solid! Very excited to have this in for the next release. Thanks for the hard work!

@ryanuber
Copy link
Member Author

@armon Thanks for the review! I made some adjustments based on your recommendations and added a few response comments. Let me know what you think and we can adjust further from here.

keyring file helps persist changes to the encryption keyring, allowing the
agent to start and rejoin the cluster successfully later on, even if key
rotations had been initiated by other members in the cluster. NOTE: this
option is not compatible with the `-encrypt` option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add a section below that documents the format of this file. Just so people know how to setup the keys, that the first key is the primary, etc.

@armon
Copy link
Member

armon commented Apr 16, 2014

@ryanuber This is looking really good. I think it's very close. I think we are also missing some tests for the RPC layer, but I can always add those after a merge if you want.

ryanuber added 25 commits April 15, 2014 23:11
@ryanuber
Copy link
Member Author

@armon I took a swing at the RPC tests and added some command tests. If there is anything you think we are missing just let me know!

@armon
Copy link
Member

armon commented Apr 18, 2014

@ryanuber Looks awesome! Thanks so much for all the hard work!

armon added a commit that referenced this pull request Apr 18, 2014
feature: change encryption keys
@armon armon merged commit 069de11 into hashicorp:master Apr 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants