Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support external servers #243

Merged
merged 11 commits into from
Apr 22, 2020
Merged

Support external servers #243

merged 11 commits into from
Apr 22, 2020

Conversation

ishustava
Copy link
Contributor

@ishustava ishustava commented Apr 8, 2020

This PR proposes the following changes to support external servers:

  1. Add a bootstrap-token-file flag to support "already-bootstrapped" external servers. This could also work for folks that want to do ACL bootstrapping themselves on the external server and just provide a bootstrap token so that the rest of tokens and policies for consul-k8s are set up. Some use-cases in bootstrapACL works only with clusters inside the same k8s cluster. consul-helm#413.
  2. -server-address flag now supports cloud auto-join strings. This is similar to how get-consul-client-ca command behaves. Cloud auto-join support here was added for consistency.
  3. -inject-auth-method-host and -inject-auth-method-ca-cert to allow configuring custom auth method parameters. This is needed because during the login workflow, consul servers are talking to the Kubernetes API to verify the JWT token. When Consul servers are external to the Kubernetes cluster, we can't make assumptions about where the kube API server is running and so we have to ask for this information from the user.

@ishustava ishustava added area/acls Related to ACLs theme/external-servers Related to Consul servers running outside the cluster labels Apr 8, 2020
* Require -server-address to be provided instead of discovering the
  server IPs from Kubernetes pods. This allows us to eventually
  to run this command against external servers or servers deployed
  on Kube in the same way. On Kubernetes, instead of discovering Pod IPs,
  we can use server's stateful set DNS names.
* [Breaking change] Remove -expected-replicas, -release-name, and
  -server-label-selector flags because we no longer need them.
@ishustava ishustava changed the base branch from acl-init-refactor to master April 9, 2020 23:31
@ishustava ishustava force-pushed the acl-init-external-servers branch 2 times, most recently from 8e1fa78 to 9871c80 Compare April 9, 2020 23:41
* Add -bootstrap-token-file to provide your own bootstrap token.
  If provided, server-acl-init will skip ACL bootstrapping of the
  servers and will not update server policies and set tokens.
* The -server-address flag now can also be a cloud auto-join
  string. This enables us to re-use the same string you would
  use for retry-join.
@ishustava ishustava marked this pull request as ready for review April 9, 2020 23:51
@ishustava ishustava requested a review from adilyse April 9, 2020 23:51
Copy link
Contributor

@adilyse adilyse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the organization you added to the flags!

Now that the godiscover code has been pulled out into a separate helper file, it would be good to have some tests that target this code. There are some in the Consul's retry_join that could probably be repurposed.

subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
@@ -186,6 +211,21 @@ func (c *Command) Run(args []string) int {
aclReplicationToken = strings.TrimSpace(string(tokenBytes))
}

var providedBootstrapToken string
if c.flagBootstrapTokenFile != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me a bit about why you chose to read this in from a file rather than getting the secret directly from kubernetes? The acl bootstrapping job already has a k8s client and permissions to create and get secrets, so adding the extra hop through a file seems like extra work that's not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed simpler to me to have only one flag for the file rather than having two flags - for secret name and secret key. I don't think reading it from a file adds an extra hop - it just replaces one API call with another. Plus, if reading from API, we'd still have to do similar error checks: 1) fail if secret doesn't exist; and 2) fail if the value of the secret key is empty. So it will likely not reduce the number of lines either.

Philosophically, it also seems appropriate for this job to only read/write secrets that it creates/manages. So this implementation follows the existing pattern of reading the CA cert and ACL replication token from a file.

We should probably also look into restricting the role rules for this job in the Helm chart to only read the list of secrets we expect. It would probably be cumbersome, but better from a security perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first instinct with anything running in a container that needs files as input is that it's often awkward to be able to provide it. That's probably not an issue here because it's managed in the helm chart and it's straightforward to mount the secret as a file.

This process does create and manage the bootstrapping secret, just not in this particular configuration. Given that, one could argue that it's confusing for the job to sometimes access the secret directly and other times not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's true. It makes it inconsistent now, where previously it was only using the API for Kube secrets. There is a pattern though: secrets that this job doesn't create/manage, we provide via files (e.g. consul ca cert, acl replication token, bootstrap token), and secrets that this job creates and updates it manages via the API.

I do like also that it decouples us from Kube secrets a little bit.

subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Show resolved Hide resolved
ishustava and others added 4 commits April 15, 2020 16:51
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
We don't need to read the service from Kube since
we can just use Kube DNS directly
These flags are added so that the auth method can be configured with
custom credentials for the Kubernetes API. This is useful when Consul
servers are running externally because in that case we can't assume that
the Kubernetes API location and CA cert we have access to in the cluster
is routable/has correct SANs from the Consul servers running outside the cluster.
@ishustava
Copy link
Contributor Author

@adilyse this is now ready for re-review. Summary of changes from your last review:

  1. Add -inject-auth-method-host and -inject-auth-method-ca-cert to allow configuring custom auth method parameters. This is needed because during the login workflow, consul servers are talking to the Kubernetes API to verify the JWT token. When Consul servers are external to the Kubernetes cluster, we can't make assumptions about where the kube API server is running and so we have to ask for this information from the user.
  2. Previously, we used to get the Kubernetes API server cluster IP by getting service information from the API. We don't need to do that and can just use DNS instead. This also reduces permissions this job needs.

Copy link
Contributor

@adilyse adilyse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as the jwt from the service account will work even if the CA is different, this should be good to go. I have a couple comments about tests, but I won't hold up the approval on that 🎉 .

Would we ever expect someone to be running servers in k8s, then update their helm values to point to external servers? Does it make sense to add tests for this scenario? Related: they've used the built in auth method, then switch to providing one (or vice versa).

subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/command.go Outdated Show resolved Hide resolved
subcommand/server-acl-init/connect_inject.go Show resolved Hide resolved
subcommand/server-acl-init/command_test.go Outdated Show resolved Hide resolved
@@ -186,6 +211,21 @@ func (c *Command) Run(args []string) int {
aclReplicationToken = strings.TrimSpace(string(tokenBytes))
}

var providedBootstrapToken string
if c.flagBootstrapTokenFile != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first instinct with anything running in a container that needs files as input is that it's often awkward to be able to provide it. That's probably not an issue here because it's managed in the helm chart and it's straightforward to mount the secret as a file.

This process does create and manage the bootstrapping secret, just not in this particular configuration. Given that, one could argue that it's confusing for the job to sometimes access the secret directly and other times not.

ishustava and others added 4 commits April 21, 2020 10:43
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
We think that using a custom CA that would be different
from the one included in the service account is pretty uncommon.
@ishustava
Copy link
Contributor Author

If someone wanted to migrate their servers from k8s to outside of k8s, they'd probably want to preserve the data. In that case, they would need to back up their data and migrate it over to the external servers. All the token information should stay the same, and so no changes are needed on the k8s side, except for the auth method. For the auth method, they would need to delete it from Consul, provide the auth method host via helm config, and run a helm upgrade to let the acl bootstrapping job re-create it.

I don't think we can test this in an effective way in this repo since it falls under the "already bootstapped" test case, and the auth method falls under our regular auth method test case since we will be creating it from scratch.

Maybe we could eventually add it to the integration tests for Helm if this proves to be a scenario we think is important to support.

@ishustava ishustava merged commit e68ed3f into master Apr 22, 2020
@ishustava ishustava deleted the acl-init-external-servers branch April 22, 2020 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/acls Related to ACLs theme/external-servers Related to Consul servers running outside the cluster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants