Skip to content
This repository has been archived by the owner on Aug 25, 2021. It is now read-only.

ACLs: Support external servers #420

Merged
merged 7 commits into from
Apr 23, 2020

Conversation

ishustava
Copy link
Contributor

@ishustava ishustava commented Apr 8, 2020

Problem

Currently, we support ACL bootstrapping only if the servers are also deployed on Kubernetes. If you're running servers elsewhere, we currently recommend bootstrapping ACLs yourself and setting up the correct policies and tokens for the k8s components. You could then set them as values, e.g. syncCatalog.aclSyncToken or connectInject.overrideAuthMethodName and connectInject.aclInjectToken.

However, figuring out the right policies is hard, and creating them yourself is still a manual process. Even though servers are running externally, our knowledge about the policies and tokens you need doesn't change since we still consume them from the Helm chart. The workaround right now is to create a Kubernetes secret with the bootstrap token named something that the Helm chart expects. This will trick the job into thinking that the bootstrapping was already done and it will continue to create tokens and policies for the rest of the components.

See #413 for a bit more details.

Proposal

This PR proposes to use the new externalServers configuration to enable the server-acl-init job to talk to external servers.

It also provides two ways of "initializing" ACLs for external servers:

  1. You can run bootstrap ACLs yourself via the acl/bootstrap API and then provide the bootstrap token by setting global.acls.bootstrapToken secret. The Helm config for this use case could look like this:
    global:
      enabled: false
      acls:
        manageSystemACLs: true
        bootstrapToken:
          secretName: consul-bootstrap-acl-token
          secretKey: token
    externalServers:
      enabled: true
    client:
      enabled: true
      join: [<retry_join address(es)>]
    connectInject:
      enabled: true
      overrideAuthMethodHost: <external address of the kube api server>
    ...
  2. You can let the server-acl-init job to call the bootstrapping API. It will behave in the same way as when the servers are running on k8s and will save the bootstrap token as a Kubernetes secret. In this case, the Helm config will look like this:
    global:
      enabled: false
      acls:
        manageSystemACLs: true
    externalServers:
      enabled: true
    client:
      enabled: true
      join: [<retry_join address(es)>]
    connectInject:
      enabled: true
      overrideAuthMethodHost: <external address of the kube api server>
    ...

Testing

This PR can be tested on AKS with the Helm configs above and the ishustava/consul-k8s-dev:04-16-2020-ae132bc image.

Requires #401 to be merged first.

@ishustava ishustava added area/acls Related to ACLs theme/external-servers Related to Consul servers running outside the cluster labels Apr 8, 2020
* server-acl-init-job sets server addresses
  if 'externalServers.enabled' is true
* server-acl-init and server-acl-init-cleanup jobs
  and resources now run either when
  servers are enabled or when externalServers are enabled
* Add new acls.bootstrapToken value for providing your own
  bootstrap token.
@ishustava ishustava marked this pull request as ready for review April 10, 2020 00:06
@ishustava ishustava requested a review from adilyse April 10, 2020 00:06
@pratiklotia
Copy link

Have a question @ishustava

Great suggestion for the new feature. My question is, after adding this capability, how can we customize which client node gets what type of policy (read/write) and which service gets what policy. Looks like the additional configuration in the helm chart here, will ask for a 'standard' policy to be set on the consul server? For example, using this - all client nodes (in the form of a k8s node) will get the same policy, correct? Or is there room to customize policy per client node?

@ishustava
Copy link
Contributor Author

Hey @pratiklotia, thanks for your feedback. Correct, at the moment, all client pods are given the same token even if servers are deployed on k8s too.

This is a good point though and is something that has come up before. I think we should look into giving clients a unique token and making their policies more restrictive, but I think this is out-of-scope for this feature. I've filed #425 to track this separately.

@ishustava ishustava changed the base branch from master to acls-external-servers April 16, 2020 18:03
Copy link
Contributor

@adilyse adilyse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the addition of the $serverEnabled variable in the templates! It makes the logic very clear.

Warnings?

Should we warn folks using external servers if their client.join and externalServers.https.address values don't match?

Should we warn folks if externalServers.enabled=true and servers are enabled?

External server options

I'm having a hard time parsing some of the external server options. Specifically when to use them, whether they're optional and why I might need to set them.

  • https: Is this entire block optional since we fall back to the client.join setup?
  • https:address: It would be helpful to mention that this can't include a port.
  • https.port: Would there ever be a case where I've set my client.join property, but the port is pulled from here? Could that cause problems?
  • tlsServerName and useSystemRoots: Are these only useful settings when using a custom CA?

Should externalServers.https.address be an array? It mirrors client.join which is an array.

Tests

Any logic that references both server and externalServers needs to do an exclusion test by setting server.enabled to false since servers are enabled by default in the Helm chart.

templates/server-acl-init-job.yaml Show resolved Hide resolved
templates/server-acl-init-job.yaml Outdated Show resolved Hide resolved
values.yaml Show resolved Hide resolved
values.yaml Outdated Show resolved Hide resolved
values.yaml Show resolved Hide resolved
test/unit/server-acl-init-job.bats Show resolved Hide resolved
test/unit/server-acl-init-job.bats Show resolved Hide resolved
test/unit/server-acl-init-job.bats Show resolved Hide resolved
test/unit/server-acl-init-job.bats Show resolved Hide resolved
test/unit/server-acl-init-job.bats Outdated Show resolved Hide resolved
* Remove authMethodConfig.caCert property because it's likely
  pretty uncommon.
* Fail if both server and externalServers are enabled
* Add missing tests
@ishustava
Copy link
Contributor Author

ishustava commented Apr 21, 2020

@adilyse Thanks so much for the review and organizing your thoughts so clearly! I've updated the PR with your suggestions.

The main change that came out of that was the configuration change. I think allowing configuring a custom CA cert for the auth method will be pretty uncommon, and so I'd rather punt on adding that configuration. As a result, the authMethodConfig became just one property overrideAuthMethodHost to match overrideAuthMethodName.

With regards to the placement of this config property, I chose the connectInject section because it seemed more specific and because it already had other acl-related properties. In general, this property applies when all three are true: ACLs are enabled, external servers are enabled, and connect inject is enabled. I'm not sure how to choose the best section out of these three, and so I went with connectInject because that seemed the most consistent with other existing values, like aclBindingRuleSelector. I'm definitely open to suggestions.


Below are my responses to your other questions:

Should we warn folks using external servers if their client.join and externalServers.https.address values don't match?

No, because the purpose of this property is to be able to overwrite the client.join address used by default.

Should we warn folks if externalServers.enabled=true and servers are enabled?

Yep, I think that's a good idea. I'll add a failure to the templates and tests. I've also added a warning to the comment for the externalServers.enabled property.

https: Is this entire block optional since we fall back to the client.join setup?

Yes.

https:address: It would be helpful to mention that this can't include a port.

Updated the comment.
The comment already lists all the values that this property can take, i.e. DNS name, IP or cloud auto-join string. This implies that it should not include the port and I feel like mentioning it would be unnecessary.

https.port: Would there ever be a case where I've set my client.join property, but the port is pulled from here? Could that cause problems?

I'm not sure what you mean; maybe you have an example in mind? Your client.join property doesn't include the port, so you'd need to provide it somehow, otherwise, there's no easy way for us to know which HTTP(s) port to use.

tlsServerName and useSystemRoots: Are these only useful settings when using a custom CA?

tlsServerName is useful when the address you're connecting to is not included in your server certificate. Then you can set tlsServerName to something that is included in the certificate so that hostname verification part of the TLS handshake won't fail.
useSystemRoots is useful when your CA is not a custom CA but a known CA that is typically included in your OS system roots, for example, Let's Encrypt CA. In this case, we don't need to provide a CA for TLS connections.

Should externalServers.https.address be an array? It mirrors client.join which is an array.

We don't need this to be an array because we only need one server to talk to, doesn't matter which one.

Any logic that references both server and externalServers needs to do an exclusion test by setting server.enabled to false since servers are enabled by default in the Helm chart.

Good call! I've added missing tests.

Copy link
Contributor

@adilyse adilyse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the updates look really good! The tests in particular are 💯 🎉

The one major concern I still have with this is the client.join and its interaction with externalServers.https.port. We learned yesterday that the retry-join flag can take a port value. If someone has this set in their client.join and hasn't updated the externalServers.https.port since they assume it's taken care of by client.join, we'll overwrite that port with the default value of 443.

Given the complexity of splitting off the port, does it make sense to ask people to instead explicitly set the server address?

test/unit/server-acl-init-job.bats Outdated Show resolved Hide resolved
values.yaml Outdated Show resolved Hide resolved
# This address must to be reachable from the Consul servers.
# Please see https://www.consul.io/docs/acl/auth-methods/kubernetes.html.
# Requires consul-k8s >= 0.14.0.
overrideAuthMethodHost: ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this makes more sense to be under the externalServers section. You can set it when not using external servers, but it's unlikely that very many people will need. On the other hand, it could be quite common for folks running external servers to need to make use of this value. It could be easy to miss it though, since it's not grouped with those settings.

Comment on lines +121 to 123
{{- if not .Values.externalServers.enabled }}
-server-port=8501 \
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might make sense to be set near the -server-address flag.

@ishustava
Copy link
Contributor Author

If somebody has provided a client.join with a port, then I think we can make the server-acl-init command ignore the port if it's provided to the server-address flag.

I'm not super concerned with port being overwritten because client.join will contain the serf port, while we need an HTTP port, so you would need to provide it anyway.

ishustava and others added 2 commits April 22, 2020 13:51
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
Co-Authored-By: Rebecca Zanzig <rzazzl@hotmail.com>
@ishustava ishustava merged commit 554f936 into acls-external-servers Apr 23, 2020
@ishustava ishustava deleted the acl-init-external-servers branch April 23, 2020 02:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/acls Related to ACLs theme/external-servers Related to Consul servers running outside the cluster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bootstrapACL works only with clusters inside the same k8s cluster.
3 participants