Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Certificate Authentication doesnt work with ACM Certificate #9756

Closed
rifelpet opened this issue Aug 14, 2020 · 7 comments
Closed

Client Certificate Authentication doesnt work with ACM Certificate #9756

rifelpet opened this issue Aug 14, 2020 · 7 comments

Comments

@rifelpet
Copy link
Member

1. What kops version are you running? The command kops version, will display
this information.

Version 1.18.0 (git-698bf974d8)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.18.6

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

kops export kubecfg; kops -v 3 rolling-update cluster

5. What happened after the commands executed?

I0810 20:12:33.576456      45 factory.go:68] state store s3://foo
 I0810 20:12:33.576612      45 s3context.go:331] product_uuid is "ec2d00d9-b726-73fa-f3b8-4f91daa9352e", assuming running on EC2
 I0810 20:12:34.342631      45 s3context.go:164] got region from metadata: "us-east-1"
 I0810 20:12:34.445237      45 s3context.go:210] found bucket in region "us-east-1"
 Unable to reach the kubernetes API.
 Use --cloudonly to do a rolling-update without confirming progress with the k8s API
 error listing nodes in cluster: Unauthorized

6. What did you expect to happen?

rolling update to succeed

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  name: foo.k8s.local
spec:
  api:
    loadBalancer:
      type: Internal
      idleTimeoutSeconds: 3600
      sslCertificate: arn:aws:acm:us-east-1:0000000000:certificate/fdbd523b-94e5-48a5-bc72-3fcbb2b45c2d
...

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

When an ACM certificate is provided for the API ELB, the listener is switched from TCP to TLS. This causes client certificate authentication to break. Before Kops 1.18, users using kubeconfig files generated with kops export kubecfg were implicitly falling back to basic authentication. With basic auth being deprecated and removed we'll need to provide a method for client cert auth to work when ACM certificates are used.

Ideas:

  • Kops creates an api.internal.$clustername A record that points to the master internal IPs. We could use this domain name in the kubeconfig file and have the client bypass the API ELB entirely. This requires:

    • Security Group rules on the master instances need to allow 443 access from the same sources that the API ELB allows.
    • Clients need access to the internal IPs (VPN connection for private topologies)
    • A strategy (cli flag?) for having kops use the internal domain name in the kubeconfig files it generates

    Add --internal flag for export kubecfg that targets the internal dns name #9732 implements this idea, minus the security group changes required.

  • Create a second (TCP) listener on the API ELB without the certificate, pointing to the same master ports

    • This will also need a strategy for having kops use the second ELB port in the kubeconfig files it generates
  • Establish an SSH tunnel to the masters, bypassing the API ELB

@johngmyers
Copy link
Member

The second listener on the API ELB seems straightforward. If export config is using --admin and the sslCertificateId is specified, use the alternate port in the cluster's server URL.

@hakman
Copy link
Member

hakman commented Aug 14, 2020

I agree with @johngmyers. Second TCP listener on the API ELB seems the best solution, just when sslCertificateId is set.

@rifelpet
Copy link
Member Author

A second TCP listener would work with setups that don't provide access to the instances directly such as private topologies without VPN.

I had concern about the addition or removal of the second listener when the sslCertificate field is set or unset and how it might invalidate existing kubeconfig files, but adding or removing the field on its own is enough to invalidate existing kubeconfig files since the CA would need to be added or removed anyways. Needing to also update the server port is trivial.

I guess the remaining source of confusion could be users upgrading their Kops version and seeing an additional listener being added, but we can make this change prominent in the release notes.

Would there be concerns with using a nonstandard port to send TLS traffic? I'm thinking of corporate proxy situations, but it seems like they would be using their own CA rather than relying on an ACM certificate. Very locked-down firewalls might make upgrading more troublesome for users, but I suppose that would happen regardless of which situation we choose here.

I'll open a PR for the second listener approach shortly.

@sepulworld
Copy link

Is this new behavior in kops 1.18? We are running into the same issue. We use AWS ELB with ACM Cert applied in front of the master instances. Been working great with kops until now. Any other possible work arounds for this issue?

@rifelpet
Copy link
Member Author

The behavior that has changed in Kops 1.18 is the disabling of basic auth by default. This can be reenabled in Kubernetes 1.18 by following these docs but instead setting the API field value to false.
This will only work in Kubernetes 1.18 so we will need an alternative for 1.19 when basic auth is removed entirely.

@rifelpet
Copy link
Member Author

This should be fixed in v1.19.0-beta.1 by migrating to an NLB. See https://github.com/kubernetes/kops/blob/master/permalinks/acm_nlb.md for more info.

/close

@k8s-ci-robot
Copy link
Contributor

@rifelpet: Closing this issue.

In response to this:

This should be fixed in v1.19.0-beta.1 by migrating to an NLB. See https://github.com/kubernetes/kops/blob/master/permalinks/acm_nlb.md for more info.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants