Connection to load balancer HTTPS port from within cluster does not terminate TLS #8

jcassee · 2019-03-03T22:15:07Z

When a pod within the cluster connects to a load balancer HTTPS port that is configured to perform TLS termination (i.e. has a certificate configured), TLS is not terminated and the connection is forwarded to the pod HTTP port as-is. This causes traffic from within the cluster to fail.

The Service definition:

kind: Service
apiVersion: v1
metadata:
  name: traefik
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-protocol: http
    service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
    service.beta.kubernetes.io/do-loadbalancer-certificate-id: XXX
spec:
  type: LoadBalancer
  selector:
    app: traefik
  ports:
    - name: http
      port: 80
    - name: https
      port: 443
      targetPort: 80

(See also the https-with-cert-nginx.yml example.)

Connection flow:
External -> LB proto HTTPS port 443 -> Service proto HTTP port 443 -> Pod proto HTTP port 80
Internal -> LB proto HTTPS port 443 -> Service proto HTTPS port 443 -> Pod proto HTTPS port 80

(For DigitalOcean engineers, I posted debugging information in support ticket 3402891.)

The text was updated successfully, but these errors were encountered:

timoreimann · 2019-03-07T23:24:24Z

Hey @jcassee, thanks for the report. I can confirm and reproduce the behavior you're describing. Let me look into it and get back to you as soon as I know more.

timoreimann · 2019-03-08T09:23:36Z

@jcassee here's a question while we are still investigating the issue: is there a particular reason you are not using the Service's cluster IP / internal DNS name from a pod running inside the cluster?

jcassee · 2019-03-08T09:57:08Z

@timoreimann Sure, the thing is we use HAL API resources, and links are absolute URLs that are accessed by applications both within and outside of the cluster.

timoreimann · 2019-03-08T10:04:03Z

@jcassee thanks for clarifying. As a workaround, I wonder if you could use DNS names that point to the external IP and cluster IP, respectively, depending on whether they are resolved in-cluster or out-of-cluster?

jcassee · 2019-03-08T10:16:03Z

@timoreimann Well I could try, but the pods use HTTP and the URLs are HTTPS. TLS termination is handled by the load balancer.

timoreimann · 2019-03-08T10:27:59Z

Ah true, you'd have to change the protocol as well.

I see how this is can be bothering. I opened an internal ticket to investigate, will keep you posted.

michiels · 2019-03-09T08:55:44Z

I was googling for my own problem and came across this issue and I think my problem is related.

I created a self-signed SSL certificate and uploaded that to DigitalOcean certificates. Then I set up a LoadBalancer service and deployment exactly like in the example at: https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/examples/https-with-cert-nginx.yml

HTTP requests are forwarded fine to the nginx backend. However, TLS requests seem to be forwarded as-is, recognisable by the bunch of hex characters that I see coming into the backend nginx access logs.

So it looks like SSL is not terminated by the load balancer. I used the IP address of my load balancer as CN for the self-signed certificate.

Maybe to give some context: I want to use my Kubernetes cluster as backend pool that lives behind other network elements we already have in our current infrastructure. However, since DO LBs do not live/have a private network IP I want to make sure that traffic from the edge router is sent encrypted to the LB and Kubernetes cluster.

michiels · 2019-03-09T09:12:17Z

Some additional info. When listing the load balancers via doctl compute load-balancers list it seems that the certificate_id attribute is empty, where it is filled with an ID in the README of your examples:

67c06198-88f5-4af0-a736-faa4ab8c012c    188.166.134.192    ad076a476424a11e99153eebbb20ed67    active    2019-03-09T09:07:53Z    round_robin    ams3             135662233,135662234    false    type:none,cookie_name:,cookie_ttl_seconds:0    protocol:tcp,port:31708,path:,check_interval_seconds:3,response_timeout_seconds:5,healthy_threshold:5,unhealthy_threshold:3    entry_protocol:tcp,entry_port:80,target_protocol:tcp,target_port:31708,certificate_id:,tls_passthrough:false entry_protocol:tcp,entry_port:443,target_protocol:tcp,target_port:31380,**certificate_id**:,tls_passthrough:false

Let me know if this is a separate issue, then I'll open that and move my comments to not disrupt the original issue by @jcassee :)

timoreimann · 2019-03-09T12:55:58Z

@michiels this might be a different issue. Could you post your Service object in YAML format to be certain? I know you said it resembled the example, but it'd be good to double-check. Thanks.

timoreimann · 2019-03-09T12:57:31Z

@michiels you can also check if the events from CCM show anything suspicious via kubectl get events.

erkie · 2019-03-14T21:21:19Z

We also ran into this problem in production. An internal API call was routed to the same domain, was getting weird like CONNECT_CR_SRVR_HELLO:wrong version number turns out iptables magic setup by kube-proxy (or something?) was hijacking requests on port 443 to the loadbalancer from within the cluster, to port 80.

timoreimann · 2019-03-14T21:25:59Z

@erkie thanks for sharing your feedback. We had a few other reports by now and are currently looking into the issue. Will let you know as soon as we've got something.

timoreimann · 2019-03-19T23:29:38Z

We have confirmed now that Kubernetes purposefully routes requests for external LBs towards the associated pods directly, thereby bypassing the LB and leading to the issues described here by some people. There is an upstream issue about the matter, and we have started to engage in discussions in order to determine whether a solution built into the Kubernetes core might be feasible at some point.

Any newly created upstream solution would certainly need a few release cycles to become available. We have been thinking about quick workarounds feasible today, but it seems difficult to find one. :/ For now, let's see where the upstream discussion leads to.

jcassee · 2019-03-25T22:37:44Z

My current workaround is to make the load balancer service HTTPS-only, then manually add a dummy port 80 HTTP rule and enable HTTP->HTTPS redirection. The pod(s) behind the service need(s) to support HTTPS, of course. This requires a manual step, but it is currently the only set-up that works.

timoreimann · 2019-04-10T13:32:17Z

Update: we got in touch with SIG Networking (meeting recording). The plan forward is to put together a PR that addresses the issue. @jcodybaker will be working on that front.

Will keep you posted as we make progress.

jcassee · 2019-06-19T08:23:51Z

Note that recently my workaround stopped working because the load balancer will now connect to the node using HTTP instead of HTTPS, even though the protocol is HTTPS and the service port is 443.

@timoreimann Is kubernetes/kubernetes#77523 the fix for this issue?

timoreimann · 2019-06-19T08:36:53Z

@jcassee

Note that recently my workaround stopped working because the load balancer will now connect to the node using HTTP instead of HTTPS, even though the protocol is HTTPS and the service port is 443.

Hmm strange, the protocol annotations should still work. If you have an example Service manifest to look at, we could investigate.

Is kubernetes/kubernetes#77523 the fix for this issue?

Unfortunately not -- see also my coworker's comment on the PR.

jcassee · 2019-06-19T08:46:47Z

@timoreimann Sure, this is the manifest:

kind: Service
apiVersion: v1
metadata:
  name: traefik
  namespace: traefik
  labels:
    app: traefik
    component: traefik
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-protocol: https
    service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
    service.beta.kubernetes.io/do-loadbalancer-algorithm: least_connections
    service.beta.kubernetes.io/do-loadbalancer-certificate-id: XXX
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-protocol: tcp
spec:
  type: LoadBalancer
  selector:
    app: traefik
    component: traefik
  ports:
    - name: https
      port: 443

The change in behavior started when the cluster nodes were recycled after the recent critical update. (At that time, the nodes names started to contain the cluster name.)

The same manifest is used on a different cluster that has not yet been updated (because of this issue) without problems.

Let me know if I can do anything else to help debug.

jcassee · 2019-06-28T15:39:17Z

@timoreimann The problem I mentioned above has not occurred in the last week. It may be fixed...?

timoreimann · 2019-06-28T15:41:53Z

@jcassee are you saying that your workaround started working again, or that the general routing problem this issue describes has been fixed?

jcassee · 2019-06-28T15:57:26Z

@timoreimann Sorry, I meant that my workaround seems to be working and stable again.

timoreimann · 2019-06-29T08:35:15Z

@jcassee off the top of my head, I can't think of a recent change we did that would have been relevant to your workaround. Figuring it out for sure depends on what CCM / DOKS image versions your cluster was on across the timeline of when your workaround was doing fine, when it started to fail, and when it would work again.

Something to keep in mind is that manual LB changes (i.e., modifying the LB directly on the DO cloud control panel / the DO API vs. making changes to the Service object exclusively) will eventually be reconciled by CCM but it can involve a big delay: CCM only reconciles when it detects a delta between the current and the future state on the local Kubernetes end (i.e., on the Service object). So it would take another local change or a CCM restart (as happening during a cluster upgrade) for the LB customization to be reverted. I know a few customers have run into this and got surprised (and it's something we need to address, at least by better documentation).

Given that it's working for you now, I'm inclined to skip any further investigations unless you see the problem resurfacing. Please ping if that's the case, I'm happy to help.

marufbd · 2019-07-02T07:09:26Z

@timoreimann i am having the same problem in k8s-1.13 cluster in digital ocean.
So i deployed the same on aws using their 1.13 cluster and terminating ssl on loadbalancer with certificate and guess what it does not have this problem, i can successfully curl the https://lb-external-ip from within a pod.

For Digital ocean

kubectl get services --namespace ingress-nginx

NAME            TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE
ingress-nginx   LoadBalancer   10.245.55.185   MY-LB-IP   80:31978/TCP,443:31285/TCP   125d

All commands from a pod within cluster:

nslookup MY-LB-IP

Server:    10.245.0.10
Address 1: 10.245.0.10 kube-dns.kube-system.svc.cluster.local

Name:      MY-LB-IP
Address 1: MY-LB-IP ingress-nginx.ingress-nginx.svc.cluster.local

curl -k -v https://MY-LB-IP

* Rebuilt URL to: https://MY-LB-IP/
*   Trying MY-LB-IP...
* TCP_NODELAY set
* Connected to MY-LB-IP (MY-LB-IP) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
* Curl_http_done: called premature == 1
* stopped the pause stream!
* Closing connection 0
curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

For AWS

kubectl get services --namespace ingress

NAME                            TYPE           CLUSTER-IP       EXTERNAL-IP                                                                  PORT(S)                      AGE
nginx-ingress-controller        LoadBalancer   10.100.246.139   ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com   80:30922/TCP,443:31923/TCP   3h4m

Commands from a pod:
nslookup ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com

Server:    10.100.0.10
Address 1: 10.100.0.10 kube-dns.kube-system.svc.cluster.local

Name:      ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com
Address 1: 52.58.147.72 ec2-52-58-147-72.eu-central-1.compute.amazonaws.com
Address 2: 52.28.12.129 ec2-52-28-12-129.eu-central-1.compute.amazonaws.com
Address 3: 18.194.22.67 ec2-18-194-22-67.eu-central-1.compute.amazonaws.com

curl -k -v https://ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com

* Rebuilt URL to: http://ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com/
*   Trying 52.28.12.129...
* TCP_NODELAY set
* Connected to ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com (52.28.12.129) port 80 (#0)
> GET / HTTP/1.1
> Host: ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 404 Not Found
< Content-Type: text/plain; charset=utf-8
< Date: Tue, 02 Jul 2019 07:06:40 GMT
< Server: nginx/1.15.8
< Content-Length: 21
< Connection: keep-alive
< 
* Connection #0 to host ae19475499c7311e99a610658fb1046d-1415231990.eu-central-1.elb.amazonaws.com left intact

The only difference i can see in nslookup output theres a dns entry with the service entry in case of digital ocean where that entry is not there for aws.

timoreimann · 2019-07-04T11:19:35Z

@marufbd thanks for sharing your test results.

The reason why this works in AWS is because the ingress hostname is not subject to the same bypassing mechanism. We also looked into leveraging ingress hostnames for DigitalOcean load-balancers. Unfortunately, this isn't easily feasible for certain reasons.

I think the best way forward is still to try to submit an upstream fix. We were running short on bandwidth over the last couple of weeks but hope to be able to tackle the matter in the foreseeable future.

timoreimann · 2019-07-11T12:55:46Z

I have transferred this issue into our new, generic DOKS feature/bug tracking repository.

timoreimann · 2019-08-19T19:50:45Z

While the underlying issue is yet to be fixed, CCM v0.1.17 supports a workaround: users may specify a custom hostname and point a corresponding DNS record to the external IP address of the LB. A more detailed guide is available in the CCM documentation.

Per our release notes, the feature requires at least one of Kubernetes 1.15.2-do.0, 1.14.5-do.0, or 1.13.9-do.0.

vasili439 · 2019-08-20T06:39:52Z

While the underlying issue is yet to be fixed, CCM v0.1.17 supports a workaround: users may specify a custom hostname and point a corresponding DNS record to the external IP address of the LB. A more detailed guide is available in the CCM documentation.

Per our release notes, the feature requires at least one of Kubernetes 1.15.2-do.0, 1.14.5-do.0, or 1.13.9-do.0.

Hi Timo, so everytime when I need to get or renew cert I need to add
service.beta.kubernetes.io/do-loadbalancer-hostname: "hello.example.com"
to LB svc manifest manually?

dottodot · 2019-08-20T08:17:10Z

@timoreimann What about if ingress-nginx is being used as per below with multiple domains? I'm assuming then that workaround is not going to work.

kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  externalTrafficPolicy: Local
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https

timoreimann · 2019-08-20T17:23:54Z

@vasili439 the new annotation works independently from certificates. You'd still use service.beta.kubernetes.io/do-loadbalancer-certificate-id to update/set your certificate.

timoreimann · 2019-08-20T17:30:11Z

@dottodot I'm not super familiar with the Nginx controller. To my understanding though, it shouldn't affect your scenario: Essentially, the new hostname annotation just sets the hostname part of the Ingress status field. Nothing should stop you from setting up further hostnames/DNS names (in addition to the one that should point to the hostname from the annotation) and have those point to the load balancer IP as well.

You could also skip the hostname annotation entirely, set up extra DNS names, and point your clients to those. Returning the hostname within the Ingress status is supposed to ease consumption of the field, but that's not a requirement.

timoreimann · 2019-12-16T17:12:22Z

The community has started an effort in form of a KEP to disable bypassing: kubernetes/enhancements#1392

spenceclark · 2020-04-16T12:38:55Z

@timoreimann i've been trying to follow this issue via the many open issues on the kubernetes github, and it doesn't sound like there is a fix yet. Is that correct?

I've been looking at your workaround (https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/annotations.md#servicebetakubernetesiodo-loadbalancer-hostname) but I'm not sure how to implement it.

We have many DNS A records (dns managed by DO) pointing to a single loadbalancer (managed by DO) and then have ingress defintions in kubernetes to direct them to the correct services based on hostname.

I can't see how the service.beta.kubernetes.io/do-loadbalancer-hostname setting works with the different A records as you can only specify one here.

timoreimann · 2020-04-16T12:54:52Z

Hey @spenceclark

You're correct that the issue isn't resolved yet. The best we have today is the workaround you've been looking at.

For multiple DNS records, the suggestion is to use CNAMEs that all point at the hostname. We have a bit more on that in the docs at https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/examples/README.md#accessing-pods-over-a-managed-load-balancer-from-inside-the-cluster.

Hope this helps. If not, please let me know.

spenceclark · 2020-04-16T12:59:44Z

Thanks for your reply. So I would create a new A record (example lb-internal.xxxx.net), add that to the LB config using service.beta.kubernetes.io/do-loadbalancer-hostname, and then create individual CNAME records for each host i need to access from internally? I would need to add each CNAME to the SSL certificate also? (We're using LetsEncrypt via DO also)

spenceclark · 2020-04-16T15:20:57Z

Ah ok, i was over complicating it in my head. I already had all the required DNS records for each host and the SSL and Ingess config.

All I needed to do was add a new A record (lb-internal.xxxxx.net), update the LB service definition to use it and that has fixed it - the workaround is working as described.

xaviablaza · 2020-05-10T11:05:25Z

I did the same thing as @spenceclark and it worked for me!

liarco · 2020-08-20T11:08:04Z

@timoreimann I'm currently testing the proposed workaround and, while it seems to work as expected, there is one thing that I don't understand:

The guide says:

To make the load-balancer accessible through multiple hostnames, register additional CNAMEs that all point to the hostname. SSL certificates could then be associated with one or more of these hostnames.

Why should I configure additional domains using CNAMEs to the same hostname instead of A records to the LB external ip?
My understanding of the issue is that the LB is not bypassed if it returns a hostname instead of a plain ip (since iptables/ipvs don't support rules based on hostnames) so just setting a CNAME and the do-loadbalancer-hostname annotation should be enough to fix the problem, am I wrong?

I'm worried about this because I run multiple websites (each with its own domain) and having to point each domain to the common hostname would have 2 issues:

users connecting to www.clientdomain.com would have to resolve loadbalancer.mycompany.com before reaching the loadbalancer
domains like clientdomain.com cannot be anything but A records, so they wouldn't work (e.g. cert-manager's self-validation for Let's Encrypt certificates would fail)

My current configuration seems to work with features like PROXY protocol, http-to-https redirect, TLS passthrough and cert-manager:

loadbalancer.mycompany.com -> LB external IP
clientdomain.com -> LB external IP
www.clientdomain.com -> LB external IP
do-loadbalancer-hostname -> loadbalancer.mycompany.com

What am I missing? I suspect the "over complication" by @spenceclark was due to the same thing...

Thank you for your time

jcassee changed the title ~~Access to load balancer HTTPS port from within cluster does not terminate TLS~~ Connection to load balancer HTTPS port from within cluster does not terminate TLS Mar 3, 2019

timoreimann transferred this issue from digitalocean/digitalocean-cloud-controller-manager Jul 11, 2019

This was referenced Jul 11, 2019

Requests from inside a pod targeting an external LB IP address bypass the LB and go directly towards the service #4

Closed

LoadBalancer issues & Questions digitalocean/digitalocean-cloud-controller-manager#239

Closed

This was referenced Aug 19, 2019

proxy_protocol mode breaks HTTP01 challenge Check stage cert-manager/cert-manager#466

Closed

Improve Load Balancers #11

Open

timoreimann mentioned this issue Aug 22, 2019

Improve doc section about via-LB requests digitalocean/digitalocean-cloud-controller-manager#269

Merged

timoreimann mentioned this issue Dec 5, 2019

DO Proxy Protocol broken header kubernetes/ingress-nginx#3996

Closed

ismailyenigul mentioned this issue May 22, 2020

SSL setup fails with: CONNECT_CR_SRVR_HELLO:wrong version number kubernetes/ingress-nginx#3556

Closed

Connection to load balancer HTTPS port from within cluster does not terminate TLS #8

Connection to load balancer HTTPS port from within cluster does not terminate TLS #8

Comments

jcassee commented Mar 3, 2019

timoreimann commented Mar 7, 2019

timoreimann commented Mar 8, 2019

jcassee commented Mar 8, 2019

timoreimann commented Mar 8, 2019 • edited

jcassee commented Mar 8, 2019

timoreimann commented Mar 8, 2019

michiels commented Mar 9, 2019

michiels commented Mar 9, 2019

timoreimann commented Mar 9, 2019

timoreimann commented Mar 9, 2019

erkie commented Mar 14, 2019

timoreimann commented Mar 14, 2019

timoreimann commented Mar 19, 2019

jcassee commented Mar 25, 2019 • edited

timoreimann commented Apr 10, 2019 • edited

jcassee commented Jun 19, 2019 • edited

timoreimann commented Jun 19, 2019

jcassee commented Jun 19, 2019

jcassee commented Jun 28, 2019

timoreimann commented Jun 28, 2019

jcassee commented Jun 28, 2019

timoreimann commented Jun 29, 2019 • edited

marufbd commented Jul 2, 2019 • edited

For Digital ocean

For AWS

timoreimann commented Jul 4, 2019

timoreimann commented Jul 11, 2019

timoreimann commented Aug 19, 2019

vasili439 commented Aug 20, 2019

dottodot commented Aug 20, 2019

timoreimann commented Aug 20, 2019

timoreimann commented Aug 20, 2019

timoreimann commented Dec 16, 2019

spenceclark commented Apr 16, 2020

timoreimann commented Apr 16, 2020

spenceclark commented Apr 16, 2020

spenceclark commented Apr 16, 2020

xaviablaza commented May 10, 2020

liarco commented Aug 20, 2020

timoreimann commented Mar 8, 2019 •

edited

jcassee commented Mar 25, 2019 •

edited

timoreimann commented Apr 10, 2019 •

edited

jcassee commented Jun 19, 2019 •

edited

timoreimann commented Jun 29, 2019 •

edited

marufbd commented Jul 2, 2019 •

edited