TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 #295

iamsaso · 2017-02-02T17:31:08Z

We are seeing a lot of "TLS handshake error" on kube-apiserver and authproxy

Anyone have any ideas why this is happening?

redbaron · 2017-02-02T17:38:06Z

check from the client side, there should be more informative error

iamsaso · 2017-02-02T17:54:54Z

Client side works and I dont see any errors

redbaron · 2017-02-02T17:56:09Z

there must be, these IPs belong to pods right? and something there failed to established connection

iamsaso · 2017-02-02T18:00:54Z

huh, ok I think I can pin it to a service and ELB healthcheck. ty!

iamsaso · 2017-02-02T19:38:16Z

This happens because we are doing TCP healthcheck on SSL listener

whereisaaron · 2017-03-02T22:04:35Z

The apiserver requires a client certificate the ELB doesn't have, so the ELB can only do a TCP connect healthcheck, and that connect/disconnect triggers those errors in the logs. I get them too. It would be nice if there were someway to suppress them, or not log on a zero-data connect/disconnect.

mumoshu · 2017-03-22T07:07:13Z

Thanks for the info @whereisaaron!
How about allowing unauthenticated http accesses to apiserver from ELB in a security group so that we can use http health checks instead of tcp checks?

whereisaaron · 2017-03-23T02:25:56Z

Enabling any unauthenticated/anonymous access to the API makes be a bit uneasy @mumoshu !

It is possible for failed client-cert requests to fall back to anonymous, but I don't see that you can restrict those anonymous requests to just the ELB, so everyone with access to the API (from either side of the ELB) could make anonymous requests. These would map to the system:unauthenticated RBAC group and so access could be restricted by RBAC. But if you didn't have RBAC enabled, you'd have the whole cluster open for unauthenticated admin access.

So --anonymous-auth=true combined with RBAC authorization then you could enable successful HTTPS health checks by the ELB. That would also avoid these false-positive log entries.

I personally think these log entries are inappropriately conflating an empty TCP connect with real TLS negotiation errors. If the client doesn't send a single byte, I am not sure you could claim they are trying to negotiate anything. It is kind of like someone visits your website login page, but doesn't try to login at all, and then claiming that is a failed login. Seems like a specious stance. Fixing that would require a patch to the authproxy though.

mumoshu · 2017-03-23T02:33:19Z

Thanks @whereisaaron!
How about --insecure-port then? I guess we can restrict accesses to the insecure port of a kube-apiserver with a security group so that only ELBs are allowed to access the insecure port.

mumoshu · 2017-03-23T02:35:13Z

Ah, doing so would enable attackers to access an insecure port via cracked pods on controller nodes?

jeremyd · 2017-04-04T20:17:44Z

I think we can close this. This is just how TCP healthchecks work and there's no easy way to 'fix' it without sacrificing stability... cc @Saso #bugcleaning

redbaron · 2017-04-06T09:32:24Z

do we have any other ports opened by apiserver which can be checked by ELB?

cknowles · 2017-04-06T14:25:06Z

/etc/kubernetes/manifests/kube-apiserver.yaml says:

- containerPort: 443
  hostPort: 443
  name: https
- containerPort: 8080
  hostPort: 8080
  name: local

8080 is used for the liveness probe too (/healthz).

danielfm · 2017-04-06T14:27:50Z

AFAIK 8080 is the insecure port (bound to 127.0.0.1 by default), so I don't think using this port would work.

mumoshu · 2017-04-28T06:29:19Z

This seems to have been fixed by SSL healthchecks. See #604
@Sasso @danielfm @c-knowles @redbaron @whereisaaron Could you confirm? Thanks!

cknowles · 2018-05-21T11:15:19Z

Anyone seeing this again with a more recent version of kube-aws like 0.10.0?

whereisaaron · 2018-05-22T05:33:42Z

Hi @c-knowles I think this fix was specific to the 'classic' ELB. Have you (like me) switched to the ELBv2 option that kube-aws offers now? That currently configures with a TCP healthcheck, which would have the same problem.

ELBv2 load balancers do support HTTP(S) health checks, even on TCP load balancers. So a similar fix may be possible for the ELBv2 configuration.

cknowles · 2018-05-22T07:27:42Z

@whereisaaron nope, we haven't swapped unless kube-aws changed the default. I'll investigate further, for now all the info I have it this seems to be re-occurring and not entirely sure why. My current guess is this is healthchecks we have inside the cloud init scripts/systemd unit and the nodes were unhealthy at the time (unrelated health issues).

sonnysideup · 2018-07-20T16:14:36Z

I'm using kube-aws v0.10.2, a classic ELB for a single API endpoint and I'm seeing the TLS errors:

I0720 16:09:21.904678       1 logs.go:41] http: TLS handshake error from 10.30.12.238:34902: EOF
I0720 16:09:31.904127       1 logs.go:41] http: TLS handshake error from 10.30.12.238:34940: EOF
I0720 16:09:41.904106       1 logs.go:41] http: TLS handshake error from 10.30.12.238:34970: EOF
I0720 16:09:51.904188       1 logs.go:41] http: TLS handshake error from 10.30.12.238:35010: EOF

Maybe this is a regression?

buildsville · 2018-08-29T03:21:46Z

Does this cause same problem?

https://github.com/kubernetes-incubator/kube-aws/blob/v0.10.2/core/controlplane/config/templates/cloud-config-controller#L3220-L3224

g00nix · 2019-07-23T20:21:06Z

When you dominate a market, you don't really care about small details like this. So what if the LB can send only TCP healthchecks? Just change the source code from apps so that it doesn't throw EOF errors when TCP healthchecks come in.

EASY!

iamsaso closed this as completed Feb 2, 2017

iamsaso reopened this Feb 2, 2017

mumoshu changed the title ~~TLS handshake error~~ TLS handshake errors from ELB running tcp healthchecks to 443 Feb 5, 2017

mumoshu changed the title ~~TLS handshake errors from ELB running tcp healthchecks to 443~~ TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 Feb 16, 2017

mikedeltalima mentioned this issue Mar 2, 2017

Generated ELB Security Group caused API to be inaccessible due to blocking ICMP #214

Closed

mumoshu mentioned this issue Mar 16, 2017

kube-apiserver TLS handshake errors from controller IPs that no longer exist #415

Closed

mumoshu mentioned this issue Apr 28, 2017

Change API endpoint ELB health check to SSL:443 #604

Merged

mumoshu closed this as completed Apr 28, 2017

mumoshu added this to the v0.9.6-rc.6 milestone Apr 28, 2017

victorvarza mentioned this issue Nov 6, 2018

kube-apiserver log always has TLS handshake error kubernetes/kubernetes#70411

Closed

lkysow mentioned this issue Nov 19, 2018

TLS enabled error spam runatlantis/atlantis#355

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 #295

TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 #295

iamsaso commented Feb 2, 2017

redbaron commented Feb 2, 2017

iamsaso commented Feb 2, 2017

redbaron commented Feb 2, 2017

iamsaso commented Feb 2, 2017

iamsaso commented Feb 2, 2017

whereisaaron commented Mar 2, 2017

mumoshu commented Mar 22, 2017

whereisaaron commented Mar 23, 2017

mumoshu commented Mar 23, 2017

mumoshu commented Mar 23, 2017

jeremyd commented Apr 4, 2017

redbaron commented Apr 6, 2017

cknowles commented Apr 6, 2017 •

edited

Loading

danielfm commented Apr 6, 2017

mumoshu commented Apr 28, 2017

cknowles commented May 21, 2018

whereisaaron commented May 22, 2018

cknowles commented May 22, 2018

sonnysideup commented Jul 20, 2018

buildsville commented Aug 29, 2018

g00nix commented Jul 23, 2019

TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 #295

TLS handshake errors caused by Controller ELB running tcp healthchecks to 443 #295

Comments

iamsaso commented Feb 2, 2017

redbaron commented Feb 2, 2017

iamsaso commented Feb 2, 2017

redbaron commented Feb 2, 2017

iamsaso commented Feb 2, 2017

iamsaso commented Feb 2, 2017

whereisaaron commented Mar 2, 2017

mumoshu commented Mar 22, 2017

whereisaaron commented Mar 23, 2017

mumoshu commented Mar 23, 2017

mumoshu commented Mar 23, 2017

jeremyd commented Apr 4, 2017

redbaron commented Apr 6, 2017

cknowles commented Apr 6, 2017 • edited Loading

danielfm commented Apr 6, 2017

mumoshu commented Apr 28, 2017

cknowles commented May 21, 2018

whereisaaron commented May 22, 2018

cknowles commented May 22, 2018

sonnysideup commented Jul 20, 2018

buildsville commented Aug 29, 2018

g00nix commented Jul 23, 2019

cknowles commented Apr 6, 2017 •

edited

Loading