ELB SSL handshake broken in us-east-2 #6181

justinsb · 2018-12-07T16:53:02Z

At least in us-east-2, with --topology=private, ELB reports master instances as being out of service when the health check is configured for SSL. Manually changing to TCP fixes things immediately.

apiserver is logging http: TLS handshake error from 172.20.1.222:38426: EOF during the failed health checks. It logs it with the TCP health-checks as well though!

Reported here: #6172 (comment)

I was able to reproduce in us-east-2

The text was updated successfully, but these errors were encountered:

justinsb · 2018-12-07T16:53:56Z

Likely related to 2accc73

cc @johanneswuerbach

Although honestly we should do a better health check anyway....

justinsb · 2018-12-07T16:54:42Z

(by better health check I mean one that actually checks the health of apiserver & etcd, not just whether or not apiserver is listening - this predates 2accc73)

johanneswuerbach · 2018-12-07T19:38:24Z

Strange, we run all our clusters in us-east-1 and eu-west-1 with API ELBs configured with Ping Target | SSL:443 and never experienced any issues.

johanneswuerbach · 2018-12-07T19:44:37Z

Maybe its caused by --topology=private and SSL checks actually doing more back-and-forth then plain TCP checks?

We had similar issues in the past with misconfigured security groups or route tables on one-side of the connection, which allowed to open a connection, but they stalled as responses never arrived.

markine · 2018-12-08T00:11:56Z

The us-east-1 vs us-east-2 contrast can be seen by using:

kops create cluster NAME --cloud aws --networking kopeio-vxlan --master-zones ZONES --zones ZONES --ssh-public-key KEY --master-size SIZE --master-volume-size SIZE --node-size SIZE --node-volume-size SIZE --node-count COUNT --output yaml --topology private --bastion --image AMI

and changing just the zones and the AMI (have to pick the AMI for the right zone). us-east-1 works, us-east-2 does not.

kops 1.10.0 (and kubernetes 1.10.11)

justinsb · 2018-12-08T00:26:27Z

Yes, what's weird is that I think sometimes us-east-2 works. I thought it was a kubernetes 1.10 vs 1.11 thing, but now I'm leaning towards it being random.

I've captured some packets, and it looks like the ELB health check is behaving more like a TCP health check - opening the connection and then closing it again. No TLS handshake being initiated.

The SSL code has been in kops for a few releases now, so I am leaning towards it being a problem with ELB in us-east-2. But nothing on the AWS status and nobody else reporting it, which is odd.

alexander-semenets · 2018-12-12T06:36:23Z

The same issue for our infrastructure. We have clusters provisioned with the same codebase in us-east-1 and us-east-2. Us-east-2 has ssl checks failed while the east-1 works well. Trying to investigate with AWS Support now.

alexander-semenets · 2018-12-13T07:09:55Z

I've just got a confirmation from AWS Support team that there was an issue with CLB update at us-east-2 region. The problem was fixed and CLB works great again at my env.

fejta-bot · 2019-03-13T07:24:43Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-04-12T07:56:54Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-05-12T08:41:36Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-05-12T08:41:43Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

justinsb mentioned this issue Dec 7, 2018

Unable to connect to the server: EOF - Kops rolling update of 1.10.11 #6172

Closed

Pharb mentioned this issue Dec 26, 2018

Kubectl version error #6268

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 13, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 12, 2019

k8s-ci-robot closed this as completed May 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ELB SSL handshake broken in us-east-2 #6181

ELB SSL handshake broken in us-east-2 #6181

justinsb commented Dec 7, 2018

justinsb commented Dec 7, 2018

justinsb commented Dec 7, 2018

johanneswuerbach commented Dec 7, 2018 •

edited

Loading

johanneswuerbach commented Dec 7, 2018

markine commented Dec 8, 2018

justinsb commented Dec 8, 2018

alexander-semenets commented Dec 12, 2018 •

edited

Loading

alexander-semenets commented Dec 13, 2018

fejta-bot commented Mar 13, 2019

fejta-bot commented Apr 12, 2019

fejta-bot commented May 12, 2019

k8s-ci-robot commented May 12, 2019

ELB SSL handshake broken in us-east-2 #6181

ELB SSL handshake broken in us-east-2 #6181

Comments

justinsb commented Dec 7, 2018

justinsb commented Dec 7, 2018

justinsb commented Dec 7, 2018

johanneswuerbach commented Dec 7, 2018 • edited Loading

johanneswuerbach commented Dec 7, 2018

markine commented Dec 8, 2018

justinsb commented Dec 8, 2018

alexander-semenets commented Dec 12, 2018 • edited Loading

alexander-semenets commented Dec 13, 2018

fejta-bot commented Mar 13, 2019

fejta-bot commented Apr 12, 2019

fejta-bot commented May 12, 2019

k8s-ci-robot commented May 12, 2019

johanneswuerbach commented Dec 7, 2018 •

edited

Loading

alexander-semenets commented Dec 12, 2018 •

edited

Loading