New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
az aks and kubectl issue: Unable to connect to the server: net/http: TLS handshake timeout #164
Comments
Running into the same issue on Mac OS. The
I've tried upgrading the cluster, but that also throws an error and puts the cluster into a failed state:
|
~ $ kubectl get nodes |
Still the same:
|
Is your cluster up and functional or it is in a failed state? |
It is in a failed state. But I can't tell if that's from the failed upgrade, or if it was failing earlier. The previously set up services/pods seem to work still. |
@jchauncey Central US |
can you try upgrading it again? https://kubernetes.io/docs/tasks/administer-cluster/kubeadm-upgrade-1-8/#recovering-from-a-bad-state If kubeadm upgrade somehow fails and fails to roll back, due to an unexpected shutdown during execution for instance, you may run kubeadm upgrade again as it is idempotent and should eventually make sure the actual state is the desired state you are declaring. You can use kubeadm upgrade to change a running cluster with x.x.x --> x.x.x with --force, which can be used to recover from a bad state. |
How does |
Sorry, Can you try upgrading it again with az aks upgrade? az aks upgrade --name myAKSCluster --resource-group myResourceGroup --kubernetes-version 1.8.7 |
I've tried the upgrade several times with the same result:
|
Is your subscription good? do you have enough credits? https://stackoverflow.com/questions/48443320/upgrade-failed-azure-aks-from-1-8-1-to-1-8-6 |
My subscription is good with plenty of credits. I was able to start another AKS cluster. Output from your linked stackoverflow:
|
can you try this on powershell and see if there are any logs. |
Unfortunately not:
|
I found this on one of the Kubernetes nodes, /var/log/syslog:
|
what version of az are you using? Can you update it to latest and try? |
Version 2.0.26 installed via homebrew |
throwing in some thoughts: |
|
Same issue... Created a cluster yesterday in West Europe. Can't connect. |
Now I'm not able to create a new cluster at all in CentralUS. The deployment fails without creating the second resource group (or any AKS nodes). |
Having the same issue in Central US :( |
After a week of contacting Azure support, they could not tell what was wrong and recommended deleting and recreating the AKS cluster 👎 |
@amanohar upgraded and new deployments seem to not fail any more. However, upgraded and newly deployed clusters are not functioning. Old upgraded cluster, still shows TLS handshake timeout: Newly created cluster info, shows new error |
I had that same issues among others. again, this is preview and MS is fixing bugs. Kubernetes looks to be designed, in the beginning, to have only one master node. To have HA master you have to do some extra work and this #164 may well be a bug in that process. Almost smells like there is active-standby master and the DNS (or whatever MS may use) does not follow cluster state and kubectl tries to connect to standby node and times out.
Timeout tells me that there is TCP port listening, but API processing is sleeping, others we would get immediate TCP reset.
I deployed kubernetes 1.9 IaaS style with 2n-1 masters. works well with azure, AWS and on-prem load balancers. LB is only component that does not come with kubernetes package and ha-proxy or nginx (or any cloud provided LB) will work to have HA master.
to me, today, AKS is not production ready yet so we decided to give MS some time to stabilize their AKS offering and try again after GA.
Jouni
…________________________________
From: Mathew Kamkar <notifications@github.com>
Sent: Tuesday, February 13, 2018 12:41
To: Azure/AKS
Cc: Subscribed
Subject: Re: [Azure/AKS] az aks and kubectl issue: Unable to connect to the server: net/http: TLS handshake timeout (#164)
After a week of contacting Azure support, they could not tell what was wrong and recommended deleting and recreating the AKS cluster 👎
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2FAKS%2Fissues%2F164%23issuecomment-365362957&data=02%7C01%7C%7C73ee9637c6a842383f2708d57311728d%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636541441122039543&sdata=gcKUsXgnXWbqDP%2BvQg9buFYDaak3Cad7A7%2FCBCpXw4E%3D&reserved=0>, or mute the thread<https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGQ_3DYdc1D0EUa2WWX7hn72HHqrSTLTks5tUddugaJpZM4R4sdl&data=02%7C01%7C%7C73ee9637c6a842383f2708d57311728d%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636541441122039543&sdata=XzEUyaRpG%2FgkLg6ylvh%2FXxErlUXye%2FaCKRTTRieFFsc%3D&reserved=0>.
|
@matkam thanks for providing the details. A follow up question to failure on:
Were you able to connect to the cluster after create and it stopped working after scale? Or was it not working after create as well? |
@amanohar It was not working after a create |
@matkam - I have resolved the tls handshake timeout error on your old cluster with resource group microserviceResourceGroup. I am still looking into the new cluster. |
My issue went away after about one day in West Europe. Is this still related to capacity? |
Thanks for looking into it @shrutir25 but I'm still seeing the same error
As for the new cluster, it looks like the provisioning scripts are either exiting early, or not provisioning everything they need to. I'm seeing missing routes on the routing table. |
Same issue in East US for me, too. Worked a couple days ago, doesn't work now. Hosted services appear to still be up but kubectl commands (get nodes, proxy, etc.) are failing as described above. In case it's useful: |
My East US AKS instance seems to be healthy again. I didn't do anything so I'm guessing it was just some transient issue with the master |
@matkam Did you ever figure out this? I have a cluster that has been working for a month, and I can run |
@derekperkins I have not figured it out. Azure support could only recommend starting a new AKS cluster (which I am now able to do). |
@matkam thanks for the quick response. I tried doing that, but I'm currently running on 1.9.1, but I can only create a 1.8.x cluster, and my app requires 1.9. Hopefully they make that available soon. :( |
Is there a solution around: I have started getting it since yesterday and getting it even in newly created clusters. |
submit a support ticket so our on calls can take a look. This error can be caused by different problems so we need to investigate whats happening before providing a mitigation. |
The same thing today. |
I am seeing the same thing in East US. Our cluster has been running fine for sometime and suddenly seeing "nable to connect to the server: net/http: TLS handshake timeout" today. |
|
If you are seeing issues please submit a support ticket so our oncall engineers can take a look |
We encouter exactly the same issue (Unable to connect to the server: net/http: TLS handshake timeout) / west europe). Our subscription does not allow us to open support tickets. Do we have any other option? We already connected via SSH to a cluster node. The managed control/master node seems to be down/broken. |
Re-deployed AKS which was in eastus and deployed in centralus and all works again. |
Azure Support fixed the "Unable to connect to the server: net/http: TLS handshake timeout" error for our cluster :-) |
We have the same problem with Aks cluster in West Europe today. It worked in the morning, but currently it returns the "unable to connect message" for several hours. |
I am facing the same issue with my aks cluster in East US. It is intermittent. |
@chetanku @snekcz @hholle @dazdaz @Jagdeep1 @dharmeshkakadia @Nirmalyasen @kkorsakov @matkam @mjrousos @danielcoman I am starting to collect info on this issue in a question over on StackOverflow: If you guys could answer a couple of these (here or above on SOF)
Feel free to respond here or (as mentioned) hit StackOverflow and I will collect the responses. |
Just for info here. for me, it is a combination of problems of corporate firewall, Azure service, and golang http library support for self signed certificate (kubectl depends on) "-v=9" will print more debugging info. e.g.: kubectl -v=9 get nodes |
Closing due inactivity. Feel free to re-open if still an issue. |
I was able to get this to work after running az aks get-credentials --resource-group rsgnamehere --name clusternamehere - this was my error though... Unable to connect to the server: dial tcp [::1]:8080: connectex: No connection could be made because the target machine actively refused it Really just putting this here in case someone else runs into it. |
I am trying to connect to my existing Kuberentes cluster from Windows 2016 Server.
1st issue:
On windows 2016 server, I run az aks get-credentials with resource group and cluster name, I see an issue with the case.
If I specify my cluster name as --name=testscluster and run it,
It somehow changes the case to testScluster.
2nd Issue:
Then after I run kubectl get nodes I get Unable to connect to the server: net/http: TLS handshake timeout . I manually changed the case and tried but still the same issue.
When I try to do the same from Ubuntu server it works completely fine.
Question: Does aks support windows ? or is it for only Linux?
I am on latest az version of 2.0.26.
Any help will be great!!
The text was updated successfully, but these errors were encountered: