Skip to content
This repository has been archived by the owner on Jan 24, 2023. It is now read-only.

Cannot connect to newly deployed Kubernetes cluster #4

Closed
jimmystridh opened this issue Jan 22, 2017 · 11 comments
Closed

Cannot connect to newly deployed Kubernetes cluster #4

jimmystridh opened this issue Jan 22, 2017 · 11 comments

Comments

@jimmystridh
Copy link

Problem: after deploying a fresh cluster, it's not possible to connect by following the instructions in the guide at https://docs.microsoft.com/en-us/azure/container-service/container-service-kubernetes-walkthrough

To reproduce:

  • Deploy a new ACS through the Portal
  • az acs kubernetes get-credentials --resource-group=my-rg --name=my-container-service
  • kubectl get pods

Expected result:

  • List pods

Actual result:

  • Unable to connect to the server: dial tcp 40.xx.xx.xx:443: i/o timeout

I have tried deploying a new cluster several times with the same result.

@jimmystridh
Copy link
Author

ssh:ing to the master and running kubectl get pods gives the same result

user@k8s-master-ABCD1234-0:~$ kubectl get pods
Unable to connect to the server: dial tcp 40.xx.xx.xx:443: i/o timeout

@jimmystridh jimmystridh changed the title Cannot connect to newly deployed cluster Cannot connect to newly deployed Kubernetes cluster Jan 23, 2017
@dbalaouras
Copy link

Sounds related to #3. If so, the workaround suggested in that post worked for me.

@colemickens
Copy link
Contributor

It's very likely that your service principal credentials are incorrect. Every single time we've had this reported it's because the SP creds were wrong, or the SP didn't have the correct permissions.

Try the troubleshooting steps here: https://github.com/Azure/acs-engine/blob/master/docs/kubernetes.md#misconfigured-service-principal

@anhowe
Copy link
Contributor

anhowe commented Jan 24, 2017

Closing, please confirm this is not SP related, and re-open if issue is reproducible.

@anhowe anhowe closed this as completed Jan 24, 2017
@jimmystridh
Copy link
Author

this was indeed because of incorrect service principal details. I assumed the dialog was for a role to be created, which is why I entered new details for a non-existing account :/ Creating a SP elsewhere proved to be working fine. Thanks for your help!

@vankhoa011
Copy link

Hello,
I got same problem, when I create Azure Container Cluster by CLI.

@colemickens
Copy link
Contributor

Please see the guidance in my earlier reply: #4 (comment)

squillace pushed a commit to squillace/ACS that referenced this issue Jun 28, 2017
@ComeMaes
Copy link

Hi,
I'm running into a similar problem on a cluster deployed months ago (and has been operating fine).
kubectl can't connect to server, either from local machine or from k8s master node. I looked into the service principal situation but it seems in order - from the command provided in the "Misconfigured Service Principal" steps in comment #4 and from inspection of the portal, SPs listed under my ACS cluster IAM, not sure how to further assess validity of this configuration.

az acs get-credentials completes OK but does not solve the problem either.
Is my problem related to this issue?

Any pointers?

@JackQuincy
Copy link
Contributor

@ComeMaes Is this a single master or a multi master cluster? If your service principal is still good, I'd ssh onto the/a master and check on the state of the api server. To do this
sudo docker ps
you should see a list of docker containers and hopefully one like this
2de14ed7ad2a gcrio.azureedge.net/google_containers/hyperkube-amd64@sha256:8ff9b34b24330af95b423a883dbf61260277992317236308962a935f265d3ef3 "/hyperkube apiserver" 5 weeks ago Up 5 weeks k8s_kube-apiserver_kube-apiserver-k8s-master-26621649-0_kube-system_0f866c27e477fcf3ed13c4a22154537f_0

if not that should be your issue. To fix it I'd do a systemctl restart kubelet.
If you have multiple masters I'd do this on each of them. Another possible fix would be to restart the master vms, would be simplier.

@ComeMaes
Copy link

ComeMaes commented Oct 17, 2017

Thanks @JackQuincy
This is a single master cluster, it looks like the api-server is indeed not up, but because the container is failing on start up. api-server container logs show a problem with etcd:

controller.go:128] Unable to perform initial IP allocation check: unable to refresh the service IP block: client: etcd cluster is unavailable or misconfigured

Actually, etcd is mentioned in several times in error messages (ListAndWatch) before that message in the same logs (can post the full trace if helpful).

@ComeMaes
Copy link

Seems like a simple restart of etcd did the trick. Thanks for the help!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants