Error forwarding ports: error upgrading connection #1455

Closed
Simon-Ince opened this Issue Oct 25, 2016 · 16 comments

Comments

Projects
None yet
7 participants

Simon-Ince commented Oct 25, 2016

I have Helm install locally and Tiller on my cluster, everything looks healthy, but running helm install stable/mysql is giving me:

Error: Error forwarding ports: error upgrading connection: dial tcp: lookup kube-4gb-lon1-02 on 8.8.8.8:53: no such host
Collaborator

technosophos commented Oct 25, 2016

Can you tell us more about your Kubernetes cluster (version, installation method) and which version of Helm you're running? Thanks.

Simon-Ince commented Oct 26, 2016

@technosophos I created 5 Ubuntu servers on DigitalOcean and used Kubeadm:

kubeadm version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.0-alpha.0.1534+cf7301f16c0363-dirty", GitCommit:"cf7301f16c036363c4fdcb5d4d0c867720214598", GitTreeState:"dirty", BuildDate:"2016-09-27T18:10:39Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

to install on Ubuntu:

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.1 LTS
Release:    16.04
Codename:   xenial

Then on my local machine using OSX:

System Version: OS X 10.10.5 (14F1909)
Kernel Version: Darwin 14.5.0

I installed Kubectl and connected to my cluster:

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.4+3b417cc", GitCommit:"3b417cc4ccd1b8f38ff9ec96bb50a81ca0ea9d56", GitTreeState:"not a git tree", BuildDate:"2016-10-21T22:33:18Z", GoVersion:"go1.7.3", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.0", GitCommit:"a16c0a7f71a6f93c7e0f222d961f4675cd97a46b", GitTreeState:"clean", BuildDate:"2016-09-26T18:10:32Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Then installed Helm using brew cask install helm But when I try helm install stable/mysql I get Error: Error forwarding ports: error upgrading connection: dial tcp: lookup kube-4gb-lon1-02 on 8.8.8.8:53: no such host

I think I've gone a fairly safe route, I used the latest stable release of Ubuntu and installed the recommended release of kubeadm, which installed the latest stable release of kubeclt and then connected to it from my local machine and used brew to install helm, so my set-up should be pretty typical. I'm able to use kubeclt on my local machine fine, I can use manifesting to get things set up and etc, the only thing I can't get working is Helm.

Collaborator

technosophos commented Oct 27, 2016

Interesting that it is failing a DNS lookup. Is SkyDNS running in your Kube cluster?

kubectl get po -n kube-system
NAME                                READY     STATUS    RESTARTS   AGE
k8s-etcd-127.0.0.1                  1/1       Running   0          18h
k8s-master-127.0.0.1                4/4       Running   1          18h
k8s-proxy-127.0.0.1                 1/1       Running   0          18h
kube-addon-manager-127.0.0.1        2/2       Running   0          18h
kube-dns-v20-jqqts                  3/3       Running   0          18h
kubernetes-dashboard-v1.4.0-5ldke   1/1       Running   0          18h
Member

adamreese commented Oct 27, 2016

Can you start a port-forward using kubectl?

kubectl -n kube-system port-forward tiller-deploy-xxxxxx 44134

@technosophos Looks like DNS is there:

kube-system   etcd-kube-4gb-lon1-01                      1/1       Running   0          3d
kube-system   kube-apiserver-kube-4gb-lon1-01            1/1       Running   0          3d
kube-system   kube-controller-manager-kube-4gb-lon1-01   1/1       Running   0          3d
kube-system   kube-discovery-982812725-28b96             1/1       Running   0          3d
kube-system   kube-dns-2247936740-7lcli                  3/3       Running   0          3d
kube-system   kube-proxy-amd64-2j5gn                     1/1       Running   0          3d
kube-system   kube-proxy-amd64-9a0ob                     1/1       Running   0          3d
kube-system   kube-proxy-amd64-m6ypu                     1/1       Running   0          3d
kube-system   kube-proxy-amd64-n702h                     1/1       Running   0          3d
kube-system   kube-proxy-amd64-yjk4a                     1/1       Running   0          3d
kube-system   kube-scheduler-kube-4gb-lon1-01            1/1       Running   0          3d
kube-system   kubernetes-dashboard-1655269645-nyptv      1/1       Running   0          3d
kube-system   tiller-deploy-2434200834-8t7yt             1/1       Running   0          2d
kube-system   weave-net-09tbe                            2/2       Running   0          3d
kube-system   weave-net-f7qsx                            2/2       Running   0          3d
kube-system   weave-net-ihvuq                            2/2       Running   0          3d
kube-system   weave-net-phlva                            2/2       Running   0          3d
kube-system   weave-net-yx4wv                            2/2       Running   0          3d

@adamreese Looks like that gives back a similar error.
kubectl -n kube-system port-forward tiller-deploy-2434200834-8t7yt 44134
results in:
error: error upgrading connection: dial tcp: lookup kube-4gb-lon1-02 on 8.8.8.8:53: no such host
I've tried on both my local machine and while ssh'ed into the master node.

Just tried setting up a cluster again from scratch and still having the same issue.

When using kubeadm to set up the cluster, step "(3/4) Installing a pod network" (http://kubernetes.io/docs/getting-started-guides/kubeadm/) requiers you to pick the addon, I chose Weave Net, is Helm not compatible with this?

The options are:

  • Weave Net
  • Calico
  • Canal
  • Romana

Should I chose another one?

Collaborator

technosophos commented Oct 28, 2016

I'm running Weave on a single-node Kubeadm-based install. I can open a tunnel with kubectl port-forward from the master node (which is the only node). I wonder if this has to do with networking across multiple nodes?

There is the possibility that asking about this in the #kubernetes-users slack channel or on StackOverflow might yield some answers, since this is actually a Kubernetes configuration thing, not a Helm-specific thing. I'll do some more hunting and update here if I find out anything.

Member

mgoodness commented Oct 28, 2016

@simon-lush Based on the names of your apiserver, scheduler, and controller-manager pods, it seems like kubectl/helm should be trying to lookup kube-4gb-lon1-01. But your error messages show lookups for kube-4gb-lon1-02. Dueling entries in your kubeconfig?

I found help on the Slack group and now have Helm working.

@awh explained that

What you're experiencing is a known issue with k8s where for some operations it expects to be able to resolve your node names in the global DNS

And suggested a work around would be to:

  1. Add entries to /etc/hosts on the master mapping your hostnames to their public IPs
  2. Install dnsmasq on the master (e.g. apt install -y dnsmasq)
  3. Kill the k8s api server container on master (kubelet will recreate it)
  4. Then systemctl restart docker (or reboot the master) for it to pick up the /etc/resolv.conf changes
Collaborator

technosophos commented Oct 31, 2016

I'm going to leave this open until I get the above put into the install FAQ. I suspect this is an issue that may crop up again.

Thanks for the help @mgoodness and the follow-up @simon-lush .

technosophos added a commit to technosophos/k8s-helm that referenced this issue Nov 1, 2016

docs(install_faq): document dnsmasq fix for kubeadm
The solution to #1455 is to configure dnsmasq on each of your nodes.
This adds brief documentation on how to do so.

Closes #1455

@technosophos technosophos closed this in #1492 Nov 2, 2016

Thanks @technosophos for your help in IRC today the helm and deis community are really helpful and responsive.
Just wanted to note in case this help others, ran into this K8s global DNS bug/issue today and did a quick hack of just adding my kube clusters IP's to my master/ectd:/etc/hosts file and was then able to proceed with installing and configuring deis workflow and helm was happy, etc (at least for now). . . ;-)

# without etcd node /etc/hosts hack entry for my cluster (its defaulting to second Nameserver in my /etc/resolv.conf where the first is our internal/forwarder and second is my ISP's
helm list
Error: Get https://deis-kube1.dev.foo.com/api/v1/namespaces/kube-system/pods?labelSelector=app%3Dhelm%2Cname%3Dtiller: dial tcp: lookup deis-kube1.dev.foo.com on 75.75.75.75:53: no such host

# and clearly records exist in our internal DNS
host deis-kube1.dev.foo.com
deis-kube1.dev.foo.com is an alias for deis-kube-elbapise-161cju3twc1uw-123456789.us-west-1.elb.amazonaws.com.
deis-kube-elbapise-161cju3twc1uw-123456789.us-west-1.elb.amazonaws.com has address 54.x.xxx.xxx
deis-kube-elbapise-161cju3twc1uw-123456789.us-west-1.elb.amazonaws.com has address 52.x.xxx.xxx

# after adding our above noted cluster IP's to kube cluster etcd nodes /etc/hosts
helm repo add deis https://charts.deis.com/workflow
"deis" has been added to your repositories

helm install deis/workflow --namespace deis
Fetched deis/workflow to workflow-v2.8.0.tgz
NAME: newbie-stingeray
LAST DEPLOYED: Thu Dec  1 12:44:01 2016
NAMESPACE: deis
STATUS: DEPLOYED

 helm list
NAME            	REVISION	UPDATED                 	STATUS  	CHART
newbie-stingeray	1       	Thu Dec  1 12:44:01 2016	DEPLOYED	workflow-v2.8.0

We may run into issues later without having completed many of the suggested work-around steps above, but for now we're moving forward. Hope this helps others.

@battlemidget battlemidget referenced this issue in conjure-up/conjure-up Dec 19, 2016

Closed

Consider adding `conjure-up deis` #520

evfurman commented Apr 19, 2017

Seeing this error after installing helm/tiller on the tectonic stack. I can search helm and use kubectl just fine but fails upon installing new helm chart from default repo. Any idea when a fix will be merged?

[root@host ~]$ helm install stable/selenium Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp: lookup ip-10-148-222-140 on 10.148.213.33:53: no such host

'Just wanted to note (for me) that as of the previous release of helm v2.3.0 and with kubernetes clusters deployed with kube-aws v0.9.6-rc.2 I did not ever encounter this problem again. After getting this email, checked and noted that there is an updated helm version released 15 hours ago: helm latest release (I'm upgrading now. . .) Not sure about tectonics, I tried it a few months ago but it was not in a working state for us without any H/A capability, etc.. Curious what version of helm you're running?

@evfurman, just a thought. . . (not sure about tectonics, but for others with kubernetes (kube-aws, etc.), one thing to note is that if your kube cluster is behind a load balancer (in AWS ELB, etc.), your processes for re-deploys may not manage any existing DNS records. So, if you had a previous cluster with an ELB with a CNAME for my-cluster.foo and you redeploy a new cluster with the same name, the DNS record will still point to the old ELB CNAME and you will see that error. In this case, take a look at the new clusters ELB and its FQDN and update any existing DNS CNAMEs to resolve.

@technosophos Any idea how to fix this when using helm inside of a pod in the k8?

I'm using gitlab-runners to build projects inside of pods. Kubectl works, its pointing to https://kubernetes.default but helm doesn't.

$ docker run --rm ${CONTAINER_TEST_IMAGE} helm ls
Error: Get http://localhost:8080/api/v1/namespaces/kube-system/pods?labelSelector=app%3Dhelm%2Cname%3Dtiller: dial tcp [::1]:8080: getsockopt: connection refused
$ docker run --rm ${CONTAINER_TEST_IMAGE} kubectl cluster-info
Kubernetes master is running at http://localhost:8080
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment