Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm install/upgrade/reset hangs on kube 1.8 #3133

Closed
gsemet opened this issue Nov 13, 2017 · 16 comments
Closed

helm install/upgrade/reset hangs on kube 1.8 #3133

gsemet opened this issue Nov 13, 2017 · 16 comments

Comments

@gsemet
Copy link

gsemet commented Nov 13, 2017

Hello.
We just upgraded to kubernetes 1.8, and I cannot perform any commands on helm, it hangs everytime, and the log on tiller does not show any thing.
Here my information:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.3", GitCommit:"f0efb3cb883751c5ffdbe6d515f3cb4fbe7b7acd", GitTreeState:"clean", BuildDate:"2017-11-08T18:39:33Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.2+coreos.0", GitCommit:"4c0769e81ab01f47eec6f34d7f1bb80873ae5c2b", GitTreeState:"clean", BuildDate:"2017-10-25T16:24:46Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
$ helm version 
Client: &version.Version{SemVer:"v2.7.0", GitCommit:"08c1144f5eb3e3b636d9775617287cc26e53dba4", GitTreeState:"clean"}
(it hangs here)

Tiler upgrade works:

$ helm init --upgrade
$HELM_HOME has been configured at /home/renault/.helm.

Tiller (the Helm server-side component) has been upgraded to the current version.
Happy Helming!

Even helm reset hangs:

$ helm reset --debug --force
[debug] Created tunnel using local port: '46474'

[debug] SERVER: "localhost:46474"
(hang)

Tiller-deploy pod does not show anything in the log once started:

[main] 2017/11/13 09:37:29 Starting Tiller v2.7.0 (tls=false)
[main] 2017/11/13 09:37:29 GRPC listening on :44134
[main] 2017/11/13 09:37:29 Probes listening on :44135
[main] 2017/11/13 09:37:29 Storage driver is ConfigMap
[main] 2017/11/13 09:37:29 Max history per release is 0

I tried restarting my PC and manually destroying all tiller resources on the cluster, the reinstallation of tiller works (helm init) but then, nothing.
For example if I install any application in it:

$ helm install /path/to/chart \
                 --version=v0.1.0 \
                 --name=appname \
                 --namespace=sandboxname \
                 --debug \
                 -f config.yaml
[debug] Created tunnel using local port: '36848'

[debug] SERVER: "localhost:36848"

[debug] Original chart version: "v0.1.0"
[debug] CHART PATH:  /path/to/chart
(hang here)

Do you have any idea on how to debug it?
Thanks

@gsemet
Copy link
Author

gsemet commented Nov 14, 2017

Anyone?

@bacongobbler
Copy link
Member

Hi @stibbons, sorry we didn't get back to you sooner. I'm just getting back from vacation. Let me take a look at this after I get some ☕️ and 🍞.

@bacongobbler
Copy link
Member

Can you try using a combination of kubectl -n kube-system port-forward my-tiller-pod 44134:44134 and HELM_HOST=127.0.0.1:44134 helm list? Curious as to see why the connection is hanging. It sounds like it could be an internal networking issue within the kubernetes cluster, but I'm not entirely sure what the root cause would be.

@gsemet
Copy link
Author

gsemet commented Nov 14, 2017

helm list also stuck, look like we have several problem.

On direct connection to the cluster (so without forwarding), helm list shows the following error on our service account we use for Tiller:

Error: configmaps is forbidden: User "system:serviceaccount:<namespace>:default" cannot list configmaps in the namespace "<namespace>" 

Using the right service account does not change, helm * commands are still stuck, like if tiller does not listen on localhost
All other thing works on our kubeserver

@bacongobbler
Copy link
Member

Sorry, I'm not sure I quite understand here. Can you explain how you set up tiller to talk to the right service account, and how you set up that service account? Were you following the documentation as described in #3094 or some other documentation?

@gsemet
Copy link
Author

gsemet commented Nov 14, 2017

I need to deep dive more in our RBAC configuration, but on kube 1.6, we set up RBAC and Tiller was fine, using an account named tiller-sa with the right accounts, but may have been done manually by someone. Having updated to 1.8, and so reployed helm completely, helm complained with this error(configMap) but was not shown by me because of the network/forwarding issue.
I now deploy helm with helm init --service-account tiller-sa and now it does not complain on helm list from within the cluster.

Now, from my machine, where is used to work, now I cannot send any command to helm, the port forward does nothing (even helm list with the port forwarding your recommended does nothing).

TD;DR: service account was not the issue blocking helm command. Something else blocking the commands. Do you have any idea on how to debug what going on between the helm container and the proxy? Because, kubectl proxy works, so I am able to connect to kube api server. Just, connection to the helm container fails since we migrated to kube 1.8 (note that I am not aware of all changes done by our IT team on our the config of kube).

@gsemet
Copy link
Author

gsemet commented Nov 17, 2017

Look my issue comes from the fact the Tiller service is a ClusterIP node, and we cannot access to these ip adress from outside the cluster. Is it possible to use an ingress for tiller, so it would be exposed to outside of our cluster like any other pods? How can I tell to the helm commands (list, install, ...) to use this endpoint instead of something else?

@bacongobbler
Copy link
Member

You can set the HELM_HOST environment variable on your client to point helm directly to a tiller instance. Is that what you're looking for?

Now, from my machine, where is used to work, now I cannot send any command to helm, the port forward does nothing (even helm list with the port forwarding your recommended does nothing).

What was the error you're seeing in this situation?

@gsemet
Copy link
Author

gsemet commented Nov 17, 2017

Can I run tiller on my machine instead of inside the cluster? I suppose it will use the kube api like I can do with kubectl from my machine? helm init --client-only does not install anything locally

I cannot use a endpoint created by an ingress in HELM_HOST, is it?

@bacongobbler
Copy link
Member

ingress controllers only proxy HTTP connections. Tiller communicates over gRPC, so exposing tiller through an ingress route won't work.

It's not recommended to expose tiller to the outside world anyways unless you enable TLS auth for tiller as anyone with direct access to tiller will have access to manage, install and uninstall cluster resources. It wouldn't be a good day for your sysadmins when they realize their production clusters are running random bitcoin miner agents. ;)

It would be better if I had an error message to better advise on how to fix your RBAC setup.

@gsemet
Copy link
Author

gsemet commented Nov 17, 2017

Thanks for your help. Not sure if it comes from RBAC precisely, but for sure, I cannot ping any internal IP adress (10.214.*) from my pc from our lab office. Every pod running in ClusterIP is simply not accessible, and I can't find a way to make it work with port-forward. I simply have no error message :(

@mklebrasseur
Copy link

We also have this problem but for us it just randomly stopped working.
Same issue no errors just hangs

@gsemet
Copy link
Author

gsemet commented Nov 29, 2017

I close this issue. Helm works great, but since tiller is clusterip we cannot access it from our lab network. What is weird is how it worked previously on kube 1.6?
Now I proxy the execution with a pod running inside the cluster and it works great

@gsemet gsemet closed this as completed Nov 29, 2017
@baracoder
Copy link

This issue started happening for me out of the blue, too.

With kubernetes 1.6.4, helm 2.5.1, had already deployed a lot using helm on that cluster. If I start it with debug, i get the following until it hangs. No logs on tiller side.

DEBUG=50  helm version --debug                                                                                                                                                                                      :(
[debug] Created tunnel using local port: '35865'

[debug] SERVER: "localhost:35865"

Client: &version.Version{SemVer:"v2.5.1", GitCommit:"7cf31e8d9a026287041bae077b09165be247ae66", GitTreeState:"clean"}
2017/12/11 21:57:29 (0xc4207e02c0) (0xc4207c9f40) Create stream
2017/12/11 21:57:29 (0xc4207e02c0) (0xc4207c9f40) Stream added, broadcasting: 1
2017/12/11 21:57:29 (0xc4207e02c0) Reply frame received for 1
2017/12/11 21:57:29 (0xc4207c9f40) (1) Writing data frame
2017/12/11 21:57:29 (0xc4207e02c0) (0xc4208fe0a0) Create stream
2017/12/11 21:57:29 (0xc4207e02c0) (0xc4208fe0a0) Stream added, broadcasting: 3
2017/12/11 21:57:29 (0xc4207e02c0) Reply frame received for 3
2017/12/11 21:57:29 (0xc4208fe0a0) (3) Writing data frame

With two GKE clusters it still works. I am confused.

@baracoder
Copy link

Finally found the issue, it is not related to helm.
For some reason our coreos machines did not have localhost in the /etc/hosts file any more. The kubelet will use socat TCP:localhost:port for port-forwarding. With localhost not resolving it just hangs there.

@vhosakot
Copy link

vhosakot commented Jan 9, 2018

Per #2224 (comment), the following commands resolved the error for me:

kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants