Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'helm install' gets tuck in error loop #981

Closed
maratoid opened this issue Jul 22, 2016 · 3 comments
Closed

'helm install' gets tuck in error loop #981

maratoid opened this issue Jul 22, 2016 · 3 comments

Comments

@maratoid
Copy link

I have an ansible playbook that:

  1. runs 'helm init'
  2. runs 'helm add repo'
  3. runs a few 'helm install' commands

From time to time, the helm install commands get stuck in an error loop of some sort:

E0722 21:46:21.216925       1 portforward.go:327] an error occurred forwarding 43394 -> 44134: error forwarding port 44134 to pod tiller-rc-sf2a4_default, uid : pod not found ("tiller-rc-sf2a4_default")
2016/07/22 21:46:21 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0722 21:46:21.223846       1 portforward.go:327] an error occurred forwarding 43394 -> 44134: error forwarding port 44134 to pod tiller-rc-sf2a4_default, uid : pod not found ("tiller-rc-sf2a4_default")
2016/07/22 21:46:21 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0722 21:46:21.233468       1 portforward.go:327] an error occurred forwarding 43394 -> 44134: error forwarding port 44134 to pod tiller-rc-sf2a4_default, uid : pod not found ("tiller-rc-sf2a4_default")
2016/07/22 21:46:21 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0722 21:46:51.234500       1 portforward.go:267] error creating error stream for port 43394 -> 44134: Timeout occured
2016/07/22 21:46:51 transport: http2Client.notifyError got notified that the client transport was broken read tcp 127.0.0.1:42940->127.0.0.1:43394: read: connection reset by peer.
E0722 21:47:21.238377       1 portforward.go:289] error creating forwarding stream for port 43394 -> 44134: Timeout occured
2016/07/22 21:47:21 transport: http2Client.notifyError got notified that the client transport was broken read tcp 127.0.0.1:42941->127.0.0.1:43394: read: connection reset by peer.
E0722 21:47:51.245922       1 portforward.go:289] error creating forwarding stream for port 43394 -> 44134: Timeout occured
2016/07/22 21:47:51 transport: http2Client.notifyError got notified that the client transport was broken read tcp 127.0.0.1:42942->127.0.0.1:43394: read: connection reset by peer.
E0722 21:48:21.247027       1 portforward.go:267] error creating error stream for port 43394 -> 44134: Timeout occured
2016/07/22 21:48:21 transport: http2Client.notifyError got notified that the client transport was broken read tcp 127.0.0.1:42943->127.0.0.1:43394: read: connection reset by peer.
E0722 21:48:51.255121       1 portforward.go:289] error creating forwarding stream for port 43394 -> 44134: Timeout occured
2016/07/22 21:48:51 transport: http2Client.notifyError got notified that the client transport was broken read tcp 127.0.0.1:42944->127.0.0.1:43394: read: connection reset by peer.

checking kubectl get pods shows that tiller-rc-sf2a4_default is actually present. Could this be a race of some sort between 'helm init' initialising the tiller rc fully and 'helm install' ?

@technosophos
Copy link
Member

Can you tell us about the environment that Tiller is running in? Local, remote, GKE, etc? Oh, and the Kubernetes version?

I've had that happen once or twice due to k8s API server issues (running k8s 1.3 inside of VirtualBox using scripts/local-cluster.sh). But my guess is that you may have hit a different bug.

@adamreese
Copy link
Member

This could be due to Helm not checking if the pod is ready before connecting. I'll check on it

@maratoid
Copy link
Author

FYI, this happens again. We have a cluster with some sort of a networking problem - pods seem to be able to talk to each other but cannot reach outside the flannel network. Trying to install a release on this cluster results in helm getting stuck in a loop:

KUBECONFIG=/Users/marat/.kraken/maratoidTNG/admin.kubeconfig HELM_HOME=/Users/marat/.kraken/maratoidTNG/.helm helm install atlas/kubedns-0.1.0 --name kubedns --values /Users/marat/.kraken/maratoidTNG/atlas-kubedns.helmvalues
Fetched atlas/kubedns-0.1.0 to /Users/marat/dev/kraken/kubedns-0.1.0.tgz
2016/09/20 15:49:32 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0920 15:49:33.217943   61537 portforward.go:327] an error occurred forwarding 61453 -> 44134: error forwarding port 44134 to pod tiller-deploy-1979772362-1kqd9_kube-system, uid : exit status 1: 2016/09/20 22:49:33 socat[21989] E connect(5, AF=2 127.0.0.1:44134, 16): Connection refused
2016/09/20 15:49:33 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0920 15:49:33.439489   61537 portforward.go:327] an error occurred forwarding 61453 -> 44134: error forwarding port 44134 to pod tiller-deploy-1979772362-1kqd9_kube-system, uid : exit status 1: 2016/09/20 22:49:33 socat[21990] E connect(5, AF=2 127.0.0.1:44134, 16): Connection refused
2016/09/20 15:49:33 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0920 15:49:33.695460   61537 portforward.go:327] an error occurred forwarding 61453 -> 44134: error forwarding port 44134 to pod tiller-deploy-1979772362-1kqd9_kube-system, uid : exit status 1: 2016/09/20 22:49:33 socat[21991] E connect(5, AF=2 127.0.0.1:44134, 16): Connection refused
2016/09/20 15:49:33 transport: http2Client.notifyError got notified that the client transport was broken EOF.
E0920 15:49:33.922762   61537 portforward.go:327] an error occurred forwarding 61453 -> 44134: error forwarding port 44134 to pod tiller-deploy-1979772362-1kqd9_kube-system, uid : exit status 1: 2016/09/20 22:49:33 socat[22003] E connect(5, AF=2 127.0.0.1:44134, 16): Connection refused
2016/09/20 15:49:33 transport: http2Client.notifyError got notified that the client transport was broken EOF.

and tiller pod crashing:

kubectl --kubeconfig=/Users/marat/.kraken/maratoidTNG/admin.kubeconfig logs tiller-deploy-1979772362-1kqd9 --namespace=kube-system --follow
Tiller is running on :44134
Tiller probes server is running on :44135
Storage driver is ConfigMap
Cannot initialize Kubernetes connection: Get https://10.32.0.1:443/api: dial tcp 10.32.0.1:443: connect: network is unreachable2016-09-20 22:52:33.535956 I | Getting release "kubedns" from storage
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6d2d69]

goroutine 41 [running]:
panic(0x1473dc0, 0xc42000a030)
        /usr/local/Cellar/go/1.7/libexec/src/runtime/panic.go:500 +0x1a1
k8s.io/helm/vendor/k8s.io/kubernetes/pkg/client/unversioned.(*ConfigMaps).Get(0xc420074b00, 0xc420461bd9, 0x7, 0x30, 0x1ec4fe0, 0x27)
        /Users/adamreese/p/go/src/k8s.io/helm/vendor/k8s.io/kubernetes/pkg/client/unversioned/configmap.go:58 +0x79
k8s.io/helm/pkg/storage/driver.(*ConfigMaps).Get(0xc4204611d0, 0xc420461bd9, 0x7, 0x1, 0x1, 0x0)
        /Users/adamreese/p/go/src/k8s.io/helm/pkg/storage/driver/cfgmaps.go:69 +0x62
k8s.io/helm/pkg/storage.(*Storage).Get(0xc4204611e0, 0xc420461bd9, 0x7, 0x493dd, 0x1fed4460200cb850, 0xcf73b4b1)
        /Users/adamreese/p/go/src/k8s.io/helm/pkg/storage/storage.go:36 +0xdd
main.(*releaseServer).uniqName(0xc420020030, 0xc420461bd9, 0x7, 0xc420075c00, 0x1, 0xc4203e7400, 0x0, 0x7f17f671b000)
        /Users/adamreese/p/go/src/k8s.io/helm/cmd/tiller/release_server.go:337 +0x376
main.(*releaseServer).prepareRelease(0xc420020030, 0xc420457400, 0xc42042d440, 0xc4200761c0, 0xc4200cba98)
        /Users/adamreese/p/go/src/k8s.io/helm/cmd/tiller/release_server.go:398 +0x6e
main.(*releaseServer).InstallRelease(0xc420020030, 0x7f17f66cfd08, 0xc4203d4810, 0xc420457400, 0xc42003bb18, 0x417cc8, 0x40)
        /Users/adamreese/p/go/src/k8s.io/helm/cmd/tiller/release_server.go:379 +0x3c
k8s.io/helm/pkg/proto/hapi/services._ReleaseService_InstallRelease_Handler(0x1592440, 0xc420020030, 0x7f17f66cfd08, 0xc4203d4810, 0xc420288040, 0x0, 0x0, 0xc4203e7400, 0x4)
        /Users/adamreese/p/go/src/k8s.io/helm/pkg/proto/hapi/services/tiller.pb.go:586 +0xdd
k8s.io/helm/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc42044a000, 0x1e971c0, 0xc42042d440, 0xc4200761c0, 0xc42033f140, 0x1eb8108, 0xc4203d46c0, 0x0, 0x0)
        /Users/adamreese/p/go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:497 +0xa0b
k8s.io/helm/vendor/google.golang.org/grpc.(*Server).handleStream(0xc42044a000, 0x1e971c0, 0xc42042d440, 0xc4200761c0, 0xc4203d46c0)
        /Users/adamreese/p/go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:646 +0x6ad
k8s.io/helm/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc4204618a0, 0xc42044a000, 0x1e971c0, 0xc42042d440, 0xc4200761c0)
        /Users/adamreese/p/go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:323 +0xab
created by k8s.io/helm/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
        /Users/adamreese/p/go/src/k8s.io/helm/vendor/google.golang.org/grpc/server.go:324 +0xa3

Now of course this is not a valid cluster, but I figured tiller shouldn't just go panic either, and helm should still fail gracefully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants