New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy exits immediately after started #174

Closed
mumoshu opened this Issue Jun 12, 2017 · 14 comments

Comments

Projects
None yet
3 participants
@mumoshu

mumoshu commented Jun 12, 2017

What were you trying to do?

Just tried to run the simplest command came up for me --run sleep 10 to verify telepresence works on my mac with the K8S cluster running on minikube

What did you expect to happen?

The command sleeps for 10 seconds and exits cleanly.

What happened instead?

The command exited almost immediately after I saw Starting proxy... message, producing the error starting from Error in atexit._run_exitfuncs:.

$ kubectl run --image datawire/telepresence-k8s:0.50 telepresence
$ telepresence -m inject-tcp --deployment telepresence --run sleep 10
Starting proxy...
Proxy to Kubernetes exited. This is typically due to a lost connection.
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.6.1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['umount', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmp1g728juz']' returned non-zero exit status 1.

telepresence seemed to write telepresence.log containing the below:

Context: minikube, namespace: default, kubectl_command: kubectl
Running: (['kubectl', '--context', 'minikube', 'cluster-info'],)... ran!
Running 'ssh -V'...Running: (['which', 'torsocks'],)... ran!
Running: (['which', 'sshfs'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'delete', '--ignore-not-found', 'all', '--selector=telepresence=f1c577e6-751a-4077-9368-6eb99cc01409'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'run', '--restart=Always', '--limits=memory=256Mi', '--requests=memory=64Mi', 'tp', '--image=datawire/telepresence-k8s:0.50', '--labels=telepresence=f1c577e6-751a-4077-9368-6eb99cc01409', '--port=8080', '--expose'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'deployment', '-o', 'json', '--export', '--selector=telepresence=f1c577e6-751a-4077-9368-6eb99cc01409'],)... ran!
Expected metadata for pods: {'creationTimestamp': None, 'labels': {'telepresence': 'f1c577e6-751a-4077-9368-6eb99cc01409'}}
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', '-o', 'json', '--export'],)... ran!
Checking {'run': 'telepresence', 'pod-template-hash': '450099238'} (phase Running)...
Labels don't match.
Checking {'telepresence': 'f1c577e6-751a-4077-9368-6eb99cc01409', 'pod-template-hash': '1449629046'} (phase Pending)...
Looks like we've found our pod!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'tp-1449629046-nv6t9', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'tp-1449629046-nv6t9', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'tp-1449629046-nv6t9', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'tp-1449629046-nv6t9', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'port-forward', 'tp-1449629046-nv6t9', '58491:8022'],)...Running: (['ssh', '-q', '-p', '58491', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Forwarding from 127.0.0.1:58491 -> 8022
Forwarding from [::1]:58491 -> 8022
Running: (['ssh', '-q', '-p', '58491', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Handling connection for 58491
 ran!
Running: (['ssh', '-q', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '58491', 'telepresence@localhost', '-R', '*:8080:127.0.0.1:8080'],)...Running: (['ssh', '-q', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '58491', 'telepresence@localhost', '-L', '127.0.0.1:58499:127.0.0.1:9050'],)...Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'exec', 'tp-1449629046-nv6t9', '--container', 'tp', 'env'],)... ran!
Running: (['sshfs', '-p', '58491', '-o', 'StrictHostKeyChecking=no', '-o', 'UserKnownHostsFile=/dev/null', 'telepresence@localhost:/', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpq2ve1y08'],)... Handling connection for 58491
 ran!
A subprocess died, killing all processes...
Shutting down local shell...
Killing local shell...
Running: (['umount', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpq2ve1y08'],)...umount: /var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpq2ve1y08: not currently mounted
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'delete', '--ignore-not-found', 'all', '--selector=telepresence=f1c577e6-751a-4077-9368-6eb99cc01409'],)... ran!
@drslump

This comment has been minimized.

Contributor

drslump commented Jun 13, 2017

to get more details about what could be the issue you can try to add the following on /usr/local/bin/telepresence:1004:

runner.write("A subprocess died, killing all processes...\n")
runner.write(str(p.args) + "\n")
@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

@drslump Thanks for the reply.

After injecting the snippet, what I see now in telepresence.log is the following:

Context: minikube, namespace: default, kubectl_command: kubectl
Running: (['kubectl', '--context', 'minikube', 'cluster-info'],)... ran!
Running 'ssh -V'...Running: (['which', 'torsocks'],)... ran!
Running: (['which', 'sshfs'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'deployment', '-o', 'json', '--export', 'telepresence'],)... ran!
Expected metadata for pods: {'creationTimestamp': None, 'labels': {'run': 'telepresence'}}
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', '-o', 'json', '--export'],)... ran!
Checking {'pod-template-hash': '2371676037', 'run': 'nginx'} (phase Running)...
Labels don't match.
Checking {'openvpn': 'minikube.dev', 'pod-template-hash': '1131237064'} (phase Running)...
Labels don't match.
Checking {'pod-template-hash': '450099238', 'run': 'telepresence'} (phase Running)...
Looks like we've found our pod!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'telepresence-450099238-sqn1k', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'port-forward', 'telepresence-450099238-sqn1k', '49366:8022'],)...Running: (['ssh', '-q', '-p', '49366', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Forwarding from 127.0.0.1:49366 -> 8022
Forwarding from [::1]:49366 -> 8022
Running: (['ssh', '-q', '-p', '49366', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Handling connection for 49366
 ran!
Running: (['ssh', '-q', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '49366', 'telepresence@localhost', '-L', '127.0.0.1:49374:127.0.0.1:9050'],)...Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'exec', 'telepresence-450099238-sqn1k', '--container', 'telepresence', 'env'],)... ran!
Running: (['sshfs', '-p', '49366', '-o', 'StrictHostKeyChecking=no', '-o', 'UserKnownHostsFile=/dev/null', 'telepresence@localhost:/', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpjfity7t3'],)...  ran!
A subprocess died, killing all processes...
['ssh', '-q', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '49366', 'telepresence@localhost', '-L', '127.0.0.1:49374:127.0.0.1:9050']
Connection to localhost closed by remote host.
Shutting down local shell...
Killing local shell...
Running: (['umount', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpjfity7t3'],)...umount: /var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmpjfity7t3: not currently mounted
@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

Running the logged ssh command on my shell results in the following:

$ ssh -q -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=1 -oServerAliveCountMax=10 -N -p 49366 telepresence@localhost -L 127.0.0.1:49374:127.0.0.1:9050
$ echo $?
255
@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

Removing the -q flag reveals that the ssh connection is getting refused:

$ ssh -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=1 -oServerAliveCountMax=10 -N -p 49366 telepresence@localhost -L 127.0.0.1:49374:127.0.0.1:9050
ssh: connect to host localhost port 49366: Connection refused
@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

Manually running the kubectl port-forward command shown in the log and then running the ssh seems to work:

kubectl port-forward telepresence-450099238-sqn1k 49366:8022
Forwarding from 127.0.0.1:49366 -> 8022
Forwarding from [::1]:49366 -> 8022
Handling connection for 49366
$ ssh -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -oServerAliveInterval=1 -oServerAliveCountMax=10 -N -p 49366 telepresence@localhost -L 127.0.0.1:49374:127.0.0.1:9050
Warning: Permanently added '[localhost]:49366' (ECDSA) to the list of known hosts.
@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

This mixed-up log message seems to indicate a possible race-condition here.

Running: (['ssh', '-q', '-p', '49366', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Handling connection for 49366

AFAIK Handling connection for 49366 is coming from kubectl port-forward and it should precede ssh.

@drslump

This comment has been minimized.

Contributor

drslump commented Jun 13, 2017

actually the ssh -q -p 49366... is telepresence probing the kubectl port-forward to detect when it's ready.

Maybe enabling verbose output on the ssh reveals some hint about the issue, at telepresence:614 change the -q for -vvv.

@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

@drslump Thanks 👍
The log with -vvv follows:

Context: minikube, namespace: default, kubectl_command: kubectl
Running: (['kubectl', '--context', 'minikube', 'cluster-info'],)... ran!
Running 'ssh -V'...Running: (['which', 'torsocks'],)... ran!
Running: (['which', 'sshfs'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'delete', '--ignore-not-found', 'all', '--selector=telepresence=3be7e2db-6f98-495c-9c3b-45fc175dcac7'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'run', '--restart=Always', '--limits=memory=256Mi', '--requests=memory=64Mi', 'example2', '--image=datawire/telepresence-k8s:0.50', '--labels=telepresence=3be7e2db-6f98-495c-9c3b-45fc175dcac7'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'deployment', '-o', 'json', '--export', '--selector=telepresence=3be7e2db-6f98-495c-9c3b-45fc175dcac7'],)... ran!
Expected metadata for pods: {'creationTimestamp': None, 'labels': {'telepresence': '3be7e2db-6f98-495c-9c3b-45fc175dcac7'}}
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', '-o', 'json', '--export'],)... ran!
Checking {'pod-template-hash': '1802794833', 'telepresence': 'e66a86f4-e4c7-4bd4-9f8c-8bd99c38c902'} (phase Running)...
Labels don't match.
Checking {'pod-template-hash': '4126177186', 'telepresence': '3be7e2db-6f98-495c-9c3b-45fc175dcac7'} (phase Pending)...
Looks like we've found our pod!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'example2-4126177186-8twzt', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'get', 'pod', 'example2-4126177186-8twzt', '-o', 'json'],)... ran!
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'port-forward', 'example2-4126177186-8twzt', '58515:8022'],)...Forwarding from 127.0.0.1:58515 -> 8022
Forwarding from [::1]:58515 -> 8022
Running: (['ssh', '-q', '-p', '58515', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', 'telepresence@localhost', '/bin/true'],)... Handling connection for 58515
 ran!
Running: (['ssh', '-q', '-vvv', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '58515', 'telepresence@localhost', '-L', '127.0.0.1:58529:127.0.0.1:9050'],)...Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'exec', 'example2-4126177186-8twzt', '--container', 'example2', 'env'],)...OpenSSH_6.9p1, LibreSSL 2.1.8
debug1: Reading configuration data /Users/cw_kuoka/.ssh/config
debug1: /Users/cw_kuoka/.ssh/config line 8: Applying options for *
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug1: auto-mux: Trying existing master
debug2: fd 3 setting O_NONBLOCK
debug2: mux_client_hello_exchange: master version 4
debug3: mux_client_forwards: request forwardings: 1 local, 0 remote
debug1: Requesting forwarding of local forward 127.0.0.1:58529 -> 127.0.0.1:9050
debug3: mux_client_request_session: entering
debug3: mux_client_request_alive: entering
debug3: mux_client_request_alive: done pid = 68733
debug3: mux_client_request_session: session request sent
debug1: mux_client_request_session: master session id: 3
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0
 ran!
Running: (['sshfs', '-p', '58515', '-o', 'StrictHostKeyChecking=no', '-o', 'UserKnownHostsFile=/dev/null', 'telepresence@localhost:/', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmp_or_6e_a'],)...  ran!
A subprocess died, killing all processes...
['ssh', '-q', '-vvv', '-oStrictHostKeyChecking=no', '-oUserKnownHostsFile=/dev/null', '-oServerAliveInterval=1', '-oServerAliveCountMax=10', '-N', '-p', '58515', 'telepresence@localhost', '-L', '127.0.0.1:58529:127.0.0.1:9050']
Connection to localhost closed by remote host.
Shutting down local shell...
Killing local shell...
Running: (['umount', '/var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmp_or_6e_a'],)...umount: /var/folders/j9/g1yzwnln6nb4w53bhr6kkwsr0000gn/T/tmp_or_6e_a: not currently mounted
Running: (['kubectl', '--context', 'minikube', '--namespace', 'default', 'delete', '--ignore-not-found', 'all', '--selector=telepresence=3be7e2db-6f98-495c-9c3b-45fc175dcac7'],)... ran!

The part Applying options for * coming from my .ssh/config seems to relate to the issue.

@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

In .ssh/config, I had been enabling ControlMaster for every host:

Host *
  ControlMaster auto
  ControlPath ~/.ssh/ctl/%C
  ControlPersist 5

Commenting out all these lines fixed the issue.
Not sure how it affects telepresence though.
Would it be nice to document this limitation(?) in the doc?

@drslump

This comment has been minimized.

Contributor

drslump commented Jun 13, 2017

🎉 great! good troubleshooting @mumoshu!

I'll send a PR implementing something like:

  • launch ssh with -F /dev/null to ignore any local configuration settings
  • expose a --verbose flag on telepresence that runs the sub commands in verbose mode to help troubleshoot future issue

@mumoshu mumoshu changed the title from Proxy exists immediately after started to Proxy exits immediately after started Jun 13, 2017

@mumoshu

This comment has been minimized.

mumoshu commented Jun 13, 2017

@drslump Thank you very much for your support 👍

@itamarst

This comment has been minimized.

Contributor

itamarst commented Jun 13, 2017

@mumoshu, sorry you had this problem!

@drslump and @mumoshu, thank you so much for debugging this!

I'll take a look at the PR this morning.

itamarst added a commit that referenced this issue Jun 13, 2017

@itamarst

This comment has been minimized.

Contributor

itamarst commented Jun 13, 2017

I just released version 0.51; packages should be available in about an hour - thanks for the bug report, debugging and patch.

@mumoshu @drslump if you email me your mailing addresses and t-shirt size we'd be happy to send you each a Telepresence t-shirt (itamar@datawire.io).

@drslump

This comment has been minimized.

Contributor

drslump commented Jun 13, 2017

that's a really great present @itamarst! 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment