New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container method doesn't notice/handle the network container not starting #842

Closed
unludo opened this Issue Nov 23, 2018 · 5 comments

Comments

2 participants
@unludo
Copy link

unludo commented Nov 23, 2018

What were you trying to do?

trying to connect to gke cluster

What did you expect to happen?

telepresence established

What happened instead?

error happened

Automatically included information

Command line: ['/usr/bin/telepresence', '--also-proxy', '10.132.0.0/24', '--docker-run', '-i', '-t', '--env', 'REDIS_HOST=redis-master.redis-system', '-p', '8081:8081', 'rediscommander/redis-commander:latest']
Version: 0.94
Python version: 3.5.2 (default, Nov 12 2018, 13:43:14) [GCC 5.4.0 20160609]
kubectl version: Client Version: v1.10.7 // Server Version: v1.11.2-gke.18
oc version: (error: [Errno 2] No such file or directory: 'oc')
OS: Linux llepenn-OptiPlex-5050 4.15.0-39-generic #42~16.04.1-Ubuntu SMP Wed Oct 24 17:09:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Traceback:

Traceback (most recent call last):
  File "/usr/bin/telepresence/telepresence/cli.py", line 130, in crash_reporting
    yield
  File "/usr/bin/telepresence/telepresence/main.py", line 85, in main
    runner, remote_info, env, socks_port, ssh, mount_dir
  File "/usr/bin/telepresence/telepresence/outbound/setup.py", line 89, in launch
    args.also_proxy, env, ssh, mount_dir
  File "/usr/bin/telepresence/telepresence/outbound/container.py", line 146, in run_docker_command
    "Waiting for network container timed out. File a bug, please!"
RuntimeError: Waiting for network container timed out. File a bug, please!

Logs:

tainer: telepresence-1542984575-7140777-11694.
 587.5 TEL | [191] exit 125 in 0.20 secs.
 588.5 TEL | [192] Running: docker run --network=container:telepresence-1542984575-7140777-11694 --rm datawire/telepresence-local:0.94 wait
 588.5 TEL | [193] Running: sudo -n echo -n
 588.7 192 | docker: Error response from daemon: No such container: telepresence-1542984575-7140777-11694.
 588.7 TEL | [192] exit 125 in 0.19 secs.
 589.7 TEL | [194] Running: docker run --network=container:telepresence-1542984575-7140777-11694 --rm datawire/telepresence-local:0.94 wait
 589.9 194 | docker: Error response from daemon: No such container: telepresence-1542984575-7140777-11694.
 589.9 TEL | [194] exit 125 in 0.23 secs.
 590.9 TEL | [195] Running: docker run --network=container:telepresence-1542984575-7140777-11694 --rm datawire/telepresence-local:0.94 wait
 591.1 195 | docker: Error response from daemon: No such container: telepresence-1542984575-7140777-11694.
 591.1 TEL | [195] exit 125 in 0.22 secs.

@ark3

This comment has been minimized.

Copy link
Contributor

ark3 commented Nov 23, 2018

Sorry for the crash! Can you please pass along the entire log file was a GitHub Gist? Thank you.

@unludo

This comment has been minimized.

Copy link

unludo commented Nov 28, 2018

@unludo unludo changed the title telepresence can connect to GKE 1.11 telepresence can't connect to GKE 1.11 Nov 28, 2018

@ark3

This comment has been minimized.

Copy link
Contributor

ark3 commented Nov 28, 2018

Thank you for the trace. It looks like Docker is unable to start the network container because of the -p 8081:8081 and that port is already in use. I see two problems:

  1. Tel fails to notice that the network container didn't start.

      17.6 TEL | [49] Launching Network container: docker run -p=8081:8081 --rm --privileged --name=telepresence-1543422212-9006648-25591 datawire/telepresence-local:0.94 proxy '{"cidrs": ["10.56.4.0/24", "10.56.0.0/24", "10.56.5.0/24", "10.56.7.0/24", "10.56.2.0/24", "10.0.0.0/20", "10.56.9.0/24", "10.56.8.0/24", "10.56.3.0/24", "10.56.6.0/24", "10.56.1.0/24"], "port": 36987, "expose_ports": []}'
      17.9  49 | docker: Error response from daemon: driver failed programming external connectivity on endpoint telepresence-1543422212-9006648-25591 (43de07c8be4d22d0224cc4abe6503ef80e5bc2757f9d2946176733c4918b08c0): Error starting userland proxy: listen tcp 0.0.0.0:8081: bind: address already in use.
      17.9 TEL | [49] exit 125
    
  2. Tel also fails to process the error messages from Docker.

      17.6 TEL | [50] Running: docker run --network=container:telepresence-1543422212-9006648-25591 --rm datawire/telepresence-local:0.94 wait
      18.0  50 | docker: Error response from daemon: cannot join network of a non running container: telepresence-1543422212-9006648-25591.
      18.0 TEL | [50] exit 125 in 0.40 secs.
    (repeated many times...)
    

Can you try again, making sure that port 8081 is available? That will help clarify whether that's the problem.

Regardless, we still need to fix the Telepresence bugs above.

@ark3 ark3 added this to To do in Tel Tracker via automation Nov 28, 2018

@ark3 ark3 changed the title telepresence can't connect to GKE 1.11 container method doesn't notice/handle the network container not starting Nov 28, 2018

@unludo

This comment has been minimized.

Copy link

unludo commented Nov 29, 2018

You are right, the port was used and freeing it fixes the issue. I should I looked at the logs more carefully. Anyway the improvements you indicate would be useful.
Thanks! :)

@ark3

This comment has been minimized.

Copy link
Contributor

ark3 commented Nov 29, 2018

Telepresence should give you a clear error message so you don't have to look at the logs. Thank you for filing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment