Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-compose over SSH returns ChannelException(2, 'Connect failed') #7542

Closed
hazcod opened this issue Jun 17, 2020 · 16 comments
Closed

docker-compose over SSH returns ChannelException(2, 'Connect failed') #7542

hazcod opened this issue Jun 17, 2020 · 16 comments

Comments

@hazcod
Copy link

hazcod commented Jun 17, 2020

Description of the issue

build over SSH via docker-compose works, but a run always ends with ChannelException(2, 'Connect failed').

Context information (for bug reports)

Output of docker-compose version

docker-compose version 1.25.5, build 8a1c60f6
(also confirmed with version 1.26.0)

Output of docker version

Client: Docker Engine - Community
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.12.17
 Git commit:        afacb8b
 Built:             Wed Mar 11 01:21:11 2020
 OS/Arch:           darwin/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b
  Built:            Wed Mar 11 01:29:16 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Steps to reproduce the issue

  1. set docker context with docker context create xxx --description "xxx" --docker "host=ssh://root@$SSH_HOST" --default-stack-orchestrator swarm
  2. docker context use xxx
  3. docker-compose build and docker-compose up

Observed result

Expected result

Stacktrace / full error message

2020-06-17T15:54:24.3292035Z Step 6/11 : COPY run.sh $APP_DIR/
2020-06-17T15:54:24.3293496Z  ---> Using cache
2020-06-17T15:54:24.3296097Z  ---> e2a57e955e9a
2020-06-17T15:54:24.3299692Z Step 7/11 : RUN $APP_DIR/post-install.sh
2020-06-17T15:54:24.3300660Z  ---> Using cache
2020-06-17T15:54:24.3301578Z  ---> 323d4c7ead3e
2020-06-17T15:54:24.3303848Z Step 8/11 : EXPOSE 8080
2020-06-17T15:54:24.3304499Z  ---> Using cache
2020-06-17T15:54:24.3305583Z  ---> 84d07026c232
2020-06-17T15:54:24.3307580Z Step 9/11 : USER $APP_USER
2020-06-17T15:54:24.3308205Z  ---> Using cache
2020-06-17T15:54:24.3310057Z  ---> 698b453faa79
2020-06-17T15:54:24.3312178Z Step 10/11 : HEALTHCHECK --interval=5s --timeout=3s --retries=3 CMD wget --tries=1 -O - --quiet http://localhost:8080/_status/ || exit 1
2020-06-17T15:54:24.3313128Z  ---> Using cache
2020-06-17T15:54:24.3315406Z  ---> a9688b412bca
2020-06-17T15:54:24.3317664Z Step 11/11 : CMD $APP_DIR/run.sh
2020-06-17T15:54:24.3318490Z  ---> Using cache
2020-06-17T15:54:24.3319843Z  ---> 154ce4a7592b
2020-06-17T15:54:24.3320282Z 
2020-06-17T15:54:24.3321263Z Successfully built 154ce4a7592b
2020-06-17T15:54:24.3388087Z Successfully tagged download:prod
2020-06-17T15:54:24.3395955Z Building web
2020-06-17T15:54:26.7155414Z Step 1/12 : FROM baseimage:prod
2020-06-17T15:54:26.7158395Z  ---> 12dfecf4398f
2020-06-17T15:54:26.7159189Z Step 2/12 : ENV ASSET_DIR "$APP_DIR/asset"
2020-06-17T15:54:26.7185080Z  ---> Using cache
2020-06-17T15:54:26.7186068Z  ---> a8c25ab21e3d
2020-06-17T15:54:26.7186798Z Step 3/12 : RUN mkdir $ASSET_DIR/
2020-06-17T15:54:26.7187584Z  ---> Using cache
2020-06-17T15:54:26.7188418Z  ---> 662d287b7d57
2020-06-17T15:54:26.7189653Z Step 4/12 : RUN apk add --no-cache nginx 	&& rm -r /etc/nginx/ /var/tmp/nginx /var/log/nginx /var/lib/nginx/
2020-06-17T15:54:26.7193147Z  ---> Using cache
2020-06-17T15:54:26.7194154Z  ---> 033105961aee
2020-06-17T15:54:26.7194886Z Step 5/12 : COPY conf/ $CONF_DIR/
2020-06-17T15:54:26.7205388Z  ---> Using cache
2020-06-17T15:54:26.7206406Z  ---> 1d113c928e68
2020-06-17T15:54:26.7207509Z Step 6/12 : COPY run.sh $APP_DIR/
2020-06-17T15:54:26.7209950Z  ---> Using cache
2020-06-17T15:54:26.7210838Z  ---> 759545f0b148
2020-06-17T15:54:26.7211521Z Step 7/12 : ADD asset/ $ASSET_DIR/
2020-06-17T15:54:26.7324418Z  ---> Using cache
2020-06-17T15:54:26.7327302Z  ---> c45ee937aa65
2020-06-17T15:54:26.7331421Z Step 8/12 : RUN $APP_DIR/post-install.sh
2020-06-17T15:54:26.7334521Z  ---> Using cache
2020-06-17T15:54:26.7337586Z  ---> d9c75eded3e2
2020-06-17T15:54:26.7341203Z Step 9/12 : EXPOSE 8080 8443
2020-06-17T15:54:26.7348029Z  ---> Using cache
2020-06-17T15:54:26.7353156Z  ---> 461a9b85c635
2020-06-17T15:54:26.7358012Z Step 10/12 : USER $APP_USER
2020-06-17T15:54:26.7361133Z  ---> Using cache
2020-06-17T15:54:26.7364199Z  ---> 8d32caac9dd8
2020-06-17T15:54:26.7368323Z Step 11/12 : HEALTHCHECK --interval=5s --timeout=3s --retries=3 CMD wget --quiet --tries=1 --spider --no-check-certificate https://localhost:8443/_status/ || exit 1
2020-06-17T15:54:26.7371604Z  ---> Using cache
2020-06-17T15:54:26.7374678Z  ---> 6a11199270cb
2020-06-17T15:54:26.7378306Z Step 12/12 : CMD $APP_DIR/run.sh
2020-06-17T15:54:26.7381296Z  ---> Using cache
2020-06-17T15:54:26.7384302Z  ---> 2b4da9291b88
2020-06-17T15:54:26.7387004Z 
2020-06-17T15:54:26.7424699Z Successfully built 2b4da9291b88
2020-06-17T15:54:26.7480313Z Successfully tagged web:prod
2020-06-17T15:54:26.7937558Z + docker-compose -p venclave -f compose/network.yml -f compose/db.yml -f compose/php.yml -f compose/web.yml -f compose/download.yml -f compose/mq.yml -f compose/slave.yml -f compose/processor.yml -f compose/stages/prod/prod.yml up -d
2020-06-17T15:54:27.2793496Z Some services (db, download, mq, php, processor, slave, web) use the 'deploy' key, which will be ignored. Compose does not support 'deploy' configuration - use `docker stack deploy` to deploy to a swarm.
2020-06-17T15:54:43.4743772Z Starting download ... 
2020-06-17T15:54:43.4773389Z db is up-to-date
2020-06-17T15:54:43.4779050Z mq is up-to-date
2020-06-17T15:54:43.4804868Z Starting slave    ... 
2020-06-17T15:54:43.6501746Z Secsh channel 41 open FAILED: open failed: Connect failed
2020-06-17T15:54:43.8208622Z 
2020-06-17T15:54:43.8211047Z ERROR: for slave  ChannelException(2, 'Connect failed')
2020-06-17T15:54:43.8211491Z Creating php      ... 
2020-06-17T15:54:45.0129211Z �[3A�[2K
2020-06-17T15:54:45.0130159Z Starting download ... �[32mdone�[0m
2020-06-17T15:54:48.9541960Z �[3B�[1A�[2K
2020-06-17T15:54:48.9542806Z Creating php      ... �[32mdone�[0m
2020-06-17T15:54:49.3842855Z �[1BCreating web      ... 
2020-06-17T15:54:55.5721305Z �[1A�[2K
2020-06-17T15:54:55.5722054Z Creating web      ... �[32mdone�[0m
2020-06-17T15:54:55.5731642Z �[1B[2744] Failed to execute script docker-compose
2020-06-17T15:54:55.5732404Z 
2020-06-17T15:54:55.5733972Z ERROR: for slave  ChannelException(2, 'Connect failed')
2020-06-17T15:54:55.5735415Z Traceback (most recent call last):
2020-06-17T15:54:55.5736234Z   File "bin/docker-compose", line 6, in <module>
2020-06-17T15:54:55.5737464Z   File "compose/cli/main.py", line 72, in main
2020-06-17T15:54:55.5738150Z   File "compose/cli/main.py", line 128, in perform_command
2020-06-17T15:54:55.5738750Z   File "compose/cli/main.py", line 1078, in up
2020-06-17T15:54:55.5739339Z   File "compose/cli/main.py", line 1074, in up
2020-06-17T15:54:55.5739922Z   File "compose/project.py", line 576, in up
2020-06-17T15:54:55.5740507Z   File "compose/parallel.py", line 112, in parallel_execute
2020-06-17T15:54:55.5741091Z   File "compose/parallel.py", line 210, in producer
2020-06-17T15:54:55.5741685Z   File "compose/project.py", line 562, in do
2020-06-17T15:54:55.5742272Z   File "compose/service.py", line 569, in execute_convergence_plan
2020-06-17T15:54:55.5742911Z   File "compose/service.py", line 511, in _execute_convergence_start
2020-06-17T15:54:55.5743528Z   File "compose/parallel.py", line 112, in parallel_execute
2020-06-17T15:54:55.5744134Z   File "compose/parallel.py", line 210, in producer
2020-06-17T15:54:55.5744723Z   File "compose/service.py", line 509, in <lambda>
2020-06-17T15:54:55.5745949Z   File "compose/service.py", line 621, in start_container_if_stopped
2020-06-17T15:54:55.5746636Z   File "compose/service.py", line 626, in start_container
2020-06-17T15:54:55.5747247Z   File "compose/container.py", line 241, in start
2020-06-17T15:54:55.5748345Z   File "site-packages/docker/utils/decorators.py", line 19, in wrapped
2020-06-17T15:54:55.5749402Z   File "site-packages/docker/api/container.py", line 1094, in start
2020-06-17T15:54:55.5750452Z   File "site-packages/docker/utils/decorators.py", line 46, in inner
2020-06-17T15:54:55.5751327Z   File "site-packages/docker/api/client.py", line 226, in _post
2020-06-17T15:54:55.5752494Z   File "site-packages/requests/sessions.py", line 581, in post
2020-06-17T15:54:55.5753299Z   File "site-packages/requests/sessions.py", line 533, in request
2020-06-17T15:54:55.5754121Z   File "site-packages/requests/sessions.py", line 646, in send
2020-06-17T15:54:55.5754935Z   File "site-packages/requests/adapters.py", line 449, in send
2020-06-17T15:54:55.5755794Z   File "site-packages/urllib3/connectionpool.py", line 677, in urlopen
2020-06-17T15:54:55.5756644Z   File "site-packages/urllib3/connectionpool.py", line 392, in _make_request
2020-06-17T15:54:55.5757579Z   File "http/client.py", line 1252, in request
2020-06-17T15:54:55.5758209Z   File "http/client.py", line 1298, in _send_request
2020-06-17T15:54:55.5758811Z   File "http/client.py", line 1247, in endheaders
2020-06-17T15:54:55.5759378Z   File "http/client.py", line 1026, in _send_output
2020-06-17T15:54:55.5759958Z   File "http/client.py", line 966, in send
2020-06-17T15:54:55.5760795Z   File "site-packages/docker/transport/sshconn.py", line 32, in connect
2020-06-17T15:54:55.5761863Z   File "site-packages/paramiko/transport.py", line 879, in open_session
2020-06-17T15:54:55.5762775Z   File "site-packages/paramiko/transport.py", line 1017, in open_channel
2020-06-17T15:54:55.5763694Z paramiko.ssh_exception.ChannelException: ChannelException(2, 'Connect failed')
2020-06-17T15:54:56.4111034Z + exit 1
2020-06-17T15:54:56.4117740Z ##[error]Process completed with exit code 1.
2020-06-17T15:54:56.4187762Z Post job cleanup.
2020-06-17T15:54:56.5088657Z [command]/usr/bin/git version
2020-06-17T15:54:56.5159219Z git version 2.27.0
2020-06-17T15:54:56.5195091Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2020-06-17T15:54:56.5226595Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :
2020-06-17T15:54:56.5469079Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2020-06-17T15:54:56.5506862Z http.https://github.com/.extraheader
2020-06-17T15:54:56.5507766Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader
2020-06-17T15:54:56.5544637Z [command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :
2020-06-17T15:54:56.5871180Z Cleaning up orphan processes

Additional information

GitHub Actions -> debian

@hazcod
Copy link
Author

hazcod commented Jun 18, 2020

Verbose output attached.
logs_93.zip

@hazcod
Copy link
Author

hazcod commented Jun 18, 2020

Possibly related to docker/docker-py#2289

@hazcod
Copy link
Author

hazcod commented Jun 18, 2020

Temporary workaround:

        ssh -i .ssh/id_rsa -nNT -L "$(pwd)"/docker.sock:/var/run/docker.sock root@$SSH_HOST &
        export DOCKER_HOST="unix://$(pwd)/docker.sock"

Since this works, this appears to be an issue that only occurs with the SSH Python library?

@espoirMur
Copy link

I had the same issue and google brought me here

@guidorice
Copy link

guidorice commented Jun 26, 2020

I also ran into lots of ssh channel exceptions when using docker-compose with a ssh:// host. The unix domain socket workaround by @hazcod was working great for me, but then I ran into a problem with one of my docker-compose volume configs:

invalid mount config for type "bind": bind source path does not exist: (local path...)

Curious if anyone has run into that as well and what did you do?

@Clindbergh
Copy link

        ssh -i .ssh/id_rsa -nNT -L "$(pwd)"/docker.sock:/var/run/docker.sock root@$SSH_HOST &
        export DOCKER_HOST="unix://$(pwd)/docker.sock"

In case you tried this on Windows (where it doesn't seem to help) in git bash and are now receiving protocol not available when running a docker command, run unset DOCKER_HOST and switch contexts.

Using @espoirMur's solution worked for me.

@MartinJPaterson
Copy link

Changing the MaxSessions parameter in /etc/ssh/sshd_config to 30 made this work for me.

@pyarun
Copy link

pyarun commented Dec 1, 2020

Changing MaxSessions to 30 or 50, do not work for me. I am on mac machine!!

@viceice
Copy link

viceice commented Dec 11, 2020

Running this on the docker host solved it (ubuntu focal) 🎉

echo 'MaxSessions 50' | sudo tee /etc/ssh/sshd_config.d/docker.conf
sudo systemctl reload sshd.service

@sigi
Copy link

sigi commented Apr 1, 2021

I am using a variation on what @hazcod did (#7542 (comment)):

ssh -o 'StrictHostKeyChecking no' -fNT -L /root/docker.sock:/var/run/docker.sock $DEPLOY_TARGET
(and something to the effect of export DOCKER_HOST="unix:///root/docker.sock of course).

Context: We are in a CI environment that wants to access the Docker daemon on the machine where the deliverable is deployed.

Explanation: this forwards the Unix Domain Socket of the Docker daemon on the target host into your CI environment; if your deploy user is restricted, you will have to make sure that port forwarding is allowed for that user.

Keys are already configured via ssh-agent in my script, and the -f option is a little more clean than what @hazcod used (-n + &) in order to daemonize (background) the SSH process. Disabling host key checking is not clean, but works for us right now (use host certificates instead).

In my opinion this is a very good solution, even better than using the native ssh:// method.

Still, this bug is a real show stopper and should be fixed in Docker Compose.

Configuring MaxSessions on the SSH server to work around this is ugly if you ask me. I have also tried using a ControlMaster, but this was unreliable (it worked, and then it did not).

@stale
Copy link

stale bot commented Nov 9, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 9, 2021
@stale
Copy link

stale bot commented Apr 17, 2022

This issue has been automatically closed because it had not recent activity during the stale period.

@stale stale bot closed this as completed Apr 17, 2022
@viceice
Copy link

viceice commented Apr 17, 2022

pretty bad to have this closed 😢

@KarlJorgensen
Copy link

FYI: Have a look at the log on the ssh server.

If it says something like:

Dec 04 13:20:08 hostname sshd[29481]: error: no more sessions

Then increasing the number of permitted ssh sessions is the way to go, and @viceice 's comment above is spot on: - #7542 (comment)

@BigBoatCap
Copy link

BigBoatCap commented Apr 21, 2023

Hi folks,

Among other possible causes it could be low bandwidth inbetween deployment server and image repository(in our case).
Runner error was:

paramiko.ssh_exception.ChannelException: ChannelException(2, 'Connect failed')
Creating webapp
Secsh channel 17 open FAILED: open failed: Connect failed
ERROR: for ****  ChannelException(2, 'Connect failed')
Secsh channel 19 open FAILED: open failed: Connect failed

ERROR: for webapp  ChannelException(2, 'Connect failed')
Traceback (most recent call last):
  File "/usr/bin/docker-compose", line 11, in <module>
    sys.exit(main())
  File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 72, in main
    command()
  File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 128, in perform_command
    handler(command, command_options)
  File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 1077, in up
    to_attach = up(False)
  File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 1073, in up
    cli=native_builder,
  File "/usr/lib/python2.7/site-packages/compose/project.py", line 576, in up
    get_deps,
  File "/usr/lib/python2.7/site-packages/compose/parallel.py", line 112, in parallel_execute
    raise error_to_reraise
paramiko.ssh_exception.ChannelException: ChannelException(2, 'Connect failed')
Cleaning up project directory and file based variables
00:01
ERROR: Job failed: command terminated with exit code 1

it was confirmed with

iperf:

@app02:~# iperf -c <registry> -p 10250 -e -i 1
------------------------------------------------------------
Client connecting to <registry>, TCP port 10250 with pid 838180
Write buffer size:  128 KByte
TCP window size:  196 KByte (default)
------------------------------------------------------------
[  3] local <local_IP> port 57422 connected with <registry> port 10250 (ct=1.73 ms)
[ ID] Interval            Transfer    Bandwidth       Write/Err  Rtry     Cwnd/RTT        NetPwr
[  3] 0.0000-1.0000 sec  1.59 MBytes  13.3 Mbits/sec  13/0          0       32K/3846 us  432.97
[  3] 1.0000-2.0000 sec   768 KBytes  6.29 Mbits/sec  6/0         45        1K/31322 us  25.11
[  3] 2.0000-3.0000 sec   759 KBytes  6.22 Mbits/sec  6/0        114       57K/5482 us  141.81
[  3] 3.0000-4.0000 sec   700 KBytes  5.73 Mbits/sec  6/0         65       69K/28096 us  25.51
[  3] 4.0000-5.0000 sec   700 KBytes  5.73 Mbits/sec  6/0         67       70K/19762 us  36.27
[  3] 5.0000-6.0000 sec   512 KBytes  4.19 Mbits/sec  4/0         69       72K/25312 us  20.71
[  3] 6.0000-7.0000 sec   824 KBytes  6.75 Mbits/sec  7/0          0       91K/39903 us  21.15
[  3] 7.0000-8.0000 sec   891 KBytes  7.30 Mbits/sec  8/0         70        1K/46515 us  19.61
[  3] 8.0000-9.0000 sec   256 KBytes  2.10 Mbits/sec  2/0        142       73K/14102 us  18.59
[  3] 9.0000-10.0000 sec  1017 KBytes  8.33 Mbits/sec  8/0          0       89K/38017 us  27.38
[  3] 0.0000-10.2094 sec  7.86 MBytes  6.46 Mbits/sec  66/0        572       -1K/25235 us  32.01

@atazangene
Copy link

Running this on the docker host solved it (ubuntu focal) 🎉

echo 'MaxSessions 50' | sudo tee /etc/ssh/sshd_config.d/docker.conf
sudo systemctl reload sshd.service

you saved my life!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests