incorrect ports in ssh connection to docker container #654

jspaaks · 2019-07-25T13:31:52Z

I had some trouble with creating an ssh connection to a docker container while preparing materials for a xenon-cli v3 based tutorial, which in turn was based on the earlier xenon-cli v2 tutorial. Here are some details of my setup:

$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="18.04.2 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.2 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
$ docker --version
Docker version 18.09.3, build 774a1f4
$ java -version
openjdk version "11.0.3" 2019-04-16
OpenJDK Runtime Environment (build 11.0.3+7-Ubuntu-1ubuntu218.04.1)
OpenJDK 64-Bit Server VM (build 11.0.3+7-Ubuntu-1ubuntu218.04.1, mixed mode, sharing)
$ echo $JAVA_HOME
/usr/lib/jvm/java-11-openjdk-amd64
$ mkdir -p ~/.local/bin/xenon
$ cd ~/.local/bin/xenon
$ wget https://github.com/xenon-middleware/xenon-cli/releases/download/v3.0.0/xenon-cli-shadow-3.0.0.tar
$ tar -xvf xenon-cli-shadow-3.0.0.tar
$ echo '' >> ~/.bashrc
$ echo '#Add xenon cli to the PATH:' >> ~/.bashrc
$ echo 'PATH=$PATH:~/.local/bin/xenon/xenon-cli-shadow-3.0.0/bin' >> ~/.bashrc
$ source ~/.bashrc

I started the docker slurm container with

$ docker run --detach --publish 10022:22 --hostname slurm17 nlesc/xenon-slurm:17

as per the tutorial text. I tried to connect to it with:

$ ssh -p 10022 xenon@localhost
$ exit

which worked as normal.

But then (slight digression from the tutorial text, probably needed as a result of the xenon 2 -> 3 upgrade; location now includes the schema part)

$ xenon scheduler slurm --location ssh://localhost:10022 --username xenon --password javagat queues
ssh adaptor: Connection setup to localhost:10022 failed!

I think I've tracked down the place where things go wrong:

xenon/src/main/java/nl/esciencecenter/xenon/adaptors/shared/ssh/SSHUtil.java

Line 386 in a577b12

session = client.connect(username, host, port).verify(timeout).getSession();

Even though we give it port 10022, somehow inside of the .connect() it uses 22 and then it fails (at least at some point I got an error message from inside .connect() that said something like connection id = xenon@localhost:22).

Anyway, restarting the docker container with

$ docker run --detach --publish 22:22 --hostname slurm17 nlesc/xenon-slurm:17

and then

$ xenon scheduler slurm --location ssh://localhost:22 --username xenon --password javagat queues

gives the expected response:

Available queues: mypartition, otherpartition
Default queue: mypartition

I'd be interested to hear if this is indeed a bug. Could be I'm just doing something wrong.

Sidenote: I expect we are missing a session.close() or something, because it takes 1 minute or so to return focus to the user after the answer is printed in the terminal.

The text was updated successfully, but these errors were encountered:

jmaassen · 2019-07-30T10:40:24Z

I cannot reproduce the issue with the connection setup. I do see the delay to return focus though

sverhoeven · 2019-07-30T10:44:36Z

Also can't reproduce port mismatch inside a VirtualBox VM and on bare metal Linux.

jmaassen · 2019-07-30T10:44:55Z

After some debugging, it turns out the delay in exiting the application is caused by the mina layer used by SSHD.

The SSHD implementation can use different communication layers: standard java sockets, MINA, or Netty. To select an implementation you can use different dependencies:

sshd-core
sshd-mina
sshd-netty

We were using the sshd-mina in Xenon 3.0.0. Unfortunately, the mina library creates a thread pool internally which does not seems to shutdown immediately when the application tries to exit.

Switching to sshd-core solves the problem.

jmaassen · 2019-07-30T10:50:44Z

For reference sshd-netty seems to have a similar problem. The hanging threads are named differently, but also prevent the JVM from shutting down.

Almost seems like we are supposed to explicitly shut down something which we forget?

jmaassen · 2019-07-30T11:16:30Z

Turns out we were missing a SshClient.stop() in SSHConnection.close(). The different ssh sessions created by the client were all shut down properly, but the client itself wasn't. For the core implementation this is not a problem, but both the mina and netty implementations have an internal thread pool per client and therefore need to be properly shut down.

sverhoeven · 2019-07-30T11:36:31Z

On https://github.com/apache/mina-sshd/blob/master/docs/dependencies.md the sshd-mina and sshd-netty dependencies are explained.

Looks like sshd-mina is the legacy socket implementation. Maybe we should switch to the default or netty implementation?

jspaaks · 2019-08-01T15:08:13Z

did some more digging into the connection error thing.
tried sidestepping my normal .ssh settings

mv ~/.ssh ~/.ssh-sidelined
mkdir ~/.ssh
chmod 700 ~/.ssh

Then

xenon scheduler slurm --location ssh://localhost:10022 --username xenon --password javagat queues

works as normal.

Next, I copied files from ~/.ssh-sidelined to ~/.ssh to see where things break

copied a bunch of *.pem files, all good
copied id_rsa and id_rsa.pub, works
copied authorized_keys, still good
even known_hosts with 74 entries, still good
but then copying config breaks xenon scheduler slurm --location ssh://localhost:10022 --username xenon --password javagat queues
- Further digging down intoconfig, disabled all lines with #, works again
- I have a Port 22 somewhere in there, I really don;t know why or what it does, but by toggling the line on and off with # I was able to generate an error or pass, respectively. Not sure what to do about it, but at least it works now.

jspaaks assigned jmaassen Jul 25, 2019

jspaaks added the Bug label Jul 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

incorrect ports in ssh connection to docker container #654

incorrect ports in ssh connection to docker container #654

jspaaks commented Jul 25, 2019 •

edited

Loading

jmaassen commented Jul 30, 2019

sverhoeven commented Jul 30, 2019

jmaassen commented Jul 30, 2019

jmaassen commented Jul 30, 2019

jmaassen commented Jul 30, 2019

sverhoeven commented Jul 30, 2019

jspaaks commented Aug 1, 2019

incorrect ports in ssh connection to docker container #654

incorrect ports in ssh connection to docker container #654

Comments

jspaaks commented Jul 25, 2019 • edited Loading

jmaassen commented Jul 30, 2019

sverhoeven commented Jul 30, 2019

jmaassen commented Jul 30, 2019

jmaassen commented Jul 30, 2019

jmaassen commented Jul 30, 2019

sverhoeven commented Jul 30, 2019

jspaaks commented Aug 1, 2019

jspaaks commented Jul 25, 2019 •

edited

Loading