Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH tunnel timeout on Docker #6114

Closed
Tracked by #14929
lhriley opened this issue Oct 6, 2017 · 31 comments
Closed
Tracked by #14929

SSH tunnel timeout on Docker #6114

lhriley opened this issue Oct 6, 2017 · 31 comments
Labels
Operation/Docker Priority:P2 Average run of the mill bug Type:Bug Product defects

Comments

@lhriley
Copy link

lhriley commented Oct 6, 2017

Metabase fails to establish an SSH tunnel properly, and times out when attempting to connect to the remote database.

  • Your browser and the version: Chrome Version 61.0.3163.100 (Official Build) (64-bit)
  • Your operating system: Linux Mint 18.2
  • Your databases: MySQL, Postgres
  • Metabase version: 0.26.1
  • Metabase hosting environment: Docker
  • Metabase internal database: H2 (default), MySQL

Steps to reproduce:

  1. As an admin, navigate to the Admin Panel in Metabase.
  2. Create a new database connection.
  3. Select database type of your choice, fill out the database connection information.
  4. Enable Use an SSH-tunnel for database connections.
  5. Fill out the SSH tunnel connection information.
  6. Save.
  7. Wait for timeout error to appear.

Sample Log:

Oct 06 15:55:03 INFO metabase.util.ssh :: creating ssh tunnel metabase@redacted.host.address:22 -L 34517:redacted.host.address:3306
Oct 06 15:55:07 ERROR metabase.driver :: Failed to connect to database: Timed out after 5000 milliseconds.

Notes:

I am able to exec into the running docker container and establish an SSH tunnel using the example generated by metabase and displayed in the logs, however I am required to remove the port from the SSH target.

Broken Example:

# ssh metabase@redacted.host.address:22 -L 34517:redacted.host.address:3306
ssh: Could not resolve hostname redacted.host.address:22: Name does not resolve

Working Example:

# ssh metabase@redacted.host.address -L 34517:redacted.host.address:3306

I attempted to look through the code to find out where these connection strings are built, but I wasn't able to grok much from it.

⬇️ Please click the 👍 reaction instead of leaving a +1 or update? comment

@zejji
Copy link

zejji commented Oct 6, 2017

I am having the same issue.

I am attempting to connect to a database (localhost port 3306) on a remote server accessed via SSH on port 222. It is definitely not a firewall issue as I have no issues connecting to the database manually through the Docker container as follows:

  • docker exec -it metabase bash
  • apk update
  • apk add openssh
  • ssh username@hostname -p 222
  • mysql -u mysqlusername -p

The metabase error messages I receive are as follows:

MM-DD HH:MM:SS INFO util.ssh :: creating ssh tunnel username@hostname:222 -L 37937:localhost:3306
MM-DD HH:MM:SS ERROR metabase.driver :: Failed to connect to database: Timed out after 5000 milliseconds.
MM-DD HH:MM:SS INFO util.ssh :: creating ssh tunnel username@hostname:222 -L 42391:localhost:3306
MM-DD HH:MM:SS ERROR metabase.driver :: Failed to connect to database: Timed out after 5000 milliseconds.

The metabase version I am using is version v0.26.1 built on 2017-09-26.

@zejji
Copy link

zejji commented Nov 3, 2017

Any thoughts from anyone who is able to understand the Clojure source code?

This is a pretty major issue. It basically means that it is currently impossible to host Metabase on a different server from the database server (at least when using Docker), so it would be really helpful to have a fix.

@salsakran salsakran added Operation/ Type:Bug Product defects Priority:P2 Average run of the mill bug labels Dec 4, 2017
@pfeiffer
Copy link
Contributor

Also seeing this

@wihodges
Copy link

wihodges commented Apr 5, 2018

Not sure if this helps, but I have similar issue with SSH Tunnel for MYSQL database. I can see on the host MYSQL machine, that a SSH connection is accepted from Metabase Docker IP address. But as stated above, the connection gives a timeout response in metabase.

@evangreally
Copy link

evangreally commented May 9, 2018

Hi I am getting the same issue with MySQL. Is there any work around currently?

@pfeiffer
Copy link
Contributor

pfeiffer commented May 9, 2018

A workaround is to use Docker and setup an SSH tunnel in your Dockerfile that tunnels traffic to a remote host on a local port, and then connect to it in Metabase without Metabase having knowledge of the SSH tunnelling.

@ghost
Copy link

ghost commented Jul 9, 2018

@pfeiffer Any help on how to do this?

@dont-panic-42
Copy link

@pfeiffer, or anyone else, can you provide details?

I set up an SSH tunnel on my host machine forwarding local port 5433 to to my remote Postgres installation, and from the host command line I can successfully connect:

psql -h 127.0.0.1 -p 5433 

That connects me to the remote Postgres server, just as I want.

No matter what I try though, I cannot get the Metabase Docker container to connect through that tunnel. On the Metabase "Add Database" page I've tried using the container IP, the host IP, localhost, 127.0.0.1. All fail with:

Failed to connect to database: org.postgresql.util.PSQLException: Connection to IP-redacted:5433 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.

I tried recreating the Docker container and exposing the port I'm forwarding (5433):

docker create -p 3000:3000 -p 5433:5433 ... 

But that fails too, though with a new error:

org.postgresql.util.PSQLException: The connection attempt failed.

Anyone know how to get this working?

@ghost
Copy link

ghost commented Jul 20, 2018

Hi dont-panic-42

I did get it working eventually.

The port assignment (-p 5433:5433) only works in one direction: from host to container.
There is no analogous way to assign ports in the other direction, unfortunately.

I believe there are many ways to solve this problem. I've tried two so far, successfully:

  1. use host networking for docker container.
    If you use the flag "--network="host" you connect the container to the host machine network. This means you can leave out the "-p" flags altogether.
    Very easy if it fits your situation.
    Added benefit: networking is faster (recommended for running webservers in docker containers e.g.)

  2. use flag "--add-host" with something like "--add-host="docker-host:172.20.10.2" and enter the host machines internal ip. (on Mac you can find this out in the Network Utility).
    I assume this will easily break if the host IP changes. But it did do the trick for a test-run for me.

  3. use network plugins or docker containers that do the routing for you. (no experience)

@dont-panic-42
Copy link

Thank you very much for taking the time to reply @EZS-JD! I did, finally, get it working with host networking.

I'd prefer to use your option 2 but unfortunately it doesn't work for me. I tried adding a hostname for the host IP ("docker-host" for eg), which I can then see in /etc/hosts in the container, and I can ping it from inside the container, etc. But using "docker-host" in the host field when adding a Metabase DB fails with the same error that I saw for all my previous IP addresses attempts: Connection to docker-host:5433 refused.

In case you had maybe meant the container's IP (I am clutching at straws!), I tried that too, but it makes no difference.

I am curious if you know why using a hostname, rather than IP, would make a difference? I actually noticed that the Metabase documentation says the same thing, so I'm not doubting it, just wondering why:

Keep in mind that Metabase will be connecting from within your docker container, so make sure that either you’re using a fully qualified hostname or that you’ve set a proper entry in your container’s /etc/hosts file.

Very frustrating to have spent so long on this and still not know why it is failing or even what the problem is.

In any case using --network host works!! Thanks again for your time, you've made my day :-)

@ghost
Copy link

ghost commented Aug 3, 2018

@jornh
Copy link
Contributor

jornh commented Aug 9, 2018

Is this essentially the same as (a duplicate of) #6851?

I notice you @lhriley as OP (and several other posters) on this specify Docker as your environment. So that's may or may not complicate the matter even more ...

Note #6851 got fixed yesterday and as far as I can see it went straight into v0.30 - just announced: https://metabase.com/blog/

So, everyone still interested in this issue, please test and see if v0.30 fixes this for you guys as well and report back here please. 😊

@lhriley
Copy link
Author

lhriley commented Aug 9, 2018

@jornh unfortunately I do not currently have the resources available to verify this fix. Can someone else in this thread validate that v0.30 is enough to get this functional?

Thanks!

As an aside, for whatever reason I can't actually load the issue #6851 . Any other github page seems to work, but not that one. 🤷‍♂️

@ghost
Copy link

ghost commented Aug 10, 2018

I might get aound to testing this today. So I just pull a new version of the metabase docker image? Is it up to date on Docker Hub?

@tlrobinson
Copy link
Contributor

@EZS-JD Yes, metabase/metabase:latest on Dockerhub is v0.30.

Please read this before upgrading https://metabase.com/blog/before-upgrading-to-0.30/index.html

@ghost
Copy link

ghost commented Aug 10, 2018

I'm trying a fresh install just to test out the remote db connection via ssh-tunneling.

I'm not getting it to work yet (Server error encountered), but I may be doing something wrong.

Setup:

  • metabase running on another machine inside my office network in a docker container.
  • trying to access a mysql DB on a remote server (outside office network) via SSH-tunnel.

Can someone help me with:

  • how to debug? Which logs to read?
  • What needs to go into the Field "Host" (second field)?
  • What needs to go into the field "SSH Tunnel host" (first field after "Use an SSH-tunnel for database connections")?

@tlrobinson
Copy link
Contributor

"SSH tunnel host" would be the remote server where the SSH server is running
"Host" would be the hostname/IP address of the database itself, from the point of view of the SSH server (i.e. if they're on the same server it could just be localhost)

@ghost
Copy link

ghost commented Aug 10, 2018

not working for me. Which logs should I be consulting?

@jornh
Copy link
Contributor

jornh commented Aug 10, 2018

@EZS-JD starting point for logs is at least the Metabase log under the top-right gear icon (or Docker console). I fear it might not tell a lot though. I saw @tlrobinson recently posted in the Metabase gitter chat (linked from GitHub Readme) how you could bump log level for Metabase with Docker. Then bump it for the SSH lib metabase uses.

You may also start off and try comparing it to just a freshly spun up .jar version locally - so you know for a fact that it’s just the Metabase in Docker and Docker networking setup part you are fighting.

@ghost
Copy link

ghost commented Aug 10, 2018

Found the ssh-tunnel command in the metabase logs. Here some examples from my attempts:

(I'm hiding the real username and remote host IP address. They are correct)

creating ssh tunnel <remoteusername>@<remoteIPaddress>:22 -L 59410:localhost:3306
creating ssh tunnel <remoteusername>@<remoteIPaddress>:22 -L 58771:localhost:3306
creating ssh tunnel <remoteusername>@<remoteIPaddress>:22 -L 49462:localhost:3306

Any idea what this means? I would expect it to be "-L 3306:localhost:3306"

@ghost
Copy link

ghost commented Aug 10, 2018

Is there some way i could manually test the ssh connection from within the docker container? (ping to remote server works)

I would expect ssh to be installed on the metabase docker image, but command "ssh" doesn't work and it doesn't exist under /usr/bin. Is metabase using a different ssh implementation?

@jornh
Copy link
Contributor

jornh commented Aug 10, 2018

Is metabase using a different ssh implementation?

Yes, according to the Clojure project file, its http://www.jcraft.com/jsch/ - a pure Java SSH implementation (pretty decent, saw they build Boeng 878's with it):
https://github.com/metabase/metabase/blob/v0.30.0/project.clj#L56

Any idea what this means? I would expect it to be "-L 3306:localhost:3306"

I have never really worked with SSH tunnels, but again how does it compare to running with metabase.jar directly on you Docker host machine.

Also, depending on your Java-fu (or StackOverflow-fu ;) you might be able to find a bit of jsch example cli code docker cp <code>.class CONTAINER:DESTPATH it into the container, and you rule out Metabase/Clojure altogether.

@ghost
Copy link

ghost commented Aug 10, 2018

Tried running metabase.jar on Mac. ssh-tunnel works flawlessly.

So apparently it's a docker thing :-( Let me know if you have any ideas what might be the problem or how to fix it (apart from "--network=host" which I'd like to avoid).

@jeff303
Copy link
Contributor

jeff303 commented Feb 4, 2021

Is anyone able to reproduce this problem still? I just tried, and was unable to (database sync and querying worked fine). Here is my setup:

  • Docker version of Metabase ran via: docker run -d -p 3000:3000 --name metabase metabase/metabase (happens to be version 0.37.8)
  • Postgres was also run via Docker, mapping out to port 5432 on the host machine. In other words: docker run --rm -d -p 5432:5432 --network <someDockerNetwork> POSTGRES_USER=some_user POSTGRES_DB=metabase -e POSTGRES_PASSWORD=some_password --name postgres-12 postgres:12
  • Created a table and inserted some data in the metabase DB created in Postgres, above
  • Added the database via Admin in Metabase, with ssh tunnel configured*
  • Explored the table and was successfully able to query the rows

In this particular case, the ssh server was actually the one bundled with my OS X operating system, so with password authentication, I'm just providing my local user credentials. And for the SSH tunnel host, I'm providing the local network address of my host machine (the mac) since we're coming from inside Docker (I presume everyone else here has some "real" SSH server they're working with). The Host (i.e. database host) is given as localhost since, from the point of view of the ssh server, it is the same host (i.e. my host machine, the Mac, where Postgres is running via Docker, with its 5432 port mapped to the host machine port 5432).

Happy to help troubleshoot further if someone gets back to me, now 2+ years later.

Screen Shot 2021-02-04 at 2 06 30 PM

Screen Shot 2021-02-04 at 2 06 40 PM

@flamber
Copy link
Contributor

flamber commented Feb 5, 2021

@jeff303 Try something different than Postgres, since it might not close idle connections (8679), and then it depends on your SSH server config - some are setup with keepalive. Been seeing this issue many times in the forum and hosted.

@jeff303
Copy link
Contributor

jeff303 commented Feb 5, 2021

Well, #14563 should fix most of that in a generic way, if I'm understanding correctly (the ssh tunnel session/connection will be marked invalid/closed, and reopened, regardless of DB type, Docker or bare metal, etc.). What I'm trying to narrow down on now is whether there is something particular to the Docker container deployment that needs to be looked at. As far as I understand things, the fact that Metabase is in a Docker container shouldn't preclude ssh port forwarding from working, unless the server is configured to only allow this from certain hosts (and it can't identify the Docker container as such a blessed host), or if somehow the hosts file within the container doesn't have the ssh landing host in its list, etc.

I know there's a lot that depends on configuration on the SSL server side, which we obviously have no control over. So I think the best we can do is establish and try to maintain the connection as robustly as possible.

@flamber
Copy link
Contributor

flamber commented Feb 5, 2021

SSH will only close an idle tunnel, so if your database connections never goes to rest (following our c3p0 connection handling), then it won't timeout. Yes, reconnect will likely fix this partially, but it shouldn't close the tunnel, since it usually takes a while to open a tunnel, which would mean the user experience viewing a dashboard in the morning would be slower than one minute later.


Workaround is to setup the tunnel manually instead of using the built-in tunnel in Metabase:
https://www.metabase.com/docs/latest/administration-guide/01-managing-databases.html#what-if-the-built-in-ssh-tunnels-dont-fit-my-needs

@scream314
Copy link

The built-in tunnel (with PostgreSQL) was regularly breaking for us (even on 0.38.1) and it was not reconnecting. After starting Metabase we started to get "cannot connect to localhost:" errors in 1-2 hours. After restarting Metabase it was working for a few hours, then stopped working again.

We just ended up using dedicated SSH tunnel containers as sidecars, no connection issues since.

@skrrriiieeet
Copy link

Having this same issue. Just setup a fresh Metabase v.0.38.2 on a Docker image. When connecting to our postgres through SSH. It keeps the connection for a couple of minutes then drops it.

If I go into the Database menu in Metabase and "save" it will connect again for a few minutes then drop with the "cannot connect to localhost." error message.

@jeff303
Copy link
Contributor

jeff303 commented Mar 25, 2021

@scream314 , @jakobskytte , both of these issues should be fixed by generic ssh tunnel reconnection, from #14563, coming in the 39 release.

@flamber
Copy link
Contributor

flamber commented Nov 7, 2021

Closing, since connection timeout has been extended in 0.41.1 #18354 and SSH tunnel reconnect was fixed in 0.39.0 #10081

If you are still experiencing this problem, then post "Diagnostic Info" from Admin > Troubleshooting, and the full stacktrace from Admin > Troubleshooting > Logs. And any other details that would be helpful to reproduce the problem.

@flamber flamber closed this as completed Nov 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Operation/Docker Priority:P2 Average run of the mill bug Type:Bug Product defects
Projects
None yet
Development

No branches or pull requests