Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dremio+flight connection issue #39

Open
snth opened this issue Apr 4, 2023 · 4 comments
Open

dremio+flight connection issue #39

snth opened this issue Apr 4, 2023 · 4 comments

Comments

@snth
Copy link

snth commented Apr 4, 2023

Hi,

I'm struggling to connect to dremio running in the community edition dremio/dremio-oss docker container.

I've seen many similar issues like #11 and #23 but none of those solution seem to be working for me.

I first encountered this when trying to connect from Superset but I can reproduce it running in a simple Python script.

$ python --version
Python 3.8.10
$ pip install sqlalchemy_dremio
Requirement already satisfied: sqlalchemy_dremio in /home/tobias/.local/lib/python3.8/site-packages (3.0.3)
Requirement already satisfied: sqlalchemy>=1.3.24 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy_dremio) (2.0.8)
Requirement already satisfied: pyarrow>=5.0.0 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy_dremio) (11.0.0)
Requirement already satisfied: typing-extensions>=4.2.0 in /home/tobias/.local/lib/python3.8/site-packages (from sqlalchemy>=1.3.24->sqlalchemy_dremio) (4.5.0)
Requirement already satisfied: greenlet!=0.4.17; platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32"))))) in /usr/lib/python3/dist-packages (from sqlalchemy>=1.3.24->sqlalchemy_dremio) (0.4.15)
Requirement already satisfied: numpy>=1.16.6 in /home/tobias/.local/lib/python3.8/site-packages (from pyarrow>=5.0.0->sqlalchemy_dremio) (1.24.2)
>>> import sqlalchemy as sa
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:31010/dremio?UseEncryption=false&disableCertificateVerification=true', echo=True)
>>> con = engine.connect()
...
 File "pyarrow/_flight.pyx", line 1398, in pyarrow._flight.FlightClient.authenticate_basic_token
  File "pyarrow/_flight.pyx", line 71, in pyarrow._flight.check_flight_status
pyarrow._flight.FlightUnavailableError: Flight returned unavailable error, with message: failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:31010: Socket closed
>>>

The relevant section of my dremio.conf looks as follows:

services: {
  coordinator.enabled: true,
  coordinator.master.enabled: true,
  executor.enabled: true,
  flight.enabled: true,
  #flight.auth.mode: "legacy.arrow.flight.auth"
  flight.auth.mode: "arrow.flight.auth2"
  #flight.use_session_service: true,
  #flight.ssl.enabled: true,
  #flight.ssl.auto-certificate.enabled: true
}
@snth
Copy link
Author

snth commented Apr 5, 2023

After sleeping on it, I managed to make some progress on this.

It occurred to me that the people reporting this issue sometimes use port 31010 and sometimes 32020. I'm able to connect to my Dremio instance from DBeaver using the JDBC driver on port 31010. The Superset Dremio page mentions that the default port for Arrow Flight is 32010 (but this is quite subtle and I missed it at first) but this port isn't exposed in the Dremio DockerHub page which only lists port 31010. I also exposed port 32010 by changing the docker command to

docker run -p 9047:9047 -p 31010:31010 - p 32010:32010 -p 45678:45678 dremio/dremio-oss

and now I'm able to connect and authenticate from Pyhon:

>>> import sqlalchemy as sa
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:32010/dremio?UseEncryption=false&disableCertificateVerification=true', echo=True)
>>> con = engine.connect()
>>>

The connection from Superset is still not working. I'm guessing it might need a different combination of encryption settings. I will experiment more with this and in particular try the suggestion from #23.

@snth
Copy link
Author

snth commented Apr 5, 2023

Using the settings from #23 I now get an SSL certificate error:

dremio.conf:

services: {
  coordinator.enabled: true,
  coordinator.master.enabled: true,
  executor.enabled: true,
  flight.enabled: true,
  flight.use_session_service: true,
  flight.ssl.enabled: true,
  flight.ssl.auto-certificate.enabled: true
}
Python 3.8.10 (default, Mar 13 2023, 10:26:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy as sa
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:32010/dremio?UseEncryption=true&disableCertificateVerification=false', echo=True)
<stdin>:1: SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'sqlalchemy_dremio.flight.DremioDialect_flight'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
>>> con = engine.connect()
E0405 09:35:25.276792401    4334 ssl_transport_security.cc:1501] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.
E0405 09:35:25.286732554    4334 ssl_transport_security.cc:1501] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3251, in connect
    return self._connection_cls(self)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 145, in __init__
    self._dbapi_connection = engine.raw_connection()
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 3275, in raw_connection
    return self.pool.connect()
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 420, in connect
    return _ConnectionFairy._checkout(self, self._fairy)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 1271, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 719, in checkout
    rec = pool._do_get()
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 402, in _do_get
    c = self._create_connection()
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 396, in _create_connection
    return _ConnectionRecord(self)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 681, in __init__
    self.__connect()
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 906, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 901, in __connect
    self.dbapi_connection = connection = pool._invoke_creator(self)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 636, in connect
    return dialect.connect(*cargs, **cparams)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy_dremio/flight.py", line 196, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy_dremio/db.py", line 20, in connect
    return Connection(c)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy_dremio/db.py", line 86, in __init__
    bearer_token = client.authenticate_basic_token(properties['UID'], properties['PWD'])
  File "pyarrow/_flight.pyx", line 1398, in pyarrow._flight.FlightClient.authenticate_basic_token
  File "pyarrow/_flight.pyx", line 71, in pyarrow._flight.check_flight_status
pyarrow._flight.FlightUnavailableError: Flight returned unavailable error, with message: failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:32010: Ssl handshake failed: SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:32010/dremio?UseEncryption=true&disableCertificateVerification=false', echo=True)E0405 09:35:35.108920946    4346 ssl_transport_security.cc:1501] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.
E0405 09:35:45.108883819    4346 ssl_transport_security.cc:1501] Handshake failed with fatal error SSL_ERROR_SSL: error:0A000086:SSL routines::certificate verify failed.

I'm serving this off localhost for the moment so perhaps it's no surprise that certificates don't work. I did try it with SSL enabled and without the certificate and that worked from Python but still didn't work from Superset. What do I need to do on the Superset side to get this working?

@snth
Copy link
Author

snth commented Apr 5, 2023

So my full setup is as follows:

dremio.conf

services: {
  coordinator.enabled: true,
  coordinator.master.enabled: true,
  executor.enabled: true,
  flight.enabled: true,
  flight.auth.mode: "arrow.flight.auth2"
  flight.use_session_service: true,
  flight.ssl.enabled: true
}

Dockerfile

FROM dremio/dremio-oss

COPY dremio.conf /opt/dremio/conf/dremio.conf

docker command:

docker build -t dremio-flight . && docker run --rm -p 9047:9047 -p 31010:31010 -p 32010:32010 -p 45678:45678 dremio-flight

python

Python 3.8.10 (default, Mar 13 2023, 10:26:41)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlalchemy as sa
>>> engine = sa.create_engine('dremio+flight://dremio:dremio123@localhost:32010/dremio?UseEncryption=true&disableCertificateVerification=true', echo=True)
<stdin>:1: SADeprecationWarning: The dbapi() classmethod on dialect classes has been renamed to import_dbapi().  Implement an import_dbapi() classmethod directly on class <class 'sqlalchemy_dremio.flight.DremioDialect_flight'> to remove this warning; the old .dbapi() classmethod may be maintained for backwards compatibility.
>>> con = engine.connect()
>>> con.execute(sa.text('SELECT * FROM "lakefs.staging.main".table1'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1414, in execute
    return meth(
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 486, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1630, in _execute_clauseelement
    compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 631, in _compile_w_cache
    if compiled_cache is not None and dialect._supports_statement_cache:
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 1138, in __get__
    obj.__dict__[self.__name__] = result = self.fget(obj)
  File "/home/tobias/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 424, in _supports_statement_cache
    "warning." % (self.name, self.driver),
AttributeError: 'DremioDialect_flight' object has no attribute 'driver'
>>>

The SELECT * FROM "lakefs.staging.main".table1 query works from within Dremio as well as from DBeaver using the JDBC driver so to me this points to a problem with the sqlalchemy_dremio driver.

$ pip freeze
...
SQLAlchemy==2.0.8
sqlalchemy-dremio==3.0.3
...

@lenoyjacob
Copy link

Refer to #37 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants