Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(impala): cannot connect to Impala backend after 8.0 update due to attempt to create a temporary database on connect #8466

Closed
1 task done
contang0 opened this issue Feb 27, 2024 · 2 comments · Fixed by #8489
Labels
bug Incorrect behavior inside of ibis impala The Apache Impala backend

Comments

@contang0
Copy link

contang0 commented Feb 27, 2024

What happened?

Hello, back in January I reported a bug under a closed merge request, which probably was not noticed. I probably should have opened a separate issue already back then, sorry! The issue seems to be that Ibis is now trying to create a database at startup, which does not work for me, and likely most other corporate users.

What version of ibis are you using?

8.0

What backend(s) are you using, if any?

Impala

Relevant log output

Please find it here, the error is still the same:

https://github.com/ibis-project/ibis/pull/7537#issuecomment-1876533010


---------------------------------------------------------------------------
HiveServer2Error                          Traceback (most recent call last)
c:\Users\username\Desktop\Projects\projname\src\pairing.py in line 1
----> 281 prod_mmsrdl = connect_ibis(schema_name="schema_name", env="prod")

File ~\Desktop\Projects\projname\src\utils.py:35, in connect_ibis(schema_name, env)
---> 35 return ibis.impala.connect(
     36     host=hosts[env],
     37     database=schema_name,
     38     port=443,
     39     use_ssl=True,
     40     auth_mechanism="PLAIN",
     41     user=os.getlogin(),
     42     password=pw,
     43     use_http_transport=True,
     44     http_path="cliservice",
     45 )

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\__init__.py:107, in __getattr__.<locals>.connect(*args, **kwargs)
    106 def connect(*args, **kwargs):
--> 107     return backend.connect(*args, **kwargs)

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\base\__init__.py:847, in BaseBackend.connect(self, *args, **kwargs)
    824 """Connect to the database.
    825 
    826 Parameters
   (...)
    844     An instance of the backend
    845 """
    846 new_backend = self.__class__(*args, **kwargs)
--> 847 new_backend.reconnect()
    848 return new_backend

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\base\__init__.py:862, in BaseBackend.reconnect(self)
    860 def reconnect(self) -> None:
    861     """Reconnect to the database already configured with connect."""
--> 862     self.do_connect(*self._con_args, **self._con_kwargs)

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:212, in Backend.do_connect(self, host, port, database, timeout, use_ssl, ca_cert, user, password, auth_mechanism, kerberos_service_name, pool_size, **params)
    210 self.con = con
    211 self.options = {}
--> 212 self._ensure_temp_db_exists()

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:668, in Backend._ensure_temp_db_exists(self)
    666 name, path = options.impala.temp_db, options.impala.temp_path
    667 if name not in self.list_databases():
--> 668     self.create_database(name, path=path, force=True)

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:307, in Backend.create_database(self, name, path, force)
    295 """Create a new Impala database.
    296 
    297 Parameters
   (...)
    304     Forcibly create the database
    305 """
    306 statement = CreateDatabase(name, path=path, can_exist=force)
--> 307 self._safe_exec_sql(statement)

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:276, in Backend._safe_exec_sql(self, *args, **kwargs)
    275 def _safe_exec_sql(self, *args, **kwargs):
--> 276     with self._safe_raw_sql(*args, **kwargs):
    277         pass

File c:\Mambaforge\envs\env_name\Lib\contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:272, in Backend._safe_raw_sql(self, query)
    270 if not isinstance(query, str):
    271     query = query.compile()
--> 272 with contextlib.closing(self.raw_sql(query)) as cur:
    273     yield cur

File c:\Mambaforge\envs\env_name\Lib\site-packages\ibis\backends\impala\__init__.py:252, in Backend.raw_sql(self, query)
    249     cursor._wait_to_finish()
    251     util.log(query)
--> 252     cursor.execute_async(query)
    254     cursor._wait_to_finish()
    255 except (Exception, KeyboardInterrupt):

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:388, in HiveServer2Cursor.execute_async(self, operation, parameters, configuration)
    383     op = self.session.execute(self._last_operation_string,
    384                               configuration,
    385                               run_async=True)
    386     self._last_operation = op
--> 388 self._execute_async(op)

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:407, in HiveServer2Cursor._execute_async(self, operation_fn)
    405 self._reset_state()
    406 self._debug_log_state()
--> 407 operation_fn()
    408 self._last_operation_active = True
    409 self._debug_log_state()

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:383, in HiveServer2Cursor.execute_async.<locals>.op()
    380 else:
    381     self._last_operation_string = operation
--> 383 op = self.session.execute(self._last_operation_string,
    384                           configuration,
    385                           run_async=True)
    386 self._last_operation = op

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:1227, in HS2Session.execute(self, statement, configuration, run_async)
   1219 req = TExecuteStatementReq(sessionHandle=self.handle,
   1220                            statement=statement,
   1221                            confOverlay=configuration,
   1222                            runAsync=run_async)
   1223 # Do not try to retry http requests.
   1224 # Read queries should be idempotent but most dml queries are not. Also retrying
   1225 # query execution from client could be expensive and so likely makes sense to do
   1226 # it if server is also aware of the retries.
-> 1227 return self._operation('ExecuteStatement', req, False)

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:1148, in ThriftRPC._operation(self, kind, request, retry_on_http_error)
   1147 def _operation(self, kind, request, retry_on_http_error=False):
-> 1148     resp = self._rpc(kind, request, retry_on_http_error)
   1149     return self._get_operation(resp.operationHandle)

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:1085, in ThriftRPC._rpc(self, func_name, request, retry_on_http_error)
   1083 response = self._execute(func_name, request, retry_on_http_error)
   1084 self._log_response(func_name, response)
-> 1085 err_if_rpc_not_ok(response)
   1086 return response

File c:\Mambaforge\envs\env_name\Lib\site-packages\impala\hiveserver2.py:781, in err_if_rpc_not_ok(resp)
    777 def err_if_rpc_not_ok(resp):
    778     if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
    779             resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
    780             resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
--> 781         raise HiveServer2Error(resp.status.errorMessage)

HiveServer2Error: AuthorizationException: User 'username' does not have privileges to execute 'CREATE' on: __ibis_tmp

Code of Conduct

  • I agree to follow this project's Code of Conduct
@contang0 contang0 added the bug Incorrect behavior inside of ibis label Feb 27, 2024
@cpcloud cpcloud changed the title bug: cannot connect to Imapla backend after 8.0 update bug(impala): cannot connect to Impala backend after 8.0 update due to attempt to create a temporary database on connect Feb 28, 2024
@cpcloud
Copy link
Member

cpcloud commented Feb 28, 2024

Thanks for reporting (again)!

A quick glance at the code suggests the temporary database may not be needed at all. I'll poke at it a bit.

@cpcloud cpcloud added the impala The Apache Impala backend label Feb 28, 2024
gforsyth pushed a commit that referenced this issue Feb 28, 2024
…may prevent connection success (#8489)

Remove the creation of temporary resources in the Impala backend, which
are no longer used after the HDFS removal (these temp resources were
used as scratch space for things like dataframe csvs). Fixes #8466.
@chris-park
Copy link
Contributor

chris-park commented Jun 26, 2024

@cpcloud - I'm getting a similar error using both ibis-framework-10.0.0.dev120 and ibis-framework-9.1.0. The code below runs fine if I use version 7.2.0.

Here's the error:

---------------------------------------------------------------------------
HiveServer2Error                          Traceback (most recent call last)
Cell In[4], line 1
----> 1 con.sql("SELECT COUNT(*) FROM dbschema.table_name")

File c:\Users\username\venvs\.venv\Lib\site-packages\ibis\backends\sql\__init__.py:200, in SQLBackend.sql(self, query, schema, dialect)
    198 query = self._transpile_sql(query, dialect=dialect)
    199 if schema is None:
--> 200     schema = self._get_schema_using_query(query)
    201 return ops.SQLQueryResult(query, ibis.schema(schema), self).to_expr()

File c:\Users\username\venvs\.venv\Lib\site-packages\ibis\backends\impala\__init__.py:401, in Backend._get_schema_using_query(self, query)
    396 create_sql = sge.Create(
    397     kind="VIEW", this=ident, exists=True, expression=query, dialect=self.dialect
    398 )
    399 drop_sql = sge.Drop(kind="VIEW", this=ident, exists=True)
--> 401 with self._safe_raw_sql(create_sql):
    402     pass
    404 try:

File C:\Program Files\Python311\Lib\contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None

File c:\Users\username\venvs\.venv\Lib\site-packages\ibis\backends\impala\__init__.py:273, in Backend._safe_raw_sql(self, query)
    270         query = query.compile()
    272 assert isinstance(query, str), type(query)
--> 273 with contextlib.closing(self.raw_sql(query)) as cur:
    274     yield cur

File c:\Users\username\venvs\.venv\Lib\site-packages\ibis\backends\impala\__init__.py:248, in Backend.raw_sql(self, query)
    245     cursor._wait_to_finish()
    247     util.log(query)
--> 248     cursor.execute_async(query)
    250     cursor._wait_to_finish()
    251 except (Exception, KeyboardInterrupt):

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:388, in HiveServer2Cursor.execute_async(self, operation, parameters, configuration)
    383     op = self.session.execute(self._last_operation_string,
    384                               configuration,
    385                               run_async=True)
    386     self._last_operation = op
--> 388 self._execute_async(op)

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:407, in HiveServer2Cursor._execute_async(self, operation_fn)
    405 self._reset_state()
    406 self._debug_log_state()
--> 407 operation_fn()
    408 self._last_operation_active = True
    409 self._debug_log_state()

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:383, in HiveServer2Cursor.execute_async.<locals>.op()
    380 else:
    381     self._last_operation_string = operation
--> 383 op = self.session.execute(self._last_operation_string,
    384                           configuration,
    385                           run_async=True)
    386 self._last_operation = op

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:1227, in HS2Session.execute(self, statement, configuration, run_async)
   1219 req = TExecuteStatementReq(sessionHandle=self.handle,
   1220                            statement=statement,
   1221                            confOverlay=configuration,
   1222                            runAsync=run_async)
   1223 # Do not try to retry http requests.
   1224 # Read queries should be idempotent but most dml queries are not. Also retrying
   1225 # query execution from client could be expensive and so likely makes sense to do
   1226 # it if server is also aware of the retries.
-> 1227 return self._operation('ExecuteStatement', req, False)

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:1148, in ThriftRPC._operation(self, kind, request, retry_on_http_error)
   1147 def _operation(self, kind, request, retry_on_http_error=False):
-> 1148     resp = self._rpc(kind, request, retry_on_http_error)
   1149     return self._get_operation(resp.operationHandle)

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:1085, in ThriftRPC._rpc(self, func_name, request, retry_on_http_error)
   1083 response = self._execute(func_name, request, retry_on_http_error)
   1084 self._log_response(func_name, response)
-> 1085 err_if_rpc_not_ok(response)
   1086 return response

File c:\Users\username\venvs\.venv\Lib\site-packages\impala\hiveserver2.py:781, in err_if_rpc_not_ok(resp)
    777 def err_if_rpc_not_ok(resp):
    778     if (resp.status.statusCode != TStatusCode.SUCCESS_STATUS and
    779             resp.status.statusCode != TStatusCode.SUCCESS_WITH_INFO_STATUS and
    780             resp.status.statusCode != TStatusCode.STILL_EXECUTING_STATUS):
--> 781         raise HiveServer2Error(resp.status.errorMessage)

HiveServer2Error: AuthorizationException: User 'username' does not have privileges to execute 'CREATE' on: dbschema

Please let me know it would be easier for you if I raise a new issue. Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis impala The Apache Impala backend
Projects
Archived in project
3 participants