Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(pyathena.error.OperationalError) com.facebook.presto.hive.DataCatalogException: Namespace test_pyathena_xy12gnw2tu not found #18

Closed
laughingman7743 opened this issue Jul 30, 2017 · 1 comment

Comments

@laughingman7743
Copy link
Owner

__________________ TestSQLAlchemyAthena.test_get_table_names ___________________

self = <sqlalchemy.engine.base.Connection object at 0x7ff729120898>
dialect = <pyathena.sqlalchemy_athena.AthenaDialect object at 0x7ff72d35ec50>
constructor = <bound method type._init_statement of <class 'sqlalchemy.engine.default.DefaultExecutionContext'>>
statement = '\n                SELECT\n                  table_schema,\n                  table_name,\n                  column_na...       ordinal_position,\n                  comment\n                FROM information_schema.columns\n                '
parameters = {}
args = ('\n                SELECT\n                  table_schema,\n                  table_name,\n                  column_n...  ordinal_position,\n                  comment\n                FROM information_schema.columns\n                ', [])
conn = <sqlalchemy.pool._ConnectionFairy object at 0x7ff72bbca7f0>
context = <sqlalchemy.engine.default.DefaultExecutionContext object at 0x7ff72915ab38>

    def _execute_context(self, dialect, constructor,
                         statement, parameters,
                         *args):
        """Create an :class:`.ExecutionContext` and execute, returning
            a :class:`.ResultProxy`."""
    
        try:
            try:
                conn = self.__connection
            except AttributeError:
                conn = self._revalidate_connection()
    
            context = constructor(dialect, self, conn, *args)
        except BaseException as e:
            self._handle_dbapi_exception(
                e,
                util.text_type(statement), parameters,
                None, None)
    
        if context.compiled:
            context.pre_exec()
    
        cursor, statement, parameters = context.cursor, \
            context.statement, \
            context.parameters
    
        if not context.executemany:
            parameters = parameters[0]
    
        if self._has_events or self.engine._has_events:
            for fn in self.dispatch.before_cursor_execute:
                statement, parameters = \
                    fn(self, cursor, statement, parameters,
                       context, context.executemany)
    
        if self._echo:
            self.engine.logger.info(statement)
            self.engine.logger.info(
                "%r",
                sql_util._repr_params(parameters, batches=10)
            )
    
        evt_handled = False
        try:
            if context.executemany:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_executemany:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_executemany(
                        cursor,
                        statement,
                        parameters,
                        context)
            elif not parameters and context.no_parameters:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute_no_params:
                        if fn(cursor, statement, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_execute_no_params(
                        cursor,
                        statement,
                        context)
            else:
                if self.dialect._has_events:
                    for fn in self.dialect.dispatch.do_execute:
                        if fn(cursor, statement, parameters, context):
                            evt_handled = True
                            break
                if not evt_handled:
                    self.dialect.do_execute(
                        cursor,
                        statement,
                        parameters,
>                       context)

.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1182: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pyathena.sqlalchemy_athena.AthenaDialect object at 0x7ff72d35ec50>
cursor = <pyathena.cursor.Cursor object at 0x7ff72915af98>
statement = '\n                SELECT\n                  table_schema,\n                  table_name,\n                  column_na...       ordinal_position,\n                  comment\n                FROM information_schema.columns\n                '
parameters = {}
context = <sqlalchemy.engine.default.DefaultExecutionContext object at 0x7ff72915ab38>

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)

.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/default.py:470: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<pyathena.cursor.Cursor object at 0x7ff72915af98>, '\n                SELECT\n                  table_schema,\n      ...  ordinal_position,\n                  comment\n                FROM information_schema.columns\n                ', {})
kwargs = {}

    @functools.wraps(wrapped)
    def _wrapper(*args, **kwargs):
        with _lock:
>           return wrapped(*args, **kwargs)

pyathena/util.py:29: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pyathena.cursor.Cursor object at 0x7ff72915af98>
operation = '\n                SELECT\n                  table_schema,\n                  table_name,\n                  column_na...       ordinal_position,\n                  comment\n                FROM information_schema.columns\n                '
parameters = {}

    @synchronized
    def execute(self, operation, parameters=None):
        query = self._formatter.format(operation, parameters)
        _logger.debug(query)
    
        request = self._build_query_execution_request(query)
        try:
            self._reset_state()
            response = retry_api_call(self._connection.start_query_execution,
                                      exceptions=self.retry_exceptions,
                                      attempt=self.retry_attempt,
                                      multiplier=self.retry_multiplier,
                                      max_delay=self.retry_max_deply,
                                      exp_base=self.retry_exponential_base,
                                      logger=_logger,
                                      **request)
        except Exception as e:
            _logger.exception('Failed to execute query.')
            raise_from(DatabaseError(*e.args), e)
        else:
            self._query_id = response.get('QueryExecutionId', None)
>           self._poll()

pyathena/cursor.py:233: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pyathena.cursor.Cursor object at 0x7ff72915af98>

    def _poll(self):
        if not self._query_id:
            raise ProgrammingError('QueryExecutionId is none or empty.')
        while True:
            try:
                request = {'QueryExecutionId': self._query_id}
                response = retry_api_call(self._connection.get_query_execution,
                                          exceptions=self.retry_exceptions,
                                          attempt=self.retry_attempt,
                                          multiplier=self.retry_multiplier,
                                          max_delay=self.retry_max_deply,
                                          exp_base=self.retry_exponential_base,
                                          logger=_logger,
                                          **request)
            except Exception as e:
                _logger.exception('Failed to poll query result.')
                raise_from(OperationalError(*e.args), e)
            else:
                query_execution = response.get('QueryExecution', None)
                if not query_execution:
                    raise DataError('KeyError `QueryExecution`')
                status = query_execution.get('Status', None)
                if not status:
                    raise DataError('KeyError `Status`')
    
                state = status.get('State', None)
                if state == 'SUCCEEDED':
                    self._completion_date_time = status.get('CompletionDateTime', None)
                    self._submission_date_time = status.get('SubmissionDateTime', None)
    
                    statistics = query_execution.get('Statistics', {})
                    self._data_scanned_in_bytes = statistics.get(
                        'DataScannedInBytes', None)
                    self._execution_time_in_millis = statistics.get(
                        'EngineExecutionTimeInMillis', None)
    
                    result_conf = query_execution.get('ResultConfiguration', {})
                    self._output_location = result_conf.get('OutputLocation', None)
                    break
                elif state == 'FAILED':
>                   raise OperationalError(status.get('StateChangeReason', None))
E                   pyathena.error.OperationalError: com.facebook.presto.hive.DataCatalogException: Namespace test_pyathena_xy12gnw2tu not found. Please check your query.

pyathena/cursor.py:192: OperationalError

The above exception was the direct cause of the following exception:

self = <tests.test_sqlalchemy_athena.TestSQLAlchemyAthena testMethod=test_get_table_names>
engine = Engine(awsathena+rest://athena.[secure].amazonaws.com:443/test_pyathena_6m87iesn50?s3_staging_dir=[secure])
connection = <sqlalchemy.engine.base.Connection object at 0x7ff72d35eeb8>

    @with_engine
    def test_get_table_names(self, engine, connection):
        meta = MetaData()
>       meta.reflect(bind=engine)

tests/test_sqlalchemy_athena.py:101: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.tox/py34/lib/python3.4/site-packages/sqlalchemy/sql/schema.py:3909: in reflect
    Table(name, self, **reflect_opts)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/sql/schema.py:439: in __new__
    metadata._remove_table(name, schema)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/util/langhelpers.py:66: in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/util/compat.py:187: in reraise
    raise value
.tox/py34/lib/python3.4/site-packages/sqlalchemy/sql/schema.py:434: in __new__
    table._init(name, metadata, *args, **kw)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/sql/schema.py:514: in _init
    include_columns, _extend_on=_extend_on)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/sql/schema.py:527: in _autoload
    _extend_on=_extend_on
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1534: in run_callable
    return callable_(self, *args, **kwargs)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/default.py:372: in reflecttable
    table, include_columns, exclude_columns, **opts)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/reflection.py:598: in reflecttable
    table_name, schema, **table.dialect_kwargs):
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/reflection.py:369: in get_columns
    **kw)
<string>:2: in get_columns
    ???
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/reflection.py:54: in cache
    ret = fn(self, con, *args, **kw)
pyathena/sqlalchemy_athena.py:145: in get_columns
    } for row in connection.execute(query).fetchall()
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:939: in execute
    return self._execute_text(object, multiparams, params)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1097: in _execute_text
    statement, parameters
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1189: in _execute_context
    context)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1402: in _handle_dbapi_exception
    exc_info
.tox/py34/lib/python3.4/site-packages/sqlalchemy/util/compat.py:203: in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/util/compat.py:186: in reraise
    raise value.with_traceback(tb)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/base.py:1182: in _execute_context
    context)
.tox/py34/lib/python3.4/site-packages/sqlalchemy/engine/default.py:470: in do_execute
    cursor.execute(statement, parameters)
pyathena/util.py:29: in _wrapper
    return wrapped(*args, **kwargs)
pyathena/cursor.py:233: in execute
    self._poll()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pyathena.cursor.Cursor object at 0x7ff72915af98>

    def _poll(self):
        if not self._query_id:
            raise ProgrammingError('QueryExecutionId is none or empty.')
        while True:
            try:
                request = {'QueryExecutionId': self._query_id}
                response = retry_api_call(self._connection.get_query_execution,
                                          exceptions=self.retry_exceptions,
                                          attempt=self.retry_attempt,
                                          multiplier=self.retry_multiplier,
                                          max_delay=self.retry_max_deply,
                                          exp_base=self.retry_exponential_base,
                                          logger=_logger,
                                          **request)
            except Exception as e:
                _logger.exception('Failed to poll query result.')
                raise_from(OperationalError(*e.args), e)
            else:
                query_execution = response.get('QueryExecution', None)
                if not query_execution:
                    raise DataError('KeyError `QueryExecution`')
                status = query_execution.get('Status', None)
                if not status:
                    raise DataError('KeyError `Status`')
    
                state = status.get('State', None)
                if state == 'SUCCEEDED':
                    self._completion_date_time = status.get('CompletionDateTime', None)
                    self._submission_date_time = status.get('SubmissionDateTime', None)
    
                    statistics = query_execution.get('Statistics', {})
                    self._data_scanned_in_bytes = statistics.get(
                        'DataScannedInBytes', None)
                    self._execution_time_in_millis = statistics.get(
                        'EngineExecutionTimeInMillis', None)
    
                    result_conf = query_execution.get('ResultConfiguration', {})
                    self._output_location = result_conf.get('OutputLocation', None)
                    break
                elif state == 'FAILED':
>                   raise OperationalError(status.get('StateChangeReason', None))
E                   sqlalchemy.exc.OperationalError: (pyathena.error.OperationalError) com.facebook.presto.hive.DataCatalogException: Namespace test_pyathena_xy12gnw2tu not found. Please check your query. [SQL: '\n                SELECT\n                  table_schema,\n                  table_name,\n                  column_name,\n                  data_type,\n                  is_nullable,\n                  column_default,\n                  ordinal_position,\n                  comment\n                FROM information_schema.columns\n                ']
@laughingman7743
Copy link
Owner Author

Related issues:
laughingman7743/PyAthenaJDBC#15

The current "limit concurrent jobs" setting of TravisCI is 1.
Tests are very slow. 😫 I want to be able to run tests in parallel.
And even if there is a change in DataCatalog, I want to be able to execute queries safely with SQLAlchemy.

laughingman7743 added a commit that referenced this issue Aug 12, 2017
…et_columns_method

Add retry processing of error when schema name or table name does not match (fix #18)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant