Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

WHERE clause isn't being applied in _refine_coltypes (Errors on Bigquery table with Partition filter set to required) #221

@tripathiabhay09

Description

@tripathiabhay09

Hi Team,

I am trying to run count_and_checksum and table_diff on a bq table having Partition filter flag set to Required.(see the screenshot for flag detail)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 171, in execute
    return_value = self.execute_callable()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 189, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)

  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/diff_tables.py", line 151, in with_schema
    return self._with_raw_schema(self.database.query_table_schema(self.table_path))
  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/diff_tables.py", line 143, in _with_raw_schema
    schema = self.database._process_table_schema(self.table_path, raw_schema, self._relevant_columns)
  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/databases/base.py", line 195, in _process_table_schema
    self._refine_coltypes(path, col_dict)
  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/databases/base.py", line 208, in _refine_coltypes
    samples_by_row = self.query(Select(fields, TableName(table_path), limit=16), list)
  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/databases/base.py", line 101, in query
    res = self._query(sql_code)
  File "/home/airflow/.local/lib/python3.8/site-packages/data_diff/databases/bigquery.py", line 57, in _query
    raise ConnectError(msg % (sql_code, e))
data_diff.databases.base.ConnectError: Exception when trying to execute SQL code:
    SELECT TRIM() FROM  LIMIT 16

Got error: 400 Cannot query over table<tablename> without a filter over column(s) <column name> that can be used for partition elimination

Not sure if this is a bug or I am doing something wrong. Currently, I am creating table segment by providing a where_clause which matches with the filter column on BQ.
Thank you so much for the help.
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions