This repository was archived by the owner on May 17, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 294
This repository was archived by the owner on May 17, 2024. It is now read-only.
diff_tables gets an error when boolean column(s) included in the extra_columns argument #280
Copy link
Copy link
Closed
Description
Description
First things first, let me express my admiration for the data-diff project and its contributors 👏 🙏
Coming to the issue:
When I try to include a boolean column to the extra_columns
argument of the diff_tables
, I got the below error (for Redshift):
psycopg2.errors.CannotCoerce: cannot cast type boolean to character varying
- The example code I used:
diff_tables_list = diff_tables(table1, table2, key_column="id", extra_columns=("title", "is_active", ))
for different_row in diff_tables_list:
...
(assume the is_active
column is boolean.)
- The run output + error I am getting. (including tracestack)
Traceback (most recent call last):
File "data_diff_test.py", line 151, in <module>
ids = get_table_diff(table)
File "data_diff_test.py", line 135, in get_table_diff
for different_row in diff_tables_list:
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 167, in diff_tables
raise error
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 143, in diff_tables
yield from ti
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/thread_utils.py", line 68, in __iter__
raise self._exception
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/thread_utils.py", line 56, in _worker
res = fn(*args, **kwargs)
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 294, in _diff_tables
(count1, checksum1), (count2, checksum2) = self._threaded_call("count_and_checksum", [table1, table2])
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 320, in _threaded_call
return list(self._thread_map(methodcaller(func), iterable))
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/table_segment.py", line 198, in count_and_checksum
count, checksum = self.database.query(
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 104, in query
res = self._query(sql_code)
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 312, in _query
return r.result()
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 318, in _query_in_worker
return _query_conn(self.thread_local.conn, sql_code)
File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 69, in _query_conn
c.execute(sql_code)
psycopg2.errors.CannotCoerce: cannot cast type boolean to character varying
Environment
MacBook Pro 2020 with M1 Max chip
MacOS 12.6 Monterey
Python 3.8.12
data_diff version 0.2.8
Metadata
Metadata
Assignees
Labels
No labels