Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

diff_tables gets an error when boolean column(s) included in the extra_columns argument #280

@akocukcu

Description

@akocukcu

Description
First things first, let me express my admiration for the data-diff project and its contributors 👏 🙏
Coming to the issue:
When I try to include a boolean column to the extra_columns argument of the diff_tables, I got the below error (for Redshift):

psycopg2.errors.CannotCoerce: cannot cast type boolean to character varying

  • The example code I used:
diff_tables_list = diff_tables(table1, table2, key_column="id", extra_columns=("title", "is_active", ))
for different_row in diff_tables_list:
...

(assume the is_active column is boolean.)

  • The run output + error I am getting. (including tracestack)
Traceback (most recent call last):
  File "data_diff_test.py", line 151, in <module>
    ids = get_table_diff(table)
  File "data_diff_test.py", line 135, in get_table_diff
    for different_row in diff_tables_list:
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 167, in diff_tables
    raise error
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 143, in diff_tables
    yield from ti
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/thread_utils.py", line 68, in __iter__
    raise self._exception
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/thread_utils.py", line 56, in _worker
    res = fn(*args, **kwargs)
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 294, in _diff_tables
    (count1, checksum1), (count2, checksum2) = self._threaded_call("count_and_checksum", [table1, table2])
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/diff_tables.py", line 320, in _threaded_call
    return list(self._thread_map(methodcaller(func), iterable))
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
    yield fs.pop().result()
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/table_segment.py", line 198, in count_and_checksum
    count, checksum = self.database.query(
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 104, in query
    res = self._query(sql_code)
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 312, in _query
    return r.result()
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/Users/apo.kocukcu/.pyenv/versions/3.8.12/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 318, in _query_in_worker
    return _query_conn(self.thread_local.conn, sql_code)
  File "/Users/apo.kocukcu/.pyenv/versions/databricks-sql-test/lib/python3.8/site-packages/data_diff/databases/base.py", line 69, in _query_conn
    c.execute(sql_code)
psycopg2.errors.CannotCoerce: cannot cast type boolean to character varying

Environment
MacBook Pro 2020 with M1 Max chip
MacOS 12.6 Monterey
Python 3.8.12
data_diff version 0.2.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions