-
Notifications
You must be signed in to change notification settings - Fork 294
Warning during using hashdiff for VARCHAR type values in POSTGRES. #849
Description
Despite using a supported data type, while using hashdiff, I receive the warning '[PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling.' I couldn't find information about this in the documentation, only that VARCHAR is supported for PostgreSQL.
documentation
data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c email -v -a hashdiff -d
01:20:09 INFO [PostgreSQL] Starting a threadpool, size=1. base.py:1196
DEBUG Running SQL (PostgreSQL): base.py:959
SET TIME ZONE 'UTC'
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
information_schema.columns WHERE table_name = 'test2' AND table_schema = 'public'
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
information_schema.columns WHERE table_name = 'test' AND table_schema = 'public'
INFO [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name', schema.py:12
'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
'character varying', None, None, None)}
INFO [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name', schema.py:12
'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
'character varying', None, None, None)}
DEBUG Available mutual columns: {'email', 'id', 'first_name', 'last_name', 'nulle1', 'emailx', __main__.py:489
'nulle2', 'second_name'}
INFO Diffing schemas... __main__.py:81
INFO Diffing using columns: key=('id',) update=None extra=('email',). __main__.py:518
INFO Using algorithm 'hashdiff'. __main__.py:519
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT * FROM (SELECT TRIM("email") FROM "test2") AS LIMITED_SELECT LIMIT 64
INFO [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>), schema.py:12
'email': Text(_notes=[])}
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT * FROM (SELECT TRIM("email") FROM "test") AS LIMITED_SELECT LIMIT 64
INFO [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>), schema.py:12
'email': Text(_notes=[])}
WARNING [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling. hashdiff_tables.py:133
If encoding/formatting differs between databases, it may result in false positives.
WARNING [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling. hashdiff_tables.py:133
If encoding/formatting differs between databases, it may result in false positives.
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT min("id")::varchar, max("id")::varchar FROM "test2"
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT min("id")::varchar, max("id")::varchar FROM "test"
INFO Diffing segments at key-range: (1)..(3). size: table1 <= 2, table2 <= 2 diff_tables.py:294
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT "id"::varchar, "email"::varchar FROM "test2" WHERE ("id" >= 1) AND ("id" < 3)
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT "id"::varchar, "email"::varchar FROM "test" WHERE ("id" >= 1) AND ("id" < 3)
INFO Diff found 0 different rows. hashdiff_tables.py:217
01:20:10 INFO Duration: 1.38 seconds. __main__.py:553
Same result i receive for columns with empty strings and null values:
data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c nulle2 -v -a hashdiff -d
I'm running data-diff version 0.9.17.
My testing tabels in PostgresSQL:
create table test (
id serial primary key,
first_name VARCHAR(50) NOT NULL,
second_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50)NOT NULL,
emailx VARCHAR(50),
email VARCHAR(50),
Nulle1 VARCHAR(20),
Nulle2 VARCHAR(20)
);
insert into test (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2)
values
(1, 'Tisha', 'Annalise', 'Casserly', 'acasserly0@i2i.jp*', 'acasserly0@i2i.jp', null, ''),
(2, 'Joelynn', 'Bette', 'Samuels', 'bsamuels1@networksolutions.com*', 'bsamuels1@networksolutions.com', ' ', null);
insert into test2 (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2)
values
(1, 'Tisha', 'Annalise', 'Casserly', 'acasserly0@i2i.jp*', 'acasserly0@i2i.jp', '', ''),
(2, 'Joelynn', 'Bette', 'Samuels', 'bsamuels1@networksolutions.com*', 'bsamuels1@networksolutions.com', null, NULL)