Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

Warning during using hashdiff for VARCHAR type values in POSTGRES. #849

@EnCeT

Description

@EnCeT

Despite using a supported data type, while using hashdiff, I receive the warning '[PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling.' I couldn't find information about this in the documentation, only that VARCHAR is supported for PostgreSQL.
documentation

data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c email -v -a hashdiff -d

01:20:09 INFO     [PostgreSQL] Starting a threadpool, size=1.                                                      base.py:1196
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SET TIME ZONE 'UTC'
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
                  information_schema.columns WHERE table_name = 'test2' AND table_schema = 'public'
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
                  information_schema.columns WHERE table_name = 'test' AND table_schema = 'public'
         INFO     [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name',        schema.py:12
                  'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
                  None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
                  ('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
                  None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
                  'character varying', None, None, None)}
         INFO     [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name',        schema.py:12
                  'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
                  None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
                  ('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
                  None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
                  'character varying', None, None, None)}
         DEBUG    Available mutual columns: {'email', 'id', 'first_name', 'last_name', 'nulle1', 'emailx',      __main__.py:489
                  'nulle2', 'second_name'}
         INFO     Diffing schemas...                                                                             __main__.py:81
         INFO     Diffing using columns: key=('id',) update=None extra=('email',).                              __main__.py:518
         INFO     Using algorithm 'hashdiff'.                                                                   __main__.py:519
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT * FROM (SELECT TRIM("email") FROM "test2") AS LIMITED_SELECT LIMIT 64
         INFO     [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>),         schema.py:12
                  'email': Text(_notes=[])}
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT * FROM (SELECT TRIM("email") FROM "test") AS LIMITED_SELECT LIMIT 64
         INFO     [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>),         schema.py:12
                  'email': Text(_notes=[])}
         WARNING  [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling.   hashdiff_tables.py:133
                  If encoding/formatting differs between databases, it may result in false positives.
         WARNING  [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling.   hashdiff_tables.py:133
                  If encoding/formatting differs between databases, it may result in false positives.
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT min("id")::varchar, max("id")::varchar FROM "test2"
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT min("id")::varchar, max("id")::varchar FROM "test"
         INFO     Diffing segments at key-range: (1)..(3). size: table1 <= 2, table2 <= 2                    diff_tables.py:294
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT "id"::varchar, "email"::varchar FROM "test2" WHERE ("id" >= 1) AND ("id" < 3)
         DEBUG    Running SQL (PostgreSQL):                                                                         base.py:959
                  SELECT "id"::varchar, "email"::varchar FROM "test" WHERE ("id" >= 1) AND ("id" < 3)
         INFO     Diff found 0 different rows.                                                           hashdiff_tables.py:217
01:20:10 INFO     Duration: 1.38 seconds.                                                                       __main__.py:553

Same result i receive for columns with empty strings and null values:
data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c nulle2 -v -a hashdiff -d
I'm running data-diff version 0.9.17.

My testing tabels in PostgresSQL:

create table test (
	id serial primary key,
	first_name VARCHAR(50) NOT NULL,
	second_name VARCHAR(50) NOT NULL,
	last_name VARCHAR(50)NOT NULL,
	emailx VARCHAR(50),
    email VARCHAR(50),
	Nulle1 VARCHAR(20),
	Nulle2 VARCHAR(20)
);
insert into test (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2) 
values
(1,	'Tisha',	'Annalise',	'Casserly',	'acasserly0@i2i.jp*',	'acasserly0@i2i.jp',	null, ''),
(2,	'Joelynn',	'Bette',	'Samuels',	'bsamuels1@networksolutions.com*',	'bsamuels1@networksolutions.com',	' ',	null);

insert into test2 (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2) 
values 
(1,	'Tisha',	'Annalise',	'Casserly',	'acasserly0@i2i.jp*',	'acasserly0@i2i.jp',	'',	''),
(2,	'Joelynn',	'Bette',	'Samuels',	'bsamuels1@networksolutions.com*',	'bsamuels1@networksolutions.com',	null,	NULL)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstaleIssues/PRs that have gone staletriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions