forked from datafold/data-diff
-
Notifications
You must be signed in to change notification settings - Fork 1
Closed
Labels
P0-criticalShip-blocking, fix immediatelyShip-blocking, fix immediatelybugSomething isn't workingSomething isn't workingtriage
Description
Problem
_connect.py:298-303 silently catches NotImplementedError from set_timezone_to_utc() at DEBUG level and proceeds normally. BigQuery and ClickHouse both raise here.
This is silent data corruption. When comparing timestamps across databases where one session is in a non-UTC timezone, the bisection algorithm produces wrong ranges and phantom diffs — or worse, masks real diffs.
Scope
- BigQuery (
bigquery.py:154): Implement usingSET @@time_zone = 'UTC'session variable - ClickHouse (
clickhouse.py:98): Implement usingSET session_timezone = 'UTC' _connect.py:298-303: Elevate the caughtNotImplementedErrorlog from DEBUG to WARNING. Consider adding a--require-utcflag that makes this an error.
Key Files
data_diff/databases/_connect.py:296-304data_diff/databases/bigquery.py:154data_diff/databases/clickhouse.py:98
Acceptance Criteria
- BigQuery
set_timezone_to_utc()sets session to UTC - ClickHouse
set_timezone_to_utc()sets session to UTC - Connection factory logs WARNING (not DEBUG) when timezone cannot be set
- Tests cover timezone normalization for both databases
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P0-criticalShip-blocking, fix immediatelyShip-blocking, fix immediatelybugSomething isn't workingSomething isn't workingtriage