Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

--dbt is stuck with --json flag #874

@harikaduyu

Description

@harikaduyu

Describe the bug
I'm trying to get a json output from a --dbt run which uses a state file. It works fine if there is no --json flag. But when I add the json flag, it gets stuck and process never finishes.

Make sure to include the following (minus sensitive information):

  • The command or code you used

sh data-diff --dbt --state prod-run-artifacts/manifest.json --json -d

  • The run output + error you're getting. (including tracestack)
Running with data-diff=0.11.1
15:44:30 INFO     Parsing file dbt_project.yml                                                                                                                                                                                                                                                                                                                                                                                               dbt_parser.py:287
         INFO     Parsing file /dbt_project/target/manifest.json                                                                                                                                                                                                                                                                                                                                                                             dbt_parser.py:280
         INFO     Parsing file prod-run-artifacts/manifest.json                                                                                                                                                                                                                                                                                                                                                                              dbt_parser.py:280
         INFO     Parsing file target/run_results.json                                                                                                                                                                                                                                                                                                                                                                                       dbt_parser.py:253
         INFO     config: prod_database=None prod_schema=None prod_custom_schema=None datasource_id=None                                                                                                                                                                                                                                                                                                                                     dbt_parser.py:159
         INFO     Parsing file /dbt_project/profiles.yml                                                                                                                                                                                                                                                                                                                                                                                     dbt_parser.py:294
         DEBUG    Found no PKs                                                                                                                                                                                                                                                                                                                                                                                                               dbt_parser.py:465
{"status": "failed", "model": "model.dbt.bi_dagster_asset", "dataset1": ["data-prod", "prod_observability", "bi_dagster_asset"], "dataset2": ["data-prod", "dbt_pr_test_ci_observability", "bi_dagster_asset"], "error": "No primary key found. Add uniqueness tests, meta, or tags.", "version": "1.0.0"}
         DEBUG    Found PKs via Uniqueness tests [fct_tbl_info]: {'col_id'}                                                                                                                                                                                                                                                                                                                                                                  dbt_parser.py:459
         DEBUG    Found PKs via Uniqueness tests [int_table]: {'col_id'}                                                                                                                                                                                                                                                                                                                                                                     dbt_parser.py:459
         DEBUG    Found no PKs                                                                                                                                                                                                                                                                                                                                                                                                               dbt_parser.py:465
{"status": "failed", "model": "model.dbt.dim_latest_email_table", "dataset1": ["data-prod", "prod_schema", "dim_latest_email_table"], "dataset2": ["data-prod", "dbt_pr_test_ci_schema", "dim_latest_email_table"], "error": "No primary key found. Add uniqueness tests, meta, or tags.", "version": "1.0.0"}
         DEBUG    Database 'BigQuery(default_schema='dev', _interactive=False, is_closed=False, _dialect=Dialect(_prevent_overflow_when_concat=False), project='data-dev', dataset='dev', _client=<google.cloud.bigquery.client.Client object at 0x10xxxx>)' does not allow setting timezone. We recommend making sure it's set to 'UTC'.                                                                                                     _connect.py:300
         DEBUG    Database 'BigQuery(default_schema='dev', _interactive=False, is_closed=False, _dialect=Dialect(_prevent_overflow_when_concat=False), project='data-dev', dataset='dev', _client=<google.cloud.bigquery.client.Client object at 0x12xxxx>)' does not allow setting timezone. We recommend making sure it's set to 'UTC'.                                                                                                     _connect.py:300
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                                         base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'prod_schema'                                                                                                                                                                                                              
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                            base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'prod_schema'                                                                                                                                                                                                       
15:44:32 DEBUG    Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                  base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`dbt_pr_test_ci_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'dbt_pr_test_ci_schema'                                                                                                                                                                                   
         DEBUG    Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                               base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`dbt_pr_test_ci_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'dbt_pr_test_ci_schema'                                                                                                                                                                                          
15:44:33 DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                            base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'int_table' AND table_schema = 'prod_schema'                                                                                                                                                                                                       
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                                         base.py:980
                  SELECT column_name, data_type, 6 as datetime_precision, 38 as numeric_precision, 9 as numeric_scale FROM `data-prod`.`prod_schema`.INFORMATION_SCHEMA.COLUMNS WHERE table_name = 'fct_tbl_info' AND table_schema = 'prod_schema'                                                                                                                                                                                                              
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                             base.py:980
                  SELECT * FROM (SELECT TRIM(`sf_id`), TRIM(`col_name`), TRIM(`col_type`), TRIM(`col_mtd`), TRIM(`col_pl`) FROM `data-prod`.`prod_schema`.`int_table`) AS LIMITED_SELECT LIMIT 64                                                                                                                                                                                                                    
15:44:34 DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                                          base.py:980
                  SELECT * FROM (SELECT TRIM(`unit`) FROM `data-prod`.`prod_schema`.`fct_tbl_info`) AS LIMITED_SELECT LIMIT 64                                                                                                                                                                                                                                                                                                                               
 ..... Cut because text gets too long ....                                                                                                                                     
         DEBUG    Done collecting stats for table #2: ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                       joindiff_tables.py:306
         DEBUG    Testing for null keys: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                       joindiff_tables.py:252
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                             base.py:980
                  SELECT `col_id` FROM `data-prod`.`prod_schema`.`int_table` WHERE (`col_id` IS NULL)                                                                                                                                                                                                                                                                                                                                              
         DEBUG    Done collecting stats for table #2: ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                   joindiff_tables.py:306
         DEBUG    Testing for null keys: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                joindiff_tables.py:252
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                                         base.py:980
                  SELECT `col_id` FROM `data-prod`.`prod_schema`.`fct_tbl_info` WHERE (`col_id` IS NULL)                                                                                                                                                                                                                                                                                                                                                       
15:44:38 DEBUG    Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                                                                   base.py:980
                  SELECT `col_id` FROM `data-prod`.`dbt_pr_test_ci_schema`.`int_table` WHERE (`col_id` IS NULL)                                                                                                                                                                                                                                                                                                                                    
         DEBUG    Running SQL (BigQuery): ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                                                                               base.py:980
                  SELECT `col_id` FROM `data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info` WHERE (`col_id` IS NULL)                                                                                                                                                                                                                                                                                                                                             
15:44:39 DEBUG    Counting exclusive rows: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                    joindiff_tables.py:372
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                     base.py:980
                  SELECT count(*) FROM (SELECT * FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`col_value` is distinct from `tmp2`.`col_value` THEN 1 ELSE 0 END AS `is_diff_col_value`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0               
                  END AS `is_diff_org_id`, CASE WHEN `tmp1`.`col_combined_value` is distinct from `tmp2`.`col_combined_value` THEN 1 ELSE 0 END AS `is_diff_col_combined_value`, CASE WHEN `tmp1`.`col_mtd` is distinct from `tmp2`.`col_mtd` THEN 1 ELSE 0 END AS `is_diff_col_mtd`, CASE WHEN `tmp1`.`sf_id` is distinct from `tmp2`.`sf_id` THEN 1 ELSE 0 END AS `is_diff_sf_id`, CASE WHEN                                   
                  `tmp1`.`col_ch_prob` is distinct from `tmp2`.`col_ch_prob` THEN 1 ELSE 0 END AS `is_diff_col_ch_prob`, CASE WHEN `tmp1`.`col_name` is distinct from `tmp2`.`col_name` THEN 1 ELSE 0 END AS `is_diff_col_name`, CASE WHEN `tmp1`.`col_pl` is distinct from `tmp2`.`col_pl` THEN 1 ELSE 0 END AS `is_diff_col_pl`, CASE WHEN `tmp1`.`col_type` is distinct from `tmp2`.`col_type` THEN 1 ELSE 0 END AS                     
                  `is_diff_col_type`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, format('%.11f', `tmp1`.`col_value`) AS `col_value_a`, format('%.11f', `tmp2`.`col_value`) AS `col_value_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, format('%.11f', `tmp1`.`col_combined_value`) AS `col_combined_value_a`,             
                  format('%.11f', `tmp2`.`col_combined_value`) AS `col_combined_value_b`, cast(`tmp1`.`col_mtd` as string) AS `col_mtd_a`, cast(`tmp2`.`col_mtd` as string) AS `col_mtd_b`, cast(`tmp1`.`sf_id` as string) AS `sf_id_a`, cast(`tmp2`.`sf_id` as string) AS `sf_id_b`, format('%.11f', `tmp1`.`col_ch_prob`) AS `col_ch_prob_a`, format('%.11f', `tmp2`.`col_ch_prob`) AS             
                  `col_ch_prob_b`, cast(`tmp1`.`col_name` as string) AS `col_name_a`, cast(`tmp2`.`col_name` as string) AS `col_name_b`, cast(`tmp1`.`col_pl` as string) AS `col_pl_a`, cast(`tmp2`.`col_pl` as string) AS `col_pl_b`, cast(`tmp1`.`col_type` as string) AS `col_type_a`, cast(`tmp2`.`col_type` as string) AS `col_type_b` FROM                                                                                   
                  `data-prod`.`prod_schema`.`int_table` `tmp1` FULL OUTER JOIN `data-prod`.`dbt_pr_test_ci_schema`.`int_table` `tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` = 1) OR (`is_diff_col_value` = 1) OR (`is_diff_org_id` = 1) OR (`is_diff_col_combined_value` = 1) OR (`is_diff_col_mtd` = 1) OR                
                  (`is_diff_sf_id` = 1) OR (`is_diff_col_ch_prob` = 1) OR (`is_diff_col_name` = 1) OR (`is_diff_col_pl` = 1) OR (`is_diff_col_type` = 1)) AND (`is_exclusive_a` OR `is_exclusive_b`)) tmp4                                                                                                                                                                                                                                                                       
         DEBUG    Counting exclusive rows: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                              joindiff_tables.py:372
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                               base.py:980
                  SELECT count(*) FROM (SELECT * FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`is_error` is distinct from `tmp2`.`is_error` THEN 1 ELSE 0 END AS                     
                  `is_diff_is_error`, CASE WHEN `tmp1`.`is_inc_error` is distinct from `tmp2`.`is_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_inc_error`, CASE WHEN `tmp1`.`is_non_inc_error` is distinct from `tmp2`.`is_non_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_non_inc_error`, CASE WHEN `tmp1`.`created_at` is distinct from `tmp2`.`created_at` THEN 1 ELSE 0 END AS `is_diff_created_at`, CASE WHEN `tmp1`.`is_x` is distinct from                       
                  `tmp2`.`is_x` THEN 1 ELSE 0 END AS `is_diff_is_x`, CASE WHEN `tmp1`.`is_inc` is distinct from `tmp2`.`is_inc` THEN 1 ELSE 0 END AS `is_diff_is_inc`, CASE WHEN `tmp1`.`is_x_error` is distinct from `tmp2`.`is_x_error` THEN 1 ELSE 0 END AS `is_diff_is_x_error`, CASE WHEN `tmp1`.`unit` is distinct from `tmp2`.`unit` THEN 1 ELSE 0 END AS `is_diff_unit`, CASE WHEN                
                  `tmp1`.`is_missing_t` is distinct from `tmp2`.`is_missing_t` THEN 1 ELSE 0 END AS `is_diff_is_missing_t`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, cast(cast(`tmp1`.`is_error` as int) as string) AS `is_error_a`, cast(cast(`tmp2`.`is_error` as              
                  int) as string) AS `is_error_b`, cast(cast(`tmp1`.`is_inc_error` as int) as string) AS `is_inc_error_a`, cast(cast(`tmp2`.`is_inc_error` as int) as string) AS `is_inc_error_b`, cast(cast(`tmp1`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_a`, cast(cast(`tmp2`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_b`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp1`.`created_at`) AS `created_at_a`,                              
                  FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp2`.`created_at`) AS `created_at_b`, cast(cast(`tmp1`.`is_x` as int) as string) AS `is_x_a`, cast(cast(`tmp2`.`is_x` as int) as string) AS `is_x_b`, cast(cast(`tmp1`.`is_inc` as int) as string) AS `is_inc_a`, cast(cast(`tmp2`.`is_inc` as int) as string) AS `is_inc_b`, cast(cast(`tmp1`.`is_x_error` as int) as string) AS                                      
                  `is_x_error_a`, cast(cast(`tmp2`.`is_x_error` as int) as string) AS `is_x_error_b`, cast(`tmp1`.`unit` as string) AS `unit_a`, cast(`tmp2`.`unit` as string) AS `unit_b`, cast(cast(`tmp1`.`is_missing_t` as int) as string) AS `is_missing_t_a`, cast(cast(`tmp2`.`is_missing_t` as int) as string) AS `is_missing_t_b` FROM                                                    
                  `data-prod`.`prod_schema`.`fct_tbl_info` `tmp1` FULL OUTER JOIN `data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info` `tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` = 1) OR (`is_diff_org_id` = 1) OR (`is_diff_is_error` = 1) OR (`is_diff_is_inc_error` = 1) OR (`is_diff_is_non_inc_error` = 1) OR (`is_diff_created_at` = 1)              
                  OR (`is_diff_is_x` = 1) OR (`is_diff_is_inc` = 1) OR (`is_diff_is_x_error` = 1) OR (`is_diff_unit` = 1) OR (`is_diff_is_missing_t` = 1)) AND (`is_exclusive_a` OR `is_exclusive_b`)) tmp4                                                                                                                                                                                                                                                      
15:44:40 DEBUG    Counting differences per column: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                           joindiff_tables.py:346
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                    base.py:980
                  SELECT sum(`is_diff_col_id`), sum(`is_diff_col_value`), sum(`is_diff_org_id`), sum(`is_diff_col_combined_value`), sum(`is_diff_col_mtd`), sum(`is_diff_sf_id`), sum(`is_diff_col_ch_prob`), sum(`is_diff_col_name`), sum(`is_diff_col_pl`), sum(`is_diff_col_type`) FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS `is_exclusive_b`, CASE WHEN                              
                  `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`col_value` is distinct from `tmp2`.`col_value` THEN 1 ELSE 0 END AS `is_diff_col_value`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`col_combined_value` is distinct from `tmp2`.`col_combined_value` THEN 1 ELSE 0 END AS                         
                  `is_diff_col_combined_value`, CASE WHEN `tmp1`.`col_mtd` is distinct from `tmp2`.`col_mtd` THEN 1 ELSE 0 END AS `is_diff_col_mtd`, CASE WHEN `tmp1`.`sf_id` is distinct from `tmp2`.`sf_id` THEN 1 ELSE 0 END AS `is_diff_sf_id`, CASE WHEN `tmp1`.`col_ch_prob` is distinct from `tmp2`.`col_ch_prob` THEN 1 ELSE 0 END AS `is_diff_col_ch_prob`, CASE WHEN `tmp1`.`col_name` is distinct             
                  from `tmp2`.`col_name` THEN 1 ELSE 0 END AS `is_diff_col_name`, CASE WHEN `tmp1`.`col_pl` is distinct from `tmp2`.`col_pl` THEN 1 ELSE 0 END AS `is_diff_col_pl`, CASE WHEN `tmp1`.`col_type` is distinct from `tmp2`.`col_type` THEN 1 ELSE 0 END AS `is_diff_col_type`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, format('%.11f', `tmp1`.`col_value`) AS            
                  `col_value_a`, format('%.11f', `tmp2`.`col_value`) AS `col_value_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, format('%.11f', `tmp1`.`col_combined_value`) AS `col_combined_value_a`, format('%.11f', `tmp2`.`col_combined_value`) AS `col_combined_value_b`, cast(`tmp1`.`col_mtd` as string) AS `col_mtd_a`,                                     
                  cast(`tmp2`.`col_mtd` as string) AS `col_mtd_b`, cast(`tmp1`.`sf_id` as string) AS `sf_id_a`, cast(`tmp2`.`sf_id` as string) AS `sf_id_b`, format('%.11f', `tmp1`.`col_ch_prob`) AS `col_ch_prob_a`, format('%.11f', `tmp2`.`col_ch_prob`) AS `col_ch_prob_b`, cast(`tmp1`.`col_name` as string) AS `col_name_a`, cast(`tmp2`.`col_name` as string) AS `col_name_b`,                        
                  cast(`tmp1`.`col_pl` as string) AS `col_pl_a`, cast(`tmp2`.`col_pl` as string) AS `col_pl_b`, cast(`tmp1`.`col_type` as string) AS `col_type_a`, cast(`tmp2`.`col_type` as string) AS `col_type_b` FROM `data-prod`.`prod_schema`.`int_table` `tmp1` FULL OUTER JOIN                                                                                                             
                  `data-prod`.`dbt_pr_test_ci_schema`.`int_table` `tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` = 1) OR (`is_diff_col_value` = 1) OR (`is_diff_org_id` = 1) OR (`is_diff_col_combined_value` = 1) OR (`is_diff_col_mtd` = 1) OR (`is_diff_sf_id` = 1) OR (`is_diff_col_ch_prob` = 1) OR (`is_diff_col_name` = 1) OR                             
                  (`is_diff_col_pl` = 1) OR (`is_diff_col_type` = 1))                                                                                                                                                                                                                                                                                                                                                                                                                               
         DEBUG    Counting differences per column: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                   joindiff_tables.py:346
         DEBUG    Running SQL (BigQuery): ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                            base.py:980
                  SELECT sum(`is_diff_col_id`), sum(`is_diff_org_id`), sum(`is_diff_is_error`), sum(`is_diff_is_inc_error`), sum(`is_diff_is_non_inc_error`), sum(`is_diff_created_at`), sum(`is_diff_is_x`), sum(`is_diff_is_inc`), sum(`is_diff_is_x_error`), sum(`is_diff_unit`), sum(`is_diff_is_missing_t`) FROM (SELECT (`tmp2`.`col_id` IS NULL) AS `is_exclusive_a`, (`tmp1`.`col_id` IS NULL) AS                            
                  `is_exclusive_b`, CASE WHEN `tmp1`.`col_id` is distinct from `tmp2`.`col_id` THEN 1 ELSE 0 END AS `is_diff_col_id`, CASE WHEN `tmp1`.`org_id` is distinct from `tmp2`.`org_id` THEN 1 ELSE 0 END AS `is_diff_org_id`, CASE WHEN `tmp1`.`is_error` is distinct from `tmp2`.`is_error` THEN 1 ELSE 0 END AS `is_diff_is_error`, CASE WHEN `tmp1`.`is_inc_error` is distinct from `tmp2`.`is_inc_error` THEN 1 ELSE 0 END AS                         
                  `is_diff_is_inc_error`, CASE WHEN `tmp1`.`is_non_inc_error` is distinct from `tmp2`.`is_non_inc_error` THEN 1 ELSE 0 END AS `is_diff_is_non_inc_error`, CASE WHEN `tmp1`.`created_at` is distinct from `tmp2`.`created_at` THEN 1 ELSE 0 END AS `is_diff_created_at`, CASE WHEN `tmp1`.`is_x` is distinct from `tmp2`.`is_x` THEN 1 ELSE 0 END AS `is_diff_is_x`, CASE WHEN `tmp1`.`is_inc` is distinct from                    
                  `tmp2`.`is_inc` THEN 1 ELSE 0 END AS `is_diff_is_inc`, CASE WHEN `tmp1`.`is_x_error` is distinct from `tmp2`.`is_x_error` THEN 1 ELSE 0 END AS `is_diff_is_x_error`, CASE WHEN `tmp1`.`unit` is distinct from `tmp2`.`unit` THEN 1 ELSE 0 END AS `is_diff_unit`, CASE WHEN `tmp1`.`is_missing_t` is distinct from `tmp2`.`is_missing_t` THEN 1 ELSE 0 END AS                                       
                  `is_diff_is_missing_t`, cast(`tmp1`.`col_id` as string) AS `col_id_a`, cast(`tmp2`.`col_id` as string) AS `col_id_b`, cast(`tmp1`.`org_id` as string) AS `org_id_a`, cast(`tmp2`.`org_id` as string) AS `org_id_b`, cast(cast(`tmp1`.`is_error` as int) as string) AS `is_error_a`, cast(cast(`tmp2`.`is_error` as int) as string) AS `is_error_b`, cast(cast(`tmp1`.`is_inc_error` as int) as string) AS                        
                  `is_inc_error_a`, cast(cast(`tmp2`.`is_inc_error` as int) as string) AS `is_inc_error_b`, cast(cast(`tmp1`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_a`, cast(cast(`tmp2`.`is_non_inc_error` as int) as string) AS `is_non_inc_error_b`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp1`.`created_at`) AS `created_at_a`, FORMAT_TIMESTAMP('%F %H:%M:%E6S', `tmp2`.`created_at`) AS `created_at_b`,                                                
                  cast(cast(`tmp1`.`is_x` as int) as string) AS `is_x_a`, cast(cast(`tmp2`.`is_x` as int) as string) AS `is_x_b`, cast(cast(`tmp1`.`is_inc` as int) as string) AS `is_inc_a`, cast(cast(`tmp2`.`is_inc` as int) as string) AS `is_inc_b`, cast(cast(`tmp1`.`is_x_error` as int) as string) AS `is_x_error_a`, cast(cast(`tmp2`.`is_x_error` as int) as string) AS                  
                  `is_x_error_b`, cast(`tmp1`.`unit` as string) AS `unit_a`, cast(`tmp2`.`unit` as string) AS `unit_b`, cast(cast(`tmp1`.`is_missing_t` as int) as string) AS `is_missing_t_a`, cast(cast(`tmp2`.`is_missing_t` as int) as string) AS `is_missing_t_b` FROM `data-prod`.`prod_schema`.`fct_tbl_info` `tmp1` FULL OUTER JOIN                                     
                  `data-prod`.`dbt_pr_test_ci_schema`.`fct_tbl_info` `tmp2` ON (`tmp1`.`col_id` = `tmp2`.`col_id`)) tmp3 WHERE ((`is_diff_col_id` = 1) OR (`is_diff_org_id` = 1) OR (`is_diff_is_error` = 1) OR (`is_diff_is_inc_error` = 1) OR (`is_diff_is_non_inc_error` = 1) OR (`is_diff_created_at` = 1) OR (`is_diff_is_x` = 1) OR (`is_diff_is_inc` = 1) OR (`is_diff_is_x_error` = 1)            
                  OR (`is_diff_unit` = 1) OR (`is_diff_is_missing_t` = 1))                                                                                                                                                                                                                                                                                                                                                                                                                      
15:44:41 INFO     Diffing complete: ('data-prod', 'prod_schema', 'int_table') <> ('data-prod', 'dbt_pr_test_ci_schema', 'int_table')                                                                                                                                                                                                                                                                                                      joindiff_tables.py:165
{"status": "success", "result": "identical", "model": "model.dbt.int_table", "dataset1": ["data-prod", "prod_schema", "int_table"], "dataset2": ["data-prod", "dbt_pr_test_ci_schema", "int_table"], "rows": {"exclusive": {"dataset1": [], "dataset2": []}, "diff": []}, "summary": {"rows": {"total": {"dataset1": 27346, "dataset2": 27346}, 
"exclusive": {"dataset1": 0, "dataset2": 0}, "updated": 0, "unchanged": 27346}, "stats": {"diffCounts": {"col_value": 0, "org_id": 0, "col_combined_value": 0, "col_mtd": 0, "sf_id": 0, "col_ch_prob": 0, "col_name": 0, "col_pl": 0, "col_type": 0}}}, "columns": {"dataset1": [{"name": "col_id", "type": "INT64", "kind": "integer"}, {"name": "org_id", "type": "INT64", "kind": "integer"}, {"name": "sf_id", "type": 
"STRING", "kind": "unsupported"}, {"name": "col_name", "type": "STRING", "kind": "unsupported"}, {"name": "col_type", "type": "STRING", "kind": "unsupported"}, {"name": "col_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_combined_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_mtd", "type": "STRING", "kind": "unsupported"}, {"name": "col_pl", "type": "STRING", "kind": "unsupported"}, {"name": "col_ch_prob", "type": "FLOAT64", 
"kind": "float"}], "dataset2": [{"name": "col_id", "type": "INT64", "kind": "integer"}, {"name": "org_id", "type": "INT64", "kind": "integer"}, {"name": "sf_id", "type": "STRING", "kind": "unsupported"}, {"name": "col_name", "type": "STRING", "kind": "unsupported"}, {"name": "col_type", "type": "STRING", "kind": "unsupported"}, {"name": "col_value", "type": "FLOAT64", "kind": "float"}, {"name": "col_combined_value", "type": "FLOAT64", "kind": "float"}, 
{"name": "col_mtd", "type": "STRING", "kind": "unsupported"}, {"name": "col_pl", "type": "STRING", "kind": "unsupported"}, {"name": "col_ch_prob", "type": "FLOAT64", "kind": "float"}], "primaryKey": ["col_id"], "exclusive": {"dataset1": [], "dataset2": []}, "typeChanged": []}, "version": "1.1.0"}
15:45:22 INFO     Diffing complete: ('data-prod', 'prod_schema', 'fct_tbl_info') <> ('data-prod', 'dbt_pr_test_ci_schema', 'fct_tbl_info')                                                                                                                                                                                                                                                                                               joindiff_tables.py:165
⠼ 
In Progress fct_tbl_info
In Progress int_table
Diffing complete: ('data-prod', 'prod_schema', 'fct_tbl

The last line is also not fully shown. It's cut before even the table name.

Describe the environment

I'm using macOS 14.3.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions