Fixes #21953, #23338, #27380: upgrade collate-sqllineage to >=2.1.1 with regression tests#27413
Conversation
…lineage test coverage
There was a problem hiding this comment.
Pull request overview
Upgrades the ingestion SQL lineage dependency (collate-sqllineage) to a newer minimum version and expands the unit test suite to validate newly-fixed lineage parsing behaviors across dialects/parsers.
Changes:
- Bump
collate-sqllineageminimum version from>=2.0.2to>=2.1.1. - Unskip/adjust existing lineage tests where 2.1.1 fixes parser behavior (notably CTE column lineage and ClickHouse CTAS+CTEs).
- Add new regression tests covering ClickHouse CTAS patterns, BigQuery CLONE with digit-starting identifiers, and additional Snowflake COPY INTO stage/table patterns.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
ingestion/setup.py |
Updates base dependency to require collate-sqllineage>=2.1.1. |
ingestion/tests/unit/lineage/test_sql_lineage.py |
Removes the skip on the CTE column-lineage test now that the parser behavior is fixed. |
ingestion/tests/unit/lineage/queries/test_specific_dialect_queries.py |
Cleans up skip annotations and adds multiple new cross-parser regression tests for dialect-specific lineage. |
🟡 Playwright Results — all passed (17 flaky)✅ 3664 passed · ❌ 0 failed · 🟡 17 flaky · ⏭️ 89 skipped
🟡 17 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
Code Review ✅ ApprovedUpgrades collate-sqllineage to version 2.1.1 to resolve multiple reported issues and includes new regression tests. No issues found. OptionsDisplay: compact → Showing less information. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
|
|
Failed to cherry-pick changes to the 1.12.6 branch. |



Describe your changes:
Fixes #21953
Fixes #23338
Fixes #27380
Upgrades
collate-sqllineageminimum version from>=2.0.2to>=2.1.1(release) and validates the release with an expanded unit test suite.Parser fixes unlocked by 2.1.1:
test_populate_column_lineage_map_cteswas previously skipped because SqlGlot failed to propagate column lineage through CTEs. Skip removed.test_clickhouse_create_table_with_ctes.New regression tests (18 → 28):
test_ctas_union_all_inside_cte_column_lineage— CTAS where a CTE wraps a UNION ALL; column lineage flows from both union branches to the write targettest_clickhouse_ctas_engine_union_all_not_in— ClickHouse CTAS withENGINE =, CTEs, UNION ALL, and NOT IN subquery (dbt + Clickhouse incorrect process column lineage for the final model #21953)test_bigquery_clone_table_with_digit_starting_name— BigQueryCLONEwhere the source identifier starts with a digit (GCP BQ Data Lineage doesn't work if tables cloned between different dataset. #23338)test_snowflake_copy_into_table_from_stage— COPY INTO table from@stagetest_snowflake_copy_into_stage_from_table— COPY INTO@stagefrom tabletest_snowflake_copy_into_stage_from_select— COPY INTO@stagefrom SELECT subquerytest_snowflake_copy_into_fully_qualified_stage— COPY INTO with fully-qualified stage pathtest_snowflake_copy_into_table_with_column_list_from_stage_subquery— COPY INTO with explicit column list and$1:fieldpositional syntax (Snowflake Stage → Table lineage not generated for certain COPY INTO patterns #27380)test_snowflake_copy_into_stage_subpath_with_external_file_format— stage subpath with external named FILE_FORMAT (Snowflake Stage → Table lineage not generated for certain COPY INTO patterns #27380)test_snowflake_copy_into_stage_subpath_date_partitioned— date-partitioned stage subpath (/YYYY/MM/DD/file.csv) stripped to stage root (Snowflake Stage → Table lineage not generated for certain COPY INTO patterns #27380)Skip cleanup:
test_complex_postgres_view— replaced broad@pytest.mark.skipwith targetedtest_sqlfluff=False; SqlGlot and SqlParse extract correct column lineage, only SqlFluff is intermittently flaky on deeply nested UNION ALL (~5% of runs)test_postgres_copy_with_jsonb_to_target— removedtest_sqlglot=Falseandtest_sqlparse=False; all 3 parsers now handleCOPY FROMcorrectlyType of change:
Checklist:
Fixes <issue-number>: <short explanation>