-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed as not planned
Closed as not planned
Copy link
Labels
Description
Hello,
We have noticed that transforming expressions on timestamp column in MERGE INTO queries don't get pushed down to table scan filter.
Query (published column is timestamp)
MERGE INTO iceberg.books.books_changeset_base_schema T USING books_changeset_base_schema595503 U ON
date(T.published) IN (date '1937-01-01', date '1925-01-01', date '1851-01-01') AND T.title = U.title
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
Physical Plan
21/06/29 12:52:08 INFO BaseTableScan: Scanning table iceberg.books.books_changeset_base_schema snapshot 6883139844713745910 created at 2021-06-29 12:52:06.246 with filter true
21/06/29 12:52:10 INFO SparkWrite: Committing overwrite of 1 data files with 3 new data files, scanSnapshotId: 6883139844713745910, conflictDetectionFilter: true to table iceberg.books.books_changeset_base_schema
But the same query works when the published column is date
MERGE INTO iceberg.books.books_changeset_base_schema T USING books_changeset_base_schema595503 U ON
date(T.published) IN (date '1937-01-01', date '1925-01-01', date '1851-01-01') AND T.title = U.title
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *
Physical Plan
21/06/29 12:52:08 INFO BaseTableScan: Scanning table iceberg.books.books_changeset_base_schema snapshot 4766385065641761074 created at 2021-06-29 13:12:06.246 with filter ref(name="published") in (-12053, -16436, -43464)
21/06/29 12:52:10 INFO SparkWrite: Committing overwrite of 1 data files with 3 new data files, scanSnapshotId: 4766385065641761074, conflictDetectionFilter: ref(name="published") in (-12053, -16436, -43464) to table iceberg.books.books_changeset_base_schema
Is this expected behaviour from Iceberg?
Reactions are currently unavailable