Skip to content

Does MERGE INTO operations support hidden partition on timestamp columns? #2765

@cduongvn

Description

@cduongvn

Hello,

We have noticed that transforming expressions on timestamp column in MERGE INTO queries don't get pushed down to table scan filter.

Query (published column is timestamp)

MERGE INTO iceberg.books.books_changeset_base_schema T USING books_changeset_base_schema595503 U ON
date(T.published) IN (date '1937-01-01', date '1925-01-01', date '1851-01-01') AND T.title = U.title
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

Physical Plan

21/06/29 12:52:08 INFO BaseTableScan: Scanning table iceberg.books.books_changeset_base_schema snapshot 6883139844713745910 created at 2021-06-29 12:52:06.246 with filter true
21/06/29 12:52:10 INFO SparkWrite: Committing overwrite of 1 data files with 3 new data files, scanSnapshotId: 6883139844713745910, conflictDetectionFilter: true to table iceberg.books.books_changeset_base_schema

But the same query works when the published column is date

MERGE INTO iceberg.books.books_changeset_base_schema T USING books_changeset_base_schema595503 U ON
date(T.published) IN (date '1937-01-01', date '1925-01-01', date '1851-01-01') AND T.title = U.title
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *

Physical Plan

21/06/29 12:52:08 INFO BaseTableScan: Scanning table iceberg.books.books_changeset_base_schema snapshot 4766385065641761074 created at 2021-06-29 13:12:06.246 with filter ref(name="published") in (-12053, -16436, -43464)
21/06/29 12:52:10 INFO SparkWrite: Committing overwrite of 1 data files with 3 new data files, scanSnapshotId: 4766385065641761074, conflictDetectionFilter: ref(name="published") in (-12053, -16436, -43464) to table iceberg.books.books_changeset_base_schema

Is this expected behaviour from Iceberg?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions