Skip to content

Support MERGE INTO #20746

@wirybeaver

Description

@wirybeaver

Is your feature request related to a problem or challenge?

The goal is to support MERGE INTO SQL statements in DataFusion so that downstream table providers (specifically Iceberg via iceberg-rust) can implement merge logic. The iceberg-rust repo (feature/merge-into branch) already has a merge_into function on its DataFusion table provider that expects DataFusion to parse MERGE INTO SQL and invoke a merge_into hook on TableProvider.

Example SQL:

MERGE INTO target_table t
USING source_table s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.value = s.value
WHEN NOT MATCHED THEN INSERT (id, value) VALUES (s.id, s.value)

Describe the solution you'd like

The implementation follows the same pattern as UPDATE/DELETE DML hooks (PR#19142). Reuse the existing DmlStatement logical plan node with a new WriteOp::MergeInto(MergeIntoOp) variant. The merge-specific data (ON condition, WHEN clauses) is carried in MergeIntoOp. The DmlStatement.input field holds the source plan (USING clause), and DmlStatement.target holds the target table.

sqlparser v0.61.0 already parses Statement::Merge — no parser work needed.

The following tasks are already implemented in the PoC branch with 3 commits. Plan to raise PRs one by another as the fork repo doesn't support stacking PRs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions