Skip to content

Avoid using Projections in generic Merge Into DMLs #15635

@hudi-bot

Description

@hudi-bot

Currently, MergeIntoHoodieTableCommand squarely relies on semantic implemented by ExpressionPayload to be able to insert/update records.

While this is necessary since MIT semantic enables users to do sophisticated and fine-grained updates (for ex, partial updating), this is not necessary in the most generic case:

 
{code:java}
MERGE INTO target
USING ... source
ON target.id = source.id
WHEN MATCHED THEN UPDATE *
WHEN NOT MATCHED THEN INSERT *{code}
This is essentially just a SQL way of implementing an upsert – if there are matching records in the table we update them, otherwise – insert.

In this case there's actually no need to use ExpressionPayload at all, and we can just simply use normal Hudi upserting flow to handle it (avoiding all of the ExpressionPayload overhead)

JIRA info

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions