Currently, MergeIntoHoodieTableCommand squarely relies on semantic implemented by ExpressionPayload to be able to insert/update records.
While this is necessary since MIT semantic enables users to do sophisticated and fine-grained updates (for ex, partial updating), this is not necessary in the most generic case:
{code:java}
MERGE INTO target
USING ... source
ON target.id = source.id
WHEN MATCHED THEN UPDATE *
WHEN NOT MATCHED THEN INSERT *{code}
This is essentially just a SQL way of implementing an upsert – if there are matching records in the table we update them, otherwise – insert.
In this case there's actually no need to use ExpressionPayload at all, and we can just simply use normal Hudi upserting flow to handle it (avoiding all of the ExpressionPayload overhead)
JIRA info
Currently,
MergeIntoHoodieTableCommandsquarely relies on semantic implemented byExpressionPayloadto be able to insert/update records.While this is necessary since MIT semantic enables users to do sophisticated and fine-grained updates (for ex, partial updating), this is not necessary in the most generic case:
{code:java}
MERGE INTO target
USING ... source
ON target.id = source.id
WHEN MATCHED THEN UPDATE *
WHEN NOT MATCHED THEN INSERT *{code}
This is essentially just a SQL way of implementing an upsert – if there are matching records in the table we update them, otherwise – insert.
In this case there's actually no need to use ExpressionPayload at all, and we can just simply use normal Hudi upserting flow to handle it (avoiding all of the ExpressionPayload overhead)
JIRA info