-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-1884] MergeInto Support Partial Update For COW #3154
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3154 +/- ##
============================================
- Coverage 47.82% 47.81% -0.01%
+ Complexity 5568 5566 -2
============================================
Files 936 936
Lines 41629 41655 +26
Branches 4188 4195 +7
============================================
+ Hits 19907 19916 +9
- Misses 19954 19968 +14
- Partials 1768 1771 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
// e.g. If the update action missing 'id' attribute, we fill a "id = target.id" to the update action. | ||
val newAssignments = removeMetaFields(target.output) | ||
.map(attr => { | ||
// TODO support partial update for MOR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for partial update, do we need another payload like this? #2666?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we can support partial update for cow table by the ExpressionPayload
added in the sql support.
@umehrot2 @zhedoubushishi @rmpifer can one of you please review this? |
63cf961
to
a9cc94f
Compare
@umehrot2 @xiarixiaoyao can you take a look at this PR? |
@@ -102,7 +103,7 @@ case class HoodieResolveReferences(sparkSession: SparkSession) extends Rule[Logi | |||
|
|||
def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperatorsUp { | |||
// Resolve merge into | |||
case MergeIntoTable(target, source, mergeCondition, matchedActions, notMatchedActions) | |||
case mergeInto@MergeIntoTable(target, source, mergeCondition, matchedActions, notMatchedActions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe better keep style like spark。 case mergeInto @ MergeIntoTable(target, source, mergeCondition, matchedActions, notMatchedActions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, will fix this format.
s" for MOR table. Currently we cannot support partial update for MOR," + | ||
s" please complete all the target fields just like '...update set id = s0.id, name = s0.name ....'") | ||
} | ||
if (preCombineField.isDefined && preCombineField.get.equalsIgnoreCase(attr.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should we specify value for the preCombineField in merge-into update action。 maybe we can delete this judgment logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the table has defined preCombineField
, we must know the mapping of preCombineField
in target table to the source field. So we must specify the value for preCombineField
, or not, we cannot read the preCombineField
field from the source.
LGTM |
f551185
to
f5f0fd3
Compare
Hi @n3nash can you take a at the test case |
f5f0fd3
to
5c55583
Compare
What is the purpose of the pull request
MergeInto Support Partial Update For COW.
Brief change log
we complete the missing fields to the update action, like this:
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.