Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature:New MergeOperator for upsert to ignore null fields by default #30

Closed
moresun opened this issue May 26, 2022 · 0 comments
Closed
Assignees
Labels
enhancement New feature or request

Comments

@moresun
Copy link
Contributor

moresun commented May 26, 2022

Here is a table with columns(A B C D ) , Column A is primary key , All fields are type string.
Spark reads a batch of json data from kafka and converts to Dataframe to upsert. But Dataframe may contains "null" or null for some cells because of some missing fields in kafka(.e.g {A:A1,B:B4,C:C4,D:D4}{A:A2,C:C5}{A:A3,B:B5,D:D5}) which could lead to unexpected result in new table.
Since LakeSoul has MergeOperator feature, we could create a new mergeoprator to deal with this case by ignoring null values
image

@moresun moresun added the enhancement New feature or request label May 26, 2022
@dmetasoul01 dmetasoul01 changed the title New Feature:New MergeOprator for upsert in some scenarios New Feature:New MergeOperator for upsert to ignore null fields by default May 26, 2022
@moresun moresun closed this as completed Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant