[SUPPORT] Upsert overwrting ordering field with invalid value #5469
Labels
priority:critical
production down; pipelines stalled; Need help asap.
spark
Issues related to spark
writer-core
Issues relating to core transactions/write actions
Projects
Describe the problem you faced
I'm writing an application to upsert records from a table. The problem is when an upsert operation is done, the ordering column of records that exists in base table and not exists in incoming data is overwritten to invalid value.
E.g.
The base table has a record with
id = 1
andcreateddate = 2022-04-01
The incoming data has a record with
id = 2
andcreateddate = 2022-04-02
After upsert operation the createddate of record with
id = 1
is changed to1970-xx-xx
and the record withid = 2
remains intact.To Reproduce
Example full file content
After upsert operation
Obs: A random number of records is affected by this bug. For each execution a different number of records is affected.
1rst execution
2nd execution
Environment Description
Hudi version : 0.10.0
Spark version : 3.1.2
Storage (HDFS/S3/GCS..) : Local
Running on Docker? (yes/no) : Yes
The text was updated successfully, but these errors were encountered: