-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] CoW: Hudi Upsert not working when there is a timestamp field in the composite key #10303
Comments
|
Screen of the data showing the different behavior between bulk_insert and upsert on the same set of records. If you see we should be only having curr_id = 1 for only 1 record at any point in time. The first upsert broke that. But the subsequent update fixed it as the timestamp value of INT is taken into consideration for the second update.
|
@srinikandi I see a fix(#4201) was tried but then it was reverted due to another issue,. Will look into it. Thanks for raising this again. |
@srinikandi Sorry for the delay on this. I was able to reproduce the issue with Hudi version 0.12.1 and 0.14.1. We have introduced the config "hoodie.datasource.write.keygenerator.consistent.logical.timestamp.enabled", you can set it to True.
Reproducible Code which works when we set the config. -
Let me know in case you need any more help on this. Thanks. |
@srinikandi Closing out this issue, Please reopen in case you still faces this issue after setting |
Hi we have been facing this issue with Hudi Upserts that are converting a timestamp field which is part of the Composite primary key.
The bulk insert on the table works fine and storing the timestamp in a proper timestamp format. But when the same table has upsert operation (Type 2 SCD), The new row inserted is having Timestamp value is getting converting into EPOCH for the __hoodied_record_key. The actual attribute in the table is having the data in proper timestamp format. This is breaking the type 2 SCD that we are trying to achieve as the subsequent updates are all being treated as new records.
Steps to reproduce the behavior:
We are using Glue with Hudi 0.12.1
Hudi version : 0.12.1
Spark version : 3.3
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : No
Additional context
There was a issue opened about 2 years back and there was no resolution mentioned and the ticket was closed.
#3313
The text was updated successfully, but these errors were encountered: