Tips before filing an issue
-
Have you gone through our FAQs? yes
-
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
-
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
Using BULK_INSERT mode multiple times writing causing a bug: Duplicate fileId 00000000-8651-4ae5-8f9e-4424fed2d181 from bucket 0 of partition found during the BucketStreamWriteFunction index bootstrap.
configuration:
write.operation=BULK_INSERT
index.type=BUCKET
hoodie.index.bucket.engine=SIMPLE
To Reproduce
Steps to reproduce the behavior:
1.A program writes to table a in BULK_INSERT mode.
2.Another program writes to this table using BULK_INSERT again, and the data written in the two times are not duplicated.
3.When trying to write incremental data using upsert mode, an error occurred.:Duplicate fileId 00000000-8651-4ae5-8f9e-4424fed2d181 from bucket 0 of partition found during the BucketStreamWriteFunction index bootstrap.
Expected behavior
How to use BULK_INSERT to write multiple times to the same table
Environment Description
Additional context
Add any other context about the problem here.
Stacktrace
Add the stacktrace of the error.