[HUDI-6692] Don't default to bulk insert on nonpkless table if recordkey is omitted#9444
Conversation
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieWriterUtils.scala
Outdated
Show resolved
Hide resolved
I'm confused, if we already know it is a table with pk, can we just use the field from table config as the record key by default. And we should not think it as a pkless table. |
|
@danny0405 we default to bulk insert for pkless so what was happening is if the user forgets the recordkey field for a write it will do a bulk insert. |
Do we generate the |
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
Show resolved
Hide resolved
|
@rahil-c : do you happened to know why TestFSUtilsWithRetryWrapperEnable is failing w/ java 17 specifically? if you check github actions, java 17 module is failing and when looked at the logs, I see TestFSUtilsWithRetryWrapperEnable is failing and is repeated for 204 times. |
|
Hi @nsivabalan, I checked the code and it looks like an expected behavior. For non-java17 test it also fails and repeats for 204 times. If we want it to repeat less then maybe we can look into reducing the |
…key is omitted (#9444) - If a write to a table with a pk was missing the recordkey field in options it could default to bulk insert because it was using the pre-merging properties. Now it uses the post merging properties for the recordkey field. --------- Co-authored-by: Jonathan Vexler <=>
…key is omitted (apache#9444) - If a write to a table with a pk was missing the recordkey field in options it could default to bulk insert because it was using the pre-merging properties. Now it uses the post merging properties for the recordkey field. --------- Co-authored-by: Jonathan Vexler <=>
…key is omitted (apache#9444) - If a write to a table with a pk was missing the recordkey field in options it could default to bulk insert because it was using the pre-merging properties. Now it uses the post merging properties for the recordkey field. --------- Co-authored-by: Jonathan Vexler <=>
Change Logs
If a write to a table with a pk was missing the recordkey field in options it could default to bulk insert because it was using the pre-merging properties. Now it uses the post merging properties for the recordkey field.
Impact
prevent unexpected behavior
Risk level (write none, low medium or high below)
low
Documentation Update
N/A
Contributor's checklist