[MINOR] hoodie.datasource.write.row.writer.enable should set to be true.#8200
[MINOR] hoodie.datasource.write.row.writer.enable should set to be true.#8200renshangtao wants to merge 1 commit intoapache:masterfrom
Conversation
Yes, when I tested clustering, I found that the files under the same partition after sorting can only ensure internal order of the files, but there is still no order between the files. The location code only found that this configuration and document settings are inconsistent |
|
on, I got it, the default value in config is true. But I think it will not lead to the differences of sorting results |
You can test it,if the value is false , it will create a RDDCustomColumnsSortPartitioner who's class description is " |
| clusteringPlan.getInputGroups().stream() | ||
| .map(inputGroup -> { | ||
| if (getWriteConfig().getBooleanOrDefault("hoodie.datasource.write.row.writer.enable", false)) { | ||
| if (getWriteConfig().getBooleanOrDefault("hoodie.datasource.write.row.writer.enable", true)) { |
There was a problem hiding this comment.
cc @nsivabalan .
Good catch. It looks like we cannot use the ConfigProperty directly due to circular dependency. Can you comb the codebase to see if there are similar cases ?
There was a problem hiding this comment.
guess this was intentional. unless user explicitly enables this config, we don't want to enable it for clustering.
There was a problem hiding this comment.
lets also consider issues like #8259 before we can make it default.
|
@renshangtao : Can you create a jira and add it to the PR description so that PR validation can succeed. |
e0317dc to
6758894
Compare
@renshangtao Both RDDCustomColumnsSortPartitioner and RowCustomColumnsSortPartitioner should sort globally. If you observe sorting issue, then it's a different bug to be fixed. Flipping this default value here is irrelevant to sorting issue |
|
as row-writing for clustering is introduced recently and we want to keep row-writing to false for longer to fully stablize it, will close this for now. |
Change Logs
Fix the mismatch between the value of hoodie.datasource.write.row.writer.enable and the document.
Impact
If the user does not find this configuration when sorting within the partition, the desired result will not be obtained.
Risk level
low
Documentation Update
No documentation update needed.
Contributor's checklist