[HUDI-7526] Fix constructors for bulkinsert sort partitioners to ensure we could use it as user defined partitioners#10942
Conversation
|
|
||
| /** | ||
| * Constructor to create as UserDefinedBulkInsertPartitioner class via reflection | ||
| * @param config HoodieWriteConfig |
There was a problem hiding this comment.
Can you give an example how this partitioner got instantiated and customized?
There was a problem hiding this comment.
This partitioner will be instantiated when user define write config property hoodie.bulkinsert.user.defined.partitioner.class=org.apache.hudi.execution.bulkinsert.JavaGlobalSortPartitioner.
This constructor will be called via reflection in methods of DataSourceUtils class createUserDefinedBulkInsertPartitioner(HoodieWriteConfig config) and createUserDefinedBulkInsertPartitionerWithRows(HoodieWriteConfig config).
There is nothing to customize in this JavaGlobalSortPartitioner, but, for example, provided writeConfig is used for customization of RowSpatialCurveSortPartitioner.
There was a problem hiding this comment.
I got confused because the "customized" HoodieWriteConfig does not really play a role here and it is ignored?
There was a problem hiding this comment.
Yes, in this case HoodieWriteConfig is ignored just because this Partitioner is not configurable at all, but it does not mean that it should not be used as UserDefinedBulkInsertPartitioner.
So I think, the purpose of this task is not to make all BulkInsertPartitioners customizable with HoodieWriteConfig, but only to make them instantiable via reflection with already existing common approach for UserDefinedBulkInsertPartitioner (constructor with HoodieWriteConfig as the only parameter).
@nsivabalan am I right?
There was a problem hiding this comment.
Yes, in this case HoodieWriteConfig is ignored just because this Partitioner is not configurable at all, but it does not mean that it should not be used as UserDefinedBulkInsertPartitioner
That does not make sense for me.
There was a problem hiding this comment.
Before this fix:
if user wants to use JavaGlobalSortPartitioner and he set hoodie.bulkinsert.user.defined.partitioner.class=org.apache.hudi.execution.bulkinsert.JavaGlobalSortPartitioner, it will not work because this partitioner could not be instantiated via reflection (as it has no constructor with writeConfig parameter).
We create this constructor to add ability to use JavaGlobalSortPartitioner as user defined partitioner just by setting it's class name in writeConfig.
Don't know how to explain more clear. Let's wait for the author's reply.
There was a problem hiding this comment.
We create this constructor to add ability to use JavaGlobalSortPartitioner as user defined partitioner just by setting it's class name in writeConfig.
But does it work correctly with the write config being igored?
There was a problem hiding this comment.
i think, yes it does. before the fix this partitioner had only default constructor, it was enough for it's instantiation and working, as it has no referencies to any writeconfig params
There was a problem hiding this comment.
@danny0405 can you please assign this PR to @nsivabalan to summon him to this discussion?
|
@nsivabalan Hi! Sorry to bother you, but you are reporter of this task. Could you please review my PR? |
ea11f68 to
55fb13f
Compare
…re we could use it as user defined partitioners
Change Logs
Our constructor for user defined sort partitioner takes in write config, while some of the partitioners used in out of the box sort mode, does not account for it.
Lets fix the sort partitioners to ensure anything can be used as user defined partitioners.
For eg, NoneSortMode does not have a constructor that takes in write config
Impact
none
Risk level (write none, low medium or high below)
none
Documentation Update
none
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist