[HUDI-6551] A new slashed month partition value extractor#9184
[HUDI-6551] A new slashed month partition value extractor#9184ban1989ban wants to merge 4 commits intoapache:masterfrom
Conversation
| * in the format 'yyyy/mm'. | ||
| */ | ||
| public class SlashEncodedYearMonthPartitionValueExtractor implements PartitionValueExtractor { | ||
|
|
There was a problem hiding this comment.
How about we name it: SlashEncodedMonthPartitionValueExtractor
| * PartitionValueExtractor interface to support extracting partition values from paths | ||
| * in the format 'yyyy/mm'. | ||
| */ | ||
| public class SlashEncodedMonthPartitionValueExtractor implements PartitionValueExtractor { |
There was a problem hiding this comment.
The functionality of this extractor is covered by SinglePartPartitionValueExtractor. Is SinglePartPartitionValueExtractor not enough in your use case?
There was a problem hiding this comment.
Maybe not, the user actually wants the data format with yyyy/mm and ignores the datetime.
There was a problem hiding this comment.
I understand that. SinglePartPartitionValueExtractor serves the same purpose by transforming yyyy/mm to yyyy-mm.
There was a problem hiding this comment.
The user want yyyy/mm instead.
There was a problem hiding this comment.
Not following, the logic and test show that yyyy/mm is transformed to yyyy-mm during extraction.
There was a problem hiding this comment.
@banank1989 could you clarify?
|
Closing this PR now. @banank1989 feel free to reopen it if you need additional functionality. |
Change Logs
Support for adding Month Wise Partitioner for Hudi-hive sync
Impact
With this, now users will be able ti hudi-hive sync where month wise partitioner is required by giving the --partition-value-extractor org.apache.hudi.hive.SlashEncodedMonthPartitionValueExtractor
Risk level (write none, low medium or high below)
None
Documentation Update
To use Month-wise partitioner while using Hive sync tool for Hudi Tables
Use Following command
$HUDI_HOME/hudi-sync/hudi-hive-sync/run_sync_tool.sh --jdbc-url jdbc:hive2://localhost:10000 --partitioned-by partitionid --base-path "hdfs://NameNodeIp: port/<path to table>" --user hive --pass hive --database default --table <table_name> --partition-value-extractor org.apache.hudi.hive.SlashEncodedMonthPartitionValueExtractorContributor's checklist