[HUDI-6823] initializing writeTimer before using when flink checkpoin…#9631
[HUDI-6823] initializing writeTimer before using when flink checkpoin…#9631LXin96 wants to merge 9 commits intoapache:masterfrom
Conversation
|
@hudi-bot run azure |
1. This commit allows users to disable metadata using write configs cleanly. 2. Valid instants consideration while reading from MDT is solid now. We are going to treat any special instant time (that has additional suffix compared to DT's commit time) as valid. Especially with MDT partition initialization, the suffix is dynamic, and so we can't really find exact match. So, might have to go with total instant time length and treat all special instant times as valid ones. In the LogRecordReader, we will first ignore any uncommitted instants. And then if it's completed in MDT timeline, we check w/ the instantRange. So it should be fine to return true for any special instant times.
The core change is to 1. Use HoodieRecord data type for general process; 2. Support HoodieSparkRecord when extract field value from records; Co-authored-by: Lin Liu <linliu@Lins-MacBook-Pro.local>
| metaClient.validateTableProperties(config.getProps()); | ||
|
|
||
| initTimer(operationType, table); | ||
|
|
There was a problem hiding this comment.
The Spark set up the timer before starting the writing tasks, while for Flink, it actually starts just before commiting the metadata, is that what you want then?
There was a problem hiding this comment.
thx for your reply. in fact, the method initTable I didn't do any change, just extract the part for named init timer method.I actually want to use the init timer method to init the writeTimer used in org.apache.hudi.client.BaseHoodieWriteClient#emitCommitMetrics, when i want to emitCommitMetrics, the writeTimer object have to be not null , or the method emitCommitMetrics will be jumpped over。when i print log ,I find the writeTimer in method [org.apache.hudi.client.BaseHoodieWriteClient#emitCommitMetrics] is null,
so i put the initTimer method after the [HoodieTable table = createTable(config, hadoopConf);]
There was a problem hiding this comment.
bulk_writer‘s metric is ok, and we can get metric and report. when flink checkpoint try to commit, it will call the org.apache.hudi.client.BaseHoodieWriteClient#commitStats to report other metrics, however the writeTimer in BaseHoodieWriteClient is null, when call the method org.apache.hudi.client.BaseHoodieWriteClient#emitCommitMetrics.
There was a problem hiding this comment.
I kind of think we should init the timer in StreamWriteOperatorCoordinator#startInstant(), and we can have a new method:
public void setWriteTimer(String commitType)There was a problem hiding this comment.
ok,i get you. I try to do that
There was a problem hiding this comment.
I open another #9637 and fix as above, this maybe close. thx very much
…to hudi_initialize_writeTime initial writeTimer in StreamWriteOperatorCoordinator
Change Logs
1、when checkpoint commit,the writeTimer in org.apache.hudi.client.BaseHoodieWriteClient#emitCommitMetrics is null and not be initialized , so it will ignore the org.apache.hudi.metrics.HoodieMetrics#updateCommitMetrics method, which
makes metrics can not be updated
2、in order to make the writetimer initialized,extract a method from initTable named initTimer
Impact
private void initTimer(WriteOperationType operationType, HoodieTable table) { switch (operationType) { case INSERT: case INSERT_PREPPED: case UPSERT: case UPSERT_PREPPED: case BULK_INSERT: case BULK_INSERT_PREPPED: case INSERT_OVERWRITE: case INSERT_OVERWRITE_TABLE: setWriteTimer(table); break; case CLUSTER: case COMPACT: case LOG_COMPACT: tableServiceClient.setTableServiceTimer(operationType); break; default: } }extracted from org.apache.hudi.client.BaseHoodieWriteClient#initTable to initialize writerTimer as before
Risk level (write none, low medium or high below)
medium
1、when checkpoint commit,the writeTimer in org.apache.hudi.client.BaseHoodieWriteClient#emitCommitMetrics is null and not be initialized , so it will ignore the org.apache.hudi.metrics.HoodieMetrics#updateCommitMetrics method, which
makes metrics can not be updated
2、in order to make the writetimer initialized,extract a method from initTable named initTimer
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist