[HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode#1647
[HUDI-867]: fixed IllegalArgumentException from graphite metrics in deltaStreamer continuous mode#1647pratyakshsharma wants to merge 1 commit intoapache:masterfrom
Conversation
…eltaStreamer continuous mode
Codecov Report
@@ Coverage Diff @@
## master #1647 +/- ##
============================================
+ Coverage 16.60% 18.37% +1.76%
- Complexity 800 855 +55
============================================
Files 344 344
Lines 15172 15165 -7
Branches 1512 1512
============================================
+ Hits 2520 2786 +266
+ Misses 12320 12026 -294
- Partials 332 353 +21 Continue to review full report at Codecov.
|
| try { | ||
| long start = System.currentTimeMillis(); | ||
| Option<String> scheduledCompactionInstant = deltaSync.syncOnce(); | ||
| HoodieMetrics.setTableName(cfg.metricsTableName + "_" + iteration); |
There was a problem hiding this comment.
sorry, I don't quite get why we need to set table name here? Next line will in turn rely on the arg passed for tablename. So, don't really understand why we need static fix (i.e. setTableName). From the diff, I see that cfg.tableName is set passed into DeltaSync.syncOnce(tblName) and HoodieDeltaStreamerMetrics(HoodieWriteConfig tableName). Can you help me understand the case where the static set method for table name is required.
There was a problem hiding this comment.
IllegalArgumentException is happening because the metrics name are generated in same way using tableName in each run. So we need some way of differentiating metrics names for every run and the easiest way to do that is altering the table name like "tableName_iteration". We need to do this change at 2 places, for HoodieDeltaStreamerMetrics and for HoodieMetrics.
The table name getting passed with syncOnce() method takes care of HoodieDeltaStreamerMetrics only. For all the other metrics, we need to reset table name in HoodieMetrics class.
There was a problem hiding this comment.
@pratyakshsharma Thanks for the fix, I have a small question, when using Spark Streaming writing to Hudi (which seems like the continuous mode of deltastreamer), the exception will happen again?
There was a problem hiding this comment.
@leesf yes. This PR only fixes this for continuous mode of HoodieDeltaStreamer. If you can point me to relevant code of spark datasource from where this can be executed, I can try fixing there as well.
There was a problem hiding this comment.
@pratyakshsharma just a raw idea, how about adding a static variable (iterator) in HoodieMetrics, and ++iterator after updateCommitMetrics. And back to the PR, if we only fix the HoodieDeltaStreamer, I think we would simply pass the iteration_time to syncOnce method to create a new metric name instead of adding a static tableName to HoodieMetrics, I think it is a little bit weird.
|
I will let @leesf reivew this patch. |
|
@pratyakshsharma IIUC, this will introduce infinite number of metrics being sent to monitoring system in theory when it's set to continuous mode? Normally for a Hudi table, we'd like to monitor all kinds of metrics named like Haven't thought about the solution yet. Just trying to raise the concern. |
Thanks @xushiyan for your thoughts, yes, current solution will send metrics like |
|
@leesf There is some markdown format issue with your typings...not sure what is suggested. Just to clarify from user perspective. Say a user runs a delta streamer, he denotes the job with prefix |
@leesf This is also problematic as |
|
If i read https://stackoverflow.com/a/55753138 correctly, normally you register an gauge only at startup (or first metric write) and than just update the value in every loop. Currently Deltastreamer tries to register the gauge in every loop. ( It would be necessary to close the gauge at shutdown of the Deltastreamer, so if the Deltastreamer gets restarted the metric can be registered again. |
|
@pratyakshsharma I will add PR to support update metrics this weekend. |
|
I guess this issue is already solved. Can we close it now @xushiyan ? |
|
@pratyakshsharma yes I think so. Thanks for taking on this! |
|
@pratyakshsharma @xushiyan should this PR be closed? |
|
yup closing. |
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Tips
What is the purpose of the pull request
Added a way to create metric names with updated table name in every iteration so that IllegalArgumentException does not comes up.
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.