Skip to content

(WIP) Task delay metrics endpoints on TaskDriver#1056

Closed
NealSun96 wants to merge 5 commits intoapache:task_poolfrom
NealSun96:nealsun/task-thread-pool-related-metrics
Closed

(WIP) Task delay metrics endpoints on TaskDriver#1056
NealSun96 wants to merge 5 commits intoapache:task_poolfrom
NealSun96:nealsun/task-thread-pool-related-metrics

Conversation

@NealSun96
Copy link
Contributor

@NealSun96 NealSun96 commented Jun 3, 2020

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

Fixes #1055

Description

  • Here are some details about my PR, including screenshots of any UI changes:

We are adding 3 endpoints to TaskDriver that return values of SubmissionToProcessDelay, SubmissionToScheduleDelay, and ControllerInducedDelay. The purpose of the endpoints is to offer an option for users that cannot rely directly on the controller emitted metrics to see the metric values. In order for the metrics to be accessible from TaskDriver, the metric values are written to ZooKeeper every time a new metric value is reported. The metric values that are synced to Zk is the mean value. Note: since they are synced on reporting time, if there is no reporting, there is no update to these values. These endpoint values are fresh when there are task framework resources actively running.

Tests

  • The following tests are written for this issue:

TestJobMonitor, testGetSubmissionToProcessDelay, testGetSubmissionToProcessDelayIllegalArgument, testGetSubmissionToScheduleDelay, testGetSubmissionToScheduleDelayIllegalArgument, testControllerInducedDelay, testControllerInducedDelayIllegalArgument

  • The following is the result of the "mvn test" command on the appropriate module:
[ERROR] Failures: 
[ERROR]   TestEnableCompression.testEnableCompressionResource:117 expected:<true> but was:<false>
[ERROR]   TestStateTransitionTimeoutWithResource.testStateTransitionTimeOut:171 expected:<true> but was:<false>
[INFO] 
[ERROR] Tests run: 1178, Failures: 2, Errors: 0, Skipped: 1
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:16 h
[INFO] Finished at: 2020-06-02T19:20:23-07:00
[INFO] ------------------------------------------------------------------------

Rerun:

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.517 s - in TestSuite
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  43.765 s
[INFO] Finished at: 2020-06-03T09:14:54-07:00
[INFO] ------------------------------------------------------------------------

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

@jiajunwang
Copy link
Contributor

I have 2 questions,

  1. What is the expected ZK throughput? It sounds a very scary idea. Note that ZK writes traffic is the current performance bottleneck.
  2. If we are doing this, I suggest doing this as an additional metric reporting channel outside the monitor classes, or even outside the Helix lib. MBean is a generic way to report metrics. To expose that data to any other metric collection system or to ZK is not related to the MBeans. Adding this option and extra logic into the monitors just increase the unnecessary complexity of those classes.

// first level, and its children nodes will represent each job type. Since the metrics are
// aggregated by job types, the metric values will be recorded in each ZNode belonging to a job
// type as simple fields.
private static final String SYNC_ZK_ROOT_PATH = "/JOB_MONITOR_METRICS";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need a ZNode to persist metrics data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't use MBean for such metrics?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the metrics are emitted by the controller. We want to expose the metrics to the client side, therefore ZNodes are used as a medium to store the metrics for access.

MBean is still used; the ZNodes are for "syncing" or "copying" the metrics to client side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now necessary for doing in this way:

  1. These metrics can be done in controller by looking at the stats of the ZNode property.
  2. Even if you would like to have it in client side. You should make it reported in MBean since all the data can be derived from config/context. So ingraph can fetch MBean to report it. This is not a good idea to have metrics reporting to introduce new ZNode create/read/write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not trying to say no to this idea before I fully understand the context here. But,

  1. Who are this "We"?
  2. Do we have a design that illustrates the motivation and potential risks? I would like to take a look and see if there are any other ways.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not think it is a good idea to use ZK as syncing store for metrics.. ZK is not designed for this purpose. ZK is such a core part of infra. Syncing metrics to ZK definitely increase read/write traffic to ZK, which may impact the other critical services. If we have a better design not to use ZK as a syncing store, exposing metrics to client is fine.

Could you explain any case "users that cannot rely directly on the controller emitted metrics to see the metric values."? I would like to see if there is an alternative solution that doesn't use ZK.

@NealSun96
Copy link
Contributor Author

NealSun96 commented Jun 4, 2020

I have 2 questions,

  1. What is the expected ZK throughput? It sounds a very scary idea. Note that ZK writes traffic is the current performance bottleneck.
  2. If we are doing this, I suggest doing this as an additional metric reporting channel outside the monitor classes, or even outside the Helix lib. MBean is a generic way to report metrics. To expose that data to any other metric collection system or to ZK is not related to the MBeans. Adding this option and extra logic into the monitors just increase the unnecessary complexity of those classes.
  1. Unfortunately I can't answer this question with confidence - as noted in the design doc, @narendly believes this is not a major performance concern; without much context on performance bottleneck myself, I trust @narendly's assessment.
    In terms of throughput, the syncing happens every time a data point is added to this metric.
    Lastly, to achieve the goal of "providing metrics on client side", this is the only method I can think of: using ZK. Is there any other good ways to achieve this goal?

  2. There isn't any means to access the gauges outside JobMonitor, which is why I placed the logic inside JobMonitor. I don't think adding support for JobMonitor to optionally sync values to Zk is "complexity"; it's just an option.
    But if you insist, I'm happy to leave the logic to an outer class. I will need to have JobMonitor providing access to the gauges.

@jiajunwang
Copy link
Contributor

I have 2 questions,

  1. What is the expected ZK throughput? It sounds a very scary idea. Note that ZK writes traffic is the current performance bottleneck.
  2. If we are doing this, I suggest doing this as an additional metric reporting channel outside the monitor classes, or even outside the Helix lib. MBean is a generic way to report metrics. To expose that data to any other metric collection system or to ZK is not related to the MBeans. Adding this option and extra logic into the monitors just increase the unnecessary complexity of those classes.
  1. Unfortunately I can't answer this question with confidence - as noted in the design doc, @narendly believes this is not a major performance concern; without much context on performance bottleneck myself, I trust @narendly's assessment.
    In terms of throughput, the syncing happens every time a data point is added to this metric.
    Lastly, to achieve the goal of "providing metrics on client side", this is the only method I can think of: using ZK. Is there any other good ways to achieve this goal?
  2. There isn't any means to access the gauges outside JobMonitor, which is why I placed the logic inside JobMonitor. I don't think adding support for JobMonitor to optionally sync values to Zk is "complexity"; it's just an option.
    But if you insist, I'm happy to leave the logic to an outer class. I will need to have JobMonitor providing access to the gauges.
  1. Then the problem is how you can convince us : ) This current explanation is not satisfying, to be frank. Please schedule a meeting so we can talk offline.
  2. Please check how we are using MBeanServer in our code base. That is the way. "complexity" means,
    • The JobMonitor will be a larger file. And there are 2 systems of reporting mechanism inside one class.
    • MBean is the generic mean for Java process to report it's status. The emitting part should be built above that. If you mixed emitting part with the MBean, it definitely makes it harder to understand.
    • If you want to add this type of support to the other monitors, you will need to do this again. However, if you do this outside as service, then you just need to register your MBean there. I guess no more than 10 lines of change given the ZK based metric emitting service is designed well.

// first level, and its children nodes will represent each job type. Since the metrics are
// aggregated by job types, the metric values will be recorded in each ZNode belonging to a job
// type as simple fields.
private static final String SYNC_ZK_ROOT_PATH = "/JOB_MONITOR_METRICS";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's review whether creating this root path would be appropriate. Can you find a better place to persist the data?

* Sync the current SubmissionToProcessDelay mean to ZooKeeper
* @param baseDataAccessor
*/
public void syncSubmissionToProcessDelayToZk(BaseDataAccessor<ZNRecord> baseDataAccessor) {
Copy link
Contributor

@narendly narendly Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, there could be potentially better designs to achieve this.

  • This is too specific and not very easy to maintain or scalable - suppose you want to add more metrics that you need to persist, then you'd have to implement additional methods for each additional metric.
  • Can you look into inversion of control? Perhaps you could create a component with the appropriate set of interface methods that allow persisting of metric numbers to a metadata store in a more generic fashion. For example, you could have something like,
<I> MetricStorage
<Class> ZkMetricStorage (takes a Zk connection to initialize)

Also, this MetricStorage implementation could take in a dynamic list of metric names to be persisted, etc. You could get creative with this. No need to feel like you have to cover all use cases, but designing a component in a way that is extendable and generic enough will save you time and help with maintainability.

protected static void reportSubmissionToProcessDelay(BaseControllerDataProvider dataProvider,
final ClusterStatusMonitor clusterStatusMonitor, final WorkflowConfig workflowConfig,
final JobConfig jobConfig, final long currentTimestamp) {
final JobConfig jobConfig, final long currentTimestamp, final HelixManager helixManager) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strongly recommend we check whether the given ZkConnection is valid. What if helixManager is null or not connected?

JobMonitorMetricZnodeField field, HistogramDynamicMetric metric) {
String zkPath = buildZkPathForJobMonitorMetric(_jobType);

if (!baseDataAccessor.update(zkPath, currentData -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're doing an update instead of set?

@NealSun96 NealSun96 changed the title Task delay metrics endpoints on TaskDriver (WIP) Task delay metrics endpoints on TaskDriver Jun 5, 2020
@NealSun96 NealSun96 force-pushed the nealsun/task-thread-pool-related-metrics branch from 4ab7271 to 8027951 Compare June 14, 2020 04:58
@NealSun96
Copy link
Contributor Author

After discussion, this work item will not be done.

@NealSun96 NealSun96 closed this Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants