Mitigate controller leadership switch latency - improve leader -> standby by i3wangyi · Pull Request #662 · apache/helix

i3wangyi · 2019-12-18T23:04:22Z

Issues

My PR addresses the following Helix issues and references them in the PR description:

Description

Here are some details about my PR, including screenshots of any UI changes:
The code change will skip the getChildren() request on listener's removal, and the zkClient's unsubscribeAll will be used instead of unsubscribing a single path;
It also includes some log optimization works.

Tests

The following tests are written for this issue:

(List the names of added unit/integration tests)
testControllerConnectThenSessionExpire in TestControllerLeaderhshipChange

The following is the result of the "mvn test" command on the appropriate module:

(Copy & paste the result of "mvn test")
[ERROR] Tests run: 890, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 3,909.929 s <<< FAILURE! - in TestSuite
[ERROR] testDeleteStoppingStuckWorkflowForcefully(org.apache.helix.integration.task.TestForceDeleteWorkflow) Time elapsed: 60.645 s <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.apache.helix.integration.task.TestForceDeleteWorkflow.testDeleteStoppingStuckWorkflowForcefully(TestForceDeleteWorkflow.java:279)

[ERROR] testForceDeleteJobFromJobQueue(org.apache.helix.integration.task.TestDeleteJobFromJobQueue) Time elapsed: 0.498 s <<< FAILURE!
org.apache.helix.HelixException: Failed to delete job: testForceDeleteJobFromJobQueue_job2 from queue: testForceDeleteJobFromJobQueue
at org.apache.helix.integration.task.TestDeleteJobFromJobQueue.testForceDeleteJobFromJobQueue(TestDeleteJobFromJobQueue.java:75)

[ERROR] testStateTransitionTimeoutByClusterLevel(org.apache.helix.integration.paticipant.TestStateTransitionTimeoutWithResource) Time elapsed: 38.119 s <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.apache.helix.integration.paticipant.TestStateTransitionTimeoutWithResource.testStateTransitionTimeoutByClusterLevel(TestStateTransitionTimeoutWithResource.java:196)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestStateTransitionTimeoutWithResource.testStateTransitionTimeoutByClusterLevel:196 expected: but was:
[ERROR] TestDeleteJobFromJobQueue.testForceDeleteJobFromJobQueue:75 » Helix Failed to ...
[ERROR] TestForceDeleteWorkflow.testDeleteStoppingStuckWorkflowForcefully:279 expected: but was:
[INFO]
[ERROR] Tests run: 890, Failures: 3, Errors: 0, Skipped: 0

Failed tests get passed running individually in IDE

Commits

My commits all reference appropriate Apache Helix GitHub issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters (not including Jira issue reference)
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Documentation

In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

My diff has been formatted using helix-style.xml

i3wangyi · 2019-12-19T01:05:24Z

Current Leader -> Standby diagram:

i3wangyi · 2019-12-19T01:05:37Z

After applying the changes

junkaixue · 2020-01-09T23:04:27Z

Logically, it sounds good to me. But I think this is not the majority contribution in LeaderSwitch when controller is trying acquiring leadership, right?

How much it reduces the overhead?

i3wangyi · 2020-01-09T23:33:02Z

Logically, it sounds good to me. But I think this is not the majority contribution in LeaderSwitch when controller is trying acquiring leadership, right?

How much it reduces the overhead?

No, the optimization is only for leader -> standby, one data point shows a 10X latency improvement (from 1700ms -> 170ms); For optimizing the other part of the story: standby -> leader, there're different options, I will explore them later in different PRs.

junkaixue

Please add a test for it. I saw you link to #681. Have you tested your change with it?

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java

helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixManager.java

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java

jiajunwang · 2020-01-19T19:09:48Z

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java

    if (_eventTypes.contains(EventType.NodeDataChanged)
        || _eventTypes.contains(EventType.NodeCreated)
        || _eventTypes.contains(EventType.NodeDeleted)) {
-      logger.info("Subscribing data change listener to path: " + path);


Why removing these logs? While debugging, how do we know if this listener is configured with what kind of _eventTypes list? Since there are still some concerns or bug reports regarding callbacks, we'd better keep the detail logs.

These logs are actually redundant. The private subscribeForChanges() will eventually call other private methods

private void subscribeChildChange(String path, NotificationContext.Type callbackType) { if (callbackType == NotificationContext.Type.INIT || callbackType == NotificationContext.Type.CALLBACK) { if (logger.isDebugEnabled()) { logger.debug(_manager.getInstanceName() + " subscribes child-change. path: " + path + ", listener: " + _listener); } _zkClient.subscribeChildChanges(path, this); } else if (callbackType == NotificationContext.Type.FINALIZE) { logger.info(_manager.getInstanceName() + " unsubscribe child-change. path: " + path + ", listener: " + _listener); _zkClient.unsubscribeChildChanges(path, this); } } private void subscribeDataChange(String path, NotificationContext.Type callbackType) { if (callbackType == NotificationContext.Type.INIT || callbackType == NotificationContext.Type.CALLBACK) { if (logger.isDebugEnabled()) { logger.debug(_manager.getInstanceName() + " subscribe data-change. path: " + path + ", listener: " + _listener); } _zkClient.subscribeDataChanges(path, this); } else if (callbackType == NotificationContext.Type.FINALIZE) { logger.info(_manager.getInstanceName() + " unsubscribe data-change. path: " + path + ", listener: " + _listener); _zkClient.unsubscribeDataChanges(path, this); } }

To perform the actual listener subscription. I think keeping the verbose log in the last step is sufficient enough and you know helix.log is flooded with all listeners logs

jiajunwang · 2020-01-19T19:24:52Z

helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixManager.java

      // TODO reset user defined handlers only
      // TODO Fix the issue that when connection disconnected, reset handlers will be blocked. -- JJ
-      // This is because reset logic contains ZK operations.
-      resetHandlers(true);


So we are not sending FINALIZE events on the reset anymore? I think this is problematic.

Yeah. That's the design of using unsubscribeAll. Because the current callback handler's FINALIZE method is pretty expensive (involving read all children node under currentstates then remove them one by one).
I could perform tests that prove without the FINALIZE event, HelixManager could clean up all information when disconnect(). And the single callback handler could still use the FINALIZE event to un-register itself.

The thing is that our customers also depend on the FINALIZE event to do the cleanup. I believe with this change disconnect will clean up all the listeners. But you break the agreement.

@jiajunwang I understand your concern. With the current helix structure & library dependency, many times there's really no easy way to satisfy everything while still keeping the code clean & short IMO, even we overall agree with the overall design here. Let's take it offline.

I will try to resolve rest concerns as much as I could and propose a new draft code structure

Sure, feel free to call for a short meeting with more people. My point is that we shall never trade-off correctness or functionality for performance.

- boost performance on leader -> standby latency (disconnect() method performs much faster) - one sample result shows leader -> standby latency reduces from 1432ms to 176ms

TODO: add tests when session expiry

jiajunwang · 2020-02-05T23:45:46Z

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java

-    logger.info("Subscribing changes listener to path: " + path + ", type: " + callbackType
-        + ", listener: " + _listener);
+    logger.info(
+        "START:INVOKE subscribing changes listener on path: {}, callbackType: {}, listener: {}, isWatchChild: {}",


START:INVOKE/END:INVOKE are for the callbacks. Please don't use it here. It will be confusing to whoever debug later.

jiajunwang · 2020-02-05T23:49:53Z

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java

+   * *Caution*: currently it's only used during disconnecting from ZK and the
+   * listeners unsubscription will be taken care by zkClient directly
+   */
+  void closeBatchCallbackProcessor() {


Besides my concern of not sending FINALIZE, this method is a subset of void reset(boolean isShutdown); I think you can avoid duplicate code.

Of course, please address my concern about not having the FINALIZE first. Then we can revisit this one.

i3wangyi force-pushed the memory branch from 5da7007 to 48dd876 Compare December 19, 2019 00:50

i3wangyi changed the title ~~Reset all listeners only in-memory operation~~ Mitigate controller leadership switch latency - improve leader -> standby Dec 19, 2019

i3wangyi force-pushed the memory branch from a192113 to 32d9d74 Compare December 20, 2019 02:06

i3wangyi force-pushed the memory branch from 9dca62e to bc65349 Compare January 14, 2020 22:41

i3wangyi requested review from jiajunwang, junkaixue and lei-xia and removed request for lei-xia January 15, 2020 02:41

i3wangyi mentioned this pull request Jan 15, 2020

Integration test for controller connect and disconnect #681

Merged

7 tasks

junkaixue reviewed Jan 19, 2020

View reviewed changes

helix-core/src/main/java/org/apache/helix/manager/zk/CallbackHandler.java Outdated Show resolved Hide resolved

helix-core/src/main/java/org/apache/helix/manager/zk/ZKHelixManager.java Show resolved Hide resolved

jiajunwang reviewed Jan 19, 2020

View reviewed changes

i3wangyi added 2 commits January 31, 2020 17:09

Use unsubscribeAll to remove all listeners

5146eb3

- boost performance on leader -> standby latency (disconnect() method performs much faster) - one sample result shows leader -> standby latency reduces from 1432ms to 176ms

Update the comments - logging

6069b50

TODO: add tests when session expiry

i3wangyi force-pushed the memory branch from bc65349 to 6069b50 Compare February 1, 2020 01:41

Add the test case for session expiry

d5c36bc

jiajunwang reviewed Feb 5, 2020

View reviewed changes

i3wangyi closed this Feb 21, 2020

Comments

Conversation

i3wangyi commented Dec 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues

Description

Tests

Commits

Documentation

Code Quality

Uh oh!

i3wangyi commented Dec 19, 2019

Uh oh!

i3wangyi commented Dec 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

junkaixue commented Jan 9, 2020

Uh oh!

i3wangyi commented Jan 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

junkaixue left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

i3wangyi commented Dec 18, 2019 •

edited

Loading

i3wangyi commented Dec 19, 2019 •

edited

Loading

i3wangyi commented Jan 9, 2020 •

edited

Loading

junkaixue left a comment •

edited

Loading