Skip to content

Comments

Mitigate controller leadership switch latency - improve leader -> standby#662

Closed
i3wangyi wants to merge 3 commits intoapache:masterfrom
i3wangyi:memory
Closed

Mitigate controller leadership switch latency - improve leader -> standby#662
i3wangyi wants to merge 3 commits intoapache:masterfrom
i3wangyi:memory

Conversation

@i3wangyi
Copy link
Contributor

@i3wangyi i3wangyi commented Dec 18, 2019

Issues

  • My PR addresses the following Helix issues and references them in the PR description:

#661

Description

  • Here are some details about my PR, including screenshots of any UI changes:

  • The code change will skip the getChildren() request on listener's removal, and the zkClient's unsubscribeAll will be used instead of unsubscribing a single path;

  • It also includes some log optimization works.

Tests

  • The following tests are written for this issue:

(List the names of added unit/integration tests)
testControllerConnectThenSessionExpire in TestControllerLeaderhshipChange

  • The following is the result of the "mvn test" command on the appropriate module:

(Copy & paste the result of "mvn test")
[ERROR] Tests run: 890, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 3,909.929 s <<< FAILURE! - in TestSuite
[ERROR] testDeleteStoppingStuckWorkflowForcefully(org.apache.helix.integration.task.TestForceDeleteWorkflow) Time elapsed: 60.645 s <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.apache.helix.integration.task.TestForceDeleteWorkflow.testDeleteStoppingStuckWorkflowForcefully(TestForceDeleteWorkflow.java:279)

[ERROR] testForceDeleteJobFromJobQueue(org.apache.helix.integration.task.TestDeleteJobFromJobQueue) Time elapsed: 0.498 s <<< FAILURE!
org.apache.helix.HelixException: Failed to delete job: testForceDeleteJobFromJobQueue_job2 from queue: testForceDeleteJobFromJobQueue
at org.apache.helix.integration.task.TestDeleteJobFromJobQueue.testForceDeleteJobFromJobQueue(TestDeleteJobFromJobQueue.java:75)

[ERROR] testStateTransitionTimeoutByClusterLevel(org.apache.helix.integration.paticipant.TestStateTransitionTimeoutWithResource) Time elapsed: 38.119 s <<< FAILURE!
java.lang.AssertionError: expected: but was:
at org.apache.helix.integration.paticipant.TestStateTransitionTimeoutWithResource.testStateTransitionTimeoutByClusterLevel(TestStateTransitionTimeoutWithResource.java:196)

[INFO]
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestStateTransitionTimeoutWithResource.testStateTransitionTimeoutByClusterLevel:196 expected: but was:
[ERROR] TestDeleteJobFromJobQueue.testForceDeleteJobFromJobQueue:75 » Helix Failed to ...
[ERROR] TestForceDeleteWorkflow.testDeleteStoppingStuckWorkflowForcefully:279 expected: but was:
[INFO]
[ERROR] Tests run: 890, Failures: 3, Errors: 0, Skipped: 0

Failed tests get passed running individually in IDE

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

  • My diff has been formatted using helix-style.xml

@i3wangyi
Copy link
Contributor Author

Current Leader -> Standby diagram:
Current existing implementatio

@i3wangyi
Copy link
Contributor Author

i3wangyi commented Dec 19, 2019

After applying the changes
after changes

@i3wangyi i3wangyi changed the title Reset all listeners only in-memory operation Mitigate controller leadership switch latency - improve leader -> standby Dec 19, 2019
@junkaixue
Copy link
Contributor

Logically, it sounds good to me. But I think this is not the majority contribution in LeaderSwitch when controller is trying acquiring leadership, right?

How much it reduces the overhead?

@i3wangyi
Copy link
Contributor Author

i3wangyi commented Jan 9, 2020

Logically, it sounds good to me. But I think this is not the majority contribution in LeaderSwitch when controller is trying acquiring leadership, right?

How much it reduces the overhead?

No, the optimization is only for leader -> standby, one data point shows a 10X latency improvement (from 1700ms -> 170ms); For optimizing the other part of the story: standby -> leader, there're different options, I will explore them later in different PRs.

Copy link
Contributor

@junkaixue junkaixue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for it. I saw you link to #681. Have you tested your change with it?

if (_eventTypes.contains(EventType.NodeDataChanged)
|| _eventTypes.contains(EventType.NodeCreated)
|| _eventTypes.contains(EventType.NodeDeleted)) {
logger.info("Subscribing data change listener to path: " + path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing these logs? While debugging, how do we know if this listener is configured with what kind of _eventTypes list? Since there are still some concerns or bug reports regarding callbacks, we'd better keep the detail logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These logs are actually redundant. The private subscribeForChanges() will eventually call other private methods

  private void subscribeChildChange(String path, NotificationContext.Type callbackType) {
    if (callbackType == NotificationContext.Type.INIT
        || callbackType == NotificationContext.Type.CALLBACK) {
      if (logger.isDebugEnabled()) {
        logger.debug(_manager.getInstanceName() + " subscribes child-change. path: " + path
            + ", listener: " + _listener);
      }
      _zkClient.subscribeChildChanges(path, this);
    } else if (callbackType == NotificationContext.Type.FINALIZE) {
      logger.info(_manager.getInstanceName() + " unsubscribe child-change. path: " + path
          + ", listener: " + _listener);

      _zkClient.unsubscribeChildChanges(path, this);
    }
  }

  private void subscribeDataChange(String path, NotificationContext.Type callbackType) {
    if (callbackType == NotificationContext.Type.INIT
        || callbackType == NotificationContext.Type.CALLBACK) {
      if (logger.isDebugEnabled()) {
        logger.debug(_manager.getInstanceName() + " subscribe data-change. path: " + path
            + ", listener: " + _listener);
      }
      _zkClient.subscribeDataChanges(path, this);
    } else if (callbackType == NotificationContext.Type.FINALIZE) {
      logger.info(_manager.getInstanceName() + " unsubscribe data-change. path: " + path
          + ", listener: " + _listener);

      _zkClient.unsubscribeDataChanges(path, this);
    }
  }

To perform the actual listener subscription. I think keeping the verbose log in the last step is sufficient enough and you know helix.log is flooded with all listeners logs

// TODO reset user defined handlers only
// TODO Fix the issue that when connection disconnected, reset handlers will be blocked. -- JJ
// This is because reset logic contains ZK operations.
resetHandlers(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are not sending FINALIZE events on the reset anymore? I think this is problematic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. That's the design of using unsubscribeAll. Because the current callback handler's FINALIZE method is pretty expensive (involving read all children node under currentstates then remove them one by one).
I could perform tests that prove without the FINALIZE event, HelixManager could clean up all information when disconnect(). And the single callback handler could still use the FINALIZE event to un-register itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that our customers also depend on the FINALIZE event to do the cleanup. I believe with this change disconnect will clean up all the listeners. But you break the agreement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiajunwang I understand your concern. With the current helix structure & library dependency, many times there's really no easy way to satisfy everything while still keeping the code clean & short IMO, even we overall agree with the overall design here. Let's take it offline.

I will try to resolve rest concerns as much as I could and propose a new draft code structure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, feel free to call for a short meeting with more people. My point is that we shall never trade-off correctness or functionality for performance.

- boost performance on leader -> standby latency (disconnect() method performs much faster)
- one sample result shows leader -> standby latency reduces from 1432ms to 176ms
TODO: add tests when session expiry
logger.info("Subscribing changes listener to path: " + path + ", type: " + callbackType
+ ", listener: " + _listener);
logger.info(
"START:INVOKE subscribing changes listener on path: {}, callbackType: {}, listener: {}, isWatchChild: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

START:INVOKE/END:INVOKE are for the callbacks. Please don't use it here. It will be confusing to whoever debug later.

* *Caution*: currently it's only used during disconnecting from ZK and the
* listeners unsubscription will be taken care by zkClient directly
*/
void closeBatchCallbackProcessor() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides my concern of not sending FINALIZE, this method is a subset of void reset(boolean isShutdown); I think you can avoid duplicate code.

Of course, please address my concern about not having the FINALIZE first. Then we can revisit this one.

@i3wangyi i3wangyi closed this Feb 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants