Add ExcessiveTopStateResolver to gracefully fix the double-masters situation. by jiajunwang · Pull Request #1037 · apache/helix

jiajunwang · 2020-05-28T23:10:22Z

Issues

My PR addresses the following Helix issues and references them in the PR description:

Description

Here are some details about my PR, including screenshots of any UI changes:

Although the rebalancer will fix the additional master eventually, the default operations are arbitrary and it may cause an older master to survive. This may cause serious application logic issues since many applications require the master to have the latest data.
With this state resolver, the rebalancer will change the default behavior to reset all the master replicas so as to ensure the remaining one is the youngest one. Then the double-masters situation is gracefully resolved.

Tests

The following tests are written for this issue:

TestAbnormalStatesResolver.testExcessiveTopStateResolver()

The following is the result of the "mvn test" command on the appropriate module:

N/A, the newly added logic is only used in the new test case. The other tests won't touch the new logic.
Will run the whole test before merging the branch to master.

Commits

My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters (not including Jira issue reference)
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"

Documentation (Optional)

In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Code Quality

My diff has been formatted using helix-style.xml
(helix-style-intellij.xml if IntelliJ IDE is used)

…tuation. Although the rebalancer will fix the additional master eventually, the default operations are arbitrary and it may cause an older master to survive. This may cause serious application logic issues since many applications require the master to have the latest data. With this state resolver, the rebalancer will change the default behavior to reset all the master replicas so as to ensure the remaining one is the youngest one. Then the double-masters situation is gracefully resolved.

...x-core/src/test/java/org/apache/helix/integration/rebalancer/TestAbnormalStatesResolver.java

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

narendly · 2020-05-29T17:35:47Z

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

+  }
+
+  @Override
+  public Map<String, String> computeRecoveryAssignment(final CurrentStateOutput currentStateOutput,


computeCorrectedAssignment is clearer?

Not really, this is not the corrected mapping, this is the next step of fixing.

narendly · 2020-05-29T17:37:33Z

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

+    if (currentStateOutput.getCurrentStateMap(resourceName, partition).values().stream()
+        .filter(state -> state.equals(stateModelDef.getTopState())).count() > 1) {
+      return false;


I wonder if this would be too much of an overhead for each pipeline run?

Do you think it would be better to try to come up with a way to cache currentState mappings and compare diffs (going from O(n) -> O(1) check by storing results across pipelines).

For heavy users, this O(n) computation might become a significant bottleneck if done every pipeline.

Could you please clarify why comparing diff will bring the complexity from O(n) to O(1)?

@jiajunwang In CurrentStateOutput, could we add a top state counter map so we could cache the top state counter, like below? Then we could avoid that stream filter computation? Tradeoff is we need a bit more memory for the cache. But most of them are just references.

public void setCurrentState(String resourceName, Partition partition, String instanceName, String state) { (...... current code ......) // Counter number of top state replicas for a single top state model. if (state.equals(stateModelDef.getTopState())) { Map<String, Integer> counterMap = _topStateCounter.computeIfAbsent(resourceName, k -> new HashMap<>()) .computeIfAbsent(partition, k -> new HashMap<>()); counterMap.put(state, counterMap.getOrDefault(state, 0) + 1); } }

Not sure if we need to optimize this. Maybe you could test it. It seems for this part, the time complexity is down from O(n) to O(1), but I am not sure what the actual time saving is, considering the whole pipeline. If the whole pipeline complexity is O(N^2), with this optimization, it is O(N), that may help. If the whole pipeline is O(2 * N), with this optimization, still O(N).

I see. In that case, we should add this to the cache instead of CurrentStateOutput. The cache is "protected" by the selective update, so it will help to reduce some calculations.

That is a valid idea. But that requires more changes. For this specific usage, changing the fundamental cache class seems to be not worthy.
Moreover, if the resolver is not enabled, then we don't do the calculation at all.

Let me add a TODO there, if we have more usage of this count, then we shall do it.

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

junkaixue

lgtm

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

huizhilu · 2020-06-04T07:48:31Z

...c/main/java/org/apache/helix/controller/rebalancer/constraint/ExcessiveTopStateResolver.java

+    if (currentStateOutput.getCurrentStateMap(resourceName, partition).values().stream()
+        .filter(state -> state.equals(stateModelDef.getTopState())).count() > 1) {
+      return false;


@jiajunwang In CurrentStateOutput, could we add a top state counter map so we could cache the top state counter, like below? Then we could avoid that stream filter computation? Tradeoff is we need a bit more memory for the cache. But most of them are just references.

public void setCurrentState(String resourceName, Partition partition, String instanceName, String state) { (...... current code ......) // Counter number of top state replicas for a single top state model. if (state.equals(stateModelDef.getTopState())) { Map<String, Integer> counterMap = _topStateCounter.computeIfAbsent(resourceName, k -> new HashMap<>()) .computeIfAbsent(partition, k -> new HashMap<>()); counterMap.put(state, counterMap.getOrDefault(state, 0) + 1); } }

Not sure if we need to optimize this. Maybe you could test it. It seems for this part, the time complexity is down from O(n) to O(1), but I am not sure what the actual time saving is, considering the whole pipeline. If the whole pipeline complexity is O(N^2), with this optimization, it is O(N), that may help. If the whole pipeline is O(2 * N), with this optimization, still O(N).

jiajunwang · 2020-06-04T18:50:13Z

This PR is ready to be merged, approved by @dasahcc

…tuation. (#1037) Although the rebalancer will fix the additional master eventually, the default operations are arbitrary and it may cause an older master to survive. This may cause serious application logic issues since many applications require the master to have the latest data. With this state resolver, the rebalancer will change the default behavior to reset all the master replicas so as to ensure the remaining one is the youngest one. Then the double-masters situation is gracefully resolved.

…tuation. (apache#1037) Although the rebalancer will fix the additional master eventually, the default operations are arbitrary and it may cause an older master to survive. This may cause serious application logic issues since many applications require the master to have the latest data. With this state resolver, the rebalancer will change the default behavior to reset all the master replicas so as to ensure the remaining one is the youngest one. Then the double-masters situation is gracefully resolved.