New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISPN-14782 Unable to reenable rebalance after cluster scale up #10834
Conversation
Image pushed for Jenkins build #1:
|
Replayed CI. |
Image pushed for Jenkins build #2:
|
Image pushed for Jenkins build #3:
|
Image pushed for Jenkins build #4:
|
Added a commit for https://issues.redhat.com/browse/ISPN-14793 This changes how the nodes handle the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just two minor points.
I've backported to speed things up while you're on PTO: #10849
I removed the System.out
comment, but I haven't changed the AggregateCompletionStage
usage.
@@ -295,6 +309,7 @@ public CompletionStage<ManagerStatusResponse> handleStatusRequest(int viewId) { | |||
// As long as we have an older view, we can still process topologies from the old coordinator | |||
return withView(viewId, getGlobalTimeout(), MILLISECONDS).thenApply(ignored -> { | |||
Map<String, CacheStatusResponse> caches = new HashMap<>(); | |||
AggregateCompletionStage<Void> joins = CompletionStages.aggregateCompletionStage(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need an Agreggate here? AFAICT we only depend on a single CompletionStage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this was leftover from the first solution. Updating to remove it.
@@ -535,6 +564,7 @@ private CompletionStage<Void> doHandleStableTopologyUpdate(String cacheName, Cac | |||
CacheTopology stableTopology = cacheStatus.getStableTopology(); | |||
if (stableTopology == null || stableTopology.getTopologyId() < newStableTopology.getTopologyId()) { | |||
log.tracef("Updating stable topology for cache %s: %s", cacheName, newStableTopology); | |||
//System.out.printf("[%s] Updating stable topology for cache %s: %s\n", transport.getAddress(), cacheName, newStableTopology); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed
* After receiving a CacheStatusRequest from the coordinator, the nodes will send a join request for the caches which need to be recovered from the persistent state.
Updated with suggestions. |
Image pushed for Jenkins build #5:
|
https://issues.redhat.com/browse/ISPN-14782
https://issues.redhat.com/browse/ISPN-14793
We do not broadcast the topology as stable if it wasn't yet restored.