Skip to content

Conversation

@kgusakov
Copy link
Contributor

@kgusakov kgusakov commented Feb 15, 2023

@kgusakov kgusakov marked this pull request as ready for review February 16, 2023 22:54
- On every new PD leader elected - it must check the direct value (not the locally cached one) of `zoneId.assignment.pending` keys and send RebalanceRequest to needed PrimaryReplicas and then listen updates from the last revision.
- On every PrimaryReplica reelection by PD it must send the new RebalanceRequest to PrimaryReplica, if pending key is not empty.
- On every leader reelection (for the leader oriented protocols) inside the replication group - leader send leaderElected event to PrimaryReplica, which force PrimaryReplica to send RebalanceRequest to the replication group leader again.
- On every new PD leader elected - it must check the direct value (not the locally cached one) of `zoneId.assignment.pending`/`zondeId.assignment.cancel` (the last one always wins, if exists) keys and send `RebalanceRequest`/`CancelRebalanceRequest` to needed PrimaryReplicas and then listen updates from the last revision of this key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these requests contain revision? Only old and new topology is mentioned, as I can see

When PrimaryReplica send `CancelRebalanceRequest(oldTopology, newTopology)` to the ReplicationGroup following cases available:
- Replication group has ongoing rebalance oldToplogy->newTopology. So, it must be cancelled and cleanup for the configuration state of replication group to oldTopology must be executed.
- Replication group has no ongoing rebalance and currentTopology==oldTopology. So, nothing to cancel, return success response.
- Replication group has no ongoing rebalance and currentTopology==newTopology. So, cancel request can't be executed, return the response about it. Result recipient of this response (placement driver) must log this fact and do the same routine for usual rebalanceDone.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if after sending CancelRebalanceRequest the placement driver finds out that some of replication groups have finished rebalance (currentTopology==newTopology) and some have not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it possible, if any rebalance touches only one distribution zone, so, only one replication group?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems I missed this moment, my fault. Question withdrawn.

Co-authored-by: Denis Chudov <moonglloom@gmail.com>
@asfgit asfgit closed this in c276a33 Feb 22, 2023
lowka pushed a commit to gridgain/apache-ignite-3 that referenced this pull request Mar 18, 2023
…pache#1676

Signed-off-by: Slava Koptilin <slava.koptilin@gmail.com>
lowka pushed a commit to gridgain/apache-ignite-3 that referenced this pull request Apr 19, 2023
…pache#1676

Signed-off-by: Slava Koptilin <slava.koptilin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants