Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cluster inconsistency] Master enslaves to another master sometimes, if all slots moved away #3083

Open
irfanurrehman opened this issue Feb 12, 2016 · 3 comments
Labels
Milestone

Comments

@irfanurrehman
Copy link

Referring and extending this defect: #3043

once the above happens (or in some other scenario where a cluster has a master without slaves)

If the slots are moved back and forth, from this master to some other, such a way that the last slot is also moved away from this master (meaning move all the slots this master has) in each repetition; this master sometimes enslaves itself to the node where we moved its last slot.

I know its a corner case, and why ideally would any admin want to do this, but the reasoning is that this can also happen the first time all the slots are moved away from this master. And imho, i think there might be some scenario, where a cluster needs to have some master which might not be serving any slots, and this might not be the original config (means slots moved away sometime in the later part of the life of cluster, after the original config).

I tried to dig a little and have little bit of analysis (might not be completely right though)

1 - currently there is no differentiation between the slot updates that float in the cluster because of actual setslot command, or because of failure/failover scenarios.
2 - On setslot (of the last slot, we are moving away), redis-trib does a setslot in all the master nodes, each of which will float/broadcast a cluster update with its config epoch.
3 - if the actual setslot command happens on the node frm where last slot is being moved, in such a way that its config epoch is highest, then cluster updates from other nodes are discarded.
4 - if the cluster update message comes with a higher config epoch, then this master currently has, from the target node (or some other master), it makes itself a slave because of below:

in clusterUpdateSlotsConfigWith

 /* If at least one slot was reassigned from a node to another node
     * with a greater configEpoch, it is possible that:
     * 1) We are a master left without slots. This means that we were
     *    failed over and we should turn into a replica of the new
     *    master.
     * 2) We are a slave and our master is left without slots. We need
     *    to replicate to the new slots owner. */
    if (newmaster && curmaster->numslots == 0) {
        redisLog(REDIS_WARNING,
            "Configuration change detected. Reconfiguring myself "
            "as a replica of %.40s", sender->name);
        clusterSetMaster(sender);

Now the points are:
1 - do we really need to support this scenario; I say may be not necassary, if users and cluster admin are aware of side effects, they can certainly avoid this happenning
2 - is it really needed to differentiate between, the cluster updates triggered by the SETSLOT called by admin/tool and cluster update because of cluster failures/failovers; I say may be, because we might uncover similar scenario later, I am though not aware of them now :)

@irfanurrehman
Copy link
Author

thinking further, take this scenario

  • the cluster admin wants to replace a particular server/node with a new one, so he adds a new master and moves all the slots to the new master, we wouldn't want our current master to enslave itself to the new one, would we. It wouldn't cause too many problems though, cause the old master anyhow stopped serving all slots already.. :-)

@antirez
Copy link
Contributor

antirez commented Feb 12, 2016

Thanks, my notes ASAP, this will be fixed before RC4 of 3.2.

@antirez antirez added this to the Redis >= 3.2 milestone Feb 12, 2016
@irfanurrehman
Copy link
Author

Thanks a lot !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants