-
Notifications
You must be signed in to change notification settings - Fork 918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Who is the new intermediate master? #46
Comments
The new intermediate master is BTW you can also get this value from the environment variable |
@shlomi-noach : That was my first thought also. So in this example the But in my test the
So it promoted Do I miss something here? |
Oh OK. So here's the complication: (though, first of all, in your case it should have been The result of an intermediate master failover may result with multiple successors. Perhaps the promoted replica cannot take over all of its siblings, and some will move elsewhere. There can be up to three different successors at the same time: an uncle, one of the orphaned siblings (as in your case), and the grandfather (master in your case). The logic goes through all three options and apparently settled the successor to be the master, even though everything was satisfied by |
Thanks @shlomi-noach I am going to run some tests again make sure I did everything right , but I already ran it many times and the |
I'm now looking into this |
I'm actually unable to get the same results. When I failover an intermediate master X, and when all of its replicas are successfully relocated to its sibling Y, I always get Y as |
@theTibi I see. Now here's the thing: in your case it can be argued who the successor is:
So there's actually two servers which have a role in the recovery process. Which of them is the successor? In your particular use case you'd like to get the new intermediate master, because in your setup the intermediate master has a special importance, being writable. However for someone else it might make more sense to know "who ultimately took charge". This can be argued and I'm open to hear of a strong argument to one over the other. |
@shlomi-noach Both of the them have pros and cons. But in my opinion if Orchestrator handles intermediate masters and intermediate master failovers the Example in this case if an intermediate master is writable and can take writes. After a failover the But we might should not choose between them. Maybe a new placeholder would be the best solution and everybody can decide which |
An interesting scenario is that of a split. Say the new intermediate master could only take over a few of the boxes, and the rest were salvaged by the master. Then the problem is even stronger. Let me look at the code and see what can make sense to change. |
Addressed by #61 |
Gonna run a few experiments to see that #61 doesn't return the wrong thing, and then I'm happy. |
@shlomi-noach Thanks, I am also going to test this soon,I will let you know how did it go. |
@theTibi if you're able to compile and test that would be awesome |
@shlomi-noach Yes, going to do it today. |
@shlomi-noach : So I did some tests but I did not get what I expected. Here are two examples: In this case I would like to see As we can see
Another test: Again I was expecting
So I can see some changes in the promotion logic but I think it is still missing the point and Orchestrator is promoting new intermediate master which might does not have all the data. |
@theTibi I'm surprised and confused that you find the above logic to be incorrect, and I think it all comes down to the "have all the data" issue. To recap your example, you have:
you kill
and announces I find this to be perfect behavior: What makes Imagine in the above I'd tell you You have writable intermediate masters. What if you'd have different kinds of writable intermediate masters? Imagine:
How would you react to Your setup is what it is, but I just wanted to illustrate why I don't see how In your expectation to get
there are hidden assumptions. First and foremost, that If I hope I managed to clarify the complexity of your setup and of the specific failover expectation you have for it. I'm happy to continue the discussion and to perhaps discuss simple and feasible solutions to all these questions. |
@shlomi-noach Thank you for your answer.
So
Another example:
If If
Same like in the first case, Other example:
If
So in my opinion if there are intermediate masters and they have slaves, when the intermediate master fails Just like one you have one master and many slave, if master dies one slave will be promoted and other slaves relocate under the new master. But of course it has some limitation example So I understand Orchestrator can not handle all the different topologies and does not know which server has which data etc.. but I think it would already help a lot if Opinions? |
You must be aware that this is not always possible. Consider:
However can you ensure that both |
@shlomi-noach I understand. Orchestrator is not a magic wand which solves all the problems. Of course if you would like to use topologies like this you should know the requirements and limitations. Example you have to make sure If you have a |
OK, I'm getting a clearer picture of how this would be implemented in code. |
@shlomi-noach In you previous pot you mentioned this |
@theTibi I'm moving away from hostname regexes and onto something more formal. |
I'm in the middle of a split and looking at this now, partly to ensure an intermediate master with filters won't ever get promoted and that normal slaves won't get put under these filtered intermediate masters. I think but need to check that current logic prevents this. As Shlomi says things get get quite hairy and my topology is already 6 layers deep. So the trick here seems to be to make orchestrator aware of these barriers /borders and to ensure that failover never moves boxes outside of their zones. |
Hi, I was just wondering are there any new thoughts on this topic? |
Hi @theTibi I haven't made progress on this unfortunately, was not focusing on this issue. |
@theTibi I'm looking into this now. |
So the very most basic question is: We make the promotion, intermediate master takes over as we would expect etc. How do we connect the new intermediate master with the topology? The basic assumption is that the intermediate master took writes, hence it was potentially taking writes at time of failure, hence its binary logs, and those of its replicas, are different than those of the rest of the topology. |
Here is the time to bring up this old ticket again. I think GTID won't fail, let's see we have the following replicaset:
Rep3 dies:
If we are using GTID I think |
Hi,
If I have a topology like this (just an example):
Rep2 is an intermediate master. If rep2 dies Orchestrator processes a
DeadIntermediateMaster
failover and reorganises the topology like (just an example):rep1 --> rep4 --> rep3
So rep4 is going to be an intermediate master now. But based on the
PostFailoverProcesses
placeholders I can not decide who is the new intermediate master.It has the following placeholders:
{failureType}, {failureDescription}, {failedHost}, {failureCluster}, {failureClusterAlias}, {failureClusterDomain}, {failedPort}, {successorHost}, {successorPort}, {successorAlias}, {countSlaves}, {slaveHosts}, {isDowntimed}, {isSuccessful}, {lostSlaves}
I am trying to call an external script when an intermediate master dies but the script should/has to know who is the new intermediate master after failover.
Is there any solution/ideas for this?
Thanks.
The text was updated successfully, but these errors were encountered: