-
Notifications
You must be signed in to change notification settings - Fork 583
Make MYSQL_INNODB_NUM_MEMBERS work with offline members #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
checking the number of members in the cluster. This has the advantage that, once fully formed, it never shrinks even if a member goes offline, so that starting the router container in a scenario when one mysql replica is down doesn't fail. Such scenario can frequently occur when the router is deployed alongside with the client application. Signed-off-by: Gianluca Borello <g.borello@gmail.com>
Hi, thank you for submitting this pull request. In order to consider your code we need you to sign the Oracle Contribution Agreement (OCA). Please review the details and follow the instructions at http://www.oracle.com/technetwork/community/oca-486395.html |
Hi, I am already a contributor in other Oracle repositories on GitHub, and my name is listed in the page at http://www.oracle.com/technetwork/community/oca-486395.html, along with my github username:
As your FAQ says (http://www.oracle.com/technetwork/oca-faq-405384.pdf):
So I think I'm good? |
Hi, I have the same problem and Ginaluca's solution is good for me. Thank you |
@gianlucaborello I'm not 100% familiar with the procedure, but generally you need to follow the instructions given on the page linked by the oca bot, and your merge proposal will be added to an internal queue for review |
@gianlucaborello : Hi, |
Thank you for your help. Yes, my name is already listed in the OCA list and have successfully contributed in other Oracle projects on GitHub using this username. As far as the other request, here is the explicit agreement of the OCA: I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it. Let me know if I should also send it via email. Thanks |
@gianlucaborello Sorry for the late reply. For handling MySQL contributions, in addition to the signed OCA you need to have a user in http://bugs.mysql.com, please create one by clicking the 'register' link at the top right side of the screen. Thanks |
Done. Username is Thanks |
Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment: |
I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it. |
Thanks |
Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow |
@gianlucaborello What about we remove the loop altogether and you use "restart: on-failure". This way we remove the static dependency and your setup will still work because the container will be restarted until the initial bootstrap server is working. Right? A somewhat larger problem is that a similar version of this dependency still exists for the initial bootstrap server -- if it goes away and router is restarted it will fail. But that's maybe part of a larger internal discussion. |
@neumayer thanks for your reply. What you are suggesting can be seen as an improvement of the current behavior, which I'd definitely welcome, but if I'm not mistaken it is still subject to a race condition, unless the user explicitly makes sure that all the Router instances are started after the cluster has been fully formed at least once, which in my experience places just additional burden on the user. I, for one, maintain a few production applications and usually let all components start at the same time on Kubernetes and naturally figure out the proper dependency among themselves, relying for example on the In particular, if you remove the loop, this is the race condition I see:
Notice that Router doesn't exit, it keeps staying in this loop waiting for I'm not particularly attached to the solution I'm proposing, but on paper it seems to work relatively well, in the sense that I can't think of a scenario where Router would actually become unavailable assuming the underlying cluster is still operating even with some replicas down (modulo the initial bootstrap server problem that you mention, but that's inevitable at the moment), what's the disadvantage that you see? The only scenario it doesn't cover is a user changing the number of cluster members at runtime, but that to me seems a different use case that should perhaps be handled natively inside Router the same way is done in other distributed systems (e.g. in Cassandra, once a member of the cluster is discovered via a bootstrap node, it will keep being considered even if the original bootstrap node goes down, thus eliminating the need to know, at bootstrap time, all the cluster members, like Router instead seems to want). Hope this all makes sense. |
First of all, in case I haven't said it. Thanks for the good writeup and your input, we really appreciate it. Now a bit of explanation on how I see things: The entrypoint script was always meant as a workaround until we find out more about the functionality the router image should provide becomes a bit more clear. Currently it is intended to let users use the bootstrap mode of cluster. And the "wait for a number of servers to be online" was mainly a workaround for the case when one starts multiple servers locally and router at the same time. The servers will simply not have finished the initialisation process in time so router fails (or rather bootstrap mode). I always had a strange feeling about it and now clearly it doesn't work with scaling of the cluster (well, both intended and unintended scaling). I really think all issues are due to current limitations in bootstrap mode (or due to how we use bootstrap mode in the image). So in the medium run I see two "proper" solutions: A lot of my motivation comes from having the docker image as thin a layer as possible, only handling configuration handling for the application really. This is to not have duplicate logic (like the waiting for the servers is in a way).
I hear there is internal efforts to make bootstrap mode more graceful/dynamic. There is also a somewhat larger discussion on how router is best used in different docker scenarios (plain/docker-compose/container schedulers). I also see that your immediate problem is not solved by any of this :-) For the time being I can propose a compromise, leave the loop the way you describe it (after all, it doesn't change semantics, shouldn't impact existing deployments), but make it optional. I.e. if the variable is not set there is no loop, just restart. |
Thanks for the context @neumayer. I'll definitely be interested in hearing the new developments of the bootstrap mode, I'd love to see a more resilient workflow. I am fine with your compromise, if you remove the variable it could be worth adding a note saying that in that mode users should try to launch Router after the cluster has been fully formed the first time, otherwise people might end up spending hours of troubleshooting like me, just to eventually figure out the internal details about Router never using the other cluster members discovered at runtime if the ones discovered during the bootstrap go down, leading to loss of availability. Thanks |
Great. I mentioned in the readme. I merged both changes, they will be part of the next release. |
Hello,
I am facing some issues dealing with this image in production, in particular around the semantics of
MYSQL_INNODB_NUM_MEMBERS
. Specifically, the container fails to start until exactlyMYSQL_INNODB_NUM_MEMBERS
members are part of the cluster and online. In my case, such number is set to 3, and the setup is a standard single primary InnoDB Cluster.This creates a problem because, if one mysql replica goes down, the cluster is still healthy since it still has a quorum (though it won't be tolerant to further failures, but that might not be an imminent problem), but even if it's healthy, starting new Router containers will fail, because they will be stuck waiting for the missing replica to come up. This severely limits the availability of Router, especially when Router itself gets deployed alongside the client application (as recommended in the official documentation), which might be subject to a very frequent deployment cycle, even when a mysql replica is down.
Looking a bit deeper at the implementation of the container, the variable
MYSQL_INNODB_NUM_MEMBERS
is used in the docker entrypoint as such:So, we don't proceed with the initialization until
MYSQL_INNODB_NUM_MEMBERS
are up and online.My first attempt to solve this was to relax this query to the following:
This doesn't work, because
replication_group_members
entries are removed if a member of the cluster goes down, so if I temporarily lose one replica, the table shows up just two entries.In other words, if I bring down the
mysql3
container, I go from:To:
So that's not a viable solution. Another attempt to relax the entrypoint constraint was to wait until at least (and not exactly)
N
members are online, so I could just passMYSQL_INNODB_NUM_MEMBERS=1
and the underlying query would be:Upon first inspection, this worked, because Router will dynamically discover the other cluster members at runtime via the bootstrap server, and properly route the traffic through the right master, so it doesn't matter if we end up with a configuration file containing a single bootstrap server:
However, it seems that all the cluster state is always discovered only via the bootstrap nodes. This means that if the original node discovered during the bootstrap goes down, the entire thing goes down, Router stops serving all the requests, even if we have a valid cluster formed by mysql2 and mysql3:
So, also this option is not viable (or better, it would be viable just if we introduced an explicitly dependency between Router and the mysql instances, making sure that Router is started after the cluster is fully formed for the first time, which seems annoying).
The third option that I found was to actually not rely on the
replication_group_members
table, but to actually rely on the exact same table that Router internally uses during the bootstrap process to discover the members of the cluster (https://github.com/mysql/mysql-router/blob/8.0/src/router/src/config_generator.cc#L1328). This table has the advantage that all the members are always listed, even if one is down:So, if we use this table with
MYSQL_INNODB_NUM_MEMBERS=3
, we can effectively make sure that Router, just like before, doesn't come up until all the bootstrap server addresses can be properly discovered by the bootstrap process, but we also become more tolerant to a replica going temporarily down, since Router can be safely started from scratch in that scenario. So, the query that I'm using, adapted from the C++ code in the link, is:What do you think? I'm not familiar too familiar with Router so I would be interested to know if I missed some corner case.