-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keeper: "our keeper requested role is not available" #146
Comments
@sgotti any thoughts on the above? |
More background after further testing: given my setup of 1 sentinel, 1 proxy and 3 keepers, I have the ability to start the sentinel first. I'm currently starting the three keeper nodes such that each starts with a random sleep between 1-60 secs before each one is started. I do not currently have the ability to control the order of these three nodes. I delay the start of the proxy by 5 minutes, so I can be sure it gets started last. It would appear that the very fact that there are >1 keepers being started is the hangup. I cannot seem to successfully start >1 keeper node and consistently have the cluster come up. If I HAVE to start the three keepers individually, I may be able to do so. I was hoping to avoid that sort of approach. |
@wchrisjohnson this is the wanted behavior. When the master sentinel have to initialize the cluster for the first time and there're more than one registered keepers, the sentinel cannot make any assumption on which keeper to choose as master. If you're starting from an empty db (so choosing any keeper as master will be ok) you can just set an initial cluster config (exec the sentinels with This option is not the default since sane configs should not make any assumption. |
@sgotti sounds good - will try that. It would be useful to pass that cluster config via an env var vs having to read a file on the filesystem, FWIW... |
This appears to be working - thanks @sgotti ! |
It appears that when you attempt to start up a stolon cluster, and 2-n keepers are required, there is the distinct possibility that the cluster will not come up due to a deadlock wrt electing a leader/master.
I've been trying to start a stolon cluster with 1 proxy, 1 sentinel, and 2 or 3 keepers. The underlying env is docker related. All of the containers are started at one time. All configuration of the containers is done via environment variables - not stolonctl.
The following is a very typical snapshot of the env (logs):
KEEPER #1
KEEPER #2
KEEPER #3
SENTINEL
PROXY
As you can see Keeper#1 and Keeper#2 BOTH try to acquire the leader/master role for keepers at the very same HH:MM:SS. It appears that this causes some sort of deadlock.
Possible solutions?
I'd be willing to offer a PR for one of the above if we could come to an agreement on the preferred approach.
The text was updated successfully, but these errors were encountered: