-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent cluster membership #61
Comments
I want to separate the behavior of Serf vs the fragility of the current model. Currently, sending a SIGINT to Serf is logically the same as what a In terms of fragility, if the nodes does a graceful leave, it would be very strange in my |
I think we're pretty much on the same page. :) I agree that a node doing a graceful leave rejoining the cluster would be strange, that should never happen. I like the current failure handling/rejoining behaviour. It could be sped up, but you're right, I wouldn't implement the local state storage just for that. Not worth it. If we already have local state, however, we may as well use it. The real benefit of storing cluster state locally is avoiding the possibility that the entire cluster state will be lost. That could be caused by a big power outage, or perhaps more likely, something like a puppet misconfiguration that stops every serf process. A bad package upgrade, a typo in a config file, a mistake with a remote mass execution tool like salt, etc. Regardless of how it happens or how likely it is, I think the effect of losing the entire cluster state is quite serious - requiring the operator to manually go and rejoin each node. If this happens just once, the irritated operator will be tempted to configure the list of all nodes in the config file, which I think is completely the wrong way to solve the problem. I think we should avoid encouraging the wrong behaviour like that, so I think that if it is possible to make serf just magically handle this, it should. Playing devil's advocate, are there cases in which persistent membership in a production environment would be undesirable? |
I see what you are saying. I agree, it would be nice to have a safe guard against total cluster loss. What we could do is is add a new |
Additionally, I think we can add I think we SIGINT as graceful, and SIGTERM as non-graceful as the defaults is sane, but it may not be expected |
Sounds good! If it were up to me (I don't know any golang yet...) I'd avoid exposing this to the user at all and make it automatic and unconfigurable. In the absence of any use cases where persistence is undesirable, that is. I can't think of one, but hopefully others will chime in with something I haven't thought of! I'm having some trouble coming up with a reason why the user (other than a serf developer) would want to adjust signal handling behaviour. I also don't think it's useful/sane to do a graceful cluster leave on receiving any signal, I think it's a relatively rare/special case and should never be accidentally triggered. I think it ought to happen when executing In general I have a preference for opinionated software over configurable, and I feel like this could be applied here. If we don't really need those config options, we could make persistence automatic and mostly unconfigurable. (Of course, the user can just go and delete the persistence file, but I think that's pretty reasonable, I imagine it would be an extreme and rare action!) Anyway - sounds like we're agreed on the general ideas here, and thanks for being open to it. Some more points of view from other users who want to run serf in production would be appreciated. :) |
The signal handling for leave is important for integration into tools like systemd, upstart, etc. It also makes it really easy if you are running Serf as a sub-process to just send it a signal to kill vs leave. SIGINT can only really be accidentally done if you are running Serf manually on a CLI and the user does a control-c, which is not a sane production setup. In terms of configuring the snapshot/bootstrap, its hard to predict user environments and needs. For example, internally we can rely on our service discovery mechanism to do the bootstrap, and so we can avoid the node local state. Also having the user explicitly provide a path to the file avoids any issues with permissions or assumptions we might make as the serf developers. I agree strongly that the software should be opinionated, but having the configurations is great when you really need to change something to work within your environment. In the default case, as with most serf settings, everything can just be left to the defaults. Anyways, I will split this into a few sub-tickets so it can be tracked individually. Thanks for the feedback. |
I've just caught up on everything. I agree with everything said here. I'll break it down by point, each of which can be its own ticket.
|
@mitchellh Shouldn't it be --snapshot? |
@thedrow The way Go's standard command-line parsing library works, it is just one dash. I think this is some oddity from plan 9. I've gotten used to it. |
@mitchellh But most people would find it strange no? |
Probably the wrong issue for discussing this, but I agree that single-dash arg prefixing is weird for those used to more GNU-style arg parsing. If possible, I'd say the double dash for long arg format is preferable, but in the context of this issue it's not relevant. I don't think the implementation language ought to "leak" like this into the operational side. New issue for GNU-style arg parsing instead of Go's standard one? |
@derpston +1 |
@derpston I have two projects with this single line non-GNU format and I haven't had any real complaints. if you use double dashes then the help will be shown and you'll see. I really enjoy using the standard library arg parsing library and would rather not go away from that. |
No complaint here, just a mild preference for the GNU style for consistency reasons. Not a problem. |
Closing this ticket, as I've replaced it with a few sub-tickets. |
Just stating my own preferences here, hope this doesn't reopen the bug. Currently I rely on the SIGINT graceful leave behavior -- I have serf deployed on machines that boot from Intel Z-130 2GB industrial USB units containing Ubuntu-derived custom LiveISOs with TORAM=Yes in the boot parameters. They come and go in varying states and when they shut down, they're gone, and so is all of their configuration. If that physical machine comes back (via a separate IPMI management VLAN), it's welcomed as a completely new node to the cluster, as it's role may have changed due to a load requirement that triggered the IPMI machine start up sequence. If we need more databases, it will be a database for a while until the load subsides. It may even have a new IP address due to DHCP assignment. In practice, our DHCPD tends to assign the same addresses over and over by hashing the MAC address somehow. But at least two of our machines have a collision and fight over 10.0.10.116 when they're both asked to join the database subnet. Recently due to the debian decision to standardize on systemd, we've started moving from a ubuntu base over to a debian+systemd infrastructure that actually lets us sanely shutdown an entire role and all of it's processes, and then perform a REST call to see if we're more useful immediately adopting a new role and rapidly pivoting to a completely new configuration without rebooting, or choosing to save power and just shutdown. Serf currently does it's job and has very little care with who gets what role when or why. That said, I'm currently taking a look at the result of #86 to see if dumping that to the USB stick on shutdown is better than our current method of joining to whatever a REST call's results on bootup says to join. Either way; just pointing out serf can get used in a lot of very weird ways. |
@kamilion Glad to hear it is working for your use case. We tried to make sure that the snapshots did not change the existing behavior if you choose not to use them. The snapshot feature is mostly great to guard against agents failing and allowing them to auto re-join (on bug, power loss, etc). But if you are running the OS in memory only, the snapshot is likely useless since there is no real "durable" state. |
I was surprised by the behaviour of the agent when giving it a SIGINT - it gracefully leaves the cluster by notifying other nodes before it shuts down. When restarting the agent, it is no longer a member of the cluster. It doesn't attempt to rejoin, and the other nodes don't attempt to contact it to tell it to rejoin.
When killing the agent with a SIGTERM or SIGKILL, it makes no attempt to leave the cluster gracefully. Other agents eventually notice that it disappeared, and make regular attempts to contact it. When the agent comes back up, other agents will tell it to rejoin the cluster, and it does.
I found this behaviour surprising because I expected cluster persistence to be less fragile. If I issue a
serf join foo.example.com
I expect that this action won't be undone unless I later issue aserf leave
on that agent, or aforce-leave
on another agent.So, I propose:
serf leave
command, to complimentforce-leave
, that performs an orderly leaving of the local agent from the cluster.This would have some benefits:
This is basically the same ring/cluster persistence model used by Riak, I believe. I've used Riak in production for over a year and I've grown to love the resilience. Nodes go up, nodes go down, and the operator never has to do a thing to maintain cluster state.
In terms of implementation, we have a few options:
Of these options, I think (1) is poor because it requires the operator stay on top of cluster membership and write it to every agent's config file. This seems like error-prone busywork to me.
Option (3) feels fragile to me - it seems like a hack. I feel like cluster membership persistence should be a first class feature, so I would be in favour of option (2) and having serf do this by default.
It was suggested that this could be implemented as a plugin for use with the eventual plugin system. This feels like just a slightly tidier version of option (3), so I'm not keen on it.
While I'm obviously not in a position to dictate project goals, I feel like serf (and every other piece of potentially production software) should strive for:
I think Riak's cluster membership model is perfect in this regard, and I think it should be a model for distributed system membership.
Opinions, anyone? :)
Thanks for reading!
The text was updated successfully, but these errors were encountered: