New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ceilometer workload partitioning with tooz & redis #447
Conversation
@@ -0,0 +1,14 @@ | |||
class quickstack::pacemaker::redis( | |||
$bind_host = '127.0.0.1', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will using 127.0.0.1 work in an HA setup? if we are not load blancing redis (similar to memcached), it seems like we would want to bind on either 0s or the map_params("local_bind_addr") value, otherwise each node would only get the cache from there instead of potentially elsewhere, if that is where the right copy is?
Aside from the couple comments I had, this looks pretty reasonable, thanks! If we want to try to make this into the next release, we need an RFE in bugzilla set against openstack-foreman-installer and version 6, then we have something to set QE on for testing once this gets in |
Thanks for the detailed feedback Jay, responding in one consolidated comment.
That's just the default bind_host, which is always overridden in practice (IIUC). I just followed what seemed to be the existing practice whereby the default is generally set to 127.0.0.1 in the quickstack puppet classes, but I'd be happy to change to 0.0.0.0 if that makes more sense.
Redis is not quite as limited as memcached (where IIUC, there is no replication/synchronization at all between the memcached servers). But yes, it would not make sense to LB over multiple Redis instances, as only one is deemed the master (the others are effectively passive slaves).
The problem here is tooz (the library that ceilometer uses for coordination, which itself then interacts with redis) does not IIUC allow for an address list to be specified via a URL.
We need to pick one redis for now, so this is it. Once openstack-puppet-modules is updated to accept the latest puppet-redis patches for redis-sentinel (pull requests [1], [2]) and we have a new version of python-tooz rebuilt, this will change as follows:
The sentinel will then mediate slave->master promotion and failover when the old master goes away. However, what I'm missing is how, in quickstack terms, to approach ensuring a different action is taken on one controller as opposed to another. i.e. can you point me to an existing pattern where different parameters are passed to a particular puppet class on a subset of the controllers? (in this case $slaveof being set for the ::redis class for all but one of the controllers).
See https://bugzilla.redhat.com/1180158 Cheers, [1] redhat-openstack/openstack-puppet-modules#209 |
On 08/01/15 06:21 -0800, Eoghan Glynn wrote:
|
Oh,and one general request on the commit message. Would you kindly updated it on the next revision to be of the form:
It makes it easier for me to generate the rpm changelog. Thanks! |
Thanks again for the further feedback, comments inline ...
A-ha, cool, got it now, I'll change that.
We have to explicitly tell the ::redis puppet class whether the service run on the current host is going to be the initial master (which is the default) or be a slave (by setting the $slave_of parameter to the ::redis class). If that initial master fails, the redis-sentinel then decides which slave is promoted to be the new master. If the original master fails, a slave is promoted in its place ... but then if the old master is restarted, the sentinels tell it revert to being a slave (regardless of its master-like redis.conf). So it's somewhat less dynamic than the mongodb replicaset scenario, in the sense that the initial master is set in config, but things become more dynamic thereafter. It may seem odd that mastership and slavedom are initially statically defined in config, but then dynamically managed thereafter by the sentinels ... IIUC this is due to the master->slave replication being developed first, and then the sentinel mechanism added later, layered over the pre-existing replication feature.
We're still trying to get the sentinel-related o-p-m changes into o-p-m with a view to getting them into OSP 6 GA. Testing on the o-p-m pull-requests showed up a problem that cdent fixed earlier today, so we're awaiting feedback on the updated PRs. (note that the current state of this pull-request does not require the sentinel changes, as I avoided depending on those puppet-redis changes prior to acceptance into o-p-m)
Cool, those would be great to hear.
Sure thing, will do! Cheers, |
@jguiditta: just thinking about a possible approach to driving slightly different behaviours on each of multiple controller nodes (in the context of setting the $slaveof param on all but one of the instantiations of the ::redis class, as discussed in my previous comment). How about comparing the current $ipaddress_eth0 fact against the first element of the map_params("lb_backend_server_addrs") array? If lb_backend_server_addrs is ordered the same way on each controller node, I'm thinking this comparison should only be equal on a single controller node ... am I on the right track there? |
On 09/01/15 09:04 -0800, Eoghan Glynn wrote:
The other approach I was thinking could work would be to use the It seems like Node 1 dies, 2/3 decide that 2 is the new master. OK,
And that could return the ip or nil/blank string? I think something Hope that is at least somewhat helpful. -j |
Hi Jason, Thanks again for talking the time to consider this carefully, this discussion is very helpful :) TBH I think I've missed the point about the relevance of the galera stick-table config, as I thought that simply controls the size etc. of the data structure used by HAProxy to implement stickiness. However I'm liking the other approach you've suggested around checking for an address match, having expressed similar (though less developed) thoughts in my previous comment. In fact it seems IIUC that we use already a similar approach in the quickstack::pacemaker::rabbitmq class in order to ensure the first node in the rabbitmq cluser starts up first[1]. So I'm wondering if that might be sufficient in this case also, i.e. whether we'd even need to worry about matching the cluster_control_ip to a particular lb_backend_server_addr? i.e. would simply designating one of those lb_backend_server_addrs as special (say the first or the last) suffice to drive the different behaviour on a single controller node hat we need? ... along the lines of the following pattern:
BTW you're correct in your thinking the mastership assignment in redis config is effectively disregarded when a previously-failed master is restarted, in this case the sentinels handle telling the old master that it's no longer top-dog. You're also correct in observing that the complete death of the the node hosting the "contact" sentinel would be problematic. Unfortunately tooz does not currently allow us to identify multiple sentinels in the backend URL, so we'll need to either quickly add that in to tooz or put the sentinels all behind a VIP (assuming that's feasible?) Cheers, |
@jguiditta: I inadvertently caused this pull-request to be closed, and couldn't re-open it. So I've created a fresh pull-request[1] with the latest version of the patch, illustrating the discussion in the previous comments about setting the $slaveof status of the redis on all but one of the controllers. Can we continue the discussion on the new pull-request? [1] #449 |
On 12/01/15 05:28 -0800, Eoghan Glynn wrote:
|
The quickstack::pacemaker:ceilometer puppet class now allows
redis to be specified as the backend for tooz, to be used for
workload partitioning in the ceilometer central agent and
alarm evaluator.
We do not create a pacemaker resource for redis, as this will
instead be self-monitored via the redis-sentinel service (via
a subsequent patch, once the support for sentinel in tooz is
packaged python-tooz).