Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - Does cookbook support rolling restarts? #315

Closed
spuder opened this issue Apr 28, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@spuder
Copy link
Contributor

commented Apr 28, 2015

If you make a change to the config files, it notifies the elasticsearch service to restart.

template "elasticsearch.yml" do
  path   "#{node.elasticsearch[:path][:conf]}/elasticsearch.yml"
  source node.elasticsearch[:templates][:elasticsearch_yml]
  owner  node.elasticsearch[:user] and group node.elasticsearch[:user] and mode 0755

  notifies :restart, 'service[elasticsearch]' unless node.elasticsearch[:skip_restart]
end

Unless I'm mistaken, doesn't that have the possibility to take the entire cluster down? Elasticsearch restarts can take a while due to rebalancing and replication. If other nodes in the cluster start their converge before shards are fully replicated, bad things could happen.

I'm new to chef, so maybe there is something that I"m not seeing.

http://www.elastic.co/guide/en/elasticsearch/guide/master/_rolling_restarts.html

For reference, here is how the kafka cookbook handles rolling restarts.

https://github.com/mthssdrbrg/kafka-cookbook

@martinb3

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2015

From what I'm reading, the kafka cookbook simply provides hooks for downstream users to provide an implementation for a condition to inhibit a restart (such as requiring a minimum number of nodes to be available). It doesn't actually have the rolling restart functionality built in.

@spuder

This comment has been minimized.

Copy link
Contributor Author

commented Apr 29, 2015

Thanks for responding.

So in practice, what will happen if I make a change to a parameter that modifies the elasticsearch.yaml, and then all nodes receive the notice to restart the service at once?

I'm upgrading this weekend. I'm thinking that I will disable rebalancing;

PUT /_cluster/settings
{
    "transient" : {
        "cluster.routing.allocation.enable" : "none"
    }
}

then stop the chef client on all nodes, then do a converge one node at a time to allow for the restart and rebalancing. Is that what you do?

@martinb3

This comment has been minimized.

Copy link
Contributor

commented Apr 29, 2015

There's an option to skip restarts. I set that, and control it myself.

@spuder

This comment has been minimized.

Copy link
Contributor Author

commented Apr 29, 2015

Thanks.

I see two parameters. For posterity, It looks like the first one is the one you are refering to. The second one prevents ES from restarting when a new node joins the cluster.

default.elasticsearch[:skip_restart] = false

node.set['elasticsearch']['skip_restart'] = true

I'm going to set that first value.

@spuder spuder closed this Apr 29, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.