knife cluster kick shouldn't rely on already running chef-client #109

Closed
dhruvbansal opened this Issue Feb 28, 2012 · 2 comments

Comments

Projects
None yet
3 participants
@dhruvbansal
Contributor

dhruvbansal commented Feb 28, 2012

Currently, running knife cluster kick will cause an existing chef-client process on a node to start executing. I propose that we have knife cluster kick instead launch a new chef-client process, just like running sudo chef-client manually on the node would do.

This would fix several problems associated with long-running chef-client processes:

  1. Cookbooks sometimes install gems during the compile phase that they need during the execution phase. If gem 'foo' is being installed at version '2.1.0' but the long-running chef-client has already loaded version '2.0.0' from a previous run, then we're in a bad state.

  2. Database and service handles created in one run may be reused in another run, later, instead of being refreshed. This burden is placed upon the cookbook's author and many authors will not deal with the issue of handles becoming stale and needing to be reset, instead using them directly if they're available.

In both cases, insisting on utilizing a long-running chef-client process introduces complexity downstream for cookbook authors while a "fresh" run would immediately solve both.

What would we lose by making knife cluster kick launch a fresh chef-client as opposed to kicking an existing one?

@pcn

This comment has been minimized.

Show comment Hide comment
@pcn

pcn Mar 21, 2012

If you know that you're not going to have a running chef-client, then you don't lose anything. However if you've got a running chef client you can have badness like two simultaneous chef runs that e.g. both download a file at the same time. One finishes, tries to do something withe the downloaded file, but the second one has tried to start its own download, so the first one blows up, and the second one doesn't know to clean up any turds left, etc. In addition, without the --once flag you're going to end up with multiple chef-clients hanging around and running constantly.

For my own case, our policy is to not keep a chef-client running. I've added a "--once" option to kick which runs "chef-client --once". I will get around to turning that into a pull request.

pcn commented Mar 21, 2012

If you know that you're not going to have a running chef-client, then you don't lose anything. However if you've got a running chef client you can have badness like two simultaneous chef runs that e.g. both download a file at the same time. One finishes, tries to do something withe the downloaded file, but the second one has tried to start its own download, so the first one blows up, and the second one doesn't know to clean up any turds left, etc. In addition, without the --once flag you're going to end up with multiple chef-clients hanging around and running constantly.

For my own case, our policy is to not keep a chef-client running. I've added a "--once" option to kick which runs "chef-client --once". I will get around to turning that into a pull request.

@temujin9

This comment has been minimized.

Show comment Hide comment
@temujin9

temujin9 Apr 12, 2012

Contributor

This appears to be fixed in the most recent version; knife cluster kick checks for a running chef-client to HUP, and runs directly if it doesn't find it. Closing this.

Contributor

temujin9 commented Apr 12, 2012

This appears to be fixed in the most recent version; knife cluster kick checks for a running chef-client to HUP, and runs directly if it doesn't find it. Closing this.

@temujin9 temujin9 closed this Apr 12, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment