New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi Node / Cluster Support #873

Closed
bradenwright opened this Issue Nov 5, 2015 · 16 comments

Comments

Projects
None yet
6 participants
@bradenwright

So I know this has been brought but numerous times, I guess at this point I want to come up with a better solution, I feel I'm close. I've been working on a kitchen driver for LXD, and really want to be able to test mongo replicasets/sharding, I'm tired of my workaround. I'm wiling to write something but basically I have 2 thoughts right now:

  1. Run chef-zero on my local machines (on ip that's accessible to the node) and then supply a chef_server_url (or something like that) so multiple nodes can use the same chef-zero instance.

It would be easy to start a chef-zero server in my driver, etc. But when chef-zero runs it doesn't honor the chef_server_url, although the setting is written to /tmp/kitchen/client.rb

From this link #549 I really thought this approach would work, but so far no luck.

  1. Ability to copy files from /tmp/kitchen/ after converge completes. Basically I would hope there would be a hook or some kind of way to identify when converge completed, b/c I'd leave everything working as is, but just be able to fire off commands in my kitchen driver once converge has completed.

Basically idea would be to have a "local_chef_zero_path" (e.g copy all applicable dirs in /tmp/kitchen) or "nodes_save_path", "data_bag_save_path" (e.g. so could just copy nodes, or data_bags).

Idea came from #437 and the fact that right now my solution is to manually/script doing this. I run kitchen converge on node1, copy /tmp/kitchen/nodes/node1.json from the newly created node to the same location as my .kitchen.yml nodes_path. Then I repeat for node2.

I'm open to idea but those are my best 2 ideas, unfortunately I've run into stucks with both:

  1. Can't figure out how to use chef_server_url or run like chef-client
  2. Can't figure out if there is a way to determine in kitchen driver when converge has completed and run code

Thanks for any help/suggestions. Any other methods/ideas are welcome to I would just like to take out the manual steps.

@willejs

This comment has been minimized.

Show comment
Hide comment
@willejs

willejs Nov 11, 2015

This was discussed at chef community summit London and chef employees and community members committed to putting this into test-kitchen.

willejs commented Nov 11, 2015

This was discussed at chef community summit London and chef employees and community members committed to putting this into test-kitchen.

@bradenwright

This comment has been minimized.

Show comment
Hide comment
@bradenwright

bradenwright Nov 13, 2015

@willejs thanks for the info. It's good to know, that its one the road map for a while there seemed to be some back and forth on it.

At this point I basically have a work around of creating a Rakefile which creates the node, then copies that nodes .json from /tmp/kitchen/nodes on the newly created node to a local directory the directory nodes_path: in kitchen.yml points to. And destroy commands that delete the file from node_path after kitchen destroys the node. It works but its kind of a pain.

The more I have been think about the situation the more I realize I really want option 1. The ability to run a chef-client provisioner (against a chef-server, which I believe would also allow for chef-zero instance). I know vagrant has a chef-solo, chef-zero & chef-client provisioner. But unless I'm wrong the options in test-kitchen (for chef) right now are chef-solo or chef-zero. I haven't been able to find a chef-client provisioner option

But I think I remember seeing write ups on vagrant that point towards a chef-zero instance running where ever, using chef-client provisioner for vagrant.

This would also allow testing out of chef-server before deploying, meaning we run a non-production chef-server, and after testing locally I could run me tests against my chef server. Which would help me ensure that recipes will work when they are run on my production chef server.

@willejs thanks for the info. It's good to know, that its one the road map for a while there seemed to be some back and forth on it.

At this point I basically have a work around of creating a Rakefile which creates the node, then copies that nodes .json from /tmp/kitchen/nodes on the newly created node to a local directory the directory nodes_path: in kitchen.yml points to. And destroy commands that delete the file from node_path after kitchen destroys the node. It works but its kind of a pain.

The more I have been think about the situation the more I realize I really want option 1. The ability to run a chef-client provisioner (against a chef-server, which I believe would also allow for chef-zero instance). I know vagrant has a chef-solo, chef-zero & chef-client provisioner. But unless I'm wrong the options in test-kitchen (for chef) right now are chef-solo or chef-zero. I haven't been able to find a chef-client provisioner option

But I think I remember seeing write ups on vagrant that point towards a chef-zero instance running where ever, using chef-client provisioner for vagrant.

This would also allow testing out of chef-server before deploying, meaning we run a non-production chef-server, and after testing locally I could run me tests against my chef server. Which would help me ensure that recipes will work when they are run on my production chef server.

@willejs

This comment has been minimized.

Show comment
Hide comment
@willejs

willejs Nov 13, 2015

@bradenwright I want to see multiple nodes defined in a suite, with different attributes perhaps. But ultimately, I want to be able to do integration test clusters in a single suite in test kitchen. I need to use this mostly for distributed datastores like etcd and cassandra and also middleware like rabbit etc.

Your opinion 1, which I understand as a single chef zero server for a suite of multiple nodes in a single suite? To me this doesn't make doesn't make much sense. Each node can load chef zero and hold the state from your nodes, roles, environments and data bags paths defined in flat files as they should be... Writing back to a centralised server and sharing those attributes, takes more than one chef run, which is a controversial metadata and service discovery method for datastores using chef. I think I might be missing something your saying though?

willejs commented Nov 13, 2015

@bradenwright I want to see multiple nodes defined in a suite, with different attributes perhaps. But ultimately, I want to be able to do integration test clusters in a single suite in test kitchen. I need to use this mostly for distributed datastores like etcd and cassandra and also middleware like rabbit etc.

Your opinion 1, which I understand as a single chef zero server for a suite of multiple nodes in a single suite? To me this doesn't make doesn't make much sense. Each node can load chef zero and hold the state from your nodes, roles, environments and data bags paths defined in flat files as they should be... Writing back to a centralised server and sharing those attributes, takes more than one chef run, which is a controversial metadata and service discovery method for datastores using chef. I think I might be missing something your saying though?

@bradenwright

This comment has been minimized.

Show comment
Hide comment
@bradenwright

bradenwright Nov 13, 2015

@willejs just to be clear... since my first comment may not say this exactly (and maybe confusing), it became clearer after thinking about it a little bit more.

For option 1, what I want is a chef_client provisioner. This would allow 2 scenarios for multi-vm testing. (Unless I'm mistaken)

A) A person could point nodes towards a really chef server ( like this: https://docs.vagrantup.com/v2/provisioning/chef_client.html )
B) A person could run chef-zero (somewhere accessible by the other nodes)

I'm probably missing a limitation about chef-zero that you are bring up, maybe something about the fact that the files don't get written til the end of the chef run, but otherwise I'm missing why this would take more than 1 run.

But for examples, sometime they are clearer:
A) Pointing at a really chef server. I could use a recipe to spin up a full chef server instance locally (or same place as nodes for cloud usages, etc) and then run my tests against the chef server. This setup should have any limitations, anything you can do on chef-server you should be able to test.

B) Either running chef-zero on my laptop (or on a node, or where ever is network accessible by the nodes), then by using the chef_client provisioner you can point multiple nodes to a single chef-zero instance. At least I think it would work (it works with knife, and I think I did this with vagrant chef_client provisioner when I messed with it but that was a while back).

I just think having a chef_client provisioner would offer the most flexibility. Then a driver (or something in test kitchen) could be written to spin up and destroy the chef-zero instance, whether it be locally (wouldn't hold up for cloud deploys, etc) or on the first node that's created, etc.

@willejs just to be clear... since my first comment may not say this exactly (and maybe confusing), it became clearer after thinking about it a little bit more.

For option 1, what I want is a chef_client provisioner. This would allow 2 scenarios for multi-vm testing. (Unless I'm mistaken)

A) A person could point nodes towards a really chef server ( like this: https://docs.vagrantup.com/v2/provisioning/chef_client.html )
B) A person could run chef-zero (somewhere accessible by the other nodes)

I'm probably missing a limitation about chef-zero that you are bring up, maybe something about the fact that the files don't get written til the end of the chef run, but otherwise I'm missing why this would take more than 1 run.

But for examples, sometime they are clearer:
A) Pointing at a really chef server. I could use a recipe to spin up a full chef server instance locally (or same place as nodes for cloud usages, etc) and then run my tests against the chef server. This setup should have any limitations, anything you can do on chef-server you should be able to test.

B) Either running chef-zero on my laptop (or on a node, or where ever is network accessible by the nodes), then by using the chef_client provisioner you can point multiple nodes to a single chef-zero instance. At least I think it would work (it works with knife, and I think I did this with vagrant chef_client provisioner when I messed with it but that was a while back).

I just think having a chef_client provisioner would offer the most flexibility. Then a driver (or something in test kitchen) could be written to spin up and destroy the chef-zero instance, whether it be locally (wouldn't hold up for cloud deploys, etc) or on the first node that's created, etc.

@cheeseplus cheeseplus changed the title from Multi Node Testing... close but need help. to Multi Node / Cluster Support Jan 13, 2016

@cheeseplus

This comment has been minimized.

Show comment
Hide comment
@cheeseplus

cheeseplus Jan 13, 2016

Contributor

Renaming this so we can focus everyone on ONE thread for this stuff.

Contributor

cheeseplus commented Jan 13, 2016

Renaming this so we can focus everyone on ONE thread for this stuff.

@cheeseplus

This comment has been minimized.

Show comment
Hide comment
Contributor

cheeseplus commented Jan 13, 2016

@ianmiell

This comment has been minimized.

Show comment
Hide comment
@ianmiell

ianmiell Jan 15, 2017

This is interesting - I'd always assumed kitchen supported multi-node Vagrant until I ended up here.

I wrote a framework for myself to test multi-node vagrant builds, and am using it here, eg to test moving an etcd framework within an OpenShift cluster:

cf:

https://medium.com/@zwischenzugs/migrating-an-openshift-etcd-cluster-a7e43e861d61#.jun9e5qgh

and also eg:

https://medium.com/@zwischenzugs/a-complete-openshift-cluster-on-vagrant-step-by-step-7465e9816d98#.ml6rv32sn

The framework is:

https://ianmiell.github.io/shutit/

Frankly I'd prefer to use kitchen as it's better supported, but until then I guess I'm stuck with my own until someone improves kitchen. Is that likely to happen?

This is interesting - I'd always assumed kitchen supported multi-node Vagrant until I ended up here.

I wrote a framework for myself to test multi-node vagrant builds, and am using it here, eg to test moving an etcd framework within an OpenShift cluster:

cf:

https://medium.com/@zwischenzugs/migrating-an-openshift-etcd-cluster-a7e43e861d61#.jun9e5qgh

and also eg:

https://medium.com/@zwischenzugs/a-complete-openshift-cluster-on-vagrant-step-by-step-7465e9816d98#.ml6rv32sn

The framework is:

https://ianmiell.github.io/shutit/

Frankly I'd prefer to use kitchen as it's better supported, but until then I guess I'm stuck with my own until someone improves kitchen. Is that likely to happen?

@cheeseplus

This comment has been minimized.

Show comment
Hide comment
@cheeseplus

cheeseplus Jan 17, 2017

Contributor

In order to support multi-node we have to do it for all provisioners and all drivers (so more than Vagrant) which means building in primitives that test-kitchen lacks. This has long been a goal but test-kitchen has, like many OSS projects, varying levels of maintainers and corporate sponsorship which has left us prioritizing the issues/bugs rather than new features, especially ones that will require significant re-architecture and major version bumping.

There is an RFC for this but I don't believe anyone, myself included, has started any work toward it outside of planning - to be clear, we all really want the feature but it's non-trivial and requires some deep thought at how it interacts with several sub-systems, plugins, etc.

https://github.com/chef/chef-rfc/blob/master/rfc084-test-kitchen-multi.md

Contributor

cheeseplus commented Jan 17, 2017

In order to support multi-node we have to do it for all provisioners and all drivers (so more than Vagrant) which means building in primitives that test-kitchen lacks. This has long been a goal but test-kitchen has, like many OSS projects, varying levels of maintainers and corporate sponsorship which has left us prioritizing the issues/bugs rather than new features, especially ones that will require significant re-architecture and major version bumping.

There is an RFC for this but I don't believe anyone, myself included, has started any work toward it outside of planning - to be clear, we all really want the feature but it's non-trivial and requires some deep thought at how it interacts with several sub-systems, plugins, etc.

https://github.com/chef/chef-rfc/blob/master/rfc084-test-kitchen-multi.md

@ianmiell

This comment has been minimized.

Show comment
Hide comment
@ianmiell

ianmiell Feb 6, 2017

Anyone interested in an interim solution please contact me: @ianmiell on twitter.

I have written a framework that regression tests multi-node VM clusters using Vagrant and ShutIt under the hood. Am using it (for example) to test a chef provisioning cookbook for OpenShift against various combinations of cookbook versions, base OSes, cluster configurations etc..

ianmiell commented Feb 6, 2017

Anyone interested in an interim solution please contact me: @ianmiell on twitter.

I have written a framework that regression tests multi-node VM clusters using Vagrant and ShutIt under the hood. Am using it (for example) to test a chef provisioning cookbook for OpenShift against various combinations of cookbook versions, base OSes, cluster configurations etc..

@agh

This comment has been minimized.

Show comment
Hide comment
@agh

agh Feb 26, 2017

@ianmiell If you could write this up on a blog, or here, it sounds like that would be useful. 💖

agh commented Feb 26, 2017

@ianmiell If you could write this up on a blog, or here, it sounds like that would be useful. 💖

@ianmiell

This comment has been minimized.

Show comment
Hide comment
@ianmiell

ianmiell Feb 27, 2017

Thanks - I will, and will re-post back here.

Thanks - I will, and will re-post back here.

@ianmiell

This comment has been minimized.

Show comment
Hide comment
@ianmiell

ianmiell Feb 27, 2017

In the meantime, this is something I wrote a while ago about it in a slightly different context, which got picked up by hackernoon.com:

https://hackernoon.com/1-minute-multi-node-vm-setup-413dfc836fc9

In the meantime, this is something I wrote a while ago about it in a slightly different context, which got picked up by hackernoon.com:

https://hackernoon.com/1-minute-multi-node-vm-setup-413dfc836fc9

@ianmiell

This comment has been minimized.

Show comment
Hide comment
@ianmiell

ianmiell Mar 18, 2017

I've written this method of creating multi-node vagrant clusters for testing purposes here:

https://zwischenzugs.wordpress.com/2017/03/18/clustered-vm-testing-how-to/

and would be grateful for any feedback.

I've written this method of creating multi-node vagrant clusters for testing purposes here:

https://zwischenzugs.wordpress.com/2017/03/18/clustered-vm-testing-how-to/

and would be grateful for any feedback.

@richerve

This comment has been minimized.

Show comment
Hide comment
@richerve

richerve Jul 27, 2017

There's a provisioner plugin that manage the case of running searches on machines created on different suites https://github.com/mwrock/kitchen-nodes. As far as I know it uses the nodes_path feature by population the json files "on-demand" taking the information from the running VMs. This is not as nice as having supersuites as described on the rfc but it manage the main case for searches.

It's only compatible with Chef because is actually an extension of the chef-zero provisioner

There's a provisioner plugin that manage the case of running searches on machines created on different suites https://github.com/mwrock/kitchen-nodes. As far as I know it uses the nodes_path feature by population the json files "on-demand" taking the information from the running VMs. This is not as nice as having supersuites as described on the rfc but it manage the main case for searches.

It's only compatible with Chef because is actually an extension of the chef-zero provisioner

@bradenwright

This comment has been minimized.

Show comment
Hide comment
@bradenwright

bradenwright Jul 28, 2017

@richerve not that the plugin couldn't be expanded or another one created but it didn't work for my use case b/c it only sets a select few attributes like ipaddress, fqdn so if you had set tags or attributes in recipes those are not searchable or usable. Made it only work for limited use cases

bradenwright commented Jul 28, 2017

@richerve not that the plugin couldn't be expanded or another one created but it didn't work for my use case b/c it only sets a select few attributes like ipaddress, fqdn so if you had set tags or attributes in recipes those are not searchable or usable. Made it only work for limited use cases

@cheeseplus

This comment has been minimized.

Show comment
Hide comment
@cheeseplus

cheeseplus Feb 2, 2018

Contributor

This is formally written up as an RFC so closing the issue here such that there is one canonical document for this feature going forward:

https://github.com/chef/chef-rfc/blob/master/rfc084-test-kitchen-multi.md

Contributor

cheeseplus commented Feb 2, 2018

This is formally written up as an RFC so closing the issue here such that there is one canonical document for this feature going forward:

https://github.com/chef/chef-rfc/blob/master/rfc084-test-kitchen-multi.md

@cheeseplus cheeseplus closed this Feb 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment