Clusterfuck¶ ↑

A Subversive Distributed-Systems Tool¶ ↑

Clusterfuck is a tool for automating the process of SSH-ing into remote machines and kickstarting a large number of jobs. It’s probably best explained by an example, so here’s what I use it for:

As part of my research I need to compute the distance between each pair of objects in a set of about 70,000 items. Computing the distance between each pair takes a few seconds; running the entire job on a single machine generally takes over a day. However, as a member of the University I have a ssh login that works on quite a few machines, so I found myself breaking the job up into smaller, quicker chunks and running each chunk on a different machine. Clusterfuck was born out of my frustration with that method – “surely,” I said to myself, “this can be automated.”

If you have a lot of jobs to run and access to multiple machines on which to run them, Clusterfuck is for you!

Usage¶ ↑

To use Clusterfuck you’ll first need to create a configuration file (a “clusterfile”). An example clusterfile might look something like this:

Clusterfuck::Task.new do |task|
  task.hosts = %w{clark asimov}
  task.jobs = (0..3).map { |x| Clusterfuck::Job.new("host{x}","sleep 0.5 && hostname") }
  task.temp = "fragments"
  task.username = "SSHUSERNAME"
  task.password = "SSHPASSWORD"
  task.debug = true
end

This creates a new clusterfuck task and distributes the jobs across two hosts, clark and asimov. The jobs to be run in this case are pretty trivial; we basically ssh into each machine, sleep for a little bit, then get the hostname. Whatever each job prints to stdout is saved in task.temp (under the current working directory); running this clusterfile will create 4 files in ./fragments/: host0., host1., host2., and host3. (where [hostname] is the name of the machine on which the job was run). task.username and task.password are the SSH credentials used to log into the maching – currently, Clusterfuck can only use one global set of credentials. There’s no technical reason for this, other than the fact that I don’t really need to use machine-specific logins, so it’ll probably appear in future releases. task.verbose turns on verbose output (messages to stdout each time a job is started, skipped, or canceled).

Once you have a clusterfile you can kick off your jobs by running the command clusterfuck in the same directory.

Note on Patches/Pull Requests¶ ↑

Fork the project.
Add something cool or fix a nefarious bug. Documentation wins extra love.
Add tests for it. I’d really like this, but since I haven’t written any tests myself yet I can’t really blame you if you skip it…
Commit, but do not mess with rakefile, version, or history. (if you want to have your own version that’s ok – but bump the version in a separate commit that I can ignore when I pull)
Send me a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
bin		bin
example		example
lib		lib
test		test
.document		.document
.gitignore		.gitignore
LICENSE		LICENSE
README.rdoc		README.rdoc
Rakefile		Rakefile
VERSION		VERSION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clusterfuck¶ ↑

A Subversive Distributed-Systems Tool¶ ↑

Usage¶ ↑

Note on Patches/Pull Requests¶ ↑

Copyright¶ ↑

About

Releases

Packages

Languages

License

doches/clusterfuck

Folders and files

Latest commit

History

Repository files navigation

Clusterfuck¶ ↑

A Subversive Distributed-Systems Tool¶ ↑

Usage¶ ↑

Note on Patches/Pull Requests¶ ↑

Copyright¶ ↑

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages