Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxy requests to all Host methods via a Cluster #5

Merged
merged 1 commit into from
Jun 2, 2013

Conversation

richo
Copy link
Contributor

@richo richo commented May 27, 2013

This exposes all the methods on Host via the ClusterCommandRunner.

It's on the edge of too metaprogrammy and loses the arity checks, but on the upside it didn't involve too much copypasting

@mipearson
Copy link
Owner

Hmm. I might toy with this a bit .. I'm thinking that what should happen is that instead of being given connectionproxy, you actually get given an instance of Gofer::Host.

So then you can do things like

cluster = Gofer::Cluster.new
# ...
cluster.run do |h|
  # each instance of this block runs in its own thread
  h.upload "pie ingredients"
  h.run "make pie"
  h.download "pie.txt", "pie.#{h.hostname}.txt"
end

I think the way that you've implemented it, ie that each call to run within the block starts threads, waits, then returns, works better like this:

cluster = Gofer::Cluster.new
# ...
cluster.upload "pie ingredients"
cluster.run "make pie"
puts "PIE MADE"
cluster.download "pie.txt", "pie.txt" # NoMethodError
cluster.each { |h| h.download "pie.txt", "pie.#{h.hostname}.txt" }

What do you think? What's your use case? Can you share any example code?

@mipearson
Copy link
Owner

ooh, and then!

cluster.once { |h| h.run "rake db:migrate" }

Of course, more idiomatic is

h = cluster.shuffle.first
h.run "..."

@richo
Copy link
Contributor Author

richo commented May 27, 2013

That makes sense, although we pool like commands into small, almost transactional segments for brevity and also to get around the lack of connection caching (Which is on the horizon for the next few days).

def checkout
  APPS.run do |c|
    c.run [
      "(cd #{o(:deploy_to)} && git checkout -fq #{DEPLOY_ASSETS.slugs[DEPLOY_ASSETS.head]})",
      "(cd #{o(:deploy_to)} && git submodule update)",
      "(cd #{o(:deploy_to)} && . ~/.bash_profile && bundle install)"
    ].join(" && ")
  end
end

To basically get as much done in one connection as possible, while still bailing out usefully if anything goes sour. (Ignore the kludges, this is mid refactor).

If we were using more of the Gofer specific stuff, I agree your API makes more sense. Especially if you can:

cluster = Gofer::Cluster.new
# ...
cluster.run do |h|
  # each instance of this block runs in its own thread
  h.upload "pie ingredients"
  h.run "make pie"
  h.sync # Make sure that all servers have made pie before downloading anything
  h.download "pie.txt", "pie.#{h.hostname}.txt"
end

To syncronise in the middle of your run. While that would be neat, it would sitll be amazingly awkward to support having cap "task" like functionality.

@richo
Copy link
Contributor Author

richo commented May 27, 2013

Also with your cluster idea, where do you specify concurrency? In an opts hash to any of those methods?

@mipearson
Copy link
Owner

re your checkout method: gofer doesn't "cache" the connection, it instead keeps the connection open until Gofer::Host gets garbage collected. What happens is each command (eg run) starts a new SSH channel within that already open (and authorized) connection.

If you're seeing a delay, it might be the latency from sending each command one by one, not from a connection establishment & authorization sequence. Give it a try:

def host
  Gofer::Host.new(...)
end

puts "new object each request:" + Benchmark.measure { 50.times { host.run 'echo' } }
puts "same object each request:" + Benchmark.measure { h = host; 50.times { h.run 'echo' } }

If the second isn't faster than the first, that's a bug.

You're right, however, in that it's not a shell, and each command needs to be prefixed with the cd. Ours is messier: it brings in a ~/environment.sh file that includes RVM, RAILS_ENV, db connection URLs, etc.

@mipearson
Copy link
Owner

I'm not sure about sync: it looks like extra code where simply closing the block and making a new one would do.

Re concurrency: it'd be specified as so:

cluster.each(max_concurrency: 5) do |c|
 #  c.run ...
end

If there's a demand, I'm also thinking about something like:

responses = cluster.run("ls", max_concurrency: 5) 

puts responses.all.stdout
puts responses['hostname'].stdout

@mipearson
Copy link
Owner

Okay, more thoughts (apologies for spam).

cluster.each (previously cluster.run) is better accomplished with a gem like parallel (https://github.com/grosser/parallel). Implementing it doesn't offer anything that isn't already done better, elsewhere.

So I'm thinking:

# immediate_abort_on_exception: default behaviour on exception is to wait for other 'runs' to finish
# before raising the original exception. Make a new Gofer::HostErrors class to encapsulate
# multiple error responses
cluster = Gofer::Cluster.new(:max_concurrency => 5, :immediate_abort_on_exception => true)
host = cluster.hosts.shuffle.first
host.run "rake db:migrate"
cluster.run "rake deploy"
cluster.write "/tmp/restart", ""

Would this still be useful to you?

mipearson added a commit that referenced this pull request Jun 2, 2013
Proxy requests to all Host methods via a Cluster
@mipearson mipearson merged commit 31f1fc2 into mipearson:master Jun 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants