Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

elasticwulf-service -> ec2cluster

  • Loading branch information...
commit 3a4738946d9df078822874aae3651c3e39a1282f 1 parent 16c9657
@datawrangling authored
View
18 README.textile
@@ -1,13 +1,13 @@
-h1. ElasticWulf-Service
+h1. ec2cluster
-Elasticwulf-Service is a Rails web console, including a REST API, that launches temporary Beowulf clusters on Amazon EC2 for parallel processing. You upload input data and code to Amazon S3, then submit a job request including how many nodes you want in your cluster. Elasticwulf will spin up & configure a private beowulf cluster, process the data in parallel across the nodes, upload the output results to an Amazon S3 bucket, and terminate the cluster when the job completes (termination is optional). ElasticWulf-Service is like Amazon Elastic MapReduce, except it is uses MPI and REST instead of Hadoop and SOAP. The source code is also free for use in both personal and commercial projects, released under the BSD license.
+ec2cluster is a Rails web console, including a REST API, that launches temporary Beowulf clusters on Amazon EC2 for parallel processing. You upload input data and code to Amazon S3, then submit a job request including how many nodes you want in your cluster. ec2cluster will spin up & configure a private beowulf cluster, process the data in parallel across the nodes, upload the output results to an Amazon S3 bucket, and terminate the cluster when the job completes (termination is optional). ec2cluster is like Amazon Elastic MapReduce, except it is uses MPI and REST instead of Hadoop and SOAP. The source code is also free for use in both personal and commercial projects, released under the BSD license.
h3. Features
* feature 1
* feature 2
-h2. Running MPI jobs on EC2 with Elasticwulf
+h2. Running MPI jobs on EC2 with ec2cluster
overview goes here, workflow, s3 inputs, commands.
@@ -36,7 +36,7 @@ h4. Sample Ruby REST API client example
TODO: flesh this out more...
-The full code is at "http://github.com/datawrangling/elasticwulf-client-demos":http://github.com/datawrangling/elasticwulf-client-demos/tree/master
+The full code is at "http://github.com/datawrangling/ec2cluster-client-demos":http://github.com/datawrangling/ec2cluster-client-demos/tree/master
Fill in your AWS info and server details in config.yml:
@@ -52,7 +52,7 @@ keypair: your-keypair
</code></pre>
-Use ActiveResource to communicate with the Elasticwulf REST API with Ruby
+Use ActiveResource to communicate with the ec2cluster REST API with Ruby
<pre><code>
class Job < ActiveResource::Base
@@ -156,7 +156,7 @@ todo
h2. Why use MPI? Why not Hadoop?
-If you can solve your problem with Hadoop, go for it. If you are short on time and MPI code exists that solves your problem, then you might want to try Elasticwulf. MPI has been around for a while and there are lots of existing libraries for a number of domains. That said, debugging MPI jobs and dealing with node failure can be a hassle. Reuse or reimplement, your choice.
+If you can solve your problem with Hadoop, go for it. If you are short on time and MPI code exists that solves your problem, then you might want to try ec2cluster. MPI has been around for a while and there are lots of existing libraries for a number of domains. That said, debugging MPI jobs and dealing with node failure can be a hassle. Reuse or reimplement, your choice.
h2. Dependencies
@@ -179,7 +179,7 @@ Launch an instance of the latest ec2onrails ami and note the returned instance a
$ ec2-run-instances ami-5394733a -k gsg-keypair
$ ec2-describe-instances
</pre>
-Create the needed configuration files from the provided examples and edit them, filling in your instance address information, keypairs, and other configuration information as indicated in the comments of each file. See the ec2onrails documentation or source code for more details on each setting. If you want to make changes to the elasticwulf code, be sure to replace the base github repository in deploy.rb and config.yml with your own github location.
+Create the needed configuration files from the provided examples and edit them, filling in your instance address information, keypairs, and other configuration information as indicated in the comments of each file. See the ec2onrails documentation or source code for more details on each setting. If you want to make changes to the ec2cluster code, be sure to replace the base github repository in deploy.rb and config.yml with your own github location.
<pre>
$ cp config/deploy.rb.example config/deploy.rb
$ cp config/s3.yml.example config/s3.yml
@@ -205,7 +205,7 @@ Deploy the app to your launched EC2 instance with Capistrano (this wil take seve
</pre>
Use the admin login information you set in config.yml to access the dashboard from a web browser or as web service at the url of the instance you provided in deploy.rb: https://ec2-12-xx-xx-xx.z-1.compute-1.amazonaws.com . You can also ssh into your running EC2 instance as usual with your keypairs to debug any issues. See the ec2onrails forums for more help debugging deployment issues.
-To redeploy the app after making changes to the base elasticwulf code (this will also restart the delayed_job services which launch and terminate EC2 clusters):
+To redeploy the app after making changes to the base ec2cluster code (this will also restart the delayed_job services which launch and terminate EC2 clusters):
<pre>
$ cap deploy
</pre>
@@ -251,7 +251,7 @@ Launch the rails app itself
Launch a background delayed_job worker in a separate terminal window
<pre>
$ rake jobs:work
- (in /Users/pskomoroch/rails_projects/elasticwulf-service)
+ (in /Users/pskomoroch/rails_projects/ec2cluster)
*** Starting job worker host:72-63-103-214.pools.spcsdns.net pid:12221
background cluster launch initiated...
1 jobs processed at 0.0498 j/s, 0 failed ...
View
8 app/models/job.rb
@@ -199,7 +199,7 @@ def launch_cluster
APP_CONFIG['aws_secret_access_key'])
puts "Creating master security group"
- @ec2.create_security_group(self.master_security_group,'Elasticwulf-Master-Node')
+ @ec2.create_security_group(self.master_security_group,'ec2cluster-Master-Node')
self.set_progress_message("launching master node")
template = "/../views/jobs/bootstrap.sh.erb"
@@ -212,7 +212,7 @@ def launch_cluster
if self.number_of_instances > 1
puts "Launching worker nodes"
self.set_progress_message("launching worker nodes")
- @ec2.create_security_group(self.worker_security_group,'Elasticwulf-Worker-Node')
+ @ec2.create_security_group(self.worker_security_group,'ec2cluster-Worker-Node')
@workernodes = boot_nodes(self.number_of_instances, self.worker_ami_id,
self.worker_security_group, bootscript)
end
@@ -368,8 +368,8 @@ def set_rest_url
def set_security_groups
timeval = Time.now.strftime('%m%d%y-%I%M%p')
- update_attribute(:master_security_group, "#{id}-elasticwulf-master-"+timeval)
- update_attribute(:worker_security_group, "#{id}-elasticwulf-worker-"+timeval)
+ update_attribute(:master_security_group, "#{id}-ec2cluster-master-"+timeval)
+ update_attribute(:worker_security_group, "#{id}-ec2cluster-worker-"+timeval)
self.save
end
View
16 app/views/jobs/bootstrap.sh.erb
@@ -2,11 +2,11 @@
apt-get -y update
apt-get -y upgrade
apt-get -y install git-core
-groupadd elasticwulf
-useradd -d /mnt/elasticwulf -m -g elasticwulf elasticwulf
-ln -s /mnt/elasticwulf /home/elasticwulf
-chmod 775 -R /home/elasticwulf/
-chown -R elasticwulf:elasticwulf /home/elasticwulf
+groupadd ec2cluster
+useradd -d /mnt/ec2cluster -m -g ec2cluster ec2cluster
+ln -s /mnt/ec2cluster /home/ec2cluster
+chmod 775 -R /home/ec2cluster/
+chown -R ec2cluster:ec2cluster /home/ec2cluster
repository=<%= APP_CONFIG['repository'] %>
aws_access_key_id=<%= APP_CONFIG['aws_access_key_id'] %>
aws_secret_access_key=<%= APP_CONFIG['aws_secret_access_key'] %>
@@ -15,6 +15,6 @@ admin_password=<%= APP_CONFIG['admin_password'] %>
rest_url=<%= self.mpi_service_rest_url %>
job_id=<%= self.id %>
user_packages="<%= self.user_packages %>"
-cd /home/elasticwulf
-su - elasticwulf -c "git clone $repository"
-bash /home/elasticwulf/elasticwulf-service/lib/bootscripts/ubuntu_installs.sh $aws_access_key_id $aws_secret_access_key $admin_user $admin_password $rest_url $job_id "$user_packages"
+cd /home/ec2cluster
+su - ec2cluster -c "git clone $repository"
+bash /home/ec2cluster/ec2cluster/lib/bootscripts/ubuntu_installs.sh $aws_access_key_id $aws_secret_access_key $admin_user $admin_password $rest_url $job_id "$user_packages"
View
2  app/views/jobs/index.html.erb
@@ -1,4 +1,4 @@
-<h1>Elasticwulf Jobs</h1>
+<h1>ec2cluster Jobs</h1>
<%= link_to "New Job", new_job_path %><BR><BR>
<div id="jobs_div">
<%= render :partial => "jobs_list", :locals => { :jobs => @jobs } %>
View
2  app/views/jobs/new.html.erb
@@ -18,7 +18,7 @@
</p>
<p>
<%= f.label :input_files %> <i>(mybucket/folder/mpi.bin mybucket/folder/script.sh)</i><br />
- <%= f.text_area :input_files, :cols => 60, :rows => 2, :value =>"elasticwulf/samples/kmeans/input/color100.txt elasticwulf/samples/kmeans/code/Simple_Kmeans.zip elasticwulf/samples/kmeans/code/run_kmeans.sh" %>
+ <%= f.text_area :input_files, :cols => 60, :rows => 2, :value =>"ec2cluster/samples/kmeans/input/color100.txt ec2cluster/samples/kmeans/code/Simple_Kmeans.zip ec2cluster/samples/kmeans/code/run_kmeans.sh" %>
</p>
<p>
<%= f.label :commands %> <i>(bash script.sh)</i><br />
View
6 config/config.yml.example
@@ -1,6 +1,6 @@
development:
rails_application_port: 3000
- repository: "git://github.com/datawrangling/elasticwulf-service.git"
+ repository: "git://github.com/datawrangling/ec2cluster.git"
aws_access_key_id: 1233KNMKNDSVNDKNVDVD
aws_secret_access_key: DJNVio/vvJDNNCDMVMLlmLMLMLMLMVDlcmldcmdl
web_security_group: 'default'
@@ -17,7 +17,7 @@ development:
test:
rails_application_port: 3000
- repository: "git://github.com/datawrangling/elasticwulf-service.git"
+ repository: "git://github.com/datawrangling/ec2cluster.git"
aws_access_key_id: 1233KNMKNDSVNDKNVDVD
aws_secret_access_key: DJNVio/vvJDNNCDMVMLlmLMLMLMLMVDlcmldcmdl
web_security_group: 'default'
@@ -34,7 +34,7 @@ test:
production:
rails_application_port: 3000
- repository: "git://github.com/datawrangling/elasticwulf-service.git"
+ repository: "git://github.com/datawrangling/ec2cluster.git"
aws_access_key_id: 1233KNMKNDSVNDKNVDVD
aws_secret_access_key: DJNVio/vvJDNNCDMVMLlmLMLMLMLMVDlcmldcmdl
web_security_group: 'default'
View
4 config/deploy.rb.example
@@ -1,10 +1,10 @@
# This is a sample Capistrano config file for EC2 on Rails.
# It should be edited and customized.
-set :application, "elasticwulf-service"
+set :application, "ec2cluster"
set :scm, :git
-set :repository, "git://github.com/datawrangling/elasticwulf-service.git"
+set :repository, "git://github.com/datawrangling/ec2cluster.git"
set :branch, "master"
default_run_options[:shell] = false
View
40 lib/bootscripts/ubuntu_installs.sh
@@ -1,6 +1,6 @@
#!/bin/bash
# ubuntu MPI cluster installs
-# this script is kicked off as root within /home/elasticwulf on all nodes
+# this script is kicked off as root within /home/ec2cluster on all nodes
# TODO: check if any installs need to be modified for 64 bit vs. 32 bit amis
# information can be obtained by curl of instance metadata.
@@ -15,7 +15,7 @@ rest_url=$5
job_id=$6
user_packages="$7"
-cat <<EOF >> /home/elasticwulf/cluster_config.yml
+cat <<EOF >> /home/ec2cluster/cluster_config.yml
aws_access_key_id: $aws_access_key_id
aws_secret_access_key: $aws_secret_access_key
admin_user: $admin_user
@@ -25,10 +25,10 @@ job_id: $job_id
user_packages: $user_packages
EOF
-chown elasticwulf:elasticwulf /home/elasticwulf/cluster_config.yml
+chown ec2cluster:ec2cluster /home/ec2cluster/cluster_config.yml
addgroup admin
-adduser elasticwulf admin
+adduser ec2cluster admin
echo '' >> /etc/sudoers
echo '# Members of the admin group may gain root ' >> /etc/sudoers
echo '%admin ALL=NOPASSWD:ALL' >> /etc/sudoers
@@ -95,7 +95,7 @@ INSTANCE_ID=`wget -q -O - http://169.254.169.254/latest/meta-data/instance-id`
NODE_ID=`curl -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/search?query=${INSTANCE_ID}`
# configure NFS on master node and set up keys
-# master security group has the format: 8-elasticwulf-master-052609-0823PM
+# master security group has the format: 8-ec2cluster-master-052609-0823PM
SECURITY_GROUPS=`wget -q -O - http://169.254.169.254/latest/meta-data/security-groups`
# Job state is "waiting_for_nodes"
@@ -106,18 +106,18 @@ if [[ "$SECURITY_GROUPS" =~ "master" ]]
then
echo "Node is master, installing nfs server"
sudo apt-get -y install nfs-kernel-server
- echo '/mnt/elasticwulf *(rw,sync)' >> /etc/exports
+ echo '/mnt/ec2cluster *(rw,sync)' >> /etc/exports
/etc/init.d/nfs-kernel-server restart
- ############ ON MASTER NODE AS ELASTICWULF USER ############
- #As the home directory of elasticwulf in all nodes is the same (/home/elasticwulf) ,
+ ############ ON MASTER NODE AS ec2cluster USER ############
+ #As the home directory of ec2cluster in all nodes is the same (/home/ec2cluster) ,
#there is no need to run these commands on all nodes.
- #First we generate DSA key for elasticwulf (leaves passphrase empty):
- su - elasticwulf -c "ssh-keygen -b 1024 -N '' -f ~/.ssh/id_dsa -t dsa -q"
+ #First we generate DSA key for ec2cluster (leaves passphrase empty):
+ su - ec2cluster -c "ssh-keygen -b 1024 -N '' -f ~/.ssh/id_dsa -t dsa -q"
#Next we add this key to authorized keys on master node:
- su - elasticwulf -c "cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys"
- su - elasticwulf -c "chmod 700 ~/.ssh"
- su - elasticwulf -c "chmod 600 ~/.ssh/*"
+ su - ec2cluster -c "cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys"
+ su - ec2cluster -c "chmod 700 ~/.ssh"
+ su - ec2cluster -c "chmod 600 ~/.ssh/*"
else
echo 'node is a worker, skipping NFS export step'
fi
@@ -145,7 +145,7 @@ else
fi
### Set up hosts file on each node. hostsfile will only be ready after all child nodes start booting.
-chmod go-w /mnt/elasticwulf
+chmod go-w /mnt/ec2cluster
curl -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/hosts >> /etc/hosts
sed -i -e 's/# StrictHostKeyChecking ask/StrictHostKeyChecking no/g' /etc/ssh/ssh_config
/etc/init.d/ssh restart
@@ -157,7 +157,7 @@ if [[ "$SECURITY_GROUPS" =~ "master" ]]
then
echo "node is the master node, skipping NFS mount, waiting for worker nodes to mount home dir"
# fetch openmpi_hostfile from jobs url
- su - elasticwulf -c "curl -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/openmpi_hostfile > openmpi_hostfile"
+ su - ec2cluster -c "curl -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/openmpi_hostfile > openmpi_hostfile"
WORKER_NODES=`cat openmpi_hostfile | wc -l | cut --delimiter=' ' -f 1`
MOUNTED_NODES=`grep 'authenticated mount request' /var/log/syslog | wc -l`
@@ -170,20 +170,20 @@ then
echo "All workers have mounted NFS home directory, cluster is ready for MPI jobs"
# Quick test of local openmpi
- su - elasticwulf -c "mpicc /home/elasticwulf/elasticwulf-service/lib/examples/hello.c -o /home/elasticwulf/hello"
- su - elasticwulf -c "mpirun -np 2 /home/elasticwulf/hello > local_mpi_smoketest.txt"
+ su - ec2cluster -c "mpicc /home/ec2cluster/ec2cluster/lib/examples/hello.c -o /home/ec2cluster/hello"
+ su - ec2cluster -c "mpirun -np 2 /home/ec2cluster/hello > local_mpi_smoketest.txt"
# Get total number of cpus in cluster from REST action
CPU_COUNT=`curl -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/cpucount`
# Quick smoke test of multinode openmpi run,
- su - elasticwulf -c "mpirun -np $CPU_COUNT --hostfile /home/elasticwulf/openmpi_hostfile /home/elasticwulf/hello > cluster_mpi_smoketest.txt"
+ su - ec2cluster -c "mpirun -np $CPU_COUNT --hostfile /home/ec2cluster/openmpi_hostfile /home/ec2cluster/hello > cluster_mpi_smoketest.txt"
# kick off ruby command_runner.rb script (only on master node)
- su - elasticwulf -c "ruby /home/elasticwulf/elasticwulf-service/lib/command_runner.rb $CPU_COUNT"
+ su - ec2cluster -c "ruby /home/ec2cluster/ec2cluster/lib/command_runner.rb $CPU_COUNT"
else
echo "Node is worker, mounting master NFS"
apt-get -y install portmap nfs-common
- mount ${MASTER_HOSTNAME}:/mnt/elasticwulf /mnt/elasticwulf
+ mount ${MASTER_HOSTNAME}:/mnt/ec2cluster /mnt/ec2cluster
# Send REST PUT to node url, signaling that NFS is ready on node..
curl -H "Content-Type: application/json" -H "Accept: application/json" -X PUT -d "{"node": {"nfs_mounted":"true"}}" -u $admin_user:$admin_password -k ${rest_url}jobs/${job_id}/nodes/${NODE_ID}
fi
View
8 lib/command_runner.rb
@@ -2,9 +2,9 @@
# command_runner.rb NUMBER_OF_CPUS
-# This script is only run on the master node of the cluster as the "elasticwulf" user from within the NFS home directory "/home/elasticwulf/". It fetches input & code from s3, runs the job command, and uploads outputs to S3. The job command it runs will typically be a bash script containing MPI commands run across the entire cluster using the input data fetched from S3 which is available to all nodes via NFS.
+# This script is only run on the master node of the cluster as the "ec2cluster" user from within the NFS home directory "/home/ec2cluster/". It fetches input & code from s3, runs the job command, and uploads outputs to S3. The job command it runs will typically be a bash script containing MPI commands run across the entire cluster using the input data fetched from S3 which is available to all nodes via NFS.
-# Convention is for the supplied command to write all files to the working directory or a path relative to /home/elasticwulf/
+# Convention is for the supplied command to write all files to the working directory or a path relative to /home/ec2cluster/
require 'rubygems'
require 'activeresource'
@@ -14,7 +14,7 @@
CPU_COUNT=ARGV[0]
ENV['CPU_COUNT'] = CPU_COUNT
-CLUSTER_CONFIG = YAML.load_file("/home/elasticwulf/cluster_config.yml")
+CLUSTER_CONFIG = YAML.load_file("/home/ec2cluster/cluster_config.yml")
puts "job id: " + CLUSTER_CONFIG['job_id'].to_s
@s3handle = RightAws::S3Interface.new(CLUSTER_CONFIG['aws_access_key_id'],
@@ -54,7 +54,7 @@ def upload_s3file(output_path, localfile, s3handle)
s3handle.put(bucket, s3key, File.open(localfile))
end
-# Create an ActiveResource connection to the Elasticwulf REST web service
+# Create an ActiveResource connection to the ec2cluster REST web service
class Job < ActiveResource::Base
self.site = CLUSTER_CONFIG['rest_url']
self.user = CLUSTER_CONFIG['admin_user']
Please sign in to comment.
Something went wrong with that request. Please try again.