Skip to content

Cloud Deployment

ihelmke edited this page Sep 9, 2011 · 5 revisions

Deploying the Project to the Cloud

1. Modify the pom in the pipeline folder to build a zip file containing all of the dependencies required to run the project. You can do this by adding the following lines to the dependency-package xml: :: <activation> <activeByDefault>true</activeByDefault> </activation>

  1. Build everything from the root directory as described in the "running the framework" directions (unless this is your first time running the project, this simply amounts to running mvn on the root directory)
  2. Build the fsrip utility. Directions on how to do this can also be found in the "running the framework" instructions.
  3. Switch to the deploy/ directory. This contains a skeleton of files that need to be copied to the cloud. Add the library jar zip file to the root of this directory. Add the main pipeline jar to the deploy/pipeline/ directory.
  4. Add fsrip to the deploy/fsrip directory, and add its dependencies to the deploy/fsrip/deps directory.
  5. Use whirr to start the cluster. Whirr 0.6+ is recommended. A startup script is included with the project which uses one master and one worker node with hbase and cloudera hadoop.
  6. Run the pushtocluster.sh script, with the addresses of the two machines to push to as arguments. This will copy the deploy directory to those machines and extract the lib zip file to them, allowing the framework to run properly. It will also restart hadoop on those machines.
Clone this wiki locally