Tool for provisioning GCE hadoop clusters.
- Install gevent
sudo pip install gevent
-
Follow the instructions to install and configure gsutil at: https://cloud.google.com/storage/docs/gsutil_install
-
Follow the instructions to install and configure gcutil at: https://cloud.google.com/compute/docs/gcutil
-
Copy cluster_config_sample to cluster_config
cp cluster_config_sample cluster_config
- Edit the settings in the custom settings section of cluster_config
python zdutil.py -c cluster_config -a setup
python zdutil.py -c cluster_config -a teardown
python zdutil.py -c cluster_config -a setup -s <path_to_script1>,<path_to_script2>
To setup a Hadoop cluster, run your own bash scripts on the namenode, and teardown the cluster afterwards
python script_runner/script_runner.py -c cluster_config -z zdutil.py -s <path_to_script1>,<path_to_script2>
Read more about zdutil at http://engineering.zulily.com/2014/12/03/google-compute-engine-hadoop-clusters-with-zdutil/