Spark GCE Script Helps you deploy Spark cluster on Google Cloud.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md
banner
spark_gce.py

README.md

Spark GCE

Spark GCE is like Spark Ec2 but for those who run their cluster on Google Cloud.

  • Make Sure you have installed and authenticated gcutils where you are running this script.
  • Helps you launch a spark cluster in the Google Cloud
  • Attaches 500GB empty disk to all nodes in the cluster
  • Installs and Configures everything Automatically
  • Starts the Shark server Automatically

Spark GCE is a python script which will help you launch a spark cluster in the google cloud like the way spark_ec2 script does for AWS.

Usage

spark_gce.py project-name number-of-slaves slave-type master-type identity-file zone cluster-name

  • project-id: Project ID of the project where you are going to launch your spark cluster.

  • number-of-slave: Number of slaves that you want to launch.

  • slave-type: Instance type for the slave machines.

  • master-type: Instance type for the master node.

  • identity-file: Identity file to authenticate with your GCE instances, Usually resides at ~/.ssh/google_compute_engine once you authenticate using gcutils.

  • zone: Specify the zone where you are going to launch the cluster.

  • cluster-name: Name the cluster that you are going to launch.

spark_gce.py project-name cluster-name destroy

  • project-id: Project id of the project where the spark cluster is at.
  • cluster-name: Name of the cluster that you are going to destroy.

Installation

git clone https://github.com/sigmoidanalytics/spark_gce.git
cd spark_gce
python spark_gce.py

Need Help?