Vagrant template to provision a standalone Spark cluster with lean defaults. This is a great way to set up a Spark cluster on your laptop that can easily be deleted later with no changes to your machine.
Vagrantfilefor details and to make changes.
- Spark running as a standalone cluster. Tested with Spark 2.1.x and 2.2.x.
- One head node Ubuntu 16.04 machine and
Nworker (slave) machines.
- Spark running in standalone cluster mode.
To spin up your own local Spark cluster, clone this repository first.
Next, download a pre-built Spark package and place it into this directory, named "spark.tgz".
Next, open up
Vagrantfile in a text editor.
You'll want to change the
N_WORKERS variable near the top of the file.
Vagrant will spin up one "head node" and
N worker nodes in a Spark standalone cluster.
Feel free to make other changes, e.g. RAM and CPU for each of the machines.
When you're ready, just run
vagrant up in the directory the
Vagrantfile is in. Wait a few minutes and your Spark cluster will be ready.
SSH in using
vagrant ssh hn0 or
vagrant ssh wn0.
You'll also be able to see the Spark WebUI at
Shut down the cluster with
vagrant halt and delete it with
vagrant destroy. You can always run
vagrant up to turn on or build a brand new cluster.
SparkPi on the cluster, run the following commands:
vagrant ssh hn0 spark-submit --class org.apache.spark.examples.SparkPi ~/spark/examples/jars/spark-examples_2.11-2.2.1.jar 1000
See the LICENSE.txt file.