Skip to content

Commit

Permalink
Improve the description about Cluster Launch Script in docs/spark-sta…
Browse files Browse the repository at this point in the history
…ndalone.md
  • Loading branch information
sarutak committed Sep 25, 2014
1 parent 7858225 commit eff7394
Showing 1 changed file with 6 additions and 6 deletions.
12 changes: 6 additions & 6 deletions docs/spark-standalone.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,12 @@ Finally, the following configuration options can be passed to the master and wor

# Cluster Launch Scripts

To launch a Spark standalone cluster with the launch scripts, you need to create a file called `conf/slaves` in your Spark directory,
which should contain the hostnames of all the machines where you would like to start Spark workers, one per line. If `conf/slaves`
does not exist, the launch scripts use a list which contains single hostname `localhost`. This can be used for testing.
The master machine must be able to access each of the slave machines via `ssh`. By default, `ssh` is executed in the background for parallel execution for each slave machine.
If you would like to use password authentication instead of password-less(using a private key) for `ssh`, `ssh` does not work well in the background.
To avoid this, you can set a environment variable `SPARK_SSH_FOREGROUND` to something like `yes` or `y` to execute `ssh` in the foreground.
To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory,
which must contain the hostnames of all the machines where you intend to start Spark workers, one per line.
If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing.
Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup.
If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker.


Once you've set up this file, you can launch or stop your cluster with the following shell scripts, based on Hadoop's deploy scripts, and available in `SPARK_HOME/bin`:

Expand Down

0 comments on commit eff7394

Please sign in to comment.