Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH #2444

Closed
wants to merge 8 commits into from
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ conf/*.cmd
conf/*.properties
conf/*.conf
conf/*.xml
conf/slaves
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this one I think we intentionally have a simple slaves file in the repo so that people can start a local cluster by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this file will not to be edited? User should use another slave list file by SPARK_SLAVES variable right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay this is fine actually, given that we preserve the "deafult" behavior due to your edits below (of starting at localhost).

docs/_site
docs/api
target/
Expand Down
1 change: 1 addition & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ log4j.properties
log4j.properties.template
metrics.properties.template
slaves
slaves.template
spark-env.sh
spark-env.cmd
spark-env.sh.template
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/spark-standalone.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Finally, the following configuration options can be passed to the master and wor

# Cluster Launch Scripts

To launch a Spark standalone cluster with the launch scripts, you need to create a file called `conf/slaves` in your Spark directory, which should contain the hostnames of all the machines where you would like to start Spark workers, one per line. The master machine must be able to access each of the slave machines via password-less `ssh` (using a private key). For testing, you can just put `localhost` in this file.
To launch a Spark standalone cluster with the launch scripts, you need to create a file called `conf/slaves` in your Spark directory, which should contain the hostnames of all the machines where you would like to start Spark workers, one per line. If `conf/slaves` does not exist, the launch scripts use a list which contains single hostname `localhost`. This can be used for testing. The master machine must be able to access each of the slave machines via `ssh`. By default, `ssh` is executed in the background for parallel execution for each slave machine. If you would like to use password authentication instead of password-less(using a private key) for `ssh`, `ssh` does not work well in the background. To avoid this, you can set a environment variable `SPARK_SSH_FOREGROUND` to `yes` or `y` to execute `ssh` in the foreground.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about -

To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory, which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing. Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup. If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattf Thank you for reviewing. It makes sense.

Once you've set up this file, you can launch or stop your cluster with the following shell scripts, based on Hadoop's deploy scripts, and available in `SPARK_HOME/bin`:

Expand Down
31 changes: 22 additions & 9 deletions sbin/slaves.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ sbin="`cd "$sbin"; pwd`"
# If the slaves file is specified in the command line,
# then it takes precedence over the definition in
# spark-env.sh. Save it here.
HOSTLIST="$SPARK_SLAVES"
if [ -f "$SPARK_SLAVES" ]; then
HOSTLIST=`cat "$SPARK_SLAVES"`
fi

# Check if --config is passed as an argument. It is an optional parameter.
# Exit if the argument is not a directory.
Expand All @@ -67,23 +69,34 @@ fi

if [ "$HOSTLIST" = "" ]; then
if [ "$SPARK_SLAVES" = "" ]; then
export HOSTLIST="${SPARK_CONF_DIR}/slaves"
if [ -f "${SPARK_CONF_DIR}/slaves" ]; then
HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"`
else
HOSTLIST=localhost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change the docs in spark-standalone.md to explain two new features:

  1. You can set SSH_FOREGROUND if you cannot use paswordless SSH (currently, it says this is required).
  2. If there is no slaves file in existence, it will launch a single slave at localhost by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i was moving too quickly this morning. definitely need something to allow for background ssh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K, I'll add SSH_FOREGROUND variable and add description.

fi
else
export HOSTLIST="${SPARK_SLAVES}"
HOSTLIST=`cat "${SPARK_SLAVES}"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cat here and echo later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to use HOSTLIST as List of Host, not file.
It's to use localhost as a default host list entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing that out. i didn't read closely enough.

fi
fi



# By default disable strict host key checking
if [ "$SPARK_SSH_OPTS" = "" ]; then
SPARK_SSH_OPTS="-o StrictHostKeyChecking=no"
fi

for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do
ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \
2>&1 | sed "s/^/$slave: /" &
if [ "$SPARK_SLAVE_SLEEP" != "" ]; then
sleep $SPARK_SLAVE_SLEEP
fi
for slave in `echo "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do
if [ "${SPARK_SSH_FOREGROUND}" = "y" ] || [ "${SPARK_SSH_FOREGROUND}" = "yes" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically for these types of options we just check whether it's defined or not. For example elsewhere we do:

if [ -n "$SPARK_PRINT_LAUNCH_COMMAND" ]; then

Can you make it consistent with that?

ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \
2>&1 | sed "s/^/$slave: /"
else
ssh $SPARK_SSH_OPTS "$slave" $"${@// /\\ }" \
2>&1 | sed "s/^/$slave: /" &
fi
if [ "$SPARK_SLAVE_SLEEP" != "" ]; then
sleep $SPARK_SLAVE_SLEEP
fi
done

wait