-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH #2444
Conversation
Modified sbin/slaves to choose localhost as a default host list Renamed conf/slaves to conf/slaves.template Added entries about slaves and slaves.template to .rat-excludes Added entries about slaves to .gitignore
QA tests have started for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
else | ||
export HOSTLIST="${SPARK_SLAVES}" | ||
HOSTLIST=`cat "${SPARK_SLAVES}"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why cat here and echo later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to use HOSTLIST as List of Host, not file.
It's to use localhost as a default host list entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pointing that out. i didn't read closely enough.
+1 lgtm |
if [ -f "${SPARK_CONF_DIR}/slaves" ]; then | ||
HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"` | ||
else | ||
HOSTLIST=localhost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should change the docs in spark-standalone.md
to explain two new features:
- You can set SSH_FOREGROUND if you cannot use paswordless SSH (currently, it says this is required).
- If there is no
slaves
file in existence, it will launch a single slave atlocalhost
by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i was moving too quickly this morning. definitely need something to allow for background ssh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O.K, I'll add SSH_FOREGROUND variable and add description.
Made some comments. We need to guard this with a config parameter because otherwise it will regress behavior on large clusters where serial vs parallel ssh makes a big difference. |
QA tests have started for PR 2444 at commit
|
QA tests have started for PR 2444 at commit
|
QA tests have started for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
sleep $SPARK_SLAVE_SLEEP | ||
fi | ||
for slave in `echo "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do | ||
if [ "${SPARK_SSH_FOREGROUND}" = "y" ] || [ "${SPARK_SSH_FOREGROUND}" = "yes" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically for these types of options we just check whether it's defined or not. For example elsewhere we do:
if [ -n "$SPARK_PRINT_LAUNCH_COMMAND" ]; then
Can you make it consistent with that?
This looks good, just had a minor comment, then I think it's ready to merge. |
Thanks @pwendell , I've modified what you mentioned. |
QA tests have started for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
Test PASSed. |
does not exist, the launch scripts use a list which contains single hostname `localhost`. This can be used for testing. | ||
The master machine must be able to access each of the slave machines via `ssh`. By default, `ssh` is executed in the background for parallel execution for each slave machine. | ||
If you would like to use password authentication instead of password-less(using a private key) for `ssh`, `ssh` does not work well in the background. | ||
To avoid this, you can set a environment variable `SPARK_SSH_FOREGROUND` to something like `yes` or `y` to execute `ssh` in the foreground. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about -
To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves
in your Spark directory, which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. If conf/slaves
does not exist, the launch scripts defaults to a single machine (localhost
), which is useful for testing. Note, the master machine accesses each of the worker machines via ssh
. By default, ssh
is run in parallel and requires password-less (using a private key) access to be setup. If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND
and serially provide a password for each worker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattf Thank you for reviewing. It makes sense.
QA tests have started for PR 2444 at commit
|
QA tests have finished for PR 2444 at commit
|
Test PASSed. |
Hi @pwendell , |
I also noticed that |
Actually, nevermind: this was fixed in #2549. |
No description provided.