Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH #2444

Closed
wants to merge 8 commits into from

Conversation

sarutak
Copy link
Member

@sarutak sarutak commented Sep 18, 2014

No description provided.

Modified sbin/slaves to choose localhost as a default host list

Renamed conf/slaves to conf/slaves.template

Added entries about slaves and slaves.template to .rat-excludes

Added entries about slaves to .gitignore
@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have started for PR 2444 at commit 297e75d.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 18, 2014

QA tests have finished for PR 2444 at commit 297e75d.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

else
export HOSTLIST="${SPARK_SLAVES}"
HOSTLIST=`cat "${SPARK_SLAVES}"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cat here and echo later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to use HOSTLIST as List of Host, not file.
It's to use localhost as a default host list entry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing that out. i didn't read closely enough.

@mattf
Copy link
Contributor

mattf commented Sep 19, 2014

+1 lgtm

if [ -f "${SPARK_CONF_DIR}/slaves" ]; then
HOSTLIST=`cat "${SPARK_CONF_DIR}/slaves"`
else
HOSTLIST=localhost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change the docs in spark-standalone.md to explain two new features:

  1. You can set SSH_FOREGROUND if you cannot use paswordless SSH (currently, it says this is required).
  2. If there is no slaves file in existence, it will launch a single slave at localhost by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i was moving too quickly this morning. definitely need something to allow for background ssh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

O.K, I'll add SSH_FOREGROUND variable and add description.

@pwendell
Copy link
Contributor

Made some comments. We need to guard this with a config parameter because otherwise it will regress behavior on large clusters where serial vs parallel ssh makes a big difference.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have started for PR 2444 at commit 1bba8a9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have started for PR 2444 at commit 7120a0c.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have started for PR 2444 at commit e570431.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have finished for PR 2444 at commit 1bba8a9.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have finished for PR 2444 at commit 7120a0c.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have finished for PR 2444 at commit e570431.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

sleep $SPARK_SLAVE_SLEEP
fi
for slave in `echo "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do
if [ "${SPARK_SSH_FOREGROUND}" = "y" ] || [ "${SPARK_SSH_FOREGROUND}" = "yes" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically for these types of options we just check whether it's defined or not. For example elsewhere we do:

if [ -n "$SPARK_PRINT_LAUNCH_COMMAND" ]; then

Can you make it consistent with that?

@pwendell
Copy link
Contributor

This looks good, just had a minor comment, then I think it's ready to merge.

@sarutak
Copy link
Member Author

sarutak commented Sep 25, 2014

Thanks @pwendell , I've modified what you mentioned.

@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have started for PR 2444 at commit 7858225.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have finished for PR 2444 at commit 7858225.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20799/

does not exist, the launch scripts use a list which contains single hostname `localhost`. This can be used for testing.
The master machine must be able to access each of the slave machines via `ssh`. By default, `ssh` is executed in the background for parallel execution for each slave machine.
If you would like to use password authentication instead of password-less(using a private key) for `ssh`, `ssh` does not work well in the background.
To avoid this, you can set a environment variable `SPARK_SSH_FOREGROUND` to something like `yes` or `y` to execute `ssh` in the foreground.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about -

To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory, which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing. Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup. If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattf Thank you for reviewing. It makes sense.

@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have started for PR 2444 at commit eff7394.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 25, 2014

QA tests have finished for PR 2444 at commit eff7394.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20805/

@pwendell
Copy link
Contributor

Thanks @sarutak and @mattf for the review. I'll pull this in.

@asfgit asfgit closed this in 0dc868e Sep 25, 2014
@jameszhouyi
Copy link

Hi @pwendell ,
After this commit, for spark-perf will complain 'not found slaves' when run ./bin/run... so have to modify from slaves.template to slaves manually ?

@JoshRosen
Copy link
Contributor

I also noticed that make-distribution.sh fails when trying to copy the now-nonexistent conf/slaves file. I'm going to push a hotfix commit to fix that.

@JoshRosen
Copy link
Contributor

Actually, nevermind: this was fixed in #2549.

@sarutak sarutak deleted the slaves-scripts-modification branch April 11, 2015 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants