Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-4948] Use pssh. #85

Closed
wants to merge 10 commits into from
Closed

[SPARK-4948] Use pssh. #85

wants to merge 10 commits into from

Conversation

nchammas
Copy link

  • Replace some bash-isms with calls to pssh to neatly parallelize cluster operations.
  • Remove unnecessary code to pre-approve SSH keys.
  • Decrease questionably high sleep times.

@nchammas nchammas mentioned this pull request Dec 24, 2014
@nchammas nchammas changed the title [SPARK-4325] Use pssh. [SPARK-4948] Use pssh. Dec 24, 2014
@nchammas
Copy link
Author

cc @shivaram

@shivaram
Copy link

LGTM. Thanks for testing this through. I'd like to do one last test by launching a cluster using this branch before merging -- Will do it by tomm.

@JoshRosen
Copy link
Member

So that we don't accidentally break things for users that are running a released Spark 1.2.0 version, do you think that we should merge this into a new v5 branch and bump the Spark Master's EC2 script to use that branch?

@nchammas
Copy link
Author

Though this should be a seamless change, yes it would be safer to merge this into a new branch.

Should we take this opportunity to start naming the branches after major Spark releases? So instead of v5, we cut a branch called v1.3. (Dunno if branch proliferation would be a problem in the long term...)

I think it will make this stuff simpler to manage going forward.

@JoshRosen
Copy link
Member

That naming convention sounds fine to me; I just bumped the v3 to v4 last time because I was hotfixing a breakage.

@nchammas
Copy link
Author

What dost the @shivaram thinketh?

@shivaram
Copy link

If we are going to create new branches lets call them the same names as the Spark versions going forward. I wasn't very convinced the last time around that we needed a branch per version -- though I now see that it is just better to be more careful and ensure we don't break anything for released versions.

I'd be fine with starting a v1.3 branch with this PR.

@pwendell -- Any thoughts on this ?

P.S: Given that we are changing conventions, it would be good to document somewhere how v1, v2, v3, v4 map to Spark versions.

@nchammas
Copy link
Author

Taking a quick look, these are the mappings I see.

Spark Version AMI list branch spark-ec2 folder branch
0.5.0 - 0.7.2 N/A master
0.8.0 - 0.8.1 v2 v2
0.9.0 - 0.9.2 v2 v2
1.0.0 - 1.0.2 v2 v3
1.1.0 - 1.1.1 v2 v3
1.2.0 v4 v4

The reason for the two branch columns is that until this commit, we specified the branch in two separate locations: once to get the AMI list, and once to get the script files to install Hadoop, Ganglia, etc.

Up to version 0.7.2, AMI IDs were downloaded from S3 and spark-ec2 files were downloaded from master.

@shivaram
Copy link

Thanks @nchammas -- This is great ! I'd suggest putting this in the Spark wiki, so we have this information around for later.

Also while the two columns make sense, it would be good to note below the table that the ami-list did not change between v2 and v3 (hence the mapping is consistent). I don't think we will back port AMI changes to v3, so this should remain true going forward.

@shivaram
Copy link

One last minor nit: The spark-ec2 repo was only used beginning in 0.7 [1], before that we just had an AMI that was pre-baked with the scripts. So anything before 0.7 isn't really supported using this repo.

[1] This commit introduced the spark-ec2 repo apache/spark@d012cfa

@nchammas
Copy link
Author

I'd suggest putting this in the Spark wiki, so we have this information around for later.

I don't have write access to the wiki, I don't think.

@shivaram
Copy link

I just gave the wiki user id nchammas permission to add pages. Let me know if it doesn't work

@nchammas
Copy link
Author

@nchammas
Copy link
Author

@shivaram Done.

@shivaram
Copy link

Wiki looks good - I'll create a new branch and we can move the PR to that branch. BTW any suggestions on naming ? I was going to use branch-1.3 instead of v1.3 to make a clear distinction. Also it'll mirror the Spark branch naming scheme.

@nchammas
Copy link
Author

@pwendell hasn't had a chance to chime in yet, but branch-1.3 sounds good to me.

@pwendell
Copy link

Yeah that branch naming sounds good to me. It makes more sense.

On Wed, Dec 24, 2014 at 11:01 AM, Nicholas Chammas <notifications@github.com

wrote:

@pwendell https://github.com/pwendell hasn't had a chance to chime in
yet, but branch-1.3 sounds good to me.


Reply to this email directly or view it on GitHub
#85 (comment).

@shivaram
Copy link

@nchammas I just pushed out a new branch to mesos/spark-ec2 named branch-1.3 https://github.com/mesos/spark-ec2/tree/branch-1.3 -- Can you move this PR to that branch ?

@nchammas
Copy link
Author

Done: #86

@nchammas nchammas closed this Dec 25, 2014
@nchammas nchammas deleted the use-pssh branch December 25, 2014 20:11
@nchammas
Copy link
Author

@shivaram Do you also need to make branch-1.3 the default branch in this repo?

@shivaram
Copy link

Yeah I dont have permissions to do that though. @pwendell @JoshRosen can one of you make branch-1.3 the default branch for mesos/spark-ec2 ?

@JoshRosen
Copy link
Member

I don't have permission either; we need someone with repo admin access to do it (I think @pwendell can).

asfgit pushed a commit to apache/spark that referenced this pull request Dec 25, 2014
Going forward, we'll use matching branch names across the mesos/spark-ec2 and apache/spark repositories, per [the discussion here](mesos/spark-ec2#85 (comment)).

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #3804 from nchammas/patch-2 and squashes the following commits:

cd2c0d4 [Nicholas Chammas] [EC2] Update mesos/spark-ec2 branch to branch-1.3
@nchammas
Copy link
Author

Pinging @pwendell about making branch-1.3 the default branch per the discussion above.

@pwendell
Copy link

Done, thanks guys.

On Tue, Jan 13, 2015 at 6:28 PM, Nicholas Chammas notifications@github.com
wrote:

Pinging @pwendell https://github.com/pwendell about making branch-1.3
the default branch per the discussion above.


Reply to this email directly or view it on GitHub
#85 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants