New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CORS Support and EC2 Support #290

Merged
merged 8 commits into from Nov 3, 2015

Conversation

Projects
None yet
2 participants
@David-Durst
Contributor

David-Durst commented Oct 18, 2015

This commit mainly adds support for 2 features. First, it enables CORS support
so that requests can be made across domain names. Second, it adds a script
for deploying the job-server to EC2. The commit also includes an example to
run on the EC2 cluster. Finally, there are improvements in the bin/ scripts
to better support paths with spaces in them. Previously, these paths had
caused issues with the bin/sever_deploy.sh script.

Added CORS Support and EC2 Support
This commit mainly adds support for 2 features. First, it enables CORS support
so that requests can be made across domain names. Second, it adds a script
for deploying the job-server to EC2. The commit also includes an example to
run on the EC2 cluster. Finally, there are improvements in the bin/ scripts
to better support paths with spaces in them. Previously, these paths had
caused issues with the bin/sever_deploy.sh script.
@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 19, 2015

Might be good to make the exact version (such as hadoop2.6) an env var so users can override them

velvia commented on bin/ec2_deploy.sh in dd25ea1 Oct 19, 2015

Might be good to make the exact version (such as hadoop2.6) an env var so users can override them

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 19, 2015

This seems local to your installation and not necessary...

velvia commented on .gitignore in dd25ea1 Oct 19, 2015

This seems local to your installation and not necessary...

This comment has been minimized.

Show comment
Hide comment
@David-Durst

David-Durst Oct 20, 2015

Owner

It is necessary. I download the file as part of my script to launch on ec2. I have made it more generic though to match any version.

Owner

David-Durst replied Oct 20, 2015

It is necessary. I download the file as part of my script to launch on ec2. I have made it more generic though to match any version.

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 19, 2015

Should get the version string from the root version.sbt file, not hard coded

velvia commented on bin/ec2_example.sh.template in dd25ea1 Oct 19, 2015

Should get the version string from the root version.sbt file, not hard coded

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 19, 2015

Contributor

@David-Durst thanks! I'm really looking forward to trying out the K-Means example. Only took a brief look through it, and will be offline for about 2 days.

Contributor

velvia commented Oct 19, 2015

@David-Durst thanks! I'm really looking forward to trying out the K-Means example. Only took a brief look through it, and will be offline for about 2 days.

Fixed first set of issues.
First, I moved the Spark version into an environment variable. Next, I made
the .gitignore file more general. Finally, I fixed some extra spaces I added
in dependcies.
@David-Durst

This comment has been minimized.

Show comment
Hide comment
@David-Durst

David-Durst Oct 20, 2015

Contributor

I made the changes. Let me know what you think.

Contributor

David-Durst commented Oct 20, 2015

I made the changes. Let me know what you think.

@David-Durst

This comment has been minimized.

Show comment
Hide comment
@David-Durst

David-Durst Oct 20, 2015

Contributor

Unfortunately, I can't fix the spark-ec2 issues as it is part of the Spark distribution.

Contributor

David-Durst commented Oct 20, 2015

Unfortunately, I can't fix the spark-ec2 issues as it is part of the Spark distribution.

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 20, 2015

Contributor

@David-Durst are you saying that the spark EC2 scripts download the distro into the current dir? Sorry not sure exactly what you're referring to.

Rest of it looks fine to me. I'd like to try the K-Means example tonight.

Contributor

velvia commented Oct 20, 2015

@David-Durst are you saying that the spark EC2 scripts download the distro into the current dir? Sorry not sure exactly what you're referring to.

Rest of it looks fine to me. I'd like to try the K-Means example tonight.

@David-Durst

This comment has been minimized.

Show comment
Hide comment
@David-Durst

David-Durst Oct 21, 2015

Contributor

There are two parts.

  • First, for the .gitignore issue, spark-jobserver's bin/ec2_deploy.sh downloads the entire Spark distro. I did this because the documentation for Spark's spark-ec2 script assumes that the script is run in the ec2 folder inside a complete Spark release. Therefore, separating it from the rest of Spark may lead to undefined behavior. spark-jobserver's bin/ec2_deploy.sh runs Spark's spark-ec2 to deploy to the EC2 cluster. I put the Spark .tgz and the unzipped directory containing Spark in spark-jobserver's .gitignore so that they aren't added to the spark-jobserver repo after the user runs spark-jobserver's bin/ec2_deploy.sh.
  • Second, I have found the spark-ec2 script to be unreliable. It sometimes hangs with the error message I describe in 0159d8b. Unfortunately, spark-ec2 is the best option at the moment for spinning up and shutting down a cluster from the command line. If the spark-ec2 script hangs and causes my my ec2_deploy script to hang, I recommend killing the cluster and restarting the ec2_deploy process. See 0159d8b for complete instructions. It is easy to do and currently the best option.
Contributor

David-Durst commented Oct 21, 2015

There are two parts.

  • First, for the .gitignore issue, spark-jobserver's bin/ec2_deploy.sh downloads the entire Spark distro. I did this because the documentation for Spark's spark-ec2 script assumes that the script is run in the ec2 folder inside a complete Spark release. Therefore, separating it from the rest of Spark may lead to undefined behavior. spark-jobserver's bin/ec2_deploy.sh runs Spark's spark-ec2 to deploy to the EC2 cluster. I put the Spark .tgz and the unzipped directory containing Spark in spark-jobserver's .gitignore so that they aren't added to the spark-jobserver repo after the user runs spark-jobserver's bin/ec2_deploy.sh.
  • Second, I have found the spark-ec2 script to be unreliable. It sometimes hangs with the error message I describe in 0159d8b. Unfortunately, spark-ec2 is the best option at the moment for spinning up and shutting down a cluster from the command line. If the spark-ec2 script hangs and causes my my ec2_deploy script to hang, I recommend killing the cluster and restarting the ec2_deploy process. See 0159d8b for complete instructions. It is easy to do and currently the best option.
@David-Durst

This comment has been minimized.

Show comment
Hide comment
@David-Durst

David-Durst Oct 22, 2015

Contributor

It turns out that I was wrong on the first point. I have fixed my code so that you no longer need to download the entire Spark repo to your local machine.
Ready to merge it in to the master?

Contributor

David-Durst commented Oct 22, 2015

It turns out that I was wrong on the first point. I have fixed my code so that you no longer need to download the entire Spark repo to your local machine.
Ready to merge it in to the master?

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Oct 25, 2015

Contributor

Sweet thanks. Sorry very busy as I prepare to fly to Amsterdam to talk
about Spark Job Server! Will look at your PR once I settle down.

On Wed, Oct 21, 2015 at 8:45 PM, David Durst notifications@github.com
wrote:

I turns out I was wrong on the first point. I have fixed it so that you no
longer need to download the entire Spark repo to your local machine.


Reply to this email directly or view it on GitHub
#290 (comment)
.

If you are free, you need to free somebody else.
If you have some power, then your job is to empower somebody else.
--- Toni Morrison

Contributor

velvia commented Oct 25, 2015

Sweet thanks. Sorry very busy as I prepare to fly to Amsterdam to talk
about Spark Job Server! Will look at your PR once I settle down.

On Wed, Oct 21, 2015 at 8:45 PM, David Durst notifications@github.com
wrote:

I turns out I was wrong on the first point. I have fixed it so that you no
longer need to download the entire Spark repo to your local machine.


Reply to this email directly or view it on GitHub
#290 (comment)
.

If you are free, you need to free somebody else.
If you have some power, then your job is to empower somebody else.
--- Toni Morrison

@velvia

This comment has been minimized.

Show comment
Hide comment
@velvia

velvia Nov 3, 2015

Contributor

@David-Durst probably won't get time to try out the K-Means example, but going to merge as this looks good. Many thanks!

Contributor

velvia commented Nov 3, 2015

@David-Durst probably won't get time to try out the K-Means example, but going to merge as this looks good. Many thanks!

velvia added a commit that referenced this pull request Nov 3, 2015

Merge pull request #290 from David-Durst/rebasePR
Added CORS Support and EC2 Support

@velvia velvia merged commit 00e1204 into spark-jobserver:master Nov 3, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment