New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added CORS Support and EC2 Support #290

Merged
merged 8 commits into from Nov 3, 2015

Fixed first set of issues.

First, I moved the Spark version into an environment variable. Next, I made
the .gitignore file more general. Finally, I fixed some extra spaces I added
in dependcies.
  • Loading branch information...
David-Durst committed Oct 20, 2015
commit 24690e12668f77c39cc692b86d92656442b762ff
View
@@ -19,8 +19,8 @@ metastore_db/
bin/ec2_example.sh
# ignore spark binaries
spark-1.5.0-bin-hadoop2.6.tgz
spark-1.5.0-bin-hadoop2.6/
spark-*-bin-hadoop*.tgz
spark-*-bin-hadoop*/
# don't ignore the ec2 config and sh files
!job-server/config/ec2.sh
View
@@ -5,13 +5,14 @@ bin=`cd "$bin"; pwd`
. "$bin"/../config/user-ec2-settings.sh
#get spark binaries if they haven't been downloaded and extracted yet
if [ ! -d "$bin"/../spark-1.5.0-bin-hadoop2.6 ]; then
wget -P "$bin"/.. http://apache.arvixe.com/spark/spark-1.5.0/spark-1.5.0-bin-hadoop2.6.tgz
tar -xvzf "$bin"/../spark-1.5.0-bin-hadoop2.6.tgz -C "$bin"/..
SPARK_DIR=spark-$SPARK_VERSION-bin-hadoop$HADOOP_VERSION
if [ ! -d "$bin"/../$SPARK_DIR ]; then
wget -P "$bin"/.. http://apache.arvixe.com/spark/spark-$SPARK_VERSION/$SPARK_DIR.tgz
tar -xvzf "$bin"/../$SPARK_DIR.tgz -C "$bin"/..
fi
#run spark-ec2 to start ec2 cluster
EC2DEPLOY="$bin"/../spark-1.5.0-bin-hadoop2.6/ec2/spark-ec2
EC2DEPLOY="$bin"/../$SPARK_DIR/ec2/spark-ec2
"$EC2DEPLOY" --copy-aws-credentials --key-pair=$KEY_PAIR --hadoop-major-version=yarn --identity-file=$SSH_KEY --region=us-east-1 --zone=us-east-1a --instance-type=$INSTANCE_TYPE --slaves $NUM_SLAVES launch $CLUSTER_NAME
#There is only 1 deploy host. However, the variable is plural as that is how Spark Job Server named it.
#To minimize changes, I left the variable name alone.
@@ -9,7 +9,8 @@ if [ -n "$SSH_KEY" ] ; then
ssh_key_to_use="-i $SSH_KEY"
fi
wget -O- --post-file "$bin"/../job-server-extras/target/scala-2.10/job-server-extras_2.10-0.5.3-SNAPSHOT.jar "$DEPLOY_HOSTS/jars/km"
VERSION=$(sed -E 's/version in ThisBuild := "(.*)"/\1/' version.sbt)
wget -O- --post-file "$bin"/../job-server-extras/target/scala-2.10/job-server-extras_$VERSION.jar "$DEPLOY_HOSTS/jars/km"
scp -rp -o StrictHostKeyChecking=no $ssh_key_to_use "$bin"/../job-server-extras/src/main/KMeansExample/* ${APP_USER}@"${DEPLOY_HOSTS%:*}:/var/www/html/"
echo "The example is running at ${DEPLOY_HOSTS%:*}:5080"
View
@@ -6,11 +6,13 @@
* export AWS_ACCESS_KEY_ID=accesskeyId
* export AWS_SECRET_ACCESS_KEY=secretAccessKey
3. Copy job-server/config/user-ec2-settings.sh.template to job-server/config/user-ec2-settings.sh and configure it. In particular, set KEY_PAIR to the name of your EC2 key pair and SSH_KEY to the location of the pair's private key.
* I recommend using an ssh key that does not require entering a password on every use. Otherwise, you will need to enter the password many times
* I recommend using an ssh key that does not require entering a password on every use. Otherwise, you will need to enter the password many times.
4. Run bin/ec2_deploy.sh to start the EC2 cluster. Go to the url printed at the end of the script to view the Spark Job Server frontend. Change the port from 8090 to 8080 to view the Spark Standalone Cluster frontend.
5. Run bin/ec2_example.sh to setup the example. Go to the url printed at the end of the script to view the example.
4. Run bin/ec2_destroy.sh to shutdown the EC2 cluster.
Note: To change the version of Spark on the cluster, set the SPARK_VERSION variable in both config/ec2.sh and config/user-ec2-settings.sh.template.
## Using The Example
1. Start a Spark Context by pressing the "Start Context" button.
@@ -5,3 +5,5 @@ KEY_PAIR=EC2_KEY_PAIR
CLUSTER_NAME=large19Slaves
INSTANCE_TYPE=r3.large
NUM_SLAVES=19
SPARK_VERSION=1.5.0
HADOOP_VERSION=2.6
View
@@ -32,13 +32,12 @@ object Dependencies {
lazy val sparkDeps = Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" excludeAll(excludeNettyIo, excludeQQ),
// Force netty version. This avoids some Spark netty dependency problem.
"io.netty" % "netty-all" % "4.0.23.Final",
"org.scala-lang" % "scala-library" % "2.10.3"
"io.netty" % "netty-all" % "4.0.23.Final"
)
lazy val scalaLib = if (scala.util.Properties.versionString.split(" ")(1).startsWith("2.10"))
Seq("org.scala-lang" % "scala-library" % "2.10.3")
else Seq()
Seq("org.scala-lang" % "scala-library" % "2.10.3")
else Seq()
lazy val sparkExtraDeps = Seq(
"org.apache.spark" %% "spark-mllib" % sparkVersion % "provided" excludeAll(excludeNettyIo, excludeQQ),
@@ -47,7 +46,6 @@ object Dependencies {
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided" excludeAll(excludeNettyIo, excludeQQ, excludeScalaTest)
) ++ scalaLib
lazy val slickDeps = Seq(
"com.typesafe.slick" %% "slick" % "2.1.0",
"com.h2database" % "h2" % "1.3.170",
@@ -66,7 +64,7 @@ object Dependencies {
)
lazy val securityDeps = Seq(
"org.apache.shiro" % "shiro-core" % "1.2.4"
"org.apache.shiro" % "shiro-core" % "1.2.4"
)
lazy val serverDeps = apiDeps ++ yodaDeps
ProTip! Use n and p to navigate between commits in a pull request.