Examples/spark #1267
Examples/spark #1267
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
#Spark-pi example | ||
The aim of this example is to show how to run a Spark application in cluster mode on MANTL. | ||
|
||
##Prerequisites | ||
* Install Spark throug the mantl-api: | ||
```bash | ||
curl -k -i -L -X POST -d "{\"name\": \"spark\"}" https://admin:password@control-node/api/1/install | ||
``` | ||
>**Note**: in the previous command, don't forget to substitute *password* and *control-node* with the actual password, | ||
and the actual control node IP (or domain name) of your MANTL cluster. | ||
|
||
* Install the GlusterFS addon: http://docs.mantl.io/en/latest/components/glusterfs.html | ||
|
||
* SSH into one of your nodes, and create a configuration file `/mnt/container-volumes/spark-conf/spark-defaults.conf` that | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe this should be done on all nodes via Ansible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you could do that using the mantl-api. @ryane Am i right?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mcapuccini Is there a reason not to use Ansible? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @siddharthist if you have multiple copies of the conf file, and you want to change the configuration, then you will have to modify all of them. I find it easier to have only one copy on GlusterFS. |
||
looks someting like this: | ||
``` | ||
spark.mesos.principal=mantl-api | ||
spark.mesos.secret=your_mantl_cluster_mesos_secret | ||
spark.mesos.executor.docker.volumes=/mnt/container-volumes/spark-conf:/opt/spark/dist/conf:ro | ||
spark.mesos.executor.docker.image=mesosphere/spark:1.6.0 | ||
``` | ||
You can figure out the mesos secret form the *security.yml* file, that you created usining the *security-setup* script: | ||
```bash | ||
cat security.yml | grep mantl_api_secret | ||
``` | ||
|
||
##Schedule the job using Chronos | ||
To schedule Spark-pi using Chronos, please run: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Chronos role will soon be an addon, so we should include instructions similar to GlusterFS here when #1260 is merged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mcapuccini Once this is added, we can merge this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @siddharthist Sure, just did it! |
||
``` | ||
./schedule.sh | ||
``` | ||
>**Note**: in the *spark-pi.json* file, the spark-pi job is scheduled for the new year eve of 2030, therefore you might | ||
want to force its run from the Chronos API. In alternative, you can change this date in *spark-pi.json* before to run `./schedule.sh`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
|
||
echo "Please insert your MANTL control node IP address (or domain name)" | ||
read -r MANTL_CONT | ||
echo "Please instert your MANTL admin password:" | ||
read -sr MANTL_PASS | ||
CHRONOS="admin:$MANTL_PASS@$MANTL_CONT/chronos" | ||
|
||
curl -k -i -L -X POST -H "Content-type: application/json" "https://$CHRONOS/scheduler/iso8601" -d@"spark-pi.json" | ||
echo #prints a newline | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{ | ||
"schedule" : "R0/2030-01-01T12:00:00Z/PT1H", | ||
"cpus": "0.5", | ||
"mem": "512", | ||
"epsilon" : "PT30M", | ||
"name" : "spark-pi", | ||
"container": { | ||
"type": "DOCKER", | ||
"image": "mcapuccini/spark:1.6.0", | ||
"volumes": [ | ||
{ | ||
"hostPath": "/mnt/container-volumes/spark-conf", | ||
"containerPath": "/opt/spark/dist/conf", | ||
"mode": "RO" | ||
} | ||
] | ||
}, | ||
"command" : "MASTER_PORT=$(dig +short spark.service.consul SRV | awk '{print $3}' | sort | head -1) && bin/spark-submit --class org.apache.spark.examples.SparkPi --master mesos://spark.service.consul:$MASTER_PORT --deploy-mode cluster http://central.maven.org/maven2/org/apache/spark/spark-examples_2.10/1.1.1/spark-examples_2.10-1.1.1.jar 1000", | ||
"owner" : "user@example.com" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Install Spark (via|with) mantl-api:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch!