Skip to content

3. Spark tools

Simon Renauld edited this page Nov 22, 2021 · 7 revisions

3.1. Submitting Production Applications

The spark-submit script in Spark’s bin directory is used to launch applications on a cluster. spark-submit does one thing: it lets you send your application code to a cluster and launch it to execute there. Upon submission, the application will run until it exits (completes the task) or encounters an error.

https://spark.apache.org/docs/latest/submitting-applications.html

./bin/spark-submit \
--master local \
./examples/src/main/python/pi.py 10
  • --class: The entry point for your application (e.g. org.apache.spark.examples.SparkPi)
  • --master: The master URL for the cluster (e.g. spark://23.195.26.187:7077)
  • --deploy-mode: Whether to deploy your driver on the worker nodes (cluster) or locally as an external client (client) (default: client) †
  • --conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown).

3.2. Submitting Production Applications

Clone this wiki locally