Skip to content
Spark connector for pbs pro
Scala Shell
Branch: master
Clone or download

README.md

Spark PBSPro Support

This adds support for PBS Professional HPC resource manager in Apache Spark.

Status of build with latest Spark

Build Status

Usage

You can run Spark on the PBS cluster just by adding "--master pbs" while submitting as follows:

# start spark shell. only in client mode
./bin/spark-shell --master pbs

# submit a spark application in client mode
./bin/spark-submit --master pbs --deploy-mode client --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/target/scala-2.12/jars/spark-examples_2.12-3.1.0-SNAPSHOT.jar 100

# submit a spark application in cluster mode
./bin/spark-submit --master pbs --deploy-mode cluster --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/target/scala-2.12/jars/spark-examples_2.12-3.1.0-SNAPSHOT.jar 100

You can also just append spark.master pbs in conf/spark-defaults.conf to avoid adding --master pbs on every submit.

To run Spark UI with PBS cluster:

bin/spark-class org.apache.spark.deploy.pbs.ui.PbsClusterUI

Installation

This expects PBSPro to be installed at /opt/pbs.

Clone the Spark repository and move to spark folder

git clone https://github.com/apache/spark.git
cd spark

In the spark project root, punch in these commands:

# Clone the repo
git clone https://github.com/PBSPro/spark-pbspro-connector resource-managers/pbs

# Apply patch to spark (in the root directory).
git am resource-managers/pbs/*.patch

# Build!
build/mvn -DskipTests -Ppbs package

Add executor home to your configuration:

# in file conf/spark-defaults.conf add line:
spark.pbs.executor.home "SPARK INSTALLATION DIRECTORY PATH IN PBS MOMS"
You can’t perform that action at this time.