Description

    __  __            __      _____                  __  
   / / / /_  ______ _/ /__   / ___/____  ____ ______/ /__
  / /_/ / / / / __ `/ //_/   \__ \/ __ \/ __ `/ ___/ //_/
 / __  / /_/ / /_/ / ,<     ___/ / /_/ / /_/ / /  / ,<   
/_/ /_/\__, /\__,_/_/|_|   /____/ .___/\__,_/_/  /_/|_|  
      /____/                   /_/

Description

Sample scripts to run Spark jobs on Hyak

Mox

Prerequisites

Add export SPARK_HOME=/sw/contrib/spark and export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH to your .bashrc file.
Get a build node: srun -p build -c 1 --pty /bin/bash
Within the build node, load anaconda: module load anaconda2_4.3.1
Download required package: pip install py4j --user

To replicate examples:

Create a new conda environment, e.g.: conda create -n deeplearning keras tensorflow
Activate virtual env and download required packages:

pip install py4j
pip install nltk
python -m nltk.downloader punkt

Ikt

Interactive Job Submission

Set $JAVA_HOME variable in your ~/.bashrc file to point to /sw/contrib/java/jdk1.8.0_111
Download Spark into your home directory and unpack:

wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.1-bin-hadoop2.7.tgz
tar -xzvf spark-2.0.1-bin-hadoop2.7.tgz

Get nodes:

qsub -I -l nodes=2:ppn=16,walltime=2:00:00

Find node ids:

cat $PBS_NODEFILE | sort | uniq

Assign current node as master from within your spark directory:

cd spark-2.0.1-bin-hadoop2.7
./sbin/start-master.sh

Find master node URL (replacing username with netid and n0*** with your master node number):

cat logs/spark-*username*-org.apache.spark.deploy.master.Master-1-n0***.out | grep -o "spark://.*"

For each other assigned node ssh to that node and start a worker process (replace MASTER_NODE_URL with URL found in step 6):

ssh n0***
cd spark-2.0.1-bin-hadoop2.7
./sbin/start-slave MASTER_NODE_URL

Return to master node and start interactive session:

ssh n0***
cd spark-2.0.1-bin-hadoop2.7
./bin/spark-shell

Note: You can check to make sure you have assigned all nodes as workers by visiting http://localhost:8080:

curl localhost:8080

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
examples		examples
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Mox

Prerequisites

To replicate examples:

Ikt

Interactive Job Submission

About

Releases

Packages

License

UW-HPC/Hyak-Spark

Folders and files

Latest commit

History

Repository files navigation

Description

Mox

Prerequisites

To replicate examples:

Ikt

Interactive Job Submission

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages