Skip to content

RIKEN-RCCS/spark-k

Repository files navigation

Using Apache Spark on K

This package provides installation document and utility scripts for running Apache Spark on K Computer. It includes simple examples.

  • The installation document is in docs. Spark (1.6.2) is installed in "/opt/aics/spark/spark-1.6.2-bin-sparkk" on K (on the compute-nodes).

  • The utility scripts are in scripts. The scripts are installed in "/opt/aics/spark/scripts" on K (on the compute-nodes).

  • The examples are in examples. They are provided as job scripts for Fujitsu job manager.

Basics

The scripts are simple wrappers of the scripts of Spark, which help start/stop master and worker processes. Spark runs in the Spark standalone mode, but master/workers are started using MPI (Message Passing Interface) via mpiexec.

The scripts assume that Spark is installed in "/opt/aics/spark" and R in "/opt/aics/R" (on the compute-nodes).

Examples

First try the sample job scripts (for Fujitsu job manager) in examples.

Building Spark

For procedures to build and install, see docs.

More Information

  • For background information using K, but not very specific to Spark, see Basics in docs.