-
Notifications
You must be signed in to change notification settings - Fork 11
Using MsPASS with Singularity (on HPC)
On machines that have Singularity setup. Use the following command to build the image as mspass.simg
in current directory:
singularity build mspass.simg docker://wangyinz/mspass
Before starting the MongoDB server, please make sure you have a dedicated directory created for the database files. Here we assume that to be ./data
. The command to start the mongoDB server for localhost only is:
singularity exec mspass.simg mongod --dbpath ./data --logpath ./log --fork
- The
--dbpath
and--logpath
options ofmongod
specify where to keep the database files and logs. - The
--fork
will let the MongoDB server process running in the background.
Then, launch the client locally with:
singularity exec mspass.simg mongo
To stop the MongoDB server, type the following command in the mongo shell:
use admin
db.shutdownServer()
First, request a interactive session with more than one node. Below we assume the hostname (output of the hostname
command) of the two nodes requested are node-1
and node-2
. Please make sure to change the names according to your system setup.
Assuming we want to have the MongoDB server running on node-1
, for a remote client to connect, start the server with:
singularity exec mspass.simg mongod --dbpath ./data --logpath ./log --fork --bind_ip_all
-
--bind_ip_all
will bind the MongoDB server to all IPv4 addresses, so it can be accessed from another node.
To launch the client from node-2
, simply ssh node-2
to get to that node and then:
singularity exec mspass.simg mongo --host node-1
It will connect to the MongoDB server running on node-1
.
To stop the MongoDB server, type the following command in mongo shell on node-1
:
use admin
db.shutdownServer()
Assume the two nodes requested in a interactive session are node-1
and node-2
. To launch the Spark master and the MongoDB server on node-1
, use the following command on node one:
singularity run mspass.simg &
This will require a data
directory already created at current directory. It will also create the log files of Spark master and MongoDB in current directory. The &
will let the servers running in the background.
To launch a Spark worker on node-2
, first ssh node-2
, and then run
singularity exec mspass.simg bash -c 'export SPARK_MASTER=node-1; \
export SPARK_LOG_DIR=path_to_current_dir; \
export SPARK_WORKER_DIR=path_to_current_dir; \
$SPARK_HOME/sbin/start-slave.sh spark://$SPARK_MASTER:$SPARK_MASTER_PORT'
You will need to specify three environment variables: SPARK_MASTER
, SPARK_LOG_DIR
, and SPARK_WORKER_DIR
in this version.
To test the setup with the Pi calculation example, use the following command on either node-1
or node-2
:
singularity exec mspass.simg /usr/local/spark/bin/run-example --master spark://node-1:7077 SparkPi 10
Each run will create a directory named as app-X-X
, which contains the files such as stderr.
The MongoDB can be accessed in the same way as described above.
To launch the Python shell with pyspark, use:
singularity exec mspass.simg pyspark \
--conf "spark.mongodb.input.uri=mongodb://node-1/test.myCollection?readPreference=primaryPreferred" \
--conf "spark.mongodb.output.uri=mongodb://node-1/test.myCollection" \
--conf "spark.master=spark://node-1:7077" \
--packages org.mongodb.spark:mongo-spark-connector_2.11:2.4.1