- A resource set is a group of resources (GPU, CPU, RAM) within a node.
- A resource set can’t span socket, can’t span node.
- All resource sets within a
jsrun
must be the same. - On CPU, the granularity is on the entire core, not hyper thread level.
-n
: number of resource set-c
: number of physical cores per resource set-g
: number of gpus per resource set-a
: number of tasks per resource set-r
: number of resource per node
Hello jsrun can really be your friend. The following examples are from Summit interactive shell with 2 nodes allocated.
export OMP_NUM_THREADS=1
jsrun -n4 -c1 -a1 ./hello_jsrun | sort
MPI Rank 000 of 004 on HWThread 000 of Node h30n16, OMP_threadID 0 of 1
MPI Rank 001 of 004 on HWThread 004 of Node h30n16, OMP_threadID 0 of 1
MPI Rank 002 of 004 on HWThread 009 of Node h30n16, OMP_threadID 0 of 1
MPI Rank 003 of 004 on HWThread 012 of Node h30n16, OMP_threadID 0 of 1
All ranks are allocated on the first node; probably not what we want.
jsrun -n4 -c1 -a1 -r2 ./hello_jsrun | sort
MPI Rank 000 of 004 on HWThread 001 of Node h30n16, OMP_threadID 0 of 1
MPI Rank 001 of 004 on HWThread 004 of Node h30n16, OMP_threadID 0 of 1
MPI Rank 002 of 004 on HWThread 001 of Node h30n17, OMP_threadID 0 of 1
MPI Rank 003 of 004 on HWThread 005 of Node h30n17, OMP_threadID 0 of 1
---------- MPI Ranks: 4, OpenMP Threads: 1, GPUs per Resource Set: 0 ----------
#!/bin/bash
#BSUB -P stf008 # project ID
#BSUB -J name_test # name of the job
#BSUB -o nvme_test.o%J # output file
#BSUB -W 2 # wallclock hrs
#BSUB -nnodes 2 # num. of nodes requested
#BSUB -alloc_flags NVME # if u plan to use NVME