-
Notifications
You must be signed in to change notification settings - Fork 11
NERSC GPU Hackathon 2021
For the hackathon we will be using (in order):
- Perlmutter Phase 1
- deep bayou
- Ascent
Steve Brandt and Roland Haas worked a bit to see how to compile on these systems and came up with two methods. One "spack" uses pre-compiled libraries provided by a spack installation that Steve created and he other "ExternalLibraries" uses the tarballs in the ExternalLibraries to compiled code.
These instructions assume that you are familiar with Cactus and Simfactory. You may want to consult Getting Started if you are not.
These instructions are currently work-in-progress, please report any issues that you find via the shared GRHydroX team chat (or a ticket here). Please do not use direct messages to Steve or Roland (they may get lost) for these.
To avoid confusion this is a copy of Getting Started but using the GRHydroX thorn list
Since the source code repo is private you have to work a bit to get it. Simplest is to checkout on on you laptop and then use simfactory/bin/sim sync to copy to deep bayou.
For this to work you will have to have a public / private key set up with GitHub.
First clone the source code repo to get the thornlist:
git clone git@github.com:shankar-1729/GRHydroX
The use this to check out the remainder of the code:
curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents
./GetComponents --root Cactus --parallel GRHydroX/thornlists/grhydrox.th
Then, still on you laptop, set up simfactory:
cd Cactus
./simfactory/bin/sim setup-silent
and add a section to simfactory/etc/defs.local.ini
to override some settings:
[db1.hpc.lsu.edu]
user = <YOUR-DB1-USER-NAME>
allocation = hpc_et_test1
sourcebasedir = /ddnA/project/sbrandt/carpetx/@USER@
# Use the GPU
#optionlist = /project/sbrandt/cfgs/spack-cuda.cfg
# Use the CPU only
#optionlist = /project/sbrandt/cfgs/spack.cfg
# Use for hackathon
optionlist = /work/eschnett/Cactus/simfactory/mdb/optionlists/db-gpu.cfg
runscript = /work/eschnett/Cactus/simfactory/mdb/runscripts/db-gpu.run
envsetup = <<EOF
export SPACK_ROOT=/project/sbrandt/spack
source $SPACK_ROOT/share/spack/setup-env.sh
eval `spack --config-scope /home/sbrandt/.spack load --sh mpich`
eval `spack --config-scope /home/sbrandt/.spack load --sh yaml-cpp`
EOF
#ExternalLibrariesCUDA is used only with branch rhaas/cuda, not with master
# enabled-thorns = <<EOF
# ExternalLibraries/CUDA
# ExternalLibraries/RePrimAnd
#EOF
where you should replace only <YOUR-DB1-USER-NAME>
by your user name on db1.
After that copy the code to db1:
./simfactory/bin/sim sync db1.hpc.lsu.edu
and log in to db1:
./simfactory/bin/sim login db1.hpc.lsu.edu
Then set up some symbolic links to provide directories the run script expects:
ln -s /ddnA/work/eschnett/Cactus/view-cuda ./
Finally compile the thornlist (on db1):
./simfactory/bin/sim build -j8 --thornlist thornlists/grhydrox.th
and run Erik's test parfile:
./simfactory/bin/sim submit test-gpu-01 --parfile=arrangements/GRHydroX/GRHydroX/par/test.par --procs=24 --num-threads=24
TODO: describe this properly
This uses a plain Cactus build. First check out the version of the code you would like, eg:
Log in to Perlmutter phase 1:
ssh $USER@perlmutter-p1.nersc.gov
where $USER
is your username on NERSC. Then
./GetComponents https://bitbucket.org/eschnett/cactusamrex/raw/master/azure-pipelines/carpetx-cuda.th
and set up the modules required:
module load gcc/9.3.0
export SPACK_ROOT=/global/u2/s/sbrandt/spack
# this next line may fail with "please set spakc prefix" due to missing executable permission on $SPACK_ROOT/bin/spack
source $SPACK_ROOT/share/spack/setup-env.sh
Then compile Cactus:
cd Cactus
make sim-config options=/global/u2/s/sbrandt/spack-cuda.cfg THORNLIST=thornlists/carpetx-cuda.th
make -j8 sim
This mostly follows Getting Started using the same type of thorns.
Since the source code repo is private you have to work a bit to get it. Simplest is to checkout on on you laptop and then use simfactory/bin/sim sync to copy to deep bayou.
So
curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents
git clone git@github.com:shankar-1729/GRHydroX
./GetComponents --root Cactus --parallel GRHydroX/thornlists/grhydrox.th
Switch simfactory the rhaas/perlmutter
branch, switch cactusamrex
and GRHydroX
to rhaas/cuda
:
cd Cactus
cd repos/simfactory2
git checkout rhaas/perlmutter
cd ../cactusamrex
git checkout rhaas/cuda
cd ../GRHydroX
git checkout rhaas/cuda
cd ../../
Then, still on you laptop, set up simfactory:
./simfactory/bin/sim setup-silent
and add a section to simfactory/etc/defs.local.ini
to override some settings:
[perlmutter-p1]
user = <YOUR-PERLMUTTER-USER-NAME>
allocation = ntrain9
After that copy the code to perlmutter
./simfactory/bin/sim sync perlmutter-p1
and log in to perlmuter:
./simfactory/bin/sim login perlmutter-p1
Set up user name etc. for perlmutter:
echo perlmutter-p1 >~/.hostname
Compile:
./simfactory/bin/sim build --thornlist thornlists/grhydrox.th
Run a test:
./simfactory/bin/sim submit --cores 128 --walltime 0:5:0 --parfile repos/cactusamrex/WaveToyGPU/par/planewave-gpu.par planewave-gpu-01
TODO: add this
These instructions were tested on Summit since I (Roland) do not yet have access to Ascent, which is the open, training system using Summit hardware.
Log in:
ssh $USER@summit.olcf.ornl.gov
where $USER
is your username on NERSC. Then change to scratch (I got build failures when I tried to build in $HOME
which uses NFS):
cd /gpfs/alpine/*/scratch/$USER
Then get the code:
module load subversion
curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents
./GetComponents --root Cactus --parallel https://bitbucket.org/eschnett/cactusamrex/raw/d659fbcb2593ab4816040c3b0b16203bc3c22ffe/manifest/carpetx.th
Switch simfactory the rhaas/summit-gpu
branch, switch cactusamrex
and GRHydroX
to rhaas/cuda
:
cd Cactus
cd repos/simfactory2
git checkout rhaas/summit-gpu
cd ../cactusamrex
git checkout rhaas/cuda
cd ../GRHydroX
git checkout rhaas/cuda
cd ../../
Set up user name etc for Summit:
./simfactory/bin/sim setup-silent
Change summit's machine description to cause the updated files by editing simfactory/etc/defs.local.ini
and adding a section for summit:
[summit]
# you may have to adjust the allocation
allocation = GEN170
enabled-thorns = <<EOF
ExternalLibraries/CUDA
CarpetX/WaveToyGPU
EOF
optionlist = summit-gpu.cfg
# load some additional modules
envsetup = <<EOF
[ -n "$LMOD_CMD" ] || source /etc/profile
module unload boost || true
module unload spectrum-mpi || true
module unload gcc || true
module unload xl || true
module unload cuda || true
module load gcc/9.3.0 &&
module load spectrum-mpi/10.4.0.3-20210112 &&
module load cuda/11.3.1 &&
module load gsl/2.5 &&
module load hdf5/1.10.7 &&
module load boost/1.77.0 &&
module load openblas/0.3.17 &&
module load fftw/3.3.8 &&
module load curl/7.79.0 &&
module load hwloc/2.5.0 &&
module load hypre/2.22.0-cpu &&
module load papi/6.0.0.1 &&
module load perl/5.30.1 &&
module load petsc/3.15.4-no_cuda &&
module load zlib/1.2.11 &&
module load cmake/3.21.3
EOF
Compile:
./simfactory/bin/sim build --thornlist thornlists/carpetx.th