Skip to content

NERSC GPU Hackathon 2021

rhaas80 edited this page Dec 9, 2021 · 22 revisions

For the hackathon we will be using (in order):

  • Perlmutter Phase 1
  • deep bayou
  • Ascent

Steve Brandt and Roland Haas worked a bit to see how to compile on these systems and came up with two methods. One "spack" uses pre-compiled libraries provided by a spack installation that Steve created and he other "ExternalLibraries" uses the tarballs in the ExternalLibraries to compiled code.

These instructions assume that you are familiar with Cactus and Simfactory. You may want to consult Getting Started if you are not.

These instructions are currently work-in-progress, please report any issues that you find via the shared GRHydroX team chat (or a ticket here). Please do not use direct messages to Steve or Roland (they may get lost) for these.

Deeb Bayou

To avoid confusion this is a copy of Getting Started but using the GRHydroX thorn list

Since the source code repo is private you have to work a bit to get it. Simplest is to checkout on on you laptop and then use simfactory/bin/sim sync to copy to deep bayou.

For this to work you will have to have a public / private key set up with GitHub.

First clone the source code repo to get the thornlist:

git clone git@github.com:shankar-1729/GRHydroX

The use this to check out the remainder of the code:

curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents

./GetComponents --root Cactus --parallel GRHydroX/thornlists/grhydrox.th

Then, still on you laptop, set up simfactory:

cd Cactus
./simfactory/bin/sim setup-silent

and add a section to simfactory/etc/defs.local.ini to override some settings:

[db1.hpc.lsu.edu]
user = <YOUR-DB1-USER-NAME>

allocation =  hpc_et_test1
sourcebasedir = /ddnA/project/sbrandt/carpetx/@USER@

# Use the GPU
#optionlist = /project/sbrandt/cfgs/spack-cuda.cfg
# Use the CPU only
#optionlist = /project/sbrandt/cfgs/spack.cfg
# Use for hackathon
optionlist = /work/eschnett/Cactus/simfactory/mdb/optionlists/db-gpu.cfg

runscript = /work/eschnett/Cactus/simfactory/mdb/runscripts/db-gpu.run

envsetup = <<EOF
export SPACK_ROOT=/project/sbrandt/spack
source $SPACK_ROOT/share/spack/setup-env.sh
eval `spack --config-scope /home/sbrandt/.spack load --sh mpich`
eval `spack --config-scope /home/sbrandt/.spack load --sh yaml-cpp`
EOF
#ExternalLibrariesCUDA is used only with branch rhaas/cuda, not with master
# enabled-thorns = <<EOF
# ExternalLibraries/CUDA
# ExternalLibraries/RePrimAnd
#EOF

where you should replace only <YOUR-DB1-USER-NAME> by your user name on db1.

After that copy the code to db1:

./simfactory/bin/sim sync db1.hpc.lsu.edu

and log in to db1:

./simfactory/bin/sim login db1.hpc.lsu.edu

Then set up some symbolic links to provide directories the run script expects:

ln -s /ddnA/work/eschnett/Cactus/view-cuda ./

Finally compile the thornlist (on db1):

./simfactory/bin/sim build -j8 --thornlist thornlists/grhydrox.th

and run Erik's test parfile:

./simfactory/bin/sim submit test-gpu-01 --parfile=arrangements/GRHydroX/GRHydroX/par/test.par --procs=24 --num-threads=24

Perlmutter Phase 1

Spack

TODO: describe this properly

This uses a plain Cactus build. First check out the version of the code you would like, eg:

Log in to Perlmutter phase 1:

ssh $USER@perlmutter-p1.nersc.gov

where $USER is your username on NERSC. Then

./GetComponents https://bitbucket.org/eschnett/cactusamrex/raw/master/azure-pipelines/carpetx-cuda.th

and set up the modules required:

module load gcc/9.3.0
export SPACK_ROOT=/global/u2/s/sbrandt/spack
# this next line may fail with "please set spakc prefix" due to missing executable permission on $SPACK_ROOT/bin/spack
source $SPACK_ROOT/share/spack/setup-env.sh

Then compile Cactus:

cd Cactus
make sim-config options=/global/u2/s/sbrandt/spack-cuda.cfg THORNLIST=thornlists/carpetx-cuda.th
make -j8 sim

ExternalLibraries

This mostly follows Getting Started using the same type of thorns.

Since the source code repo is private you have to work a bit to get it. Simplest is to checkout on on you laptop and then use simfactory/bin/sim sync to copy to deep bayou.

So

curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents

git clone git@github.com:shankar-1729/GRHydroX
./GetComponents --root Cactus --parallel GRHydroX/thornlists/grhydrox.th

Switch simfactory the rhaas/perlmutter branch, switch cactusamrex and GRHydroX to rhaas/cuda:

cd Cactus
cd repos/simfactory2
git checkout rhaas/perlmutter
cd ../cactusamrex
git checkout rhaas/cuda
cd ../GRHydroX
git checkout rhaas/cuda
cd ../../

Then, still on you laptop, set up simfactory:

./simfactory/bin/sim setup-silent

and add a section to simfactory/etc/defs.local.ini to override some settings:

[perlmutter-p1]
user = <YOUR-PERLMUTTER-USER-NAME>
allocation = ntrain9

After that copy the code to perlmutter

./simfactory/bin/sim sync perlmutter-p1

and log in to perlmuter:

./simfactory/bin/sim login perlmutter-p1

Set up user name etc. for perlmutter:

echo perlmutter-p1 >~/.hostname

Compile:

./simfactory/bin/sim build --thornlist thornlists/grhydrox.th

Run a test:

./simfactory/bin/sim submit --cores 128 --walltime 0:5:0 --parfile repos/cactusamrex/WaveToyGPU/par/planewave-gpu.par planewave-gpu-01

Ascent / Summit

Spack

TODO: add this

ExternalLibraries

These instructions were tested on Summit since I (Roland) do not yet have access to Ascent, which is the open, training system using Summit hardware.

Log in:

ssh $USER@summit.olcf.ornl.gov

where $USER is your username on NERSC. Then change to scratch (I got build failures when I tried to build in $HOME which uses NFS):

cd /gpfs/alpine/*/scratch/$USER

Then get the code:

module load subversion
curl -kLO https://raw.githubusercontent.com/gridaphobe/CRL/ET_2021_05/GetComponents
chmod a+x GetComponents
./GetComponents --root Cactus --parallel https://bitbucket.org/eschnett/cactusamrex/raw/d659fbcb2593ab4816040c3b0b16203bc3c22ffe/manifest/carpetx.th

Switch simfactory the rhaas/summit-gpu branch, switch cactusamrex and GRHydroX to rhaas/cuda:

cd Cactus
cd repos/simfactory2
git checkout rhaas/summit-gpu
cd ../cactusamrex
git checkout rhaas/cuda
cd ../GRHydroX
git checkout rhaas/cuda
cd ../../

Set up user name etc for Summit:

./simfactory/bin/sim setup-silent

Change summit's machine description to cause the updated files by editing simfactory/etc/defs.local.ini and adding a section for summit:

[summit]
# you may have to adjust the allocation
allocation = GEN170
enabled-thorns = <<EOF
ExternalLibraries/CUDA
CarpetX/WaveToyGPU
EOF
optionlist = summit-gpu.cfg
# load some additional modules
envsetup = <<EOF
    [ -n "$LMOD_CMD" ] || source /etc/profile
    module unload boost || true
    module unload spectrum-mpi || true
    module unload gcc || true
    module unload xl || true
    module unload cuda || true
    module load gcc/9.3.0 &&
    module load spectrum-mpi/10.4.0.3-20210112 &&
    module load cuda/11.3.1 &&
    module load gsl/2.5 &&
    module load hdf5/1.10.7 &&
    module load boost/1.77.0 &&
    module load openblas/0.3.17 &&
    module load fftw/3.3.8 &&
    module load curl/7.79.0 &&
    module load hwloc/2.5.0 &&
    module load hypre/2.22.0-cpu &&
    module load papi/6.0.0.1 &&
    module load perl/5.30.1 &&
    module load petsc/3.15.4-no_cuda &&
    module load zlib/1.2.11 &&
    module load cmake/3.21.3
EOF

Compile:

./simfactory/bin/sim build --thornlist thornlists/carpetx.th