The Taurus cluster is located at ZIH (TU Dresden).
The cluster has multiple partitions, this section describes how to use the AMD Rome CPUs + NVIDIA A100¶.
If you are new to this system, please see the following resources:
- ZIH user guide
- Batch system: Slurm
- Jupyter service: Missing?
- Production directories:
$PSCRATCH
: per-user production directory, purged every 30 days (<TBD>TB)/global/cscratch1/sd/m3239
: shared production directory for users in the projectm3239
, purged every 30 days (50TB)/global/cfs/cdirs/m3239/
: community file system for users in the projectm3239
(100TB)
Use the following commands to download the WarpX source code and switch to the correct branch:
git clone https://github.com/ECP-WarpX/WarpX.git $HOME/src/warpx
We use the following modules and environments on the system ($HOME/taurus_warpx.profile
).
.. literalinclude:: ../../../../Tools/machines/taurus-zih/taurus_warpx.profile.example :language: bash :caption: You can copy this file from ``Tools/machines/taurus-zih/taurus_warpx.profile.example``.
We recommend to store the above lines in a file, such as $HOME/taurus_warpx.profile
, and load it into your shell after a login:
source $HOME/taurus_warpx.profile
Then, cd
into the directory $HOME/src/warpx
and use the following commands to compile:
cd $HOME/src/warpx
rm -rf build
cmake -S . -B build -DWarpX_DIMS=3 -DWarpX_COMPUTE=CUDA
cmake --build build -j 16
The general :ref:`cmake compile-time options <building-cmake>` apply as usual.
The alpha partition has 34 nodes with 8 x NVIDIA A100-SXM4 Tensor Core-GPUs and 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz (multithreading disabled) per node.
The batch script below can be used to run a WarpX simulation on multiple nodes (change -N
accordingly).
Replace descriptions between chevrons <>
by relevant values, for instance <input file>
could be plasma_mirror_inputs
.
Note that we run one MPI rank per GPU.
.. literalinclude:: ../../../../Tools/machines/taurus-zih/taurus.sbatch :language: bash :caption: You can copy this file from ``Tools/machines/taurus-zih/taurus.sbatch``.
To run a simulation, copy the lines above to a file taurus.sbatch
and run
sbatch taurus.sbatch
to submit the job.