Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Shifter Integration with SLURM
Shifter is distributed with a SPANK (https://github.com/SchedMD/slurm/blob/master/slurm/spank.h) plugin for SLURM. This plugin requires features from SLURM 15.08 to function properly, and relies on some features from 16.05 (which can be backported to 15.08).
This feature enables large scale Shifter applications to run within HPC while also easing the user interface for using Shifter. The SLURM integration adds several options to
salloc which are used to pre-construct the user-defined environment. When an image is specified, it will be setup on every compute node in the compute allocation ahead of time. Furthermore, an ssh daemon can be started within the container, as well as a hostsfile place within the image to allow an application to directly access other nodes without any difficulty.
Building the SPANK Plugin
This part is easy, just specify
--with-slurm=</path/to/your/slurm/installation> as an option to configure for udiRoot. The plugin will be compiled and installed into the udiRoot distribution under
Configuring the SPANK Plugin
Add a new line to your SLURM
plugstack.conf configuration file:
required /path/to/shifter/udiRoot/lib/shifterudiroot/shifter_slurm.so shifter_config=/path/to/udiRoot.conf <other options>
Additional SLURM Configurations
When using shifter integration, you should set in slurm.conf:
The "alloc" option indicates that the slurm prologs should be run on all nodes in a job allocation when just before the job starts. The default is to do this at the first srun, rather than at allocation time. For performance reasons it is better to do the initial shifter setup prior to job start, and all at once. Additionally, if the calculation is to use the sshd provided by shifter, it alloc is need to make sure it is running everywhere.
The "contain" option adds a runs a separate slurmstepd on every node that shepherds the "extern" step. The extern step can be used to provide SLURM control to processes started within the shifter sshd. If you specify the "extern_setup" option to shifter_slurm.so in plugstack.conf you can specify a script that is run once per-node after all job-setup is complete. This is useful if there is any final setup required for your site that is done between the prologs and job start (i.e., Cray Datawarp mounts).