Written by Shawn O'neil, adapted to SLURM by Zhian Kamvar. Original implementation: https://github.com/oneilsh/SGE_Array
Note: This software is intended for use on the HCC infrastructure, and may not work well on other SLURM installations. It comes with no warranty or gaurantee of effectiveness.
So you have a bunch of commands that you want to run via SLURM, maybe in a text file called commands.txt
:
runAssembly sample_117.fasta -o sample_117.fasta.out
runAssembly sample_162.fasta -o sample_162.fasta.out
runAssembly sample_169.fasta -o sample_169.fasta.out
runAssembly sample_30.fasta -o sample_30.fasta.out
runAssembly sample_34.fasta -o sample_34.fasta.out
runAssembly sample_38.fasta -o sample_38.fasta.out
runAssembly sample_47.fasta -o sample_47.fasta.out
runAssembly sample_58.fasta -o sample_58.fasta.out
runAssembly sample_96.fasta -o sample_96.fasta.out
Or maybe you generated your commands from the list of file names with come clever usage of awk or sed (I generated
the above with ls -1 *.fasta | awk '{print "runAssembly " $1 " -o " $1 ".out"}' > commands.txt
). From
there you could just make the file an executable shell script, or even pipe it right into bash with a
cat commands.txt | bash
. But what if you want to run these via SLURM?
Probably, you should run them as an array job (because using a loop to submit them as individual jobs is NOT GOOD, ok?),
but this means using the clunky $SLURM_TASK_ID
syntax, which will only take numerals. SLURM_Array
is
to your rescue: it takes a list of commands (either as a file, or on stdin) and turns them into an array job. Boom.
cat commands.txt | SLURM_Array
# or
SLURM_Array -c commands.txt
# what about?
ls -1 *.fasta | awk '{print "runAssembly " $1 " -o " $1 ".out"}' | SLURM_Array
By default, each command is run requesting 4 gigs of RAM, and will be killed if it exceeds that. Each command will be killed if it attempts to create a file of more than 500G. The maximum number of commands that can run simultaneously across any number of machines is limited to 50 (to preserve network resources, this can be changed for IO-light commands).
The default log directory (where the qsub submit script is created, and where the stdout and stderr of
each command are written) is jYEAR-MON-DAY_HOUR-MIN-SEC_cmd_etal
, as in j2015-01-07_16-35-67_runAssembly_etal
(and the "cmd" part is even autogenerated from the first word of the first command),
so that you can easily organize your log information by time, and move/remove things in an orderly fashion
(you know, like to delete all of today's work, rm -rf j2015-01-07*
).
Most of this can be changed, here's the help output:
usage: SLURM_Array.py [-h] [-c COMMANDSFILE] [-q QUEUE] [-m MEMORY] [-t TIME]
[-l MODULE [MODULE ...]] [-M MAIL] [--mailtype MAILTYPE]
[-f FILELIMIT] [-b CONCURRENCY] [-x MAXCOMMANDS]
[--duration DURATION] [-P PROCESSORS] [-r RUNDIR]
[-w WD] [-H] [--hold] [--hold_jids HOLD_JID_LIST]
[--hold_names HOLD_NAME_LIST] [-v] [-d]
[--showchangelog]
Runs a list of commands specified on stdin as a SLURM array job. Example
usage: cat `commands.txt | SLURM_Array` or `SLURM_Array -c commands.txt`
optional arguments:
-h, --help show this help message and exit
-c COMMANDSFILE, --commandsfile COMMANDSFILE
The file to read commands from. Default: -, meaning
standard input.
-q QUEUE, --queue QUEUE
The queue(s) to send the commands to. Default: all
queues you have access to.
-m MEMORY, --memory MEMORY
Amount of free RAM to request for each command, and
the maximum that each can use without being killed.
Default: 4gb
-t TIME, --time TIME The maximum amount of time for the job to run in
d-hh:mm:ss. Default: 04:00:00
-l MODULE [MODULE ...], --module MODULE [MODULE ...]
List of modules to load after preamble. Eg: R/3.3
python/3.6
-M MAIL, --mail MAIL Email address to send notifications to. Default: None
--mailtype MAILTYPE Type of email notification to be sent if -M is
specified. Options: BEGIN, END, FAIL, ALL. Default:
ALL
-f FILELIMIT, --filelimit FILELIMIT
The largest file a command can create without being
killed. (Preserves fileservers.) Default: 500G
-b CONCURRENCY, --concurrency CONCURRENCY
Maximum number of commands that can be run
simultaneously across any number of machines.
(Preserves network resources.) Default: 1000
-x MAXCOMMANDS, --maxcommands MAXCOMMANDS
Maximum number of commands that can be submitted with
one submission script. If the number of commands
exceeds this number, they will be batched in separate
array jobs. Default: 900
--duration DURATION Duration expected for each of maxcommands to run in
d-hh:mm:ss. This will be multiplied by the number of
batches needed to run.
-P PROCESSORS, --processors PROCESSORS
Number of processors to reserve for each command.
Default: 1
-r RUNDIR, --rundir RUNDIR
Job name and the directory to create or OVERWRITE to
store log information and standard output of the
commands. Default: 'jYEAR-MON-DAY_HOUR-MIN-
SEC_<cmd>_etal' where <cmd> is the first word of the
first command.
-w WD, --working-directory WD
Working directory to set. Defaults to nothing.
-H Hold the execution for these commands until you
release them via scontrol release <JOB-ID>
--hold Hold the execution for these commands until all
previous jobs arrays run from this directory have
finished. Uses the list of jobs as logged to
.slurm_array_jobnums.
--hold_jids HOLD_JID_LIST
Hold the execution for these commands until these
specific job IDs have finished (e.g. '--hold_jid
151235' or '--hold_jid 151235,151239' )
--hold_names HOLD_NAME_LIST
Hold the execution for these commands until these
specific job names have finished (comma-sep list);
accepts regular expressions. (e.g. 'SLURM_Array -c
commands.txt -r this_job_name --hold_names
previous_job_name,other_jobs_.+'). Uses job
information as logged to .slurm_array_jobnums.
-v, --version show program's version number and exit
-d, --debug Create the directory and script, but do not submit
--showchangelog Show the changelog for this program.
It can also be used to run non-array jobs (though it will run it as an array of 1 command anyway). The cool thing is that no longer do you have to wrap the command in quotes (unless you are doing funky things like using environment variables right on the command-line, or you want to use | or >, which you'll have to escape).
This means shell autocompletion will work!
echo runAssembly input.fasta -o assembly_output | SLURM_Array
If your command needs funky shell stuff, you'll have to make sure echo
can print it properly, by escaping
or falling back to using quotes
echo runAssembly input.fasta -o assembly_output \> log.txt | SLURM_Array
echo 'runAssembly input.fasta -o assembly_output > log.txt' | SLURM_Array
I'd like to have the thing read a config file called .SLURM_Array
in $HOME
so that the defaults
for the options (--path
, --queue
, --memory
etc.) can be adjusted to minimize typing in
day-to-day usage.