Skip to content

Running Scripts

Neal W Morton edited this page Apr 10, 2019 · 31 revisions

Scripts in FAT are designed to do the minimum amount of processing; for example, prep_bold_run.sh only processes a single run. This allows you to run processing in whatever way makes the most sense for you, and makes it possible to run some operations dramatically faster by running different runs or subjects in parallel.

Utilities for Submitting Jobs to the Cluster

  • launch - submit a command or a set of commands to run in parallel
  • ezlaunch - run launch, saving out everything (run script, launch script, job output) in a standard way; also will automatically set the job name to make sure it's unique
  • slaunch - automatically generate a set of commands for different subjects, and submit them
  • rlaunch - automatically generate a set of commands for a set of subject/run combinations, and submit them

If you have a script that you want to run for a set of subjects, you can generally submit a job to run it on all subjects in parallel using a single line of code at the terminal, using slaunch or rlaunch. To illustrate how this works, we'll first go through how you can manually create a script to run multiple subjects and runs, and then show how to use rlaunch.

Creating and Submitting a Commands File

Using launch, it's possible to run multiple commands in parallel. First, you must create a file that contains each of the commands, with one line per command. If you create this using a text editor, note that every line must end with a "newline" character (i.e., hit enter after entering the last line); otherwise, the last command will not be recognized.

Then you can use launch to run all of those commands in parallel on the cluster. This example script will submit a job to run three subjects, each with six functional runs, in prep_bold_run.sh.

#!/bin/bash
mkdir -p $WORK/launchscripts
jobfile=$WORK/launchscripts/myjob.sh
if [ -f "$jobfile" ]; then
    rm "$jobfile"
fi

for subject in bender_02 bender_04 bender_05; do
    for run in study_1 study_2 study_3 study_4 study_5 study_6; do
    	echo "prep_bold_run.sh $WORK/bender/$subject/BOLD/$run" >> "$jobfile"
    done
done
chmod +x "$jobfile"
launch -s "$jobfile" -N 3 -n 18 -a 4 -r 01:00:00

Save the above to a script, say prep_bold_all_runs.sh, in a directory (we'll call it $scriptdir) and run chmod +x $scriptdir/prep_bold_all_runs.sh to make it an executable script. Then, if you haven't already, run export PATH=$PATH:$scriptdir to add the directory to your path. Finally, run by just typing prep_vold_all_runs.sh. This should submit a job to the cluster, which you can see by entering squeue -u $USER.

Run launch -h to see a full list of options for launch. For example, -m email@whatever.com will send an email when the job finishes.

Organizing job information

By default, launch will save output to your current directory and will automatically delete the job submission script after it's been run. ezlaunch will run jobs the same way, but then save them out in a standard format, with files for job commands, job submission, and job output, all placed within a BATCHDIR that you define before. For example:

export BATCHDIR=$WORK/bender/batch/launchcripts # can place this command in your .bashrc file
ezlaunch -s "$jobfile" -N 3 -n 18 -a 4 -r 01:00:00

After the job runs, you'll have the following files:

$BATCHDIR/Job1.sh # script with commands to run
$BATCHDIR/Job1.slurm # script with job submission options (e.g. max time, number of nodes)
$BATCHDIR/Job1.out # output from the script

By default, the base job name will be "Job" and the number at the end will be incremented automatically. Use "-J myjobname" to change the base name to "myjobname".

Processing multiple runs with rlaunch

You can do the same thing as the above script more quickly using rlaunch. rlaunch is similar to ezlaunch in that it automatically organizes job information for you. First, define two variables with the list of subjects and the list of runs you want to process, with colons separating different subjects/runs:

SUBJIDS=bender_02:bender_04:bender_05
STUDYRUNS=study_1:study_2:study_3:study_4:study_5:study_6

These commands can be put in your $HOME/.bashrc file, so that they will be defined every time you login to Lonestar without you having to type them again. Next, use the -t option with rlaunch to test out the command you want to run for each subject/run combination:

rlaunch -t "prep_bold_run.sh $WORK/bender/{s}/BOLD/{r}" $SUBJIDS $STUDYRUNS

This should display all the commands to be run, without actually submitting anything:

prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_1
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_2
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_3
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_4
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_5
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_02/BOLD/study_6
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_1
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_2
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_3
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_4
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_5
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_04/BOLD/study_6
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_1
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_2
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_3
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_4
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_5
prep_bold_run.sh /work/03206/mortonne/lonestar/bender/bender_05/BOLD/study_6

Every place in the command string with {s} will be replaced by the subject code, and every {r} will be replaced by the run code. Every combination of them will be run.

To actually submit those commands to be run in parallel, remove the -t (test) option, and add some settings to specify how the job should be launched:

rlaunch "prep_bold_run.sh $WORK/bender/{s}/BOLD/{r}" $SUBJIDS $STUDYRUNS -N 3 -n 18 -a 4 -r 01:00:00

This will run all commands, spread over 3 nodes, with 6 processes on each node. Each process will use 4 threads for running ANTS functions. This way, all 24 cores of the three nodes will be used, and all commands will run in parallel. The job will time out after 1 hour. You can always increase this time, though jobs with longer max times may take longer to make it through the job queue and start running. Depending on how long your runs are and what their resolution is, you may need to tweak your settings to avoid running out of memory on any of the nodes.

Like ezlaunch, by default, the commands, job settings, and job output will be saved in the current directory under JobX.sh, JobX.slurm, and JobX.out, respectively. The "X" will be an automatically assigned serial number, so the 100th job you submit will be called Job100.sh. You can change the job name using the -J flag.

You can set a variable called BATCHDIR to control where things are saved out:

export BATCHDIR=$WORK/batch/launchscripts

You can put this in your $HOME/.bashrc file so BATCHDIR will always be set when you log in. Then all of your job information will always be saved to the same directory.

Checking for File Dependencies

Using rlaunch, it's also possible to check for a file dependency for each subject/run combination, and only run the command if that exists. For example, to run prep_bold_run.sh, you must have a file called bold.nii.gz in each run directory. To check for that, use the -f flag (here, we use the -t flag to just print the commands that would be run):

rlaunch -t -f $WORK/bender/{s}/BOLD/{r}/bold.nii.gz "prep_bold_run.sh $WORK/bender/{s}/BOLD/{r}" $SUBJIDS $STUDYRUNS

Re-running Commands that Failed

Clusters can be a bit unpredictable, and sometimes some commands will run, while others fail. This might be due to running out of memory part of the way through, or running past the max time. Ideally, we want to just re-run the commands that failed, while not redoing the commands that ran okay the first time. We can often do that by checking for an output file, and only running if that file does not exist. You need to have a file that is generated as the last step of the job. Set that with the -n option:

rlaunch -t -n $WORK/bender/{s}/BOLD/{r}/QA/QA_report.pdf "prep_bold_run.sh $WORK/bender/{s}/BOLD/{r}" $SUBJIDS $STUDYRUNS

This checks for the QA_report.pdf file that is created as the last step of prep_bold_run.sh, and only runs the command if it does not exist.

Running Subjects in Parallel

If you only need to run one command per subject, you can use slaunch. It works exactly the same as rlaunch, with a couple differences:

  • You only need to specify a list of subjects, without a list of runs also.
  • You can use just "{}" in the commands string to indicate places to put the subject ID, instead of "{s}" like you need for rlaunch.
  • If "{r}" appears anywhere in the commands string, it will not be replaced like it is in rlaunch.

For example, to run FreeSurfer on a list of subjects specified in SUBJIDS:

slaunch "run_freesurfer.sh {} 12" $SUBJIDS -N 3 -n 3 -r 08:00:00

Clone this wiki locally