# Intermediate Slurm on Milton
### Delivered by WEHI Research Computing Platform

<pre>Edward Yang    Michael Milton    Julie Iskander</pre><br />

<img src="static/1200px-Slurm_logo.svg.png" alt="Slurm" width="100"/><img src="static/milton.png" alt="Milton Mascot" width="100"/><img src="static/WEHI_RGB_logo.png" alt="WEHI logo" width="200"/>

## [OPTIONAL SLIDE] Self Introductions!

### Me
* Civil Engineer by training
* HPC by interest
* Mostly code in Fortran, occasionally Python
* I use HPC to simulate fluid and granular flows

## How about you?

## Background
* We already ran an "intro to Slurm" workshop (recording on RCP website)
* More "advanced" features of Slurm were highly requested
* Both ITS and researchers would benefit
    * ITS will have fewer issues to address
    * researchers can accelerate their research <- High Priority for everyone!

### Target Audience
* You've submitted quite a few jobs via `sbatch`
* You're familiar with resource requests. Like:
    * using `--ntasks` and `--cpus-per-task`
    * using `--mem` and/or `--memory-per-cpu`
* You're wondering whether your jobs are utilizing resources efficiently
* You're wondering how to make your life easier when using `sbatch`

## Background cont.

My goal for the session is to teach you how to:
* get the status of the cluster
* get information about your jobs
* make use of `sbatch` scripting features
* Run emabarrasingly parallel tasks
* **bonus topic** submitting jobs from R or submitting `python` or `R` scripts without a wrapper script.

## Agenda for Today

12:00 - 12:30&emsp;<span style="color:blue">Introduction & Housekeeping</span>

12:30 - 13:00&emsp;Lunch

13:00 - 13:30&emsp;Understanding your jobs and the job queue

13:30 - 14:00&emsp;Profiling your jobs

14:00 - 14:30&emsp;Slurm scripting features

14:30 - 15:00&emsp;Embarrarsingly parallel examples

15:00 - 15:30&emsp;R batchtools

15:30 - 16:00&emsp;

## Introduction and Housekeeping

### Format
Slides + live coding

Live coding will be on Milton, so make sure you're connected to WEHI's VPN or staff network, or use RAP:<br />
rap.wehi.edu.au

Please follow along to reinforce learning!

Questions:
* Put your hand up whenever you have a question or have an issue running things
* Questions in the chat are welcome and will be addressed by helpers

Material is available here:
/link

Feel free to download the notebook and follow along.

## Introduction and Housekeeping Cont.
### Expected understanding of
* <span style="font-size:1.2em;">Concept of "resources"</span><br />
`CPUs`<br />
`RAM/memory`<br />
`Nodes`<br /><br />
* <span style="font-size:1.2em;">Job submission commands</span><br />
`srun    # executes a command/script/binary across tasks`<br />
`salloc  # allocates resources to be used (interactively and/or via srun)`<br />
`sbatch  # submits a script for later execution on requested resources`<br /><br />
* <span style="font-size:1.2em;">resource request options</span><br />
`--ntasks=             # "tasks" recognised by srun`<br />
`--nodes=              # no. of nodes`<br />
`--ntasks-per-node=    # tasks per node`<br />
`--cpus-per-task=      # cpus per task`<br />
`--mem=                # memory required for entire job`<br />
`--mem-per-cpu=        # memory required for each CPU`<br />
`--gres=               # "general resource" (i.e. GPUs)`<br />
`--time=               # requested wall time`<br />

# LUNCH

## Understanding Your Jobs and the Job Queue

_knowledge is power_

### Building on the basics: `squeue`

`squeue` shows everyone's job in the queue (passing `-u <username>`) shows only `<username>`'s jobs.

```
$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           8501639 gpuq_inte sys/dash birkinsh  R      42:18      1 gpu-a10-n01
           8501778 interacti sys/dash whitfiel  R       1:28      1 sml-n01
           8498244 interacti       MM   yuchen  R   20:02:01      1 med-n02
           8498615 interacti sys/dash baldoni.  R   17:53:25      1 sml-n01
           8501737   regular bionix-m   bedo.j  R      19:18      1 sml-n06
           8501738   regular bionix-m   bedo.j  R      19:18      1 sml-n19
           8501739   regular bionix-m   bedo.j  R      19:18      1 sml-n14
           8501755   regular        R baldoni.  R      10:34      1 med-n03
           ...
```

Getting a bit more: `squeue --long` makes things more legible adds the "time_limit" column.
```
$ squeue --long
Tue Oct 18 09:35:35 2022
             JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
           8501639 gpuq_inte sys/dash birkinsh  RUNNING      48:35   8:00:00      1 gpu-a10-n01
           8501788 interacti sys/dash  kershaw  RUNNING       2:08   5:00:00      1 med-n03
           8501778 interacti sys/dash whitfiel  RUNNING       7:45   5:00:00      1 sml-n01
           8498244 interacti       MM   yuchen  RUNNING   20:08:18 1-00:00:00      1 med-n02
           8498615 interacti sys/dash baldoni.  RUNNING   17:59:42 1-00:00:00      1 sml-n01
           8255255      long       MQ   dite.t  PENDING       0:00 12-12:00:00      1 (QOSMaxJobsPerUserLimit)
           8255253      long       MQ   dite.t  RUNNING 9-09:13:41 12-12:00:00      1 sml-n21
...
```

In [8]:
squeue

bash: squeue: command not found


: 127

But what if we want _even more_ information?

We have to make use of the formatting options!

```
$ squeue --Format \<column1\>,\<column2\>,...
```

OR use environment variables
```
# Resource related
NumCPUs
NumNodes
minmemory
tres-alloc

# time related
starttime
submittime
pendingtime
timelimit
timeleft
timeused

# Job Metadata
JobId
name
partition
priority
reasonlist
workdir
state
```

In [2]:
squeue --Format=jobid,name,partition,priority,reasonlist,state,tres-alloc:60,submittime:25,starttime:25,timelimit,timeleft

SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='? (3474580224.py, line 1)

In [3]:
sacct -S -E -o jobid,jobname,alloctres,elapsed,end,start,exitcode,ncpus,nodelist,nnodes,submit,

bash: sacct: command not found


: 127

`sinfo` shows the status of the cluster
```
$ sinfo
PARTITION        AVAIL  TIMELIMIT  NODES  STATE NODELIST
interactive         up 1-00:00:00      5    mix med-n[02-03],sml-n[01-03]
interactive         up 1-00:00:00      1   idle med-n01
regular*            up 2-00:00:00     30    mix med-n[02-03,07,09,11-14,17],sml-n[02-05,07-20,22-24]
regular*            up 2-00:00:00      4  alloc med-n[15-16],sml-n[06,21]
regular*            up 2-00:00:00     21   idle lrg-n[02-04],med-n[04-06,08,10,18-30]
long                up 14-00:00:0     30    mix med-n[02-03,07,09,11-14,17],sml-n[02-05,07-20,22-24]
long                up 14-00:00:0      4  alloc med-n[15-16],sml-n[06,21]
long                up 14-00:00:0     18   idle med-n[04-06,08,10,18-30]
bigmem              up 2-00:00:00      2    mix med-n[02-03]
bigmem              up 2-00:00:00      3   idle lrg-n[01-02],med-n04
gpuq                up 2-00:00:00     12   idle gpu-a30-n[01-07],gpu-p100-n[01-05]
gpuq_interactive    up   12:00:00      1    mix gpu-a10-n01
gpuq_large          up 2-00:00:00      3   idle gpu-a100-n[01-03]
```

```
sview
```

In [4]:
sinfo -NO nodehost,partition,statecompact,available,cpusstate,allocmem,memory,gres,gresused
scontrol show job <job id>

bash: sinfo: command not found
bash: syntax error near unexpected token `newline'


: 2

## Basic Job Profiling

ssh node

nvidia-smi

htop (and useful features)

seff

dcgmstats


## Sbatch Scripting Features

We're going to start with a simple R script submitted by wrapper sbatch script:

```r
## matmul.rscript

print("starting the matmul R script!")
nrows = 1e3
print(paste0("elem: ", nrows, "*", nrows, " = ", nrows*nrows))

# generating matrices
M <- matrix(rnorm(nrows*nrows),nrow=nrows)
N <- matrix(rnorm(nrows*nrows),nrow=nrows)

# start matmul
start.time <- Sys.time()
invisible(M %*% N)
end.time <- Sys.time()

# Getting final time and writing to stdout
elapsed.time <- difftime(time1=end.time, time2=start.time, units="secs")
print(elapsed.time)
```

In [5]:
#!/bin/bash
# Example sbatch script running Rscript
# Does a matmul
# rev0
#SBATCH --mem=8G
#SBATCH --cpus-per-task=2
#SBATCH --time=1-
#SBATCH --nodes=1
#SBATCH --ntasks=1

# loading module for R
module load R/openBLAS/4.2.1

Rscript matmul.rscript

bash: module: command not found
Fatal error: cannot open file 'matmul.rscript': No such file or directory


: 2

## Sbatch scripting features cont.

Sbatch options extend beyond just specifying resources. There are ways you can utulise `sbatch` features to make your life easier or aid in automation:
* redirecting where your output and error messages go
* changing directory
* notifying you of when jobs have started, ended
* Submitting Python and R scripts
* Making use of Slurm's environment variables

## Sbatch scripting features cont.

### a short aside on `stdout` and `stderr`
Linux uses has two main "channels" to send output messages to. One is "stdout" (standard out), and the other is "stderr" (standard error).

If you have ever used the `|` `>` or `>>` shell scripting features, then you've _redirected_ `stdout` somewhere else e.g., to another command, a file, or the void (`/dev/null`).

```bash
$ ls dir-that-doesnt-exist
ls: cannot access dir-that-doesnt-exist: No such file or directory # this is a stderr output`
```

```bash
$ ls ~
bin cache Desktop Downloads ... # this is a stdout output!
```

### redirecting output and stderr
Sbatch will automatically redirect `stout` and `stdir` to a single file called `slurm-jobid.out`. But this may not be useful. Maybe you want to seperate any "output" from "errors". This can be done with `--output` and `--error`

In [6]:
#!/bin/bash
# Example sbatch script running Rscript
# Does a matmul
# rev1-stderrstdout
#SBATCH --mem=8G
#SBATCH --cpus-per-task=2
#SBATCH --time=1-
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --output=results/Rmatmul-%j.out # the %j is interpreted as the job id!
#SBATCH --error=logs-debug/Rmatmul-%j.err

# loading module for R
module load R/openBLAS/4.2.1

Rscript matmul.rscript

bash: module: command not found
Fatal error: cannot open file 'matmul.rscript': No such file or directory


: 2

### changing directory
The `sbatch` option `--chdir` can make life a bit easier for many reasons:
* your script may be "far" from your data e.g a seperate scripts folder
* you may be processing data in different places

A typical approach would be to either modify the `--output` and `--error` locations and use `cd`.

`--chdir` changes the working directory _which includes where stderr and stdout goes_

This can be more concise and avoids having to change multiple paths.

In [7]:
#!/bin/bash
# Example sbatch script running Rscript
# Does a matmul
# rev2-chdir
#SBATCH --mem=8G
#SBATCH --cpus-per-task=2
#SBATCH --time=1-
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --output=Rmatmul-%j.out
#SBATCH --error=Rmatmul-%j.err
#SBATCH --chdir=/vast/scratch/yang.e/slurm-demo/test1

# loading module for R
module load R/openBLAS/4.2.1

Rscript matmul.rscript

bash: module: command not found
Fatal error: cannot open file 'matmul.rscript': No such file or directory


: 2