# SLURM: Advanced Batch Scripting

SLURM is a powerful tool and, in addition to having a cool name, is used by 70%+ of the systems found on the [TOP500](https://top500.org/project/) list of most powerful supercomputers.

### Example: Accounts, Partitions, and QOS.

In [None]:
# IDEA:
#.  start with a batch-script example attempting to access 
#. a restricted partition; have the student modify the 
#. script so that it requests resources somewhere else, then run.
# 
#. NOTE: should request access to a GPU.
#. 

### Example: Job Steps

In [None]:
# IDEA:
#.  A 1 ntask batch-script which uses `srun` to 
#. launch job-steps, passing different `--output` and
#. `--job-name` values with each call.
#. 

### Example: Parallel Tasks (Serial)

In [None]:
# IDEA:
#.   batch-script that requests one-node, 3 tasks, and 1 cpu per task.
#. moreover, it should launch more than one job-step and which 
#. passed different values of `-n` &`--job-name` 
#. 
#. NOTE: might use PushShift API  https://pushshift.io/api-parameters/ t
#.    or their archive of reddit comments https://files.pushshift.io/reddit/comments/
#.    to search for, or download different things
#. 

### Example: Parallel Tasks (Threaded)

In [None]:
# IDEA:
#.    batch-script requests 2 nodes, 1 task-per-node, with 
#.  2 cpus-per-task. akes one call to `srun`. Calls 
#.  a multi-threaded python script (which will be provided).
#

### Example: Job Arrays

In [None]:
# VAUGE THOUGHT: Something using Julia.

### Example: Multiple Stages 


#### Stage 1:  `example-four-stage-1.sh`

In [None]:
%%file "example-four-stage1.sh"
#!/bin/bash
### Job Parameters:
#SBATCH --job-name "Stage 1"             # job name
#SBATCH --output   "logs/stage1.%j.log"  # output file pattern

### Script To Execute:
# create virtual environment
python3 -m venv "env"

# activate environment
source "env/bin/activate"

# install packages into environment
python3 -m pip install numpy

# With python environment setup,
#    ...  we request "stage 2" be scheduled
sbatch "example-four-stage2.sh"

#### Stage 2: `example-four-stage-2.sh`

In [None]:
%%file "example-four-stage2.sh"
#!/bin/bash
### Job Parameters:
# Request Job-Array
#SBATCH --array 1-20%10 # the array has 100 sub-jobs (labeled 1 to 100) 
                        # .. with at most 10 sub-jobs running at 
                        # .. any given point.
# Generic Info for "sub-job"
#SBATCH --job-name "Stage 2"               # sub-job name
#SBATCH --output   "logs/stage2.%A.%a.log" # output file pattern
                                           # .. %A -> array-job-id
                                           # .. %a -> array-task-id 
# Resources To Request For Each "sub-job"
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 1
#SBATCH --mem-per-cpu 500
#SBATCH --time 00:45:00  

### Script To Execute:
source "env/bin/activate"         # activate environment from "stage 1"
python3 "scripts/example-four.py" # execute python script

#### Python Script: `example-four.py`

In [None]:
%%file "example-four.py"
"""
 File: `example-four.py`
 Synopsis:
    estimate condition number and print results for 
  a single random perturbation of the nxn identity matrix 
  (where n is fixed).
"""
from numpy import eye
from numpy.random import uniform
from numpy.linalg import cond

n = 1000
eps = 0.01

I = eye(n) # nxn identity matix
X = uniform(size=[n,n])

# generate perturbation `Z`
Z = I + eps*uniform()*X 

# estimate condition number of `Z`
c = cond(Z)

# print results
print(f'{c}', flush=True)

#### Submiting & Results

To see what's happening we'll need to switch to a terminal and use the following script.

In [None]:
%%file "watch-queue.sh"
#!/bin/bash

# submit seed job (stage1)
sbatch "example-four-stage1.sh"

# watch queue
ii=0 
while [ $ii -lt 1000 ]; do
    squeue --user $USER --name "Stage 1: Example 4","Stage 2: Example 4"
    sleep 0.7 && clear
    ii=$((ii+1))
done

### Example 5: Heterogenous Job

In [None]:
# IDEA:
#.  request a two-part job, one portion is the server
#. the other portion is a client. Start both of them
#. using one `srun` call. 
#.    SEE EXAMPLES HERE: https://slurm.schedmd.com/heterogeneous_jobs.html 
#. 