First install dask-jobqueue
```shell script
conda install dask-jobqueue -c conda-forge
```
More information can be found in [the official docs](https://jobqueue.dask.org/en/latest/install.html)

In [1]:
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(
    cores=8,
    processes=4,
    memory='8g',
    project="notchpeak-shared-short",
    queue="notchpeak-shared-short",
)

This will define a Dask execution object with 4 processes, each using 2 cores. Each job is submitted with one task, and 8 cores per task. The above will just define the job, the below will submit one such jobs, each using the 8 cores. Note that notchpeak-shared-short allows max 2 running jobs per user, so, if we're running this notebook in notchpeak-shared-short via Open OnDemand we can only use one more job (worker).

In [2]:
from distributed import Client
from dask import delayed

cluster.scale(1)
client = Client(cluster)

We can see what job script the Dask uses, which is useful in figuring out the SLURM task / CPU usage mapping.

In [3]:
print(cluster.job_script())

#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -p notchpeak-shared-short
#SBATCH -A notchpeak-shared-short
#SBATCH -n 1
#SBATCH --cpus-per-task=8
#SBATCH --mem=8G
#SBATCH -t 00:30:00

/uufs/chpc.utah.edu/common/home/u0101881/software/pkg/miniconda3/bin/python -m distributed.cli.dask_worker tcp://10.242.75.81:39872 --nthreads 2 --nprocs 4 --memory-limit 2.00GB --name name --nanny --death-timeout 60



In [4]:
client

0,1
Client  Scheduler: tcp://10.242.75.81:39872  Dashboard: http://10.242.75.81:8787/status,Cluster  Workers: 4  Cores: 8  Memory: 8.00 GB


Now we'll run the same embarrassingly parallel example, but using the SLURM job that Dask started. Note that the code is the same as in the local Dask run.

In [5]:
import time
import random

def costly_simulation(list_param):
    time.sleep(random.random())
    return sum(list_param)

In [6]:
import pandas as pd
import numpy as np

input_params = pd.DataFrame(np.random.random(size=(500, 4)),
                            columns=['param_a', 'param_b', 'param_c', 'param_d'])
input_params.head()

Unnamed: 0,param_a,param_b,param_c,param_d
0,0.132246,0.250139,0.681363,0.568625
1,0.799677,0.811921,0.89177,0.85986
2,0.671534,0.989434,0.926673,0.404946
3,0.575825,0.621992,0.673732,0.778058
4,0.092248,0.5525,0.980881,0.629264


In [7]:
import dask
lazy_results = []

for parameters in input_params.values:
    lazy_result = dask.delayed(costly_simulation)(parameters)
    lazy_results.append(lazy_result)

futures = dask.persist(*lazy_results)  # trigger computation in the background

In [8]:
%time results = dask.compute(*futures)
results[:5]

CPU times: user 1.45 s, sys: 184 ms, total: 1.64 s
Wall time: 30.8 s


(1.6323734600887714,
 3.363227419861675,
 2.9925871955685546,
 2.6496080057618863,
 2.2548927646351955)

When we are done with using Dask, we cancel the job that runs the Dask workers.

In [9]:
cluster.close()