# Dask jobqueue example for JUWELS at JSC
covers the following aspects, i.e. how to
* add the JUWELS specific Dask jobqueue configuration
* get overview on available JUWELS compute node resources
* specify batch queue and project budget name
* open, scale and close a default jobqueue cluster
* do an example calculation on larger than memory data

In [1]:
import dask, dask_jobqueue, os
import dask.distributed as dask_distributed

## Load jobqueue configuration defaults

In [2]:
additional_config = dask.config.collect(paths=['.']) # look up further Dask configurations in local directory
dask.config.update(dask.config.config, additional_config, priority='new');

In [3]:
dask.config.get('jobqueue.juwels-jobqueue-config')

{'cores': 96,
 'memory': '86864MiB',
 'processes': 1,
 'local-directory': None,
 'death-timeout': 60,
 'extra': [],
 'interface': 'ib0',
 'shebang': '#!/usr/bin/env bash',
 'walltime': '00:15:00',
 'log-directory': 'dask-jobqueue-logs',
 'name': 'dask-worker',
 'env-extra': ['export DASK_DISTRIBUTED__WORKER__MEMORY__SPILL=False',
  'export DASK_DISTRIBUTED__WORKER__MEMORY__TARGET=False',
  'export DASK_DISTRIBUTED__WORKER__MEMORY__PAUSE=0.8',
  'export DASK_DISTRIBUTED__WORKER__MEMORY__TERMINATE=0.9'],
 'project': None,
 'queue': None,
 'job-cpu': None,
 'job-mem': None,
 'job-extra': []}

## Inspect cluster utilization

Overview about idle nodes for each batch partition. This is, however, only a very general overview. As the scheduler reserves compute nodes for jobs that are already chosen to start-up very soon, and as the scheduler still reports these reserved nodes as being "idle", the numbers below are not necessarily related to the actual number of readily usable compute nodes.

In [4]:
%%bash
sinfo -t idle --format="%9P %.5a %.5D %.5t"

PARTITION AVAIL NODES STATE
batch*       up   279  idle
devel        up    19  idle
mem192       up    90  idle
esm          up     6  idle
large      down   279  idle
gpus         up     0   n/a
develgpus    up     8  idle
booster      up   345  idle
develboos    up     7  idle
largeboos    up   345  idle
maintclus    up   312  idle
maintboos    up   352  idle
maint        up   664  idle


A more accurate overview on which batch partition to choose for the Dask jobqueue cluster can be obtained by submitting `--test-only` requests and by looking at the start date estimates. If the reported date is very, very close to the current date it can be expected that a Dask cluster as specified in the `test_this_jobqueue_cluster` variable starts-up almost immediately. Note, that on JUWELS `--nodes 3` is the number of Dask jobqueue workers that would be requested by a `cluster.scale(jobs=3)`. Also note, that the scheduler on JUWELS has a backfilling mechanism and that Dask jobqueue workers might start-up after a few minutes, even though the date reported by the `--test-only` requests report otherwise.

In [5]:
%%bash

test_jobqueue_cluster='--nodes 3 --time 00:15:00 --account esmtst --cpus-per-task=96 --mem=85G'

date && scontrol show partition | grep PartitionName | cut -f2 -d"=" | \
xargs -I {} bash -c "sbatch --test-only ${test_jobqueue_cluster} --partition {} --wrap 'echo $HOST'"
printf "" # prevent display of xargs non-zero exit codes...

Thu Jan  7 17:10:37 CET 2021


sbatch: Job 3256240 to start at 2021-01-07T17:57:38 using 288 processors on nodes jwc03n[130-132] in partition batch
sbatch: Job 3256241 to start at 2021-01-07T17:10:38 using 288 processors on nodes jwc00n[016-018] in partition devel
sbatch: Job 3256242 to start at 2021-01-07T17:10:38 using 288 processors on nodes jwc08n[075-077] in partition mem192
sbatch: Job 3256243 to start at 2021-01-07T17:10:38 using 288 processors on nodes jwc00n[000,003,006] in partition esm
sbatch: Job 3256244 to start at 2021-01-07T17:10:39 using 288 processors on nodes jwc03n[130-132] in partition large
allocation failure: Invalid generic resource (gres) specification
allocation failure: Invalid generic resource (gres) specification
allocation failure: Invalid generic resource (gres) specification
allocation failure: Invalid generic resource (gres) specification
allocation failure: Invalid generic resource (gres) specification
allocation failure: Invalid account or account/partition combination specified
all

## Set up jobqueue cluster ...

In [6]:
jobqueue_cluster = dask_jobqueue.SLURMCluster(
    config_name='juwels-jobqueue-config',
    project='esmtst', # specify budget name associated with your project
    queue='devel', # choose batch partition by available resources
)

In [7]:
print(jobqueue_cluster.job_script())

#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -e dask_jobqueue_logs/dask-worker-%J.err
#SBATCH -o dask_jobqueue_logs/dask-worker-%J.out
#SBATCH -p devel
#SBATCH -A esmtst
#SBATCH -n 1
#SBATCH --cpus-per-task=96
#SBATCH --mem=85G
#SBATCH -t 00:15:00
export DASK_DISTRIBUTED__WORKER__MEMORY__SPILL=False
export DASK_DISTRIBUTED__WORKER__MEMORY__TARGET=False
export DASK_DISTRIBUTED__WORKER__MEMORY__PAUSE=0.8
export DASK_DISTRIBUTED__WORKER__MEMORY__TERMINATE=0.9
/p/project/cesmtst/hoeflich1/miniconda3/bin/python -m distributed.cli.dask_worker tcp://10.13.0.158:35955 --nthreads 96 --memory-limit 91.08GB --name dummy-name --nanny --death-timeout 60 --interface ib0 --protocol tcp://



## ... and the client process

In [8]:
client = dask_distributed.Client(jobqueue_cluster)

## Start jobqueue workers

In [9]:
jobqueue_cluster.scale(jobs=3)

In [13]:
!squeue -u hoeflich1

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           3256247     devel dask-wor hoeflich  R       0:11      1 jwc00n004
           3256248     devel dask-wor hoeflich  R       0:11      1 jwc00n005
           3256246     devel dask-wor hoeflich  R       0:15      1 jwc00n002


In [15]:
client

0,1
Client  Scheduler: tcp://10.13.0.158:35955  Dashboard: http://10.13.0.158:8787/status,Cluster  Workers: 3  Cores: 288  Memory: 273.24 GB


## Do calculation on larger than memory data

In [16]:
import dask.array as da

In [17]:
fake_data = da.random.uniform(0, 1, size=(365, 1e4, 1e4), chunks=(365,500,500)) # problem specific chunking
fake_data

Unnamed: 0,Array,Chunk
Bytes,292.00 GB,730.00 MB
Shape,"(365, 10000, 10000)","(365, 500, 500)"
Count,400 Tasks,400 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 292.00 GB 730.00 MB Shape (365, 10000, 10000) (365, 500, 500) Count 400 Tasks 400 Chunks Type float64 numpy.ndarray",10000  10000  365,

Unnamed: 0,Array,Chunk
Bytes,292.00 GB,730.00 MB
Shape,"(365, 10000, 10000)","(365, 500, 500)"
Count,400 Tasks,400 Chunks
Type,float64,numpy.ndarray


In [18]:
import time

In [19]:
start_time = time.time()
fake_data.mean(axis=0).compute()
elapsed = time.time() - start_time
print('elapse time ',elapsed,' in seconds')

elapse time  5.86717677116394  in seconds


## Close jobqueue cluster and client process

In [20]:
!squeue -u hoeflich1

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           3256247     devel dask-wor hoeflich  R       0:42      1 jwc00n004
           3256248     devel dask-wor hoeflich  R       0:42      1 jwc00n005
           3256246     devel dask-wor hoeflich  R       0:46      1 jwc00n002


In [21]:
jobqueue_cluster.close()
client.close()

In [23]:
!squeue -u hoeflich1

             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)


## Conda environment

In [24]:
!conda list --explicit

# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: linux-64
@EXPLICIT
https://repo.anaconda.com/pkgs/main/linux-64/_libgcc_mutex-0.1-main.conda
https://conda.anaconda.org/conda-forge/linux-64/ca-certificates-2020.12.5-ha878542_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/ld_impl_linux-64-2.33.1-h53a641e_7.conda
https://conda.anaconda.org/conda-forge/linux-64/libgfortran4-7.5.0-hae1eefd_17.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libstdcxx-ng-9.1.0-hdf63c60_0.conda
https://conda.anaconda.org/conda-forge/linux-64/pandoc-2.11.3.2-h7f98852_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libgcc-ng-9.1.0-hdf63c60_0.conda
https://conda.anaconda.org/conda-forge/linux-64/libgfortran-ng-7.5.0-hae1eefd_17.tar.bz2
https://conda.anaconda.org/conda-forge/linux-64/jpeg-9d-h36c2ea0_0.tar.bz2
https://repo.anaconda.com/pkgs/main/linux-64/libffi-3.3-he6710b0_2.conda
https://conda.anaconda.org/conda-forge/linux-