# DASK on HPC with PBS Pro batch queue system.

To use [pangeo](http://pangeo.io), one needs to get familiar with key python packages of its ecosystem.
[Dask](http://dask.org) is one of them which allow you to distribute your computations.
We'll focus here on distribution over an HPC cluster equipped with PBS Pro.
The distribution requires spawning a dask distributed cluster which will be achieved with [Dask_Jobqueue](https://dask-jobqueue.readthedocs.io/en/latest/).

Other methods of HPC deployments are possible and more ressources about these can be found on [pangeo.io](http://pangeo.io/setup_guides/hpc.html)

***For MPI users, this resembles the combination of writing code using mpi_init and submitting mpirun script to a cluster (without knowing what your context of code will be....)***

---

## 1. Set up python environments. 

In [1]:
import os, sys
import dask
import xarray as xr

---
## 2. Set up a dask distributed cluster  
This configuration is based on hal (CNES HPC Cluster) which use PBSPro as batch scheduler.   

In [2]:
from dask_jobqueue import PBSCluster
cluster = PBSCluster(cores=6, memory='30 gb', walltime='1:00:00')


- Above request 6 cores with 30GB of memory, walltime as 1 hours for each dask client.  In case you need small chunk of nodes, one can modify, for example 5 Gb of memory on each node, as below.:


`cluster = PBSCluster(cores=1,memory='5 gb', walltime='1:00:00')`

ATT these chunks should be chosen well so that the chunk fits well to the cluster's pbs configuration. In this example it chose 'walltime 1 hour' since that is the max time limit of short quick job queue.


**If you make short, and small chunk, your job generally fits gaps of unused resources of the HPC machine and may thus start faster.
But this also creates small chunks of used resources, which makes it difficult to run big job for other users.
Thus it is important to interact with your HPC cluster managers about your setup.**

---

## 3.  Spawn DASK workers 


In [3]:
w = cluster.scale(10)

**Warning: after execution of the cell above, we are effectively starting to 'occupy' resources on the HPC cluster**

**Hence, do not forget to kill your 'dask-cluster' using the command 'qdel' in a shell, or the command 'cluster.close()' in the notebook**

---

## 4. Check status of batch jobs

You can do this in a shell with the following command and see if your pbs jobs are running or not, and, if running, on which nodes.
You can try to connect these nodes with ssh, and check the status of Dask workers with 'top, ps ...' commands.

qstat -u your-login -n -1

In [4]:
!qstat -u odakat -n -1


admin01: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
4398868.admin01 odakat   qdev     jupyterhub  72639   1  16   61gb 12:00 R 05:20 node558/0*16
4410568.admin01 odakat   qt1h     dask-worke   9146   1   6   28gb 01:00 R 00:00 node089/3*6
4410569.admin01 odakat   qt1h     dask-worke  22172   1   6   28gb 01:00 R 00:00 node098/1*6
4410570.admin01 odakat   qt1h     dask-worke  22174   1   6   28gb 01:00 R 00:00 node098/2*6
4410571.admin01 odakat   qt1h     dask-worke  23137   1   6   28gb 01:00 R 00:00 node099/1*6
4410572.admin01 odakat   qt1h     dask-worke  23152   1   6   28gb 01:00 R 00:00 node099/2*6
4410573.admin01 odakat   qt1h     dask-worke   4493   1   6   28gb 01:00 R 00:00 node104/1*6
4410574.admin01 odakat   qt1h     dask-worke  17044   1   6   28gb 01:00 R 00:00 node107/1*6
441

THe following command can be used to check the actual pbs scripts that have been submitted.

In [5]:
 print(cluster.job_script())

#!/bin/bash

#!/usr/bin/env bash
#PBS -N dask-worker
#PBS -l select=1:ncpus=6:mem=28GB
#PBS -l walltime=1:00:00
JOB_ID=${PBS_JOBID%.*}



/home/mp/odakat/miniconda3/envs/equinox/bin/python -m distributed.cli.dask_worker tcp://10.120.43.58:57642 --nthreads 6 --memory-limit 30.00GB --name dask-worker--${JOB_ID}-- --death-timeout 60



Following commands will instantiate the Dask client and will enable you to check your Dask client status among other things.
The ***Dashboard*** link enable you to monitor your Dask cluster.
- Dask dashbord's worker tab to see how each worker use memory and cpu in a graphical mode.  
- 'System' shows system usage of jupyternotebook which host DASK scheduler.  
Other tabs are also usefull to understand how parallel process are working.  ATT, it use cpu and memory of your 'jupyter notebook node' if you try to see too complicated graphical interface, your jupyter notebook itself may get slower.

**AP: I am not sure ATT is a standard acronym**

See the [distributed doc](http://distributed.dask.org/en/latest/web.html) for a video walkthrough of the dashboard.

In [6]:
from dask.distributed import Client
client=Client(cluster)
cluster

0,1
Client  Scheduler: tcp://10.120.43.58:57642  Dashboard: http://10.120.43.58:8787/status,Cluster  Workers: 10  Cores: 60  Memory: 300.00 GB


In [None]:
client

---

## 5. Increase  (or scale down) number of workers  

In [7]:
cluster.scale_up(11)

In [8]:
!qstat -u odakat


admin01: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
4398868.admin01 odakat   qdev     jupyterhub  72639   1  16   61gb 12:00 R 05:20
4410568.admin01 odakat   qt1h     dask-worke   9146   1   6   28gb 01:00 R 00:00
4410569.admin01 odakat   qt1h     dask-worke  22172   1   6   28gb 01:00 R 00:00
4410570.admin01 odakat   qt1h     dask-worke  22174   1   6   28gb 01:00 R 00:00
4410571.admin01 odakat   qt1h     dask-worke  23137   1   6   28gb 01:00 R 00:00
4410572.admin01 odakat   qt1h     dask-worke  23152   1   6   28gb 01:00 R 00:00
4410573.admin01 odakat   qt1h     dask-worke   4493   1   6   28gb 01:00 R 00:00
4410574.admin01 odakat   qt1h     dask-worke  17044   1   6   28gb 01:00 R 00:00
4410575.admin01 odakat   qt1h     dask-worke  17579   1   6   28gb 01:00 R 00:00
4410576.admin01 oda

In [9]:
!qstat -u odakat |grep dask-work |wc -l

11


As you can see, the number of Dask workers increased; you have 11 of them now. One could use other commands like
`cluster.scale(11)` instead of `cluster.scale_up(11)`.

You can also try to decrease the number of workers.  

In [10]:
cluster.scale(5)

In [11]:
!qstat -u odakat 


admin01: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
4398868.admin01 odakat   qdev     jupyterhub  72639   1  16   61gb 12:00 R 05:20
4410573.admin01 odakat   qt1h     dask-worke   4493   1   6   28gb 01:00 R 00:00
4410574.admin01 odakat   qt1h     dask-worke  17044   1   6   28gb 01:00 R 00:00
4410575.admin01 odakat   qt1h     dask-worke  17579   1   6   28gb 01:00 R 00:00
4410576.admin01 odakat   qt1h     dask-worke   2706   1   6   28gb 01:00 R 00:00
4410577.admin01 odakat   qt1h     dask-worke  17619   1   6   28gb 01:00 R 00:00


Let's say your dask-worker been killed for whatever the reason

In [12]:
!qstat -u odakat |grep dask-work |awk '{print "qdel " $1 }' >./del-daskworker 
!chmod +x ./del-daskworker
!./del-daskworker

In [13]:
client

0,1
Client  Scheduler: tcp://10.120.43.58:57642  Dashboard: http://10.120.43.58:8787/status,Cluster  Workers: 0  Cores: 0  Memory: 0 B


You can re-scale your cluster and get back to your parallel computation. 

In [14]:
cluster.scale(8)

In [15]:
!qstat -u odakat


admin01: 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
4398868.admin01 odakat   qdev     jupyterhub  72639   1  16   61gb 12:00 R 05:21
4410601.admin01 odakat   qt1h     dask-worke   9444   1   6   28gb 01:00 R 00:00
4410602.admin01 odakat   qt1h     dask-worke  22503   1   6   28gb 01:00 R 00:00
4410603.admin01 odakat   qt1h     dask-worke  22505   1   6   28gb 01:00 R 00:00
4410604.admin01 odakat   qt1h     dask-worke  23454   1   6   28gb 01:00 R 00:00
4410605.admin01 odakat   qt1h     dask-worke  23455   1   6   28gb 01:00 R 00:00
4410606.admin01 odakat   qt1h     dask-worke   4765   1   6   28gb 01:00 R 00:00
4410607.admin01 odakat   qt1h     dask-worke  17752   1   6   28gb 01:00 R 00:00
4410608.admin01 odakat   qt1h     dask-worke  18295   1   6   28gb 01:00 R 00:00


---
## 6. once you're done computing, do not forget to stop your dask worker with following command.


In [16]:
cluster.close()