These are instruction to get dask working in an adaptive manner on NIH HPC.

# Setup

Create an environment that has dask. If you haven't setup conda to create new environments on the cluster follow the instructions on the NIH hpc docs or [here](https://github.com/leej3/notebooks_on_hpc/blob/master/notebooks_on_hpc.ipynb)

Using conda to create the appropriate environent:

```bash
module load Anaconda
conda create -y -c conda-forge -n dask python=3 dask
source activate dask
```

For now the required changes are not part of the codebase but will be soon and the dask-jobqueue package will be installable directly from pip:

```bash
pip install -e \
git+https://github.com/guillaumeeb/dask-jobqueue.git@f0bb957d6b350561f3a0998d62a313250df8625b#egg=dask_jobqueue

```

Connect to the hpc and request an interactive node. From there:

```bash
source activate dask
python
```

# A quick demo

A simple demonstration of setting up a scalable cluster can be run by executing the following commands:

```python
from dask_jobqueue import SLURMCluster

# setup a scheduler. the cluster object will manage this It's important to constrain the workers to ones that default to the same networking. We can't mix infiniband with 10g ethernet nodes. The constraint argument to sbatch allows us to specify the ethernet nodes
cluster = SLURMCluster(
    queue='quick',
    memory =  "12g",
    processes=1,
    threads = 4,
    job_extra = ['--constraint=10g'] )


# start two workers
cluster.start_workers(2)


# Create a Client object to use the cluster we set up
from dask.distributed import Client
c = Client(cluster)

# Lets use the cluster we setup using dask delayed
from dask import delayed
import time

def inc(x):
    import time
    time.sleep(5)
    return x + 1

def dec(x):
    import time
    time.sleep(3)
    return x - 1

def add(x, y):
    import time
    time.sleep(7)
    return x + y

x = delayed(inc)(1)
y = delayed(dec)(2)
total = delayed(add)(x, y)

# The following command executes the task graph that
# total represents. This is non-blocking. We can continue
# our python session. Whenever we query the fut object
# we will be informed of its status.
fut = c.compute(total)

# This is a blocking call and will return the results.
# We could run this immediately or wait until fut shows
# that the computation is finished.
c.gather(fut)
```

If at some point we need more compute we can scale our pool of workers easily:

```python
c.adapt()
```

# More? 

Dask tutorials can be found [here](https://github.com/dask/dask-tutorial)