# Dask Workflow - Local

Demo notebooks 02a and 02b will show you the recommended way to work with Dask on a cluster.

Requirements:
 - Cluster has been added and configured, as shown in notebook 01.
 - `idact` is installed on the cluster using pip.

## Initial setup

Add `idact` to path.

These steps would not usually be required, but for the purpose of separating the demo from your live config, key and config paths are substituted:

In [1]:
import sys
import os
import bitmath
import logging
import subprocess
from pprint import pprint

# appending path is not necessary if idact was installed using pip
def append_idact_path():
    idact_path = os.path.realpath(os.path.join(os.getcwd(), '../'))
    sys.path.append(idact_path)
append_idact_path()

# comment out the line below to use the default key location: ~/.ssh
os.environ['IDACT_KEY_LOCATION'] = os.path.join(os.getcwd(), '../.notebook-ssh')

## Load cluster

In [2]:
from idact import *

load_environment('.idact-env')  # load_environment() would use the default path: ~/.idact.conf
cluster = show_cluster("pro")  # replace with your cluster name if necessary
cluster

Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='E:\\shared\\uni\\eng-project\\notebooks\\../.notebook-ssh\\id_rsa_du', install_key=False, disable_sshd=False)

Make sure it's properly configured. Replace the following with correct values for your cluster, or skip if already configured:

In [3]:
set_log_level(logging.INFO)
#set_log_level(logging.DEBUG)
cluster.config.setup_actions.jupyter = ['module load plgrid/tools/python-intel/3.6.2']
cluster.config.setup_actions.dask = ['module load plgrid/tools/python-intel/3.6.2']
cluster.config.scratch = '$SCRATCH'

save_environment('.idact-env')

Get access node:

In [4]:
node = cluster.get_access_node()
node

Node(pro.cyfronet.pl:22, None)

Make sure authentication is set up correctly:

In [5]:
node.connect()

In [6]:
node.run('whoami')

'plggarstka'

In [7]:
node.run('hostname')

'login01.pro.cyfronet.pl'

## Allocate nodes

You will need nodes to deploy Jupyter Notebook and Dask on:

In [8]:
nodes = cluster.allocate_nodes(nodes=2,
                               cores=2,
                               memory_per_node=bitmath.GiB(10),
                               walltime=Walltime(minutes=20),
                               native_args={
                                   '--partition': 'plgrid-testing',
                                   '--account': 'intdata'
                               })

2018-11-02 22:34:08 INFO: Creating the ssh directory.


In [9]:
nodes

Nodes([Node(NotAllocated),Node(NotAllocated)], SlurmAllocation(job_id=13969005))

Wait until the nodes are allocated:

In [10]:
nodes.wait()
nodes

Nodes([Node(p0653:51136, 2018-11-02 21:54:16.293426+00:00),Node(p0657:56621, 2018-11-02 21:54:16.293426+00:00)], SlurmAllocation(job_id=13969005))

## Deploy notebook

You will work from a remote Jupyter Notebook deployed on the cluster:

In [11]:
nb = nodes[0].deploy_notebook()
nb

JupyterDeployment(8080 -> Node(p0653:51136, 2018-11-02 21:54:16.293426+00:00)

Open the remote notebook in a new tab:

In [12]:
nb.open_in_browser()

## Push nodes

To deploy Dask on the notebook, you will need access to nodes you allocated earlier.

Push the nodes to the cluster. You will be able to use it straigth away by calling `load_environment`.

In [13]:
cluster.push_deployment(nodes)

2018-11-02 22:34:39 INFO: Pushing deployment: Nodes([Node(p0653:51136, 2018-11-02 21:54:16.293426+00:00),Node(p0657:56621, 2018-11-02 21:54:16.293426+00:00)], SlurmAllocation(job_id=13969005))


## Push environment

If this is your first time working on the cluster from a remote notebook,
you may want to push the current environment. Alternatively, just `add_cluster` and perform other configuration steps on the remote notebook, as demonstrated in demo notebook 01.

In [14]:
push_environment(cluster)

2018-11-02 22:34:56 INFO: Pushing the environment to cluster.
2018-11-02 22:34:58 ERROR: Failure: Getting file from node pro.cyfronet.pl: /net/people/plggarstka/.idact.conf
2018-11-02 22:34:58 ERROR: Failure: Deserializing the environment from cluster.
2018-11-02 22:34:58 INFO: Remote environment is missing, current environment will be copied to cluster.


## Copy next notebook to the cluster

Drag and drop `02b-DaskWorkflow-Remote.ipynb` to the notebook you previously opened in a new tab. Open it.

## Follow the instructions in notebook 02b

Follow instructions until you are referred back to this notebook.

## View Dask diagnostics

Dask deployment synchronization is currently not implemented.

To view Dask scheduler dashboard, you will need to open the tunnel manually.
Copy the port number from client description displayed on the remote notebook.

In [15]:
port = 48170  # copy port from other notebook

In [16]:
tunnel = nodes[0].tunnel(there=port)

2018-11-02 22:37:49,951| ERROR   | Could not establish connection from ('127.0.0.1', 53777) to remote side of the tunnel
2018-11-02 22:37:49,959| ERROR   | Exception: Error reading SSH protocol banner
2018-11-02 22:37:50,049| ERROR   | Traceback (most recent call last):
2018-11-02 22:37:50,050| ERROR   |   File "E:\Anaconda3\envs\idact-dev\lib\site-packages\paramiko\transport.py", line 2044, in _check_banner
2018-11-02 22:37:50,050| ERROR   |     buf = self.packetizer.readline(timeout)
2018-11-02 22:37:50,051| ERROR   |   File "E:\Anaconda3\envs\idact-dev\lib\site-packages\paramiko\packet.py", line 353, in readline
2018-11-02 22:37:50,052| ERROR   |     buf += self._read_timeout(timeout)
2018-11-02 22:37:50,052| ERROR   |   File "E:\Anaconda3\envs\idact-dev\lib\site-packages\paramiko\packet.py", line 542, in _read_timeout
2018-11-02 22:37:50,053| ERROR   |     raise EOFError()
2018-11-02 22:37:50,054| ERROR   | EOFError
2018-11-02 22:37:50,054| ERROR   | 
2018-11-02 22:37:50,055| ERROR



Open the dashboard:

In [17]:
import webbrowser
webbrowser.open("http://localhost:{here}/status".format(here=tunnel.here));

In [18]:
tunnel.close()

## Monitor node resources

While working with Dask, it may be useful to monitor resource usage on nodes:

In [19]:
nodes[0].resources.memory_total

GiB(10.0)

In [20]:
nodes[0].resources.cpu_cores

2

In [21]:
nodes[0].resources.memory_usage

GiB(0.46537017822265625)

In [22]:
nodes[1].resources.memory_usage

GiB(0.22327804565429688)

In [23]:
nodes[0].resources.cpu_usage

9.0

In [24]:
nodes[1].resources.cpu_usage

3.0

## Cancel Dask deployment

Cancel Dask deployment in the remote notebook, or just close it.

## Cancel other deployments

If the nodes are still running, make sure you cancel their allocation to save CPU time.

In [25]:
nodes.running()

True

In [26]:
nb.cancel()

2018-11-02 22:39:37 INFO: Cancelling Jupyter deployment.


In [27]:
nodes.cancel()

2018-11-02 22:39:43 INFO: Cancelling job 13969005.


In [28]:
nodes.running()

False

In [29]:
node.run('squeue')

'JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)'