# 06a. Working on a cluster - local part

## Overview

In this notebook and its remote counterpart `06b`, you will learn how to:

 - Synchronize deployments between the local machine and a notebook running on the cluster.
 - Perform simple computations using the Dask deployment on a deployed notebook.
 - Clear synchronized deployments.

## Import idact

It's recommended that *idact* is installed with *pip*.  
Alternatively, make sure the dependencies are installed: `pip install -r requirements.txt`, and add *idact* to path, for example:  
`import sys`  
`sys.path.append('<YOUR_IDACT_PATH>')`

We will use a wildcard import for convenience:

In [None]:
from idact import *
import bitmath

## Load the cluster

Let's load the environment and the cluster. Make sure to use your cluster name.

In [None]:
load_environment()
cluster = show_cluster("test")
cluster

In [None]:
access_node = cluster.get_access_node()
access_node.connect()

## Allocate nodes, deploy Jupyter and Dask

We will be working with Dask on a Jupyter Notebook deployed on the cluster. Make sure to adjust `--account`, same as in previous notebooks 

In [None]:
nodes = cluster.allocate_nodes(nodes=3,
                               cores=2,
                               memory_per_node=bitmath.GiB(10),
                               walltime=Walltime(minutes=20),
                               native_args={
                                   '--account': 'intdata'
                               })
nodes

In [None]:
nodes.wait()
nodes

Deploy a notebook:

In [None]:
nb = nodes[0].deploy_notebook()
nb

Deploy Dask on all three nodes:

In [None]:
dd = deploy_dask(nodes)
dd

## Synchronize the deployments

It may be useful to access the allocated nodes, or any other deployment above from another notebook.

In particular, we need to access the Dask deployment from the notebook that was deployed on the cluster, in order to perform computations.

Synchronizing a deployment involves *pushing* it first, and the *pulling* on another notebook.

Let's push the allocation first:

In [None]:
cluster.push_deployment(nodes)

Then Jupyter:

In [None]:
cluster.push_deployment(nb)

And finally Dask:

In [None]:
cluster.push_deployment(dd)

We will pull the deployments on the remote notebook in a moment.

## Copy notebook `06b` to the cluster

Drag and drop `06b-Working_on_a_cluster_-_remote_part.ipynb` to the deployed notebook, and open it there.

In [None]:
nb.open_in_browser()

## Follow the instructions in notebook `06b`

Follow the instructions until you are referred back to this notebook.

## Examine Dask Dashboards

You can always take a look how your computations look on the dashboards:

In [None]:
client = dd.get_client()
client

In [None]:
dd.diagnostics.open_all()

In [None]:
client.close()

## Clear synchronized deployments

Deployments are cleared automatically if they are expired or cancelled. They can also be cleared manually by  running:

In [None]:
cluster.clear_pushed_deployments()

## Cancel Dask and Jupyter deployments (optional)

In [None]:
nb.cancel()

In [None]:
dd.cancel()

## Cancel the allocation

It's important to cancel an allocation if you're done with it early, in order to minimize the CPU time you are charged for.

In [None]:
nodes.running()

In [None]:
nodes.cancel()

In [None]:
nodes.running()

## Next notebook

In the next notebook we will take a look on how to adjust deployment timeouts.