# 05a. Configuring *idact* on a cluster - local part

## Overview

In this notebook and its remote counterpart `05b`, you will learn how to:

 - Synchronize the environment between *idact* and the cluster.
 - Initialize *idact* config on the cluster from a deployed notebook.

## Import idact

Add `idact` to path if it's not already installed, for instance if this notebook is executed in a cloned repo.

In [1]:
import bitmath
import sys

sys.path.append('../')

We will use a wildcard import for convenience:

In [2]:
from idact import *

## Load the cluster

Let's load the environment and the cluster. Make sure to use your cluster name.

In [3]:
load_environment()
cluster = show_cluster("hpc")
cluster

Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)

In [4]:
access_node = cluster.get_access_node()
access_node.connect()

## Synchronize the environment

Synchronizing the environment with the cluster makes sure that your configuration matches, and also serves as a backup.

It's a slightly smarter file copy.

Pushing the environment will merge the local environment into the remote environment. This means most config fields will be overwritten, but machine-specific ones like `key` will be left unchanged.

In [5]:
push_environment(cluster)

2018-11-24 15:07:31 INFO: Pushing the environment to cluster.
2018-11-24 15:07:33 ERROR: Failure: Getting file from node pro.cyfronet.pl: /net/people/plggarstka/.idact.conf
2018-11-24 15:07:33 ERROR: Failure: Deserializing the environment from cluster.
2018-11-24 15:07:33 INFO: Remote environment is missing, current environment will be copied to cluster.


The reverse action is pulling the environment. It will merge the remote environment into the local environment.

In [6]:
pull_environment(cluster)

2018-11-24 15:07:35 INFO: Pulling the environment from cluster.


The environment still needs to be saved to keep changes after pull:

In [7]:
save_environment()

## Install *idact* on the cluster

Our goal is to be able to work with *idact* on a notebook deployed on the cluster.

We have already pushed our configuration, so the setup time will be minimal.

Make sure `idact` is installed with the Python 3.5+ distribution you are already using for Jupyter and Dask.
```
python -m pip install idact
```

## Initialize *idact* in a deployed notebook

### Deploy a notebook

We need to deploy a notebook on a node. Let's allocate one. Make sure to adjust `--account`, same as in previous notebooks

In [8]:
nodes = cluster.allocate_nodes(nodes=1,
                               cores=2,
                               memory_per_node=bitmath.GiB(10),
                               walltime=Walltime(minutes=20),
                               native_args={
                                   '--account': 'intdata'
                               })
nodes

2018-11-24 15:07:45 INFO: Creating the ssh directory.


Nodes([Node(NotAllocated)], SlurmAllocation(job_id=14335308))

In [9]:
nodes.wait()
nodes

Nodes([Node(p0218:56458, 2018-11-24 14:27:54.448650+00:00)], SlurmAllocation(job_id=14335308))

In [10]:
nb = nodes[0].deploy_notebook()
nb

JupyterDeployment(8080 -> Node(p0218:56458, 2018-11-24 14:27:54.448650+00:00)

In [11]:
nb.open_in_browser()

## Copy notebook `05b` to the cluster

Drag and drop `05b-Configuring_idact_on_a_cluster_-_remote_part.ipynb` to the deployed notebook, and open it there.

## Follow the instructions in notebook `05b`

Follow the instructions until you are referred back to this notebook.

## Cancel the Jupyter deployment (optional)

In [12]:
nb.cancel()

2018-11-24 15:09:11 INFO: Cancelling Jupyter deployment.


## Cancel the allocation

It's important to cancel an allocation if you're done with it early, in order to minimize the CPU time you are charged for.

In [13]:
nodes.running()

True

In [14]:
nodes.cancel()

2018-11-24 15:09:20 INFO: Cancelling job 14335308.


In [15]:
nodes.running()

False

## Next notebook

In the next notebook we will deploy Jupyter and Dask, then access these deployments and perform simple computations from a notebook on the cluster.