# 02. Allocating nodes

## Overview

In this notebook, you will learn how to:

 - Allocate nodes on a cluster.
 - Examine allocated nodes.
 - Open tunnels to nodes.
 - Cancel node allocations.

## Import idact

It's recommended that *idact* is installed with *pip*. Alternatively, make sure the dependencies are installed: `pip install -r requirements.txt`, and add *idact* to path, for example:

In [1]:
import sys
sys.path.append('../')

We will use a wildcard import for convenience:

In [2]:
from idact import *
import bitmath

## Load the cluster

We will use the cluster we added in the previous notebook.

Let's load the environment first:

In [3]:
load_environment()

Now to show the cluster. Make sure to use the right name, if you've changed it.

In [4]:
cluster = show_cluster("hpc")
cluster

Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)

Let's access the head node to make sure everything is configured correctly:

In [5]:
access_node = cluster.get_access_node()
access_node.connect()

If there are any issues, please review the previous notebook and make sure that the cluster config is properly set up.

## Allocate nodes

For demonstration purposes, we will allocate two nodes with two cores and 10 GiB of memory each for the walltime of 10 minutes.

Currently, only [Slurm Workload Manager](https://slurm.schedmd.com/) is supported as the job scheduler.
Both nodes will be allocated as a single Slurm job.

You will likely need to change the native Slurm argument `--account` to specify the account that will be charged for the used resources, see [sbatch documentation](https://slurm.schedmd.com/sbatch.html).

In [6]:
nodes = cluster.allocate_nodes(nodes=2,
                               cores=2,
                               memory_per_node=bitmath.GiB(10),
                               walltime=Walltime(minutes=10),
                               native_args={
                                   '--account': 'intdata'
                               })
nodes

2018-11-29 01:19:01 INFO: Installing key in '.ssh/authorized_keys.idact' for access to compute nodes.
2018-11-29 01:19:01 INFO: Creating the ssh directory.


Nodes([Node(NotAllocated),Node(NotAllocated)], SlurmAllocation(job_id=14373726))

Walltime is a helper class that can be used to specify the job duration down to seconds:

In [7]:
Walltime(days=1, hours=6, minutes=30, seconds=30)

1-06:30:30

Memory per node was given in terms `GiB` (gibibytes) from the `bitmath` library. You can use
related units too, e.g.:

In [8]:
bitmath.MiB(500)

MiB(500.0)

In [9]:
bitmath.TiB(0.1)

TiB(0.1)

For more, see [bitmath on PyPI](https://pypi.org/project/bitmath/).

## Wait for allocation

We can see the returned nodes are not allocated for now:

In [10]:
nodes

Nodes([Node(NotAllocated),Node(NotAllocated)], SlurmAllocation(job_id=14373726))

We need to wait for the resources to be allocated. It shouldn't take long, provided the cluster is not too busy.

The `wait` method will wait until the job is allocated or there is a keyboard interrupt.
Alternatively, you can specify the number of seconds to wait for in the `timeout` parameter, see [the documentation](https://garstka.github.io/idact/develop/html/api/idact.html#idact.Nodes.wait).

In [11]:
nodes.wait()
nodes

2018-11-29 01:19:20 INFO: Still pending or configuring...


Nodes([Node(p0458:35686, 2018-11-29 00:29:12.405379+00:00),Node(p0459:48947, 2018-11-29 00:29:12.405379+00:00)], SlurmAllocation(job_id=14373726))

Nodes should now be allocated. We can check that at any time by calling:

In [12]:
nodes.running()

True

We can run commands on each node now:

In [13]:
nodes[0].run('whoami')

'plggarstka'

In [14]:
nodes[0].run('hostname')

'p0458'

In [15]:
nodes[1].run('whoami')

'plggarstka'

In [16]:
nodes[1].run('hostname')

'p0459'

## Examine allocated nodes

Let's take a look at one of the allocated nodes.

### Connection details

We can get a node's hostname and SSH port:

In [17]:
nodes[0].host

'p0458'

In [18]:
nodes[0].port

35686

The SSH port differs each time, because *idact* deploys its own SSH daemon that accepts connections based on the contents of the file
`~/.ssh/authorized_keys.idact`


### Resources

We can examine a node's resources and their usage:

In [19]:
nodes[0].resources.memory_total

GiB(10.0)

In [20]:
nodes[0].resources.memory_usage

GiB(0.022563934326171875)

In [21]:
nodes[0].resources.cpu_cores

2

In [22]:
nodes[0].resources.cpu_usage

0.0

For more information, see the [documentation of NodeResourceStatus](https://garstka.github.io/idact/develop/html/api/idact.html#idact.NodeResourceStatus).

## Tunnel

You can open a tunnel to any port on the node, e.g.:

In [23]:
tunnel = nodes[0].tunnel(here=9000, there=10000)

In [24]:
tunnel

MultiHopTunnel(9000:10000)

In [25]:
tunnel.close()

In particular, you can try to SSH into the node itself through a tunnel, as long as you use the cluster key.

Let's try to do that.

In [26]:
target_node = nodes[1]

In [27]:
tunnel2 = target_node.tunnel(here=target_node.port, there=target_node.port)
tunnel2

MultiHopTunnel(48947:48947)

The tunnel should now be open. If you have an SSH client installed, you can copy the command printed below and try to run it in a terminal.

In [28]:
print("ssh -p {port} -i {key} {user}@localhost".format(
    port=tunnel2.here,
    key=cluster.config.key,
    user=cluster.config.user))

ssh -p 48947 -i C:\Users\Maciej/.ssh\id_rsa_6p plggarstka@localhost


After you're done, you can close the tunnel:

In [29]:
tunnel2.close()

Here is a quicker way to do what we just did manually:

In [30]:
tunnel3 = target_node.tunnel_ssh()
tunnel3

ssh -i "C:\Users\Maciej/.ssh\id_rsa_6p" -p 48947 plggarstka@localhost

In [31]:
tunnel3.close()

## Cancel the allocation

It's important to cancel an allocation if you're done with it early, in order to minimize the CPU time you are charged for.

In [32]:
nodes.running()

True

In [33]:
nodes.cancel()

2018-11-29 01:20:07 INFO: Cancelling job 14373726.


In [34]:
nodes.running()

False

## Next notebook

In the next notebook, we will deploy a Jupyter Notebook instance on an allocated compute node, and access it from the local computer.