# 02. Allocating nodes

## Overview

In this notebook, you will learn how to:

 - Allocate nodes on a cluster.
 - Examine allocated nodes.
 - Open tunnels to nodes.
 - Cancel node allocations.

## Import idact

It's recommended that *idact* is installed with *pip*.  
Alternatively, make sure the dependencies are installed: `pip install -r requirements.txt`, and add *idact* to path, for example:  
`import sys`  
`sys.path.append('<YOUR_IDACT_PATH>')`

We will use a wildcard import for convenience:

In [None]:
from idact import *
import bitmath

## Load the cluster

We will use the cluster we added in the previous notebook.

Let's load the environment first:

In [None]:
load_environment()

Now to show the cluster. Make sure to use the right name, if you've changed it.

In [None]:
cluster = show_cluster("test")
cluster

Let's access the head node to make sure everything is configured correctly:

In [None]:
access_node = cluster.get_access_node()
access_node.connect()

If there are any issues, please review the previous notebook and make sure that the cluster config is properly set up.

## Allocate nodes

For demonstration purposes, we will allocate two nodes with two cores and 10 GiB of memory each for the walltime of 10 minutes.

Currently, only [Slurm Workload Manager](https://slurm.schedmd.com/) is supported as the job scheduler.
Both nodes will be allocated as a single Slurm job.

You will likely need to change the native Slurm argument `--account` to specify the account that will be charged for the used resources, see [sbatch documentation](https://slurm.schedmd.com/sbatch.html).
To find out more about the cluster used for the development of idact, see [Prometheus cluster](https://garstka.github.io/idact/develop/html/prometheus.html).

In [None]:
nodes = cluster.allocate_nodes(nodes=2,
                               cores=2,
                               memory_per_node=bitmath.GiB(10),
                               walltime=Walltime(minutes=10),
                               native_args={
                                   '--account': 'intdata'
                               })
nodes

Walltime is a helper class that can be used to specify the job duration down to seconds:

In [None]:
Walltime(days=1, hours=6, minutes=30, seconds=30)

Memory per node was given in terms `GiB` (gibibytes) from the `bitmath` library. You can use
related units too, e.g.:

In [None]:
bitmath.MiB(500)

In [None]:
bitmath.TiB(0.1)

For more, see [bitmath on PyPI](https://pypi.org/project/bitmath/).

## Wait for allocation

We can see the returned nodes are not allocated for now:

In [None]:
nodes

We need to wait for the resources to be allocated. It shouldn't take long, provided the cluster is not too busy.

The `wait` method will wait until the job is allocated or there is a keyboard interrupt.
Alternatively, you can specify the number of seconds to wait for in the `timeout` parameter, see [the documentation](https://garstka.github.io/idact/develop/html/api/idact.html#idact.Nodes.wait).

In [None]:
nodes.wait()
nodes

Nodes should now be allocated. We can check that at any time by calling:

In [None]:
nodes.running()

We can run commands on each node now:

In [None]:
nodes[0].run('whoami')

In [None]:
nodes[0].run('hostname')

In [None]:
nodes[1].run('whoami')

In [None]:
nodes[1].run('hostname')

## Examine allocated nodes

Let's take a look at one of the allocated nodes.

### Connection details

We can get a node's hostname and SSH port:

In [None]:
nodes[0].host

In [None]:
nodes[0].port

The SSH port differs each time, because *idact* deploys its own SSH daemon that accepts connections based on the contents of the file
`~/.ssh/authorized_keys.idact`


### Resources

We can examine a node's resources and their usage:

In [None]:
nodes[0].resources.memory_total

In [None]:
nodes[0].resources.memory_usage

In [None]:
nodes[0].resources.cpu_cores

In [None]:
nodes[0].resources.cpu_usage

For more information, see the [documentation of NodeResourceStatus](https://garstka.github.io/idact/develop/html/api/idact.html#idact.NodeResourceStatus).

## Tunnel

You can open a tunnel to any port on the node, e.g.:

In [None]:
tunnel = nodes[0].tunnel(here=9000, there=10000)

In [None]:
tunnel

In [None]:
tunnel.close()

In particular, you can try to SSH into the node itself through a tunnel, as long as you use the cluster key.

Let's try to do that.

In [None]:
target_node = nodes[1]

In [None]:
tunnel2 = target_node.tunnel(here=target_node.port, there=target_node.port)
tunnel2

The tunnel should now be open. If you have an SSH client installed, you can copy the command printed below and try to run it in a terminal.

In [None]:
print("ssh -p {port} -i {key} {user}@localhost".format(
    port=tunnel2.here,
    key=cluster.config.key,
    user=cluster.config.user))

After you're done, you can close the tunnel:

In [None]:
tunnel2.close()

Here is a quicker way to do what we just did manually:

In [None]:
tunnel3 = target_node.tunnel_ssh()
tunnel3

In [None]:
tunnel3.close()

## Cancel the allocation

It's important to cancel an allocation if you're done with it early, in order to minimize the CPU time you are charged for.

In [None]:
nodes.running()

In [None]:
nodes.cancel()

In [None]:
nodes.running()

## Next notebook

In the next notebook, we will deploy a Jupyter Notebook instance on an allocated compute node, and access it from the local computer.