# 01. Connecting to a cluster

## Introduction

This is the first in a series of notebooks that will show you how to use *idact* step by step.

## Overview

In this notebook, you will learn how to:

 - Add and configure a cluster.
 - Connect to the cluster.
 - Save and load your environment.
 - Modify and remove cluster config.

## Import idact

It's recommended that *idact* is installed with *pip*. Alternatively, make sure the dependencies are installed: `pip install -r requirements.txt`, and add *idact* to path, for example:

In [1]:
import sys
sys.path.append('../')

We will use a wildcard import for convenience:

In [2]:
from idact import *
import os

## Add a cluster

In order to connect to a cluster, you will need to configure its parameters, like hostname, user, and authentication method.

### Authentication method

The recommended authentication method is a public/private key pair.

In [3]:
auth = AuthMethod.PUBLIC_KEY

If you don't have an SSH key to connect to the cluster, you can specify `KeyType.RSA`
 to generate one.
It will have a unique name, so you don't need to worry about overwriting any existing keys.

In [4]:
key = KeyType.RSA  # Generate a new RSA key (Default location: ~/.ssh)

If you already have a key you want to use, uncomment the following line and provide its absolute path:

In [5]:
# key = os.path.expanduser('~/.ssh/id_rsa')

You can install the key manually, or let *idact* do it for you.

If you leave `install_key` below as True, you will be asked for a password later. If you change it to False, *idact* will assume the key is already installed.

In [6]:
install_key = True

### SSH connection info

You will need:

 - The name of the user to log in as.
 - The hostname of the login (head) node.
 - The ssh port.
 
Replace the following with correct entries for you:

In [7]:
user = "plggarstka"
host = "pro.cyfronet.pl"
port = 22

### Cluster name

Each cluster is identified by a unique identifier, e.g.

In [8]:
name = "hpc"

### Finally, add the cluster

Add the cluster:

In [9]:
cluster = add_cluster(name=name,
                      user=user,
                      host=host,
                      port=port,
                      auth=auth,
                      key=key,
                      install_key=install_key)
cluster

2018-11-23 20:31:21 INFO: Generating public-private key pair.


Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=True, disable_sshd=False)

If you'd like to learn more about `add_cluster`, or any *idact* member, please refer to the [API documentation](https://garstka.github.io/idact/develop/html/api/idact.html).

## Connect to the cluster

We will now test the connection to the head cluster node.

In [10]:
node = cluster.get_access_node()
node

Node(pro.cyfronet.pl:22, None)

If you set `install_key=True`, then on your first action, you may be asked for a password to install the key. Let's do this right now by connecting explicitly:

In [11]:
node.connect()

2018-11-23 20:31:30 INFO: Installing key using password authentication.


Password for plggarstka@pro.cyfronet.pl:22:  


You should now be able to run simple commands on the head node:

In [12]:
node.run("whoami")

'plggarstka'

In [13]:
node.run("hostname")

'login01.pro.cyfronet.pl'

If you have trouble connecting or performing any other action using *idact*,
the exception message may not provide enough information to troubleshoot the problem.

In that case, you may want to examine the debug log, which is saved per-session as `idact.log` in current working directory.

## Save the environment

Now that we've added the cluster and made sure that public key authentication works, it would be a shame to have to do that again every time we need to access the cluster.

In order to save our changes, let's save the environment.

In [14]:
save_environment()

The default config file location is `~/.idact.conf`.
You could change it by providing the `path` parameter, see [save_environment](https://garstka.github.io/idact/develop/html/api/idact.html#idact.save_environment).

## Load the environment

After saving the environment, we can load it by calling:

In [15]:
load_environment()

`load_environment` replaces the current environment used by top-level functions such as `add_cluster`. We still have access to any leftover objects from the previous environment, though.

To access the recently added cluster using the new environment, we need to execute:

In [16]:
cluster = show_cluster(name)
cluster

Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)

You can also view all clusters in the environment by calling:

In [17]:
show_clusters()

{'hpc': Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)}

Let's access the node again, this time using the new cluster object:

In [18]:
node = cluster.get_access_node()
node

Node(pro.cyfronet.pl:22, None)

We should be able to connect without needing a password, using the key instead:

In [19]:
node.connect()

In [20]:
node.run("whoami")

'plggarstka'

In [21]:
node.run("hostname")

'login01.pro.cyfronet.pl'

## Examine and modify the cluster

You can always access the cluster config like this:

In [22]:
cluster.config

(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)

In [23]:
cluster.config.user

'plggarstka'

To view all configurable cluster parameters, visit the [API documentation for ClusterConfig](https://garstka.github.io/idact/develop/html/api/idact.html#idact.ClusterConfig).
We will modify a few of them in other tutorials.

You can easily modify any parameter we passed in `add_cluster`, for example:

In [24]:
correct_user = cluster.config.user
cluster.config.user = "another_user"
cluster.config.user

'another_user'

In [25]:
cluster.config.user = correct_user
cluster.config.user

'plggarstka'

If you want a config change to be saved, make sure to call `save_environment` after making it.

## Remove a cluster

If you ever need to start over, just use `remove_cluster`. Let's try it out on a fake cluster:

In [26]:
add_cluster(name="fake",
            user="fakeuser",
            host="fakehost",
            port=2222)

2018-11-23 20:32:17 INFO: No auth method specified, defaulting to password-based.


Cluster(fakehost, 2222, fakeuser, auth=AuthMethod.ASK, key=None, install_key=True, disable_sshd=False)

In [27]:
show_clusters()

{'hpc': Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False),
 'fake': Cluster(fakehost, 2222, fakeuser, auth=AuthMethod.ASK, key=None, install_key=True, disable_sshd=False)}

In [28]:
remove_cluster("fake")

In [29]:
show_clusters()

{'hpc': Cluster(pro.cyfronet.pl, 22, plggarstka, auth=AuthMethod.PUBLIC_KEY, key='C:\\Users\\Maciej/.ssh\\id_rsa_6p', install_key=False, disable_sshd=False)}

## Next notebook

In the next notebook, you will find out how to allocate compute nodes on the cluster.