# How to use a cluster for your python code

28th January 2021, DYNOSOB group seminar

### I) Setting up: (Cloning the git repository into your local machine)

```shell
$ git clone https://github.com/Saptarshi07/cluster-tutorial
```

### II) Python basics: Command line inputs

#### a) No command line input:

*python file:*

no_commandline.py

```shell

import sys

u = 3
v = 2
..
..
..
x = u - v + u*v

print(x)
```

*terminal*

```shell
$ python no_commandline.py
>>> 7```

#### b) Command line input

*python file*:

commandline.py

```shell

import sys

u = int(sys.argv[1])
v = int(sys.argv[2])
..
..
..
x = u - v + u*v

print(x)
```

*terminal*

```shell
$ python commandline.py 3 2
>>> 7```



## What is a cluster? What is SLURM?

For all we need to know -   

1) A collection of interconnected computing **nodes** for our disposal  
2) Each node is a collection of CPUs, just like your computer  
3) SLURM is a software that stands for Simple Linux Utility Resource Manager - a "glue" for joining the nodes.



Important links:  



http://evoltheo001/wiki/index.php/Ada (information about ada)  
http://ada1/doc/slurm/tutorials.html (SLURM tutorials)  
http://ada1/ganglia/ (Monitoring the ada cluster -- not too important)  


### III) Part 1: Tunneling into ADA (the MPI Evolbio cluster):

In the command line type:

```shell
$ ssh username@ada1.evolbio.mpg.de
```

ideally you would have been prompted to type your ada password (given to you by Derk) -- however, you can bypass entering your password everytime by making the configs I made at your respective machines:

1) 
```shell
$ ssh-keygen -t rsa -b 4096 -C "[username]@evolbio.mpg.de"
```


Press ENTER on both the questions that you are asked (including passcode creation prompt)

2) Create a file 'config' inside .ssh which exists in your root ~

3) Paste the following in your config file: 

```shell
Host ada
    HostName ada1.evolbio.mpg.de
    User [username_given_by_derk]
```
4) Type in your terminal the following and press ENTER. When prompted to give a password - type in the one given to you by Derk:

```shell
$ ssh-copy-id ada
```

All you need to do now is type from anywhere in your machine:

```shell
$ ssh ada 
```

And you should be logged into ada: You can notice this by seeing the following:

```shell
[username@ada1 ~]
```

### Part 1.5 Some tools commands of SLURM

```shell
$ squeue
```

```shell
$ squeue -u [username]
```

```shell
$ sshare -all
```

```shell
$ sinfo -O cpusstate
```

```shell
$ scontrol show job [job id]
```

```shell
$ sacct -S [start-date] -u [username]
```

### Part 2: Transfering files from local to cluster and cluster to local:

### a) from local to cluster

In the command line type (from anywhere in your local machine):

```shell
$ rsync -av [path_of_transferable] [username@ada1.evolbio.mpg.de:home/username/any further path]
```

### b) from cluster to local


In the command line type (from anywhere in your local machine):

```shell
$ rsync -av [username@ada1.evolbio.mpg.de:home/username/path to folder or file] [path in your local computer]
```

If rsync from directory you wish to transfer to - 

```shell
$ rsync -av [username@ada1.evolbio.mpg.de:home/username/path to folder or file] .
```

### Part 3: Running a single python file in cluster (no batch file)

```shell
$ python simple_python.py simple_text.py
```

### Part 4: Running a single python file in cluster (with batch file) - first job submission

```shell
$ sbatch my_first_batch.sh
```

check the details for your job (that just ran):

```shell
$ scontol job name [jobid]
```

make the following the changes to your batch file my_first_batch.sh by adding --output=output.out, --error=error.err

### Part 5: Running the array job

```shell
$ sbatch --array=1-3 my_array.sh
```

### Part 6: A problem with numbers:

```shell
$ cd numbers
```

sigmoid.py calculates the sigmoidal function:

$$ y = \frac{K}{1 + e^{-\beta(x - x_0)}}$$

at point $x$ for parameters $K, \beta, x_0$ mentioned at the text file passed as an argument in command line.

```shell
$ python sigmoid.py test_point.txt
```

### Running the sigmoid as array in the cluster:

```shell 
$ sbatch --array=1-2 sigmoid_array.py
```