# Configuration
First, we specify the Slurm server and login information:

In [1]:
import slurmqueen

nots = slurmqueen.SlurmServer('nots.rice.edu', 'jmd11', 'C:/Users/Jeffrey/.ssh/id_rsa')

`nots.rice.edu` is the Slurm server to connect to. `jmd11` is the account to use on the server. `C:/Users/Jeffrey/.ssh/id_rsa` is an SSH private key used to connect with the server; the corresponding public key should be added to the Slurm server (see [here](https://adamdehaven.com/blog/how-to-generate-an-ssh-key-and-add-your-public-key-to-the-server-for-authentication/) for details on generation and use of SSH keys).

Next, we choose a partition on the Slurm cluster (e.g. `commons`), a local directory on the current machine (e.g. `C:/Work/Projects/SlurmQueen/example`) and a remote directory on the Slurm cluster (e.g. `/scratch/jmd11/experiments/slurmqueen`). The remote directory should generally be in the scratch filesystem of the Slurm cluster if possible. These directories will be created when a job is started if they do not currently exist.

In [2]:
config = slurmqueen.ExperimentConfig(
    server=nots,
    partition='scavenge',
    local_directory='C:/Work/Projects/SlurmQueen/example',
    remote_directory='/scratch/jmd11/experiments/slurmqueen')

# Defining an experiment

When running many experiments on the same tool, it is convenient to define a subclass of ```SlurmExperiment``` to hold any additional server setup. The positional arguments to SlurmExperiment are, in order:
1. A path (relative to ```local_directory``` and ```remote_directory```) where the input and output files for this experiment should be stored.
2. The command to use to run the tool. In this case, ```example_tool.py``` is run through python.
3. The list of tasks to execute for this experiment; see below.
4. A list of file dependencies for this tool. Note that each can also be a Unix glob to capture multiple files. In this case, our example tool requires a single file to run: ```example_tool.py```.

Our example tool requires Python 3 to be installed on the cluster. We satisfy this dependency by loading the ```Anaconda3/5.0.0``` module on the cluster before running the tool. The string passed to `setup_commands` is copied directly to the [script used to eventually submit the Slurm job](https://github.com/Kasekopf/SlurmQueen/blob/master/example/experiments/slurm_test_1/_run.sh), after pregenerated SBATCH arguments but before any tasks are executed. Custom arguments to `sbatch` can also be provided here, e.g. `#SBATCH --mem=0` to allow full use of the node memory.

By default, everything written to stdout by the tool will be stored in a ```.out``` file and automatically parsed to be queried with SQL, while everything written to stderr by the tool will be stored in a ```.log``` file and not parsed. This behavior can be adjusted by including the optional arguments ```output_argument``` (defaults to ```'>>'``` indicating stdout) and ```log_argument``` (defaults to ```'2>'```, indicating stderr) when initializing the SlurmExperiment.

In [3]:
class SlurmTest(slurmqueen.SlurmExperiment):
    def __init__(self, experiment_id, changing_args):
        slurmqueen.SlurmExperiment.__init__(self, 'experiments/' + experiment_id, 
                                            'python3 example_tool.py',
                                             changing_args,
                                             dependencies=['example_tool.py'],
                                             setup_commands="module load Anaconda3/5.0.0")

We can then define a single experiment on ```example_tool.py``` by providing a name for this experiment and a list of tasks. Each task is defined by a set of arguments, given by a dictionary of key/value pairs. These arguments are passed as arguments to the tools as ```--key=value``` arguments, with a few exceptions:
* The key ```''```, if it exists, indicates a list of positional arguments (which are given in the provided order).
* Keys that contain ```<``` or ```>``` are treated as shell redirections. For example, the pair ```'<': 'path/to/file'``` opens ```path/to/file``` as stdin.
* Keys that begin with ```'|'``` are not passed to each task. Such keys can be used when processing results.

In [4]:
slurm_test_1 = SlurmTest('slurm_test_1',
                         [{'': [chr(a+65) + chr(b+65)],
                           'a': a, 'b': b,
                           '|desc': '%d + %d' % (a, b)
                          } for a in range(3) for b in range(3)])

## Running an experiment

Once we have defined an experiment and a configuration, we can run the experiment on the provided cluster in a single command. In this case, we run the 9 tasks using 2 Slurm workers (on two separate nodes in the cluster). Each worker is given a timeout of 5 minutes. The following command performs each of the following steps:
1. Creates a set of 9 .in files in ```experiments/slurm_test_1/```, each defining a single task.
2. Copies all .in files and all dependencies provided to the ```SlurmExperiment``` to the Slurm cluster (in ```remote_directory```)
3. Generates an appropriate Slurm script to run all tasks on the provided number of workers (distributed in round-robin).
4. Submits the Slurm job, returning the job id.

In [5]:
slurm_test_1.slurm_instance(config).run(2, '5:00')

Connected to nots.rice.edu
Created 9 local files
Compressed local files
Copied files to remote server
Attempting to submit job
Submitted batch job 1053370



`run` also accepts an optional argument not shown here, `cpus_per_worker`, to indicate the number of cpus to request on the Slurm node allocated to each worker (default `1`).

Once an experiment has finished running, we use a single additional command to download all results back to the local machine (and clean up the files on the cluster).

In [6]:
slurm_test_1.slurm_instance(config).complete()

Experiment complete. Compressing and copying results.
Deleting files from remote server: /scratch/jmd11/experiments/slurmqueen/experiments/slurm_test_1/


We can then use an SQL interface to query the results. All inputs and results appear in a table named `data`. The columns are all the named argument keys (`a`, `b`, and `|desc`), all the output keys (`Repeated Text` and `Sum`), and a column automatically generated by SlurmQueen (`file`, indicating the task id).

In [7]:
slurm_test_1.slurm_instance(config).query('SELECT * FROM data')

Unnamed: 0,Repeated Text,Sum,a,b,file,|desc
0,AAAAAA,0,0,0,0,0 + 0
1,ABABAB,1,0,1,1,0 + 1
2,ACACAC,2,0,2,2,0 + 2
3,BABABA,1,1,0,3,1 + 0
4,BBBBBB,2,1,1,4,1 + 1
5,BCBCBC,3,1,2,5,1 + 2
6,CACACA,2,2,0,6,2 + 0
7,CBCBCB,3,2,1,7,2 + 1
8,CCCCCC,4,2,2,8,2 + 2


## (Optional) Generating/Analyzing an experiment without a Slurm cluster.

To aid in reproducability, an experiment can be generated and analyzed even without access to a Slurm cluster. In particular, the ```Experiment``` class (in ```experiment.py```) is sufficient to generate all ```*.in``` files and provide an SQL interface to results without requiring access to a Slurm cluster.

We can define an experiment here purely as the command and set of arguments.

In [8]:
experiment_1 = slurmqueen.Experiment('python3 example_tool.py',
                                     [{'': [chr(a+65) + chr(b+65)],
                                       'a': a, 'b': b,
                                       '|desc': '%d + %d' % (a, b)
                                      } for a in range(3) for b in range(3)])

We then only need to provide a local directory in order to generate all ```*.in``` files.

In [9]:
experiment_1.instance('experiments/slurm_test_1').setup()

Each ```*.in``` file is a complete bash script that runs a single task and produces the corresponding ```.out``` file. Once all ```*.out``` files are computated separately or previously provided, the SQL interface can still be used to find the results.

In [10]:
experiment_1.instance('experiments/slurm_test_1').query('SELECT * FROM data WHERE a=1')

Unnamed: 0,Repeated Text,Sum,a,b,file,|desc
0,BABABA,1,1,0,3,1 + 0
1,BBBBBB,2,1,1,4,1 + 1
2,BCBCBC,3,1,2,5,1 + 2
