# HPC intro

## atools

There are many situations in which you want to run an application for (potentially many) different input parameters.  These parameters can be command line options you run your application with, or file names you provide and so on.

Of course, you could submit a job for each of the instances of your problem, but that would result in many jobs.  Moreover, quite some bookkeeping would be required if some instances fail, while others succeed.  You typically don't have a convenient way to get an overview of which instances failed, and hence have to be redone.

Alternatively, you could simply do all these instances looping over all the parameters.  This would result in potentially prohibitively long run times, and, more importantly, you would not be exploiting a supercomputers main feature: executing work in parallel.

atools has been designed to make it easy for you to run many instances of a problem in parallel, and it takes care of the bookkeeping for you as well.  An instance of the problem that you want to compute is called a *work item* in the context of atools.

### Job script

The first step is to make a few modifications to your job script.  By way of example, use a script that simply calculates and displays the product of two numbers that you also used in the [tutorial on jobs](020_jobs.ipynb).

```bash
#!/usr/bin/env bash
#SBATCH --accoun=lp_multiscale_physics
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1g
#SBATCH --time=00:05:00

# actual computation, a bit boring
for i in $(seq 1 10)
do
    for j in $(seq 1 10)
    do
        echo $(( $i * $j ))
    done
done
```

In this job script, you do all computations sequentially, but to speed things up, you would like to do them in parallel as independent jobs.  So you can rewrite the job script such that it only does a single multiplication.

```bash
#!/usr/bin/env bash
#SBATCH --accoun=lp_multiscale_physics
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1g
#SBATCH --time=00:05:00

# actual computation, a bit boring
echo $(( $i * $j ))
```

The job script has been adapted to compute a single work item.

This is where atools comes in.  You can make a few more modifications to this job script to use it.  The values of `i` and `j` will be read from a Comma Separated Value file (CSV file).

The first line of this file lists the names of the variables, each line after that the values that correspond to the work items.  So for this example, that would look as follows.  

```
i,j
1,1
1,2
1,3
...
10,8
10,9
10,10
```

You don't have to type all that, there is a data file `data.csv` available that you can copy.  You can find it in the `021_artefacts` directory.

As it is, this script would fail since at this point the variables `i` and `j` are not defined.  You have to make sure that atools can do its magic.  For that purpose, you have to make a few more modifications to the job script.

1. Load the `atools` module.
2. Log the start of the work item.
3. Make sure that the variables used in the script are initialized.
4. Log the end of the work item.

```bash
#!/usr/bin/env bash
#SBATCH --accoun=lp_multiscale_physics
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=1g
#SBATCH --time=00:05:00

# make sure the module system starts from a clean slate and load the atools module
module purge
module load atools

# log the start of the work item
alog  --state start

# initialize the variables
source <(aenv --data data.csv)

# actual computation, a bit boring
echo $(( $i * $j ))

# log the end of the work item
alog  --state end  --exit $?
```

Now your job script is fully adapted to use atools features.  It is available in the `021_artefacts` directory as `jobscript_parallel.slurm`.   Don't forget to change the credit account name to the one you have access to.

### Job submission

You can submit an atools job almost the same way as an ordinary job, except that you need to specify the `--array` option for `sbatch`.  If you know the number of work items, 100 in the `data.csv` file you are using, you can simply use `--array=1-100`.  Otherwise, atools can help you determine it easily.

First, load the atools module.

In [None]:
module load atools

Next, submit the job as follows.

In [None]:
sbatch  --array=$(arange --data data.csv)  jobscript_parallel.slurm