# Running SLURM jobs from a notebook

One possible use case of Jupyter Notebooks in an HPC environment is to manage SLURM jobs and to monitor/visualize results from running jobs.  

In this lesson, we will have a look at **Slurm magics** to manage jobs and **interactive analysis** of running jobs.

## Contents

- [SLURM magics](#SLURM-magics)
    - [Using SLURM magics](#Using-SLURM-magics)
- [Submitting and analyzing jobs](#Submitting-a-job-and-analyzing-results-on-the-fly)
    - [GROMACS as an example](#GROMACS-as-an-example)
    - [<font color="red"> Exercise 2.1](#exercise21)

## SLURM magics

- Developed at [NERSC](http://www.nersc.gov/) (sources [here](https://github.com/NERSC/slurm-magic))
- Implements Jupyter magic commands for interacting with the SLURM workload manager
- Commands are spawned via `subprocess` and output captured in the notebook
- Arguments accepted by a SLURM command are also accepted by the corresponding magic command

> *"I’ll never have to leave a notebook again, that’s like the ultimate dream"*  
> (Anonymous SLURM-magic user)

### Using SLURM magics

The Python package ``slurm-magic`` is available in the ``prace`` environment:

```bash
$ module load anaconda/py36/5.0.1
$ source activate prace
$ jupyter-notebook
```

In the notebook, we then need to load the IPython extension: 

In [None]:
%load_ext slurm_magic

We can check the newly added magics provided by ``slurm-magic``:

In [None]:
%lsmagic

and try them out:

In [None]:
%squeue

In [None]:
%sinfo

We can also bring up a help menu by using a question mark:

In [None]:
%squeue?

## Submitting a job and analyzing results on the fly

### GROMACS as an example

[GROMACS](http://www.gromacs.org/) is a molecular dynamics simulation package designed for simulations of biological macromolecules (proteins, lipids, nucleic acids, etc.).

In this exercise we use [lysozyme in water](http://www.mdtutorials.com/gmx/lysozyme/index.html) as a model system to demonstrate how to use Jupyter notebook to submit jobs and analyze results.

First, go to the ``gromacs_job`` folder

In [None]:
%cd gromacs_job

and check that the input files (.mdp, .top and .gro) are in the folder

In [None]:
%ls

Then, use the ``%%sbatch`` cell magic to submit a GROMACS job

In [None]:
%%sbatch
#!/bin/bash -l
#SBATCH -A 20XX-YY-ZZ
#SBATCH -N 1
#SBATCH -t 00:05:00
#SBATCH -J gromacs
module load gromacs/2018.3
gmx_seq grompp -f npt.mdp -c start.gro -p topol.top
gmx_seq mdrun -s topol.tpr -deffnm npt

Monitor your job with the ``%squeue`` line magic

In [None]:
%squeue -u username

As the simulation goes on, the output files will be constantly updated. You can start to analyze the output files and monitor the progress of the simulation.

Gromacs has many utility programs for extracting information from the binary output files, and to run it we need to load the Gromacs module and call the `gmx_seq energy` program for each property we wish to analyze. We use the %%bash magic to let Jupyter run these as a shell script:

In [None]:
%%bash

module load gromacs/2018.3
echo "Temperature" | gmx_seq energy -f npt.edr -o temperature.xvg
echo "Density" | gmx_seq energy -f npt.edr -o density.xvg
echo "Pressure" | gmx_seq energy -f npt.edr -o pressure.xvg

Let's define a function to extract data from the processed Gromacs xvg files

In [None]:
def get_prop(prop):
    """Extract system property (Temperature, Pressure, Potential, or Density)
    from a GROMACS xvg file. Returns lists of time and property."""

    x = []
    y = []

    f_prop = open("%s.xvg" % prop, 'r')
    for line in f_prop:
        if line[0] == '#' or line[0] == '@':
            continue
        content = line.split()
        x.append(float(content[0]))
        y.append(float(content[1]))
    f_prop.close()

    return x,y

Now import matplotlib and specify how plots should be displayed in the notebook:

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

and examine the evolution of density with respect to simulation time

In [None]:
time,dens = get_prop("density")
plt.plot(time,dens)

Note that the default unit is kg/m<sup>3</sup> for density and ps for simulationt time. You may improve the plot by adding ``xlabel``, ``ylabel``, etc.

In [None]:
plt.xlabel('Simulation time [ps]')
plt.ylabel('Density [kg/m$^3$]')
plt.plot(time,dens)

We can also examine the evolution of pressure with respect to time

In [None]:
time,pres = get_prop("pressure")
plt.plot(time,pres)

Also look at the correlation between density and pressure

In [None]:
plt.plot(dens,pres[:len(dens)],'b+')

<a id='exercise21'></a>

### <font color="red"> Exercise 2.1

In this exercise, you will compile the hello-world MPI code, submit a batch job and have a look at the output. 

Try to do all the steps below from within this notebook:

1. Start by creating a new directory called `hello-world` under the `jupyter-notebook` directory (you may need `%cd ..` first), and `cd` into it.
2. Copy-paste the hello-world MPI code in C from [the HPC-Intro lesson](https://pdc-support.github.io/hpc-intro/08-compiling/#mpi-parallelized-code) into a code cell (**don't execute it yet**).
3. Add the `%%writefile hello_mpi.c` cell magic command at the top of the cell, and execute it.
4. Check that you have indeed created the file in the right directory (`%pwd` and `%ls` are your friends).
5. Compile the code using `mpicc -o hello_mpi hello_mpi.c`. Check that the executable has been created.
6. Write a new batch script in a cell (or copy-paste the cell from the [Gromacs section above](#GROMACS-as-an-example) using `c` and `v`). It should: 
    - request 1 node for 5 minutes using the edu18.prace allocation
    - load the `gcc/7.2.0` and `openmpi/3.0-gcc-7.2` modules 
    - execute your executable using `mpirun -n 24 ./hello_mpi > hello.out`
7. Submit the job using the `%%sbatch` magic, monitor the job using `squeue -u <username>` and inspect the output file. 