# Starting a Sequence of Simulations using RickFlow

The RickFlow workflow provides an interface for running simulations in OpenMM using CHARMM input files.

This example (`rickflow/examples/start_simulation`) contains the following files:
    
- hxdwat.crd: The CHARMM coordinate file of a water-hexadecane slab system.
- hxdwat.psf: The CHARMM psf file (connectivity) of the system.
- top_all36_lipid.rtf and par_all36_lipid.prm: CHARMM topology and parameter files.


- **dyn.py**: The python script that runs the simulation.
- **sdyn.sh**: The slurm submit script.

The first bunch of files define the simulation system and will vary for every system.
The second bunch of files run the simulations on a GPU node of cluster.

To start the simulations, call
`
rflow submit sdyn.sh
`


Let's take a look at the files.



## *dyn.py*  - The Simulation Script

An executable python script that defines the simulation. First, the shebang for enabling execution via `./dyn.py`

In [None]:
#! /usr/bin/env python

A couple of imports (note that openmm and the rflow package have to be installed).

In [None]:
import os
import numpy as np
import simtk.unit as u
from simtk.openmm.app import CharmmPsfFile, CharmmCrdFile, CharmmParameterSet, PME
from simtk.openmm.app import HBonds, Simulation, DCDReporter, StateDataReporter
from simtk.openmm import LangevinIntegrator, MonteCarloAnisotropicBarostat
from rflow.integrators import NoseHooverChainVelocityVerletIntegrator
from rflow import RickFlow

We create a `RickFlow` instance:

In [None]:
workflow = RickFlow(
    toppar=["top_all36_lipid.rtf", "par_all36_lipid.prm"],
    psf="hxdwat.psf",
    crd="hxdwat.crd",
    box_dimensions=[50,50,53.1975], # box lenghts in Angstrom
    gpu_id=0,
    nonbonded_method=PME,
    tmp_output_dir=os.path.join("/lscratch", os.environ['SLURM_JOB_ID'])
)

This command does the following:
1. It **creates an OpenMM system** (accessible as `workflow.system`) based on the files provided. 
2. It also makes sure that **CUDA** is enabled (if not, the constructor will throw an error). To skip the CUDA check, use `gpu_id=None`. 
3. It sets up the simulation on GPU 0 (note that it is usually most efficient to run jobs on single GPUs in OpenMM). `gpu_id=0` is usually the right choice. Note that the sdyn.sh script below requests only one GPU; this GPU will always have the ID `0`.
4. It **recenters** the input coordinate so that the center of mass of all non-water atoms is at the center of the box. This  is useful, because OpenMM defines the origin (0,0,0) to be an edge of the box, while CHARMM simulations usually have the box centered around the origin.
5. It **removes isotropic long-range corrections** from all nonbonded forces in the system. These forces are added by default, but most CHARMM parameters were optimized without long-range corrections.
6. It sets up a particular **directory structure**: subdirectories res, trj, out for restart, trajectory, and output files, respectively. 
7. It creates a file `next.seqno`, which stores the id of the next sequence to be simulated (each sequence spans 1 ns). After a sequence is finished, the sdyn.sh script submits the next one. To stop the simulation after a given sequence, you can create a file `last.seqno`, which contains the number of the final sequence.
8. If `tmp_output_dir` is specified, the output files will be written to a temporary directory (usually a local scratch directory on the compute node) and copied over to the working directory afterwards.

### Define Intergrator and Barostat

In [None]:
#
#  ========= Define Integrator and Barostat ==========
#
temperature = 310.0 * u.kelvin
# integrator (The Nose-Hoover integrator in openmm does is currently not using the right
# number of degrees of freedom. The implementation in nosehoover.py provides a short-term fix,
# which requires the system to be passed to the constructor)
integrator = NoseHooverChainVelocityVerletIntegrator(
        workflow.system, temperature, 50.0 / u.picosecond, 1.0 * u.femtosecond,
        chain_length=3, num_mts=3, num_yoshidasuzuki=3
)
# integrator = LangevinIntegrator(temperature, 1.0 / u.picosecond, 1.0 * u.femtosecond)
barostat = MonteCarloAnisotropicBarostat(
    u.Quantity(value=np.array([0.0, 0.0, 1.0]), unit=u.atmosphere), temperature,
    False, False, True
)

workflow.prepareSimulation(integrator, barostat)

The prepare simulation command creates a simulation object (`workflow.simulation`), reads the restart files, and writes out the system as a pdb (for postprocessing in VMD).

Note that the `NoseHoover...` integrator is a lot slower than the `LangevinIntegrator`.

### Run the simulation

In [None]:
if __name__ == "__main__":
    workflow.run()

## sdyn.sh -- The Batch Script

In [None]:
#!/bin/bash
#SBATCH --time=1:00:00 --ntasks=1 --nodes=1 -p gpu
#SBATCH --ntasks-per-node=1 --cpus-per-task=2 --gres=lscratch:250,gpu:p100:1

cd $SLURM_SUBMIT_DIR
sleep 10

# run simulation and resubmit script
./dyn.py && rflow submit sdyn.sh && sleep 60


Note that we do not want to use the node exclusively,
especially when using nodes with multiple GPUs (on lobos: k40 -- 2 GPUs, pascal -- 4 GPUs).
OpenMM does all the work on the GPU and utilizes only one GPU per simulation. 
By requesting only on GPU per job, the rest of the GPUs can be utilized by other jobs.

