# CCPBioSim Training Course: Longbow

**Author**: James Gebbie-Rayet - [@jimboid](https://github.com/jimboid)

In this part of the course you are going to learn about the three basic simulation modes that Longbow offers. These three simulation modes (single simulation, replicates, multiples) can be used to build up enormously complex simulation schemes in real world research.

Here you will be introduced to where each type of simulation is appropriate to use in your own work. Do also note, that the examples provided here are very small systems so that we can fit the training into the length of time allocated. In real world work, you will need to configure Longbow to ask for larger amounts of resources and this is covered in the documentation for Longbow.

## Running Your First Simulation

Submitting your first simulation is very simple, we are going to first start out by submitting a single simulation. These can be useful in your own research where you are simply running single benchmarks or test simulations ready to do a more detailed study.

We have prepared a GROMACS simulation for you, so the first thing you need to do is to change into the directory where the input file is stored. So in your terminal tab type:

```
cd ~/longbow-workshop/data/single-simulation
```

Now if you list the directory contents, you will see that there is just one file, this is the GROMACS binary input file or the tpr file.

```
ls
```

Normally starting a GROMACS simulation would look something like this (Don't try to run this):

```
gmx mdrun -deffnm example
```

And to run it on a supercomputer you would normally have to prepare a job submission script for each simulation that would look something like this (Don't try to run this):

```
#!/bin/bash --login
#PBS -N 24
#PBS -A my-account
#PBS -l select=1:ncpus=24:mpiprocs=24
#PBS -l walltime=00:20:00

export PBS_O_WORKDIR=$(readlink -f $PBS_O_WORKDIR)
cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=1

module load gromacs

aprun -n 24 -N 24 mdrun_mpi -deffnm example
```

As you can probably tell, if you had to do this lots and lots of times then you would spend a lot of time scripting, which is both time consuming and prone to making mistakes.

The longbow way of submitting these simulations to a supercomputer makes it easy, once you have set Longbow up (like you have in the previous steps) you just simply write your command-line like this:

```
longbow --verbose mdrun_mpi -deffnm example
```

note: we are using the "mdrun_mpi" executable instead of "gmx mdrun" since some HPC systems compile this way.

And you should see that Longbow is now performing various tests, sending your files, generating the submit script and finally submitting your simulation. Once complete Longbow will tell you and exit, once this has happened list the directory again

```
ls
```

You will see lots more files in here. These are the results of the simulation! You will also see a file called "submit.pbs", this is the job submission file that Longbow generated to submit your simulation.

Examples of running molecular dynamics simulations with Longbow using codes other than gromacs can be found in the "quick start" section of our documentation.

## Running Many Similar Simulations

Probably one of the most common ways simulations are run in real world research is that you have a large number of simulations where you would typically use the same MD program. With the only difference between simulations being that you have a different input file, this could be a slightly different starting configuration, a different protein or ligand etc. But really from a computing perspective, these simulations are identical to run from the command-line, they just use different files.

Longbow can automate this for you using the "replicate" simulation type, so that you only have to launch your simulations with one command and all of the files for all of the jobs are transfered, simulations launched and results delivered back.

If you now use your terminal tab to change into the following directory:

```
cd ~/longbow-workshop/data/multi-similar
```

and then list the contents:

```
ls
```

You will see a bunch of directories labelled in the form repX if you now have a look inside each of these directories:

```
ls rep1
ls rep2
ls rep3
```

You will see that there is a gromacs input file inside each directory. We are going to pretend that these are each their own distinct simulations with different starting conditions, but we have organised the files so that they are named the same but in different directories.

Launching these manually would require 3 separate launch commands on the command-line, and to launch on a supercomputer you would have to make 3 submit scripts and launch them.

However, with Longbow, you can simply run the replicate simulation type and it will detect your simulations and fully automate the submission for you. So if you type:

```
longbow --verbose --replicates 3 mdrun_mpi -deffnm example
```

You will see Longbow running through its usual tests and then transfering the files and finally submitting your jobs. Once Longbow has finished you will notice the usual goodbye message. Once done you can then have a look around the directories.

```
ls
```

You should see that the job submit script is created at the same level as the repX directories. Listing the contents of the repX directories, you will see that the results from the simulations now appear.

```
ls rep1
ls rep2
ls rep3
```

That's it, you have now just run three simulations in one go. There is nothing now to stop you from doing this for 10, 50, 1000 or more!

You also do not have to use the repX naming scheme either, this is simply the Longbow default. You could add the following to your hosts.conf:

```
replicate-naming = simulation
```

and then your directories can be called simulation1, simulation2 ....

## Running Many Different Simulations

Running many simulations that have different requirements is just part of being a simulation scientist in the biosimulation field! 

Longbow has a simulation type called "multi-jobs". This job type uses configuration files to independently control the parameters for each simulation. So for example, you could have 10 simulations running on two different HPC machines 6 on one and 4 on another, and be able to launch this with one command! You could have simulations using different codes, or core counts, different wall clock times etc. You can even have replicate jobs like the one above inside this job type.

The first thing to do is to change into the directory for this example:

```
cd ~/longbow-workshop/data/multi-different
```

Now if you list the directory contents:

```
ls
```

You will see that there are directories named "amber", "gromacs", "lammps" and "namd". We are going to work on a simulation that uses different codes for each of its simulations.

For this example, rather than give all of the files needed to get going, we are going to make this ourselves. The only file missing really is the job.conf. The job.conf is what in Longbow terminology is a job configuration file, the format is the same as the hosts.conf you made earlier. But instead of storing details about machines, it stores details about jobs, and you only really need one of these if you are doing multi-jobs like this or if you are keen to override host.conf parameters on a per job basis (this is covered in the docs).

The first step then is to create yourself a job.conf:

```
nano job.conf
```

Now we have four simulations we need to run each with it's own configuration, so the job.conf will contain four blocks of code when you have finished. We will work through the first one together:

Paste the following into your job.conf file:

```
[amber]
resource = Archer
cores = 24
executable = pmemd.MPI
executableargs = -i example.in -c example.min -p example.top -o example.out
```

So to break this down for you:

[amber] - this is the name of the simulation, and this has to match the directory name containing the input files!

resource - this is the name of the machine you want to submit the simulation to.

cores - this is the number of cores you wish the job to use (these are small simulations so 24 is plenty!).

executable - this is the executable name of the program you wish to use (pmemd.MPI for amber, mdrun_mpi for gromacs).

executableargs - this is the arguments that you would normally give to the executable on the command-line.

So we have added the code block for the amber simulation, now add in the remaining simulations, you should separate each block with a blank line in the job.conf so it is nice and clear. Below we give the examples of how to run various codes in case you haven't seen them used before:

AMBER: pmemd.MPI -i example.in -c example.min -p example.top -o example.out

GROMACS: mdrun_mpi -deffnm example

LAMMPS: lmp_xc30 -i example.in -sf opt

NAMD: namd2 +ppn 23 +pemap 1-23 +commap 0 example.in > example.out

Now use the above to add the remaining three simulations to the job.conf. You should only need to change 3 of the fields for each block to acheive this. Ask if you need help!

Once you are confident that you have this done, you can submit your simulation by doing:

```
longbow --verbose --job job.conf
```

If everything goes well you should see Longbow going about doing its testing and transfering files and submitting your jobs. Once you see the goodbye message, all jobs will have run. You can now list the contents of each directory:

```
ls amber
ls gromacs
ls lammps
ls namd
```

And you should see the results files for each one. That's it, you've just run 4 of the most popular MD engines on the UK national supercomputer in one go.

You can build up quite complex simulation configurations with this simulation type, and there are many parameters that can be tuned on a per job basis. The Longbow documentation contains a detailed treatment on how to use each one and what they do.