Author: Lester Hedges<br>
Email:&nbsp;&nbsp; lester.hedges@bristol.ac.uk

Adapted by:
Antonia Mey<br>
Email:&nbsp;&nbsp; antonia.mey@ed.ac.uk

___Jupyter Recap___:
* Press Shift+Enter to execute a cell and move to the cell below.
* Press Ctrl+Enter to execute a cell and remain in that cell.
* Run a shell command on the underlying operating system by prefixing the command with an exclamation mark, !
* Remember that the flow is in the order that you execute cells, which is not necessarily linear in the notebook. Keep track of the numbers in brackets to the left of the cell!


# Molecular dynamics

## Introduction

In this section, we will learn how to use BioSimSpace to configure and run some basic molecular dynamics simulations.


## Protocols

One of the key goals of BioSimSpace was to start a conversation regarding _best practice_ within the biomolecular simulation community and to facilitate the codifying of shareable, reusable, and extensible simulation protocols.

The [BioSimSpace.Protocol](https://biosimspace.org/api/index_Protocol.html) package defines protocols for a range of common molecular dynamics simulations. We can query the package to see what protocols are available:

In [None]:
from get_tutorial import download
download()

In [None]:
import BioSimSpace as BSS
BSS.Protocol.protocols()

Since we require protocols to be _interoperable_, the classes listed above are simple objects that allow you to configure a _limited_ set of options that are handled by _all_ of the molecular dynamics engines that we support. This might seem quite restrictive, but we will see later how it is possible to fully customise a simulation for a particular molecular dynamics engine.

Each protocol comes with some default options. To see what those are we can instantiate an object using the default constructor. For example, let's explore the [Equilibration](https://biosimspace.org/api/generated/BioSimSpace.Protocol.Equilibration.html#BioSimSpace.Protocol.Equilibration) protocol.

In [None]:
protocol = BSS.Protocol.Equilibration()
print(protocol)

## Processes

Once you have created a molecular system and chosen a protocol, then it is time to create a simulation _process_. The [BioSimSpace.Process](https://biosimspace.org/api/index_Process.html) package provides functionality for configuring and running processes with several common molecular dynamics engines.

Let's query the package to see what engines are available:

In [None]:
BSS.Process.engines()

In [None]:
system = BSS.IO.readMolecules("inputs/ala*")

As a simple example, let us use a short minimisation protocol:

In [None]:
protocol = BSS.Protocol.Minimisation(steps=1000)

We'll now create a process to apply the `Protocol` to the `System` using the AMBER molecular dynamics engine:

In [None]:
process = BSS.Process.Amber(system, protocol)

A lot of complexity is hidden in this line. BioSimSpace has automatically found an AMBER executable on the underlying operating system, has automatically written AMBER format molecular input files, generated an AMBER configuration file for the minimisation protocol, and configured any command-line arguments that are required.

By default, processes are run inside of a temporary working directory hidden away from the user. To see where this is, run:

In [None]:
process.workDir()

N.B. If you want to use a different temporary directory, e.g. one with a faster disk, then simply set the `TMPDIR` environment variable. Alternatively, you can pass the `work_dir` argument to the `Process` constructor to explicitly specify the path. This can be useful when you want named directories, or want to examine the intermediate files from the `Process` for debugging purposes.

To see what executable was found, run:

In [None]:
process.exe()

To see the list of autogenerated input files:

In [None]:
process.inputFiles()

If you like, we could zip up the input files to use on another occasion. When working on a notebook server it's possible to return a file link so that we can download them:

In [None]:
process.getInput(file_link=True)

We can query also query the list of configuration file options:

In [None]:
process.getConfig()

And also get command-line argument string for the process:

In [None]:
process.getArgString()

N.B. You might want to add additional configuration details to your `Process` wrappers, e.g. to ensure that a specific executable is used.

Now that we have a process, let's go ahead and start it:

In [None]:
process.start()

BioSimSpace has now launched a minimisation process in the background! When in an interactive session you carry on working and periodically check in on the process to see how its doing.

To check whether the process is running:

In [None]:
process.isRunning()

We can see how many minutes it has been running for:

In [None]:
process.runTime()

Since this is a short minimisation it will likely finish pretty quickly. Let's print the final energy of the system and return the minimised molecular configuration.

In [None]:
print(process.getTotalEnergy(block=True))
minimised = process.getSystem()

When working interactively, any time we query a running process we get back the _latest_ information that has been written to disk. This means that we can get an update on how things are progressing, then immediately carry on with what we were doing in our notebook. By passing `block=True`, as we do when we call `getTotalEnergy` above, we request that the process finishes running before returning a result. This means we get the _final_ energy, and the minimised system that is returned afterwards represents the _final_ snapshot that was saved.

Let's now re-run the simulation, instead using GROMACS as the MD engine.

In [None]:
process = BSS.Process.Gromacs(system, protocol)

When the process is instantiated, BioSimSpace takes the system that was read from AMBER format files and converts it to GROMACS format ready for simulation. Let's take a look at the list of input files that were autogenerated for us:

In [None]:
process.inputFiles()

Let's start the process running and, once again, wait for it to finish before getting the minimised system.

In [None]:
process.start()
minimised = process.getSystem(block=True)

## Interactive molecular dynamics

The example in the previous section was finished almost as soon as it began. Let's run a more complicated equilibration protocol so that we can learn more about how to monitor processes interactively using BioSimSpace.

In [None]:
protocol = BSS.Protocol.Equilibration(
    runtime=1 * BSS.Units.Time.picosecond,
    temperature_start=0 * BSS.Units.Temperature.kelvin,
    temperature_end=300 * BSS.Units.Temperature.kelvin,
    restraint="backbone",
)

This protocol will equlibrate a system for 20 picoseconds, while heating it from 0 to 300 Kelvin and restraining any atoms in the backbone of the molecule. Note that some of the parameters passed have units, e.g. the temperatures are in Kelvin. BioSimSpace has a built in type system for handling variables with units. The `BSS.Units` package provides a convenient way of declaring these, for example `10*BSS.Units.Temperature.kelvin` creates an object of type `BSS.Types.Temperature` with a magnitude of 10 and unit of Kelvin. This allows the user to pass parameters with whatever unit they like. BioSimSpace will simply convert it to the correct unit for the chosen MD engine internally.

One again, we now need a `Process` in order to run our simulation. Exectute the cell below to initialise an AMBER process and start it immediately. Note that we pass in the minimised system from the last example, along with our new protocol.

In [None]:
process = BSS.Process.Amber(minimised, protocol).start()

We can monitor the time, temperature, and energy as the process runs. If you run this multiple times using "CTRL+Return" you'll see the temperature slowly increasing.

In [None]:
print(process.getTime(), process.getTemperature(), process.getTotalEnergy())

Since all of the values returned above are typed we can easily convert them to other units:

In [None]:
print(
    process.getTime().nanoseconds(),
    process.getTemperature().celsius(),
    process.getTotalEnergy().kj_per_mol(),
)

It's possible to query many other thermodynamic records. What's available depends on type of protocol and the MD package that is used to run the protocol. To get more information, run:

N.B. Certain functionality is specific to the process in question, i.e. `BSS.Process.Amber` will have different options to `BSS.Process.Gromacs`, but, for the purposes of interoperability, there is a core set of functionality that is consistent across all `Process` classes, e.g. all classes implement a `getSystem` method.)

### Plotting time series data

As well as querying the most recent records we can also get a time series of results by passing the `time_series` keyword argument to any of the data record getter methods, e.g.

```python
# Get a time series of pressure records.
pressure = process.getPressure(time_series=True)
```

The `BSS.Notebook` package provides several useful tools that are available when working inside of a Jupyter notebook. One of these is the plot function, that allows us to create simple x/y plots of time-series data.

Let's grab the same record data as above and use it to make some graphs of the data.

In [None]:
# Generate a plot of time vs temperature.
plot1 = BSS.Notebook.plot(
    process.getTime(time_series=True), process.getTemperature(time_series=True)
)

# Generate a plot of time vs energy.
plot2 = BSS.Notebook.plot(
    process.getTime(time_series=True), process.getTotalEnergy(time_series=True)
)

(Note that, by default, the axis labels axis labels are automatically generated from the types and units of the x and y data that are passed to the function.)

Re-run the cell using "CTRL+Return" to see the graphs update as the simulation progesses. (Occasionally, you might see a warning that the x and y data sets are mismatched in length, this is because the data was extracted before all records were written to disk.)

Being able to query a process in real time is an incredibly useful tool. This could enable us to check for convergence, or spot errors in the simulation. If you ever need to kill a running process (perhaps it was configured incorrectly), run:

```python
process.kill()
```

### Visualising the molecular system

Another useful tool that is available when working inside of a notebook is the `View` class that can be used to visualise the molecular system while a process is running. To create a `View` object we must attach it to a process (or a molecular system), e.g.:

In [None]:
view = BSS.Notebook.View(process)

We can now visualise the system:

In [None]:
view.system()

(If you see an empty view, try re-executing the cell.)

To only view a specific molecule:

In [None]:
view.molecule(0)

To view a list of molecules:

In [None]:
view.molecules([0, 5, 10])

If a particular view was of interest it can be reloaded as follows:

In [None]:
# Reload the original view.
view.reload(0)

To save a specific view as a PDB file:

In [None]:
view.savePDB("my_view.pdb", index=0)