# Simple Simulation of Alanine Dipeptide
Authors: Benjamin D. Madej & Ross Walker<br>
Original Link: https://ambermd.org/tutorials/basic/tutorial0/index.php<br>
Adapted by: Jeremy Leung<br>
Email:&nbsp;&nbsp; jml230@pitt.edu


## Introduction

This tutorial is designed to provide an introduction to molecular dynamics simulations with Amber. It is designed for AMBER 24 and for new users who want to learn about how to run molecular dynamics simulations with Amber. This notebook is designed with the assumption that you are working within a virtual environment on the H2P Cluster at Pitt.

AMBER stands for Assisted Model Building and Energy Refinement. It refers not only to the molecular dynamics programs, but also a set of force fields that describe the potential energy function and parameters of the interactions of biomolecules.

In order to run a Molecular Dynamics simulation in Amber, each molecule's interactions are described by a molecular force field. The force field has specific parameters defined for each molecule.

<code>sander</code> is the basic MD engine of Amber. <code>pmemd</code> is the high performance implementation of the MD engine that contains a subset of features of <code>sander</code>. <code>pmemd</code> can also be run with acceleration from graphics processing units (GPU) through <code>pmemd.cuda</code> or the MPI parallel version <code>pmemd.mpi</code>.

In order to run an MD simulation with sander or pmemd, three key files are needed:

    parm7 - The file that describes the parameter and topology of the molecules in the system
    rst7 - The file that describes the initial molecular coordinates of the system
    mdin - The file that describes the settings for the Amber MD engine


## Learning Objectives

- navigate the command line interface through terminal and tleap to prepare topology and coordinate files
- understand the basics of forcefields and be able to load the FF19SB forcefield to be able to work with the alanine dipeptide
- set up an explicit water simulation in tleap
- perform basic equilibration
- perform production run simulations at a given temperature
- Visualize the results of production MD in <code>nglview</code>
- Use <code>matplotlib</code> to plot MD thermodynamic data such as temperature, density, and energy
- Use <code>mdtraj</code> to calculate the root mean square displacement (RMSD) of the trajectory relative to the initial structure



## System Requirements
- AmberTools24 is necessary to run this simulation.
    - Note that it is already installed as a module on H2P
- numpy, matplotlib, ipympyl, mdtraj, and nglview are used in this jupyter notebook. 
    - nglview, matplotlib, ipympl, and numpy are optional for visualization purposes.
    - These are already installed as part of the virtual environment.

In [None]:
!module load amber/24

/bin/bash: module: command not found


In [None]:
! cat << EOF
source leaprc.protein.ff19SB
diala = sequence { ACE ALA NME }source leaprc.water.opc
solvateoct diala OPCBOX  10.0
saveamberparm diala parm7 rst7
EOF > build_diala.in



-I: Adding /Users/RainbowIslands/miniconda3/envs/drug-design2025/dat/leap/prep to search path.
-I: Adding /Users/RainbowIslands/miniconda3/envs/drug-design2025/dat/leap/lib to search path.
-I: Adding /Users/RainbowIslands/miniconda3/envs/drug-design2025/dat/leap/parm to search path.
-I: Adding /Users/RainbowIslands/miniconda3/envs/drug-design2025/dat/leap/cmd to search path.

Welcome to LEaP!
(no leaprc in search path)
> 

## Running the Simulation

The files to initialize and run the WESTPA simulation are included with this jupyter notebook for demonstration purposes. **It is <u>not</u> recommended that you run your WE simulations within a Jupyter Notebook.** The simulation will take a while to complete, so feel free to stop it at any stage. Sample completed files for analysis are provided in the `for_analysis/` directory and `w_crawl/` folder.

In [None]:
import os
import shutil
# Clean up from previous/ failed simulations.
for i in ['west.h5', 'seg_logs', 'traj_segs','istates','get_pcoord.log']:
    try:
        os.remove(i)
    except OSError:
        try:
            shutil.rmtree(i)
        except OSError:
            pass
        
for i in ['seg_logs','traj_segs','istates']:
    os.mkdir(i)

In [None]:
import westpa
import numpy
from westpa.cli.core import w_init
from argparse import Namespace

# Initializing the System:
# Set some parameters that WESTPA needs to set simulation state.
args = Namespace(rcfile='west.cfg',
                 verbosity='verbose', ## change to 'debug' if you want a more detailed view of what's happening
                                      ## or 'verbose' for some information
                                      ## or 'quiet' for no information at all. 
                 work_manager='threads')

# Update westpa.rc with these
westpa.rc.process_args(args)

# Initialize the simulation using the tstate and bstate files
w_init.initialize(tstates=None, bstates=None, 
                  tstate_file='tstate.file', bstate_file='bstates/bstates.txt', 
                  segs_per_state=5, shotgun=False)

In [None]:
import westpa
import numpy
from westpa.cli.core import w_run
import westpa.work_managers as work_managers
from argparse import Namespace

# Running the Simulation.
# Set some parameters that WESTPA needs to set simulation state.
args = Namespace(rcfile = 'west.cfg',
                 verbosity = "verbose",
                 work_manager = 'threads')

# Update westpa.rc with these
westpa.rc.process_args(args)
# Prepare work manager
work_managers.environment.process_wm_args(args)

# Launch the simulation
w_run.run_simulation()

## Monitoring the WE Simulation

The following sample code runs <code>w_pdist</code> and <code>plothist</code> to generate a probability distribution.

In [None]:
import tarfile

# Untar the files for analysis
for file in ['./for_analysis/01.tar.gz']:
    with tarfile.open(file) as tar_f:
        tar_f.extractall('./for_analysis')

In [None]:
# Generate the pdist.h5
print('running w_pdist...')
!{'w_pdist -W ./for_analysis/01/west.h5'}
print('Done!')

In [None]:
# Generate the average hist.pdf
print('running plothist...')
!{'plothist evolution -o evol.pdf pdist.h5'}
print('Done!')

In [None]:
# View the average probability distribution
from IPython.display import IFrame, display
filepath = "evol.pdf"
IFrame(filepath, width=700, height=400)

In [None]:
# Generate the average hist.pdf
print('running plothist...')
!{'plothist average -o avg.pdf pdist.h5 0'}
print('Done!')

In [None]:
# View the average probability distribution
from IPython.display import IFrame, display
filepath = "avg.pdf"
IFrame(filepath, width=700, height=400)

## Analyzing the WE Simulation

### Visualization of the System

We will now take a look at how one of the basis states looks like. The water box is omitted for visibility. Na<sup>+</sup> is red and Cl<sup>-</sup> is blue in the representation.

In [None]:
import mdtraj
import nglview
system = mdtraj.load('bstates/01/basis.xml', top='common_files/bstate.pdb')
Na = list(range(0,1)) # Na+ is the first atom
Cl = list(range(1,2)) # Cl- is the second atom
both = Na + Cl # Na+ + Cl-
system = system.atom_slice(both)
view = nglview.show_mdtraj(system)
view.representations = [
    {"type": "ball+stick", "params": {
        "sele": ".Na+", "color": "red"
    }},
    {"type": "ball+stick", "params": {
        "sele": ".Cl-", "color": "blue"
    }}
]
view.background = 'white'
view

### Visualization of Coverage of the Start States

The following cells plots the multiple start states in a way that allows us to examine  their spacial coverage. The first cell visualizes the system one-by-one after alignment to Na<sup>+</sup>. The second cell plots all the center-of-masses at once. In some frames, Cl<sup>-</sup> might not be visible unless you move the camera.

Do note that many lines are not necessary for Na<sup>+</sup>/Cl</sup>-</sup> as it is spherically symmetrical, and that its center of mass is equal to its coordinate. Extra code is provided so it could be generalized to larger systems.

In [None]:
import numpy
import mdtraj
import nglview
lst = numpy.loadtxt('bstates/bstates.txt', usecols=2, dtype=str) # Reading basis state names
tpg = 'common_files/bstate.pdb' # Topology File for basis state (Shared between all bstates)
lst = [x + "/basis.xml" for x in lst] # Change path to point to file name

# Reading reference and setup
# There might be some warnings about unconverged rotation matrices because of the system's rotational symmetry

com = [] # list containing all CoM
a = mdtraj.load('bstates/01/basis.xml', top=tpg) # Load the first
a_slice = a.atom_slice([0,1]) # Just Na+
com.append(numpy.squeeze(mdtraj.compute_center_of_mass(a_slice))) # Save CoM of Na+
a_slice = a.atom_slice([0,1]) # Both Na+/Cl-

# Loading and superposing, storing Center of Mass (CoM) to list for heatmap
c = a_slice
for i in lst:
    b = mdtraj.load('bstates/' + i, top=tpg)
    b = b.atom_slice([0,1])
    b.superpose(a_slice, atom_indices=[0])
    c = mdtraj.join([c,b], check_topology=False)
    # Just saving the CoM of Cl-, since Na+ is superimposed
    com.append(numpy.squeeze(mdtraj.compute_center_of_mass(b.atom_slice([1]))))
com = numpy.asarray(com)

# Now displaying it, note that Cl- is not visible in some frames unless you rotate the camera
view2 = nglview.show_mdtraj(c)
view2.representations = [
    {"type": "ball+stick", "params": {
        "sele": ".Na+", "color": "red"
    }},
    {"type": "ball+stick", "params": {
        "sele": ".Cl-", "color": "blue"
    }}
]
view2.center('.Na+')
view2.control.zoom(-1.75)
view2

In [None]:
# Looking at the coverage of the sstates, assuming you ran the previous cell
# Comment out the next line or install ipypml if you have trouble viewing.
%matplotlib widget 
import matplotlib
from mpl_toolkits.mplot3d import Axes3D
from pylab import *
import numpy

fig = matplotlib.pyplot.figure(figsize=(7,7))
ax = fig.add_subplot(111, projection='3d')

img1 = ax.scatter(com[0,0], com[0,1], com[0,2], s=20, marker='s', color='Red', label='Na+')
img2 = ax.scatter(com[1:,0], com[1:,1], com[1:,2], s=50, marker='.', color='Blue', label='Cl-')

# Labels and Titles
ax.set_title("Basis State Structures Coverage")
ax.set_xlabel('x-axis')
ax.set_ylabel('y-axis')
ax.set_zlabel('z-axis')
ax.legend()

show()

Due to the rotational symmetry, it might be better to look at the Na<sup>+</sup>/Cl<sup>-</sup> atom-to-atom distances instead, which are already precalculated in each pcoord.init file.

In [None]:
import numpy
import matplotlib
import matplotlib.pyplot as plt
lst = numpy.loadtxt('bstates/bstates.txt', usecols=2, dtype=str) # Reading basis state names
lst2 = ["bstates/" + x + "/pcoord.init" for x in lst] # Change path to point to file name

values = numpy.asarray([numpy.loadtxt(x) for x in lst2])

fig2 = matplotlib.pyplot.figure(figsize=(7,7))
ax2 = fig2.subplots()

img3 = ax2.scatter(0, 0, color='Red', label='Na+')
img4 = ax2.scatter(values[:], numpy.zeros(values.shape[0]), color='Blue', label='Cl-')
img5 = ax2.axvline(x=2.6, ymin=0, ymax=1, label='Bound State', linestyle='--')

# Labels and Titles
ax2.set_title("Basis States Structures Coverage (Atom-to-Atom Distances)")
ax2.set_xlabel('Na+ to Cl- distance ($\mathrm{\AA}$)')
ax2.set_ylabel('y-axis')
ax2.legend()
plt.xlim(-1,23)

plt.show()

### Calculating Rates

In [None]:
import tarfile
from os import symlink

# Untar the files for analysis.
for file in ['./for_analysis/01.tar.gz']:
    with tarfile.open(file) as tar_f:
        tar_f.extractall('./for_analysis')
try:
    symlink('./for_analysis/01/west.h5', './west.h5')
except FileExistsError:
    print('A different west.h5 exists in the basic_nacl/. Either delete it or rename it to continue.')

In [None]:
from westpa.cli.tools import w_ipa
import westpa

w = w_ipa.WIPI()
w.main()
w.interface = 'matplotlib'

## Managing Your Simulations

### Combining Multiple Simulation Runs

The following sample code runs <code>w_multi_west</code> to concatenate two runs.

In [None]:
import tarfile

# Untar the files
for file in ['./for_analysis/01.tar.gz','./for_analysis/02.tar.gz']:
    with tarfile.open(file) as tar_f:
        tar_f.extractall('./for_analysis')

In [None]:
# Run w_multi_west in the commandline
!{'w_multi_west -m ./for_analysis/ -n 2'}
print('Done!')

In [None]:
# Check to see if the multi.h5 file exists
from os.path import exists
exists('multi.h5')

### Using <code>w_crawl</code> to calculate post-simulation auxiliary data

The following sample code runs <code>w_crawl</code> to calculate additional observables post-simulation.

In [None]:
import os
os.chdir('w_crawl')
# Run w_crawl in the commandline
!{'./run_w_crawl.sh'}
os.chdir('../')
print('Done!')

In [None]:
# Check to see if the example.h5 file exists
from os.path import exists
exists('./w_crawl/crawl.h5')

## Cleaning Up

In [None]:
# Run the following bash script to revert your tutorial folder to pristine condition.
!{'./1.clean.sh'}