# Molecular Dynamics Simulation of Alanine Dipeptide

In this notebook, we will be using the latest AMBER and CHARMM force fields. The OpenMM documentation contains [a complete overview of included force fields](http://docs.openmm.org/latest/userguide/application.html#force-fields). The corresponding data files can be viewed on the [GitHub repository](https://github.com/openmm/openmm/tree/master/wrappers/python/simtk/openmm/app/data). These files are already included in every OpenMM installation, so you don't need to download them.

Just to avoid getting lost in the force field zoo, here are the key references to various AMBER force fields and some comments on the history of their development:

- 1997 `amber96.xml`: a slight improvement of amber94 with ab initio calculations, not very significant. [REF](https://doi.org/10.1007/978-94-017-1120-3_2)

- 2006 `amber99sb.xml`: a refinement of amber ff94 side-chain and mostly backbone (SB) torsional parameters, to obtain better balance between stabilities of different secondary structure elements, in the literature referred to as AMBER ff99SB. [REF](https://dx.doi.org/10.1002%2Fprot.21123)

- 2003 `amber03.xml`: improvement of AMBER ff99SB by refitting of torsional parameters to ab initio data, referred to as AMBER ff03. [REF](https://doi.org/10.1002/jcc.10349)

- 2010 `amber99sbildn.xml`: refitting of AMBER ff99sb to improve side-chain conformations to match NMR data, in the literature referred to as AMBER ff99SB-ILDN. [REF](https://dx.doi.org/10.1002%2Fprot.22711)

- 2010 `amber99sbnmr.xml`: another (not as popular) refitting of AMBER ff99SB parameters to NMR data. [REF](https://doi.org/10.1002/anie.201001898)

- ???? `amber10.xml`: origin not clear (yet), it is certainly an intermediate step in the development of AMBER ff14SB.

- 2015 `amber14-all.xml` or `amber14/protein.xml`: improved side-chain and backbone parameters starting from AMBER ff99SB, in the literature referred to as AMBER ff14SB. [REF](https://doi.org/10.1021/acs.jctc.5b00255)

The most recent (and recommended) AMBER force field is **AMBER ff14SB**. If for some reason, you need to use an older model, **AMBER ff99SB-ILDN** or **AMBER ff99SB** could be useful. The remaining ones are rarely used anymore.

Another popular family of biomolecular force fields are the CHARMM force fields. The [CHARMM36](https://doi.org/10.1021/ct300400x) force field was pubslished in 2012 and its development followed similar steps as that of AMBER. In 1992 the CHARMM22 force field was released. In 2004, so-called [CMAP corrections](https://doi.org/10.1021/ja036959e) were released based on ab initio reference data, to improve the accuracy of the backbone conformations. In CHARMM36 this was taken one (big) step further, with more accurate ab initio data and follow-up refinements against NMR data.
The files `charmm36.xml` and `charmm36/*.xml` were only included recently in OpenMM and these will not be used for this tutorial. Instead, files generated with [CHARMM GUI](http://charmm-gui.org/) will be used instead.


## 1. Gas phase (AMBER)

Unlike the first notebook, all imports and initialization are put in the first cell.

In [None]:
%matplotlib widget
from sys import stdout
from simtk.openmm.app import *
from simtk.openmm import *
from simtk.unit import *
import numpy as np
import nglview
import mdtraj
import pandas
import matplotlib.pyplot as plt

We will run test simulations on tiny "protein": alanine dipeptide. Data files for this notebook were taken from the [OpenMM test systems](https://github.com/openmm/openmm/tree/master/wrappers/python/tests/systems). The files `alanine-dipeptide-implicit.*` were renamed as follows:

* `alanine-dipeptide.pdb` (All-atom PDB file.)
* `alanine-dipeptide.inpcrd` (AMBER input coordinates)
* `alanine-dipeptide.prmtop` (AMBER parameters and topology)

The simulation code below closely follows the water example from the previous notebook with a few minor differences:

- The topology and the initial positions are now taken from a PDB file

- A Langevin integrator is used, with different settings and more MD steps.

- The AMBER 2014 force field is used.

- X-H bonds are constrained in length, where X can be any atom. This allows us to take time steps of 2 femtoseconds.

This is not a realistic simulation because the dipeptide is simulated in gas phase (no solvation).

In [None]:
pdb = PDBFile('alanine-dipeptide.pdb')
print(pdb.topology)
forcefield = ForceField('amber14-all.xml')
system = forcefield.createSystem(pdb.topology, nonbondedCutoff=3*nanometer, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 2*femtoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()


**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON! &#x1F528;
</span>**

> Modify the PDB file to understand which pieces of information are essential to apply the force-field defition. Try making changes to:
>
> 1. the atom names, 
> 2. the name of a residue, 
> 3. the order of the atoms (within one residue or mixing residues) and 
> 4. the presence of atoms.
>
> Also note that the PDB file contains no bonds. These are somehow reconstructed when loading the PDB file. Try displacing one atom over a large distance. Does this affect the bond detection?

In the next cell, an MD simulation is carried out and the trajectory is written to a DCD file. This is a compact binary file format for trajectory data. This does not only save disk space. Also loading DCD files is much faster compared to PDB trajectory files.

In [None]:
simulation.reporters = []
simulation.reporters.append(DCDReporter('traj1.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars1.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)

In the next code cell, the potential energy as function of time is plotted. It reveals a very short equilibration phase. Because the energy was first minimized, the potential energy starts low and increases quickly due to the motion of the atoms.

Normally, the equilibration phase should be discarded prior to further analysis. We will only do this in the example with explicit solvent in this notebook.

In [None]:
df1 = pandas.read_csv("scalars1.csv")
df1.plot(kind='line', x='#"Time (ps)"', y='Potential Energy (kJ/mole)')

The visualization below shows a few changes in conformation of the dipeptide.

In [None]:
traj1 = mdtraj.load('traj1.dcd', top='alanine-dipeptide.pdb')
traj1.superpose(traj1, 0)
nglview.show_mdtraj(traj1)

The following cell defines a function to draw a "Ramachandran" plot with MatPlotLib. As opposed to a conventional Ramachandran plot, datapoints represents different conformations (time steps) of the same angles.

In [None]:
def plot_ramachandran(traj, phi_atoms=None, psi_atoms=None):
    """Generate a basic Ramachandrom plot for a given trajectory.
    
    Parameters
    ----------
    traj
        An MDTraj trajectory object.
    phi_atoms
        A list of atom names (in order) to identify the phi angle.
        The defaults in MDTraj do not work for termini in CHARMM
        topologies, which can be fixed with this argument.
    psi_atoms
        A list of atom names (in order) to identify the psi angle.
        The defaults in MDTraj do not work for termini in CHARMM
        topologies, which can be fixed with this argument.

    """
    from matplotlib.gridspec import GridSpec
    if phi_atoms is None:
        phis = mdtraj.compute_phi(traj)[1].ravel()
    else:
        phis = mdtraj.compute_dihedrals(traj, mdtraj.geometry.dihedral._atom_sequence(traj.topology, phi_atoms)[1])
    if psi_atoms is None:
        psis = mdtraj.compute_psi(traj)[1].ravel()
    else:
        psis = mdtraj.compute_dihedrals(traj, mdtraj.geometry.dihedral._atom_sequence(traj.topology, psi_atoms)[1])
    fig = plt.figure()
    gs = GridSpec(nrows=2, ncols=3, wspace=0.1)
    # Ramachandran plot
    ax1 = fig.add_subplot(gs[:2, :2])
    ax1.plot(phis*180/np.pi, psis*180/np.pi, 'k+')
    ax1.set_aspect('equal', adjustable='box')
    ax1.axvline(0)
    ax1.axhline(0)
    ax1.set_xlim(-180, 180)
    ax1.set_ylim(-180, 180)
    ax1.set_xticks(np.linspace(-180, 180, 5))
    ax1.set_yticks(np.linspace(-180, 180, 5))
    ax1.set_xlabel("Phi [deg]")
    ax1.set_ylabel("Psi [deg]")
    # Phi(t) plot
    ax2 = fig.add_subplot(gs[0, 2])
    ax2.plot(np.arange(len(phis)), phis*180/np.pi, 'k+')
    ax2.axhline(0)
    ax2.set_ylim(-180, 180)
    ax2.set_yticks(np.linspace(-180, 180, 5))
    ax2.set_xlabel("Step")
    ax2.set_ylabel("Phi [deg]")
    # Psi(t) plot
    ax3 = fig.add_subplot(gs[1, 2])
    ax3.plot(np.arange(len(phis)), psis*180/np.pi, 'k+')
    ax3.axhline(0)
    ax3.set_ylim(-180, 180)
    ax3.set_yticks(np.linspace(-180, 180, 5))
    ax3.set_xlabel("Step")
    ax3.set_ylabel("Psi [deg]")
    plt.show()

# Function call to make the plot    
plot_ramachandran(traj1)

## 2. Implicit solvent model (AMBER with GBSA-OBC)

OpenMM supports various parameterizations of the generalized-Born Surface Area (GBSA) model, e.g. that by 
Onufriev, Bashford and Case (OBC), see https://doi.org/10.1002/prot.20033. For a limited number of force fields, atomic GBSA-OBC parameters were generated according to a recipe in the [TINKER](https://dasher.wustl.edu/tinker/) program:

- `amber96_obc.xml` for `amber96.xml`
- `amber03_obc.xml` for `amber03.xml`
- `amber10_obc.xml` for `amber10.xml`
- `amber99_obc.xml` for `amber99sb.xml`, `amber99sbildn.xml` or `amber99sbnmr.xml`

These can be used by changing the force field definition, e.g. to `ForceField('amber99sbildn.xml', 'amber99_obc.xml')`. Keep in mind that these extra atomic parameters for GBSA-OBC deviate from those in the [AMBER](http://ambermd.org/) program.

One can also use implicit solvent models that match exactly those implemented in the AMBER program, and this is also the only way to use an implicit solvent model with AMBER ff14SB. To make this work, we can no longer simply start from a PDB file. Instead we have to load the structure, the topology and the force field parameters from so-called INPCRD and PRMTOP files. These files can be constructed with [AmberTools](http://ambermd.org/AmberTools.php) or [CHARMM-GUI](http://www.charmm-gui.org), starting from a PDB file and a force field definition. For this example, the files were prepared previously.

In [None]:
prmtop = AmberPrmtopFile('alanine-dipeptide.prmtop')
inpcrd = AmberInpcrdFile('alanine-dipeptide.inpcrd')
system = prmtop.createSystem(implicitSolvent=OBC2, nonbondedCutoff=3*nanometer, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
simulation = Simulation(prmtop.topology, system, integrator)
simulation.context.setPositions(inpcrd.positions)
simulation.minimizeEnergy()
simulation.reporters.append(DCDReporter('traj2.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars2.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)

With the implicit solvent model, the computational cost increases by approximately 50%. (This increase might vary with system size.)

The potential energy as function of time shows a short equilibration phase, comparable to the previous example:

In [None]:
df2 = pandas.read_csv("scalars2.csv")
df2.plot(kind='line', x='#"Time (ps)"', y='Potential Energy (kJ/mole)')

We can now compare the structure of the backbone with and without implicit solvent. Also compare with figures obtained by other students.

In [None]:
traj2 = mdtraj.load('traj2.dcd', top='alanine-dipeptide.prmtop')
plot_ramachandran(traj2)

## 3. Explicit solvent model (AMBER)

Any implicit solvent model will always be a serious approximation of explicit water molecules surrounding a solute. Before we can run such a simulation, we need to construct a topology and initial geometry of alanine dipeptide with a large number of water molecules. This can be done in several ways and for simplicity, we will first use the built-in tools from OpenMM. For this example, we can again start from a PDB file, no need for INCRD or PRMTOP files.

One is in principle free to combine any biomolecule with any water force field, but not all combinations may have been carefully tested. Always check the original papers in which the biomolecular force field was published to select mathcing pairs. To avoid confusion, the latest AMBER and CHARMM force fields in OpenMM are bundeled with mathcing water force fields. Here are a few sensible combinations:

* `amber14-all.xml` can be paired with any TIP or SPC model in the `amber14` directory, see https://github.com/openmm/openmm/tree/master/wrappers/python/simtk/openmm/app/data/amber14

* `charm36.xml` should be combined withwater force fields in the `charmm36` directory, see https://github.com/openmm/openmm/tree/master/wrappers/python/simtk/openmm/app/data/charmm36. Note that `charmm36/water.xml` is a slightly modified form of TIP3P.

* The parameters in `amber99sbildn.xml` [were tested](https://dx.doi.org/10.1002%2Fprot.22711) with TIP3P (`tip3p.xml`) and TIP4P-Ew (`tip4pex.xml`), see https://github.com/openmm/openmm/tree/master/wrappers/python/simtk/openmm/app/data.

The files from the Github repository should normally never be downloaded and links are only provided for reference. These files are included in any OpenMM installation.

In [None]:
pdb = PDBFile('alanine-dipeptide.pdb')
modeller = Modeller(pdb.topology, pdb.positions)
forcefield = ForceField('amber14-all.xml', 'amber14/tip3pfb.xml')
modeller.addSolvent(forcefield, model='tip3p', padding=1*nanometer)
print(modeller.topology)
# Write a PDB file to provide a topology of the solvated
# system to MDTraj below.
with open('init3.pdb', 'w') as outfile:
    PDBFile.writeFile(modeller.topology, modeller.positions, outfile)

# The modeller builds a periodic box with the solute and solvent molecules.
# PME is the method to compute long-range electristatic interactions in
# periodic systems.
system = forcefield.createSystem(modeller.topology, nonbondedMethod=PME, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 2*femtoseconds)
simulation = Simulation(modeller.topology, system, integrator)
simulation.context.setPositions(modeller.positions)
simulation.minimizeEnergy()
simulation.reporters.append(DCDReporter('traj3.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars3.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)

This computation was considerably slower, approximately a factor of 10 more expensive.

Before going further, we plot the potential energy as function of time, to estimate the length of the equilibration phase. There was barely any equilibration in the previous two runs, mainly because the few degrees of freedom were all relatively fast motions. More complex structures tend to exhibit also slower motions. For example, in our final run with explicit water molecules the solvent needs more time to equilibrate.

In [None]:
df3 = pandas.read_csv("scalars3.csv")
df3.plot(kind='line', x='#"Time (ps)"', y='Potential Energy (kJ/mole)')

The figure above shows that approximately (the first) 15 picoseconds are required for the equilibration. Results for these steps need to be removed before performing any analysis. A single MD step takes 2 femtoseconds and only every 100 steps, a frame is written to the PDB file, which means that the first 75 frames from the trajectory should be removed. For the visualization, we still look at all steps.

In [None]:
traj3 = mdtraj.load('traj3.dcd', top='init3.pdb')
view = nglview.show_mdtraj(traj3)
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

A few remarks on the visualization:

- The absolute position of the dipeptide relative to the box is not crucial. Due to the periodic boundary contions, all molecules interact with an infinitely large environment, of which the visualization only shows a small fragment.

- The dipeptide slowly diffuses through the liquid, which is the expected behavior. Water molecules make larger jumps because they are smaller and lighter.

In the next cell, the equilibration phase is discarded and the water molecules are removed, before making the Ramachandran plot.

In [None]:
traj3.restrict_atoms(traj3.topology.select("protein"))
plot_ramachandran(traj3[75:])

The Ramachandran plot is strongly influenced by the choice of solvent model. Already for a simple dipeptide, the limitations of an implicit solvent model are clear. The reason is that the amide groups participate in the hyrogen-bonding network of the solvent, involving specific and directional interactions, which are absent in implicit solvent models.

Another difference with the first two runs is the slower change of the back-bone angles. There are fewer changes between conformations and switching also takes longer. This is due to the friction with and the inertia of the surrounding water molecules. Hence, a single time step with explicit solvent is more costly and one has to peform longer simulations because water slows down conformational changes. For this reason, implicit solvent models are still popular, despite the fact that they are very approximate.

# 4. Explicit solvent model (CHARMM without CMAP)

**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON! &#x1F528;
</span>**

> CHARMM-compatible PDB and PSF files are not included in this example. Follow the instructions below to create these files and test them with the code cells provided below.


Before using [CHARMM-GUI](http://charmm-gui.org), we first need to create a simple PDB file of a single alanine residue, by stripping this part out of a larger PDB file. Put the following in a file `ala.pdb`:

```
ATOM      7  N   ALA X   2       8.270  24.640   0.690  1.00  0.00            
ATOM      9  CA  ALA X   2       7.480  23.690  -0.190  1.00  0.00            
ATOM     11  CB  ALA X   2       8.470  23.160  -1.270  1.00  0.00            
ATOM     15  C   ALA X   2       6.730  22.590   0.490  1.00  0.00            
ATOM     16  O   ALA X   2       7.340  21.880   1.280  1.00  0.00            
TER
END
```

This was used as PDB input file for the [CHARMM-GUI Solution Builder](http://charmm-gui.org/?doc=input/solution), using the following options:

- Start page: select ala.pdb for upload and set PDB format to PDB.
- Page 1 (PDB Info): keep default settings.
- Page 2 (PDB Info): apply terminal patching with ACE and CT3.
- Page 3 (CHARMM PDB): disable "include ions", for consistency with the previous example.
- Page 4 (Solvator): keep default settings.
- Page 5 (PBC Setup): keep default settings.
- Page 6 (Input generator): click on the red `download.tgz` button and save it as `charmm-.

Unpack the TGZ file and make sure all its contents end up in a subdirectory `charmm-gui-nocmap` next to the python notebook. The code cells below will load various files from the `charmm-gui-nocmap` directory to run a simulation with the CHARMM36 force field. These files include geometry and topology of the solvated dipeptide and CHARMM36 force field parameters.

In [None]:
def readBox(fnbox):
    """Read the box size from a step2.1_waterbox.prm file."""
    with open(fnbox) as f:
        for line in f:
            segments = line.split('=')
            if segments[0].strip() == "SET A":
                a = float(segments[1])
            if segments[0].strip() == "SET B":
                b = float(segments[1])
            if segments[0].strip() == "SET A":
                c = float(segments[1])
    return a*angstroms, b*angstroms, c*angstroms

pdb = PDBFile('charmm-gui-nocmap/step3_pbcsetup.pdb')
psf = CharmmPsfFile('charmm-gui-nocmap/step3_pbcsetup.psf')
psf.setBox(*readBox('charmm-gui-nocmap/step2.1_waterbox.prm'))
params = CharmmParameterSet(
    "charmm-gui-nocmap/toppar/par_all36m_prot.prm",
    "charmm-gui-nocmap/toppar/top_all36_prot.rtf",
    "charmm-gui-nocmap/toppar/toppar_water_ions.str",
)

system = psf.createSystem(params, nonbondedMethod=PME, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 2*femtoseconds)
simulation = Simulation(psf.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()
simulation.reporters.append(DCDReporter('traj4.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 100, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars4.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)

In [None]:
df4 = pandas.read_csv("scalars4.csv")
df4.plot(kind='line', x='#"Time (ps)"', y='Potential Energy (kJ/mole)')

In [None]:
traj4 = mdtraj.load('traj4.dcd', top='charmm-gui-nocmap/step3_pbcsetup.psf')
view = nglview.show_mdtraj(traj4)
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

In [None]:
traj4.restrict_atoms(traj4.topology.select("protein"))
# non-standard atom names are needed due to quirky CHARMM atom names.
plot_ramachandran(traj4, ['CY', 'N', 'CA', 'C'], ['N', 'CA', 'C', 'NT'])

# 4. Explicit solvent model (CHARMM with CMAP)

The failure of CHARMM-GUI to add the CMAP terms in the previous example is a consequence of how the CHARMM36 residue topologies are stored in the file `top_all36_prot.rtf`. To circumvent this issue, one as to use one large residue for the entire dipeptide with label `ALAD`. This is not as straightforward and requires us to jump through a few hoops.

**<span style="color:#A03;font-size:14pt">
&#x270B; HANDS-ON! &#x1F528;
</span>**

> Prepare new CHARMM compatible PDB and PSF files with CHARMM-GUI, by following the steps below.

- Create a file `alad.pdb`, which is almost the same as the given `alanine-dipeptide.pdb` file. The only differences are (i) all atoms are part of the same `ALAD` residue and (ii) a residue index is added. (This must be present to make CHARMM-GUI work. You should have the following:

```
ATOM      1 HH31 ALADA   2       2.000   1.000  -0.000  1.00  0.00
ATOM      2  CH3 ALADA   2       2.000   2.090   0.000  1.00  0.00
ATOM      3 HH32 ALADA   2       1.486   2.454   0.890  1.00  0.00
ATOM      4 HH33 ALADA   2       1.486   2.454  -0.890  1.00  0.00
ATOM      5  C   ALADA   2       3.427   2.641  -0.000  1.00  0.00
ATOM      6  O   ALADA   2       4.391   1.877  -0.000  1.00  0.00
ATOM      7  N   ALADA   2       3.555   3.970  -0.000  1.00  0.00
ATOM      8  H   ALADA   2       2.733   4.556  -0.000  1.00  0.00
ATOM      9  CA  ALADA   2       4.853   4.614  -0.000  1.00  0.00
ATOM     10  HA  ALADA   2       5.408   4.316   0.890  1.00  0.00
ATOM     11  CB  ALADA   2       5.661   4.221  -1.232  1.00  0.00
ATOM     12  HB1 ALADA   2       5.123   4.521  -2.131  1.00  0.00
ATOM     13  HB2 ALADA   2       6.630   4.719  -1.206  1.00  0.00
ATOM     14  HB3 ALADA   2       5.809   3.141  -1.241  1.00  0.00
ATOM     15  C   ALADA   2       4.713   6.129   0.000  1.00  0.00
ATOM     16  O   ALADA   2       3.601   6.653   0.000  1.00  0.00
ATOM     17  N   ALADA   2       5.846   6.835   0.000  1.00  0.00
ATOM     18  H   ALADA   2       6.737   6.359  -0.000  1.00  0.00
ATOM     19  CH3 ALADA   2       5.846   8.284   0.000  1.00  0.00
ATOM     20 HH31 ALADA   2       4.819   8.648   0.000  1.00  0.00
ATOM     21 HH32 ALADA   2       6.360   8.648   0.890  1.00  0.00
ATOM     22 HH33 ALADA   2       6.360   8.648  -0.890  1.00  0.00
TER   
END   
```

Start a new session in the [CHARMM-GUI Solution Builder](http://charmm-gui.org/?doc=input/solution), using the following options:

- Start page: select alad.pdb for upload and set PDB format to PDB.
- Page 1 (PDB Info): check the "hetero" box because CHARMM-GUI can only recognize the atoms as hetero-atoms.
- Page 2 (PDB Info): under "Reading Hetero Chain Residues", selected "Rename to ALAD". Leave other settings as they are.
- Page 3 (CHARMM PDB): disable "include ions", for consistency with the previous example.
- Page 4 (Solvator): keep default settings.
- Page 5 (PBC Setup): keep default settings.
- Page 6 (Input generator): click on the red `download.tgz` button and save it as `charmm-gui-cmap.tgz`.

Unpack the TGZ file and make sure all its contents end up in a subdirectory `charmm-gui-cmap` next to the python notebook. With that, the following code should work.

In [None]:
pdb = PDBFile('charmm-gui-cmap/step3_pbcsetup.pdb')
psf = CharmmPsfFile('charmm-gui-cmap/step3_pbcsetup.psf')
psf.setBox(*readBox('charmm-gui-cmap/step2.1_waterbox.prm'))
params = CharmmParameterSet(
    "charmm-gui-cmap/toppar/par_all36m_prot.prm",
    "charmm-gui-cmap/toppar/top_all36_prot.rtf",
    "charmm-gui-cmap/toppar/toppar_water_ions.str",
)

system = psf.createSystem(params, nonbondedMethod=PME, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 2*femtoseconds)
simulation = Simulation(psf.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()
simulation.reporters.append(DCDReporter('traj5.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 100, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars5.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)

In [None]:
df5 = pandas.read_csv("scalars5.csv")
df5.plot(kind='line', x='#"Time (ps)"', y='Potential Energy (kJ/mole)')

In [None]:
traj5 = mdtraj.load('traj5.dcd', top='charmm-gui-cmap/step3_pbcsetup.psf')
view = nglview.show_mdtraj(traj5)
view.clear_representations()
view.add_licorice()
view.add_unitcell()
view

In [None]:
traj5.restrict_atoms(traj5.topology.select("protein"))
# CHARMM has even quirkier atom names for ALAD. 
plot_ramachandran(traj5, ['CLP', 'NL', 'CA', 'CRP'], ['NL', 'CA', 'CRP', 'NR'])

# 5. Explicit solvent model (AMBER from CHARMM-GUI)

You can skip this section. While this is supposed to work, it doesn't. (Some of required files returned by CHARMM-GUI are empty. Others contain garbage.)

To create the intput files, follow exactly the same procedure as in example 4, but change the force field from `CHARMM36m` to `AMBER`. Unpack the resulting files in a directory `charmm-gui-amber`, with which the following code cell should work.

In [None]:
prmtop = AmberPrmtopFile('charmm-gui-amber/amber/TODO.prmtop')
inpcrd = AmberInpcrdFile('charmm-gui-amber/amber/TODO.inpcrd')
system = prmtop.createSystem(nonbondedMethod=PME, constraints=HBonds)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
simulation = Simulation(prmtop.topology, system, integrator)
simulation.context.setPositions(inpcrd.positions)
simulation.minimizeEnergy()
simulation.reporters.append(DCDReporter('traj6.dcd', 100))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        temperature=True, elapsedTime=True))
simulation.reporters.append(StateDataReporter("scalars6.csv", 100, time=True,
    potentialEnergy=True, totalEnergy=True, temperature=True))
simulation.step(100000)