In [None]:
"""Alkane Simulation and Visualization"""

__authors__ = ["Olaseni Sode","Paul Nerenberg"]
__email__   = ["osode@calstatela.edu","pnerenb@calstatela.edu"]
__date__      = "2024-9-25"

# Alkane Simulation and Analysis

In this exercise, you will learn how to set up and run molecular dynamics (MD) simulations using OpenMM for two simple hydrocarbon systems: ethane and halogenated butane. You will also visualize and analyze the simulation results. This notebook will guide you through key steps in MD simulations, including system setup, energy minimization, equilibration, and production runs.


In [None]:
#from simtk.openmm import app
#import simtk.openmm as mm
#from simtk import unit
from openmm import app
from openmm.app import Modeller
from openmm.app import PDBFile
import openmm as mm

from openmm import unit
from sys import stdout
import time as time

## 1. Gas Phase Simulation Ethane

In this section, you will set up, run, and analyze a molecular dynamics (MD) simulation for ethane in the gas phase. Ethane is a simple hydrocarbon, making it a good starting point for learning how to perform and interpret MD simulations. The tasks in this section will guide you through the key stages of a simulation workflow.

#### A. **Create and Load the Ethane Molecule and Force Field**
   - First, you will create the molecular structure of ethane from a provided PDB file. The PDB file contains the atomic positions and bonding information necessary for the simulation.
   - You will use the `PDBFile` function from OpenMM to read the ethane structure into the simulation environment.
   - Next, you will create and assign a force field to ethane. A force field defines the potential energy function that governs the interactions between atoms (bonds, angles, torsions, and non-bonded interactions). In this case, you will use the **General Amber Force Field 2 (GAFF2)**, which is appropriate for hydrocarbons like ethane.
   - You will load the force field into OpenMM, which will automatically assign parameters to the ethane atoms.

#### B. **Energy Minimization and Equilibration**
   - Before running the simulation, you need to perform energy minimization. This process reduces any high-energy contacts that might make the system unstable during dynamics.
   - Energy minimization is essential for finding a stable initial configuration by relaxing the molecule to its nearest local energy minimum.
   - You will run a minimization algorithm and track how the potential energy of the system decreases as the molecule relaxes.

#### C. **Molecular Dynamics (MD) Simulation**
   - After minimization and equilibration, you will set up and run the molecular dynamics simulation.
   - *Integrator Setup*: You'll use a **Langevin integrator** to simulate the system under constant temperature conditions (298.15 K). The integrator will help maintain the system temperature by simulating collisions with a heat bath.
   - *Run the Simulation*: You will run the MD simulation for a specific number of time steps, allowing the ethane molecule to evolve over time. The simulation will generate a trajectory that shows how the atoms in the molecule move throughout the simulation.
   
#### D. **Trajectory Analysis**
   - Once the simulation is complete, you will analyze the trajectory of the ethane molecule to understand its motion.
   - Using `mdtraj`, you will visualize how the ethane molecule evolves over time and compute structural properties, such as bond lengths.
   - For example, you will calculate the **C-C bond distance** between the two carbon atoms over the course of the simulation.

#### E. **Visualization of Results**
   - Finally, you will visualize the results using matplotlib or another plotting library. This will allow you to observe how the energy and bond distances change over time during the simulation.
   - You will generate a plot showing the **potential energy** of the ethane molecule as a function of simulation time, as well as the **C-C bond distance** variation.

By the end of this section, you will have a deeper understanding of how molecular dynamics simulations are performed and how to interpret the data generated from these simulations. This foundation will prepare you for more complex systems in future exercises.


#### Create PDB and Force Field

In the following cells, you will create the coordinates and force field needed for your simulation. The coordinates are provided in PDB format, while the force field is defined in XML format. 

Pay close attention to the parameters in the force field:
- Equilibrium bond distances, angles, and dihedrals
- Force constants for bonds, angles, and torsions
- Atomic charges
- Non-bonded interaction values

These parameters define how the atoms in your system interact during the simulation.

In [None]:
# Define the content of the ethane.pdb file
pdb_content = """\
ATOM      1  C1  ETH     1      -3.553   2.382   0.000  1.00  0.00           C  
ATOM      2  H11 ETH     1      -3.940   1.922   0.912  1.00  0.00           H  
ATOM      3  H12 ETH     1      -3.941   1.831  -0.859  1.00  0.00           H  
ATOM      4  H13 ETH     1      -3.919   3.410  -0.053  1.00  0.00           H  
ATOM      5  C2  ETH     1      -2.016   2.361   0.000  1.00  0.00           C  
ATOM      6  H21 ETH     1      -1.649   1.333   0.053  1.00  0.00           H  
ATOM      7  H22 ETH     1      -1.627   2.912   0.859  1.00  0.00           H  
ATOM      8  H23 ETH     1      -1.629   2.821  -0.912  1.00  0.00           H  
CONECT    1    2    
CONECT    1    3  
CONECT    1    4  
CONECT    1    5  
CONECT    5    6  
CONECT    5    7
CONECT    5    8
"""

# Create and write the content to ethane.pdb file
with open("ethane.pdb", "w") as f:
    f.write(pdb_content)

print("ethane.pdb file created successfully.")

In [None]:
# Define the content of the ethane.gaff2.xml file
xml_content = """\
<ForceField>
     <AtomTypes>
      <Type name="0" class="c3" element="C" mass="12.01078"/>
      <Type name="1" class="hc" element="H" mass="1.007947"/>
     </AtomTypes>
     <Residues>
      <Residue name="ETH">
       <Atom name="C1" type="0"/>
       <Atom name="H11" type="1"/>
       <Atom name="H12" type="1"/>
       <Atom name="H13" type="1"/>
       <Atom name="C2" type="0"/>
       <Atom name="H21" type="1"/>
       <Atom name="H22" type="1"/>
       <Atom name="H23" type="1"/>
       <Bond atomName1="C1" atomName2="H11"/>
       <Bond atomName1="C1" atomName2="H12"/>
       <Bond atomName1="C1" atomName2="H13"/>
       <Bond atomName1="C1" atomName2="C2"/>
       <Bond atomName1="C2" atomName2="H21"/>
       <Bond atomName1="C2" atomName2="H22"/>
       <Bond atomName1="C2" atomName2="H23"/>
      </Residue>
     </Residues>
     <HarmonicBondForce>
      <Bond class1="c3" class2="c3" length="0.15380" k="194572.74"/>
      <Bond class1="c3" class2="hc" length="0.10970" k="314568.76"/>
     </HarmonicBondForce>
     <HarmonicAngleForce>
      <Angle class1="c3" class2="c3" class3="hc" angle="1.91637152" k="391.756288"/>
      <Angle class1="hc" class2="c3" class3="hc" angle="1.87762521" k="326.01728"/>
     </HarmonicAngleForce>
     <PeriodicTorsionForce>
      <Proper class1="hc" class2="c3" class3="c3" class4="hc" periodicity1="3" phase1="0.0" k1="0.50208"/>
     </PeriodicTorsionForce>
     <NonbondedForce coulomb14scale="0.833333" lj14scale="0.5">
      <Atom type="0" charge="-0.094100" sigma="0.3397710" epsilon="0.4510352"/>
      <Atom type="1" charge="0.031700" sigma="0.2600177" epsilon="0.0870272"/>
     </NonbondedForce>
</ForceField>
"""

# Create and write the content to ethane.gaff2.xml file
with open("ethane.gaff2.xml", "w") as f:
    f.write(xml_content)

print("ethane.gaff2.xml file created successfully.")


#### Load PDB and Force Field

Next, we will load the PDB file and the force field, into OpenMM. By loading both the PDB and force field, you’ll provide OpenMM with all the necessary information to compute the forces and energies during the simulation.


In [None]:
# read in a starting structure for ethane and the
# corresponding force field file
pdb = app.PDBFile('ethane.pdb')
forcefield = app.ForceField('ethane.gaff2.xml')

In [None]:
import mdtraj as md
import nglview as ngl

eth = md.load('ethane.pdb')
visualize = ngl.show_mdtraj(eth)
visualize

#### Setup simulation

The following cell will setup your simulation, including adding the topology and force field files to the system, applying the proper integrator, and setting the temperature and simulation time step.

In [None]:
# setup system by taking topology from pdb file;
system = forcefield.createSystem(pdb.topology, nonbondedMethod=app.NoCutoff, 
                                 constraints=app.HBonds)
# run gas phase simulation 
# using a Langevin thermostat (integrator)
# at 298.15 K  
# with coupling constant of 5.0 ps^-1
# with 2 fs time step (using SHAKE)
integrator = mm.LangevinIntegrator(298.15*unit.kelvin, 5.0/unit.picoseconds, 
                                   2.0*unit.femtoseconds)
integrator.setConstraintTolerance(1e-5)

platform = mm.Platform.getPlatformByName('Reference')

# A simulation ties together various objects used for running a simulation
simulation = app.Simulation(pdb.topology, system, integrator, platform)
simulation.context.setPositions(pdb.positions)

#### Energy Minimization

This reduces the potential energy of the system before beginning dynamics. This eliminates "bad" (i.e., overly close) contacts and generally leads to more stable simulation behavior.

In [None]:
print('Minimizing...')

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy before minimization is %s" % st.getPotentialEnergy())

simulation.minimizeEnergy(maxIterations=100)

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy after minimization is %s" % st.getPotentialEnergy())

#### Equilibration

We will run a (very) short equilibration simulation to bring the molecule up to our desired temperature.  If this were a periodic system, we would also aim to bring the density/volume to equilibrium at the desired pressure.

In [None]:
print('Equilibrating...')

simulation.reporters.append(app.StateDataReporter(stdout, 100, step=True, 
    potentialEnergy=True, temperature=True, separator='\t'))
simulation.context.setVelocitiesToTemperature(150.0*unit.kelvin)
simulation.step(2500)

#### Production

Now we run a long MD simulation with parameters that are identical to the equilibration phase (other than simulation length, of course!).  We will also save a trajectory file (i.e., corodinats vs. time) of this simulation that we can analyze afterward using MDTraj (or other trajectory analysis tools).

In [None]:
print('Running Production...')

tinit=time.time()
simulation.reporters.clear()
# output basic simulation information below every 250000 steps/500 ps
simulation.reporters.append(app.StateDataReporter(stdout, 250000, 
    step=True, time=True, potentialEnergy=True, temperature=True, 
    speed=True, separator='\t'))
# write out a trajectory (i.e., coordinates vs. time) to a DCD
# file every 100 steps/0.2 ps
simulation.reporters.append(app.DCDReporter('ethane_sim.dcd', 100))

# run the simulation for 1.0x10^7 steps/20 ns
simulation.step(10000000)
tfinal=time.time()
print('Done!')
print('Time required for simulation:', tfinal-tinit, 'seconds')

#### Trajectory Analysis

Now that you have simulated your system, you can begin to anaylze the trajectory and visualize the simulation. To begin, you should load the necessary libraries. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import mdtraj as md


%matplotlib inline

#### Load Trajectory and Characterize Bonds

Next, you should load the trajectory that you specified earlier. Remember it ends with `.dcd`.  You will use the `load` command from `mdtraj` library. You'll also need to specify the topology from the original PDB file (`.pdb`) that contains our solvated molecule.

If you convert the trajectory file to a dataframe it will be easier to view the atoms (and bonds). Afterward you can show the first 20 lines of the atoms dataframe by the following command `atoms.head(20)`.

The structure of a Topology object is similar to that of a PDB file. It consists of a set of 'Chains'. Each 'Chain' contains a set of 'Residues', and each 'Residue' contains a set of 'Atoms'. In addition, the Topology stores a list of which atom pairs are bonded to each other.

**Useful links**: 
1. https://mdtraj.org/1.9.4/api/generated/mdtraj.Topology.html
2. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html


In [None]:
traj = md.load('ethane_sim.dcd', top='ethane.pdb')
atoms, bonds = traj.topology.to_dataframe()
atoms.head(20)

In [None]:
visualize = ngl.show_mdtraj(traj)
visualize

In [None]:
bonds

#### Compute carbon-carbon bond distance
You can now compute the central carbon-carbon bond distance by specifying which atoms in the above 'Atoms' dataframe correspond to the correct atoms. Add these two values to the `bond_indices` array.

You can then compute the values

The `plt.hist` function will make a histogram plot with a certain number of bin given a list of bond distances. Make sure to title and correctly label the axes. 

In [None]:
bond_indices = [0, 4]
bonds = md.compute_distances(traj, [bond_indices])

bondcounts, binedges, otherstuff = plt.hist(bonds, bins=200) # create a histogram with 200 bins
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Generate potential mean force for C-C bond

You can recast the histogram counts above into a potential of mean force (pmf). Effectively, this converts the count probabilities into relative free energies. The relevant equation from statistical thermodynamics is the following:

$$ { W(x) = -k_{\rm B}T \ln p(x) } $$


In [None]:
kB = 8.31446/1000 # Boltzmann constant in kJ/mol
Temp = 298.15 # simulation temperature
bondcounts[bondcounts==0] = 0.1 # get rid of any bins with 0 counts/infinite energy
pmf = -kB*Temp*np.log(bondcounts) # W(x) = -kT*ln[p(x)] = -kT*ln[n(x)] + C
pmf = pmf - np.min(pmf) # subtract off minimum value so that energies start from 0

bincenters = (binedges[1:] + binedges[:-1])/2 # compute centers of histogram bins

plt.plot(bincenters, pmf)
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Compute H-C-C-H torsion angle 

For the H-C-C-H dihedral angle, it is important to correctly define the atoms associated with the torsion. In the case of ethane, you can use the list of atoms above to specify the correct atom index and place it in the `phi_indices` array. These indices are H(1)-C(0)-C(4)-C(5). Remember that the numbering starts with 0 for python.

In [None]:
phi_indices = [1, 0, 4, 5] # atoms to define the torsion angle
phi = md.compute_dihedrals(traj, [phi_indices])

In [None]:
phicounts, binedges, otherstuff = plt.hist(phi, bins=120) # create a histogram with 120 bins
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Generate potential mean force for H-C-C-H torsion angle

In [None]:
kB = 8.31446/1000 # Boltzmann constant in kcal/mol
Temp = 298.15 # simulation temperature
phicounts[phicounts==0] = 0.1 # get rid of any bins with 0 counts/infinite energy
pmf = -kB*Temp*np.log(phicounts) # W(x) = -kT*ln[p(x)] = -kT*ln[n(x)] + C
pmf = pmf - np.min(pmf) # subtract off minimum value so that energies start from 0

bincenters = (binedges[1:] + binedges[:-1])/2 # compute centers of histogram bins

plt.plot(bincenters, pmf)
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

**Question 1:**
**<span style="color:red">How does the simulated torsional angle compare to the potential energy surface scan we did for HW 1? Describe any discrepancies you see and try to explain them.</span>**

**Answer:**


**Question 2:**
**<span style="color:red">In the force field you loaded, what are the equilibrium bond distances for the C-H and C-C bonds? How do these values compare with typical experimental bond lengths?</span>**

**Answer:**


## 2. Gas phase Halogen Butane

In this section, we will simulate the gas-phase behavior of a halogenated butane molecule using molecular dynamics. We will begin by loading the structure of halogenated butane, setting up the necessary force field parameters, and running a simulation to observe its behavior in the gas phase. By analyzing the simulation, we will investigate how the presence of the halogen atom affects the geometry, energy, and overall dynamics of the butane molecule compared to the ethane molecule.


#### Load PDB and force field

In [None]:
# Define the content of the bromobutane.pdb file
bromobutane_pdb_content = """\
ATOM      1 C1   UNK     1       3.903   0.469   0.000                       C 0
ATOM      2 C2   UNK     1       2.531   1.139   0.000                       C 0
ATOM      3 C3   UNK     1       1.381   0.128   0.000                       C 0
ATOM      4 C4   UNK     1       0.029   0.820   0.000                       C 0
ATOM      5 H41  UNK     1      -0.123   1.430   0.886                       H 0
ATOM      6 H42  UNK     1      -0.123   1.430  -0.886                       H 0
ATOM      7 Br   UNK     1      -1.452  -0.469   0.000                      Br 0
ATOM      8 H31  UNK     1       1.454  -0.519  -0.877                       H 0
ATOM      9 H32  UNK     1       1.454  -0.519   0.877                       H 0
ATOM     10 H21  UNK     1       2.444   1.790   0.876                       H 0
ATOM     11 H22  UNK     1       2.444   1.790  -0.876                       H 0
ATOM     12 H11  UNK     1       4.034  -0.162   0.883                       H 0
ATOM     13 H12  UNK     1       4.034  -0.162  -0.883                       H 0
ATOM     14 H13  UNK     1       4.702   1.214   0.000                       H 0
CONECT    1    2
CONECT    2    3
CONECT    3    4
CONECT    4    5
CONECT    4    6
CONECT    4    7
CONECT    3    8
CONECT    3    9
CONECT    2   10
CONECT    2   11
CONECT    1   12
CONECT    1   13
CONECT    1   14
"""

# Create and write the content to bromobutane.pdb file
with open("bromobutane.pdb", "w") as f:
    f.write(bromobutane_pdb_content)

print("bromobutane.pdb file created successfully.")


In [None]:
# Define the content of the bromobutane.gaff2.xml file
bromobutane_xml_content = """\
<ForceField>
     <AtomTypes>
      <Type name="0" class="c3" element="C" mass="12.01078"/>
      <Type name="1" class="c3" element="C" mass="12.01078"/>
      <Type name="2" class="hc" element="H" mass="1.007947"/>
      <Type name="3" class="hc" element="H" mass="1.007947"/>
      <Type name="4" class="br"  element="Br" mass="79.904"/>
     </AtomTypes>
     <Residues>
      <Residue name="UNK">
       <Atom name="C1" type="0"/>
       <Atom name="C2" type="1"/>
       <Atom name="C3" type="1"/>
       <Atom name="C4" type="0"/>
       <Atom name="H41" type="2"/>
       <Atom name="H42" type="2"/>
       <Atom name="Br" type="4"/>
       <Atom name="H31" type="2"/>
       <Atom name="H32" type="3"/>
       <Atom name="H21" type="3"/>
       <Atom name="H22" type="3"/>
       <Atom name="H11" type="3"/>
       <Atom name="H12" type="2"/>
       <Atom name="H13" type="2"/>
       <Bond atomName1="C1" atomName2="C2"/>
       <Bond atomName1="C2" atomName2="C3"/>
       <Bond atomName1="C3" atomName2="C4"/>
       <Bond atomName1="C4" atomName2="H41"/>
       <Bond atomName1="C4" atomName2="H42"/>
       <Bond atomName1="C4" atomName2="Br"/>
       <Bond atomName1="C3" atomName2="H31"/>
       <Bond atomName1="C3" atomName2="H32"/>
       <Bond atomName1="C2" atomName2="H21"/>
       <Bond atomName1="C2" atomName2="H22"/>
       <Bond atomName1="C1" atomName2="H11"/>
       <Bond atomName1="C1" atomName2="H12"/>
       <Bond atomName1="C1" atomName2="H13"/>
      </Residue>
     </Residues>
     <HarmonicBondForce>
      <Bond class1="c3" class2="c3" length="0.15380" k="194572.736"/>
      <Bond class1="c3" class2="hc" length="0.10970" k="314568.56"/>
      <Bond class1="c3" class2="br"  length="0.1787" k="189953.60"/>
     </HarmonicBondForce>
     <HarmonicAngleForce>
      <Angle class1="c3" class2="c3" class3="c3" angle="1.94621665" k="542.982784"/>
      <Angle class1="c3" class2="c3" class3="hc" angle="1.91637152" k="391.756288"/>
      <Angle class1="hc" class2="c3" class3="hc" angle="1.87762521" k="326.01728"/>
      <Angle class1="c3" class2="c3" class3="br" angle="1.92003671" k="534.71520"/>
      <Angle class1="br" class2="c3" class3="hc" angle="1.85877565" k="352.2928"/>
     </HarmonicAngleForce>
     <PeriodicTorsionForce>
      <Proper class1="hc" class2="c3" class3="c3" class4="hc" periodicity1="3" phase1="0.0" k1="0.50208"/>
      <Proper class1="c3" class2="c3" class3="c3" class4="hc" periodicity1="3" phase1="0.0" k1="0.33472"/>
      <Proper class1="c3" class2="c3" class3="c3" class4="c3" 
       periodicity1="1" phase1="0.0" k1="0.4602"
       periodicity2="2" phase2="3.141593" k2="1.2134" 
       periodicity3="3" phase3="0.0" k3="0.5439"/>
     <Proper class1="hc" class2="c3" class3="c3" class4="br" 
       periodicity1="3" phase1="0.0" k1="0.87864" 
       periodicity2="1" phase2="0.0" k2="0.33472"/>
     </PeriodicTorsionForce>
     <NonbondedForce coulomb14scale="0.833333" lj14scale="0.5">
      <Atom type="0" charge="-0.05652" sigma="0.3397710" epsilon="0.4510352"/>
      <Atom type="1" charge="-0.05652" sigma="0.3397710" epsilon="0.4510352"/>
      <Atom type="2" charge="0.045631" sigma="0.2600177" epsilon="0.0870272"/>
      <Atom type="3" charge="0.045631" sigma="0.2600177" epsilon="0.0870272"/>
      <Atom type="4" charge="-0.15699" sigma="0.36125943" epsilon="1.6451488"/>
     </NonbondedForce>
</ForceField>
"""

# Create and write the content to bromobutane.gaff2.xml file
with open("bromobutane.gaff2.xml", "w") as f:
    f.write(bromobutane_xml_content)

print("bromobutane.gaff2.xml file created successfully.")


In [None]:
# read in a starting structure for ethane and the
# corresponding force field file
pdb = app.PDBFile('bromobutane.pdb')
forcefield = app.ForceField('bromobutane.gaff2.xml')

In [None]:
import mdtraj as md
import nglview as ngl

but = md.load('bromobutane.pdb')
visualize = ngl.show_mdtraj(but)
visualize

#### Setup simulation

In [None]:
# setup system by taking topology from pdb file;
system = forcefield.createSystem(pdb.topology, nonbondedMethod=app.NoCutoff, 
                                 constraints=app.HBonds)
# run gas phase simulation 
# using a Langevin thermostat (integrator)
# at 298.15 K  
# with coupling constant of 5.0 ps^-1
# with 2 fs time step (using SHAKE)
integrator = mm.LangevinIntegrator(298.15*unit.kelvin, 5.0/unit.picoseconds, 
                                   2.0*unit.femtoseconds)
integrator.setConstraintTolerance(1e-5)

platform = mm.Platform.getPlatformByName('Reference')

# A simulation ties together various objects used for running a simulation
simulation = app.Simulation(pdb.topology, system, integrator, platform)
simulation.context.setPositions(pdb.positions)

#### Energy Minimization

In [None]:
print('Minimizing...')

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy before minimization is %s" % st.getPotentialEnergy())

simulation.minimizeEnergy(maxIterations=100)

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy after minimization is %s" % st.getPotentialEnergy())

#### Equilibration

In [None]:
print('Equilibrating...')

simulation.reporters.append(app.StateDataReporter(stdout, 100, step=True, 
    potentialEnergy=True, temperature=True, separator='\t'))
simulation.context.setVelocitiesToTemperature(150.0*unit.kelvin)
simulation.step(10000)

#### Production

In [None]:
print('Running Production...')

tinit=time.time()
simulation.reporters.clear()
# output basic simulation information below every 250000 steps/500 ps
simulation.reporters.append(app.StateDataReporter(stdout, 250000, 
    step=True, time=True, potentialEnergy=True, temperature=True, 
    speed=True, separator='\t'))
# write out a trajectory (i.e., coordinates vs. time) to a DCD
# file every 100 steps/0.2 ps
simulation.reporters.append(app.DCDReporter('bromobutane.dcd', 100))

# run the simulation for 1.0x10^7 steps/20 ns
simulation.step(5000000)
tfinal=time.time()
print('Done!')
print('Time required for simulation:', tfinal-tinit, 'seconds')

#### Trajectory Analysis

#### Load Trajectory and Characterize Bonds

In [None]:
traj2 = md.load('bromobutane.dcd', top='bromobutane.pdb')
atoms, bonds = traj2.topology.to_dataframe()
atoms

In [None]:
visualize = ngl.show_mdtraj(traj2)
visualize

In [None]:
bonds

#### Compute carbon-halogen bond distance

In [None]:
bond_indices = [3,6]
bonds = md.compute_distances(traj2, [bond_indices])

bondcounts, binedges, otherstuff = plt.hist(bonds, bins=200) # create a histogram with 200 bins
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Generate potential mean force for carbon-halogen bond

In [None]:
kB = 8.31446/1000 # Boltzmann constant in kcal/mol
Temp = 298.15 # simulation temperature
bondcounts[bondcounts==0] = 0.1 # get rid of any bins with 0 counts/infinite energy
pmf = -kB*Temp*np.log(bondcounts) # W(x) = -kT*ln[p(x)] = -kT*ln[n(x)] + C
pmf = pmf - np.min(pmf) # subtract off minimum value so that energies start from 0

bincenters = (binedges[1:] + binedges[:-1])/2 # compute centers of histogram bins

plt.plot(bincenters, pmf)
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Compute C-C-C-C torsion angle

In [None]:
phi_indices = [0,1,2,3]
phi = md.compute_dihedrals(traj2, [phi_indices])

In [None]:
phicounts, binedges, otherstuff = plt.hist(phi, bins=120) # create a histogram with 120 bins
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

#### Generate potential mean force for C-C-C-C torsion angle

In [None]:
kB = 8.31446/1000 # Boltzmann constant in kcal/mol
Temp = 298.15 # simulation temperature
phicounts[phicounts==0] = 0.1 # get rid of any bins with 0 counts/infinite energy
pmf = -kB*Temp*np.log(phicounts) # W(x) = -kT*ln[p(x)] = -kT*ln[n(x)] + C
pmf = pmf - np.min(pmf) # subtract off minimum value so that energies start from 0

bincenters = (binedges[1:] + binedges[:-1])/2 # compute centers of histogram bins

plt.plot(bincenters, pmf)
plt.title(___________) # Fill in: Plot title
plt.xlabel(___________) # Fill in: x label. note the units.
plt.ylabel(___________) # Fill in: y label
plt.show()

**Question 3:**
**<span style="color:red">In all of the PMF plots the tops of the graphs are typically very jagged. Why do you think this is the case?</span>**

**Answer:**



## 3. Condensed Phase Simulation of Halogen Butane

In this section, we will simulate the behavior of halogenated butane in the condensed phase using molecular dynamics. We will begin by loading the structure of halogenated butane, setting up the force field parameters, and running a simulation to observe its behavior in a solvent environment. The goal is to analyze how the presence of solvent affects the geometry and overall dynamics of the bromobutane molecule. 


In [None]:
# read in a starting structure and the
# corresponding force field file
pdb = app.PDBFile('bromobutane.pdb')
modeller = Modeller(pdb.topology, pdb.positions)

forcefield = app.ForceField('bromobutane.gaff2.xml', 'amber14/tip3pfb.xml')
modeller.addSolvent(forcefield, model='tip3p', padding=750*unit.picometer)

with open('water-box.pdb', 'w') as outfile:
    PDBFile.writeFile(modeller.topology, modeller.positions, outfile)

In [None]:
# setup system by taking topology from pdb file;
#system = forcefield.createSystem(modeller.topology, nonbondedMethod=app.PME, 
system = forcefield.createSystem(modeller.topology, nonbondedMethod=app.PME, 
                                 nonbondedCutoff=500*unit.picometer,
                                 constraints=app.HBonds)
#forcefield.createSystem()
# run gas phase simulation 
# using a Langevin thermostat (integrator)
# at 298.15 K  
temperature = 298.15*unit.kelvin
pressure = 1*unit.bar

# with coupling constant of 5.0 ps^-1
# with 2 fs time step (using SHAKE)
integrator = mm.LangevinIntegrator(temperature, 
                                   5.0/unit.picoseconds, 
                                   2.0*unit.femtoseconds)

integrator.setConstraintTolerance(1e-5)
platform = mm.Platform.getPlatformByName('Reference')
system.addForce(mm.MonteCarloBarostat(pressure, temperature))

# A simulation ties together various objects used for running a simulation
simulation = app.Simulation(modeller.topology, system, integrator, platform)
simulation.context.setPositions(modeller.positions)

#### Energy minimization

In [None]:
print('Minimizing...')

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy before minimization is %s" % st.getPotentialEnergy())

simulation.minimizeEnergy(maxIterations=100)

st = simulation.context.getState(getPositions=True,getEnergy=True)
print("Potential energy after minimization is %s" % st.getPotentialEnergy())

In [None]:
import mdtraj as md
import nglview as ngl

trajj = md.load('water-box.pdb')
visualize = ngl.show_mdtraj(trajj)
visualize.add_licorice('HOH', linewidth=1.)
visualize

#### Equilibration

In [None]:
print('Equilibrating...')

simulation.reporters.append(app.StateDataReporter(stdout, 100, step=True, 
    potentialEnergy=True, temperature=True, volume=True, separator='\t'))
simulation.context.setVelocitiesToTemperature(50.0*unit.kelvin)
simulation.step(2500)

#### Production
Now setup a production run for 400 picosends, by changing the number of steps for the simulation. Note that the time step is 2 femtoseconds.

In [None]:
print('Running Production...')

tinit=time.time()
simulation.reporters.clear()

# output basic simulation information below every 2500 steps / 5 ps
simulation.reporters.append(app.StateDataReporter(stdout, 2500, 
    step=True, time=True, potentialEnergy=True, temperature=True, 
    volume=True, speed=True, separator='\t'))

# write out a trajectory (i.e., coordinates vs. time) to a DCD
# file every 100 steps/0.2 ps
simulation.reporters.append(app.DCDReporter('box.dcd', 10))

# run the simulation for 400 ps
simulation.step(___________) # Fill in: number of simulation steps for 400 ps

tfinal=time.time()
print('Done!')
print('Time required for simulation:', tfinal-tinit, 'seconds')

#### Trajectory Analysis

In [None]:
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

#### Load Trajectory and Characterize Bonds

If you convert the trajectory file to a dataframe it will be easier to view the atoms (and bonds). However, since your molecule is now solvated, there will be a large number of atoms present if you simply type `atoms`. Instead, request the first 20 lines of the atoms dataframe with the following command, `atoms.head(20)`.

In [None]:
traj3 = md.load('box.dcd', top='water-box.pdb')
atoms, bonds = traj3.topology.to_dataframe()
atoms.head(20)


In [None]:
traj3.center_coordinates()
traj3.image_molecules()
visualize = ngl.show_mdtraj(traj3)
visualize.add_licorice('HOH', linewidth=1.)
visualize

#### Compute C-C-C-C torsion angle


#### Generate potential mean force for C-C-C-C torsion angle


**Question 4:**
**<span style="color:red">After performing a geometry optimization in the condensed phase, compare the final potential energy to that obtained in the gas-phase simulation. What factors contribute to the energy differences?</span>**

**Answer:**

**Question 5:**
**<span style="color:red">Observe the molecular dynamics over time. How does the presence of a solvent influence the motion of the halogenated butane molecule compared to the gas phase? Consider both rotational and vibrational motion.</span>**

**Answer:**


**Question 6:**
**<span style="color:red">If you were to replace the halogen atom with a heavier or lighter halogen (e.g., replacing bromine with fluorine), how would you expect the dynamics of the molecule to change? What impact would this have on the bond lengths, angles, and overall stability of the molecule?</span>**

**Answer:**