#Probing the surface of the Villin headpiece protein using an Ammonium ion

This is an example of evaluating a pseudotrajectory created with molgri using this package.

In [1]:
import sys
sys.path.append('../') #workaround, need to add the module to a folder that is part of path?
import numpy as np
from tqdm import tqdm
import nglview as nv
import MDAnalysis as mda
from MDAnalysis.coordinates.memory import MemoryReader

import modelledtrajectory as mtra




First we define the files needed for the calculations. The molgri pseudotrajectory has already been made and just needs to be loaded. For the probe we load an SDF file so the proper parameters can be found using the Amber general forcefield GAFF. 



In [2]:
fVillin = 'villin.pdb'
fNH4 = 'Structure_NH4.sdf'
fPT1 = 'villin_NH4_o_ico_128_b_zero_1_t_4145557111.gro'
fPT2 = 'villin_NH4_o_ico_128_b_zero_1_t_4145557111.xtc'

mt = mtra.ModelledTrajectory(fVillin, fNH4, fPT1, fPT2, 
                             forces=['amber/protein.ff14SB.xml', 'implicit/gbn2.xml'], 
                             nonbondedMethod='CutoffPeriodic', nonbondedCutoff = 2)


  alpha = np.rad2deg(np.arccos(np.dot(y, z) / (ly * lz)))
  beta = np.rad2deg(np.arccos(np.dot(x, z) / (lx * lz)))
  gamma = np.rad2deg(np.arccos(np.dot(x, y) / (lx * ly)))


The other parameters specify the parameter files for the forcefield that openMM should use to set up the simulation box, and settings for the nonbonded interactions that are also passed to the openMM system. We can now evaluate properties from the simulation box at each frame in the pseudotrajectory. For example, if we would want to know the potential energy at each frame we can loop over the entire trajectory:

In [3]:
potentials = np.zeros(len(mt))
for i in tqdm(mt.frames()):
    pot = mt.getPotentialEnergy()
    potentials[i] = pot

  alpha = np.rad2deg(np.arccos(np.dot(y, z) / (ly * lz)))
  beta = np.rad2deg(np.arccos(np.dot(x, z) / (lx * lz)))
  gamma = np.rad2deg(np.arccos(np.dot(x, y) / (lx * ly)))
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 512/512 [00:33<00:00, 15.45it/s]


Here the mt.allFrames() function returns an iterator that loops over the entire trajectory (the tqdm function is only used to provide a progress bar). For each iteration the simulation box coordinates are set to those in the current frame of the trajectory. mt.getPotentialEnergy() then returns the current potential energy of the system. By using this iterator it is possible to take different steps and or readings per frame without looping over the entire trajectory for each one. 

Lets say we want to do some measurements after a short minimization. Since we don't want to move to far from the initial gridpoint in the pseudotrajectory we would like to constrain the position of the protein backbone and the ammonium probe. We can start by selecting the ID's for the atoms that need to be restrained using MDAnalysis selection language:

In [4]:
probe = mt.selectIDs('type N and not protein')
backbone = mt.selectIDs('backbone')

This selects the nitrogen in the probe and all the atoms that are considered backbone in the protein. In order to constrain these we can call the following function:

In [5]:
mt.constrainAtoms(probe + backbone)

setting the masses of the selected atoms to 0 in the openMM simulation, causing their locations to become fixed. Note that setting these masses affects all frames of the trajectory, not just the current one. Since performing a minimization for each frame of the trajectory can take a while, we will instead select a number of the highest energy frames (that might need some minimization due to the probe and protein colliding) from the earlier list and see if the minimization affects those frames.

In [6]:
numFrames = 10
highPotentialFrames = np.argsort(potentials)[-1:-numFrames-1:-1]
print(highPotentialFrames)

[145 504 177  64 505 496 161 412 332  76]


the mt.frames() function can be called with a list of integers as an argument. The resulting iterator will, instead of all frames, loop only over those specified in the argument, in the order they are specified in. The next loop will perform the minimization for these frames, making sure to also store the coordinates before and after mimimization for comparison.

In [7]:
minimizedPotentials = np.zeros(numFrames)
for i in tqdm(mt.frames(select=highPotentialFrames)):
    mt.pickStateForSlice('normal')
    mt.minimizeEnergy(maxIterations = 100)
    minimizedPotentials[i] = mt.getPotentialEnergy()
    mt.pickStateForSlice('minimized')
    
preSlice = mt.getSlice('normal')
postSlice = mt.getSlice('minimized')

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:50<00:00,  5.05s/it]


mt.minimizeEnergy() passes its arguments directly to the openMM function of the same name. After minimization we simply extract the current potential energy of the box and store it in our array. Again, this is only done for a few frames to save time. If we wanted this for all frames in the trajectory, we could have these steps to the first loop where we evaluated the potential energy.

Now lets first see the differences in energies before and after minimization:

In [8]:
for i in range(numFrames):
    print('frame: {:=4} | pre: {:>-10.3e} | post: {:>-10.3e} | change: {:>-10.3e}'.format(highPotentialFrames[i], 
                                                         potentials[highPotentialFrames[i]], minimizedPotentials[i], 
                                                         minimizedPotentials[i]-potentials[highPotentialFrames[i]]))

frame:  145 | pre:  1.981e+17 | post: -5.945e+03 | change: -1.981e+17
frame:  504 | pre:  8.344e+12 | post: -5.466e+03 | change: -8.344e+12
frame:  177 | pre:  6.123e+12 | post:  4.629e+06 | change: -6.123e+12
frame:   64 | pre:  5.220e+12 | post: -6.481e+03 | change: -5.220e+12
frame:  505 | pre:  1.670e+12 | post:  4.377e+11 | change: -1.232e+12
frame:  496 | pre:  1.646e+12 | post: -5.285e+03 | change: -1.646e+12
frame:  161 | pre:  9.185e+11 | post: -6.548e+03 | change: -9.185e+11
frame:  412 | pre:  2.160e+11 | post:  2.160e+11 | change: -2.167e+05
frame:  332 | pre:  1.362e+11 | post: -3.818e+03 | change: -1.362e+11
frame:   76 | pre:  1.006e+11 | post:  6.842e+06 | change: -1.006e+11


And a as a visual comparison, the before and after for each of these frames:

In [21]:
pre = mda.Merge(mt._tu.atoms)
pre.load_new(preSlice[:]*10, format=MemoryReader, order='fac')
view1 = nv.show_mdanalysis(pre)
view1.add_licorice('sidechain or .CA')
view1.add_spacefill('not protein')
view1


(10, 1107, 3)


NGLWidget(max_frame=9)

In [22]:
post = mda.Merge(mt._tu.atoms)
post.load_new(postSlice[:]*10, format=MemoryReader, order='fac')
view2 = nv.show_mdanalysis(post)
view2.add_licorice('sidechain or .CA')
view2.add_spacefill('not protein')
view2

NGLWidget(max_frame=9)

Here we can see the differences before and after minimization of the existing trajectory. In actual use it might be more interesting to take the time to minimize all of the frames and see which ones have to lowest energy then.