# 1. Imports, type in the protein to be displayed

### After running through every cell of this JupyterNotebook, you will have started the creation of a short protein trajectory! We can beginning running 1nmf and 1a57. [Click here](https://www.rcsb.org/search?request=%7B%22query%22%3A%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22group%22%2C%22nodes%22%3A%5B%7B%22type%22%3A%22terminal%22%2C%22service%22%3A%22text%22%2C%22parameters%22%3A%7B%22attribute%22%3A%22rcsb_polymer_entity_container_identifiers.chem_comp_monomers%22%2C%22negation%22%3Afalse%2C%22operator%22%3A%22exact_match%22%2C%22value%22%3A%22TRP%22%7D%7D%5D%2C%22logical_operator%22%3A%22and%22%2C%22label%22%3A%22text%22%7D%5D%2C%22logical_operator%22%3A%22and%22%7D%2C%22return_type%22%3A%22entry%22%2C%22request_options%22%3A%7B%22paginate%22%3A%7B%22start%22%3A0%2C%22rows%22%3A25%7D%2C%22results_content_type%22%3A%5B%22experimental%22%5D%2C%22sort%22%3A%5B%7B%22sort_by%22%3A%22rcsb_entry_info.deposited_polymer_entity_instance_count%22%2C%22direction%22%3A%22asc%22%7D%5D%2C%22scoring_strategy%22%3A%22combined%22%7D%2C%22request_info%22%3A%7B%22query_id%22%3A%2297685f4d68712014d3397c6405025b8c%22%7D%7D) for a link to the protein database. 

In [4]:
from simtk.openmm.app import *
from simtk.openmm import *
from simtk.unit import *
from sys import stdout
from time import time

#type in the name of the protein you're working with inside the quotation marks(e.g. 1yiy, 3hfb)
#note: 1nmf is already in this directory, but you will have to add 1a57.
the_protein = "type here"

#sends printout to separate log file
log = open(the_protein + ".log", "a")
sys.stdout = log

### Note that running cells will have an asterisk an asterisk (*) on the left-hand side, and completed cells will have the #'th place that it finished running at.

# Load a structure and set up the simulation parameters and data structures

In [6]:
protein = PDBFile(the_protein + '.pdb')
forcefield = ForceField('amber14/protein.ff14SB.xml', 'amber14/tip3p.xml')
# Create Modeller object  This Modeller object will be used in place of the
# actual pdb file.
protein_model = Modeller(protein.topology, protein.positions)

# Add a tip3p waterbox solvent
protein_model.addSolvent(forcefield, model = 'tip3p', padding = 1.5*nanometers)

try:
    protein_model.addSolvent(forcefield, model = 'tip3p', padding = 1.5*nanometers)
except Exception as e:
    print(e)
    display("Exception logged. Run pdb fixer before proceeding.")

In [7]:
# Create and setup the system/environment as well as the integrator
# to be used in the simulation. Values will be recorded at every femtosecond
# (1 femtosecond = 1 timestep), and the system will be set to stay at around
# 296 Kelvin

system = forcefield.createSystem(protein_model.topology, nonbondedMethod = PME, 
                                 nonbondedCutoff = 1.0*nanometers, constraints = HBonds)
integrator = LangevinIntegrator(296*kelvin, 1.0/picosecond, 1.0*femtosecond)

# Specify the platform/processor that will be used to run the simulation on.
# Since the average person will be doing these simulations on their own computer
# with no special modifications, 'CPU' will be the only platform one wll be able
# to use.
platform = Platform.getPlatformByName('CPU')

# Initialize the simulation with the required topology, system, integrator, and
# platform, as well as set the positions of the molecules in the simulation.
simulation = Simulation(protein_model.topology, system, integrator, platform)
simulation.context.setPositions(protein_model.positions)

# Minimization: provides an initial low energy structure

### This script is set to generate a log file that will store every print() statement. Note what is being printed as different steps of the production run! By clicking on the jupyterhub logo in the top left corner, you will be able to navigate the directory (folders) and the cells selected to run will still run. 

In [8]:

# Get the initial state of the system and print out the values for
# potential and kinetic energy
st = simulation.context.getState(getPositions=True, getEnergy=True, enforcePeriodicBox=True)
print("Potential energy before minimization is %s" % st.getPotentialEnergy())
print("Kinetic energy before minimization is %s" % st.getKineticEnergy())

# Minimize the simulation for at most 100 iterations (the simulation 
# can potentially stop minimizing if the kinetic energy has been sufficiently
# minimized before it reaches 100 iterations, although you won't be able
# to tell), while also taking note of how long it takes for the minimization to
# finish.
print('Minimizing...')
tinit = time()
simulation.minimizeEnergy(maxIterations=100)
tfinal = time()

# Get the new state of the system and print out the values for
# potential and kinetic energy.
st = simulation.context.getState(getPositions=True,getEnergy=True, enforcePeriodicBox=True)
print("Potential energy after minimization is %s" % st.getPotentialEnergy())
print("Kinetic energy after minimization is %s" % st.getKineticEnergy())

# Print out the length of time the minimization took to complete.
print("Done Minimization! Time required: ", tfinal-tinit, "seconds")



# Equilibration: sets initial velocities and brings system up to desired temperature

In [None]:
# Set the system to maintain atom velocities such that the temperature of
# the system fluctuates around 296 Kelvin throughout the rest of the 
# equilibration and final simulation.
simulation.context.setVelocitiesToTemperature(296*kelvin)

# Equilibrate the system for 10,000 timesteps while also taking note of 
# how long it takes for the equilibration to finish.
print('Equilibrating...')
tinit = time()
simulation.step(10000)   #number is how many timesteps of equilibration to do
tfinal = time()

# Print out the length of time the equilibration took to complete.
print("Done equilibrating! Time required:", tfinal - tinit, "seconds")

# Production: creates trajectory at desired temperature, starting from end of equilibration

In [None]:
# Set up simulation to save important values needed for visualization purposes
# to a PDB file and a DCD file, as well as report the timestep, the kinetic
# and potential energies, and the temperature of the system to the standard
# output, every 100 timesteps.
Nsteps=10000
print_every_Nsteps=100
simulation.reporters.append(PDBReporter(the_protein + '_trajectory.pdb', print_every_Nsteps))
simulation.reporters.append(StateDataReporter(stdout, print_every_Nsteps, step=True, kineticEnergy=True, 
    potentialEnergy=True, temperature=True, separator='\t'))

# Start the simulation, while also taking note of how long it takes for the
# simulation to finish. When the values are shown in the standard output,
# it will look like the simulation starts at at the 10,000th timestep. That's
# because we are looking at the current state of the system. We already told
# the simulation to go through 10,000 timesteps in the equilibration step, so
# that's what's being reflected here.
tinit = time()
print('Running Production on ' + the_protein + '...')
simulation.step(Nsteps)
tfinal = time()
print('Done!')

# Print out the length of time the simulation took to complete.
print('Done production! Time required:', tfinal-tinit, 'seconds')
print("Number of steps:", Nsteps)

# Trajectory analysis using the mdtraj package

In [None]:
import matplotlib.pyplot as plt
import mdtraj as md
import numpy as np

In [None]:
traj = md.load(the_protein + '_trajectory.pdb')
print(traj)

In [None]:
print('How many atoms?    %s' % traj.n_atoms)   #prints out number of atoms in simulation
print('How many residues? %s' % traj.n_residues)   #prints out number of residues in simulation
print('How many water molecules? %s' % traj.atoms_by_name(self, water))
print('Second residue: %s' % traj.topology.residue(1))   #prints out the residue label for the number you put in
atom = traj.topology.atom(2)      #picks out a specific atom (based on the number) for further interrogation
print('''Hi! I am the %sth atom, and my name is %s. 
I am a %s atom with %s bonds. 
I am part of an %s residue.''' % ( atom.index, atom.name, atom.element.name, atom.n_bonds, atom.residue.name))


### If the asterisk aboves resolves to a number, the production should be done! Be sure to store the log file in OneNote, Box, etc. [Here](https://www.rcsb.org/structure/1A57) is the link to 1a57. Click download files, then download to PDB format. Then as we had navigated through the Jupyterhub directories before, be sure to upload the .pdb to the same folder this program is within.