# Folding Lab

## Building the pose

In this lab session, you will be folding a 30 residue protein with basic movers. Start by initializing PyRosetta as usual.

In [1]:
#1
# YOUR CODE HERE
from pyrosetta import *
init()

PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.python37.Release 2020.02+release.22ef835b4a2647af94fcd6421a85720f07eddf12 2020-01-05T17:31:56] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.python37.Release r242 2020.02+release.22ef835b4a2 22ef835b4a2647af94fcd6421a85720f07eddf12 http://www.pyrosetta.org 2020-01-05T17:31:56
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2aro -database /Users/paul/anaconda3/envs/pyrosetta_env/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=1531610209 seed_offset=0 real_seed=1531610209 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal mode, seed=1531610209 RG_type=mt19937


We would like to visualize folding as it happens. Before starting with the folding protocol, instantiate a PyMOL mover. Make sure you retain history to view the entire folding process. Start a PyMOL session on the side and make sure it says `PyMOL <---> PyRosetta link started!` on its commnad line.

In [2]:
#2
# YOUR CODE HERE
#pmm = pyrosetta.PyMOLMover()

Initialize a pose from sequence using the FASTA file given in this folder, `1qgm.fatsa`. Use the PyMOL mover to view the pose. You should see a long thread-like structure. Check the backbone dihedrals of any residue (except the first and last). What are the values of phi and psi dihedrals?

In [3]:
#3
# YOUR CODE HERE
#VIHCDAATICPDGTTCSLSPYGVWYCSPFS
seq = pyrosetta.rosetta.core.sequence.read_fasta_file_return_str('1qgm.fasta')
pose = pose_from_sequence('VIHCDAATICPDGTTCSLSPYGVWYCSPFS')
print(pose)

[0mcore.chemical.GlobalResidueTypeSet: {0} [0mFinished initializing fa_standard residue type set.  Created 980 residue types
[0mcore.chemical.GlobalResidueTypeSet: {0} [0mTotal time to initialize 1.03042 seconds.
PDB file name: VIHCDAAT
Total residues: 30
Sequence: VIHCDAATICPDGTTCSLSPYGVWYCSPFS
Fold tree:
FOLD_TREE  EDGE 1 30 -1 


In [4]:
pmm = pyrosetta.PyMOLMover()
pmm.apply(pose)

Pick any residue in the pose (except for an Ala or Gly) and print out information about it using `pose.residue(resno)`. How many atoms do you see? You might also want to go to PyMOL and take a look at this residue in stick representation

In [5]:
#4
# YOUR CODE HERE
resno = 4
res = pose.residue(resno)
print(res)
print("Total number of res =", pose.total_residue())
# 11 atoms on full-atom res

Residue 4: CYS (CYS, C):
Base: CYS
 Properties: POLYMER PROTEIN CANONICAL_AA SC_ORBITALS METALBINDING SIDECHAIN_THIOL ALPHA_AA L_AA
 Variant types:
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O    H    HA 
 Side-chain atoms:  CB   SG  1HB  2HB   HG 
Atom Coordinates:
   N  : 9.5345, 5.35895, 5.86727e-16
   CA : 10.1896, 6.66148, -2.44224e-17
   C  : 11.7059, 6.51548, 9.3584e-16
   O  : 12.2319, 5.40254, 1.59757e-15
   CB : 9.76138, 7.47898, 1.21891
   SG : 8.58811, 6.62992, 2.30256
   H  : 10.1015, 4.5231, 1.31674e-15
   HA : 9.89167, 7.19849, -0.900599
  1HB : 10.6399, 7.73826, 1.81014
  2HB : 9.30234, 8.41073, 0.88903
   HG : 8.47609, 7.61688, 3.18607
Mirrored relative to coordinates in ResidueType: FALSE

Total number of res = 30


## Using centroid mode

In this entire lab, we will work entirely in centroid mode. So inistatiate a `SwitchResidueTypeSetMover` and apply it on the pose. Apply the PyMOL mover again.

In [6]:
#5
# YOUR CODE HERE
switchMover = SwitchResidueTypeSetMover('centroid')
switchMover.apply(pose)

[0mcore.chemical.GlobalResidueTypeSet: {0} [0mFinished initializing centroid residue type set.  Created 62 residue types
[0mcore.chemical.GlobalResidueTypeSet: {0} [0mTotal time to initialize 0.042272 seconds.


In [7]:
unfolded_pose = pose.clone()

In [8]:
help(SwitchResidueTypeSetMover)

Help on class SwitchResidueTypeSetMover in module pyrosetta.rosetta.protocols.simple_moves:

class SwitchResidueTypeSetMover(pyrosetta.rosetta.protocols.moves.Mover)
 |  A mover that switches a pose between residue type sets (e.g. centroid and fullatom)
 |  
 |  examples:
 |      switch = protocols::simple_moves::SwitchResidueTypeSetMover("centroid")
 |  See also:
 |      Pose
 |      Residue
 |      ResidueType
 |      ResidueTypeSet
 |  
 |  Method resolution order:
 |      SwitchResidueTypeSetMover
 |      pyrosetta.rosetta.protocols.moves.Mover
 |      pybind11_builtins.pybind11_object
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(...)
 |      __init__(*args, **kwargs)
 |      Overloaded function.
 |      
 |      1. __init__(self: pyrosetta.rosetta.protocols.simple_moves.SwitchResidueTypeSetMover) -> None
 |      
 |      2. __init__(self: pyrosetta.rosetta.protocols.simple_moves.SwitchResidueTypeSetMover, : str) -> None
 |      
 |      3. __init__(self

Print out information about the same residue using `pose.residue(resi#)`. How many atoms do you see now? Is there a change in the stick representation of this residue?

In [9]:
#6
# YOUR CODE HERE
print(pose.residue(resno))
# 7 atoms on centroid res 4

Residue 4: CYS (CYS, C):
Base: CYS
 Properties: POLYMER PROTEIN CANONICAL_AA SIDECHAIN_THIOL ALPHA_AA L_AA
 Variant types:
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O    H  
 Side-chain atoms:  CB   CEN
Atom Coordinates:
   N  : 9.5345, 5.35895, 5.86727e-16
   CA : 10.1896, 6.66148, -2.44224e-17
   C  : 11.7059, 6.51548, 9.3584e-16
   O  : 12.2319, 5.40254, 1.59757e-15
   CB : 9.70832, 7.28151, 1.312
   CEN: 9.0524, 8.0536, 1.47384
   H  : 10.1015, 4.5231, 1.31674e-15
Mirrored relative to coordinates in ResidueType: FALSE



Since we are working with a centroid residue type set, `ref2015` can no longer be used as a score function. Nonetheless let's give it a try. Initialize `ref2015` using `get_fa_scorefxn()` and try to score the pose with that. What do you see? Comment this line out (that is add # in front of it) to proceed.

In [10]:
#7
# YOUR CODE HERE
sxfn2015 = get_fa_scorefxn()
#print(sxfn2015(pose))

#Error
#Illegal attempt to score with non-identical atom set between pose and etable
#	pose   atom_type_set: 'centroid'
#	etable atom_type_set: 'fa_standard'

[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mcore.scoring.etable: {0} [0mStarting energy table calculation
[0mcore.scoring.etable: {0} [0msmooth_etable: changing atr/rep split to bottom of energy well
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing lj etables (maxdis = 6)
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing solvation etables (max_dis = 6)
[0mcore.scoring.etable: {0} [0mFinished calculating energy tables.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
[0mbasic.io.database: {0} [0mDatabase file op

An appropriate score function for the centroid mode is `score3`. You can initialize it using `create_score_function("score3")`. Score the pose with this score function.

In [11]:
#8
# YOUR CODE HERE
sxfn = create_score_function('score3')
print('Centroid score =', sxfn(pose))

[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/env_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/pair_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.HS.resmooth
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.SS.resmooth
Centroid score = 156.57643598651632


## Comparing to native

Since the final solution is known to us, let's read in the native file, `1qgm.pdb` and store it in a new pose. For a good comparison, let's also use the swich residue type mover on this. Score this native pose using `score3`. Why do you think the score is like it is? To visualize this pose, use the PyMOL mover again.

In [12]:
pose_native = pose_from_file('1qgm.pdb')

[0mcore.import_pose.import_pose: {0} [0mFile '1qgm.pdb' automatically determined to be of type PDB
[0mcore.conformation.Conformation: {0} [0mFound disulfide between residues 4 16
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 4 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 16 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 4 CYD
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 16 CYD
[0mcore.conformation.Conformation: {0} [0mFound disulfide between residues 10 26
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 10 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 26 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 10 CYD
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 26 CYD


In [13]:
# Score
switchMover.apply(pose_native)

In [14]:
# Score native pose
#sxfn = create_score_function('score3')
print('Centroid score =', sxfn(pose_native))

Centroid score = 63.8669820863305


In [15]:
# PyMOL mover
pmm.apply(pose_native)

How far is the pose from the native pose? One classic metric is the C-alpha RMSD. In Rosetta, it is calculated using the `CA_rmsd()` method located in `pyrosetta.rosetta.core.scoring`. What is the starting value?

In [16]:
#9
# YOUR CODE HERE
from pyrosetta.rosetta.core.scoring import CA_rmsd

print(CA_rmsd(pose, pose_native))

27.841325759887695


Now that we know how to calculate this RMSD, we can use it to track if we are getting closer to the native structure.

## Building a basic folding protocol

So finally we can start to fold the protein. The only mobile segment is the backbone. To specify this, make a movemap and set backbone motion to true for all residues, i.e. `set_bb(True)`

In [17]:
mm = MoveMap()
mm.set_bb(True)

Let us try to use some simple movers that we have learnt today to fold the protein. Start by initializing a small mover and a shear mover (located in `pyrosetta.rosetta.protocols.simple_moves`). Set `n_moves = 10` and `kT = 1.0`.

In [18]:
#10
# YOUR CODE HERE
from pyrosetta.rosetta.protocols.simple_moves import SmallMover, ShearMover
n_moves = 10
kT = 1.
# Initalize backbone movers
small_mover = SmallMover()
small_mover.nmoves(n_moves)
small_mover.temperature(kT)

shear_mover = ShearMover()
shear_mover.nmoves(n_moves)
shear_mover.temperature(kT)

To apply them one after the other, use a sequence mover (located in `pyrosetta.rosetta.protocols.moves`).

The SequenceMover is a mover that holds other movers and applies them "in sequence." I.e. if you have three movers, `m1`, `m2`, and `m3`, and you add them to a SequenceMover in that order, then when you call the SequenceMover's `apply` method, it will first invoke `m1`'s `apply`, then `m2`'s `apply`, and finally `m3`'s `apply`.

Add the above two movers to your sequence mover using its `add_mover()` function.

In [19]:
#11
# YOUR CODE HERE
sequence_mover = pyrosetta.rosetta.protocols.moves.SequenceMover()

sequence_mover.add_mover(small_mover)
sequence_mover.add_mover(shear_mover)

To perform a Monte Carlo simulation, we first need a `MonteCarlo` object. Use the constructor for MonteCarlo that takes as arguments a pose, a score function, and a kT "temperature" as its parameters. Use a temperature of 1.

You're probably wondering: what score function and what pose should you use? The MonteCarlo object will use the Pose that you give it as it's starting point. It should be the centroid pose that you created from the FASTA sequence. (It should not be the native pose you read in from the PDB file!). We will want the centroid score function to go along with the centroid pose.

The way we'll use the MonteCarlo object is that we'll perturb the Pose `p`, and then after the perturbation, ask the MonteCarlo object to accept or reject the perturbation.

```
   for _ in range(100):
       perturber.apply(p)
       mc.boltzmann(p) # accept or reject the change to p
```
Internally, the MonteCarlo object keeps a copy of the pose in its most-recently-accepted conformation so that if the Metropolis criterion calls for the rejection of the perturbation, then the MonteCarlo object can copy the prior conformation back into `p`. I.e., the copying resets `p`'s conformation to how it was before the perturbation took place.

The MonteCarlo object is keeping a second copy of the pose that represents the best-ever-seen conformation. We will talk about that copy later.

Construct your MonteCarlo object below.

In [20]:
#12
# YOUR CODE HERE
kT = 2.0
mc = MonteCarlo(pose,sxfn,kT)

As we are nesting movers, it will be convineint to use a `TrialMover` to invoke some other Mover's apply function, and then invoke the MonteCarlo's boltzmann function. That is, the TrialMover's `apply` function is basically :

```
class TrialMover(Mover):
   def __init__(self, mover: Mover, mc: MonteCarlo):
      self.mover = mover
      self.mc = mc
   def apply(self, p: Pose):
      mover.apply(p)
      self.botlzmann(p)
```

(although it is actually implemented in C++)

Why create a class to call a function? Why not just call the function? Muse on this.

Initialize a `TrialMover` with the sequence mover (the one containing the small and shear movers) and the Monte Carlo object as input.

In [21]:
#13
# YOUR CODE HERE
# Init trial_mover obj
trial_mover = pyrosetta.rosetta.protocols.moves.TrialMover(sequence_mover, mc)

Before making any changes, let's keep a copy of the starting pose handy, because we'll need it later. To do so use `pose.clone()`.

In [22]:
#14
pose_start = pose.clone()

## Applying the folding protocol

OK!

We now have all of the elements we need in order to put together and run our small- and shear-mover-based-folding protocol. Let's assemble them.

What we will do is invoke the TrialMover's `apply` method. The TrialMover will invoke the SequenceMover's `apply` method and then invoke the MonteCarlo object's `botlzmann` method. The SequenceMover will invoke the small- and shear Movers in sequence.

Just applying the TrialMover once, however, will not be sufficient to fold the protein! Apply this mover 100 times to do so (with a for loop). Inside this loop, also store the score of the pose and the C-alpha RMSD in two `numpy` arrays (look back to Session 1) after every apply of the trial mover. Also use the PyMOL mover to update the pose in PyMOL so that you can create a movie of your protocol.

We are going to have you run this cell more than once; to make sure results of previous runs don't bleed through, add a call to MonteCarlo's `reset` function handing it the conformation of the original pose. Also, copy the `pose_start` pose into the working `pose.` 

In [23]:
import numpy as np

In [24]:
#15
# YOUR CODE HERE
movemap = MoveMap()
movemap.set_bb(True)

n_iteration = 100
pose_score = np.zeros((n_iteration,), dtype='float')
CA_rmsd_array = np.zeros((n_iteration,), dtype='float')

work_pose = Pose(pose_start)
mc.reset(work_pose)

# Loop
for i in range(n_iteration):
    trial_mover.apply(work_pose)
    pose_score[i] = sxfn(work_pose)
    CA_rmsd_array[i] = CA_rmsd(work_pose, pose_native) 
    print('score:', pose_score[i], 'rmsd:', CA_rmsd_array[i])

[0mcore.scoring.ramachandran: {0} [0mshapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_funct

Now using a loop, print out the scores and the rmsds after every iteration. How close do you get to the native using this folding algorithm? Did you notice your RMSD decrease stalling after a while? Rerun the cell above (but take care that the results of your previous execution don't beeed through) a couple of times. Do you ever go below a certain number? If you "play" the states you have in pymol, do your structures look like they are becoming very protein-like?

In [25]:
#16
# YOUR CODE HERE
pmm.apply(work_pose)

## Increasing move size

>If you have less than 30 minutes left, skip this section and proceed directly to the fragment-insertion section

One possible reason for this stagnation is that the maximum angle deviation of small movers is very small: 0, 5, and 6 degrees for helices, sheets, and loops, respectively. This approach will likely take forever to fold this protein. What if we introduce larger changes? Let's try larger changes! Use `RandomTorsionMover` (also found in `pyrosetta.rosetta.protocols.simple_moves`) to make larger torsion changes. `RandomTorsionMover` can be initialized with three parameters: `movemap`, `max_angle`, and `num_moves`. Set `max_angle` to 20 (you can try a few values) and `n_moves` to 10.

In [26]:
#17
# YOUR CODE HERE
movemap = MoveMap()
movemap.set_bb(True)

max_angle = 20
n_moves = 10
random_torsion_mover = pyrosetta.rosetta.protocols.simple_moves.RandomTorsionMover(movemap, max_angle, n_moves)

Add this mover to your sequence mover (which still contains the small- and shear- movers).

In [27]:
#18
# YOUR CODE HERE
sequence_mover.add_mover(random_torsion_mover)

trial_mover = pyrosetta.rosetta.protocols.moves.TrialMover(sequence_mover, mc)

Now, copy-paste the loop and score printing from above to run the trial mover. Before you run it, however, you might want to clear out your PyMOL session by typing `delete all` in the PyMOL terminal.

In [28]:
#19
# YOUR CODE HERE (COPIED FROM 15)
n_iteration = 100
pose_score = np.zeros((n_iteration,), dtype='float')
CA_rmsd_array = np.zeros((n_iteration,), dtype='float')

work_pose = Pose(unfolded_pose)
mc.reset(work_pose)

# Loop
for i in range(n_iteration):
    trial_mover.apply(work_pose)
    pose_score[i] = sxfn(work_pose)
    CA_rmsd_array[i] = CA_rmsd(work_pose, pose_native) 
    
print(CA_rmsd_array)

[27.68357277 27.51980591 27.47325516 27.39761925 27.18251419 26.643116
 26.67430305 26.69020653 26.52354431 25.77178001 25.33015251 25.60272026
 26.00553131 25.41796875 25.43510628 25.00482368 24.80314445 24.72152901
 24.72152901 24.72152901 25.24711418 25.02639961 25.40847397 25.4579277
 25.38396645 24.84935379 23.90630913 23.856287   22.67801285 22.45601654
 22.45601654 20.98574257 21.39917183 21.79724693 21.9245491  21.20833397
 21.23514366 21.88498497 20.85892677 21.03007126 21.42713165 20.76374435
 20.90421677 21.02792931 20.60367966 20.60367966 20.75211334 20.75211334
 20.78311539 21.28763199 21.55399895 20.38925171 20.09889221 20.55014229
 20.36945534 20.1437912  20.203825   20.203825   19.80945206 19.80945206
 20.00008583 20.37793922 20.96925545 20.96925545 19.91837502 19.91837502
 19.48837662 18.74079704 18.68639565 18.72855568 19.09550285 18.74419022
 18.85618973 18.85618973 18.85618973 18.85618973 18.78295708 18.24315262
 18.29317284 17.82403374 17.25905991 17.43937492 17.70

In [29]:
pmm.apply(work_pose)

Do larger moves help? 
--> I don't know. not really for me

## Fragment insertion

As a last part to this exercise, let's try fragment insertion as the first step in the sequence mover. Code to create a fragment set with 3-mer fragments is already written in the cell below. Make another fragment set of length 9 and read in the 9-mer file supplied, `1qgm.9mers`.

In [30]:
#20
from pyrosetta.rosetta.core.fragment import *
fragset3 = ConstantLengthFragSet(3)
fragset3.read_fragment_file("1qgm.3mers")

# YOUR CODE HERE
fragset9 = ConstantLengthFragSet(9)
fragset9.read_fragment_file("1qgm.9mers")

[0mcore.fragments.ConstantLengthFragSet: {0} [0mfinished reading top 200 3mer fragments from file 1qgm.3mers
[0mcore.fragments.ConstantLengthFragSet: {0} [0mfinished reading top 200 9mer fragments from file 1qgm.9mers


Construct a 3-mer and a 9-mer fragment set mover. Code for the 3-mer fragment mover is already written in the cell below. Similarly, make a 9-mer fragment mover.

In [31]:
#21
from pyrosetta.rosetta.protocols.simple_moves import ClassicFragmentMover
mover_3mer = ClassicFragmentMover(fragset3, mm)

# YOUR CODE HERE
mover_9mer = ClassicFragmentMover(fragset9, mm)

Put the 9-mer mover into a RepeatMover (from `pyrosetta.rosetta.protocols.moves`) which repeats 1 time and the 3-mer mover into a separate RepeatMover to repeat 1 time.

In [32]:
#22
# YOUR CODE HERE
from pyrosetta.rosetta.protocols.moves import RepeatMover
repeat_mover_9mer = RepeatMover(mover_9mer, 1)
repeat_mover_3mer = RepeatMover(mover_3mer, 1)

Just like with the `RandomTorsionMover` above, reassign pose to the `pose_start` conformation and `reset` your MonteCarlo object. Also, clear out your PyMOL session by typing `delete all` in the PyMOL terminal.

In [33]:
#23
# YOUR CODE HERE
work_pose = Pose(unfolded_pose)
mc.reset(work_pose)

In [34]:
print(work_pose)

PDB file name: VIHCDAAT
Total residues: 30
Sequence: VIHCDAATICPDGTTCSLSPYGVWYCSPFS
Fold tree:
FOLD_TREE  EDGE 1 30 -1 


Using the 9-mer `RepeatMover` and your Monte Carlo object, create a new `TrialMover` and apply it to the pose 100 times. Use `PyMOLMover` to visualize the changes. Print out the score and the RMSD after the 9-mer insertions. Does fragment insertion help?

Repeat the same steps for the 3-mer `RepeatMover` (picking up from where the 9-mer insertion left off). Is the RMSD lower now than when using the small- and shear-movers?

In [35]:
#24
# YOUR CODE HERE
trial_mover_9 = pyrosetta.rosetta.protocols.moves.TrialMover(repeat_mover_9mer, mc)
trial_mover_3 = pyrosetta.rosetta.protocols.moves.TrialMover(repeat_mover_3mer, mc)

In [36]:
CA_rmsd_array = np.zeros((n_iteration,), dtype='float')
pose_score = np.zeros((n_iteration,), dtype='float')

# Loop
n_iteration = 100
for i in range(n_iteration):
    trial_mover_9.apply(work_pose)
    pose_score[i] = sxfn(work_pose)
    CA_rmsd_array[i] = CA_rmsd(work_pose, pose_native)
    print('9mer score', pose_score[i], 'rmsd', CA_rmsd_array[i])


for i in range(n_iteration):
    trial_mover_3.apply(work_pose)
    pose_score[i] = sxfn(work_pose)
    CA_rmsd_array[i] = CA_rmsd(work_pose, pose_native)
    
    print('3mer score', pose_score[i], 'rmsd', CA_rmsd_array[i])

9mer score 153.3361590319783 rmsd 25.25547981262207
9mer score 121.03221984312606 rmsd 18.48982048034668
9mer score 121.26053925488073 rmsd 17.870283126831055
9mer score 109.70785867580234 rmsd 16.325658798217773
9mer score 109.70785867580234 rmsd 16.325658798217773
9mer score 102.93226257940212 rmsd 12.183347702026367
9mer score 97.30412746082597 rmsd 11.967729568481445
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 87.90091162643367 rmsd 10.718454360961914
9mer score 79.71628097826758 rmsd 10.306827545166016
9mer score 79.71628097826758 rmsd 10.306827545166016
9mer score 49.232756963274234 rmsd 5.581231117248535
9mer score 49.232756963274234 rmsd 5.581231117248535
9mer score 49.232756963274234 rmsd 5.581231117248535
9mer score 49.232756963274234 rmsd 5.581231

Let's try to refine the structure that comes out of the 9- and 3-mer fragment insertion stage above. Construct a new sequence mover with just the small mover and the shear movers (i.e. no RandomTorsionMover).

In [37]:
#25
# YOUR CODE HERE
sequence_mover_2 = pyrosetta.rosetta.protocols.moves.SequenceMover()

sequence_mover_2.add_mover(small_mover)
sequence_mover_2.add_mover(shear_mover)

Construct a trial mover with the new sequence mover and the `MonteCarlo` object.

In [38]:
#26
# YOUR CODE HERE
trial_mover_2 = pyrosetta.rosetta.protocols.moves.TrialMover(sequence_mover_2, mc)

Now, copy-paste the trial-mover loop and score printing from above to run the this refinement stage of the protocol. This time, however, do not reset the MonteCarlo object or reassign your working pose to the `pose_start`.

In [39]:
#27 jsfjielfjpajfeijfjacngieowpqfjdidinnerbreakiwoefindthepalindromeraceewafejafracecar
# YOUR CODE HERE
mc.set_temperature(1.0)
CA_rmsd_array = np.zeros((n_iteration,), dtype='float')
pose_score = np.zeros((n_iteration,), dtype='float')

# Loop
n_iteration = 100
for i in range(n_iteration):
    trial_mover_2.apply(work_pose)
    pose_score[i] = sxfn(work_pose)
    CA_rmsd_array[i] = CA_rmsd(work_pose, pose_native)
    print('score', pose_score[i], 'rmsd', CA_rmsd_array[i])

score 44.72576385688422 rmsd 4.576930046081543
score 44.58585665861669 rmsd 4.561782360076904
score 44.58585665861669 rmsd 4.561782360076904
score 42.95287626312666 rmsd 4.553153038024902
score 42.95287626312666 rmsd 4.553153038024902
score 42.18257942598374 rmsd 4.492093086242676
score 42.18257942598375 rmsd 4.492093086242676
score 42.18257942598375 rmsd 4.492093086242676
score 43.30152587941009 rmsd 4.531303405761719
score 43.17243438822205 rmsd 4.484631538391113
score 43.17243438822205 rmsd 4.484631538391113
score 43.17243438822205 rmsd 4.484631538391113
score 42.29408783527037 rmsd 4.519283771514893
score 42.294087835270375 rmsd 4.519283771514893
score 42.294087835270375 rmsd 4.519283771514893
score 42.294087835270375 rmsd 4.519283771514893
score 42.294087835270375 rmsd 4.519283771514893
score 38.43692600188429 rmsd 4.505205154418945
score 37.51565105159728 rmsd 4.454853057861328
score 37.51565105159728 rmsd 4.454853057861328
score 37.51565105159728 rmsd 4.454853057861328
score 37.

## Recovering the best structure and minimizing

We've run a Monte Carlo simulation, but this doesn't guarantee that the final accepted pose was the best scoring pose that we came across during the simulation. Use MonteCarlo's `recover_low()` function to get the lowest scoring pose. Print the score of this Pose. Does it have a lower score than the score printed for the last step of your fragment insertion code above? How does its score compare to that of the native? What is its CA RMSD?

In [44]:
#28
# YOUR CODE HERE
#mc.recover_low(work_pose)
best_pose = Pose()
mc.recover_low(best_pose)
print("best pose score:", sxfn(best_pose))
print("best pose RMSD:", CA_rmsd(pose_native,best_pose))

best pose score: 33.604221784639265
best pose RMSD: 4.418140411376953


In this lab, we have only used Monte Carlo and not minimization. As a last step, let's use the MinMover to minimize the pose periodically (located in protocols.minimization_packing). Code for the MinMover is already written in the cell below.

In [45]:
from pyrosetta.rosetta.protocols.minimization_packing import *
min_mover = MinMover()
min_mover.set_movemap(mm)

sxfn = create_score_function('score3') # centroid score function
min_mover.score_function(sxfn) 

Apply the `MinMover` on your recovered pose. Send the changes to PyMOL using the pymol mover. Print out the score and the RMSD after minimization. Is the score less, what about the RMSD?

In [47]:
#29
# YOUR CODE HERE
min_mover.apply(best_pose)
print('Score =', sxfn(best_pose))
print('RMSD =', CA_rmsd(best_pose, pose_native))

Score = 34.67902440093992
RMSD = 4.405145645141602
