# Folding Lab

## Building the pose

In this lab session, you will be folding a 30 residue protein with basic movers. Start by initializing PyRosetta as usual.

In [1]:
#1
### BEGIN SOLUTION
from pyrosetta import *
init()
### END SOLUTION

PyRosetta-4 2020 [Rosetta PyRosetta4.conda.mac.python37.Release 2020.02+release.22ef835b4a2647af94fcd6421a85720f07eddf12 2020-01-05T17:31:56] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRosetta Team.
[0mcore.init: {0} [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: {0} [0mRosetta version: PyRosetta4.conda.mac.python37.Release r242 2020.02+release.22ef835b4a2 22ef835b4a2647af94fcd6421a85720f07eddf12 http://www.pyrosetta.org 2020-01-05T17:31:56
[0mcore.init: {0} [0mcommand: PyRosetta -ex1 -ex2aro -database /Users/paul/anaconda3/envs/pyrosetta_env/lib/python3.7/site-packages/pyrosetta/database
[0mbasic.random.init_random_generator: {0} [0m'RNG device' seed mode, using '/dev/urandom', seed=554548529 seed_offset=0 real_seed=554548529 thread_index=0
[0mbasic.random.init_random_generator: {0} [0mRandomGenerator:init: Normal mode, seed=554548529 RG_type=mt19937


We would like to visualize folding as it happens. Before starting with the folding protocol, instantiate a PyMOL mover. Make sure you retain history to view the entire folding process. Start a PyMOL session on the side and make sure it says `PyMOL <---> PyRosetta link started!` on its commnad line.

In [2]:
#2
### BEGIN SOLUTION
pmm = PyMOLMover()
pmm.keep_history(True) 
### END SOLUTION

Initialize a pose from sequence using the FASTA file given in this folder, `1qgm.fatsa`. Use the PyMOL mover to view the pose. You should see a long thread-like structure. Check the backbone dihedrals of any residue (except the first and last). What are the values of phi and psi dihedrals?

In [3]:
#3
### BEGIN SOLUTION
start_pose = pose_from_sequence("VIHCDAATICPDGTTCSLSPYGVWYCSPFS")
### END SOLUTION

[0mcore.chemical.GlobalResidueTypeSet: {0} [0mFinished initializing fa_standard residue type set.  Created 980 residue types
[0mcore.chemical.GlobalResidueTypeSet: {0} [0mTotal time to initialize 1.07817 seconds.


Pick any residue in the pose (except for an Ala or Gly) and print out information about it using `pose.residue(resno)`. How many atoms do you see? You might also want to go to PyMOL and take a look at this residue in stick representation

In [4]:
#4
### BEGIN SOLUTION
print(start_pose.residue(2))
pmm.apply(start_pose)
### END SOLUTION

Residue 2: ILE (ILE, I):
Base: ILE
 Properties: POLYMER PROTEIN CANONICAL_AA HYDROPHOBIC ALIPHATIC METALBINDING ALPHA_AA L_AA
 Variant types:
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O    H    HA 
 Side-chain atoms:  CB   CG1  CG2  CD1  HB  1HG1 2HG1 1HG2 2HG2 3HG2 1HD1 2HD1 3HD1
Atom Coordinates:
   N  : 3.33248, 1.53597, 1.45999e-16
   CA : 3.98759, 2.83851, 1.11973e-17
   C  : 5.50383, 2.69251, 3.98699e-16
   O  : 6.02992, 1.57957, -5.72105e-07
   CB : 3.55155, 3.67262, 1.21844
   CG1: 2.55509, 2.8866, 2.07455
   CG2: 2.94643, 4.99377, 0.769523
   CD1: 2.24058, 1.50953, 1.5361
   H  : 3.89946, 0.700128, 2.30928e-07
   HA : 3.69688, 3.37162, -0.904386
   HB : 4.41682, 3.87565, 1.84868
  1HG1: 2.95036, 2.77644, 3.0838
  2HG1: 1.62083, 3.44373, 2.14949
  1HG2: 2.64312, 5.57044, 1.64304
  2HG2: 3.68508, 5.55758, 0.201387
  3HG2: 2.07605, 4.80085, 0.142106
  1HD1: 1.52766, 1.01475, 2.19615
  2HD1: 1.80983, 1.59849, 0.538159
  3HD1: 3.15567, 0.921235, 1.48554
M

## Using centroid mode

In this entire lab, we will work entirely in centroid mode. So inistatiate a `SwitchResidueTypeSetMover` and apply it on the pose. Apply the PyMOL mover again.

In [5]:
#5
### BEGIN SOLUTION
pose = Pose(start_pose)
switch = SwitchResidueTypeSetMover("centroid")
switch.apply(pose)
pmm.apply(pose)
### END SOLUTION

[0mcore.chemical.GlobalResidueTypeSet: {0} [0mFinished initializing centroid residue type set.  Created 62 residue types
[0mcore.chemical.GlobalResidueTypeSet: {0} [0mTotal time to initialize 0.040094 seconds.


Print out information about the same residue using `pose.residue(resi#)`. How many atoms do you see now? Is there a change in the stick representation of this residue?

In [6]:
#6
### BEGIN SOLUTION
print(pose.residue(2))
### END SOLUTION

Residue 2: ILE (ILE, I):
Base: ILE
 Properties: POLYMER PROTEIN CANONICAL_AA HYDROPHOBIC ALIPHATIC ALPHA_AA L_AA
 Variant types:
 Main-chain atoms:  N    CA   C  
 Backbone atoms:    N    CA   C    O    H  
 Side-chain atoms:  CB   CEN
Atom Coordinates:
   N  : 3.33248, 1.53597, 1.45999e-16
   CA : 3.98759, 2.83851, 1.11973e-17
   C  : 5.50383, 2.69251, 3.98699e-16
   O  : 6.02992, 1.57957, -5.72105e-07
   CB : 3.56262, 3.68472, 1.214
   CEN: 2.81569, 4.3785, 1.60064
   H  : 3.89946, 0.700128, 2.30928e-07
Mirrored relative to coordinates in ResidueType: FALSE



Since we are working with a centroid residue type set, `ref2015` can no longer be used as a score function. Nonetheless let's give it a try. Initialize `ref2015` using `get_fa_scorefxn()` and try to score the pose with that. What do you see? Comment this line out (that is add # in front of it) to proceed.

In [7]:
#7
### BEGIN SOLUTION
sfxn_fa = get_score_function()
#sfxn_fa(pose)
### END SOLUTION

[0mcore.scoring.ScoreFunctionFactory: {0} [0mSCOREFUNCTION: [32mref2015[0m
[0mcore.scoring.etable: {0} [0mStarting energy table calculation
[0mcore.scoring.etable: {0} [0msmooth_etable: changing atr/rep split to bottom of energy well
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing lj etables (maxdis = 6)
[0mcore.scoring.etable: {0} [0msmooth_etable: spline smoothing solvation etables (max_dis = 6)
[0mcore.scoring.etable: {0} [0mFinished calculating energy tables.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBPoly1D.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBFadeIntervals.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/HBEval.csv
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/hbonds/ref2015_params/DonStrength.csv
[0mbasic.io.database: {0} [0mDatabase file op

An appropriate score function for the centroid mode is `score3`. You can initialize it using `create_score_function("score3")`. Score the pose with this score function.

In [8]:
#8
### BEGIN SOLUTION
sfxn_cen = create_score_function("score3")
sfxn_cen(pose)
### END SOLUTION

[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/env_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/cbeta_den.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/pair_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/EnvPairPotential/cenpack_log.txt
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.HS.resmooth
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/SecondaryStructurePotential/phi.theta.36.SS.resmooth


156.57643598651632

## Comparing to native

Since the final solution is known to us, let's read in the native file, `1qgm.pdb` and store it in a new pose. For a good comparison, let's also use the swich residue type mover on this. Score this native pose using `score3`. Why do you think the score is like it is? To visualize this pose, use the PyMOL mover again.

In [9]:
native_pose = pose_from_file("1qgm.pdb")
native_cen = Pose(native_pose)
switch.apply(native_cen)
sfxn_cen(native_cen)

[0mcore.import_pose.import_pose: {0} [0mFile '1qgm.pdb' automatically determined to be of type PDB
[0mcore.conformation.Conformation: {0} [0mFound disulfide between residues 4 16
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 4 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 16 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 4 CYD
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 16 CYD
[0mcore.conformation.Conformation: {0} [0mFound disulfide between residues 10 26
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 10 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 26 CYS
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 10 CYD
[0mcore.conformation.Conformation: {0} [0mcurrent variant for 26 CYD


63.8669820863305

How far is the pose from the native pose? One classic metric is the C-alpha RMSD. In Rosetta, it is calculated using the `CA_rmsd()` method located in `pyrosetta.rosetta.core.scoring`. What is the starting value?

In [10]:
#9
### BEGIN SOLUTION
from pyrosetta.rosetta.core.scoring import CA_rmsd
ca_rmsd = pyrosetta.rosetta.core.scoring.CA_rmsd(native_cen, pose)
print("ca rmsd:", ca_rmsd)
### END SOLUTION

ca rmsd: 27.841325759887695


Now that we know how to calculate this RMSD, we can use it to track if we are getting closer to the native structure.

## Building a basic folding protocol

So finally we can start to fold the protein. The only mobile segment is the backbone. To specify this, make a movemap and set backbone motion to true for all residues, i.e. `set_bb(True)`

In [11]:
mm = MoveMap()
mm.set_bb(True)

Let us try to use some simple movers that we have learnt today to fold the protein. Start by initializing a small mover and a shear mover (located in `pyrosetta.rosetta.protocols.simple_moves`). Set `n_moves = 10` and `kT = 1.0`.

In [12]:
#10
### BEGIN SOLUTION
small = rosetta.protocols.simple_moves.SmallMover()
small.nmoves(10)
small.temperature(1.0)

shear = rosetta.protocols.simple_moves.ShearMover()
shear.nmoves(10)
shear.temperature(1.0)
### END SOLUTION

To apply them one after the other, use a sequence mover (located in `pyrosetta.rosetta.protocols.moves`).

The SequenceMover is a mover that holds other movers and applies them "in sequence." I.e. if you have three movers, `m1`, `m2`, and `m3`, and you add them to a SequenceMover in that order, then when you call the SequenceMover's `apply` method, it will first invoke `m1`'s `apply`, then `m2`'s `apply`, and finally `m3`'s `apply`.

Add the above two movers to your sequence mover using its `add_mover()` function.

In [13]:
#11
### BEGIN SOLUTION

seq = pyrosetta.rosetta.protocols.moves.SequenceMover()
seq.add_mover(small)
seq.add_mover(shear)

### END SOLUTION

To perform a Monte Carlo simulation, we first need a `MonteCarlo` object. Use the constructor for MonteCarlo that takes as arguments a pose, a score function, and a kT "temperature" as its parameters. Use a temperature of 1.

You're probably wondering: what score function and what pose should you use? The MonteCarlo object will use the Pose that you give it as it's starting point. It should be the centroid pose that you created from the FASTA sequence. (It should not be the native pose you read in from the PDB file!). We will want the centroid score function to go along with the centroid pose.

The way we'll use the MonteCarlo object is that we'll perturb the Pose `p`, and then after the perturbation, ask the MonteCarlo object to accept or reject the perturbation.

```
   for _ in range(100):
       perturber.apply(p)
       mc.boltzmann(p) # accept or reject the change to p
```
Internally, the MonteCarlo object keeps a copy of the pose in its most-recently-accepted conformation so that if the Metropolis criterion calls for the rejection of the perturbation, then the MonteCarlo object can copy the prior conformation back into `p`. I.e., the copying resets `p`'s conformation to how it was before the perturbation took place.

The MonteCarlo object is keeping a second copy of the pose that represents the best-ever-seen conformation. We will talk about that copy later.

Construct your MonteCarlo object below.

In [14]:
#12
### BEGIN SOLUTION

mc = MonteCarlo(pose, sfxn_cen, 2.0)

### END SOLUTION

As we are nesting movers, it will be convineint to use a `TrialMover` to invoke some other Mover's apply function, and then invoke the MonteCarlo's boltzmann function. That is, the TrialMover's `apply` function is basically :

```
class TrialMover(Mover):
   def __init__(self, mover: Mover, mc: MonteCarlo):
      self.mover = mover
      self.mc = mc
   def apply(self, p: Pose):
      mover.apply(p)
      self.botlzmann(p)
```

(although it is actually implemented in C++)

Why create a class to call a function? Why not just call the function? Muse on this.

Initialize a `TrialMover` with the sequence mover (the one containing the small and shear movers) and the Monte Carlo object as input.

In [15]:
#13
### BEGIN SOLUTION

trial = TrialMover(seq, mc)

### END SOLUTION

Before making any changes, let's keep a copy of the starting pose handy, because we'll need it later. To do so use `pose.clone()`.

In [16]:
#14
# pose_start = ...
### BEGIN SOLUTION

pose_start = pose.clone()

### END SOLUTION

## Applying the folding protocol

OK!

We now have all of the elements we need in order to put together and run our small- and shear-mover-based-folding protocol. Let's assemble them.

What we will do is invoke the TrialMover's `apply` method. The TrialMover will invoke the SequenceMover's `apply` method and then invoke the MonteCarlo object's `botlzmann` method. The SequenceMover will invoke the small- and shear Movers in sequence.

Just applying the TrialMover once, however, will not be sufficient to fold the protein! Apply this mover 100 times to do so (with a for loop). Inside this loop, also store the score of the pose and the C-alpha RMSD in two `numpy` arrays (look back to Session 1) after every apply of the trial mover. Also use the PyMOL mover to update the pose in PyMOL so that you can create a movie of your protocol.

We are going to have you run this cell more than once; to make sure results of previous runs don't bleed through, add a call to MonteCarlo's `reset` function handing it the conformation of the original pose. Also, copy the `pose_start` pose into the working `pose.` 

In [17]:
#15
### BEGIN SOLUTION

import numpy
n_iterations = 100
score = numpy.zeros((n_iterations,),dtype=float)
ca_rmsd = numpy.zeros((n_iterations,), dtype=float)

work_pose = Pose(pose)
mc.reset(work_pose)

for i in range(n_iterations):
    trial.apply(work_pose)
    sc = sfxn_cen(work_pose)
    score[i] = sc
    rmsd = CA_rmsd(native_cen,work_pose)
    ca_rmsd[i] = rmsd
    pmm.apply(work_pose) # print movie in pyMOL
    print(ca_rmsd[i], score[i])
### END SOLUTION

[0mcore.scoring.ramachandran: {0} [0mshapovalov_lib::shap_rama_smooth_level of 4( aka highest_smooth ) got activated.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/shapovalov/kappa25/all.ramaProb
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/avg_L_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/avg_L_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/sym_all_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_all_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_functions/rama/flat/sym_G_rama.dat
[0mcore.scoring.ramachandran: {0} [0mReading custom Ramachandran table from scoring/score_functions/rama/flat/sym_G_rama.dat.
[0mbasic.io.database: {0} [0mDatabase file opened: scoring/score_funct

Now using a loop, print out the scores and the rmsds after every iteration. How close do you get to the native using this folding algorithm? Did you notice your RMSD decrease stalling after a while? Rerun the cell above (but take care that the results of your previous execution don't beeed through) a couple of times. Do you ever go below a certain number? If you "play" the states you have in pymol, do your structures look like they are becoming very protein-like?

In [18]:
#16
### BEGIN SOLUTION

for i in range(n_iterations):
    print("sc:", score[i], "rmsd", ca_rmsd[i])

### END SOLUTION

sc: 156.32892280333309 rmsd 27.815967559814453
sc: 156.33145739851804 rmsd 27.821609497070312
sc: 156.29321151268505 rmsd 27.782455444335938
sc: 156.50862903761862 rmsd 27.802080154418945
sc: 156.42353852154233 rmsd 27.753732681274414
sc: 156.24469036225395 rmsd 27.703907012939453
sc: 156.0729130352496 rmsd 27.658042907714844
sc: 156.0742537407675 rmsd 27.66314697265625
sc: 156.1513721178805 rmsd 27.616924285888672
sc: 156.20253281494044 rmsd 27.639799118041992
sc: 156.0179010582191 rmsd 27.62373161315918
sc: 156.10741127395818 rmsd 27.662065505981445
sc: 156.19955442899996 rmsd 27.687942504882812
sc: 156.1648930963818 rmsd 27.688819885253906
sc: 156.03127319394608 rmsd 27.685834884643555
sc: 155.8549577593573 rmsd 27.632946014404297
sc: 155.75580492356178 rmsd 27.599992752075195
sc: 155.66347603905584 rmsd 27.565946578979492
sc: 155.7732430149008 rmsd 27.60748863220215
sc: 155.8827146379476 rmsd 27.589500427246094
sc: 155.7323412400924 rmsd 27.505813598632812
sc: 155.5634007559667 rms

## Increasing move size

>If you have less than 30 minutes left, skip this section and proceed directly to the fragment-insertion section

One possible reason for this stagnation is that the maximum angle deviation of small movers is very small: 0, 5, and 6 degrees for helices, sheets, and loops, respectively. This approach will likely take forever to fold this protein. What if we introduce larger changes? Let's try larger changes! Use `RandomTorsionMover` (also found in `pyrosetta.rosetta.protocols.simple_moves`) to make larger torsion changes. `RandomTorsionMover` can be initialized with three parameters: `movemap`, `max_angle`, and `num_moves`. Set `max_angle` to 20 (you can try a few values) and `n_moves` to 10.

In [19]:
#17
### BEGIN SOLUTION

rtor = pyrosetta.rosetta.protocols.simple_moves.RandomTorsionMover(mm, 20, 10)

### END SOLUTION

Add this mover to your sequence mover (which still contains the small- and shear- movers).

In [20]:
#18
### BEGIN SOLUTION
seq.add_mover(rtor)
### END SOLUTION

Now, copy-paste the loop and score printing from above to run the trial mover. Before you run it, however, you might want to clear out your PyMOL session by typing `delete all` in the PyMOL terminal.

In [21]:
#19
### BEGIN SOLUTION

import numpy
n_iterations = 100
score = numpy.zeros((n_iterations,),dtype=float)
ca_rmsd = numpy.zeros((n_iterations,), dtype=float)

work_pose = Pose(pose)
mc.reset(work_pose)

for i in range(n_iterations):
    trial.apply(work_pose)
    sc = sfxn_cen(work_pose)
    score[i] = sc
    rmsd = CA_rmsd(native_cen,work_pose)
    ca_rmsd[i] = rmsd
    pmm.apply(work_pose)

for i in range(n_iterations):
    print("sc:", score[i], "rmsd", ca_rmsd[i])
    
### END SOLUTION

sc: 155.83710891435132 rmsd 27.37666130065918
sc: 155.97121633723128 rmsd 27.443342208862305
sc: 155.70910304246863 rmsd 27.318523406982422
sc: 155.5425327118656 rmsd 27.226299285888672
sc: 154.40197573151906 rmsd 26.840906143188477
sc: 154.40197573151906 rmsd 26.840906143188477
sc: 154.70860714288682 rmsd 26.961524963378906
sc: 155.42932182433054 rmsd 27.441179275512695
sc: 155.42932182433054 rmsd 27.441179275512695
sc: 155.6404599083312 rmsd 27.54583740234375
sc: 155.21323769459855 rmsd 27.32180404663086
sc: 155.21323769459855 rmsd 27.32180404663086
sc: 155.00432151509227 rmsd 27.234228134155273
sc: 154.68867523310522 rmsd 27.047393798828125
sc: 154.45299374264823 rmsd 26.897010803222656
sc: 155.16288818216793 rmsd 27.17877769470215
sc: 154.6343395473952 rmsd 26.972036361694336
sc: 154.23970671780543 rmsd 26.88701057434082
sc: 154.15147903155741 rmsd 26.97064971923828
sc: 154.34541357439872 rmsd 27.04338836669922
sc: 153.89983081894607 rmsd 26.820337295532227
sc: 153.89983081894607 r

Do larger moves help? 

## Fragment insertion

As a last part to this exercise, let's try fragment insertion as the first step in the sequence mover. Code to create a fragment set with 3-mer fragments is already written in the cell below. Make another fragment set of length 9 and read in the 9-mer file supplied, `1qgm.9mers`.

In [22]:
#20
from pyrosetta.rosetta.core.fragment import *
fragset3 = ConstantLengthFragSet(3)
fragset3.read_fragment_file("1qgm.3mers")

### BEGIN SOLUTION
fragset9 = ConstantLengthFragSet(9)
fragset9.read_fragment_file("1qgm.9mers")
### END SOLUTION

[0mcore.fragments.ConstantLengthFragSet: {0} [0mfinished reading top 200 3mer fragments from file 1qgm.3mers
[0mcore.fragments.ConstantLengthFragSet: {0} [0mfinished reading top 200 9mer fragments from file 1qgm.9mers


Construct a 3-mer and a 9-mer fragment set mover. Code for the 3-mer fragment mover is already written in the cell below. Similarly, make a 9-mer fragment mover.

In [23]:
#21
from pyrosetta.rosetta.protocols.simple_moves import ClassicFragmentMover
mover_3mer = ClassicFragmentMover(fragset3, mm)

### BEGIN SOLUTION
mover_9mer = ClassicFragmentMover(fragset9, mm)
### END SOLUTION

Put the 9-mer mover into a RepeatMover (from `pyrosetta.rosetta.protocols.moves`) which repeats 1 time and the 3-mer mover into a separate RepeatMover to repeat 1 time.

In [24]:
#22
### BEGIN SOLUTION

from pyrosetta.rosetta.protocols.moves import RepeatMover

repeat_9mer_frags = RepeatMover(mover_9mer, 1)
repeat_3mer_frags = RepeatMover(mover_3mer, 1)


### END SOLUTION

Just like with the `RandomTorsionMover` above, reassign pose to the `pose_start` conformation and `reset` your MonteCarlo object. Also, clear out your PyMOL session by typing `delete all` in the PyMOL terminal.

In [25]:
#23
### BEGIN SOLUTION
work_pose.assign(pose_start)
mc.reset(work_pose)
### END SOLUTION

Using the 9-mer `RepeatMover` and your Monte Carlo object, create a new `TrialMover` and apply it to the pose 100 times. Use `PyMOLMover` to visualize the changes. Print out the score and the RMSD after the 9-mer insertions. Does fragment insertion help?

Repeat the same steps for the 3-mer `RepeatMover` (picking up from where the 9-mer insertion left off). Is the RMSD lower now than when using the small- and shear-movers?

In [26]:
#24
### BEGIN SOLUTION

import numpy
n_iterations = 100
score = numpy.zeros((n_iterations,),dtype=float)
ca_rmsd = numpy.zeros((n_iterations,), dtype=float)

work_pose = Pose(pose_start)
mc.reset(work_pose)
mc.set_temperature(2.0) # I'm not asking the students to do this; I'm just teting stuff

trial9 = TrialMover(repeat_9mer_frags, mc)
trial3 = TrialMover(repeat_3mer_frags, mc)

for i in range(n_iterations):
    trial9.apply(work_pose)
    sc = sfxn_cen(work_pose)
    score[i] = sc
    rmsd = CA_rmsd(native_cen,work_pose)
    ca_rmsd[i] = rmsd
    pmm.apply(work_pose)

for i in range(n_iterations):
    print("9mer insertion -- sc:", score[i], "rmsd", ca_rmsd[i])

for i in range(n_iterations):
    trial3.apply(work_pose)
    sc = sfxn_cen(work_pose)
    score[i] = sc
    rmsd = CA_rmsd(native_cen,work_pose)
    ca_rmsd[i] = rmsd
    pmm.apply(work_pose)

for i in range(n_iterations):
    print("3mer insertion -- sc:", score[i], "rmsd", ca_rmsd[i])


    
### END SOLUTION

9mer insertion -- sc: 134.67511687539587 rmsd 21.317359924316406
9mer insertion -- sc: 97.40257983020396 rmsd 9.296319961547852
9mer insertion -- sc: 97.40257983020396 rmsd 9.296319961547852
9mer insertion -- sc: 90.4629502268659 rmsd 9.579957962036133
9mer insertion -- sc: 90.4629502268659 rmsd 9.579957962036133
9mer insertion -- sc: 90.4629502268659 rmsd 9.579957962036133
9mer insertion -- sc: 90.4629502268659 rmsd 9.579957962036133
9mer insertion -- sc: 87.7814996143404 rmsd 8.737770080566406
9mer insertion -- sc: 87.7814996143404 rmsd 8.737770080566406
9mer insertion -- sc: 87.7814996143404 rmsd 8.737770080566406
9mer insertion -- sc: 87.7814996143404 rmsd 8.737770080566406
9mer insertion -- sc: 81.38453459426358 rmsd 8.150049209594727
9mer insertion -- sc: 81.38453459426358 rmsd 8.150049209594727
9mer insertion -- sc: 81.38453459426358 rmsd 8.150049209594727
9mer insertion -- sc: 81.38453459426358 rmsd 8.150049209594727
9mer insertion -- sc: 81.38453459426358 rmsd 8.15004920959472

Let's try to refine the structure that comes out of the 9- and 3-mer fragment insertion stage above. Construct a new sequence mover with just the small mover and the shear movers (i.e. no RandomTorsionMover).

In [27]:
#25
### BEGIN SOLUTION

seq_ss = pyrosetta.rosetta.protocols.moves.SequenceMover()
seq_ss.add_mover(small)
seq_ss.add_mover(shear)

### END SOLUTION

Construct a trial mover with the new sequence mover and the `MonteCarlo` object.

In [28]:
#26
### BEGIN SOLUTION

trial_ss = TrialMover(seq_ss, mc)

### END SOLUTION

Now, copy-paste the trial-mover loop and score printing from above to run the this refinement stage of the protocol. This time, however, do not reset the MonteCarlo object or reassign your working pose to the `pose_start`.

In [29]:
#27
### BEGIN SOLUTION
mc.set_temperature(1.0)
for i in range(n_iterations):
    trial_ss.apply(work_pose)
    sc = sfxn_cen(work_pose)
    score[i] = sc
    rmsd = CA_rmsd(native_cen,work_pose)
    ca_rmsd[i] = rmsd
    pmm.apply(work_pose)

for i in range(n_iterations):
    print("small/sheer refinement -- sc:", score[i], "rmsd", ca_rmsd[i])


### END SOLUTION

small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.728442335618055 rmsd 4.81954288482666
small/sheer refinement -- sc: 37.19974132333735 rmsd 4.839699745178223
small/sheer refinement -- sc: 37.19974132333735 rmsd 4.839699745178223
small/sheer refinement -- sc: 37.19974132333735 rmsd 4.839699745178223
small/sheer refinement -- sc: 37.19974132333735 rmsd 4.839699745178223
small/sheer refinement -- sc: 37.19974132333735 rmsd 4.839699745178223
small/

## Recovering the best structure and minimizing

We've run a Monte Carlo simulation, but this doesn't guarantee that the final accepted pose was the best scoring pose that we came across during the simulation. Use MonteCarlo's `recover_low()` function to get the lowest scoring pose. Print the score of this Pose. Does it have a lower score than the score printed for the last step of your fragment insertion code above? How does its score compare to that of the native? What is its CA RMSD?

In [30]:
#28
### BEGIN SOLUTION
best_pose = Pose()
mc.recover_low(best_pose)
print("best pose score:", sfxn_cen(best_pose))
print("best pose RMSD:", CA_rmsd(native_cen,best_pose))
### END SOLUTION

best pose score: 33.492880008158416
best pose RMSD: 5.02931022644043


In this lab, we have only used Monte Carlo and not minimization. As a last step, let's use the MinMover to minimize the pose periodically (located in protocols.minimization_packing). Code for the MinMover is already written in the cell below.

In [31]:
from pyrosetta.rosetta.protocols.minimization_packing import *
min_mover = MinMover()
min_mover.set_movemap(mm)
min_mover.score_function(sfxn_cen)

Apply the `MinMover` on your recovered pose. Send the changes to PyMOL using the pymol mover. Print out the score and the RMSD after minimization. Is the score less, what about the RMSD?

In [32]:
#29
### BEGIN SOLUTION
min_mover.apply(best_pose)
print("after minimization:", sfxn_cen(best_pose), "rmsd:", CA_rmsd(native_cen, best_pose))
### END SOLUTION

after minimization: 33.4034475648057 rmsd: 5.031933307647705
