Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [None]:
NAME = "Mohini Khedekar"
COLLABORATORS = ""

---

# Mover Lab
In this lab, you will learn to use Movers to manipulate poses. 

In [1]:
import sys
if 'google.colab' in sys.modules:
  !pip install pyrosettacolabsetup
  import pyrosettacolabsetup
  pyrosettacolabsetup.mount_pyrosetta_install()
import pyrosetta
pyrosetta.init()


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting pyrosettacolabsetup
  Downloading pyrosettacolabsetup-1.0.6-py3-none-any.whl (4.7 kB)
Installing collected packages: pyrosettacolabsetup
Successfully installed pyrosettacolabsetup-1.0.6
Mounted at /content/google_drive
Looking for compatible PyRosetta wheel file at google-drive/PyRosetta/colab.bin/wheels...
Found compatible wheel: /content/google_drive/MyDrive/PyRosetta/colab.bin/wheels//content/google_drive/MyDrive/PyRosetta/colab.bin/wheels/pyrosetta-2023.19+release.d7aa7f94e8b-cp310-cp310-linux_x86_64.whl
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


PyRosetta-4 2023 [Rosetta PyRosetta4.MinSizeRel.python310.ubuntu 2023.19+release.d7aa7f94e8be5e9d5110d37f167c2a7afd30c530 2023-05-08T16:22:16] retrieved from: http://www.pyrosetta.org
(C) Copyright Rosetta Commons Member Institutions. Created in JHU by Sergey Lyskov and PyRo

In [None]:
# cd into the right directory here:

In [7]:
%cd /content/google_drive/MyDrive/codeschool2023

/content/google_drive/MyDrive/codeschool2023


Let's load the structure 2WPT, which is a complex of protein Im2 and colicin E9 DNase. Researchers have introduced various mutations to the interface to study the changes of binding free energy.

In [8]:
pose = pyrosetta.rosetta.core.import_pose.pose_from_file("/content/google_drive/MyDrive/codeschool2023/2wpt.pdb")

core.import_pose.import_pose: File '/content/google_drive/MyDrive/codeschool2023/2wpt.pdb' automatically determined to be of type PDB


Visualize this wild type version of the complex in PyMOL. Color the structure by chains. How many proteins are there?

In [None]:
# type your answers here and take a screenshot of your PyMOL session to submit 
#screenshot uploaded to github
#there are two chains that are not connected to each other and hence there are two proteins

## Backbone movers
Let's try to modify the protein backbone. The simplest way to sample backbone conformations is introducing random perturbations. The SmallMover makes small independent random perturbations of the phi and psi torsion angles of random residues. It uses the rama score to ensure that only favorable backbone torsion angles are being selected. Let's initialize a SmallMover and let it introduce 10 random perturbations.

In [9]:
#small_mover is a kind of mover function 
small_mover = pyrosetta.rosetta.protocols.simple_moves.SmallMover()

#the below code specifies the number of perturbations
small_mover.nmoves(10)

small_mover.apply(pose)

In [12]:
# now dump this new pose to a PDB file and once again visualize in PyMOL (take a screenshot to submit)
# name it 2wpt_small.pdb

pose.dump_pdb("smallmoveroutput_file.pdb")

True

In PyMol, compare the structures before and after perturbation. Do you find anything weird? Yes, the C-terminus changes much more than the N-terminus. This is called the lever effect in backbone sampling. The change at a residue will propagate to all its downstream residues. Because of the lever-arm effect, backbone perturbations are not local and bad contacts can be easily introduced.

The ShearMover deals with the lever effect. Instead of independently sampling backbone torsions, it changes torsions of two consecutive residues together in a way that the downstream lever effect is reduced. Let's import a fresh pose, initialize a ShearMover and let it introduce 100 perturbations.

In [14]:
pose = pyrosetta.rosetta.core.import_pose.pose_from_file('/content/google_drive/MyDrive/codeschool2023/2wpt.pdb')
shear_mover = pyrosetta.rosetta.protocols.simple_moves.ShearMover()
shear_mover.nmoves(100)
shear_mover.apply(pose)

# now dump this new pose to a PDB file and once again visualize in PyMOL (take a screenshot to submit)
# name it 2wpt_shear.pdb
pose.dump_pdb("shearmoveroutput_file.pdb")



core.import_pose.import_pose: File '/content/google_drive/MyDrive/codeschool2023/2wpt.pdb' automatically determined to be of type PDB


True

Now you should see that the lever-arm effect is reduced, but not completely gone. 

"Backrub" is one method to realize true local sampling. The trade off is that backbone bond angles are changed slightly. Initialize a BackrubMover and apply 100 times.

In [16]:
pose = pyrosetta.rosetta.core.import_pose.pose_from_file('/content/google_drive/MyDrive/codeschool2023/2wpt.pdb')
br_mover = pyrosetta.rosetta.protocols.backrub.BackrubMover()
for i in range(100):
    br_mover.apply(pose)

# now dump this new pose to a PDB file and once again visualize in PyMOL (take a screenshot to submit)
# name it 2wpt_backrub.pdb

pose.dump_pdb("brmover_file.pdb")

core.import_pose.import_pose: File '/content/google_drive/MyDrive/codeschool2023/2wpt.pdb' automatically determined to be of type PDB
core.mm.MMBondAngleLibrary: MM bond angle sets added fully assigned: 604; wildcard: 0 and 1 virtual parameter.
basic.io.database: Database file opened: sampling/branch_angle/branch_angle_1.txt
basic.io.database: Database file opened: sampling/branch_angle/branch_angle_2.txt
protocols.backrub.BackrubMover: Segment lengths: 3-34 atoms
protocols.backrub.BackrubMover: Main chain pivot atoms: CA
protocols.backrub.BackrubMover: Adding backrub segments for residues 1-200
protocols.backrub.BackrubMover: Total Segments Added: 1778


True

Now you can see that the perturbations are evenly distributed throughout the structure.

## Mutate residues
Protein designers constantly explore conformation and sequence spaces of proteins. You already learned methods to sample the backbone conformation space, now it's time to consider introducing mutations.

A previous study showed that the N34V R38T mutations on chain A lowers binding free energy by -2.60 kcal/mol. Let's introduce these two mutations to our structure. Again, import a fresh pose.

In [17]:
pose = pyrosetta.rosetta.core.import_pose.pose_from_file('/content/google_drive/MyDrive/codeschool2023/2wpt.pdb')

core.import_pose.import_pose: File '/content/google_drive/MyDrive/codeschool2023/2wpt.pdb' automatically determined to be of type PDB


In Rosetta, residues in a pose are numbered from 1 to N which is the total number of residues. This indexing system is different from what you see from a PDB file. For example, the first lysine in our structure has Rosetta index 1 but its pdb index is A4. In order to introduce mutations, we need to first figure out the Rosetta indices of the residues of our interest. As we have done before, we will turn to the PDBInfo object attached to a pose.

In [18]:
print(pose.pdb_info().pdb2pose('A', 34))
print(pose.pdb_info().pdb2pose('A', 38))

31
35


Use the MutateResidue mover to introduce mutations N34V R38T.

In [19]:
mutater = pyrosetta.rosetta.protocols.simple_moves.MutateResidue()

mutater.set_target(31)
mutater.set_res_name('VAL')
mutater.apply(pose)

mutater.set_target(35)
mutater.set_res_name('THR')
mutater.apply(pose)

# now dump this new pose to a PDB file and once again visualize in PyMOL (take a screenshot to submit)
# name it 2wpt_mutate.pdb
pose.dump_pdb("2wpt_mutate.pdb")



True

Now you should be able to see these mutations in PyMol. Now you learned movers that can help you expore the backbone and sequence spaces. You may have realized that the side chain conformations, which are very important, are not sampled. Side chain sampling will be covered in later labs.

## Exercises
1. Use the functions you learned from the previous lecture to score the poses before and after mutation. What is the change of the score value? Does it match the experimentally measured -2.60 kcal/mol? What score terms change significantly? What 10 residues' scores change the most? Do their changes make sense?

(If you have `emap1` and `emap2`, you can calculate the difference as follows:
```
diff_emap = EnergyMap(emap1)
temp_emap = diff_emap # create a reference to the same object
temp_emap -= emap2
print(temp_emap) # temp_emap is now None. This is the "half broken" part
print(diff_emap) # diff_emap has been modified
```

1. Redo the mutagenesis and ddG calculation on backbone perturbed structures. How much do the results change? Why?

2. Generate a backbone ensemble made of 20 structures with your favorate backbone sampling method. Redo the mutagenesis and ddG calculation on each structure and take the mean/meadian/mimimal score. How much do the results change? Why?

3. The above ddG analysis is very crude and inaccurate. What improvements should be introduced to make it better?

In [22]:


diff_emap = EnergyMap(emap1)
temp_emap = diff_emap # create a reference to the same object
temp_emap -= emap2
print(temp_emap) # temp_emap is now None. This is the "half broken" part
print(diff_emap) # diff_emap has been modified

NameError: ignored

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

In [None]:
# YOUR CODE HERE
raise NotImplementedError()