# Setup Notebook for PyRosetta


---


In [0]:
# Notebook setup
import sys
if 'google.colab' in sys.modules:
    !pip install pyrosettacolabsetup
    import pyrosettacolabsetup
    pyrosettacolabsetup.setup()
    print ("Notebook is set for PyRosetta use in Colab.  Have fun!")

Collecting pyrosettacolabsetup
  Downloading https://files.pythonhosted.org/packages/31/11/10a9140931c88352b51a91d2e6be8b929b6560b3683e5517d88c271ceb37/pyrosettacolabsetup-0.1.tar.gz
Building wheels for collected packages: pyrosettacolabsetup
  Building wheel for pyrosettacolabsetup (setup.py) ... [?25l[?25hdone
  Created wheel for pyrosettacolabsetup: filename=pyrosettacolabsetup-0.1-cp36-none-any.whl size=1695 sha256=f6890d5dfe23d942ee840321e83e2402cf9b17fcb4b86020fa47ee584eb00fa3
  Stored in directory: /root/.cache/pip/wheels/3a/2d/68/2a5b479b424b3df2b96d725177e1f42c9b85c446965d566c6c
Successfully built pyrosettacolabsetup
Installing collected packages: pyrosettacolabsetup
Successfully installed pyrosettacolabsetup-0.1


Import and intialize PyRosetta.

In [0]:
from pyrosetta import *
pyrosetta.init()

# Working with PDB Files

---


Change the current directory in Colaboratory.

In [0]:
!pwd
%cd "/content/google_drive/My Drive/PyRosetta/PyRosetta.notebooks-master/notebooks"
# !ls

The Pose class includes various types of information that describe a structure. Some of the core components include the Energies, PDBInfo, and Conformation. See the Rosetta3 paper to learn more: https://www.sciencedirect.com/science/article/pii/B9780123812704000196


In [0]:
pose = pyrosetta.pose_from_pdb("inputs/5tj3.pdb")


As an example, let's use our pose to look at the sequence of 5TJ3.

In [0]:
# print out the sequence of the pose
pose.sequence()

Sometimes PDB files do not conform to standards and need to be cleaned to be loaded successfully with PyRosetta. One way to make sure the file is loaded successfully is to only include the ATOM lines from the PDB file. Alternatively, you could use the `cleanATOM` function in `pyrosetta.toolbox` to achieve the same:



In [0]:
from pyrosetta.toolbox import cleanATOM
cleanATOM("inputs/5tj3.pdb")

This method will create a cleaned 5tj3.clean.pdb file for you. Lets load this into PyRosetta as well:

In [0]:
pose_clean = pose_from_pdb("inputs/5tj3.clean.pdb")

In our case, we could load in the PDB file for 5tj3 without cleaning it. In fact, we've lost some residues when cleaning the PDB file with cleanATOM. What is the difference in the sequence of the pose_clean now, compared to before?

In [0]:
# print out the sequence of the pose_clean
pose_clean.sequence()

With the function annotated_sequence below, we can start to see in more detail what the differences are. Note that non-canonical amino acids and hetatms are spelled out more explicitly now.

In [0]:
pose.annotated_sequence()

In [0]:
pose_clean.annotated_sequence()

## Exercise 1: Inspecting pose sequences
Inspect the sequences to find the difference(s) between the `pose_clean.sequence()` and `pose.sequence()`. Were residues removed? Which ones? 

Write a program to automatically find the differences between these two sequences

# Working With Pose Residues

---

We can use methods in Pose to count residues and pick out residues from the pose. Remember that Pose is a python class, and to access methods it implements, you need an instance of the class (here pose or pose_clean) and you then use a dot after the instance.


In [0]:
print(pose.total_residue()) 
print(pose_clean.total_residue())

# Did you find all the missing residues in Exercise 1?

## Exercise 2: Reside Information

Store the Residue information for residue 20 of the pose by using the `pose.residue(20)` function. What amino acide is residue 20? Hint: Use the `name()` function.

# Residue Objects

---

Use the pose's `.residue()` object to get the 24th residue of the protein pose. What is the 24th residue in the PDB file (look in the PDB file)? Are they the same residue?

In [0]:
# store the 24th residue in the pose into a variable (see residue20 example above)
residue24 = pose.residue(24)

In [0]:
# what other methods are attached to that Residue object? (type "residue24." and hit Tab to see a list of commands)

We can immediately see that the numbering PyRosetta internally uses for pose residues is different from the PDB file. The information corresponding to the PDB file can be accessed through the `pose.pdb_info()` object.

In [0]:

print(pose.pdb_info().chain(24))
print(pose.pdb_info().number(24))

By using the pdb2pose method in `pdb_info()`, we can turn PDB numbering (which requires a chain ID and a residue number) into Pose numbering.

In [0]:
# PDB numbering to Pose numbering
print(pose.pdb_info().pdb2pose('A', 24))

Use the `pose2pdb` method in `pdb_info()` to see what is the corresponding PDB chain and residue ID for pose residue number 24.

In [0]:
# Pose numbering to PDB numbering
print(pose.pdb_info().pose2pdb(1))

Now we can see how to examine the identity of a residue by PDB chain and residue number.

Once we get a residue, there are various methods in the Residue class that might be for running analysis. We can get instances of the Residue class from Pose. For instance, we can do the following:

In [0]:
res_24 = pose.residue(24)
print(res_24.name())
print(res_24.is_charged())