# Workshop 3: Working with the Rosetta Energy Function
In this module, we will explore the PyRosetta score function interface. You will learn to inspect energies of a biomolecule at the whole protein, per-residue, and per-atom level. Finally, you will gain practice applying the energies to answering biological questions involving proteins. For these exercises, we will use the protein Ras (PDB 6q21). Either make sure you have the PDB file "6Q21_A.pdb" in your current directory, or if you have an Internet connection, load the pdb into a pose called `ras` with the pyrosetta.pose_from_pdb method. 

In [22]:
import pyrosetta
pyrosetta.init()

ras = pyrosetta.pose_from_pdb("6Q21_A.pdb") #d

[0mcore.init: [0mChecking for fconfig files in pwd and ./rosetta/flags
[0mcore.init: [0mRosetta version: PyRosetta4.Release.python36.mac r208 2019.04+release.fd666910a5e fd666910a5edac957383b32b3b4c9d10020f34c1 http://www.pyrosetta.org 2019-01-22T15:55:37
[0mcore.init: [0mcommand: PyRosetta -ex1 -ex2aro -database /Users/kathyle/Computational Protein Prediction and Design/PyRosetta4.Release.python36.mac.release-208/pyrosetta/database
[0mcore.init: [0m'RNG device' seed mode, using '/dev/urandom', seed=1423126916 seed_offset=0 real_seed=1423126916
[0mcore.init.random: [0mRandomGenerator:init: Normal mode, seed=1423126916 RG_type=mt19937
[0mcore.import_pose.import_pose: [0mFile '6Q21_A.pdb' automatically determined to be of type PDB


## Score Function Basics
To score a protein, you will begin by defining a score function. The function `get_fa_scorefxn()` in the `pyrosetta.teaching` namespace will return the default all-atom energy function. Currently, the default is the `ref2015` energy function.

Create a PyRosetta `ScoreFunction` using:
```
sfxn = get_fa_scorefxn()
```

In [23]:
from pyrosetta.teaching import *

sfxn = get_fa_scorefxn() #d

[0mcore.scoring.ScoreFunctionFactory: [0mSCOREFUNCTION: [32mref2015[0m


You can see the terms, weights, and energy method options by printing the score function:

```
print(sfxn)
```

In [24]:
print(sfxn) #d

ScoreFunction::show():
weights: (fa_atr 1) (fa_rep 0.55) (fa_sol 1) (fa_intra_rep 0.005) (fa_intra_sol_xover4 1) (lk_ball_wtd 1) (fa_elec 1) (pro_close 1.25) (hbond_sr_bb 1) (hbond_lr_bb 1) (hbond_bb_sc 1) (hbond_sc 1) (dslf_fa13 1.25) (omega 0.4) (fa_dun 0.7) (p_aa_pp 0.6) (yhh_planarity 0.625) (ref 1) (rama_prepro 0.45)
energy_method_options: EnergyMethodOptions::show: aa_composition_setup_files: 
EnergyMethodOptions::show: mhc_epitope_setup_files: 
EnergyMethodOptions::show: netcharge_setup_files: 
EnergyMethodOptions::show: aspartimide_penalty_value: 25
EnergyMethodOptions::show: etable_type: FA_STANDARD_DEFAULT
analytic_etable_evaluation: 1
EnergyMethodOptions::show: method_weights: ref 1.32468 3.25479 -2.14574 -2.72453 1.21829 0.79816 -0.30065 2.30374 -0.71458 1.66147 1.65735 -1.34026 -1.64321 -1.45095 -0.09474 -0.28969 1.15175 2.64269 2.26099 0.58223
EnergyMethodOptions::show: method_weights: free_res
EnergyMethodOptions::show: unfolded_energies_type: UNFOLDED_SCORE12
EnergyMeth

**Practice:** List the terms in the energy function and their relative weights

**Hint:** look at the top line that starts with 'weights'

You can also create a custom energy function that includes select terms. Here, we will make an example energy function with only the van der Waals attractive and repulsive terms, both with weights of 1. 

Here, we need to use the `set_weight()`. Make a new `ScoreFunction` and set the weights accordingly. This is how we set the full-atom attractive (`fa_atr`) and the full-atom repulsive (`fa_rep`) terms.

```
sfxn2 = ScoreFunction()
sfxn2.set_weight(fa_atr, 1.0)
sfxn2.set_weight(fa_rep, 1.0)
```

In [25]:
sfxn2 = ScoreFunction() #d
sfxn2.set_weight(fa_atr, 1.0) #d
sfxn2.set_weight(fa_rep, 1.0) #d

Lets compare the score of `ras` using the full-atom `ScoreFunction` versus the `ScoreFunction` we made above using only the attractive and repulsive terms.

First, print the total energy of `ras` using `print(sfxn(ras))`
Then, print the attractive and repulsive energy only of `ras` using `print(sfxn2(ras))`

In [26]:
# print total energy of ras
print(sfxn(ras)) #d

# print the attractive and repulsive energy of ras
print(sfxn2(ras)) #d

1215.729069796814
154.59159174026854


Using the full-atom `ScoreFunction` `sfxn`, break the energy of `ras` down into its individual pieces with the `sfxn.show(ras)` method. Which are the three most dominant contributions, and what are their values? Is this what you would have expected? Why? Note which terms are positive and negative

In [27]:
# use the sfxn.show() method
sfxn.show(ras) #d

[0mcore.scoring: [0m
------------------------------------------------------------
 Scores                       Weight   Raw Score Wghtd.Score
------------------------------------------------------------
 fa_atr                       1.000   -1039.246   -1039.246
 fa_rep                       0.550    1193.837     656.611
 fa_sol                       1.000     682.582     682.582
 fa_intra_rep                 0.005     700.419       3.502
 fa_intra_sol_xover4          1.000      46.564      46.564
 lk_ball_wtd                  1.000     -14.597     -14.597
 fa_elec                      1.000    -195.387    -195.387
 pro_close                    1.250      97.210     121.513
 hbond_sr_bb                  1.000     -41.656     -41.656
 hbond_lr_bb                  1.000     -28.352     -28.352
 hbond_bb_sc                  1.000     -13.111     -13.111
 hbond_sc                     1.000      -7.771      -7.771
 dslf_fa13                    1.250       0.000       0.000
 omega        

In [28]:
# Your response here: what are the three most dominant contributions?

Unweighted, individual component energies of each residue in a structure are stored in the `Pose` object and can be accessed by the `energies()` method. For example, to break down the energy into each residue's contribution, we use: 
```
print(ras.energies().show(<n>))
```
Where `<n>` is the residue number.

What are the total van der Waals, solvation, and hydrogen-bonding contributions of residue 24? (Note that the _backbone_ hydrogen-bonding terms for each residue are not available from the `Energies` object.)

In [29]:
print(ras.energies().show(24)) #d

[0mcore.scoring.Energies: [0mE               fa_atr        fa_rep        fa_sol  fa_intra_repfa_intra_sol_x   lk_ball_wtd       fa_elec     pro_close   hbond_sr_bb   hbond_lr_bb   hbond_bb_sc      hbond_sc     dslf_fa13         omega        fa_dun       p_aa_pp yhh_planarity           ref   rama_prepro
[0mcore.scoring.Energies: [0mE(i)  24         -7.40         19.03          2.94          8.76          0.09         -0.11         -0.56          0.00          0.00          0.00          0.00          0.00          0.00          0.12          2.68          0.06          0.00          2.30          3.58
None


In [None]:
# your response here

The van der Waals, solvation, and electrostatic terms are atom-atom pairwise energies calculated from a pre-tabulated lookup table, dependent upon the distance between the two atoms and their types. You can access this lookup table, called the `etable` directly to check these energy calculations on an atom-by-atom basis. Use the `etable_atom_pair_energies` function which returns a triplet of energies for attractive, repulsive and solvation scores.

(Note that the `etable_atom_pair_energies()` function requires `Atom` objects, not the `AtomID` objects we saw in Workshop #2)

**Practice:** What are the attractice, repulsive, solvation, and electrostatic components between the nitrogen of residue 24 and the oxygen of residue 20? 


```
res24 = ras.residue(24)
res20 = ras.residue(20)
res24_atomN = res24.atom_index("N")
res20_atomO = res20.atom_index("O")
pyrosetta.etable_atom_pair_energies(res24, res24_atomN, res20, res20_atomO, sfxn)
```

In [35]:
res24 = ras.residue(24) #d
res20 = ras.residue(20) #d
res24_atomN = res24.atom_index("N") #d
res20_atomO = res20.atom_index("O") #d
pyrosetta.etable_atom_pair_energies(res24, res24_atomN, res20, res20_atomO, sfxn) #d


(-0.1505855046001568, 0.0, 0.5903452111877215, 2.173111777247698)

## Analysis of Hydrogen Bonds
The hydrogen-bonding score component requires identification of acceptor hybridization states and calculation of geometric parameters includign distance, acceptor bond angle, proton bond angle, and a torsion angle. The hydrogen-bonding energies are stored in an `HbondSet` object. You can access the list of hydrogen bonds by creating an `HBondSet` object and filling the set from the pose (after making sutre that the pose has had its `Energies` object updated based on neighboring residues within the pose), and then using the `HBondSet.show()` command.

The steps above have been combined in the PyRosetta into a method called `get_hbonds()` that has been attached to a `Pose` object, so that we can simply type:

```
sfxn(ras)
hbond_set = ras.get_hbonds()
hbond_set.show(ras)
```

In [46]:
sfxn(ras)
hbond_set = ras.get_hbonds()
hbond_set.show(ras)

The hydrogen bonds for an individual residue can be looked up from the set using its residue number as follows: 

    hbond_set.residue_hbonds(24)
    
**Practice:** How many hydrogen bonds does residue 24 make? 

In [47]:
print(hbond_set.residue_hbonds(25))

vector1_std_shared_ptr_const_core_scoring_hbonds_HBond_t[0x1356aa8e8]


## Practice
Analyze the energy between residues Y102 and Q408 in cetuximab (PDB code 1YY9, use the `pyrosetta.toolbox.pose_from_rcsb` function to download it and load it into a new `Pose` object) by following the steps below. 

A. Internally, a Pose object has a list of residues, numbered starting from 1. To find the residue numbers of Y102 of chain D and Q408 of chain A, use the residue chain identifier and the PDB residue number to convert to the pose numbering using the `pose2pdb()` method:

```
pose = pyrosetta.toolbox.pose_from_rcsb("1YY9")
res102 = pose.pdb_info().pdb2pose("D", 102)
res408 = pose.pdb_info().pdb2pose("A", 408)
```

In [None]:
# get the pose numbers for Y102 (chain D) and Q408 (chain A)

In [48]:
pose = pyrosetta.toolbox.pose_from_rcsb("1YY9") #d
res102 = pose.pdb_info().pdb2pose("D", 102) #d
res408 = pose.pdb_info().pdb2pose("A", 408) #d

[0mcore.import_pose.import_pose: [0mFile '1YY9.clean.pdb' automatically determined to be of type PDB


[0mcore.conformation.Conformation: [0mFound disulfide between residues 6 33
[0mcore.conformation.Conformation: [0mcurrent variant for 6 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 33 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 6 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 33 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 132 162
[0mcore.conformation.Conformation: [0mcurrent variant for 132 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 162 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 132 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 162 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 165 174
[0mcore.conformation.Conformation: [0mcurrent variant for 165 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 174 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 165 CYD
[0mcore.conformat

[0mcore.conformation.Conformation: [0mcurrent variant for 312 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 337 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 445 474
[0mcore.conformation.Conformation: [0mcurrent variant for 445 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 474 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 445 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 474 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 481 490
[0mcore.conformation.Conformation: [0mcurrent variant for 481 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 490 CYS
[0mcore.conformation.Conformation: [0mcurrent variant for 481 CYD
[0mcore.conformation.Conformation: [0mcurrent variant for 490 CYD
[0mcore.conformation.Conformation: [0mFound disulfide between residues 485 498
[0mcore.conformation.Conformation: [0mcurrent variant for 485 CYS
[0mcore.

[0mcore.pack.pack_rotamers: [0mbuilt 313 rotamers at 23 positions.
[0mcore.pack.interaction_graph.interaction_graph_factory: [0mInstantiating DensePDInteractionGraph


B. Score the pose and determine the van der Waals energies and solvation energy between these two residues. Use the following commands to isolate contributions from particular pairs of residues, where `rsd102` and `rsd408` are the two residue objects of interest from above (not the residue number -- use `pose.residue(res_num)` to access the objects): 

```
emap = EMapVector()
sfxn.eval_ci_2b(pose.residue(res102), pose.residue(res408), pose, emap)
print(emap[fa_atr])
print(emap[fa_rep])
print(emap[fa_sol])
```

In [50]:
emap = EMapVector() #d
sfxn.eval_ci_2b(pose.residue(res102), pose.residue(res408), pose, emap) #d
print(emap[fa_atr]) #d
print(emap[fa_rep]) #d
print(emap[fa_sol]) #d

-1.2098840439684349
0.10835353848860363
1.5729435146961963


## Energies and the PyMOL Mover
The `PyMOLMover` class contains a method for sending score function information to PyMOL,
which will then color the structure based on relative residue energies.

Open up PyMOL. Instantiate a `PyMOLMover` object and use the `pymol_mover.send_energy(ras)` to send the coloring  command to PyMOL.

```
pymol_mover = PyMOLMover()
pymol_mover.apply(ras)
print(sfxn(ras))
pymol_mover.send_energy(ras)
```

In [53]:
pymol_mover = PyMOLMover() #d
pymol_mover.apply(ras) #d
sfxn(ras) #d
pymol_mover.send_energy(ras) #d

![SegmentLocal](./Workshop3/PyMOL-send_energy.gif "send_energy")

What color is residue Proline34? What color is residue Alanine66? Which residue has lower energy?

In [None]:
# your response here

`pymol_mover.send_energy(ras, fa_atr)` will have PyMOL color only by the attractive van der Waals energy component. What color is residue 34 if colored by solvation energy, `fa_sol`?

In [None]:
# send specific energies to pymol

In [58]:
pymol_mover.send_energy(ras, fa_atr) #d

You can have PyMOL label each Cα with the value of its residue’s specified energy using:
```
pymol_mover.label_energy(ras, fa_rep)
```

In [64]:
pymol_mover.label_energy(ras, fa_rep) #d

TypeError: label_energy(): incompatible function arguments. The following argument types are supported:
    1. (self: pyrosetta.rosetta.protocols.moves.PyMOLMover, input_pose: pyrosetta.rosetta.core.pose.Pose, energy_type: str) -> None

Invoked with: <pyrosetta.rosetta.protocols.moves.PyMOLMover object at 0x135969688>, <pyrosetta.rosetta.core.pose.Pose object at 0x1359c70d8>, ScoreType.fa_rep

Finally, if you have scored the `pose` first, you can have PyMOL display all of the calculated hydrogen bonds for the structure:

```
pymol_mover.send_hbonds(ras)
```

In [65]:
pymol_mover.send_hbonds(ras) #d

## References
This Jupyter notebook is an adapted version of "Workshop #3: Scoring" in the PyRosetta workbook: https://graylab.jhu.edu/pyrosetta/downloads/documentation/pyrosetta4_online_format/PyRosetta4_Workshop3_Scoring.pdf