<!--NOTEBOOK_HEADER-->
*This notebook contains material from [PyRosetta](https://RosettaCommons.github.io/PyRosetta.notebooks);
content is available [on Github](https://github.com/RosettaCommons/PyRosetta.notebooks.git).*

# Working the Scoring function
Keywords: score function, ScoreFunction(), get_score_function(), set_weight(), show(), etable_atom_pair_energies(), Atom objects, get_hbonds(), nhbonds(), residue_hbonds()

## Init PyRosetta

In [None]:
!pip install pyrosettacolabsetup
import pyrosettacolabsetup; pyrosettacolabsetup.install_pyrosetta()
import pyrosetta; pyrosetta.init()
from pyrosetta import *

#init()
#import os
#notebook_path = os.path.abspath("clase2-score.ipynb")


Collecting pyrosettacolabsetup
  Downloading pyrosettacolabsetup-1.0.9-py3-none-any.whl.metadata (294 bytes)
Downloading pyrosettacolabsetup-1.0.9-py3-none-any.whl (4.9 kB)
Installing collected packages: pyrosettacolabsetup
Successfully installed pyrosettacolabsetup-1.0.9
Mounted at /content/google_drive

Note that USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRE PURCHASE OF A LICENSE.
See https://github.com/RosettaCommons/rosetta/blob/main/LICENSE.md or email license@uw.edu for details.

Looking for compatible PyRosetta wheel file at google-drive/PyRosetta/colab.bin//wheels...
Found compatible wheel: /content/google_drive/MyDrive/PyRosetta/colab.bin/wheels//content/google_drive/MyDrive/PyRosetta/colab.bin/wheels/pyrosetta-2024.19+release.a34b73c40f-cp310-cp310-linux_x86_64.whl


┌──────────────────────────────────────────────────────────────────────────────┐
│                                 PyRosetta-4                                  │
│              Created in JHU by Sergey Lyskov 

## Load pdb files

In [None]:
pdb_file = "/content/google_drive/MyDrive/BIP_24-25/clase_2/5tj3.pdb"
clean_pdb_file = "/content/google_drive/MyDrive/BIP_24-25/clase_2/5tj3.clean.pdb"
pose = pose_from_pdb(pdb_file)
pose_clean = pose_from_pdb(clean_pdb_file)

core.chemical.GlobalResidueTypeSet: Finished initializing fa_standard residue type set.  Created 985 residue types
core.chemical.GlobalResidueTypeSet: Total time to initialize 1.04347 seconds.
core.import_pose.import_pose: File '/content/google_drive/MyDrive/BIP_24-25/clase_2/5tj3.pdb' automatically determined to be of type PDB
core.pack.pack_missing_sidechains: packing residue number 233 because of missing atom number 6 atom name  CG
core.pack.pack_missing_sidechains: packing residue number 350 because of missing atom number 6 atom name  CG
core.pack.pack_missing_sidechains: packing residue number 353 because of missing atom number 6 atom name  CG
core.pack.pack_missing_sidechains: packing residue number 354 because of missing atom number 6 atom name  CG
core.pack.pack_missing_sidechains: packing residue number 382 because of missing atom number 6 atom name  CG
core.pack.pack_missing_sidechains: packing residue number 454 because of missing atom number 6 atom name  CG
core.pack.task: 

## Rosetta Energy Score Functions


A basic function of Rosetta is calculating the energy or score of a biomolecule. This is important for inspecting the energies of a biomolecule at the whole protein, per-residue, and per-atom level.  

Rosetta has a standard energy function for all-atom calculations as well as several scoring functions for low-resolution protein representations. See https://www.ncbi.nlm.nih.gov/pubmed/28430426 for a review on the all-atom score functions.

You can also tailor an energy function by including scoring terms of your choice with custom weights.

To score a protein, you will begin by defining a score function using the `get_score_function(is_fullatom: bool)` method in the `pyrosetta.teaching` namespace. Specifying `True` will return the default `ref2015` all-atom energy function, while `False` will specify the default centroid score function.

Create a PyRosetta score function

In [None]:
from pyrosetta.teaching import *

sfxn = get_score_function(True)

core.scoring.ScoreFunctionFactory: SCOREFUNCTION: ref2015


You can see the terms, weights, and energy method options by printing the score function:

```
print(sfxn)
```

In [None]:
print(sfxn)

ScoreFunction::show():
weights: (fa_atr 1) (fa_rep 0.55) (fa_sol 1) (fa_intra_rep 0.005) (fa_intra_sol_xover4 1) (lk_ball_wtd 1) (fa_elec 1) (pro_close 1.25) (hbond_sr_bb 1) (hbond_lr_bb 1) (hbond_bb_sc 1) (hbond_sc 1) (dslf_fa13 1.25) (omega 0.4) (fa_dun 0.7) (p_aa_pp 0.6) (yhh_planarity 0.625) (ref 1) (rama_prepro 0.45)
energy_method_options: EnergyMethodOptions::show: aa_composition_setup_files: 
EnergyMethodOptions::show: mhc_epitope_setup_files: 
EnergyMethodOptions::show: netcharge_setup_files: 
EnergyMethodOptions::show: aspartimide_penalty_value: 25
EnergyMethodOptions::show: etable_type: FA_STANDARD_DEFAULT
analytic_etable_evaluation: 1
EnergyMethodOptions::show: method_weights: ref 1.32468 3.25479 -2.14574 -2.72453 1.21829 0.79816 -0.30065 2.30374 -0.71458 1.66147 1.65735 -1.34026 -1.64321 -1.45095 -0.09474 -0.28969 1.15175 2.64269 2.26099 0.58223
EnergyMethodOptions::show: method_weights: free_res
EnergyMethodOptions::show: unfolded_energies_type: UNFOLDED_SCORE12
EnergyMeth

**Exercise. List the terms in the energy function and their relative weights**

**Hint:** look at the top line that starts with 'weights'

### Custom energy functions

You can also create a custom energy function that includes select terms. Typically, creating a whole new score function is unneccesary because the current one works well in most cases. However, tweaking the current energy function by reassigning weights and adding certain energy terms can be useful.

Here, we will make an example energy function with only the van der Waals attractive and repulsive terms, both with weights of 1. We need to use the `set_weight()`. Make a new `ScoreFunction` and set the weights accordingly. This is how we set the full-atom attractive (`fa_atr`) and the full-atom repulsive (`fa_rep`) terms.

```
sfxn2 = ScoreFunction()
sfxn2.set_weight(fa_atr, 1.0)
sfxn2.set_weight(fa_rep, 1.0)
```

In [None]:
sfxn2 = ScoreFunction()
sfxn2.set_weight(fa_atr, 1.0)
sfxn2.set_weight(fa_rep, 1.0)

Lets compare the score of `pose_clean` using the full-atom `ScoreFunction` versus the `ScoreFunction` we made above using only the attractive and repulsive terms.

**Exercise. Print the total energy of `pose_clean` using  the `sfxn` score function.
Then, print the attractive and repulsive energy only using the custom `sfxn2` score function.**



In [None]:
print(sfxn(pose_clean))

-382.7103539984651


In [None]:
print(sfxn2(pose_clean))

-2372.3763228948746


### Energy Breakdown
Using the full-atom `ScoreFunction` `sfxn`, break the energy of `pose_clean` down into its individual pieces with the `sfxn.show()` method.

In [None]:
sfxn.show(pose_clean)

core.scoring.ScoreFunction: 
------------------------------------------------------------
 Scores                       Weight   Raw Score Wghtd.Score
------------------------------------------------------------
 fa_atr                       1.000   -3221.945   -3221.945
 fa_rep                       0.550     849.569     467.263
 fa_sol                       1.000    1981.070    1981.070
 fa_intra_rep                 0.005    1174.487       5.872
 fa_intra_sol_xover4          1.000     104.071     104.071
 lk_ball_wtd                  1.000     -78.366     -78.366
 fa_elec                      1.000    -679.205    -679.205
 pro_close                    1.250      33.228      41.535
 hbond_sr_bb                  1.000    -151.593    -151.593
 hbond_lr_bb                  1.000     -99.405     -99.405
 hbond_bb_sc                  1.000     -90.923     -90.923
 hbond_sc                     1.000     -33.289     -33.289
 dslf_fa13                    1.250       0.000       0.000
 omega  

**Exercise. Which are the three most dominant contributions, and what are their values? Is this what you would have expected? Why?** Note which terms are positive and negative

Unweighted, individual component energies of each residue in a structure are stored in the `Pose` object and can be accessed by the `energies()` method.

Note: The _backbone_ hydrogen-bonding terms for each residue are not available from the `Energies` object. You can get them by using EnergyMethodOptions. See http://www.pyrosetta.org/documentation#TOC-Hydrogen-Bonds-and-Hydrogen-Bond-Scoring.

**Exercise.** What are the total van der Waals, solvation, and hydrogen-bonding contributions of residue 24?

In [None]:
# prompt: I need to breakdown the individual component energies of a residue in the pose object using the energies() method

#print(pose_clean.energies().residue_total_energy(24))
print(pose_clean.energies().show(24))


core.scoring.Energies: E               fa_atr        fa_rep        fa_sol  fa_intra_repfa_intra_sol_x   lk_ball_wtd       fa_elec     pro_close   hbond_sr_bb   hbond_lr_bb   hbond_bb_sc      hbond_sc     dslf_fa13         omega        fa_dun       p_aa_pp yhh_planarity           ref   rama_prepro
core.scoring.Energies: E(i)  24         -6.52          0.76          6.43          4.40          0.45         -0.25         -2.56          0.00          0.00          0.00         -0.66          0.00          0.00         -0.10          4.54         -0.21          0.00         -0.09         -0.25
None


In [None]:
# ChatGPT prompt:
#I need to breakdown the individual component energies of a residue in the pose object using the energies() method in pyrosetta.
#I have already imported pyrosetta and pyrosetta.teaching. I have also loaded a pdb file and created a pose (pose_clean). Can you use one-line code to get the breakdown of the energy terms of residue 24?


from pyrosetta.rosetta.core.scoring import ScoreType
residue_index = 24
energies = pose_clean.energies().residue_total_energies(residue_index)
score_types = ScoreType.end_of_score_type_enumeration
#energy_breakdown = {ScoreType(i).name: energies[ScoreType(i)] for i in range(1, score_types + 1)}
energy_breakdown = {ScoreType(i).name: energies[ScoreType(i)] for i in range(1, int(score_types) + 1)} # suggested by Gemini, the line above results in error


# Convert to DataFrame and display
import pandas as pd
energy_df = pd.DataFrame(list(energy_breakdown.items()), columns=['Energy Term', 'Value'])
print(energy_df)

                         Energy Term     Value
0                             fa_atr -6.519673
1                             fa_rep  0.762649
2                             fa_sol  6.431979
3                       fa_intra_atr -2.151932
4                       fa_intra_rep  4.401573
..                               ...       ...
463  membrane_span_term_z_constraint  0.000000
464               aromatic_restraint  0.000000
465                  rna_coarse_dist  0.000000
466                      total_score  0.148293
467                 dummy_score_type  0.000000

[468 rows x 2 columns]


The van der Waals, solvation, and electrostatic terms are atom-atom pairwise energies calculated from a pre-tabulated lookup table, dependent upon the distance between the two atoms and their types. You can access this lookup table, called the `etable` directly to check these energy calculations on an atom-by-atom basis. Use the `pyrosetta.etable_atom_pair_energies` function which returns scores for attractive, repulsive, solvation and electrostatic potentials.

(Note that the `pyrosetta.etable_atom_pair_energies()` function requires `Atom` objects, not the `AtomID` objects we saw earlier.

You can access the `Atom` object for a residue with `residue.atom_index("")` and you can access to a residue with `pose.residue()`. For instance to access the Calpha of residue 15 of the pose `pose` you can use:

```
res15 = pose.residue(15)
res15_atomCA = res15.atom_index("CA")
```
For more info, look at the [documentation](https://graylab.jhu.edu/PyRosetta.documentation/pyrosetta.toolbox.atom_pair_energy.html?highlight=etable_atom_pair_energies#pyrosetta.toolbox.atom_pair_energy.etable_atom_pair_energies).)


In [None]:
help(pyrosetta.etable_atom_pair_energies)

Help on function etable_atom_pair_energies in module pyrosetta.toolbox.atom_pair_energy:

etable_atom_pair_energies(res1, atom_index_1, res2, atom_index_2, sfxn)
    Compute the energy of two atoms and return the LJ, solvation and electrostatic
    terms.
    
    Args:
        res1 (pyrosetta.rosetta.core.conformation.Residue): the residue that contains the
            first atom of interest.
        atom_index_1 (int): index of the desired atom in residue 1
        res2 (pyrosetta.rosetta.core.conformation.Residue): the residue that contains the
            second atom of interest.
        atom_index_2 (int): index of the desired atom in residue 2
    
    Returns:
        tuple: values of the lj_atr, lj_rep, fa_solv, and fa_elec potentials.
    
    Usage: lj_atr, lj_rep, solv=etable_atom_pair_energies(res1, atom_index_1, res2, atom_index_2, sfxn)
        Description: given a pair of atoms (specified using a pair of residue objects and
        atom indices) and scorefunction, use th

**Exercise. What are the attractive, repulsive, solvation, and electrostatic components between the nitrogen of residue 24 and the oxygen of residue 20?**

In [None]:
res20 = pose.residue(20)
res20_atomCA = res20.atom_index("O")
res24 = pose.residue(24)
res24_atomN = res24.atom_index("N")
etable_atom_pair_energies(res20, res20_atomCA, res24, res24_atomN, sfxn)

(-0.15176425575543143,
 0.0366482557141293,
 0.7275895358463873,
 2.765511964725755)

ChatGPT prompt:
I have selected these two atoms from my structure:
```
res20 = pose.residue(20)
res20_atomCA = res20.atom_index("O")
res24 = pose.residue(24)
res24_atomN = res24.atom_index("N")
```
The van der Waals, solvation, and electrostatic terms are atom-atom pairwise energies calculated from a pre-tabulated lookup table, dependent upon the distance between the two atoms and their types. You can access this lookup table, called the etable directly to check these energy calculations on an atom-by-atom basis. I need to use the `pyrosetta.etable_atom_pair_energies` function which returns scores for attractive, repulsive, solvation and electrostatic potentials, to find out what are the attractive, repulsive, solvation, and electrostatic components between the nitrogen of residue 24 and the oxygen of residue 20

In [None]:
from pyrosetta import *

# Initialize PyRosetta if you haven't already
# init()  # Uncomment if PyRosetta isn't initialized

# Step 1: Select the two atoms
res20 = pose_clean.residue(20)
res24 = pose_clean.residue(24)

res20_atomCA = res20.atom_index("O")  # Oxygen of residue 20
res24_atomN = res24.atom_index("N")   # Nitrogen of residue 24

# Step 2: Get the score function and energy method options
scorefxn = get_fa_scorefxn()  # Standard full-atom score function
## Is this function ref2015???
energy_method_options = scorefxn.energy_method_options()

# Step 3: Calculate the atom-atom energy components
attractive, repulsive, solvation, electrostatic = pyrosetta.etable_atom_pair_energies(
    res20, res20_atomCA, res24, res24_atomN, pose_clean, energy_method_options
)

# Step 4: Display the results
print(f"Attractive energy: {attractive}")
print(f"Repulsive energy: {repulsive}")
print(f"Solvation energy: {solvation}")
print(f"Electrostatic energy: {electrostatic}")