# FRETpredict tutorial (pp11)

In [1]:
import MDAnalysis
import numpy as np
import os
import pandas as pd
import pickle
import yaml
from FRETpredict import FRETpredict
import warnings

warnings.filterwarnings('ignore')

  from distutils.version import LooseVersion
  class NCDFPicklable(scipy.io.netcdf.netcdf_file):


### Quick biological background

Polyproline 11 (pp11) has been described as behaving like a rigid rod, and was used as a "spectroscopic ruler" in the seminal paper by Stryer and Haugland. The pp11 system is a classical example of the importance of comparing molecular models with FRET data.

![title](pp11_structure.png)

We will perform FRET Efficiency calculations placing the rotamer libraries on the extremal residues.<br>
<br>First, let's have a look at the possible rotamer libraries we can use and how they're called.

### Rotamer library

In [2]:
with open('../../FRETpredict/lib/libraries.yml') as f:
    libraries = yaml.load(f)
libraries

{'ATTO 390 C2R': {'author': 'D Montepietra, G Tesei, JM Martins, MBA Kunze, RB Best, K Lindorff-Larsen',
  'citation': 'TBD',
  'filename': 'T39_C2R_cutoff30',
  'licence': 'GPLv3',
  'mu': ['C1', 'C10 and resname T39'],
  'negative': [],
  'positive': [],
  'r': ['C7 and resname T39']},
 'ATTO 390 L1R': {'author': 'D Montepietra, G Tesei, JM Martins, MBA Kunze, RB Best, K Lindorff-Larsen',
  'citation': 'TBD',
  'filename': 'T39_L1R_cutoff30',
  'licence': 'GPLv3',
  'mu': ['C1', 'C10 and resname T39'],
  'negative': [],
  'positive': [],
  'r': ['C7 and resname T39']},
 'ATTO 425 C2R': {'author': 'D Montepietra, G Tesei, JM Martins, MBA Kunze, RB Best, K Lindorff-Larsen',
  'citation': 'TBD',
  'filename': 'T42_C2R_cutoff30',
  'licence': 'GPLv3',
  'mu': ['C1', 'C10 and resname T42'],
  'negative': [],
  'positive': [],
  'r': ['C7 and resname T42']},
 'ATTO 425 L1R': {'author': 'D Montepietra, G Tesei, JM Martins, MBA Kunze, RB Best, K Lindorff-Larsen',
  'citation': 'TBD',
  'file

Every Rotamer Library name is composed of three parts: the manufacturer (AlexaFluor, ATTO, Lumiprobe), the peak wavelength (e.g. 488, 550, 647), and the linker that connects the dye to the protein (C1R, C2R, C3R, L1R, L2R, B1R).<br>
<br>To learn more about rotamer libraries, see __[`Tutorial_generate_new_rotamer_libraries`](https://github.com/Monte95/FRETpredict/blob/50ad48c7f2df4fc0aaca52158eb349cc82e4e2c1/tests/tutorials/Tutorial_generate_new_rotamer_libraries.ipynb)__.

### FRET efficiency calculation

Now, we will select the parameters for the FRET Efficiency calculation: 
- The `residues` to place the rotamer libraries on and their protein `chain`
For this tutorial we're going to use the first and the last residues of the pp11 chain. 

- The rotamer libraries we will use: AlexaFluor dyes 488 and 594, with C1R linkers.
`donor` and `acceptor` are used for R0 calculations, while `libname_1` and `libname_2` for the FRET Efficiency calculation. `r0lib` is the path to the dyes files (by default set to `lib/R0/`).
- The Universe object for the `protein` structure.
- `electrostatic` calculations will be enabled, and the `temperature` will be set at 298K.

In [3]:
# Experimental FRET efficiency value to compare our results
Ex = 0.88

# Create MDAnalysis.Universe object for the protein 
u = MDAnalysis.Universe('../test_systems/pp11/pp11.pdb', '../test_systems/pp11/pp11.xtc')

Let's create an instance of the FRETpredict class.

In [4]:
FRET = FRETpredict(protein=u, residues=[0, 12], temperature=298, 
                   chains=['A', 'A'], electrostatic=True,
                   donor='AlexaFluor 488', acceptor='AlexaFluor 594', 
                   libname_1='AlexaFluor 488 C1R cutoff10',
                   libname_2='AlexaFluor 594 C1R cutoff10',
                   r0lib='../../FRETpredict/lib/R0/',
                   output_prefix='test/E_pp11_10')

Run FRET efficiency calculations.

In [5]:
FRET.run()


Frame 1/316

Frame 2/316

Frame 3/316

Frame 4/316

Frame 5/316

Frame 6/316

Frame 7/316

Frame 8/316

Frame 9/316

Frame 10/316

Frame 11/316

Frame 12/316

Frame 13/316

Frame 14/316

Frame 15/316

Frame 16/316

Frame 17/316

Frame 18/316

Frame 19/316

Frame 20/316

Frame 21/316

Frame 22/316

Frame 23/316

Frame 24/316

Frame 25/316

Frame 26/316

Frame 27/316

Frame 28/316

Frame 29/316

Frame 30/316

Frame 31/316

Frame 32/316

Frame 33/316

Frame 34/316

Frame 35/316

Frame 36/316

Frame 37/316

Frame 38/316

Frame 39/316

Frame 40/316

Frame 41/316

Frame 42/316

Frame 43/316

Frame 44/316

Frame 45/316

Frame 46/316

Frame 47/316

Frame 48/316

Frame 49/316

Frame 50/316

Frame 51/316

Frame 52/316

Frame 53/316

Frame 54/316

Frame 55/316

Frame 56/316

Frame 57/316

Frame 58/316

Frame 59/316

Frame 60/316

Frame 61/316

Frame 62/316

Frame 63/316

Frame 64/316

Frame 65/316

Frame 66/316

Frame 67/316

Frame 68/316

Frame 69/316

Frame 70/316

Frame 71/316

Frame 72/316



Save the FRETpredict object to `pickle` file for future use.

In [6]:
with open('test/FRET_pp11_obj.pkl', 'wb') as file:
    
    pickle.dump(FRET, file)
    
    print('Object successfully saved to "FRET_pp11_obj.pkl"')

Object successfully saved to "FRET_pp11_obj.pkl"


Create DataFrame of the data (experimental and predicted) and show the results.

In [7]:
results = []

data = pd.read_pickle('test/E_pp11_10-data-0-12.pkl')

E_avg = np.mean([float(data.loc['Estatic', 'Average']), 
                 float(data.loc['Edynamic1', 'Average']),
                 float(data.loc['Edynamic2', 'Average'])])

results.append({'res': '1-13',
                'chromophore': 'AlexaFluor 488 C1R - AlexaFluor 594 C1R ',
                'cutoff' : 10,
                'k2': float(data.loc['k2', 'Average']),
                'Ex': Ex, 
                'Es': float(data.loc['Estatic', 'Average']),
                'Ed': float(data.loc['Edynamic1', 'Average']),
                'Ed2': float(data.loc['Edynamic2', 'Average']), 
                'E_avg': E_avg})
    
# Save data
np.save('test/results_pp11.npy', results)

# Display results
results_pp11_df = pd.DataFrame(results)
results_pp11_df

Unnamed: 0,res,chromophore,cutoff,k2,Ex,Es,Ed,Ed2,E_avg
0,1-13,AlexaFluor 488 C1R - AlexaFluor 594 C1R,10,0.684587,0.88,0.731976,0.876376,0.993194,0.867182


## R0 calculation

It is possible to select the R0 calculation method for the FRET Efficiency calculation.<br>
<br>By default (as done with the previous example), R0 is calculated for the provided dyes pair (`donor` and `acceptor`) for every rotamer conformation, taking the relative orientation into account.<br>
<br>However, it is possible to select a fixed R0 value. The only change is in the parameters passed to FRETpredict.<br>
<br>Let's see how it's done.

In [13]:
FRET_fixedR0 = FRETpredict(protein=u, residues=[0, 12], temperature=298, 
                           chains=['A', 'A'], 
                           fixed_R0=True, r0=5.68, electrostatic=True,
                           libname_1='AlexaFluor 488 C1R cutoff10',
                           libname_2='AlexaFluor 594 C1R cutoff10', 
                           output_prefix='test/E_pp11_10_fixedR0')

We set `fixed_R0=True` and passed to the `r0` parameter the selected R0 value. __[`lib/R0/R0_pairs.csv`](https://github.com/Monte95/FRETpredict/blob/ce327c60cf5f86a33251b2e0c60cf1f668bc46e2/FRETpredict/lib/R0/R0_pairs.csv)__ reports the R0 values for many dye pairs.<br>
<br>Otherwise, if the R0 value is not present in the file, online services as __[FPBase](https://www.fpbase.org/fret/)__ can be used to obtain it.

Now let's run FRET Efficiency calculations.

In [14]:
FRET_fixedR0.run()


Frame 1/316

Frame 2/316

Frame 3/316

Frame 4/316

Frame 5/316

Frame 6/316

Frame 7/316

Frame 8/316

Frame 9/316

Frame 10/316

Frame 11/316

Frame 12/316

Frame 13/316

Frame 14/316

Frame 15/316

Frame 16/316

Frame 17/316

Frame 18/316

Frame 19/316

Frame 20/316

Frame 21/316

Frame 22/316

Frame 23/316

Frame 24/316

Frame 25/316

Frame 26/316

Frame 27/316

Frame 28/316

Frame 29/316

Frame 30/316

Frame 31/316

Frame 32/316

Frame 33/316

Frame 34/316

Frame 35/316

Frame 36/316

Frame 37/316

Frame 38/316

Frame 39/316

Frame 40/316

Frame 41/316

Frame 42/316

Frame 43/316

Frame 44/316

Frame 45/316

Frame 46/316

Frame 47/316

Frame 48/316

Frame 49/316

Frame 50/316

Frame 51/316

Frame 52/316

Frame 53/316

Frame 54/316

Frame 55/316

Frame 56/316

Frame 57/316

Frame 58/316

Frame 59/316

Frame 60/316

Frame 61/316

Frame 62/316

Frame 63/316

Frame 64/316

Frame 65/316

Frame 66/316

Frame 67/316

Frame 68/316

Frame 69/316

Frame 70/316

Frame 71/316

Frame 72/316



Save the FRETpredict object to `pickle` file for future use.

In [15]:
with open('test/FRET_pp11_fixedR0_obj.pkl', 'wb') as file:
    
    pickle.dump(FRET, file)
    
    print('Object successfully saved to "FRET_pp11_fixedR0_obj.pkl"')

Object successfully saved to "FRET_pp11_fixedR0_obj.pkl"


Create DataFrame of the data (experimental and predicted) and show the results.

In [20]:
results_fixedR0 = []

data = pd.read_pickle('test/E_pp11_10_fixedR0-data-0-12.pkl')

E_avg = np.mean([float(data.loc['Estatic', 'Average']), 
                 float(data.loc['Edynamic1', 'Average']),
                 float(data.loc['Edynamic2', 'Average'])])

results_fixedR0.append({'res': '1-13',
                        'chromophore': 'Fixed R0 AlexaFluor 488 C1R - AlexaFluor 594 C1R ',
                        'cutoff' : 10,
                        'k2': float(data.loc['k2', 'Average']),
                        'Ex': Ex, 
                        'Es': float(data.loc['Estatic', 'Average']),
                        'Ed': float(data.loc['Edynamic1', 'Average']),
                        'Ed2': float(data.loc['Edynamic2', 'Average']), 
                        'E_avg': E_avg})
    
# Save data
np.save('test/results_fixedR0.npy', results_fixedR0)

# Display results and compare with previous case
results_fixedR0_df = pd.DataFrame(results_fixedR0)
pd.concat([results_pp11_df, results_fixedR0_df], ignore_index=True)

Unnamed: 0,res,chromophore,cutoff,k2,Ex,Es,Ed,Ed2,E_avg
0,1-13,AlexaFluor 488 C1R - AlexaFluor 594 C1R,10,0.684587,0.88,0.731976,0.876376,0.993194,0.867182
1,1-13,Fixed R0 AlexaFluor 488 C1R - AlexaFluor 594 C1R,10,0.684587,0.88,0.729104,0.874128,0.992961,0.865398


## Reweighting

As you probably have noticed, at the end of its calculations `FRET.run()` prompted the `effective fraction of frames contributing to average: 0.9728538649554426`.

This is an indication for the usefulness of reweighting the frames of the trajectory. Each frame is in fact assigned a weight $w_s$ obtained by multiplying the Boltzmann partition function $Z_{si}$ of the donor and acceptor for that frame.<br>

$w_s = \frac{Z_s}{\sum_s (Z_s)}$, with $Z_s = Z_{s1} \cdot Z_{s2}$. Weights are normalized such that $\sum_s w_s = 1$.

In this way, frames with many dye-protein steric clashes have a low weight.

The effective fraction of frames contributing to average is thus computed as $\phi_{eff} = \exp(S)$ with $S = - \sum_s w_s \cdot \ln(w_s)$

FRETpredict calculations are run by default with all $w_s = 1 /$num_frames.

Based on the $\phi_{eff}$ value, one could decide to reweight the trajectory frames based on $w_s$. FRETpredict implement an easy-to-use reweighting approach.

In [8]:
# REWEIGHTING
FRET.reweight(reweight_prefix='test/E_pp11_10_reweighted')

Effective fraction of frames contributing to average: 0.9728538649554426


Otherwise, if you want to reweight a previously used trajectory

In [9]:
# Load FRETpredict object from pickle file
with open('test/FRET_pp11_obj.pkl', 'rb') as file:
    
    FRET_file = pickle.load(file)
    
    print(f'Object successfully loaded from "FRET_pp11_obj.pkl"')

Object successfully loaded from "FRET_pp11_obj.pkl"


In [10]:
# REWEIGHTING
FRET_file.reweight(reweight_prefix='test/E_pp11_10_reweighted')

Effective fraction of frames contributing to average: 0.9728538649554426


Create the DataFrame of the data (experimental and predicted) and show the results.

In [25]:
results_ws = []

data = pd.read_pickle('test/E_pp11_10_reweighted-data-0-12.pkl')
    
E_avg = np.mean([float(data.loc['Estatic', 'Average']), 
                 float(data.loc['Edynamic1', 'Average']),
                 float(data.loc['Edynamic2', 'Average'])])

results_ws.append({'res': '1-13',
                   'chromophore': 'AlexaFluor 488 C1R - AlexaFluor 594 C1R ',
                   'cutoff' : 10,
                   'k2': float(data.loc['k2', 'Average']),
                   'Ex': Ex,
                   'Es': float(data.loc['Estatic', 'Average']),
                   'Ed': float(data.loc['Edynamic1', 'Average']),
                   'Ed2': float(data.loc['Edynamic2', 'Average']), 
                   'E_avg': E_avg,})

# Save data
np.save('test/results_pp11_ws.npy', results_ws)

# Display results
results_pp11_ws_df = pd.DataFrame(results_ws)
pd.concat([results_pp11_df, results_pp11_ws_df], ignore_index=True)

Unnamed: 0,res,chromophore,cutoff,k2,Ex,Es,Ed,Ed2,E_avg
0,1-13,AlexaFluor 488 C1R - AlexaFluor 594 C1R,10,0.684587,0.88,0.731976,0.876376,0.993194,0.867182
1,1-13,AlexaFluor 488 C1R - AlexaFluor 594 C1R,10,0.690553,0.88,0.736478,0.880143,0.993216,0.869946


In this case, with pp11 a small change is expected from the global weighting because the it is essentially rigid, and also the structure should already be "adapted" to the dyes. This is reflected in the hight percentage of frames contributing to the average (~97%).

Reweighting is more useful when this percentage is lower, as it is usually the case for Intrinsically Disordered Proteins (see the FRETpredict paper https://doi.org/10.1101/2023.01.27.525885).