# The Most Beautiful Disulfide Bond in the World
Eric G. Suchanek, PhD 2/24/24

In this notebook illustrate some of the features of proteusPy by analyzing the lowest energy disulfide bond in the RCSB protein structure database. If you are not familiar with `proteusPy` you can find the API at: https://suchanek.githubio.com/proteusPy.html


In [None]:
#
import pandas as pd
import pyvista as pv
from pyvista import set_plot_theme

from proteusPy.Disulfide import Disulfide, Disulfide_Energy_Function
from proteusPy.DisulfideList import DisulfideList
from proteusPy.DisulfideLoader import Load_PDB_SS

# pyvista setup for notebooks
pv.set_jupyter_backend('trame')

set_plot_theme('dark')
LIGHT = True

### Load the RCSB Disulfide Database
We load the database and get its properties as follows:

In [None]:
PDB_SS = Load_PDB_SS(verbose=True)
PDB_SS.describe()

We see from the statistics above that disulfide 2q7q_75D_140D has the lowest energy, so let's extract it from the database and display it. 

A few notes about the display window. You might need to click into the window to refresh it. Click drag to rotate the structures, mousewheel to zoom. The window titles display several parameters about the disulfide bonds including their approximate torsional energy, their Ca-Ca distance, and the *torsion length*. 

The latter parameter is formally, the Euclidean length of the sidechain dihedral angle when treated as a five-dimensional vector. This sounds all mathy and complicated, but in essence it gives a measure of how 'long' that five dimensional vector is. This is used by the package to compare individual structures and gauge their structural similarity.

In [None]:
ssmin, _ = PDB_SS.SSList.minmax_energy
ssmin_energy = ssmin.energy

best_ss = PDB_SS['2q7q_75D_140D']
best_dihedrals = best_ss.dihedrals
best_ss.pprint()
best_ss.display(style='sb', light=LIGHT)

And that, gentle reader, is it. *The most beautiful disulfide bond in the world*! Look at it. This is the lowest energy structure in the entire database. The sidechain dihdedral angles (Χ1-Χ5: -59.36°, -59.28°, -83.66°, -59.82° -59.91°), and the estimated energy, (0.49 kcal/mol). How does this compare to an analytical (modelled) minimum? We can use the ``minimize`` module from ``scipy`` to check. We know from chemistry that a reasonable guess for a low energy conformation would have the dihedral angles: (Χ1-Χ5: -60.00°, -60.00°, -90.00°, -60.00° -60.00°, 0.60 kcal/mol). Let's run this through scipy and compute a minimum energy conformation:


In [17]:
from scipy.optimize import minimize
import numpy as np

    
initial_guess = [-60.0, -60.0, -90.0, -60.0, -60.0] # initial guess for chi1, chi2, chi3, chi4, chi5
result = minimize(Disulfide_Energy_Function, initial_guess, method="Nelder-Mead")
minimum_energy = result.fun
min_conf = result.x
print(f'Minimum Energy: {minimum_energy:.3f} for conformation: {[f"{x:.3f}" for x in min_conf]}')


Minimum Energy: 0.489 for conformation: ['-60.000', '-60.000', '-83.048', '-60.000', '-60.000']


So the computed minimum energy structure is *0.489* kcal/mol, estimated. The difference from the actual conformation is:

In [18]:
diff = minimum_energy - ssmin_energy
print(f'Modeled - actual energy difference is: {diff} kcal/mol')

Modeled - actual energy difference is: -0.002797862530647066 kcal/mol


The real structure is actually at a *lower* energy than the predicted! What an amazing Disulfide bond! Let's build a model for its conformation. We do that by creating an empty disulfide and then using the `build model` function.

In [14]:
modelled_min = Disulfide('model')
modelled_min.build_model(min_conf[0],min_conf[1], min_conf[2], min_conf[3], min_conf[4])

Now make a ``DisulfideList`` list and put the real structure and modelled structure into it.

In [15]:
minmax = DisulfideList([modelled_min, ssmin], 'minmax')

Finally, display them in a common reference frame:

In [16]:
minmax.display_overlay()

  0%|                                                                              | 0/2 [00:00<?, ?it/s]

Widget(value='<iframe src="http://localhost:61630/index.html?ui=P_0x2bd8fdcd0_3&reconnect=auto" class="pyvista…

The two structures overlap with an overall RMS error of 2.14 A. Not bad considering!

## References
* *Application of Artificial Intelligence in Protein Design* - Doctoral Dissertation, EG Suchanek, 1987, Johns Hopkins Medical School
* https://doi.org/10.1021/bi00368a023
* https://doi.org/10.1021/bi00368a024
* https://doi.org/10.1016/0092-8674(92)90140-8
* http://dx.doi.org/10.2174/092986708783330566
* https://doi.org/10.1021/bi0603064
* https://doi.org/10.1021/bi9826658
* https://pubmed.ncbi.nlm.nih.gov/22782563/
* 