# The Most Beautiful Disulfide Bond in the World
Eric G. Suchanek, PhD 5/4/24

In this notebook I illustrate some of the features of proteusPy by analyzing the lowest energy disulfide bond in the RCSB protein structure database. If you are not familiar with `proteusPy` you can find the API at: https://suchanek.githubio.com/proteusPy.html


In [2]:
import pyvista as pv
import proteusPy
from proteusPy import Disulfide, DisulfideList, Load_PDB_SS

# pyvista setup for notebooks
pv.set_jupyter_backend("trame")
pv.set_plot_theme("default")

LIGHT = True

proteusPy.__version__

'0.96.2dev'

### Load the RCSB Disulfide Database
We load the database and get its properties as follows:

In [3]:
PDB_SS = Load_PDB_SS(verbose=True)
PDB_SS.describe()

--> DisulfideLoader: Downloading Disulfide Database from Drive...


Downloading...
From (original): https://drive.google.com/uc?id=1igF-sppLPaNsBaUS7nkb13vtOGZZmsFp
From (redirected): https://drive.google.com/uc?id=1igF-sppLPaNsBaUS7nkb13vtOGZZmsFp&confirm=t&uuid=44ed5136-401c-4747-9161-ddde5085dc68
To: /Users/egs/miniforge3/envs/proteusPy/lib/python3.11/site-packages/proteusPy/data/PDB_SS_ALL_LOADER.pkl
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 477M/477M [00:26<00:00, 17.8MB/s]


--> DisulfideLoader: Downloading Disulfide Subset Database from Drive...


Downloading...
From: https://drive.google.com/uc?id=1puy9pxrClFks0KN9q5PPV_ONKvL-hg33
To: /Users/egs/miniforge3/envs/proteusPy/lib/python3.11/site-packages/proteusPy/data/PDB_SS_SUBSET_LOADER.pkl
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 13.6M/13.6M [00:00<00:00, 13.9MB/s]


-> load_PDB_SS(): Reading /Users/egs/miniforge3/envs/proteusPy/lib/python3.11/site-packages/proteusPy/data/PDB_SS_ALL_LOADER.pkl... 
-> load_PDB_SS(): Done reading /Users/egs/miniforge3/envs/proteusPy/lib/python3.11/site-packages/proteusPy/data/PDB_SS_ALL_LOADER.pkl... 
PDB IDs present:                    35881
Disulfides loaded:                  168984
Average structure resolution:       2.55 Å
Lowest Energy Disulfide:            2q7q_75D_140D
Highest Energy Disulfide:           6vxk_801B_806B
Cα distance cutoff:                 8.00 Å
Total RAM Used:                     42.71 GB.


We see from the statistics above that disulfide 2q7q_75D_140D has the lowest energy, so let's extract it from the database and display it. 

A few notes about the display window. You might need to click into the window to refresh it. Click drag to rotate the structures, mousewheel to zoom. The window titles display several parameters about the disulfide bonds including their approximate torsional energy, their Ca-Ca distance, and the *torsion length*. 

The latter parameter is formally, the Euclidean length of the sidechain dihedral angle when treated as a five-dimensional vector. This sounds all mathy and complicated, but in essence it gives a measure of how 'long' that five dimensional vector is. This is used by the package to compare individual structures and gauge their structural similarity.

In [4]:
ssmin, ssmax = PDB_SS.SSList.minmax_energy
ssmin_energy = ssmin.energy

best_ss = PDB_SS["2q7q_75D_140D"]
best_dihedrals = best_ss.dihedrals
best_ss.pprint()
best_ss.display(style="sb", light=LIGHT)

<Disulfide 2q7q_75D_140D, Source: 2q7q, Resolution: 1.6 Å 
Χ1-Χ5: -59.36°, -59.28°, -83.66°, -59.82° -59.91°, -25.17°, 0.49 kcal/mol 
Cα Distance: 5.50 Å 
Torsion length: 145.62 deg>


Widget(value='<iframe id="pyvista-jupyter_trame__template_P_0x338187dd0_0" src="http://localhost:8888/trame-ju…

And that, gentle reader, is it. *The most beautiful disulfide bond in the world*! Look at it. This is the lowest energy structure in the entire database. The sidechain dihdedral angles (Χ1-Χ5: -59.36°, -59.28°, -83.66°, -59.82° -59.91°), and the estimated energy, (0.49 kcal/mol). 

How does this compare to an analytical (modelled) minimum? We can use the ``minimize`` module from ``scipy`` to check. We know from chemistry that a reasonable guess for a low energy conformation would have the dihedral angles: (Χ1-Χ5: -60.00°, -60.00°, -90.00°, -60.00° -60.00°, 0.60 kcal/mol). Let's run this through scipy and compute a minimum energy conformation:


In [5]:
from scipy.optimize import minimize
from proteusPy.Disulfide import Disulfide_Energy_Function

initial_guess = [
    -60.0,
    -60.0,
    -90.0,
    -60.0,
    -60.0,
]  # initial guess for chi1, chi2, chi3, chi4, chi5
result = minimize(Disulfide_Energy_Function, initial_guess, method="Nelder-Mead")
minimum_energy = result.fun
minimum_conformation = result.x
print(
    f'Modeled minimum energy: {minimum_energy:.3f} kcal/mol for conformation: {[f"{x:.3f}" for x in minimum_conformation]}'
)

Modeled minimum energy: 0.489 kcal/mol for conformation: ['-60.000', '-60.000', '-83.048', '-60.000', '-60.000']


So the computed minimum energy structure is *0.489* kcal/mol, estimated. The difference from the actual conformation is:

In [6]:
diff = minimum_energy - ssmin_energy
print(f"Modeled - actual energy difference is: {diff} kcal/mol")

Modeled - actual energy difference is: -0.002797862530647066 kcal/mol


Given this very small difference we can safely say that the lowest energy disulfide in the database is at the lowest theoretical energy as well. What an amazing Disulfide bond! Let's build a model for the predicted lowest-energy conformation and compare it to the actual one found in the database. We do that by creating an empty disulfide and then using the `Disulfide.build_yourself` function.

In [7]:
modelled_min = Disulfide("model")
modelled_min.dihedrals = minimum_conformation
modelled_min.build_yourself()

Now make a ``DisulfideList`` list and put the real structure and modelled structure into it.

In [8]:
minmax = DisulfideList([modelled_min, ssmin], "minmax")

Finally, display them in a common reference frame:

In [9]:
minmax.display_overlay()

  0%|                                                                              | 0/2 [00:00<?, ?it/s]

Widget(value='<iframe id="pyvista-jupyter_trame__template_P_0x338248ed0_1" src="http://localhost:8888/trame-ju…

The two structures overlap with an overall RMS error of 2.14 A. Not bad considering the modeled structure is made using idealized bond lengths and angles!

## References
* https://doi.org/10.1021/bi00368a023
