# Requisites

- Environment with dependencies in `conda_env.yaml`
- External dependencies to be downloaded and place in `training_data/utils/external`:
    - predict_ddG.py script from PyRosetta (https://github.com/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/additional_scripts/predict_ddG.py)
    - DSPP software executable (https://github.com/PDB-REDO/dssp/releases/download/v4.4.0/mkdssp-4.4.0-linux-x64)

In [1]:
from predict import get_cif, view_pdb, predict, view_pockets, view_pockets_pathways, Site



  import pkg_resources


┌──────────────────────────────────────────────────────────────────────────────┐
│                                 PyRosetta-4                                  │
│              Created in JHU by Sergey Lyskov and PyRosetta Team              │
│              (C) Copyright Rosetta Commons Member Institutions               │
│                                                                              │
│ NOTE: USE OF PyRosetta FOR COMMERCIAL PURPOSES REQUIRE PURCHASE OF A LICENSE │
│         See LICENSE.PyRosetta.md or email license@uw.edu for details         │
└──────────────────────────────────────────────────────────────────────────────┘
PyRosetta-4 2025 [Rosetta PyRosetta4.conda.ubuntu.cxx11thread.serialization.Ubuntu.python311.Release 2025.24+release.8e1e5e54f047b0833dcf760a5cd5d3ce94d63938 2025-06-06T09:20:57] retrieved from: http://www.pyrosetta.org
core.init: Checking for fconfig files in pwd and ./rosetta/flags
core.init: Rosetta version: PyRosetta4.conda.ubuntu.cxx11thread.ser

<br>

**Files are written by default to the `predict` folder, customize it with using a different `path=` argument in the functions below.**

<br>

# Predict

## Protein structure

In [2]:
pdb_id = "6t4k"

In [3]:
pdb = get_cif(
    pdb_id
)

view_pdb(pdb)

PDBeMolstar(bg_color='#F7F7F7', custom_data={'data': "data_6T4K\n#\n_entry.id 6T4K\n#\n_citation.abstract ?\n_…

## Predict

In [4]:
predictions = predict(
    pdb_id,
    protein_chains=["A"],
    email="nerinfonzf98@univie.ac.at"#"youremail@yourinstitution.com"
)

  0%|          | 0/9 [00:00<?, ?it/s]

In [5]:
predictions

Unnamed: 0,Allosteric score
pocket2,0.901122
pocket1,0.648326
pocket5,0.02459
pocket7,0.001177
pocket6,0.000759
pocket4,0.000559
pocket3,0.000344
pocket9,0.00019
pocket8,0.000141
pocket10,7.4e-05


## View

In [6]:
view_pockets(
    pdb_id,
    pockets={"pocket2": {"color": "green"}, "pocket1": {"color": "blue"}}, # {"pocketn": {"color": ""}}
)

PDBeMolstar(bg_color='#F7F7F7', color_data={'data': [{'struct_asym_id': 'A', 'representation': 'cartoon', 'rep…

### Optional: view allosteric network pathways

**View the allosteric pathways originating from the predicted pocket towards spatially distant residues in the structure.**

The allosteric network can be chosen between [correlationplus](https://github.com/tekpinar/correlationplus) (Dynamical Cross-Correlation from Elastic Network Models) or ProDy's [Perturbation Response Scanning](http://www.bahargroup.org/prody/prs/). The shortest paths between residues of the source pocket and other spatially distant residues (given a distance threshold) are calculated using [NetworkX](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.weighted.multi_source_dijkstra.html) with Dijkstra’s algorithm.

In [7]:
view_pockets_pathways(
    pdb_id, 
    pathways="correlationplus", # or 'prs'
    source_pocket="pocket2", # Source pocket to calculate paths to/from
    pathway_dist_threshold=20, # Minimum distance between residues to calculate paths from
    n_top_pathways=10, # Number of pathways to view in the structure
    pockets={"pocket2": {"color": "green"}}, # {"pocketn": {"color": ""}} # Same dictionary of pockets as above to visualize pockets alongside paths
)

2025-07-09 14:13:21,385 - .prody - DEBUG - 1974 atoms and 1 coordinate set(s) were parsed in 0.04s.
2025-07-09 14:13:21,486 - .prody - DEBUG - Hessian was built in 0.09s.
2025-07-09 14:13:24,955 - .prody - DEBUG - 100 modes were calculated in 3.47s.


Pathway #1 (blue): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ALA:371
Pathway #2 (cyan): A:MET:358, A:VAL:361, A:VAL:363, A:LEU:407, A:LEU:410
Pathway #3 (green): A:MET:358, A:VAL:361, A:ARG:364, A:CYS:366, A:SER:408, A:HIS:411
Pathway #4 (yellow): A:MET:358, A:VAL:360, A:SER:413, A:ILE:417, A:THR:421
Pathway #5 (red): A:LYS:354, A:ALA:449
Pathway #6 (purple): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ASN:373
Pathway #7 (grey): A:MET:358, A:VAL:360, A:LEU:419, A:THR:421, A:PHE:450, A:HIS:451
Pathway #8 (light blue): A:LEU:293, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:ALA:368
Pathway #9 (light cyan): A:LYS:354, A:LEU:448, A:ALA:449
Pathway #10 (mint green): A:ARG:296, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370


PDBeMolstar(bg_color='#F7F7F7', color_data={'data': [{'struct_asym_id': 'A', 'representation': 'cartoon', 'rep…

### Optional: view a target site

Set an object of the `Site` class passing a modulator molecule (and defining the site around it) or a list of residues. Then visualize it in the structure to assess overlap with predicted pockets (residues of the site will be colored green in the protein cartoon visualization).

<br>

#### With a modulator molecule

In [8]:
# Desired modulator is label_asym_id 'C'
pdb.residues.query("label_asym_id == 'C'")

Unnamed: 0,label_comp_id,label_asym_id,label_entity_id,label_seq_id,pdbx_PDB_ins_code,auth_seq_id,auth_comp_id,auth_asym_id,pdbx_PDB_model_num,pdbx_label_index,pdbx_sifts_xref_db_name,pdbx_sifts_xref_db_acc,pdbx_sifts_xref_db_num,pdbx_sifts_xref_db_res
2061,4F1,C,3,.,?,602,4F1,A,1,602,?,?,?,?


In [9]:
site = Site(
    pdb, 
    modulator_residues=pdb.residues.query("label_asym_id == 'C'")
)
site

<predict.Site at 0x769e171d3a50>

In [10]:
view_pockets_pathways(
    pdb_id, 
    pathways="correlationplus",
    source_pocket="pocket2", 
    pockets={"pocket2": {"color": "green"}}, # {"pocketn": {"color": ""}} 
    site_residues=site.residues,
    modulator_residues=site.modulator_residues,
)

2025-07-09 14:07:48,779 - .prody - DEBUG - 1974 atoms and 1 coordinate set(s) were parsed in 0.04s.


Pathway #1 (blue): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ALA:371
Pathway #2 (cyan): A:MET:358, A:VAL:361, A:VAL:363, A:LEU:407, A:LEU:410
Pathway #3 (green): A:MET:358, A:VAL:361, A:ARG:364, A:CYS:366, A:SER:408, A:HIS:411
Pathway #4 (yellow): A:MET:358, A:VAL:360, A:SER:413, A:ILE:417, A:THR:421
Pathway #5 (red): A:LYS:354, A:ALA:449
Pathway #6 (purple): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ASN:373
Pathway #7 (grey): A:MET:358, A:VAL:360, A:LEU:419, A:THR:421, A:PHE:450, A:HIS:451
Pathway #8 (light blue): A:LEU:293, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:ALA:368
Pathway #9 (light cyan): A:LYS:354, A:LEU:448, A:ALA:449
Pathway #10 (mint green): A:ARG:296, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370


PDBeMolstar(bg_color='#F7F7F7', color_data={'data': [{'struct_asym_id': 'A', 'representation': 'cartoon', 'rep…

#### With a list of residues

In [11]:
# List of residue numbers of site
resnums = site.residues.label_seq_id.to_list()
resnums

['73',
 '77',
 '78',
 '80',
 '81',
 '84',
 '85',
 '88',
 '109',
 '110',
 '111',
 '112',
 '113',
 '114',
 '232',
 '235',
 '236',
 '238',
 '239',
 '240',
 '241',
 '243',
 '248',
 '250',
 '251',
 '252',
 '253',
 '254',
 '255',
 '257',
 '258',
 '261',
 '262']

In [12]:
# Site can be defined with a list of residues instead of a modulator
res_site = Site(
    pdb=pdb,
    residues=[{"label_asym_id": "A", "label_seq_id": seqnum} for seqnum in resnums]
)
res_site.residues

Unnamed: 0,label_comp_id,label_asym_id,label_entity_id,label_seq_id,pdbx_PDB_ins_code,auth_seq_id,auth_comp_id,auth_asym_id,pdbx_PDB_model_num,pdbx_label_index,pdbx_sifts_xref_db_name,pdbx_sifts_xref_db_acc,pdbx_sifts_xref_db_num,pdbx_sifts_xref_db_res
0,TRP,A,1,73,?,317,TRP,A,1,73,UNP,P51449,317,W
1,ALA,A,1,77,?,321,ALA,A,1,77,UNP,P51449,321,A
2,HIS,A,1,78,?,322,HIS,A,1,78,UNP,P51449,322,H
3,LEU,A,1,80,?,324,LEU,A,1,80,UNP,P51449,324,L
4,THR,A,1,81,?,325,THR,A,1,81,UNP,P51449,325,T
5,ILE,A,1,84,?,328,ILE,A,1,84,UNP,P51449,328,I
6,GLN,A,1,85,?,329,GLN,A,1,85,UNP,P51449,329,Q
7,VAL,A,1,88,?,332,VAL,A,1,88,UNP,P51449,332,V
8,LEU,A,1,109,?,353,LEU,A,1,109,UNP,P51449,353,L
9,LYS,A,1,110,?,354,LYS,A,1,110,UNP,P51449,354,K


In [13]:
view_pockets_pathways(
    pdb_id, 
    pathways="correlationplus",
    source_pocket="pocket2", 
    pockets={"pocket2": {"color": "green"}}, # {"pocketn": {"color": ""}} 
    site_residues=site.residues,
)

2025-07-09 14:07:49,256 - .prody - DEBUG - 1974 atoms and 1 coordinate set(s) were parsed in 0.03s.


Pathway #1 (blue): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ALA:371
Pathway #2 (cyan): A:MET:358, A:VAL:361, A:VAL:363, A:LEU:407, A:LEU:410
Pathway #3 (green): A:MET:358, A:VAL:361, A:ARG:364, A:CYS:366, A:SER:408, A:HIS:411
Pathway #4 (yellow): A:MET:358, A:VAL:360, A:SER:413, A:ILE:417, A:THR:421
Pathway #5 (red): A:LYS:354, A:ALA:449
Pathway #6 (purple): A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370, A:ASN:373
Pathway #7 (grey): A:MET:358, A:VAL:360, A:LEU:419, A:THR:421, A:PHE:450, A:HIS:451
Pathway #8 (light blue): A:LEU:293, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:ALA:368
Pathway #9 (light cyan): A:LYS:354, A:LEU:448, A:ALA:449
Pathway #10 (mint green): A:ARG:296, A:MET:358, A:VAL:361, A:ARG:364, A:MET:365, A:TYR:369, A:ASN:370


PDBeMolstar(bg_color='#F7F7F7', color_data={'data': [{'struct_asym_id': 'A', 'representation': 'cartoon', 'rep…