# P2Rank — Protein-ligand binding site prediction
This notebook will walk us through using P2Rank for protein binding site prediction

**P2Rank implementation paper**:
Krivak R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure. Journal of Cheminformatics. 2018 Aug.


# 1) Setup

## 1.0) Imports

In [2]:
import os
from pathlib import Path

import rush
from pdbtools import pdb_fetch

## 1.1) Configuration

Let's set some global variables that define our project.

In [3]:
DESCRIPTION = "p2rank-notebook"
TAGS = ["rush-py", "p2rank", "notebook"]

## 1.2) Build your client

In [4]:
# |hide
WORK_DIR = Path.home() / "qdx" / "p2rank-rush-py-demo"
if WORK_DIR.exists():
    client = rush.Provider(workspace=WORK_DIR)
    await client.nuke(remote=False)
os.makedirs(WORK_DIR, exist_ok=True)
YOUR_TOKEN = os.getenv("RUSH_TOKEN")

In [5]:
os.environ["RUSH_TOKEN"] = YOUR_TOKEN
client = rush.build_blocking_provider_with_functions(
    batch_tags=TAGS, workspace=WORK_DIR
)

# 2) Preparation

## 2.0) Fetch example PDB
Note that P2Rank requires a PDB file of a protein to find potential pockets.

In [6]:
PDB_ID = "1GIH"
FILE_NAME = f"{PDB_ID}.pdb"
FILE_PATH = WORK_DIR / FILE_NAME

In [7]:
complex = list(pdb_fetch.fetch_structure(PDB_ID))

with open(FILE_PATH, "w") as f:
    for line in complex:
        f.write(line)

# 3) PLIP

## 3.1) Arguments

In [8]:
help(client.p2rank_pdb)

Help on function p2rank_pdb in module rush.provider:

p2rank_pdb(*args: *tuple[RushObject[bytes], bool], target: 'Target | None' = None, resources: 'Resources | None' = None, tags: 'list[str] | None' = None, restore: 'bool | None' = None) -> tuple[Record, RushObject[bytes]]
    Run P2Rank on an input PDB file to obtain ligand-binding site predictions

    Please see:
    Krivak R, Hoksza D. P2Rank: machine learning based tool for rapid and accurate prediction of ligand binding sites from protein structure.
    Journal of Cheminformatics. 2018 Aug.


    Module version:
    `github:talo/tengu-p2rank/285c5a6ac161266504e71fcad0ec4155b955408c#p2rank_pdb_tengu`

    QDX Type Description:

        input_pdb_file: Object {size: u32, format: ObjectFormat[json | bin]?, path: @$Bytes};
        use_alphafold_config: bool
        ->
        out: P2RankResults {
            pockets: [Pocket {
                surf_atom_ids: [i32],
                name: string,
                center_x: f64,
        

We can see from the above help documentation that we need to pass the Path to our PDB file as an argument.
We will also set  `use_alphafold_config` flag to `False` as we have a crystal structure from RCSB.

## 3.2) Run P2Rank
Finally, we run P2Rank so we can identify predict possible binding sites for our protein

In [10]:

p2rank_results, visualisation = client.p2rank_pdb(
    FILE_PATH, False
)

## 3.3) Get P2Rank results
Here, we get the P2Rank results, which we can see from the above help documentation is a liset of potential binding sites.

In [11]:
p2rank_results.get()

2024-06-14 03:44:09,071 - rush - INFO - Argument 5dd19ede-30d6-4694-a665-912387d2f38f is now ModuleInstanceStatus.DISPATCHED
2024-06-14 03:44:11,307 - rush - INFO - Argument 5dd19ede-30d6-4694-a665-912387d2f38f is now ModuleInstanceStatus.RUNNING
2024-06-14 03:44:21,149 - rush - INFO - Argument 5dd19ede-30d6-4694-a665-912387d2f38f is now ModuleInstanceStatus.AWAITING_UPLOAD


{'pockets': [{'name': 'pocket1',
   'rank': 1,
   'score': 12.04,
   'center_x': 5.24,
   'center_y': 9.9235,
   'center_z': 27.7115,
   'sas_points': 70,
   'surf_atoms': 48,
   'probability': 0.636,
   'residue_ids': ['A_10',
    'A_11',
    'A_12',
    'A_131',
    'A_132',
    'A_134',
    'A_144',
    'A_145',
    'A_146',
    'A_18',
    'A_31',
    'A_33',
    'A_64',
    'A_80',
    'A_81',
    'A_83',
    'A_84',
    'A_85',
    'A_86'],
   'surf_atom_ids': [82,
    83,
    84,
    85,
    86,
    87,
    88,
    89,
    95,
    140,
    141,
    242,
    257,
    258,
    259,
    442,
    575,
    576,
    577,
    578,
    579,
    580,
    581,
    585,
    605,
    612,
    613,
    620,
    621,
    622,
    629,
    633,
    634,
    636,
    992,
    993,
    994,
    1000,
    1003,
    1021,
    1022,
    1093,
    1094,
    1095,
    1098,
    1099,
    1100,
    1102]},
  {'name': 'pocket2',
   'rank': 2,
   'score': 3.36,
   'center_x': 4.8154,
   'center_y': 39.7

## 3.4) Get PyMol Visualiation files
The Rush P2Rank module also produces a zip file of PyMol visualisation if you want to dive deeper into the pockets.


In [12]:
visualisation.download("visualisation.tar.gz", overwrite=True)

PosixPath('/home/dylan/qdx/p2rank-rush-py-demo/objects/visualisation.tar.gz')