# Explore the data structure of a dynophore

A __dynophore__ (defined in the `Dynophore` class) is a collection of so-called __superfeatures__ (defined in the `SuperFeature` class). A superfeature is defined as a pharmacophore feature on ligand site (defined by a feature type, e.g. HBA, and one or more ligand atom numbers/serials) that occurs at least once during and MD simulation. A superfeature can have one or more interaction partner(s) on macromolecule-side. These interaction partners are called __environmental partners__ (defined in the `EnvPartner` class).

In this notebook, we will explore the `Dynophore`, `SuperFeature`, and `EnvPartner` classes.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path
import logging

from dynophores import Dynophore



In [3]:
logger = logging.getLogger("dynophores")
logger.setLevel(logging.DEBUG)

## Set path to `DynophoreApp` output data folder

In [4]:
DATA = Path("../../dynophores/tests/data/out")
#DATA = Path("/home/dominique/Desktop/1us_20kFrames_MD_M2R_QNB/M2_QNB_dynophore/")
DATA

PosixPath('../../dynophores/tests/data/out')

## Load data as `Dynophore` object

In [5]:
dynophore = Dynophore.from_dir(DATA)

## List object attributes

### `Dynophore` object attributes

In [6]:
dynophore.__dict__

{'id': 'dynophore_1KE7',
 'superfeatures': [<dynophores.core.superfeature.SuperFeature at 0x7f356cb4b9a0>,
  <dynophores.core.superfeature.SuperFeature at 0x7f356cb4bfa0>,
  <dynophores.core.superfeature.SuperFeature at 0x7f3510f83520>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb95040>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb95250>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb953d0>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb95520>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb955e0>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb95700>,
  <dynophores.core.superfeature.SuperFeature at 0x7f350fb958b0>]}

In [7]:
dynophore.__dict__.keys()
# NBVAL_CHECK_OUTPUT

dict_keys(['id', 'superfeatures'])

A `Dynophore` object contains:

- `id`: Dynophore identifier (name) 
- `superfeatures`: List of superfeatures (`SuperFeature` objects)

In [8]:
print(f"Number of superfeatures: {len(dynophore.superfeatures)}")
# NBVAL_CHECK_OUTPUT

Number of superfeatures: 10


Let's take a look at one example `SuperFeature` object.

### `SuperFeature` object attributes

In [9]:
dynophore.superfeatures[0].__dict__

{'id': 'HBA[4618]',
 'feature_type': 'HBA',
 'atom_numbers': [4618],
 'occurrences': array([0, 0, 0, ..., 0, 0, 0]),
 'envpartners': [<dynophores.core.envpartner.EnvPartner at 0x7f356cb4b2b0>],
 'cloud': <dynophores.core.chemicalfeaturecloud3d.ChemicalFeatureCloud3D at 0x7f356cb4b730>}

In [10]:
dynophore.superfeatures[0].__dict__.keys()
# NBVAL_CHECK_OUTPUT

dict_keys(['id', 'feature_type', 'atom_numbers', 'occurrences', 'envpartners', 'cloud'])

A `SuperFeature` object contains:

- `id`: Superfeature identifier (nomenclature: `<feature_type><list of atom numbers>`)
- `feature_type`: Feature type (e.g. HBA, HBD, H, AR, ...)
- `atom_numbers`: Number(s) of ligand atom(s) that are involved in feature
- `occurrences`: Superfeature occurrences during an MD simulation (0/1 for absent/present)
- `envpartners`: List of environmental partners on the macromolecule-side that involved in the superfeature (either at the same time or not)
- `cloud`: Chemical feature cloud in 3D (coordinates of each occurring feature during an MD simulation that belongs to the superfeature)

In [11]:
n_envpartners = sum([len(superfeature.envpartners) for superfeature in dynophore.superfeatures])
print(f"Number of environmental partners: {n_envpartners}")
# NBVAL_CHECK_OUTPUT

Number of environmental partners: 28


Let's take a look at one example `EnvPartner` object.

### `EnvPartner` object attributes

In [12]:
dynophore.superfeatures[0].envpartners[0].__dict__

{'id': 'LYS_20_A[316]',
 'residue_name': 'LYS',
 'residue_number': '20',
 'chain': 'A',
 'atom_numbers': [316],
 'occurrences': array([0, 0, 0, ..., 0, 0, 0]),
 'distances': array([11.0768795 , 11.0768795 , 10.6541481 , ...,  5.33626938,
         4.15633821,  6.20708275])}

In [13]:
dynophore.superfeatures[0].envpartners[0].__dict__.keys()
# NBVAL_CHECK_OUTPUT

dict_keys(['id', 'residue_name', 'residue_number', 'chain', 'atom_numbers', 'occurrences', 'distances'])

A `EnvPartner` object contains:

- `id`: environmental partner identifier (nomenclature: `<residue name>-<residue number>-<chain><list of atom numbers>`)
- `residue_name`: residue name
- `residue_number`: residue number
- `chain`: chain ID
- `atom_numbers`: number(s) of residue atom(s) that are involved in feature
- `occurrences`: interaction occurrences during an MD (0/1 for absent/present) between ligand and residue atoms
- `distances`: interaction distances between the involved atoms on ligand- and macromolecule-side during an MD

### `ChemicalFeatureCloud3D` object attributes

In [14]:
dynophore.superfeatures[0].cloud.__dict__

{'id': 'HBA[4618]',
 'center': array([-18.507637 ,  -8.405735 ,   1.5362723]),
 'points': array([[-18.598375 ,  -8.370245 ,   2.017743 ],
        [-18.416897 ,  -8.441224 ,   1.0548013]])}

In [15]:
dynophore.superfeatures[0].cloud.__dict__.keys()
# NBVAL_CHECK_OUTPUT

dict_keys(['id', 'center', 'points'])

A `ChemicalFeatureCloud3D` object contains:
- `id`: Cloud identifier that equals the superfeature identifier (`SuperFeature.id`)
- `center`: The coordinates of the geometric center of all points in the point cloud
- `points`: The coordiantes of all points in the point cloud

## Dynophore basics

### Dynophore identifier

In [16]:
print(f"Dynophore name: {dynophore.id}")

Dynophore name: dynophore_1KE7


### Number of frames

In [17]:
print(f"Number of MD simulation frames: {dynophore.n_frames}")

Number of MD simulation frames: 1002


### Number of superfeatures

In [18]:
print(f"Number of superfeatures: {dynophore.n_superfeatures}")

Number of superfeatures: 10


## Superfeatures monitoring (over trajectory)

### Superfeature occurrences

In [19]:
dynophore.superfeatures_occurrences

Unnamed: 0,"AR[4605,4607,4603,4606,4604]","AR[4622,4615,4623,4613,4614,4621]",HBA[4596],HBA[4606],HBA[4618],HBA[4619],HBD[4598],HBD[4612],"H[4599,4602,4601,4608,4609,4600]","H[4615,4623,4622,4613,4621,4614]"
0,0,0,1,0,0,1,0,0,1,1
1,0,0,1,0,0,1,0,0,1,1
2,0,0,0,0,0,0,0,0,1,1
3,0,0,0,0,0,0,0,0,1,1
4,0,0,0,0,0,0,0,1,1,1
...,...,...,...,...,...,...,...,...,...,...
997,0,0,1,0,0,0,0,0,1,1
998,0,0,1,0,0,0,0,0,1,1
999,0,0,1,0,0,0,0,0,1,1
1000,0,0,1,0,0,0,0,0,1,1


## Environmental partners monitoring (over trajectory)

### Interaction occurrences for example superfeature

In [20]:
sorted(dynophore.envpartners_occurrences.keys())

['AR[4605,4607,4603,4606,4604]',
 'AR[4622,4615,4623,4613,4614,4621]',
 'HBA[4596]',
 'HBA[4606]',
 'HBA[4618]',
 'HBA[4619]',
 'HBD[4598]',
 'HBD[4612]',
 'H[4599,4602,4601,4608,4609,4600]',
 'H[4615,4623,4622,4613,4621,4614]']

In [21]:
dynophore.envpartners_occurrences["H[4599,4602,4601,4608,4609,4600]"]

Unnamed: 0,"LEU_134_A[2109,2110,2111]","ALA_144_A[2263,2266]","VAL_18_A[275,276,277]","ILE_10_A[169,171]","ALA_31_A[488,491]","ILE_10_A[169,171,172]"
0,1,1,1,0,0,0
1,1,1,1,0,0,0
2,1,1,1,0,0,0
3,1,1,1,0,0,0
4,1,0,1,0,0,0
...,...,...,...,...,...,...
997,1,1,1,1,1,1
998,1,1,1,1,0,1
999,1,1,1,1,0,1
1000,1,1,1,1,0,1


### Interaction distances for example superfeature

In [22]:
dynophore.envpartners_distances["H[4599,4602,4601,4608,4609,4600]"]

Unnamed: 0,"LEU_134_A[2109,2110,2111]","ALA_144_A[2263,2266]","VAL_18_A[275,276,277]","ILE_10_A[169,171]","ALA_31_A[488,491]","ILE_10_A[169,171,172]"
0,4.395278,4.857178,4.442045,6.213273,6.622738,6.767594
1,4.395278,4.857178,4.442045,6.213273,6.622738,6.767594
2,4.533679,5.140143,4.465024,7.277190,7.753983,7.847371
3,4.494329,5.602469,4.556768,7.636541,7.028589,8.174439
4,5.321434,6.398542,4.443635,6.667186,7.115447,7.359649
...,...,...,...,...,...,...
997,4.748538,5.496365,5.268448,3.910376,6.064285,3.856377
998,4.347239,5.363754,4.816099,3.969030,6.436686,4.078391
999,4.321141,5.666852,5.082184,4.138692,6.934736,4.411178
1000,4.491572,5.434472,4.916178,4.150712,6.710442,4.188749


## Superfeatures vs. environmental partners

### Occurrence count

In [23]:
dynophore.count

Unnamed: 0,HBA[4618],"AR[4605,4607,4603,4606,4604]",HBD[4598],HBA[4606],"AR[4622,4615,4623,4613,4614,4621]",HBD[4612],HBA[4619],HBA[4596],"H[4615,4623,4622,4613,4621,4614]","H[4599,4602,4601,4608,4609,4600]"
"ALA_144_A[2263,2266]",0,0,0,0,0,0,0,0,0,992
"ALA_31_A[488,491]",0,0,0,0,0,0,0,0,0,216
ASP_86_A[1313],0,0,0,0,0,0,2,0,0,0
ASP_86_A[1319],0,0,0,0,0,18,0,0,0,0
ASP_86_A[1320],0,0,0,0,0,20,0,0,0,0
GLN_131_A[2057],0,0,0,0,0,1,0,0,0,0
GLN_131_A[2061],0,0,0,0,0,8,0,0,0,0
GLN_131_A[2062],0,0,0,2,0,0,0,0,0,0
GLU_81_A[1228],0,0,8,0,0,0,0,0,0,0
"HIS_84_A[1284,1285,1286,1287,1288]",0,0,0,0,1,0,0,0,0,0


### Occurrence frequency

In [24]:
dynophore.frequency

Unnamed: 0,HBA[4618],"AR[4605,4607,4603,4606,4604]",HBD[4598],HBA[4606],"AR[4622,4615,4623,4613,4614,4621]",HBD[4612],HBA[4619],HBA[4596],"H[4615,4623,4622,4613,4621,4614]","H[4599,4602,4601,4608,4609,4600]"
"ALA_144_A[2263,2266]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,99.0
"ALA_31_A[488,491]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.56
ASP_86_A[1313],0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0
ASP_86_A[1319],0.0,0.0,0.0,0.0,0.0,1.8,0.0,0.0,0.0,0.0
ASP_86_A[1320],0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0
GLN_131_A[2057],0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0
GLN_131_A[2061],0.0,0.0,0.0,0.0,0.0,0.8,0.0,0.0,0.0,0.0
GLN_131_A[2062],0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0
GLU_81_A[1228],0.0,0.0,0.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"HIS_84_A[1284,1285,1286,1287,1288]",0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0


## Superfeature clouds

In [30]:
dynophore.clouds

{'HBA[4618]': array([[-18.598375 ,  -8.370245 ,   2.017743 ],
        [-18.416897 ,  -8.441224 ,   1.0548013]]),
 'AR[4605,4607,4603,4606,4604]': array([[-8.794968 , -5.4824977,  5.0140915]]),
 'HBD[4598]': array([[-13.884943  ,   0.46002614,   1.966152  ],
        [-14.304212  ,   0.44538647,   1.7417221 ],
        [-14.0754    ,   0.6494397 ,   1.9893616 ],
        [-14.756247  ,   0.73604614,   1.4773699 ],
        [-13.843211  ,   0.8686193 ,   1.7220099 ],
        [-15.522755  ,   0.12329083,   0.09974381],
        [-15.237535  ,   0.9301981 ,   0.42284793],
        [-13.050346  ,  -2.8064024 ,   0.44377777],
        [-13.258515  ,  -2.9587338 ,   0.19511254],
        [-14.778919  ,  -1.2979854 ,   1.5950569 ]]),
 'HBA[4606]': array([[-7.4588404, -4.9241004,  1.5566669],
        [-6.4174824, -5.0383735,  2.984039 ],
        [-7.6771197, -5.920971 ,  2.6544497],
        [-7.5147486, -4.349894 ,  2.4435065],
        [-7.962984 , -5.0353174,  2.2140415],
        [-7.2271028, -5.09816