# Explore the data structure of a dynophore

A __dynophore__ (defined in the `Dynophore` class) is a collection of so-called __superfeatures__ (defined in the `SuperFeature` class). A superfeature is defined as a pharmacophore feature on ligand site (defined by a feature type, e.g. HBA, and one or more ligand atom numbers/serials) that occurs at least once during and MD simulation. A superfeature can have one or more interaction partner(s) on macromolecule-side. These interaction partners are called __environmental partners__ (defined in the `EnvPartner` class).

In this notebook, we will explore the `Dynophore`, `SuperFeature`, and `EnvPartner` classes.

In [None]:
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path
import logging

from dynophores import Dynophore



In [3]:
logger = logging.getLogger("dynophores")
logger.setLevel(logging.DEBUG)

## Set path to `DynophoreApp` output data folder

In [4]:
DATA = Path("../../dynophores/tests/data/out")

## Load data as `Dynophore` object

In [5]:
dynophore = Dynophore.from_dir(DATA)

Read files from ../../dynophores/tests/data/1KE7-1/DynophoreApp/dynophore.json.


## List object attributes

### `Dynophore` object attributes

In [6]:
dynophore.__dict__

{'id': '1KE7-1',
 'superfeatures': [<dynophores.core.superfeature.SuperFeature at 0x7fce7449cd60>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce164a9460>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce164a9c10>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ec730>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ec4c0>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ec820>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ec970>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154eca00>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ecb20>,
  <dynophores.core.superfeature.SuperFeature at 0x7fce154ecca0>]}

A `Dynophore` object contains:

- `id`: dynophore identifier (name) 
- `superfeatures`: list of superfeatures (`SuperFeature` objects)

In [7]:
print(f"Number of superfeatures: {len(dynophore.superfeatures)}")
# NBVAL_CHECK_OUTPUT

Number of superfeatures: 10


Let's take a look at one example `SuperFeature` object.

### `SuperFeature` object attributes

In [8]:
dynophore.superfeatures[0].__dict__

{'id': 'HBA[4618]',
 'feature_type': 'HBA',
 'atom_numbers': [4618],
 'occurrences': array([0, 0, 0, ..., 0, 0, 0]),
 'envpartners': [<dynophores.core.envpartner.EnvPartner at 0x7fce168deb80>]}

A `SuperFeature` object contains:

- `id`: superfeature identifier (nomenclature: `<feature_type><list of atom numbers>`)
- `feature_type`: feature type (e.g. HBA, HBD, H, AR, ...)
- `atom_numbers`: number(s) of ligand atom(s) that are involved in feature
- `occurrences`: superfeature occurrences during an MD simulation (0/1 for absent/present)
- `envpartners`: list of environmental partners on the macromolecule-side that involved in the superfeature (either at the same time or not)

In [9]:
n_envpartners = sum([len(superfeature.envpartners) for superfeature in dynophore.superfeatures])
print(f"Number of environmental partners: {n_envpartners}")
# NBVAL_CHECK_OUTPUT

Number of environmental partners: 28


Let's take a look at one example `EnvPartner` object.

### `EnvPartner` object attributes

In [10]:
dynophore.superfeatures[0].envpartners[0].__dict__

{'id': 'LYS_20_A[316]',
 'residue_name': 'LYS',
 'residue_number': '20',
 'chain': 'A',
 'atom_numbers': [316],
 'occurrences': array([0, 0, 0, ..., 0, 0, 0]),
 'distances': array([11.0768795 , 11.0768795 , 10.6541481 , ...,  5.33626938,
         4.15633821,  6.20708275])}

A `EnvPartner` object contains:

- `id`: environmental partner identifier (nomenclature: `<residue name>-<residue number>-<chain><list of atom numbers>`)
- `residue_name`: residue name
- `residue_number`: residue number
- `chain`: chain ID
- `atom_numbers`: number(s) of residue atom(s) that are involved in feature
- `occurrences`: interaction occurrences during an MD (0/1 for absent/present) between ligand and residue atoms
- `distances`: interaction distances between the involved atoms on ligand- and macromolecule-side during an MD

## Dynophore basics

### Dynophore identifier

In [11]:
print(f"Dynophore name: {dynophore.id}")

Dynophore name: 1KE7-1


### Number of frames

In [12]:
print(f"Number of MD simulation frames: {dynophore.n_frames}")

Number of MD simulation frames: 1002


### Number of superfeatures

In [13]:
print(f"Number of superfeatures: {dynophore.n_superfeatures}")

Number of superfeatures: 10


## Superfeatures monitoring (over trajectory)

### Superfeature occurrences

In [14]:
dynophore.superfeatures_occurrences

Unnamed: 0,"AR[4605,4607,4603,4606,4604]","AR[4622,4615,4623,4613,4614,4621]",HBA[4596],HBA[4606],HBA[4618],HBA[4619],HBD[4598],HBD[4612],"H[4599,4602,4601,4608,4609,4600]","H[4615,4623,4622,4613,4621,4614]"
0,0,0,1,0,0,1,0,0,1,1
1,0,0,1,0,0,1,0,0,1,1
2,0,0,0,0,0,0,0,0,1,1
3,0,0,0,0,0,0,0,0,1,1
4,0,0,0,0,0,0,0,1,1,1
...,...,...,...,...,...,...,...,...,...,...
997,0,0,1,0,0,0,0,0,1,1
998,0,0,1,0,0,0,0,0,1,1
999,0,0,1,0,0,0,0,0,1,1
1000,0,0,1,0,0,0,0,0,1,1


## Environmental partners monitoring (over trajectory)

### Interaction occurrences for example superfeature

In [15]:
sorted(dynophore.envpartners_occurrences.keys())

['AR[4605,4607,4603,4606,4604]',
 'AR[4622,4615,4623,4613,4614,4621]',
 'HBA[4596]',
 'HBA[4606]',
 'HBA[4618]',
 'HBA[4619]',
 'HBD[4598]',
 'HBD[4612]',
 'H[4599,4602,4601,4608,4609,4600]',
 'H[4615,4623,4622,4613,4621,4614]']

In [16]:
dynophore.envpartners_occurrences["H[4599,4602,4601,4608,4609,4600]"]

Unnamed: 0,"ALA_144_A[2263,2266]","LEU_134_A[2109,2110,2111]","ALA_31_A[488,491]","ILE_10_A[169,171]","VAL_18_A[275,276,277]","ILE_10_A[169,171,172]"
0,1,1,0,0,1,0
1,1,1,0,0,1,0
2,1,1,0,0,1,0
3,1,1,0,0,1,0
4,0,1,0,0,1,0
...,...,...,...,...,...,...
997,1,1,1,1,1,1
998,1,1,0,1,1,1
999,1,1,0,1,1,1
1000,1,1,0,1,1,1


### Interaction distances for example superfeature

In [17]:
dynophore.envpartners_distances["H[4599,4602,4601,4608,4609,4600]"]

Unnamed: 0,"ALA_144_A[2263,2266]","LEU_134_A[2109,2110,2111]","ALA_31_A[488,491]","ILE_10_A[169,171]","VAL_18_A[275,276,277]","ILE_10_A[169,171,172]"
0,4.857178,4.395278,6.622738,6.213273,4.442045,6.767594
1,4.857178,4.395278,6.622738,6.213273,4.442045,6.767594
2,5.140143,4.533679,7.753983,7.277190,4.465024,7.847371
3,5.602469,4.494329,7.028589,7.636541,4.556768,8.174439
4,6.398542,5.321434,7.115447,6.667186,4.443635,7.359649
...,...,...,...,...,...,...
997,5.496365,4.748538,6.064285,3.910376,5.268448,3.856377
998,5.363754,4.347239,6.436686,3.969030,4.816099,4.078391
999,5.666852,4.321141,6.934736,4.138692,5.082184,4.411178
1000,5.434472,4.491572,6.710442,4.150712,4.916178,4.188749


## Superfeatures vs. environmental partners

### Occurrence count

In [18]:
dynophore.count

Unnamed: 0,HBA[4618],"AR[4605,4607,4603,4606,4604]",HBD[4598],HBA[4606],"AR[4622,4615,4623,4613,4614,4621]",HBD[4612],HBA[4596],HBA[4619],"H[4599,4602,4601,4608,4609,4600]","H[4615,4623,4622,4613,4621,4614]"
"ALA_144_A[2263,2266]",0,0,0,0,0,0,0,0,992,0
"ALA_31_A[488,491]",0,0,0,0,0,0,0,0,216,0
ASP_86_A[1313],0,0,0,0,0,0,0,2,0,0
ASP_86_A[1319],0,0,0,0,0,18,0,0,0,0
ASP_86_A[1320],0,0,0,0,0,20,0,0,0,0
GLN_131_A[2057],0,0,0,0,0,1,0,0,0,0
GLN_131_A[2061],0,0,0,0,0,8,0,0,0,0
GLN_131_A[2062],0,0,0,2,0,0,0,0,0,0
GLU_81_A[1228],0,0,8,0,0,0,0,0,0,0
"HIS_84_A[1284,1285,1286,1287,1288]",0,0,0,0,1,0,0,0,0,0


### Occurrence frequency

In [19]:
dynophore.frequency

Unnamed: 0,HBA[4618],"AR[4605,4607,4603,4606,4604]",HBD[4598],HBA[4606],"AR[4622,4615,4623,4613,4614,4621]",HBD[4612],HBA[4596],HBA[4619],"H[4599,4602,4601,4608,4609,4600]","H[4615,4623,4622,4613,4621,4614]"
"ALA_144_A[2263,2266]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,99.0,0.0
"ALA_31_A[488,491]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,21.56,0.0
ASP_86_A[1313],0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0
ASP_86_A[1319],0.0,0.0,0.0,0.0,0.0,1.8,0.0,0.0,0.0,0.0
ASP_86_A[1320],0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0
GLN_131_A[2057],0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0
GLN_131_A[2061],0.0,0.0,0.0,0.0,0.0,0.8,0.0,0.0,0.0,0.0
GLN_131_A[2062],0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0
GLU_81_A[1228],0.0,0.0,0.8,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"HIS_84_A[1284,1285,1286,1287,1288]",0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0
