```
This script can be used for any purpose without limitation subject to the
conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx

This permission notice and the following statement of attribution must be
included in all copies or substantial portions of this script.

2022-06-01: Made available by the Cambridge Crystallographic Data Centre.

```

# Molecular interaction preferences and the Interaction API

[IsoStar](https://www.ccdc.cam.ac.uk/solutions/csd-core/components/isostar/) is a knowledge-base of intermolecular interaction preferences derived from the CSD and PDB. This data can also be accessed _via_ the [Interaction API](https://downloads.ccdc.cam.ac.uk/documentation/API/descriptive_docs/interaction.html).

In [1]:
from pathlib import Path
import sys
sys.path.append('../..')
from ccdc_notebook_utilities import create_logger

import os
from time import time

import warnings

In [2]:
import pandas as pd

In [3]:
with warnings.catch_warnings():
    warnings.filterwarnings(action='ignore', category=DeprecationWarning)  # Ignore current 'distutils Version classes are deprecated' warning
    
    import plotly.express as px

In [4]:
from IPython.display import HTML

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

In [5]:
import ccdc
from ccdc.interaction import InteractionLibrary
from ccdc.diagram import DiagramGenerator
from ccdc.io import EntryReader

### Initialization

In [6]:
logger = create_logger()

Set up a CCDC Diagram Generator...

In [7]:
diagram_generator = DiagramGenerator()

diagram_generator.settings.return_type = 'SVG'
diagram_generator.settings.explicit_polar_hydrogens = False
diagram_generator.settings.shrink_symbols = False

Utility to help with display in JupyterLab...

In [8]:
show_df = lambda df: df.style.set_properties(**{'text-align': 'left'})

Initialise the IsoStar central and contact group libraries...

In [9]:
central_lib = InteractionLibrary.CentralGroupLibrary()
contact_lib = InteractionLibrary.ContactGroupLibrary()

### Inspect the interaction group libraries

Available central groups...

In [10]:
len(central_lib.groups)

In [11]:
print('\n'.join(x.name for x in central_lib.groups[:10]))  # First ten names

Available contact groups...

In [12]:
len(contact_lib.groups)

In [13]:
print('\n'.join(x.name for x in contact_lib.groups[:10]))   # First ten names

We can visualise the substructure query used to define any group...

In [14]:
group = central_lib.group_by_name('aromatic-aromatic ester')

HTML(diagram_generator.image(group.substructure_query))

In [15]:
group = contact_lib.group_by_name('sulfoxide/sulfone O')

HTML(diagram_generator.image(group.substructure_query))

### Perform an Interaction Analysis

We will perform an interaction analyis for aliphatic ketones....

In [16]:
central_group = central_lib.group_by_name('aliphatic-aliphatic ketone')

HTML(diagram_generator.image(central_group.substructure_query))

The contact groups for which data is available for this central group...

In [17]:
contact_groups = [x for x in central_group.contact_groups() if x]

len(contact_groups)

We can show the data for these groups as a dataframe...

In [18]:
def make_row(contact_group):
    
    data = central_group.interaction_data(contact_group)
        
    return [contact_group.name, data.ncontacts, *data.relative_density]

In [19]:
def depiction(name): # Depiction of a contact-group substruture query
    
    return diagram_generator.image(contact_lib.group_by_name(name).substructure_query)

In [20]:
contacts_df = (
    pd.DataFrame(
        data=[make_row(x) for x in contact_groups],
        columns=['Contact Group', 'No. of Contacts', 'Relative Density', 'Std. Dev.']
    )
    .sort_values('Relative Density', ascending=False)
    .assign(depiction = lambda df: df['Contact Group'].apply(depiction))
)

contacts_df.shape

In [21]:
show_df(contacts_df.head(5))

### Inspect which central and contact groups are present in a molecule

Retrieve an example molecule from the CSD (or load one from file)...

In [22]:
refcode = 'AABHTZ'

with EntryReader('CSD') as reader:
    
    molecule = reader.molecule(refcode)

In [23]:
HTML(diagram_generator.image(molecule))

Local utility to display a molecule with a substructure highlighted...

In [24]:
depict_group = lambda mol, group: diagram_generator.image(mol, highlight_atoms=group.match_atoms())

#### Central Groups

Identify the central groups in the molecule of interest...

In [25]:
central_group_hits = central_lib.search_molecule(molecule)

len(central_group_hits)

In [26]:
central_groups_df = pd.DataFrame([(group.name, depict_group(molecule, group)) for group in central_group_hits], columns=['Group', 'Depiction'])

show_df(central_groups_df.head(3))

#### Contact Groups

Identify the contact groups in the molecule of interest...

In [27]:
contact_group_hits = contact_lib.search_molecule(molecule)

len(contact_group_hits)

In [28]:
contact_groups_df = pd.DataFrame([(group.name, depict_group(molecule, group)) for group in central_group_hits], columns=['Group', 'Depiction'])

show_df(contact_groups_df.head(3))