# Testing possible atom mappings in perses
### Module to test how a hybrid system will be generated for a system

    * This does _not_ check if hydrogens are mapped or not. That is handled by class SmallMoleculeSetProposalEngine as a system needs to be generated and the hydrogens restraints need to be checked
    * AtomMapper also does not check that the minimal number of core atoms are retained by a map. If

In [None]:
from openff.toolkit.topology import Molecule
from perses.rjmc.atom_mapping import AtomMapper
from openeye import oechem
import itertools
from IPython.display import display, Image

In [None]:
# download files of interest from openmm-forcefields repo!
import os
os.system('wget https://raw.githubusercontent.com/openmm/openmmforcefields/master/openmmforcefields/data/perses_jacs_systems/bace/Bace_ligands_shifted.sdf')




os.system('wget https://raw.githubusercontent.com/openmm/openmmforcefields/master/openmmforcefields/data/perses_jacs_systems/jnk1/Jnk1_ligands.sdf')



#### Starting by just using the three default options in perses

In [None]:
molecules_to_create = {
    'benzene':'c1ccccc1',
    'toluene':'c1ccccc1C',
    'nitrobenzene':'C1=CC=C(C=C1)[N+](=O)[O-]',
    'cyclohexane':'C1CCCCC1',
    }

molecules = list()
for title, smiles in molecules_to_create.items():
    molecule = Molecule.from_smiles(smiles)
    molecule.title = title
    molecules.append(molecule)

# looping over default mapping strengths
for map_strength in ['weak']:
    print(f'Doing {map_strength} mapping')
    atom_mapper = AtomMapper(map_strength=map_strength)
    for molA, molB in itertools.combinations(molecules, 2):
        atom_mapping = atom_mapper.get_best_mapping(molA, molB)

        if atom_mapping is None:
            print(f'Cannot map {nameA} to {nameB} with {map_strength} map strength')
        elif len(atom_mapping.old_to_new_atom_map) < 3:
            print(f'Mapping of {nameA} to {nameB} with {map_strength} map strength has only {len(atom_mapping.old_to_new_atom_map)} mapped atoms')
        else:
            print(f'{nameA} ---> {nameB} '+u'\u2713')
            render_atom_mapping(f'{nameA}to{nameB}_{map_strength}.png', molA, molB, mapping)
            i = Image(filename=f'{nameA}to{nameB}_{map_strength}.png')
            display(i)
    print()

#### Now lets try a few different mapping schemes using openeye's `OEExprOpts`

https://docs.eyesopen.com/toolkits/python/oechemtk/OEChemConstants/OEExprOpts.html

In [None]:
# this just checks atoms are in rings of the same size, with no check for aromaticity
atom_expr = oechem.OEExprOpts_IntType
bond_expr = oechem.OEExprOpts_RingMember 

from openff.toolkit.topology import Molecule
molecules = Molecule.from_file('Bace_ligands_shifted.sdf')
list_of_mols = [ molecule.to_openeye() for molecule in molecules ]

names_and_oemols = dict(zip(smiles.keys(),list_of_mols))

for nameA,nameB in itertools.combinations(names_and_oemols,2):
    molA = names_and_oemols[nameA]
    molB = names_and_oemols[nameB]    
    mapping = AtomMapper._get_mol_atom_map(molA,molB,atom_expr=atom_expr,bond_expr=bond_expr)
    
    print(f'{nameA} ---> {nameB} '+u'\u2713')
    render_atom_mapping(f'{nameA}to{nameB}.png', molA, molB, mapping)
    i = Image(filename=f'{nameA}to{nameB}.png')
    display(i)
    print()
    

Now lets look at different map_strategy's that can be used, that can prioritise different mappings

In [None]:
# this just checks atoms are in rings of the same size, with no check for aromaticity
atom_expr = oechem.OEExprOpts_IntType
bond_expr = oechem.OEExprOpts_RingMember 

from openff.toolkit.topology import Molecule
molecules = Molecule.from_file('Jnk1_ligands.sdf')
list_of_mols = [ molecule.to_openeye() for molecule in molecules ]

molA = list_of_mols[0]
molB = list_of_mols[1]

for map_strategy in ['core','geometry']: 
    mapping = AtomMapper._get_mol_atom_map(molA,molB,atom_expr=atom_expr,bond_expr=bond_expr, map_strategy=map_strategy)

    print(f'{map_strategy}'+u'\u2713')
    render_atom_mapping(f'{nameA}to{nameB}.png', molA, molB, mapping)
    i = Image(filename=f'{nameA}to{nameB}.png')
    display(i)
    print()
    

Here - trying to get as many atoms into the core as possible, maps the methyl and the ether group on to eachother. In reality, these groups point opposite directions in the binding site, and most likely won't freely interconvert in a tightly constrained active site.

Using the `geometry` as the strategy fixes this, and gets them the right way round, even though there are more unique atoms to grow in/dissapear in the simulation.

Finally lets look at all the ways that we can map. --  from best (i.e. geometrically. closest to worse)

In [None]:
# this just checks atoms are in rings of the same size, with no check for aromaticity
atom_expr = oechem.OEExprOpts_IntType
bond_expr = oechem.OEExprOpts_RingMember 

from openff.toolkit.topology import Molecule
molecules = Molecule.from_file('Jnk1_ligands.sdf')
list_of_mols = [ molecule.to_openeye() for molecule in molecules ]

molA = list_of_mols[0]
molB = list_of_mols[1]

mappings = AtomMapper._get_mol_atom_map(molA,molB,atom_expr=atom_expr,bond_expr=bond_expr, map_strategy='return-all')

for score, mapping in sorted(mappings.items()):
    print(f'geometry score {score:.2f}')
    render_atom_mapping(f'{nameA}to{nameB}.png', molA, molB, mapping)
    i = Image(filename=f'{nameA}to{nameB}.png')
    display(i)
    print()

Here we see the geometrically best first, with fewer core atoms, and then we see the one with the benzene ring flipped. We can see that more is in the core, but it's further from the input coordinates.

Following these two, the mapping has found symmetry in the scaffold, and has mapped the two aromatic rings the 'wrong way around.' These are geometrically very far from the right binding mode, and have the most atoms not in the core, but in cases where the binding mode of B is very unclear, this might be an interesting experiment to run.