# Generating 3D structures using ligand-based methods
----
<font size="3">
    
So far, we have worked on 2D molecules (as SMILES), but if we want to better understand our candidates and select the most promising ones, we need to generate 3D conformations for those SMILES.
<br><br>
There are many approaches to generate 3D conformations from the SMILES (e.g. constrained docking algorithms). In this notebook we will compute them using a ligand-based alignment and scoring routine. We assume that elaborations to the initial hit will not introduce massive changes to the binding mode, and that interactions seen in the initial hit protein-ligand complex will be conserved.  
    
By default, 100 conformations of the new molecule will be calculated using the rdkit ETKDG conformation generation algorithm (Riniker & Landrum, 2015). These conformations will be then aligned to the reference hit molecule using Open3DAlign (O3A) (Tosco et al., 2011) also implemented from rdkit. In order to select which aligned conformation of the new molecule best fits with the reference molecule, a shape and colour overlay metric (SuCOS (Leung et al., 2019) ) will be applied for scoring.
    
An implementation of this pipeline is shown in the next cell, where the main function is ``gen_conf_from_vector(input_mol_block, elaborated_smiles)`` where ``input_mol_block`` is the mol block from the reference structure, and ``elaborated_smiles`` is the smiles string of the molecule you want to calculate a predicted structure for.

</font>

----
Refs:

- Riniker, S. & Landrum, G. A. (2015). Journal of Chemical Information and Modeling. 55, 2562–2574.
- Tosco, P., Balle, T. & Shiri, F. (2011). Journal of Computer-Aided Molecular Design. 25, 777–783
- Leung, S., Bodkin, M., von Delft, F., Brennan, P. & Morris, G. (2019) https://doi.org/10.26434/chemrxiv.8100203.v1.

In [5]:
import os
import numpy as np
from rdkit import Chem
from rdkit.Chem import AllChem, rdShapeHelpers
from rdkit.Chem.FeatMaps import FeatMaps
from rdkit import RDConfig

########################################################################################
#
#  SuCOS Implementation
#
########################################################################################

fdef = AllChem.BuildFeatureFactory(os.path.join(RDConfig.RDDataDir, 'BaseFeatures.fdef'))

fmParams = {}
for k in fdef.GetFeatureFamilies():
    fparams = FeatMaps.FeatMapParams()
    fmParams[k] = fparams

keep = ('Donor', 'Acceptor', 'NegIonizable', 'PosIonizable', 'ZnBinder',
        'Aromatic', 'Hydrophobe', 'LumpedHydrophobe')


def get_FeatureMapScore(small_m, large_m, score_mode=FeatMaps.FeatMapScoreMode.Best):
    featLists = []
    for m in [small_m, large_m]:
        rawFeats = fdef.GetFeaturesForMol(m)
        # filter that list down to only include the ones we're intereted in
        featLists.append([f for f in rawFeats if f.GetFamily() in keep])
    fms = [FeatMaps.FeatMap(feats=x, weights=[1] * len(x), params=fmParams) for x in featLists]
    fms[0].scoreMode = score_mode
    fm_score = fms[0].ScoreFeats(featLists[1]) / min(fms[0].GetNumFeatures(), len(featLists[1]))
    return fm_score


def score(reflig, prb_mols, ids, score_mode=FeatMaps.FeatMapScoreMode.All, p=False):
    ref = Chem.AddHs(reflig)
    idx = 0

    results_sucos = {}
    results_tani = {}

    smi_mol = Chem.MolToSmiles(prb_mols)

    for i in ids:

        prb = Chem.AddHs(Chem.MolFromMolBlock(Chem.MolToMolBlock(prb_mols, confId=i)))

        fm_score = get_FeatureMapScore(ref, prb, score_mode)
        fm_score = np.clip(fm_score, 0, 1)

        protrude_dist = rdShapeHelpers.ShapeProtrudeDist(ref, prb,
                                                         allowReordering=False)
        protrude_dist = np.clip(protrude_dist, 0, 1)

        SuCOS_score = 0.5 * fm_score + 0.5 * (1 - protrude_dist)
        tanimoto_score = Chem.rdShapeHelpers.ShapeTanimotoDist(ref, prb)

        results_sucos[str(idx)] = SuCOS_score
        results_tani[str(idx)] = tanimoto_score

        if p:
            print("********************************")
            print("index: " + str(idx))
            print("SuCOS score:\t%f" % SuCOS_score)
            print("Tani score:\t%f" % tanimoto_score)
            print("********************************")

        idx += 1

    return results_sucos


########################################################################################
#
#  Alignment & scoring implementation
#
########################################################################################

def get_best_align(hit_mblock, elab_smiles):
    hit_mol = Chem.MolFromMolBlock(hit_mblock)
    elab_mol = Chem.MolFromSmiles(elab_smiles)
    ids = AllChem.EmbedMultipleConfs(elab_mol, numConfs=100, params=AllChem.ETKDG())

    for cid in ids:
        o3d = Chem.rdMolAlign.GetO3A(prbMol=elab_mol, refMol=hit_mol, prbCid=cid)
        o3d.Align()

    results_sucos = score(hit_mol, elab_mol, ids)
    best_i = list(results_sucos.values()).index(max(results_sucos.values()))
    elab_molblock = Chem.MolToMolBlock(elab_mol, confId=best_i)

    return elab_molblock


def gen_conf_from_vector(input_mol_block, elaborated_smiles):
    # Get the mol
    m = get_best_align(input_mol_block, elaborated_smiles)
    return m

## Exercise
----
1. Use the ``gen_conf_from_vector`` function to generate a 3D conformation for each of the molecules you calculated in the Fragment Network exercise (2nd).
2. Save these out to a new sdf file
3. Generate a new sdf file combining the results from this exercise and the merging exercise.

In [2]:
# write your code here