# 6 - Functionalizing Ligands 

Beyond simple ligands searched from SMILES, we want to dig into functionalizations of ligands!

In this tutorial we will look at the ligand functionalization routines in Architector, covering:

**(A)** Viewing default functional groups present in architector by name!

**(B)** Identifying functionalization sites on ligands.

**(C)** Adding both single and multiple functionalizations to a single ligand.

In [1]:
# First, import useful packages again
from architector import (build_complex, # Build routine
                         view_structures, # Visualization
                         smiles2Atoms, # Smiles to ASE Atoms
                         get_obmol_smiles, # Coversion to OBmol routine for editing ligand smiles
                         get_smiles_obmol, # Convert OBmol to SMILES string
                         convert_obmol_ase) # Conversion of OBmol molecele to ASE atoms for visualization
from architector.io_ptable import functional_groups_dict # Default functional groups dictionary
import copy

# For (A), Some Functional groups are included by default. Otherwise they can be input as SMILES strings!

Here, I will visualize all functional groups present by default in Architector. We will write a quick function for placing the functional groups onto a Benzene ring for clarity!

In [2]:
def view_functional_group(name, base_group='c1ccccc1', base_inds=[5]):
    """
    function to visualize functional groups on a base_group. 
    
    Inputs:
    name : str
        Name of the functional group to put on the base_group.
    base_group : str, optional
        SMILES string of the organic group to functionalize! by default benzene.
    base_inds : list(Int) 
        list of the indices of the base_smiles to add the functional group to.
    """
    fgs = [{'functional_group':name,'smiles_inds':base_inds}] # Construct functional group dictionary list.
    OBmol = get_obmol_smiles(base_group, functionalizations=fgs) # Perform functionalization in Opebabel
    new_smiles = get_smiles_obmol(OBmol) # Get the new smiles for this OBmol molecule
    ase_atoms = convert_obmol_ase(OBmol) # Convert the molecule to ASE atoms for viewing
    # Print out the functional group name and edited SMILES string
    print('Name: {}\t Functionalized_Smiles: {}'.format(name,new_smiles))  
    view_structures(ase_atoms) # View the structures.

In [3]:
'OP(O)O'

'OP(O)O'

Now, we iterate through the default functional group dictionary, visualizing what each group looks like on benzene!

In [4]:
for key,val in functional_groups_dict.items():
    view_functional_group(key)

Name: methyl	 Functionalized_Smiles: c1ccccc1C


Name: ethyl	 Functionalized_Smiles: c1ccccc1CC


Name: phenyl	 Functionalized_Smiles: c1ccccc1C1=CC=CC=C1


Name: bromo	 Functionalized_Smiles: c1ccccc1Br


Name: iodo	 Functionalized_Smiles: c1ccccc1I


Name: chloro	 Functionalized_Smiles: c1ccccc1Cl


Name: amino	 Functionalized_Smiles: c1ccccc1N


Name: hydroxyl	 Functionalized_Smiles: c1ccccc1O


Name: thiol	 Functionalized_Smiles: c1ccccc1S


Name: carbonyl	 Functionalized_Smiles: c1ccccc1C#[O+]


Name: cyano	 Functionalized_Smiles: c1ccccc1C#N


Name: fluoro	 Functionalized_Smiles: c1ccccc1F


Name: trichloro	 Functionalized_Smiles: c1ccccc1C(Cl)(Cl)Cl


Name: trifluro	 Functionalized_Smiles: c1ccccc1C(F)(F)F


Name: tribromo	 Functionalized_Smiles: c1ccccc1C(Br)(Br)Br


Name: ether	 Functionalized_Smiles: c1ccccc1OC


Name: carboxyl	 Functionalized_Smiles: c1ccccc1C(=O)O


Name: carboxylate	 Functionalized_Smiles: c1ccccc1C(=O)[O-]


Name: ester	 Functionalized_Smiles: c1ccccc1C(=O)OC


Name: ketone	 Functionalized_Smiles: c1ccccc1C(=O)C


Name: aldehyde	 Functionalized_Smiles: c1ccccc1C=O


Name: amide	 Functionalized_Smiles: c1ccccc1C(=O)N(C)C


Name: cyanimide	 Functionalized_Smiles: c1ccccc1N(C)N


Name: phosphonate	 Functionalized_Smiles: c1ccccc1[P+]([O-])(O)O


Name: 2-hydroxypyradine	 Functionalized_Smiles: c1ccccc1C1=CC=CC(=N1)O


Name: 2-methylbenzoic_acid	 Functionalized_Smiles: c1ccccc1C1=C(C=CC(=C1)C)C(=O)O


# For (B), We can also functionalize ligands during complex construction! But we need to know where to functionalize them.

Fort this example we will be functionalizing bipyradine (bipy) when bound to Fe in a couple different ways to highlight this functionality.

To start, we need the base bipy SMILES and coordinating atoms:

In [5]:
# From online:
bipy_smiles = 'n1ccccc1-c2ccccn2'
metal = 'Fe' # Initilize metal

Now, we can view the structures. 

### Instead of just looking for the coordinating atoms, we can also identify functionalization sites:

In [6]:
bipy_atoms = smiles2Atoms(bipy_smiles)
view_structures(bipy_atoms,labelinds=True,w=500,h=500) 

As before, the coordination sites are the Nitrogens (blue) with indices 0 and 11, and that bipy will be a "bi_cis" ligand.

### However, if we want to put two functional groups at both of the positions "para" to the Nitrogens, these sites are the Carbons with indices 3 and 8!

In [7]:
bipy_coordList = [0,11]
bipy_ligType = 'bi_cis'
para_smiles_inds = [3,8]

## For (C), Now we can make a functional group list of dictionaries for the bipy during complex construction! 

Here, let's just use the chloro functionalization from above for simplicity!

In [8]:
para_functional_groups = [{'functional_group':'chloro','smiles_inds':para_smiles_inds}]

Now we have enough to do a functionalized complex construction with Fe!

In [9]:
# We now have what we need to make an Fe-Bipy complex with functionalizations
lig_dict = {'smiles':bipy_smiles,
            'coordList':bipy_coordList,
            'ligType':bipy_ligType,
            'functionalizations':para_functional_groups}

inputDict = {'core':{
    'metal':'Fe',
    'coreType':'octahedral'  # Just making octahedral complexes for simplicity
        },
    'ligands':[lig_dict], # Add in the ligands dictionary
    'parameters':{
        'assemble_method':'GFN-FF', # Switch to GFN-FF for faster assembly, 
        'relax':False, # Turn of relaxation for non-optimized structures
        'fill_ligand':0
                 }
    }

In [10]:
out = build_complex(inputDict)

DETERMINING SYMMETRIES.
Total valid symmetries for core octahedral:  1
GENERATING CONFORMATIONS for n1ccc(cc1c1cc(ccn1)Cl)Cl
CONFORMERS GENERATED for n1ccc(cc1c1cc(ccn1)Cl)Cl
ASSEMBLING COMPLEX
LIGAND: n1ccc(cc1c1cc(ccn1)Cl)Cl
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :17  EN :  3.16
          Z :26  EN :  1.83
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :17  EN :  3.16
          Z :26  EN :  1.83
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          compu

          rings ...
          # BATM   960
          # H in HB   18
          doing iterative Hueckel for 3 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  Fe      6    4.24   0.05         0    2    0   0.355    0.000000    0.000000    0.000000
    2  N       3    2.77   0.00         2    0    1  -0.168    0.105592   -0.000000    3.716162
    3  C       3    2.84   0.00         2    0    1   0.013   -1.795637    0.000017    5.416037
    4  C       3    2.84   0.00         2    0    1  -0.022   -1.420089    0.000011    7.988828
    5  C       3    2.83   0.00         2    0    1   0.031    1.056603    0.000015    8.828529
    6  C       3    2.84   0.00         2    0    1  -0.021    3.062220    0.000001    7.123088
    7  C       3    2.87   0.00         2    0    1   0.039    2.545176    0.000001    4.541760
    8  C       3    2.87   0.00         2    0    1   0.039    4.541765   -0.000010    2.545160
    9  C       3    2.84   0.00 

Complex class generated:  True
ComplexSanity:  True


In [11]:
view_structures(out,w=500,h=500)

Looks like a chloro-functionalized bipy at the para-positions!

### Beyond this, we can create more than one type of functionalization

Here, let's put Chloros at the para-positions and bromos at the ortho-positions to the coordinating atoms.

The ortho positions correspond to Carbon indices 2 and 9 above

In [12]:
multi_fgs = [{'functional_group':'chloro','smiles_inds':[8,3]}, # Chloro at para positions
                     {'functional_group':'bromo','smiles_inds':[2,9]}] # Bromo at ortho-positions

new_inputDict = copy.deepcopy(inputDict) # Copy inputDict

new_inputDict['ligands'][0]['functionalizations'] = multi_fgs # Change bipy functionalization!
del new_inputDict['parameters']['fill_ligand'] # Remove fill ligand -> Default to H2O

In [13]:
out1 = build_complex(new_inputDict)

Total valid symmetries for core octahedral:  1
GENERATING CONFORMATIONS for n1cc(c(cc1c1cc(c(cn1)Br)Cl)Cl)Br
CONFORMERS GENERATED for n1cc(c(cc1c1cc(c(cn1)Br)Cl)Cl)Br
GENERATING CONFORMATIONS for O
CONFORMERS GENERATED for O
ASSEMBLING COMPLEX
LIGAND: n1cc(c(cc1c1cc(c(cn1)Br)Cl)Cl)Br
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :17  EN :  3.16
          Z :26  EN :  1.83
          Z :35  EN :  2.96
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic i

Complex class generated:  True
ComplexSanity:  True


Now, we have a water co-coordinated bi-functionalized Fe-Bipy complex!

In [14]:
view_structures(out1,w=500,h=500)

Finally, we can make structures with both functionalized and un-functionalized ligands simultaneously.

Also, functional groups can be defined by SMILES strings. Here we introduce a butane functional group with a tail tricholor carbon end.

In [15]:
functional_groups = [{'functional_group':'CCCC(Cl)(Cl)Cl','smiles_inds':[8,3]}]

lig_dict = {'smiles':bipy_smiles,'coordList':bipy_coordList,'ligType':bipy_ligType, # Functionalized bipy
            'functionalizations':functional_groups}
lig_dict2 = {'smiles':bipy_smiles,'coordList':bipy_coordList,'ligType':bipy_ligType} # Unfunctionalized bipy

last_inputDict = {
    'core':{
        'metal':'Fe',
        'coreType':'octahedral'
    },
    'ligands':[lig_dict,lig_dict2,lig_dict2],
    'parameters':{
        'assemble_method':'GFN-FF',
        'relax':False
    }
}
last_inputDict

{'core': {'metal': 'Fe', 'coreType': 'octahedral'},
 'ligands': [{'smiles': 'n1ccccc1-c2ccccn2',
   'coordList': [0, 11],
   'ligType': 'bi_cis',
   'functionalizations': [{'functional_group': 'CCCC(Cl)(Cl)Cl',
     'smiles_inds': [8, 3]}]},
  {'smiles': 'n1ccccc1-c2ccccn2', 'coordList': [0, 11], 'ligType': 'bi_cis'},
  {'smiles': 'n1ccccc1-c2ccccn2', 'coordList': [0, 11], 'ligType': 'bi_cis'}],
 'parameters': {'assemble_method': 'GFN-FF', 'relax': False}}

In [16]:
out = build_complex(last_inputDict)

DETERMINING SYMMETRIES.
Total valid symmetries for core octahedral:  1
GENERATING CONFORMATIONS for n1ccccc1-c2ccccn2
CONFORMERS GENERATED for n1ccccc1-c2ccccn2
GENERATING CONFORMATIONS for n1ccc(cc1c1cc(ccn1)CCCC(Cl)(Cl)Cl)CCCC(Cl)(Cl)Cl
CONFORMERS GENERATED for n1ccc(cc1c1cc(ccn1)CCCC(Cl)(Cl)Cl)CCCC(Cl)(Cl)Cl
ASSEMBLING COMPLEX
LIGAND: n1ccccc1-c2ccccn2
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :26  EN :  1.83
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topo

          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          rings ...
          # BATM   1236
          # H in HB   22
          doing iterative Hueckel for 3 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  Fe      6    4.24   0.05         0    2    0   0.357    0.000000    0.000000    0.000000
    2  N       3    2.77   0.00         2    0    1  -0.173    0.105437    0.000000    3.716091
    3  C       3    2.84   0.00         2    0    1   0.009   -1.780953    0.000010    5.433214
    4  C       3    2.80   0.00         2

          # BATM   1236
          # H in HB   22
          doing iterative Hueckel for 3 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  Fe      6    4.24   0.05         0    2    0   0.357    0.000000    0.000000    0.000000
    2  N       3    2.77   0.00         2    0    1  -0.173    0.105437    0.000000    3.716091
    3  C       3    2.84   0.00         2    0    1   0.009   -1.780953    0.000010    5.433214
    4  C       3    2.80   0.00         2    0    1  -0.026   -1.392127    0.000013    8.012495
    5  C       3    2.82   0.00         2    0    1  -0.028    1.081373   -0.000014    8.874432
    6  C       3    2.81   0.00         2    0    1  -0.024    3.064750   -0.000011    7.132966
    7  C       3    2.87   0.00         2    0    1   0.034    2.544778   -0.000016    4.541293
    8  C       3    2.87   0.00         2    0    1   0.034    4.541295   -0.000018    2.544785
    9  C       3    2.81   0.00         2    0    1

Complex class generated:  True
ComplexSanity:  True


In [17]:
view_structures(out,w=500,h=500)

Looks great! Now we have covered functionalizations as well.

# Conclusions

In this tutorial we covered:

**(A)** Viewing default functional groups present in architector by name!

**(B)** Identifying functionalization sites on ligands.

**(C)** Adding both single and multiple functionalizations to a single ligand.