# 2 - Specifiying New/More Complex Ligands 

So far (from 1-Introduction/Overview.ipynb) we know how to specify basic inputs and understand some of the basic outputs from Architector.

What about new or unknown systems inlcuding more complex ligands?

We have some tools to address some of these challenges manually along with some SMILES tools!

In this tutorial we will learn:

**(A)** How to manually identify coordination sites of new ligands for generation in Architector.

**(B)** How to automatically and manually identify ligand types (geometries!).

**(C)** How to use internal commands to simplify inputs for more complex coordination environments!

## For (A), From here, we need a challenge. Let's try a [La(Terpyradine)<sub>3</sub>]<sup>3+</sup> complex.}

But what is the SMILES for Terpyradine (Terpy, for short), and how is it coordinated to a metal center?

Tracking down the SMILES can be done on [Wikipedia: here](https://en.wikipedia.org/wiki/Terpyridine). Giving: "c1ccnc(c1)c2cccc(n2)c3ccccn3"

However, what are the coordinating atoms?

Here, we turn to useful routines included in Architector:

In [1]:
import architector
from architector import (build_complex, # Build routine
                         view_structures, # Visualization
                         smiles2Atoms) # Smiles utility to ASE atoms

We will also initialize the metal and ligand smiles for La/Terpy:

In [2]:
terpy_smiles = 'c1ccnc(c1)c2cccc(n2)c3ccccn3'
metal = 'La'

Next, the smiles2Atoms utility converts our terpy smiles to [ASE atoms](https://wiki.fysik.dtu.dk/ase/ase/atoms.html) for visualization purposes.

In [3]:
terpy_atoms = smiles2Atoms(terpy_smiles)

### Next, we visualize with labelled indices for identification of ligand-metal coordinating atoms (CAs)

We already know the view_structures commond, but there are a couple additional parameters that can be useful for this:

**(i)** The labelinds=True option adds overlays with the exact indices of the atoms as used by Architector

**(ii)** The size of the visualization can be shifted using w (width) and h (height) commands (default is 200x200)

With these two additions we can visualize the ligand structure for identification of CAs:

In [4]:
view_structures(terpy_atoms,labelinds=True,w=500,h=500) 

### Visually, we can identify that the CAs will be the nitrogen atoms (Blue atoms) at indices 3,11, and 17.

We can now save these indices for building the complexes!

In [5]:
terpy_coordList = [3,11,17]

## Now, for (B), Identifying ligand types we have 2 different methods:

**(i)*** Automatically 

**(ii)** Manually

For **(i)**, all we need to do is input ligand dictionaries without a specified ligType! So we funcationally already have enough information to generate the [La(Terpyradine)<sub>3</sub>]<sup>3+</sup> complex!

In [6]:
terpy_ligand_dict = {'smiles':terpy_smiles,
                    'coordList':terpy_coordList}

And the full input dictionary (including 3 terpy ligands!):

In [8]:
inputDict = {'core':{'metal':metal,'coreCN':9},
            'ligands':[terpy_ligand_dict]*3,
            'parameters':{'assemble_method':'GFN-FF', # Switch to GFN-FF for faster assembly, 
                          'n_conformers':2, # Test 2 different conformers
                          'return_only_1':True # Return just one
                          # but still using GFN2-xTB for the final relaxation. Will have more printout.
                         }}
inputDict # Print out full input Dictionary

{'core': {'metal': 'La', 'coreCN': 9},
 'ligands': [{'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3',
   'coordList': [3, 11, 17]},
  {'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3', 'coordList': [3, 11, 17]},
  {'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3', 'coordList': [3, 11, 17]}],
 'parameters': {'assemble_method': 'GFN-FF',
  'n_conformers': 2,
  'return_only_1': True}}

Looks good! Now we build the complex using Architector - Note that this might take a couple of minutes:

In [9]:
out = build_complex(inputDict) # Might take a couple minutes

ligType not specified for c1ccnc(c1)c2cccc(n2)c3ccccn3 - testing ligand placement to determine ligType!
Assigning lig c1ccnc(c1)c2cccc(n2)c3ccccn3 to ligType tri_mer!
DETERMINING SYMMETRIES.
Total valid symmetries for core am_c3_9H2O_c0:  2
GENERATING CONFORMATIONS for c1ccnc(c1)c2cccc(n2)c3ccccn3
CONFORMERS GENERATED for c1ccnc(c1)c2cccc(n2)c3ccccn3
ASSEMBLING COMPLEX
LIGAND: c1ccnc(c1)c2cccc(n2)c3ccccn3
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ---------------


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

          rings ...
          # BATM   1842
          # H in HB   33
          doing iterative Hueckel for 3 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  La      9    4.41   0.00         0    2    0   0.476   -0.000000    0.000000    0.000000
    2  C       3    2.82   0.00         2    0    1  -0.034    6.651001    0.702598   -7.429873
    3  C       3    2.80   0.00         2    0    1  -0.026    5.290969   -1.515465   -7.142075
    4  C       3    2.97   0.00         2    0    1   0.006    3.572575   -1.636480   -5.186490
    5  N       3    2.79   0.00         2    0    1  -0.166    3.151210    0.276514   -3.553625
    6  C       3    3.02   0.00         2    0    1   0.030    4.471973    2.478692   -3.804465
    7  C       3    2.81   0.00         2    0    1  -0.024    6.243149    2.707115   -5.758600
    8  C       3    3.02   0.00         2    0    1   0.032    3.945557    4.571041   -1.941277
    9  C       3    2.81   0.00


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

Complex class generated:  True
OPTIMIZING MOLECULE
                Step[ FC]     Time          Energy          fmax
*Force-consistent energies used in optimization.
BFGSLineSearch:    0[  0] 22:27:03    -3795.033846*       1.3411
xtb could not evaluate input
Failed final relaxation.
ComplexSanity:  False
DETERMINING SYMMETRIES.
Cannot map this ligand combination to core capped_square_antiprismatic - Not generating.
Total valid symmetries for core capped_square_antiprismatic:  0
No coordination environment avaiable for this ligand combination
DETERMINING SYMMETRIES.
Total valid symmetries for core cn9_YICLED:  9
GENERATING CONFORMATIONS for c1ccnc(c1)c2cccc(n2)c3ccccn3
CONFORMERS GENERATED for c1ccnc(c1)c2cccc(n2)c3ccccn3
ASSEMBLING COMPLEX
LIGAND: c1ccnc(c1)c2cccc(n2)c3ccccn3
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  E


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          rings ...
          # BATM   1901
          # H in HB   32
          doing iterative Hueckel for 4 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  La      9    4.41   0.00         0    2    0   0.478    0.000000    0.000000    0.000000
    2  C       3    2.82   0.00         2    0    1  -0.034    7.351328    5.979199   -3.184312
    3  C       3    2.80   0.00         2    0    1  -0.026    4.864668    6.711596   -3.548263
    4  C       3    2.97   0.00         2    0    1   0.006    2.994020    5.109742   -2.695914
    5  N       3    2.79   0.00         2    0    1  -0.165    3.456228    2.886009   -1.536279
    6  C       3    3.01   0.00

Complex class generated:  True
OPTIMIZING MOLECULE
                Step[ FC]     Time          Energy          fmax
*Force-consistent energies used in optimization.
BFGSLineSearch:    0[  0] 22:27:16    -3792.201287*       3.9170
BFGSLineSearch:    1[  2] 22:27:16    -3793.402426*       3.3971
BFGSLineSearch:    2[  4] 22:27:17    -3794.371412*       2.2204
BFGSLineSearch:    3[  6] 22:27:17    -3794.752585*       1.0865
BFGSLineSearch:    4[  8] 22:27:18    -3795.059304*       1.4642
BFGSLineSearch:    5[ 10] 22:27:18    -3795.352680*       1.3483
BFGSLineSearch:    6[ 12] 22:27:18    -3795.650542*       1.0886
BFGSLineSearch:    7[ 14] 22:27:19    -3795.864766*       0.9674
BFGSLineSearch:    8[ 16] 22:27:19    -3796.074096*       1.1359
BFGSLineSearch:    9[ 18] 22:27:19    -3796.278377*       1.0957
BFGSLineSearch:   10[ 20] 22:27:20    -3796.503123*       1.2880
BFGSLineSearch:   11[ 22] 22:27:20    -3796.667943*       1.0312
BFGSLineSearch:   12[ 24] 22:27:20    -3796.814213*    

And we can again visualize the structures:

In [10]:
view_structures(out)

### Should look great!

However, this took a bit of time.

What was the ligand type assigned automatically? It is in the output text of the build_complex cell - and it should be "tri_mer". This is short for [tridentate meridial](https://www.coursehero.com/study-guides/introchem/isomers-in-coordination-compounds/), which we likley could have identified manually!

To do this **(ii)** manually, we have a tool in the documentation for visualizing all ligand types that we are replicating here for tridentates:

In [11]:
import pandas as pd # Pandas is used to read in the reference data
import numpy as np # Numpy is used for selecting from the database
import architector # Architector is used for importing the filepath to the reference data

In [12]:
# Pull out the datapath for the ligand reference structures:
ref_data_path = '/'.join(architector.__file__.split('/')[0:-1]) + '/data/angle_stats_datasource.csv'
ref_data_path

'/Users/mgt16/software/architector/architector/data/angle_stats_datasource.csv'

For the utility we need a defined denticity - since we have a ligand with 3 CAs - it is tridentate!

In [13]:
denticity = 3 

### Now, we can read in and visualize the data

In [14]:
# Read in reference data for examples.
ligdf = pd.read_csv(ref_data_path)
# Show the reference data!
print('Showing examples of each ligand label!')
print('Note that "m" indicates the metal in each - some will not show if M-L bonds are longer than cutoff radii.')
print('####################################################################################')
ligtypes = ligdf.geotype_label.value_counts().index.values
cns = [ligdf[ligdf.geotype_label == x].cn.values[0] for x in ligtypes]
order = np.argsort(cns)
for i in order:
    if cns[i] == denticity: # Only Pick out Tri Dentates
        print("Ligand label - 'ligType':", "'" + ligtypes[i] + "'")
        print('Ligand denticity: ', int(cns[i]))
        # Sample 4 structures matching these labels:
        tdf = ligdf[ligdf.geotype_label == ligtypes[i]].sample(4,random_state=42) 
        # Visualize the structures:
        view_structures(tdf.xyz_structure,labels=['m']*4)
        print('####################################################################################')

Showing examples of each ligand label!
Note that "m" indicates the metal in each - some will not show if M-L bonds are longer than cutoff radii.
####################################################################################
Ligand label - 'ligType': 'tri_fac'
Ligand denticity:  3


####################################################################################
Ligand label - 'ligType': 'tri_mer_bent'
Ligand denticity:  3


####################################################################################
Ligand label - 'ligType': 'tri_mer'
Ligand denticity:  3


####################################################################################


## Here, we can manually see that "tri_mer" or "tri_mer_bent" are possible labels for terpy!

Now we can add this information to the terpy ligands dictionary manually to accelerate generation:

In [15]:
import copy

terpy_lig_dict_copy = copy.deepcopy(terpy_ligand_dict) # Copy terpy ligand dict

terpy_lig_dict_copy['ligType'] = 'tri_mer' # Add ligType manually!

And copy the inputDict to update with manual label:

In [16]:
new_inputDict = copy.deepcopy(inputDict) # Copy inputDict

new_inputDict['ligands'] = [terpy_lig_dict_copy]*3 # Update ligands field with new terpy_dict

Finally rebuild the complex. Note that this will still likely be a bit slow - lanthanides tend to take longer with XTB.

In [17]:
newout = build_complex(new_inputDict) # Still might take a couple minutes

DETERMINING SYMMETRIES.
Total valid symmetries for core am_c3_9H2O_c0:  2
GENERATING CONFORMATIONS for c1ccnc(c1)c2cccc(n2)c3ccccn3
CONFORMERS GENERATED for c1ccnc(c1)c2cccc(n2)c3ccccn3
ASSEMBLING COMPLEX
LIGAND: c1ccnc(c1)c2cccc(n2)c3ccccn3
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warsha


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

          # BATM   1842
          # H in HB   33
          doing iterative Hueckel for 3 subsystem(s) ...

  atom   neighbors  erfCN metchar sp-hybrid imet pi  qest     coordinates
    1  La      9    4.41   0.00         0    2    0   0.476    0.000000    0.000000    0.000000
    2  C       3    2.82   0.00         2    0    1  -0.034    6.651001    0.702598   -7.429873
    3  C       3    2.80   0.00         2    0    1  -0.026    5.290969   -1.515465   -7.142075
    4  C       3    2.97   0.00         2    0    1   0.006    3.572575   -1.636480   -5.186490
    5  N       3    2.79   0.00         2    0    1  -0.166    3.151210    0.276514   -3.553625
    6  C       3    3.02   0.00         2    0    1   0.030    4.471973    2.478692   -3.804465
    7  C       3    2.81   0.00         2    0    1  -0.024    6.243149    2.707115   -5.758600
    8  C       3    3.02   0.00         2    0    1   0.032    3.945557    4.571041   -1.941277
    9  C       3    2.81   0.00         2    0    1

Complex class generated:  True
OPTIMIZING MOLECULE
                Step[ FC]     Time          Energy          fmax
*Force-consistent energies used in optimization.
BFGSLineSearch:    0[  0] 22:28:49    -3795.033846*       1.3411
xtb could not evaluate input
Failed final relaxation.
ComplexSanity:  False
DETERMINING SYMMETRIES.
Cannot map this ligand combination to core capped_square_antiprismatic - Not generating.
Total valid symmetries for core capped_square_antiprismatic:  0
No coordination environment avaiable for this ligand combination
DETERMINING SYMMETRIES.
Total valid symmetries for core cn9_YICLED:  9
GENERATING CONFORMATIONS for c1ccnc(c1)c2cccc(n2)c3ccccn3
CONFORMERS GENERATED for c1ccnc(c1)c2cccc(n2)c3ccccn3
ASSEMBLING COMPLEX
LIGAND: c1ccnc(c1)c2cccc(n2)c3ccccn3
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  E

   10  C       3    2.81   0.00         2    0    1  -0.031    8.997065   -3.647070    1.838588
   11  C       3    2.81   0.00         2    0    1  -0.027    6.855541   -4.991530    2.558408
   12  C       3    3.02   0.00         2    0    1   0.031    4.463366   -4.007183    2.061904
   13  N       3    2.80   0.00         2    0    1  -0.183    4.263056   -1.728091    0.871138
   14  C       3    3.01   0.00         2    0    1   0.029    2.055574   -5.351728    2.783884
   15  C       3    2.81   0.00         2    0    1  -0.026    2.032474   -7.697191    4.011121
   16  C       3    2.82   0.00         2    0    1  -0.034   -0.259149   -8.853921    4.633983
   17  C       3    2.80   0.00         2    0    1  -0.029   -2.511691   -7.665192    4.029497
   18  C       3    2.97   0.00         2    0    1   0.005   -2.377473   -5.359506    2.822220
   19  N       3    2.79   0.00         2    0    1  -0.179   -0.185452   -4.211770    2.204834
   20  H       1    0.97   0.00         


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

Complex class generated:  True
OPTIMIZING MOLECULE
                Step[ FC]     Time          Energy          fmax
*Force-consistent energies used in optimization.
BFGSLineSearch:    0[  0] 22:29:03    -3792.201287*       3.9170
BFGSLineSearch:    1[  2] 22:29:03    -3793.402426*       3.3971
BFGSLineSearch:    2[  4] 22:29:03    -3794.371412*       2.2204
BFGSLineSearch:    3[  6] 22:29:04    -3794.752585*       1.0865
BFGSLineSearch:    4[  8] 22:29:04    -3795.059304*       1.4642
BFGSLineSearch:    5[ 10] 22:29:04    -3795.352680*       1.3483
BFGSLineSearch:    6[ 12] 22:29:05    -3795.650542*       1.0886
BFGSLineSearch:    7[ 14] 22:29:05    -3795.864766*       0.9674
BFGSLineSearch:    8[ 16] 22:29:05    -3796.074096*       1.1359
BFGSLineSearch:    9[ 18] 22:29:06    -3796.278377*       1.0957
BFGSLineSearch:   10[ 20] 22:29:06    -3796.503123*       1.2880
BFGSLineSearch:   11[ 22] 22:29:06    -3796.667943*       1.0312
BFGSLineSearch:   12[ 24] 22:29:07    -3796.814213*    

Visualization should reveal the same (or near-identical) output structure:

In [18]:
view_structures(newout)

## For (C), we can reduce the necessity of manually specifying that 3 terpy ligands are filling the coordination environment

This is done with a simple parameter addition:

In [19]:
new_inputDict # print the dictionary for reference

{'core': {'metal': 'La', 'coreCN': 9, 'smiles': '[La]'},
 'ligands': [{'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3',
   'coordList': [3, 11, 17],
   'ligType': 'tri_mer'},
  {'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3',
   'coordList': [3, 11, 17],
   'ligType': 'tri_mer'},
  {'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3',
   'coordList': [3, 11, 17],
   'ligType': 'tri_mer'}],
 'parameters': {'assemble_method': 'GFN-FF',
  'n_conformers': 2,
  'return_only_1': True,
  'is_actinide': False,
  'original_metal': 'La'}}

Updating both the ligands definition to be only a single copy of the terpy_lig_dict_copy, and adding the parameter 'fill_ligand' to indicate that the ligand which should fill the coordination sphere should be the first ligand (index 0) or terpy!

In [20]:
new_inputDict['ligands'] = [terpy_lig_dict_copy]
new_inputDict['parameters']['fill_ligand'] = 0

We can also request the complexes to not be relaxed to save additional time with the parameter 'relax' set to False. This will result in slightly less accurate geometries, so be a bit more careful here:

In [21]:
new_inputDict['parameters']['relax'] = False
new_inputDict

{'core': {'metal': 'La', 'coreCN': 9, 'smiles': '[La]'},
 'ligands': [{'smiles': 'c1ccnc(c1)c2cccc(n2)c3ccccn3',
   'coordList': [3, 11, 17],
   'ligType': 'tri_mer'}],
 'parameters': {'assemble_method': 'GFN-FF',
  'n_conformers': 2,
  'return_only_1': True,
  'is_actinide': False,
  'original_metal': 'La',
  'fill_ligand': 0,
  'relax': False}}

Looks good, and definitely more simple that the initial version of the inputDict that we created! Now onto building (again)!

In [22]:
newout1 = build_complex(new_inputDict) # Still might take a couple minutes

DETERMINING SYMMETRIES.
Total valid symmetries for core am_c3_9H2O_c0:  2
GENERATING CONFORMATIONS for c1ccnc(c1)c2cccc(n2)c3ccccn3
CONFORMERS GENERATED for c1ccnc(c1)c2cccc(n2)c3ccccn3
ASSEMBLING COMPLEX
LIGAND: c1ccnc(c1)c2cccc(n2)c3ccccn3
FINDING CORRECT CONFORMER

          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warsha


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr


          CN  :   150.00000
          rep :   500.00000
          disp:  2500.00000
          HB1 :   250.00000
          HB2 :   450.00000

          Pauling EN used:
          Z : 1  EN :  2.20
          Z : 6  EN :  2.55
          Z : 7  EN :  3.04
          Z :57  EN :  1.10
          electric field strengths (au): 0.000

           ------------------------------------------------- 
          |           Force Field Initialization            |
           ------------------------------------------------- 

          distances ...
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matrix with Floyd-Warshall algo ...
          making topology EEQ charges ...
          #fragments for EEQ constrain: 1
          ----------------------------------------
          generating topology and atomic info file ...
          pair mat ...
          computing topology distances matr

Initial Sanity:  True
Complex sanity after adding ligands:  True
Complex class generated:  True
ComplexSanity:  True


In [23]:
view_structures(newout1)

# Conclusions!

In this tutorial we learned:

**(A)** How to manually identify coordination sites of new ligands for generation in Architector.

**(B)** How to automatically and manually identify ligand types (geometries!).

**(C)** How to use internal commands to simplify inputs for more complex coordination environments!