# 8 - 2D Prepopulation Construction:

### In this tutorial we will learn how to prepopulate constructions in 2D for down-selecting 3D generation.
### This will involve 3 key takeaways:

**(A)** How to generate in 2D.

**(B)** How to translate from 2D to 3D in Architector.

**(C)** How to perform 2D to 3D generation in an end-to-end workflow.

In [1]:
# First, imports:
from architector import (view_structures,
                         build_complex,
                         build_complex_2D) # 2D construction routine!
import pandas as pd # library for handling tables (think Excel charts!)
import copy # Use the copy library

# Now, let's come up with a toy problem.

### (A) Here, let's prepopulate a set of 2D structures for all of the lanthanides with coordination number 5-10 surrounded by waters!

It will be much easier to do this in 2D first - than pick ones we want to do in 3D.

In [2]:
# First we will build a container input dictionary 
inputDict = {'core':{'metal':'La','coreCN':5}, # Fill the dictionary with 5
             'ligands':['water'],
             'parameters':{'fill_ligand':0} # Fill out the coordination environment with water!
            }

Next, we will use this example simple for loops to generate in 2D.

In [3]:
# This should take just a fraction of second!
out = build_complex_2D(inputDict)

Let's see what's in this 2D output dictionary:

In [4]:
out

{'mol2string': '@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electrons: 0 XTB_Unpaired_Electrons: 0 .mol2\n    16    15     1     0     0\nSMALL\nNoCharges\n****\nGenerated from Architector\n\n@<TRIPOS>ATOM\n     1 La1       0.0000    0.0000    0.0000   La        1 RES1   0.0000\n     2 O1        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000\n     3 H1        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n     4 H2        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n     5 O2        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000\n     6 H3        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n     7 H4        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n     8 O3        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000\n     9 H5        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n    10 H6        0.0000    0.0000    0.0000   H         1 RES1   0.0000\n    11 O4        0.0000    0.0000    0.0000   O.3       1 RES1   0.00

Looks like just a mol2string, and an output dictionary giving the same as the input.
Let's look at the mol2string:

In [5]:
print(out['mol2string'])

@<TRIPOS>MOLECULE
Charge: 3 Unpaired_Electrons: 0 XTB_Unpaired_Electrons: 0 .mol2
    16    15     1     0     0
SMALL
NoCharges
****
Generated from Architector

@<TRIPOS>ATOM
     1 La1       0.0000    0.0000    0.0000   La        1 RES1   0.0000
     2 O1        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000
     3 H1        0.0000    0.0000    0.0000   H         1 RES1   0.0000
     4 H2        0.0000    0.0000    0.0000   H         1 RES1   0.0000
     5 O2        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000
     6 H3        0.0000    0.0000    0.0000   H         1 RES1   0.0000
     7 H4        0.0000    0.0000    0.0000   H         1 RES1   0.0000
     8 O3        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000
     9 H5        0.0000    0.0000    0.0000   H         1 RES1   0.0000
    10 H6        0.0000    0.0000    0.0000   H         1 RES1   0.0000
    11 O4        0.0000    0.0000    0.0000   O.3       1 RES1   0.0000
    12 H7        0.0000    0.000

Notice the structure contains the correct bonds, along with the Charge, and Unpaired Electrons in the system in the header, but no X-Y-Z coordinates (3D information!)

Let's change that.

# (B) Now, let's translate this 2D mol2string into 3D.

To do this let's prepopulate a dictionary - note that all you need is the mol2string from 2D.
Architector will handle the translation internally!

In [6]:
translate_dict = {'mol2string':out['mol2string'], 
                  'parameters':{}}

Onto 3D generation!

In [7]:
out_3D = build_complex(translate_dict)

                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:12:51     -682.074295*       0.5232
LBFGSLineSearch:    1 12:12:51     -682.095064*       0.1299
LBFGSLineSearch:    2 12:12:51     -682.103289*       0.2343
LBFGSLineSearch:    3 12:12:51     -682.116177*       0.1722
LBFGSLineSearch:    4 12:12:51     -682.127881*       0.3127
LBFGSLineSearch:    5 12:12:51     -682.149931*       0.1687
LBFGSLineSearch:    6 12:12:51     -682.151763*       0.0259
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:12:52     -682.040671*       0.8060
LBFGSLineSearch:    1 12:12:52     -682.056033*       0.2233
LBFGSLineSearch:    2 12:12:52     -682.065290*       0.3681
LBFGSLineSearch:    3 12:12:52     -682.077420*       0.1485
LBFGSLineSearch:    4 12:12:52     -682.095662*       0.4209
LBFGSLineSearch:    5 12:12:52     -682.100613*   

The out_3D dictionary should behave just like other architector output dictionaries, and we now have 3D mol2strings to visualize!

In [8]:
view_structures(out_3D)

Looks great!

### (C) Now, we can prepopulate all the structures we want to generate in 2D.

Here, we will just use a two nested for loops. This block is a bit larger to account for generating it all in one go! But should just take a couple seconds.

In [9]:
import architector.io_ptable as io_ptable # Import the periodic table from Architector

metals = [] # Get empty lists ready for these parameters!
coordination_numbers = []
mol2strings = []

for metal in io_ptable.lanthanides: # Iterate over the lanthanide elements
    for cn in range(5,11): # Iterate over all desired coordinations
        metals.append(metal) # Save the metal
        coordination_numbers.append(cn) # Save the cn
        inpDict = copy.deepcopy(inputDict) # Copy from our previous 2D dictionary
        inpDict['core']['metal'] = metal # Shift the metal
        inpDict['core']['coreCN'] = cn # Shift the CN
        out_2D = build_complex_2D(inpDict) # Build in 2D
        mol2strings.append(out_2D['mol2string']) # Save the mol2string
        
df = pd.DataFrame({'metal':metals,'cn':coordination_numbers,'mol2string_2D':mol2strings}) # Create a dataframe

Now we can look at the full dataset we just generated:

In [10]:
df

Unnamed: 0,metal,cn,mol2string_2D
0,La,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
1,La,6,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
2,La,7,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
3,La,8,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
4,La,9,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
...,...,...,...
85,Lu,6,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
86,Lu,7,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
87,Lu,8,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
88,Lu,9,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...


90 structures is quite a bit for a tutorial - Instead of doing all 90 structural generations Let's just do all the coordination number=5!

In [11]:
gen_df = df[df.cn == 5].reset_index(drop=True) # Filter to only coordination 5
gen_df

Unnamed: 0,metal,cn,mol2string_2D
0,La,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
1,Ce,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
2,Pr,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
3,Nd,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
4,Pm,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
5,Sm,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
6,Eu,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
7,Gd,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
8,Tb,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
9,Dy,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...


Let's go! This will take a couple minutes

In [12]:
lowest_energy_conformers = []
for i,row in gen_df.iterrows():
    trans_dict = copy.deepcopy(translate_dict)
    trans_dict['mol2string'] = row['mol2string_2D']
    out_3Ds = build_complex(trans_dict)
    key = list(out_3Ds.keys())[0]
    lowest_energy_conformers.append(out_3Ds[key]['mol2string'])
gen_df['mol2string_3D'] = lowest_energy_conformers # Save the output strings

                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:12:56     -682.066211*       0.5219
LBFGSLineSearch:    1 12:12:56     -682.087656*       0.1418
LBFGSLineSearch:    2 12:12:56     -682.096781*       0.2685
LBFGSLineSearch:    3 12:12:56     -682.111065*       0.1755
LBFGSLineSearch:    4 12:12:56     -682.123308*       0.3216
LBFGSLineSearch:    5 12:12:56     -682.143998*       0.1778
LBFGSLineSearch:    6 12:12:56     -682.146791*       0.0482
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:12:57     -682.061811*       0.8520
LBFGSLineSearch:    1 12:12:57     -682.074759*       0.2057
LBFGSLineSearch:    2 12:12:57     -682.081922*       0.2474
LBFGSLineSearch:    3 12:12:57     -682.093627*       0.1503
LBFGSLineSearch:    4 12:12:57     -682.100060*       0.1885
LBFGSLineSearch:    5 12:12:57     -682.115534*   

LBFGSLineSearch:    3 12:13:14     -679.492741*       0.2278
LBFGSLineSearch:    4 12:13:14     -679.516872*       0.7052
LBFGSLineSearch:    5 12:13:14     -679.547089*       0.0884
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:13:17     -678.933178*       0.6155
LBFGSLineSearch:    1 12:13:17     -678.958769*       0.2823
LBFGSLineSearch:    2 12:13:17     -678.990882*       0.6286
LBFGSLineSearch:    3 12:13:17     -679.114977*       0.4186
LBFGSLineSearch:    4 12:13:17     -679.137819*       0.4215
LBFGSLineSearch:    5 12:13:17     -679.171676*       0.2382
LBFGSLineSearch:    6 12:13:17     -679.173630*       0.0769
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:13:17     -679.238321*       1.2730
LBFGSLineSearch:    1 12:13:17     -679.257818*       0.2618
LBFGSLineSearch:    2 12:13:17     -679.279309*   

LBFGSLineSearch:    1 12:13:36     -678.432200*       0.3991
LBFGSLineSearch:    2 12:13:36     -678.476122*       0.4816
LBFGSLineSearch:    3 12:13:36     -678.497881*       0.3283
LBFGSLineSearch:    4 12:13:36     -678.596001*       0.2330
LBFGSLineSearch:    5 12:13:36     -678.598634*       0.0988
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:13:39     -677.798733*       0.7483
LBFGSLineSearch:    1 12:13:39     -677.831789*       0.3140
LBFGSLineSearch:    2 12:13:40     -677.882793*       0.7844
LBFGSLineSearch:    3 12:13:40     -678.105478*       0.2451
LBFGSLineSearch:    4 12:13:40     -678.138498*       0.2583
LBFGSLineSearch:    5 12:13:40     -678.149849*       0.1509
LBFGSLineSearch:    6 12:13:40     -678.156931*       0.0695
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:13:40     -678.226985*   

LBFGSLineSearch:    3 12:14:00     -677.561922*       0.4365
LBFGSLineSearch:    4 12:14:00     -677.579069*       0.2933
LBFGSLineSearch:    5 12:14:00     -677.604766*       0.8356
LBFGSLineSearch:    6 12:14:00     -677.665002*       0.3807
LBFGSLineSearch:    7 12:14:00     -677.674770*       0.4178
LBFGSLineSearch:    8 12:14:00     -677.696852*       0.1635
LBFGSLineSearch:    9 12:14:00     -677.698407*       0.0710
                 Step     Time          Energy         fmax
*Force-consistent energies used in optimization.
LBFGSLineSearch:    0 12:14:01     -677.527423*       0.3471
LBFGSLineSearch:    1 12:14:01     -677.549084*       0.4230
LBFGSLineSearch:    2 12:14:01     -677.604409*       0.4413
LBFGSLineSearch:    3 12:14:01     -677.629983*       0.3666
LBFGSLineSearch:    4 12:14:01     -677.720822*       0.5206
LBFGSLineSearch:    5 12:14:01     -677.740867*       0.1097
LBFGSLineSearch:    6 12:14:01     -677.742079*       0.0735


Let's look at our generated dataframe:

In [13]:
gen_df

Unnamed: 0,metal,cn,mol2string_2D,mol2string_3D
0,La,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
1,Ce,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
2,Pr,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
3,Nd,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
4,Pm,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
5,Sm,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
6,Eu,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
7,Gd,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
8,Tb,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...
9,Dy,5,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...,@<TRIPOS>MOLECULE\nCharge: 3 Unpaired_Electron...


And, we can directly visualize the generated 3D structures:

In [14]:
view_structures(gen_df.mol2string_3D,labels=gen_df.metal.values)

### Looks pretty cool - xTB is  picking up some trends across the lanthanides.

# Conclusions

### In this tutorial we learned how to build in 2D and translate to 3D.  Specifically, we learned how to:

**(A)** How to generate in 2D.

**(B)** How to translate from 2D to 3D in Architector.

**(C)** How to perform 2D to 3D generation in an end-to-end workflow.