# Example 3: Build a Combinatorial Library

This notebook briefly introduces the `molli combine` workflow which allows to substitute attachment points on the molecules in a combinatorial fashion.

## Hardware Specification for Rerun

Desktop workstation with 2x (AMD EPYC 7702 64-Core) with total of 128 physical and 256 logical cores, 1024 GB DDR4 with Ubuntu 22.04 LTS operating system.

In [1]:
import molli as ml
ml.visual.configure()

## Overview.

We start with the file `phosphorus_core.mol2`, in which we define the three attached pseudoatoms labeled `AP1`, `AP2` and `AP3`. These are the "attachment points": the pseudoatoms that define the direction of the substitution. These atoms need to be replaced with the groups found in another file: `substituents.cdxml`.
We will be generating two libraries: achiral and chiral phosphines.

In [2]:
mol = ml.load("phosphorus_core.mol2")
for a in mol.atoms:
    print(a)
mol

Atom(element=P, isotope=None, label='P', formal_charge=0, formal_spin=0)
Atom(element=Unknown, isotope=None, label='AP1', formal_charge=0, formal_spin=0)
Atom(element=Unknown, isotope=None, label='AP2', formal_charge=0, formal_spin=0)
Atom(element=Unknown, isotope=None, label='AP3', formal_charge=0, formal_spin=0)


Molecule(name='P', formula='P1 Unknown3')

## Import prerequisite molecules as molli collections

In [3]:
!molli compile phosphorus_core.mol2 -o P.mlib --overwrite

Matched 1 files for importing.
Importing molecules: 100%|██████████████████████| 1/1 [00:00<00:00, 2411.91it/s]


In [4]:
!molli parse substituents.cdxml -o substituents.mlib --hadd --overwrite

Parsing substituents.cdxml: 100%|██████████████| 21/21 [00:00<00:00, 497.99it/s]


## Achiral phosphine ligands

The objective here is to create a library of phosphines in which the P atom is *not stereogenic*, which means that at least two substituents need to be identical.
We are going to achieve this by using molli's command line interface.

### Create a disubstituted phosphine library with `--mode same`
This ensures that two substituents will be identical

In [5]:
!molli combine --help

usage: molli combine [-h] -s <substituents.mlib>
                     [-m {same,permutns,combns,combns_repl}]
                     [-a ATTACHMENT_POINTS] [-n 1] [-b 1] -o <combined.mlib>
                     [-sep SEPARATOR] [--hadd]
                     [--obopt [ff maxiter tol disp ...]] [--overwrite]
                     cores

Combines two lists of molecules together

positional arguments:
  cores                 Base library file to combine wth substituents

options:
  -h, --help            show this help message and exit
  -s <substituents.mlib>, --substituents <substituents.mlib>
                        Substituents to add at each attachment of a core file
  -m {same,permutns,combns,combns_repl}, --mode {same,permutns,combns,combns_repl}
                        Method for combining substituents
  -a ATTACHMENT_POINTS, --attachment_points ATTACHMENT_POINTS
                        Label used to find attachment points
  -n 1, --nprocs 1      Number of processes to be used in parall

In [6]:
!molli combine P.mlib -s substituents.mlib -a AP1 -a AP2 -m same -o R2P.mlib --hadd --overwrite

Will create a library of size 21
100%|█████████████████████████████████████████████| 1/1 [00:00<00:00, 18.76it/s]


Finally, we can attach the last remaining substituent to finish the achiral phosphine library.

In [7]:
!molli combine R2P.mlib -a AP3 -s substituents.mlib --hadd --obopt UFF 1000 1e-4 0.02 --overwrite -n1 -b8 -o R3P_achiral.mlib

Will create a library of size 441
100%|███████████████████████████████████████████| 63/63 [02:50<00:00,  2.70s/it]


In order to make the calculation faster, some of molli workflows (eventually, all of them) implement parallelization. See the acceleration for yourself!

In [8]:
!molli combine R2P.mlib -a AP3 -s substituents.mlib --hadd --obopt UFF 1000 1e-4 0.02 --overwrite -n8 -b8 -o R3P_achiral.mlib 

Will create a library of size 441
100%|███████████████████████████████████████████| 63/63 [00:25<00:00,  2.49it/s]


## Chiral phosphine ligands

The objective here is to create a library of phosphines in which the P atom *is stereogenic*, which means that all substituents need to be different.
We are going to achieve this by using molli's command line interface. 

The magic here is in the use of `-m combns` parameter of the `molli combine` workflow. This means that for each 


In [9]:
!molli combine P.mlib -s substituents.mlib -a AP1 -a AP2 -a AP3 -m combns -o R3P_chiral.mlib --hadd --obopt UFF 1000 1e-4 0.02 --overwrite -n1 -b8

Will create a library of size 1330
100%|█████████████████████████████████████████| 167/167 [08:35<00:00,  3.09s/it]


Jupyter also allows for special commands, so the command `%mlib_view` allows for direct visualization of molecules in a `MoleculeLibrary` without needing to run a full command to open it. The syntax is as follows:

`%mlib_view <LIB_PATH> <KEY>`

In [10]:
%mlib_view R3P_chiral.mlib P_15_17_14 

In [11]:
!molli ls R3P_chiral.mlib

    0  P_16_6_9    
    1  P_9_13_12   
    2  P_9_5_14    
    3  P_3_20_12   
    4  P_16_5_7    
    5  P_4_2_10    
    6  P_21_10_12  
    7  P_3_17_20   
    8  P_17_8_18   
    9  P_4_17_9    
   10  P_1_18_19   
   11  P_16_13_18  
   12  P_11_1_13   
   13  P_3_17_13   
   14  P_16_20_18  
   15  P_3_17_5    
   16  P_6_17_18   
   17  P_4_17_12   
   18  P_3_2_14    
   19  P_21_20_18  
   20  P_15_21_9   
   21  P_8_5_19    
   22  P_3_6_12    
   23  P_15_20_5   
   24  P_3_4_19    
   25  P_16_15_18  
   26  P_3_21_19   
   27  P_21_1_8    
   28  P_21_18_2   
   29  P_6_13_18   
   30  P_8_13_2    
   31  P_3_19_10   
   32  P_21_2_10   
   33  P_21_5_12   
   34  P_3_9_2     
   35  P_5_18_19   
   36  P_3_15_6    
   37  P_21_5_13   
   38  P_17_8_7    
   39  P_11_20_14  
   40  P_16_18_19  
   41  P_19_10_7   
   42  P_11_13_2   
   43  P_13_2_12   
   44  P_11_5_2    
   45  P_4_11_20   
   46  P_16_5_14   
   47  P_21_20_9   
   48  P_6_5_12    
   49  P_4_9_2     
