# `molli map` script

Used to apply a function defined in a generic python file to all items in the library with multiprocessing parallelization.
This is a newer approach of embarassingly parallel loop running that requires way less code and results in workflows that are way more legible


In [3]:
!molli map --help

usage: molli combine [-h] -t <lib> [-n 1] [-b 1] [-o <combined.mlib>]
                     [--overwrite]
                     script

Read a molli library and perform some basic inspections

positional arguments:
  script                This is a python file that defines a molli_main
                        function

options:
  -h, --help            show this help message and exit
  -t <lib>, --target <lib>
                        Target library that the function is going to be
                        applied to
  -n 1, --nprocs 1      Number of processes to be used in parallel
  -b 1, --batchsize 1   Number of molecules to be processed at a time on a
                        single core
  -o <combined.mlib>, --output <combined.mlib>
                        Output library
  --overwrite           Overwrite the target files if they exist (default is
                        false)


The main file to provide here is in `script` (positional argument). This file can be stored anywhere on the file system and does not need to be in the molli script folder, which improves its usability.

An example of this file would be the following script designed for conformer generation using `RDKit`:

File: `../examples-scripts/004_rdkit_confs.py`

```python
import molli as ml
from molli.external.openbabel import obabel_optimize
from rdkit import Chem
from rdkit.Chem import rdDepictor, rdDistGeom
from rdkit import RDLogger

IN_CTYPE = ml.MoleculeLibrary
OUT_CTYPE = ml.ConformerLibrary

N_CONFS = 20

def main(mol: ml.Molecule) -> ml.ConformerEnsemble:

    rdmol = Chem.MolFromMol2Block(
        ml.dumps(mol, "mol2"),
        removeHs=False,
    )

    if rdmol is None:
        return RuntimeError(f"Cannot create an rdkit mol from {mol}")

    rdDistGeom.EmbedMolecule(rdmol)
    rdDistGeom.EmbedMultipleConfs(rdmol, N_CONFS, pruneRmsThresh=0.3)

    ens = ml.ConformerEnsemble(mol, n_conformers=rdmol.GetNumConformers())

    for i, conf in enumerate(rdmol.GetConformers()):
        ens.coords[i] = conf.GetPositions()

    return ens

```

A few parts of this script are important:

1. `IN_CTYPE` and `OUT_CTYPE` are intended to indicate which type of library is appropriate to use with the input and output files, respectfully.
2. `main` function should take one argument corresponding to the type compatible with `IN_CTYPE` and return one object compatible with `OUT_TYPE`
3. The rest of the dependencies need to exist in the same pip/conda environment as the current molli version

In [5]:
!molli map ../examples-scripts/004_rdkit_confs.py -t ../molli/files/cinchonidine.mlib -n 4 -o ../misc/output.clib


  0%|          | 0/88 [00:00<?, ?it/s]
  1%|          | 1/88 [00:05<07:19,  5.05s/it]
  2%|▏         | 2/88 [00:05<03:24,  2.38s/it]
  6%|▌         | 5/88 [00:08<01:52,  1.36s/it]
  9%|▉         | 8/88 [00:08<00:56,  1.42it/s]
 10%|█         | 9/88 [00:11<01:26,  1.09s/it]
 11%|█▏        | 10/88 [00:13<01:35,  1.23s/it]
 15%|█▍        | 13/88 [00:14<01:01,  1.22it/s]
 16%|█▌        | 14/88 [00:14<00:56,  1.31it/s]
 17%|█▋        | 15/88 [00:15<00:59,  1.23it/s]
 18%|█▊        | 16/88 [00:15<00:47,  1.52it/s]
 19%|█▉        | 17/88 [00:16<00:44,  1.59it/s]
 20%|██        | 18/88 [00:17<00:57,  1.22it/s]
 22%|██▏       | 19/88 [00:20<01:27,  1.27s/it]
 23%|██▎       | 20/88 [00:20<01:17,  1.13s/it]
 26%|██▌       | 23/88 [00:24<01:17,  1.19s/it]
 31%|███       | 27/88 [00:26<00:50,  1.20it/s]
 33%|███▎      | 29/88 [00:27<00:44,  1.33it/s]
 34%|███▍      | 30/88 [00:28<00:48,  1.21it/s]
 35%|███▌      | 31/88 [00:30<00:53,  1.06it/s]
 38%|███▊      | 33/88 [00:31<00:47,  1.15it/s]
 39%|

Now we can inspect the output file:

In [6]:
import molli as ml
ml.visual.configure()

In [7]:
!molli ls ../misc/output.clib

3_13_c_cf0   
11_1_c_cf0   
6_1_c_cf0    
3_1_c_cf0    
1_4_c_cf0    
4_3_c_cf0    
6_6_c_cf0    
9_3_c_cf0    
5_5_c_cf0    
2_3_c_cf0    
5_3_c_cf0    
7_4_c_cf0    
9_7_c_cf0    
7_7_c_cf0    
1_13_c_cf0   
7_6_c_cf0    
8_7_c_cf0    
1_6_c_cf0    
3_7_c_cf0    
10_4_c_cf0   
1_7_c_cf0    
2_5_c_cf0    
11_4_c_cf0   
10_1_c_cf0   
4_7_c_cf0    
8_3_c_cf0    
8_12_c_cf0   
6_12_c_cf0   
2_4_c_cf0    
6_13_c_cf0   
2_1_c_cf0    
3_5_c_cf0    
3_12_c_cf0   
7_12_c_cf0   
5_13_c_cf0   
10_3_c_cf0   
10_5_c_cf0   
11_5_c_cf0   
2_13_c_cf0   
5_7_c_cf0    
3_3_c_cf0    
10_12_c_cf0  
9_4_c_cf0    
5_6_c_cf0    
9_12_c_cf0   
9_6_c_cf0    
7_3_c_cf0    
9_5_c_cf0    
10_6_c_cf0   
10_13_c_cf0  
5_4_c_cf0    
3_4_c_cf0    
6_4_c_cf0    
2_6_c_cf0    
7_13_c_cf0   
6_3_c_cf0    
8_4_c_cf0    
11_13_c_cf0  
2_12_c_cf0   
4_12_c_cf0   
9_13_c_cf0   
4_4_c_cf0    
11_7_c_cf0   
4_1_c_cf0    
7_1_c_cf0    
8_5_c_cf0    
4_5_c_cf0    
4_6_c_cf0    
1_1_c_cf0    
10_7_c_cf0   
7_5_c_cf0    
9_1_c_

In [8]:
%clib_view ../misc/output.clib 3_13_c_cf0