# Welcome to `MOLLI 1.0`

## About this tutorial
This file is meant to illustrate a few fundamental principles of the new molli package. The difference between old and new style molli is stark, therefore this introductory tutorial will be useful for both experienced people and newcomers.

## Basic structure of `molli` package.

### Subpackages

In [20]:
# This is meant to be as iconic as `import numpy as np` :)
import molli as ml

## Command line

`molli` features a number of standalone scripts for standard procedures, such as parsing a .CDXML file, or for compiling a collection.

In [21]:
# This is a shell command
!molli --HELP

usage: molli [-C <file.yml>] [-L <file.log>] [-v] [-H] [-V]
             {list,align,combine,compile,gbca,grid,info,ls,parse,recollect,run,show,stats,test}

MOLLI package is an API that intends to create a concise and easy-to-use
syntax that encompasses the needs of cheminformatics (especially so, but not
limited to the workflows developed and used in the Denmark laboratory.

positional arguments:
  {list,align,combine,compile,gbca,grid,info,ls,parse,recollect,run,show,stats,test}
                        This is main command that invokes a specific
                        standalone routine in MOLLI. To get full explanation
                        of available commands, run `molli list`

options:
  -C <file.yml>, --CONFIG <file.yml>
                        Sets the file from which molli configuration will be
                        read from
  -L <file.log>, --LOG <file.log>
                        Sets the file that will contain the output of molli
                        routines.
  

In [22]:
# This is a shell command
!molli list

[32mmolli combine
[0m[32mmolli compile
[0m[32mmolli gbca
[0m[32mmolli grid
[0m[32mmolli info
[0m[32mmolli ls
[0m[32mmolli parse
[0m[32mmolli recollect
[0m[32mmolli show
[0m[32mmolli stats
[0m[32mmolli test
[0m

# Basic objects

Molli features classes that are meant to construct and represent arbitrary chemical entities. They can be constructed completely programmatically or by importing the data from a saved file. 

## `Molecule`

Molecules can be instantiated in a few key ways, here is an example of two ways to load a mol2 file:

```python
# This function imports a mol2 file from a string
mol = ml.Molecule.loads_mol2(mol2_string)

# or, similarly, from a file stream
mol = ml.Molecule.load_mol2(file_io)

# or file path
mol = ml.Molecule.load_mol2(file_path)
```
### Here is an example of this in action:

In [23]:
#Example file path available within molli
fpath = ml.files.benzene_mol2
print("Path to a test file", fpath)

#Loads a molecule object from the file path
mol = ml.Molecule.load_mol2(fpath)
print(f'This is the Molecule: {mol}')
print("Here is the molecule as an XYZ File")
print(mol.dumps_xyz())

Path to a test file /home/blakeo2/new_molli/molli_dev/molli/molli/files/benzene.mol2
This is the Molecule: Molecule(name='benzene', formula='C6 H6')
Here is the molecule as an XYZ File
12
benzene
C        -2.424200     1.134800    -0.000000
C        -3.698300     0.567200    -0.000000
C        -3.843800    -0.820000    -0.000000
C        -2.715200    -1.639600    -0.000000
C        -1.441100    -1.072000    -0.000000
C        -1.295600     0.315200    -0.000000
H        -2.310800     2.215600    -0.000000
H        -4.577600     1.205800    -0.000000
H        -4.836500    -1.262200    -0.000000
H        -2.828600    -2.720400    -0.000000
H        -0.561800    -1.710600     0.000000
H        -0.302900     0.757400     0.000000



### `molli` is natively built to read in objects from three distinct formats:

`SYBYL_MOL2`

`XYZ` (this will not automatically perceive bonds/connectivity!)

`CDXML` (this will not automatically perceive hydrogens!)


### Other formats available via Openbabel Interface

`OpenBabel` is an essential tool in cheminformatics, uniting many formats under one unified molecular structure, `OBMol`, that can easily be converted between various file formats. We have designed an interface between `molli` and `Openbabel` that allows imports from almost any known chemical format into `molli`. 

`Openbabel` is not a necessary dependency however, and would need to be independently installed to leverage this functionality (https://github.com/openbabel/openbabel)

### An example with the `mol` format is shown below

In [24]:
file_path = ml.files.dendrobine_molv3

#This loads the MOLFILE into Molli using openbabel
mol = ml.load(file_path, fmt='mol', parser='openbabel', otype="molecule", name='dendrobine')
mol

Molecule(name='dendrobine', formula='C16 H25 N1 O2')

## `ConformerEnsemble`

This is a fundamental class of `molli` that can be thought of as a collection of varying coordinates with a baseline set of atoms and bonds. These can be loaded in very similar fashion to `Molecule` objects:

```python
# This function imports a mol2 file from a string
ens = ml.ConformerEnsemble.loads_mol2(mol2_string)

# or, similarly, from a file stream
ens = ml.ConformerEnsemble.load_mol2(file_io)

# or file path
ens = ml.ConformerEnsemble.load_mol2(file_path)
```

### Here is an example of this in action

In [34]:
file_path = ml.files.pentane_confs_mol2

ens = ml.load(file_path, fmt='mol2', otype='ensemble')
print(f'Here is the ensemble: {ens}')
print('Here are the XYZ coordinates of the full ensemble')
print(ens.dumps_xyz())

Here is the ensemble: ConformerEnsemble(name='pentane', formula='C5 H12', n_conformers=7)
Here are the XYZ coordinates of the full ensemble
17
pentane
C        -2.804500     3.996400    -1.412800
C        -2.748400     3.317400    -0.053600
H        -3.684000     4.644600    -1.476700
H        -2.867800     3.257600    -2.218100
H        -1.915400     4.612600    -1.580600
C        -1.528800     2.404000     0.066300
H        -3.665500     2.735900     0.095900
H        -2.718500     4.083500     0.729900
C        -0.228600     3.184600    -0.124600
H        -1.592000     1.606100    -0.683800
H        -1.526500     1.921300     1.051200
C        -0.089200     4.294900     0.904600
H        -0.200100     3.620600    -1.130000
H         0.628700     2.506600    -0.040400
H        -0.915000     5.009100     0.825500
H         0.847200     4.839900     0.749600
H        -0.081700     3.888900     1.921200
17
pentane
C        -2.729800     4.412900     1.000500
C        -2.748400     3.317

### In addition, conformer ensembles can be instantiated from a list of mols

In [33]:
file_path = ml.files.pentane_confs_mol2
mols = ml.load_all(file_path, otype='molecule')
print(mols)
ens = ml.ConformerEnsemble(mols)
print(ens.n_conformers)
print(f'Here is the ensemble: {ens}')


[Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12'), Molecule(name='pentane', formula='C5 H12')]
7
Here is the ensemble: ConformerEnsemble(name='pentane', formula='C5 H12', n_conformers=7)


In [5]:
sub = ml.Substructure(m1, (1, 3, 5))
print(sub.dumps_xyz())

sub.coords += 50.0
print(m1.dumps_xyz())

3
Substructure(parent=Molecule(name='pentane', formula='C5 H12'), atoms=[1, 3, 5]) [produced with molli]
C        -2.748400     3.317400    -0.053600
H        -2.867800     3.257600    -2.218100
C        -1.528800     2.404000     0.066300

17
pentane [produced with molli]
C        -2.804500     3.996400    -1.412800
C        47.251600    53.317400    49.946400
H        -3.684000     4.644600    -1.476700
H        47.132200    53.257600    47.781900
H        -1.915400     4.612600    -1.580600
C        48.471200    52.404000    50.066300
H        -3.665500     2.735900     0.095900
H        -2.718500     4.083500     0.729900
C        -0.228600     3.184600    -0.124600
H        -1.592000     1.606100    -0.683800
H        -1.526500     1.921300     1.051200
C        -0.089200     4.294900     0.904600
H        -0.200100     3.620600    -1.130000
H         0.628700     2.506600    -0.040400
H        -0.915000     5.009100     0.825500
H         0.847200     4.839900     0.749600
H     