# Parameterisation
To parameterise a small molecule for simulation, import to right `Parameteriser`. Here we showcase using the `SolutionParameteriser`, which when given a SMILES string of the molecule to be simulated, it gives back a parameterised system with one copy of this molecule solvated in water. We use benzene as example here. 


In [1]:
benzene_smiles = "c1ccccc1"

Depending on the toolkit at hand, parameterisation can either be done using `via_rdkit()`, which uses open-sourced RDKit or `via_openeye()`, which is commercial. 

The parameterised system is stored as a Parmed object.


In [5]:
import Parameteriser
benzene_smiles = "c1ccccc1"
#RDKit parameterisation
rdk_pmd = Parameteriser.SolutionParameteriser.via_rdkit(benzene_smiles)
rdk_pmd

0.1 SMIRNOFF spec file does not contain 'potential' attribute for 'Bonds' tag. The SMIRNOFF spec converter is assuming it has a value of 'harmonic'
0.1 SMIRNOFF spec file does not contain 'potential' attribute for 'Angles' tag. The SMIRNOFF spec converter is assuming it has a value of 'harmonic'
0.1 SMIRNOFF spec file does not contain 'potential' attribute for 'ProperTorsions' tag. The SMIRNOFF spec converter is assuming it has a value of 'charmm'
0.1 SMIRNOFF spec file does not contain 'potential' attribute for 'vdW' tag. The SMIRNOFF spec converter is assuming it has a value of 'Lennard-Jones-12-6'
0.1 SMIRNOFF spec did not allow the 'Electrostatics' tag. Adding it in 0.2 spec conversion, and assuming the following values:
	method: PME
	scale12: 0.0
	scale13: 0.0
	scale15: 1.0
	cutoff: 9.0
	cutoff_unit: angstrom
0.1 SMIRNOFF spec file does not contain 'method' attribute for 'NonBondedMethod/vdW'' tag. The SMIRNOFF spec converter is assuming it has a value of 'cutoff'
0.1 SMIRNOFF spe

<Structure 1494 atoms; 495 residues; 1000 bonds; PBC (orthogonal); parameterized>

In [6]:
#OpenEye alternative
oe_pmd = Parameteriser.SolutionParameteriser.via_openeye(benzene_smiles)
oe_pmd

ModuleNotFoundError: No module named 'openeye'

When using RDKit, by default the partial charge assignment to the small molecule is done via antechamber to yield AM1-BCC charges. 

We also developed a machine-learned alternative partial charge assignment scheme called [mlddec](github.com/rinikierlab/mlddec). Once this package is installed, one can charge the system using:

In [7]:
Parameteriser.SolutionParameteriser.load_ddec_models()
Parameteriser.SolutionParameteriser.via_rdkit(benzene_smiles)

Loading models...


  0%|          | 0/10 [00:00<?, ?it/s]

ModuleNotFoundError: No module named 'sklearn'

After one is finished with using `Parameteriser` to prepare all the systems one wishes to subsequently simulate, the ddec models should be unloaded as they occupy quite some memory.

In [None]:
Parameteriser.SolutionParameteriser.unload_ddec_models()

The parameterised systems as parmed objects can be stored to disk and reloaded into memory using pickle:

In [8]:
import pickle
#store to disk
pickle.dump(rdk_pmd, open("./benzene.pickle", "wb"))

# Load the pickled object back to memory:
pickle.load(open("./benzene.pickle", "rb"))

<Structure 1494 atoms; 495 residues; 1000 bonds; PBC (orthogonal); parameterized>

# Visualisation (Optional)
You can have a look at the parameterised parmed system inside Jupyter notebook (Jupyter lab does not seem to work) using [nglview](https://github.com/arose/nglview). 

In [10]:
import nglview as nv
view = nv.show_parmed(rdk_pmd)
view.add_licorice()
view

NGLWidget()

# Simulation
To simulate, just import the right simulator and call the `via_openmm()` class method, which as the name implies runs MD using OpenMM under the hood. 

We plan to include another python handle in GROMACS which will enable `via_gromacs()` in the future.

The default simulation length is 5 ns, with trajectory frame stored every 10 ps (so 1 frame is stored after every 5000 steps, totally 500 frames). The simulation will take some time to run.

In [12]:
from Simulator import SolutionSimulator
SolutionSimulator.via_openmm(rdk_pmd, file_name = "benzene", file_path = "./", 
                             platform = "CUDA", num_steps = 5000 * 500)

'/localhome/cschiebroek/MDFPs/mdfptools/examples/benzene.h5'

# Obtain Molecular Dynamics Fingerprint (MDFP)
Once the simulation has finished, one can extract the relevant properties using the right `Composer` (`SolutionComposer` here). 

In [13]:
#First load in the simulated trajectory
import mdtraj as md
traj = md.load("./benzene.h5")



In [18]:
from Composer import SolutionComposer
mdfp = SolutionComposer.run(traj, rdk_pmd)

The returned object from a Composer is a `MDFP` object. As can be seen from above, it contains more information.
To use it for the subsequent machine learning tasks, call the `get_mdfp()` method to get the feature vectors (i.e. just the values not the keys)

In [19]:
mdfp.get_mdfp()

[6,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 8.450071288488036,
 0.06826284991108202,
 8.445589841856616,
 14.612581508541222,
 0.36506963120552305,
 14.556281747113932,
 -17.937042693121423,
 9.534483380689064,
 -18.129155664813904,
 -20.717649206787815,
 5.1167609700951715,
 -21.409579146549156,
 23.06265279702926,
 0.3725312154571262,
 23.003904696659248,
 -38.654691899909245,
 9.578588988605237,
 -38.818633907118866,
 0.15101096959520477,
 7.867008473163575e-05,
 0.15102645991149116,
 2.435979,
 0.008113981,
 2.435612]