<img src='./images/Rectangle.svg'/>


**<u>Mo</u>**lecular **<u>S</u>**imulation **<u>De</u>**sign **<u>F</u>**ramework, or MoSDeF, is a robust Python-based, open-source software framework designed to faciliate the initialization, atom-typing, and screeening of soft matter system using molecular dynamics simulation. The project was developed initially at Vanderbilt University, in collaboration with software engineer from the Institute for Software Integrated System (ISIS). The project later expanded into a multi-university collaboration with Vanderbilt University being the lead institute. 

The MoSDeF software suite comprises several libraries, namely `mbuild`, `foyer`, and `gmso`, each targets a different component of system initialization. Specifically, `mbuild` library can be used to systematically construct any molecular systems, either atomistic or coarse-grained, while `foyer` can be used to atom-type (assign type and parameters) to all particles, bonds, angles, and dihedrals in the created system. The `gmso` library, which is still under development, will be the main data structure that can be used to store all the information of the system, including the details of the system (particles and their positions) and parameters of all the atom types, bond types, angle types, and dihedral types. 

By creating tools that allow user to easily put together complicated system and automating "trivial" but rather tedius steps (such as writing out to simulation engine specific molecular files), MoSDeF allows user to focus on system design, be able to build more (quantity-wise) and complicated systems. This does not increase the capability of molecular research, especially screening studies, where research iterate over a wide variable space (so many of unqiue systems).

## mBuild

`mBuild`, a Python library within the MoSDeF software suite, is a general purpose tool for constructing molecular system in a scriptable manner. Unlike other existing tools in the community, which only focus on specific type of molecular system (bilayer, monlayer), and employ GUIs, which may hamper reproducibility and could not be used for fully automated workflows, `mbuild` allows user to hiearachically construct complex system from smaller, interchangable pieces that can be connected through the use of generative, or procedural, modeling. 

<img src='images/pmpc.png'/>

### Core Data Structure

#### Compound 

The `Compound` class is the core structure of `mbuild`. `mbuild.Compound` is a general purpose container that can be used to describe anything within a molecular system, either a particle, a coarse-grained beads, a collection atoms, a collection of `Compound`, or operations (e.g. a `Compound` that includes a routien to perform polymerization). The `Compounds` can be connected via different mechanism, either generatively by directly adding a `Bond` between the two interested `Compound`, or procedurally, by adding a `Port` to the two `Compound` that needed to be joined. The lalter method employs more sophistacated vector transformation methods to maintain appropriate orientation and distance of the final bond.    

#### BondGraph

`BondGraph` is the main data structure used to store and manipulate bonding information between `Compounds`. The class is designed to mimic the API and inherit partial functionality of `NetworkX`'s `Graph` data structure. 


#### Port

`Port` is a special structure, comprises a set of four ghost `Particles` that are used to connect `Compound`. When using `Port` to connect two `Compounds`, an underlying routines in the package will automatically perform vector translation to maintain `Compound`'s orientation consistent with the defined structure. 


#### Important features

- Recipe: Utilize python entry point mechanism, allow user to create personal template, which can be used to create different variation of system.

- Load in with various file types (including SMILES string): Allow user to load in, convert, and save out different file format.

- Visualization: The library utilize two visualization backend (py3dmol and nglview) to enable visualization of the structure in Jupyter noteobook environemnt. This allow user to easily visualize the structure being constructed.

#### Example

In [6]:
import mbuild as mb

# Create structure from lowest to highest
carbon = mb.Compound(name='C')

In [2]:
# mbuild recipe demonstration 
# only need to conda clone, pip install, and then the recipe 
# will be avaible/detected by mbuild

## Foyer 

The Foyer library is a tool to apply force fields information to molecular system (i.e. atomtyping). Foyer provides a standardized approach to defining chemical context (atomtyping rules), along with associated force field parameters. The atomtyping rules are encoded, not in the source code, but in an XML file that is an extension of OpenMM forcefield file format. The `foyer` software itself is used to interpretand apply the rules and thus the software is not limited to use with only a single force field type. This distinguish `foyer` from other tools currently available aiding the atomtyping process, which usually only support a specific forcefiled. Automating the process of applying forcefield to molecular system do not only improve the efficiency of the process, but also improve the accuracy/reproducibility comparing to 

Force field usage rules are encoded using a combination of a SMARTS-based annotation scheme (which will be further disccussed in following section), and `override` that defines rule precedence. 

### XML format

The XML format used for `foyer` is an extension of OpenMM XML file format, with additions of the `smarts` definition and `overrides` which specify the rule precedence. 

<img src='images/xml_examples.png'/>

#### SMARTS string (and parsing logic)

What is SMARTS string: 

How is SMARTS string parsed and matched:
<img src='images/SMARTS_matching.png'/>
Schematic of the workflow to apply SMARTS patterns to chemical topologies. The SMARTS strings used to define atomtypes are read into a `SMARTSGraph` class which inherits from NetworkX's core data structure. Using the `find_matches` method, a `SMARTSGraph` instance can search for subgraph isomorphisms of itself within a provided chemical topology and will yield all atoms that match the first token in the original SMARTS string.

#### overrides

The `overrides` key is used to determine rules precedence, which is used to resolve conflicts when more than one atomtype is matched to an atom (with different level of specitivity). `Foyer` will used the `overrides` key to determine the atomtype that with the highest priority (that is not overriden) and apply to the particle. 

### Forcefield plugins

Like `mbuild`, `foyer` also employs the pluggins mechanism, allowing user to create and use extensions library in addition to the core `foyer` library. Since `foyer` logic can perform on any type of forcefield as long as it can be put into the XML file format put forth.

In [3]:
# Parametrization of small molecule 

## General Molecular Simulation Object (Under development)

### Intro and Relevance

### Main Data Structures

### XML

- What are the new stuff and why are they important

### Road map

In [4]:
# Some working example

## TRUE Simulation

## Example Workflow