# MoSDeF - A Molecular Simulation Design Framework

## Overview

MoSDeF consists of two core Python package, [mBuild](https://github.com/mosdef-hub/mbuild) and [Foyer](https://github.com/mosdef-hub/foyer), that, when combined with tools for workflow and analysis management (such as [Signac and Signac-Flow](https://signac.io)) provide the means to perform complex molecular simulations in a **reproducible** manner. Reproducibility in this case is achieved by making all aspects of the simulation (system initialization, simulation execution, and analysis) scriptable, such that other researchers could execute your same scripts to achieve the same results. MoSDeF is also designed such that systems can be generated in a programmatic manner, facilitating screening of large structural/chemical parameter spaces.

In this overview, we will be focusing specifically on the tools we've developed to address the issue of **_system initialization_**, including the creation of a molecular model and the application of a force field (atom-typing and parameter assignment). The two tools contained within MoSDeF to address system initialization are:

  - [**mBuild**](https://github.com/mosdef-hub/mbuild): A hierarchical, component-based molecule builder
  
  - [**Foyer**](https://github.com/mosdef-hub/foyer): A package for atom-typing as well as applying and disseminating forcefields

This overview is designed to introduce you to these tools in a general manner; however, more in-depth tutorials are also available from within the [mosdef_tutorials repository](https://github.com/mosdef-hub/mosdef_tutorials).

---

**Pre-requisites**

We have designed this tutorial for users that have some knowledge of Python and object-oriented programming (OOP). However, we encourage all users to work through the notebook, even those new to the world of Python and OOP, in order to still obtain an idea of the general concept of our tools. The syntax can be picked up later.


------

### Primer on using Jupyter notebooks and Binder

[Jupyter notebooks](https://jupyter-notebook.readthedocs.io/en/stable/) provide an interactive environment for "developing, documenting, and executing code". Several languages are supported; however here we will be using Python. 

Jupyter notebooks feature two primary types of cells:
1. Markdown cells, like this cell, which contain explanatory text
2. Code cells, that can be executed by either clicking on the "run cell" icon or by hitting SHIFT + ENTER.

Cells do not have to be executed in order (however the cells in this tutorial are designed to be executed _sequentially_), and the order in which cells have been executed is recorded by the bracketed number to the left of the _code_ cell (e.g. [ 1 ]). When a cell is executed you will first see an asterisk (i.e. [ * ]) which means that the cell is still running. When the asterisk is replaced by a number this means the execution has completed.

Markdown cells will _not_ have numbers to the left of their cell. These are text based and not meant to be considered executable code. Executing these cells will render the Markdown cells in HTML. More information can be found [here](https://www.markdownguide.org/getting-started)

---

[Binder](https://mybinder.readthedocs.io/en/latest/) provides the ability to deploy Jupyter notebooks in the cloud, such that users do not need to set up their own computing environment to execute the notebook cells.
* We will not be using Binder during this session, but all of our notebooks are hosted on Binder as well.
* Binder is a free service that is community supported, and can be slow to access with multiple users trying to access the same notebook at once.
---

---

### mBuild Units

Within mBuild, units to describe various aspects of system initialization are kept constant within the package.
This provides a controlled environment that limits possible Input/Output (IO) errors when reading in/saving your structure of interest to various simulation engines.

**Length**
* nanometers [nm]

**Angles**
* Radians for all `Compound` operations
* Degrees when building `Lattices`

### Importing mBuild

To begin, we need to import the `mbuild` package, here using the alias `mb`. This will give us access to all of the data structures and functions within mBuild.

In [1]:
import mbuild as mb

### The `Compound` class

The base class of mBuild is the `Compound` class, which defines the primary building block used for constructing molecules. **Molecules are constructed hierarchically**; however, each level of the hierarchy inherits from the `Compound` class. This means that `Compounds` may contain other `Compounds`, and that the same methods and attributes are present for molecule components at any level of the hierarchy. mBuild `Compounds` feature [a variety of useful methods and attributes](http://mosdef-hub.github.io/mbuild/data_structures.html) to facilitate system construction.

<img src="./utils/hierarchical_design_image.png" alt="Drawing" style="width: 700px;"/>

Compounds can be created by generating and connecting particles one-by-one; however, it is typically more practical to load coordinates and bonds from a structure file (such as PDB, MOL2, etc.) or to import `Compound` class definitions that other users have already defined.

We'll explore both approaches here for constructing a linear alkane chain.

---

## Creating an Alkane from CH2 building blocks

In our first approach for creating an alkane we will set-up routines for connecting CH2 building blocks (and we will cap the ends of our chain with hydrogens).

### Loading from a PDB structure file

First, we'll load a CH2 moiety into an mBuild `Compound` by reading from a PDB structure file (created using [Avogadro](https://avogadro.cc/)). This will create an mBuild `Compound` containing three atoms (C, H, H), as well as two C-H bonds. The `visualize` method allows us to view our `Compound` directly within the notebook. This visualization is provided by [`nglview`](https://github.com/arose/nglview).

In [2]:
ch2 = mb.load('utils/ch2.pdb')
ch2.visualize()

<py3Dmol.view at 0x7fa7f416bcd0>

Note, formats such as PDB include bonding information.  One could presumably load other formats without bonding information, and specify these bonds manually.  Additionally, one can explicitly define atom locations and bonds; for example, see  [mBuild Tutorial 01: Basic Functionality](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_01_Basic_Functionality.ipynb).

### Examining the `Compound` data structure

Now that we have created a `Compound` we can examine the contents.  For example, simply calling the `Compound` 
within the notebook will provide us with a summary of the contents.

In [3]:
# simply call the compound to print a summary of the number particles and bonds
ch2

<Compound 3 particles, non-periodic, 2 bonds, id: 140359331270864>

We can examine the coordinates in multiple ways as shown below:

In [4]:
# view the coordinates of the atoms in the compound
ch2.xyz

array([[ 0.  ,  0.  ,  0.  ],
       [-0.11,  0.  ,  0.  ],
       [ 0.11,  0.  ,  0.  ]])

In [5]:
# use the list function to iterate over the atoms and their positions in the compound
list(ch2)

[<C pos=([0. 0. 0.]), 0 bonds, id: 140359236650560>,
 <H pos=([-0.11  0.    0.  ]), 0 bonds, id: 140359236649936>,
 <H pos=([0.11 0.   0.  ]), 0 bonds, id: 140359236537552>]

To view the bonds, we can call the `bonds` function as part of the `Compound` taking advantage of `list`:

In [6]:
#list the pairs of atoms that are bonded; each pair appears between parantheses, i.e., (atom1, atom2)
list(ch2.bonds())

[(<H pos=([-0.11  0.    0.  ]), 0 bonds, id: 140359236649936>,
  <C pos=([0. 0. 0.]), 0 bonds, id: 140359236650560>),
 (<H pos=([0.11 0.   0.  ]), 0 bonds, id: 140359236537552>,
  <C pos=([0. 0. 0.]), 0 bonds, id: 140359236650560>)]

In [7]:
# we can also format the output of bonds to simply list the pairs of bonded atoms by name alone
for pair in ch2.bonds():
    print('{}-{}'.format(pair[0].name, pair[1].name))

# equivalent shorthand output using list comprehension
['{}-{}'.format(pair[0].name, pair[1].name) for pair in ch2.bonds()]

H-C
H-C


['H-C', 'H-C']

## Formatting the output and exploring `Compound`: more examples

Lets take a look at all the underlying attributes and data structures that a `Compound` inherently contains.

To do this, we can use the built-in Python function: `dir()`. When an object is passed to `dir()`, it will
attempt to list all of the available attributes that object contains.

In [8]:
# call dir on ch2
dir(ch2)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_box',
 '_charge',
 '_check_if_contains_rigid_bodies',
 '_clone',
 '_clone_bonds',
 '_contains_only_ports',
 '_contains_rigid',
 '_element',
 '_energy_minimize_openbabel',
 '_energy_minimize_openmm',
 '_increment_rigid_ids',
 '_kick',
 '_n_particles',
 '_particles',
 '_periodicity',
 '_pos',
 '_remove',
 '_remove_references',
 '_reorder_rigid_ids',
 '_rigid_id',
 '_update_port_locations',
 '_visualize_nglview',
 '_visualize_py3dmol',
 'add',
 'add_bond',
 'all_ports',
 'ancestors',
 'available_ports',
 'bond_graph',
 'bonds',
 'box',
 'center',
 'charge',
 'children',
 'contains_rigid',
 'element',
 'ener

Information overload! All attributes prefixed with an underscore `_attribute` are considered _private_. This means 
that you should not be interacting and changing values of an object by accessing their private attributes. 
Attributes without an underscore are public and can be accessed like so: `ch2.xyz`

Double underscores as prefix and suffixes are usually special functions, you can access and edit these, but most 
cases probably will _not_ overwrite these. For more information check here: [https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types](https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types)

Lets use some python methods and strip out all the underscore and doubleunderscore (also known as: _dunderscore_)

In [9]:
# to look through all the available attributes of the Compound class
# Call the __dir__ method on the class
# accessing the objects "__dir__" method achieves the same results as using the dir() method
for x in ch2.__dir__():
    if '__' in str(x) or '_' in str(x)[0]:
        continue # skip these attrs
    else:
        print(x)

name
parent
children
labels
referrers
bond_graph
port_particle
particles
successors
n_particles
ancestors
root
particles_by_name
particles_by_element
charge
rigid_id
contains_rigid
max_rigid_id
rigid_particles
label_rigid_bodies
unlabel_rigid_bodies
add
remove
referenced_ports
all_ports
available_ports
bonds
n_bonds
add_bond
generate_bonds
remove_bond
pos
periodicity
box
element
xyz
xyz_with_ports
center
mins
maxs
get_boundingbox
min_periodic_distance
particles_in_range
visualize
update_coordinates
energy_minimization
energy_minimize
save
translate
translate_to
rotate
spin
from_trajectory
to_trajectory
from_parmed
to_parmed
to_networkx
to_pybel
from_pybel
to_intermol
get_smiles


## Example
Print out the **private** (single underscore at beginning) attributes of `ch2`

In [10]:
for x in ch2.__dir__():
    if '_' in str(x)[0] and '__' not in str(x):
        print(x)
    else:
        continue

_pos
_rigid_id
_contains_rigid
_check_if_contains_rigid_bodies
_element
_box
_periodicity
_charge
_particles
_n_particles
_contains_only_ports
_increment_rigid_ids
_reorder_rigid_ids
_remove
_remove_references
_visualize_py3dmol
_visualize_nglview
_update_port_locations
_kick
_energy_minimize_openmm
_energy_minimize_openbabel
_clone
_clone_bonds


### Using the `help` python method

Python has a built-in function that is able to provide documentation 
on the fly. As long as a function or object contains a documentation 
string (_docstring_). `help()` will return that documentation
interactively. If you are familiar with the unix command line, this is similar to `man`.

Lets check out the help information for the `Compound` class.

In [11]:
help(help)

Help on _Helper in module _sitebuiltins object:

class _Helper(builtins.object)
 |  Define the builtin 'help'.
 |  
 |  This is a wrapper around pydoc.help that provides a helpful message
 |  when 'help' is typed at the Python interactive prompt.
 |  
 |  Calling help() at the Python prompt starts an interactive help session.
 |  Calling help(thing) prints help for the python object 'thing'.
 |  
 |  Methods defined here:
 |  
 |  __call__(self, *args, **kwds)
 |      Call self as a function.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [12]:
help(mb.Compound)

Help on class Compound in module mbuild.compound:

class Compound(builtins.object)
 |  Compound(subcompounds=None, name=None, pos=None, charge=0.0, periodicity=None, box=None, element=None, port_particle=False)
 |  
 |  A building block in the mBuild hierarchy.
 |  
 |  Compound is the superclass of all composite building blocks in the mBuild
 |  hierarchy. That is, all composite building blocks must inherit from
 |  compound, either directly or indirectly. The design of Compound follows the
 |  Composite design pattern::
 |  
 |      @book{DesignPatterns,
 |          author = "Gamma, Erich and Helm, Richard and Johnson, Ralph and
 |          Vlissides, John M.",
 |          title = "Design Patterns",
 |          subtitle = "Elements of Reusable Object-Oriented Software",
 |          year = "1995",
 |          publisher = "Addison-Wesley",
 |          note = "p. 395",
 |          ISBN = "0-201-63361-2",
 |      }
 |  
 |  with Compound being the composite, and Particle playing the role

### Exercise 1
* `Save` `ch2` out to an XYZ file with a unique name and read it back in as a new `Compound`

* List the coordinates and the bonds of this compound, compare to the original `ch2` compound

In [13]:
# How many bonds do you expect to find
ch2.save('ch2.xyz', overwrite=True)
new_ch2 = mb.load('./ch2.xyz')
print(list(new_ch2.bonds()))
list(new_ch2.particles())
type(list(new_ch2.particles_by_name('C'))[0])
hydrogens = list(new_ch2.particles_by_name('H'))
print(hydrogens[1])

[]
<H pos=([0.11 0.   0.  ]), 0 bonds, id: 140359333318464>


In [14]:
new_ch2.add_bond?

### Adding bonds to a `Compound`

Can add bonds explicitly using `compound.add_bonds` method

Cons:
* Not flexible
* Tedious

In [15]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        
        carbon = mb.Particle(pos=[0.0, 0.0, 0.0], name='C')
        hydrogen0 = mb.Particle(pos=[0.1, 0.0, 0.0], name='H')
        hydrogen1 = mb.Particle(pos=[-0.1, 0.0, 0.0], name='H')
        self.add([carbon, hydrogen0, hydrogen1])
        self.add_bond((carbon, hydrogen0))
        self.add_bond((carbon, hydrogen1))

In [16]:
test_ch2 = CH2()
print(test_ch2)
test_ch2.visualize()

<CH2 3 particles, non-periodic, 2 bonds, id: 140359331597376>


<py3Dmol.view at 0x7fa7f430c7f0>

In [17]:
print(list(test_ch2.particles()))

[<C pos=([0. 0. 0.]), 0 bonds, id: 140359331595696>, <H pos=([0.1 0.  0. ]), 0 bonds, id: 140359333317408>, <H pos=([-0.1  0.   0. ]), 0 bonds, id: 140359333316976>]


### Exercise 2
Using the above `CH2` class as inspiration, and create a `CH4` `Compound` Class

_Note: Do not worry about placing the hydrogens in chemically realistic locations_

In [18]:
class CH4(mb.Compound):
    def __init__(self):
        super(CH4, self).__init__()
        # clear this out for the users
        carbon = mb.Particle(pos=[0.0, 0.0, 0.0], name='C')
        hydrogen0 = mb.Particle(pos=[0.1, 0.0, 0.0], name='H')
        hydrogen1 = mb.Particle(pos=[-0.1, 0.0, 0.0], name='H')
        hydrogen2 = mb.Particle(pos=[0.0, 0.1, 0.0], name='H')
        hydrogen3 = mb.Particle(pos=[0.0, -0.1, 0.0], name='H')
        self.add([carbon, hydrogen0, hydrogen1, hydrogen2, hydrogen3])
        self.add_bond((carbon, hydrogen0))
        self.add_bond((carbon, hydrogen1))
        self.add_bond((carbon, hydrogen2))
        self.add_bond((carbon, hydrogen3))

In [19]:
ch4 = CH4()
ch4.visualize()

<py3Dmol.view at 0x7fa7f43b40a0>

In [53]:
ch4.energy_minimize()
ch4.visualize()

  warn(


<py3Dmol.view at 0x7fa7d8ed6a30>

## Loading in Compounds

We can create compounds in multiple ways, some of these methods have been shown above.

One of the easiest ways is to create compounds by loading in from a file (`mb.load`)

In [54]:
ch2 = mb.load('utils/ch2.pdb')
ch2.visualize()

<py3Dmol.view at 0x7fa7d8e8eb80>

Explicitly create `Particles` and `add` them to a `Compound`

In [22]:
ch2 = mb.Compound()
carbon = mb.Particle(pos=[0.0, 0.0, 0.0], name='C')
hydrogen0 = mb.Particle(pos=[0.1, 0.0, 0.0], name='H')
hydrogen1 = mb.Particle(pos=[-0.1, 0.0, 0.0], name='H')
ch2.add([carbon, hydrogen0, hydrogen1])
ch2.add_bond((carbon, hydrogen0))
ch2.add_bond((carbon, hydrogen1))
ch2.visualize()

<py3Dmol.view at 0x7fa7f43a6310>

In [23]:
import mbuild as mb
from mbuild.lib.atoms import H
from mbuild.recipes import recipes
from mbuild.lib.surfaces import AmorphousSilicaSurface

class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        carbon = mb.Particle(pos=[0.0, 0.0, 0.0], name='C')
        hydrogen0 = mb.Particle(pos=[0.1, 0.0, 0.0], name='H')
        hydrogen1 = mb.Particle(pos=[-0.1, 0.0, 0.0], name='H')
        self.add([carbon, hydrogen0, hydrogen1])
        self.add_bond((carbon, hydrogen0))
        self.add_bond((carbon, hydrogen1))
        self.add(mb.Port(anchor=self[0], orientation=[0, 1, 0],
                         separation=0.07), label='up')
        self.add(mb.Port(anchor=self[0], orientation=[0, -1, 0],
                         separation=0.07), label='down')

polymer = recipes.Polymer(monomers=[CH2()])
polymer.build(n=10)
polymer.remove(polymer[-1])
polymer.add(polymer.all_ports()[0], 'down', containment=False)
locations = mb.pattern.Grid2DPattern(n=10, m=10)
functionalized_surface = recipes.Monolayer(AmorphousSilicaSurface(),
                                           polymer, backfill=H(), pattern=locations,
                                           tile_x=2, tile_y=1,)

  particle_density = traj.top.n_atoms / traj.unitcell_volumes[0]
  warn(
  warn(
 No fractions provided. Assuming a single chain type.
  warn("\n No fractions provided. Assuming a single chain type.")
 Adding 100 of chain <Polymer 31 particles, non-periodic, 30 bonds, id: 140359333962272>
  warn("\n Adding {} of chain {}".format(len(pattern), chains[-1]))


In [24]:
from mbuild.lib.surfaces import AmorphousSilicaSurface
AmorphousSilicaSurface().save('surf.pdb', overwrite=True)

Use `load` within your classes! Always remember to add to `self`.

In [25]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        ch2 = mb.load('utils/ch2.pdb')
        self.add(ch2)

ch2 = CH2()
ch2.visualize()

<py3Dmol.view at 0x7fa7d8ea90a0>

Same code as above, without creating a temporary compound to add. `load` can add directly to a compound as well!

In [26]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        mb.load('utils/ch2.pdb', compound=self)

ch2 = CH2()
ch2.visualize()

<py3Dmol.view at 0x7fa7d8e8e4f0>

Use `load` to return the compound, but add it using `self.add` since `self` is a `Compound`.

In [27]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        self.add(mb.load('utils/ch2.pdb'))

ch2 = CH2()
ch2.visualize()

<py3Dmol.view at 0x7fa7d8ca28e0>

### Adding `Ports` to the `Compound` classes

In order to connect `Compounds` we need to define locations where bonds can be formed (we can also add bonds manually through the `add_bond` method, which was covered above). mBuild handles this by allowing users to define `Ports` on particles, which essentially act as dangling bonds.

However, if one had to re-write the commands for loading a CH2 molecule and adding `Ports` each time they wanted to create a molecule that included a CH2 unit, the process would be quite cumbersome. Instead, we can create a reusable class that defines our CH2 `Compound`. This approach allows one to encapsulate the routines for creating a molecular moiety into an object that can be instantiated and in a manner that can easily be shared with others.

Below is a class definition for a CH2 moiety that uses the _same_ command we used above to load coordinates and bonds from a PDB structure file and features a few lines that add `Ports` to the carbon atom.

For additional information, see [mBuild Tutorial 02: Reusing Components](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_02_Reusing_Components.ipynb).

In [28]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        mb.load('utils/ch2.pdb', compound=self)
        # Note: carbon is just a reference to the carbon particle in the compound self, NOT a copy
        carbon = list(self.particles_by_name('C'))[0]
        self.add(mb.Port(anchor=carbon, orientation=[0, 1, 0], separation=0.075), 'up')
        self.add(mb.Port(anchor=carbon, orientation=[0, -1, 0], separation=0.075), 'down')

We can also view a summary of the  ports associated with a `Compound`:

In [29]:
ch2 = CH2()
ch2.all_ports()

[<Port, anchor: 'C', labels: ['up'], id: 140358875410192>,
 <Port, anchor: 'C', labels: ['down'], id: 140358875536112>]

If we instantiate this class and visualize we should see the same result we obtained earlier. We can pass the `show_ports=True` argument to `visualize` to see the `Ports` we've added to the carbon atom.

In [30]:
ch2 = CH2()
ch2.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8ebadc0>

### Connecting two `CH2` compounds with `Ports`

In [31]:
first_ch2 = CH2()
second_ch2 = CH2()
second_ch2.translate([1, 1, 1])
parent = mb.Compound()
parent.add((first_ch2, second_ch2))
parent.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8ee7a00>

To connect 2 ports, you use the `force_overlap` method and tell which particles/compounds to connect.

This _physically_ moves and orients compounds based on the location and orientation defined in the `Port` class.

In [32]:
mb.force_overlap(move_this=first_ch2,
                 from_positions=first_ch2['up'],
                 to_positions=second_ch2['down'])

parent.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8ece310>

Connecting compounds can be quite confusing if they are not wrapped in a `Compound` class.

In [33]:
class Ethane(mb.Compound):
    def __init__(self):
        super(Ethane, self).__init__()
        first_ch2 = CH2()
        second_ch2 = CH2()
        second_ch2.translate([1, 1, 1])

        self.add((first_ch2, second_ch2))
        
        mb.force_overlap(move_this=first_ch2,
                 from_positions=first_ch2['up'],
                 to_positions=second_ch2['down'])

Ethane().visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8ee7e50>

### More complex classes. Connecting CH2 moieties into an alkane

Now we'll create a more complex class that defines the instructions for connecting CH2 moieties into a linear alkane chain.

The code below instantiates CH2 moieties inside of a "for" loop, where the number of iterations is dependent on the desired length of the chain. The length of the chain can be toggled through the `chain_length` argument provided to the class constructor. We also import a hydrogen `Compound` from mBuild's `atoms` library to cap the ends of our chain.

This is shown pictorially below.
<img src="./utils/figure_connecting.png" alt="Drawing" style="width: 700px;"/>

**Note:** For this general overview, we do not intend for users (particularly those new to Python and object-oriented programming) to get too bogged down in the syntax. Instead, the emphasis should be that with mBuild we can encapsulate a series of routines (a "recipe") into a class, and that these routines can be defined in a manner that gives the class structural/chemical flexibility.

For additional examples, see tutorials: 
- [mBuild Tutorial 03: Connecting Components with Ports](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_03_Connecting_Components_with_Ports.ipynb) 
- [mBuild Tutorial 04: Constructing Larger Compounds](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_04_Constructing_Larger_Compounds.ipynb)
- [mBuild Tutorial 05: Creating Flexible Classes](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_05_Creating_Flexible_Classes.ipynb)

In [34]:
from mbuild.lib.atoms import H

class Alkane(mb.Compound):
    def __init__(self, chain_length=1):
        super(Alkane, self).__init__()
        hydrogen = H()
        last_monomer = CH2()
        # top capping CH2 -> CH3
        mb.force_overlap(move_this=hydrogen,
                         from_positions=hydrogen['up'],
                         to_positions=last_monomer['up'])
        # Add to our `self` compound
        self.add(hydrogen)
        self.add(last_monomer)
        # loop over n = chain_length CH2's to add to Alkane
        for _ in range(chain_length-1):
            current_monomer = CH2()
            mb.force_overlap(move_this=current_monomer,
                             from_positions=current_monomer['up'],
                             to_positions=last_monomer['down'])
            self.add(current_monomer)
            last_monomer=current_monomer
        # bottom cap
        hydrogen = H()
        mb.force_overlap(move_this=hydrogen,
                         from_positions=hydrogen['up'],
                         to_positions=last_monomer['down'])
        self.add(hydrogen)

Because we've defined our class to take `chain_length` as an argument, we can toggle the chemistry of our system (in this case the number of carbons in a linear alkane) by changing the value we provide for this argument upon instantiation.

For example, let's create a butane molecule.

In [35]:
butane = Alkane(chain_length=4)
butane.visualize()

<py3Dmol.view at 0x7fa7d8f1f790>

The geometry of this molecule is not entirely realistic as all backbone atoms featuring 180° angles in the alkane molecules, all hydrogen atoms in plane. This can be addressed by placing Particles and Ports in more realistic locations, either manually or by using energy minimized inputs.

Alternatively, a `Compound` can be constructed and then energy minimized, either through a simulation engine or using the energy_minimization function in `mBuild`, which uses the [Open Babel toolkit](http://openbabel.org/dev-api/). See tutorial [mBuild Tutorial 07: Energy Minimization](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_07_Energy_Minimization.ipynb) for more information about the use of this function and control of this function. 

In [36]:
butane.energy_minimize()
butane.visualize()

<py3Dmol.view at 0x7fa7d8f32cd0>

Now let's change the value of `chain_length` to create a decane.

In [37]:
decane = Alkane(chain_length=10)
decane.energy_minimize()
decane.visualize()

<py3Dmol.view at 0x7fa7d8f8f460>

## Altering `Compounds`. Creating an alcohol.


mBuild contains routines for the addition and removal of particles. Here, we'll explore this functionality by changing our hexane molecule into _hexanol_.
Note, we could do this by manually changing the class itself, or simply by removing the temrinal hydration and adding a hydroxyl in its place. 

First, we'll define a class for a hydroxyl group featuring a single `Port` on the oxygen to represent the dangling bond.

We'll also go ahead and instantiate this class and visualize the resulting `Compound`.

### Exercise 3:
Add a port to the oxygen atom, labeled `down`

In [38]:
class OH(mb.Compound):
    def __init__(self):
        super(OH, self).__init__()
        self.add(mb.Particle(name='O', pos=[0.0, 0.0, 0.0]), label='O')
        self.add(mb.Particle(name='H', pos=[0.0, 0.1, 0.0]), label='H')
        self.add_bond((self['O'], self['H']))
        # add the port to the oxygen atom along the [0,-1, 0] direction
        self.add(mb.Port(anchor=self['O'], orientation=[0, -1, 0], separation=0.075), label='down')
        
hydroxyl = OH()
hydroxyl.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8ed68b0>

### Exercise : 4
Convert the previous `Alkane` class to create an alcohol where it can accept the `up_cap` as a variable.

#### Part 2: 
If no variable is passed as `up_cap`, make the default behavior to add a Hydroxyl group.

A skeleton has been laid out below.

In [39]:
from mbuild.lib.atoms import H

class Alcohol(mb.Compound):
    def __init__(self, chain_length=1, up_cap=None):
        super(Alcohol, self).__init__()

        # check if nothing was passed as up_cap
        if up_cap is None:
            # input default behavior here
            up_cap = OH()
        # something was passed as up_cap,
        else:
            pass

        last_monomer = CH2()
        
        # Add to our `self` compound
        self.add(up_cap)
        self.add(last_monomer)
        
        # top capping CH2
        mb.force_overlap(move_this=up_cap,
                         from_positions=up_cap['down'],
                         to_positions=last_monomer['up'])

        # loop over n = chain_length CH2's to add to Alkane
        for _ in range(chain_length-1):
            current_monomer = CH2()
            mb.force_overlap(move_this=current_monomer,
                             from_positions=current_monomer['up'],
                             to_positions=last_monomer['down'])
            self.add(current_monomer)
            last_monomer=current_monomer
        # bottom cap
        hydrogen = H()
        mb.force_overlap(move_this=hydrogen,
                         from_positions=hydrogen['up'],
                         to_positions=last_monomer['down'])
        self.add(hydrogen)

In [40]:
butane = Alcohol(chain_length=4, up_cap=None)
butane.visualize()

<py3Dmol.view at 0x7fa7d8f16340>

### Using `mbuild.Polymer`

Build an alkane using `Polymer`

Note the absence of terminal groups


In [41]:
alkane_block = mb.recipes.Polymer(monomers=[CH2()])
alkane_block.build(n=3, add_hydrogens = False)
alkane_block.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d8fcb910>

### Create a semifluorinated alkane with 2 distinct blocks

In [42]:
class CF2(mb.Compound):
    def __init__(self):
        super(CF2, self).__init__()
        
        mb.load('./utils/cf2.pdb', compound=self)
        carbon = list(self.particles_by_name('C'))[0]
        up_port = mb.Port(anchor=carbon, orientation=[0, 0, 1], separation=0.075)
        down_port = mb.Port(anchor=carbon, orientation=[0, 0, -1], separation=0.075)
        self.add(up_port, label='up')
        self.add(down_port, label='down')

In [43]:
pfa_block = mb.recipes.Polymer(monomers=[CF2()])
pfa_block.build(3, add_hydrogens=False)
alkane_block = mb.recipes.Polymer(monomers=[CH2()])
alkane_block.build(3, add_hydrogens=False)
semifluorinated_hexane = mb.recipes.Polymer(monomers=[alkane_block, pfa_block])
semifluorinated_hexane.build(1, sequence='ABAB')
semifluorinated_hexane.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d918d8b0>

### Exercise 5:
Create an alcohol using the `Polymer` Class.

#### Part2:
Create an `Alcohol` class where the `chain_length` can be an argument.

In [44]:
from mbuild.lib.atoms import H

class Alcohol(mb.Compound):
    def __init__(self, chain_length=1, up_cap=None):
        super(Alcohol, self).__init__()

        # check if nothing was passed as up_cap
        if up_cap is None:
            # input default behavior here
            up_cap = OH()
        # something was passed as up_cap,
        else:
            pass
        
        # Add to our `self` compound
        self.add(up_cap)
        
        # create the backbone using polymer
        internal_chain = mb.recipes.Polymer(monomers=[CH2()])
        internal_chain.build(n=chain_length, add_hydrogens=False)

        
        # top capping CH2
        mb.force_overlap(move_this=up_cap,
                         from_positions=up_cap['down'],
                         to_positions=internal_chain['up'])

        self.add(internal_chain)

        # bottom cap
        hydrogen = H()
        mb.force_overlap(move_this=hydrogen,
                         from_positions=hydrogen['up'],
                         to_positions=internal_chain['down'])
        self.add(hydrogen)

In [45]:
octanol = Alcohol(chain_length=8,)
octanol.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7d9329a30>

## Copying `Compounds`

Copying a compound requires using the `mb.clone` method.

In [46]:
ch2_1 = CH2()

# This does NOT make a copy of ch2_1, but a reference to ch2_1
ch2_2 = ch2_1


print(id(ch2_1), id(ch2_2))

ch2_2.rotate(around=[0,0,1], theta=1.5)

# They are the exact same
print(ch2_2.xyz)
print(ch2_1.xyz)

140358878593952 140358878593952
[[ 0.          0.          0.        ]
 [-0.00778109 -0.10972445  0.        ]
 [ 0.00778109  0.10972445  0.        ]]
[[ 0.          0.          0.        ]
 [-0.00778109 -0.10972445  0.        ]
 [ 0.00778109  0.10972445  0.        ]]


In [47]:
ch2_1 = CH2()

# This DOES make a unique copy of ch2_1
ch2_2 = mb.clone(ch2_1)


print(id(ch2_1), id(ch2_2))

ch2_2.rotate(around=[0,0,1], theta=1.5)

# They are NOT the exact same!
print(ch2_2.xyz)
print(ch2_1.xyz)

140358880278752 140358880278704
[[ 0.          0.          0.        ]
 [-0.00778109 -0.10972445  0.        ]
 [ 0.00778109  0.10972445  0.        ]]
[[ 0.    0.    0.  ]
 [-0.11  0.    0.  ]
 [ 0.11  0.    0.  ]]


## Setting up bulk systems


Typically we aren't desiring to run simulations of a single molecule. Fortunately, mBuild offers several routines to help create more complex systems. 

For example, mBuild provides users with an interface to [PACKMOL](http://m3g.iqm.unicamp.br/packmol/home.shtml) to set up bulk systems through the `fill_box` function. Here we'll use `fill_box` to place ten octanol molecules into a 3nm x 3nm x 3nm box. We can provide a seed for PACKMOL's random number generator to ensure the configuration is reproducible.

For additional information, see [mBuild Tutorial 06: Setting Up Bulk Systems](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_06_Setting_Up_Bulk_Systems.ipynb).


In [48]:
octanol = Alcohol(chain_length=8)

octanol_box = mb.fill_box(octanol, n_compounds=10, box=[3, 3, 3], seed=2)
octanol_box.visualize()

<py3Dmol.view at 0x7fa7d9a8aeb0>

## Surface functionalization

mBuild also provides routines for functionalization surfaces. There are a few surfaces available within mBuild's `surfaces` library; however, in the future we are hoping to feature a more comprehensive `surfaces` plugin.

Here, we'll load a surface of $\beta$-cristobalite silica.

**Note:** Harmless warning messages are currently generated by one of the packages mBuild depends on. To reduce clutter, we are filtering those here, so you can safely ignore the warnings filter.

For additional information, see [mBuild Tutorial 09: Surface Functionalization](https://github.com/mosdef-hub/mbuild_tutorials/blob/master/mBuild_09_Surface_Functionalization.ipynb).

In [49]:
import warnings
warnings.filterwarnings(action="ignore", category=FutureWarning)

from mbuild.lib.surfaces import Betacristobalite
surface = Betacristobalite()
surface.visualize(show_ports=True)

  warn(
  warn(


<py3Dmol.view at 0x7fa7da65b190>

We can use the `TiledCompound` class to expand our surface if we wanted.

In [50]:
tiled_surface = mb.recipes.TiledCompound(surface, n_tiles=(2, 1, 1))
tiled_surface.visualize(show_ports=True)

  warn(


<py3Dmol.view at 0x7fa7dc2b9f40>

We will now remove the end-hydrogen from our octanol molecule to generate a dangling bond/`Port` that we can use to attach copies to the surface.

In [51]:
octanol.remove(octanol['H'])
octanol.add(octanol.all_ports()[0], 'down', containment=False)
octanol.visualize(show_ports=True)

<py3Dmol.view at 0x7fa7db7144c0>

We can use mBuild's `Pattern` class to create a pattern to define the arrangement of molecules on the surface. Here, we will create a `Random2DPattern` of 10 points in *xy* space. The `apply_to_compound` method can be used to attach copies of a molecules to a surface at locations designated by the `Pattern`. We can also provide a backfill `Compound` (in this case hydrogen) to fill vacant sites.

In [52]:
pattern = mb.Random2DPattern(10)
hydrogen = H()
chains, backfills = pattern.apply_to_compound(guest=octanol, host=tiled_surface, backfill=hydrogen)
functionalized_surface = mb.Compound(subcompounds=[tiled_surface, chains, backfills])
functionalized_surface.visualize()

<py3Dmol.view at 0x7fa7df75bc70>

This concludes the mBuild overview. For more in-depth tutorials into mBuild and Foyer, refer to the [mosdef_tutorials repository](https://github.com/mosdef-hub/mosdef_tutorials) or use our [Binder link](https://mybinder.org/v2/gh/mosdef-hub/mosdef_tutorials/master).