## Advanced mBuild routines

Now that we've explored the basics of how to create and connect mBuild `Compounds`, we'll look at some more advanced functionality to faciliate the construction of more relevant molecular systems.

### Importing mBuild

Again, we'll import mBuild

In [None]:
%matplotlib notebook
import mbuild as mb


### Creating polymers

In the previous tutorial we finished up by creating a class for constructing a linear alkane chain. One could imagine that the same approach we took to create this class (i.e. successively adding CH2 units) could be further generalized to support the creation of any linear polymer. In fact, mBuild contains a class that does just this, `mbuild.Polymer`.

Here, we'll explore how `Polymer` works by creating a PEG (polyethylene glycol) molecule. We first need to define classes for our CH2 and oxygen monomer units.

In [None]:
class CH2(mb.Compound):
    def __init__(self):
        super(CH2, self).__init__()
        
        mb.load('ch2.pdb', compound=self)
        carbon = list(self.particles_by_name('C'))[0]
        up_port = mb.Port(anchor=carbon, orientation=[0, 0, 1], separation=0.075)
        down_port = mb.Port(anchor=carbon, orientation=[0, 0, -1], separation=0.075)
        self.add(up_port, label='up')
        self.add(down_port, label='down')

class O(mb.Compound):
    def __init__(self):
        super(O, self).__init__()
        
        self.add(mb.Compound(name='O'))
        up_port = mb.Port(anchor=self[0], orientation=[0, 0, 1], separation=0.075)
        self.add(up_port, 'up')
        down_port = mb.Port(anchor=self[0], orientation=[0, 0, -1], separation=0.075)
        self.add(down_port, 'down')
        
class H(mb.Compound):
    def __init__(self):
        super(H, self).__init__()
        self.add(mb.Compound(name="H"))
        up_port = mb.Port(anchor=self[0], orientation=[0, 0, 1], separation=0.07)
        self.add(up_port, 'up')


We'll now feed instances of these two monomers to the `monomers` argument of `Polymer`. We also need to provide a few additional arguments. One of these is the `sequence`, which is provided as a string of characters where each unique character represents one repetition of a monomer. Here, `AAB` means that we want two `CH2`'s for each `O`. We can use the `n` argument to specify the number of times the sequence should be replicated. The `port_labels` argument tells mBuild the names of the two `Ports` to connect when stitching together the polymer.

In [None]:
peg4 = mb.lib.recipes.Polymer(monomers=(CH2(), O()), end_groups=(H(), H()))
peg4.build(sequence='AAB', n=4)
peg4.visualize(backend="nglview")

#### Exercise
Change the copolymer repeat unit by changing the `sequence` argument. For example, try `'AAABBBB'`.

In [None]:
polymer = mb.lib.recipes.Polymer(monomers=(CH2(), O()), end_groups=(H(), H()))
polymer.build(sequence='AAABBBB', n=3)
polymer.visualize(backend="nglview")

### Energy minimization

By this point you have likely noticed that the geometries of some of the molecules we've created may not look entirely realistic (e.g. all backbone atoms featuring 180 degree angles in our PEG molecule). You can solve this issue by placing `Particles` and `Ports` in more realistic locations, either manually or by using energy minimized inputs. 

Alternatively, you can construct a `Compound` and then energy minimize, either through a simulation engine or using the `energy_minimize` function in mBuild (which uses the [Open Babel](http://openbabel.org/dev-api/) toolkit) to yield more realistic geometries for your prototypes.

**Note:** In many cases it is easier to create systems with unrealistic configurations.

In [None]:
from mbuild.lib.recipes import Alkane
hexane = Alkane(6)
hexane.visualize(backend="nglview")

In [None]:
hexane.energy_minimize()

In [None]:
hexane.visualize(backend="nglview")

We can use Python's implemented `help` function to view the docstring of any function or object. Here we'll use this to view the docstring for the `energy_minimization` method.

In [None]:
help(mb.Compound.energy_minimize)

### Packing boxes

A common routine used for setting up systems is the packing of boxes with some molecule prototype. mBuild features several routines designed around the [PackMol](http://www.ime.unicamp.br/~martinez/packmol/home.shtml) utility to support this functionality. Here we'll use the `fill_box` routine to create a box filled with hexane molecules.

To use the `fill_box` routine, we first need to define the dimensions of the box itself. mBuild features a basic `Box` class for defining orthogonal simulation boxes. Here we'll define a box with dimensions of 3nm x 3nm x 3nm.

In [None]:
box = mb.Box(lengths=[3, 3, 3])
box

We'll now use the `fill_box` routine to place five hexane molecules into our box.

In [None]:
box = mb.fill_box(hexane, n_compounds=5, box=box)
box.visualize()

### Patterning

It can often be useful to specify the exact locations where molecules should be placed in a system. mBuild's `Pattern` class allows one to generate a set of points in Cartesian space in commonly desired arrangements, such as random and grid-like patterns in both 2D and 3D, as well as uniform points on the surface of a sphere.

We'll explore the `Pattern` class here by arranging hydrogen atoms in various arrangements.

In [None]:
my_compound = mb.Compound()
grid3d = mb.Grid3DPattern(4, 4, 3)
grid3d.scale(0.5)
for position in grid3d:
    particle = mb.Compound(name='H', pos=position)
    my_compound.add(particle)
my_compound.visualize(backend="nglview")

In [None]:
my_compound = mb.Compound()
sphere_pattern = mb.SpherePattern(50)
sphere_pattern.scale(0.2)
for position in sphere_pattern:
    particle = mb.Compound(name='H', pos=position)
    my_compound.add(particle)
my_compound.visualize(backend="nglview")

We can also use the `Pattern.apply` method to automatically place copies of a `Compound` at locations specified by a `Pattern`.

In [None]:
sphere_pattern = mb.SpherePattern(50)
sphere_pattern.scale(0.2)
particle = mb.Compound(name='H', pos=position)
particles = sphere_pattern.apply(particle)
my_compound = mb.Compound(subcompounds=particles)
my_compound.visualize(backend="nglview")

### Surface functionalization

mBuild also features several functions to aid in the functionalization of surfaces. For example, the `Pattern.apply_to_compound` method allows one to connect copies of a 'guest' `Compound` to `Ports` located on a 'host' `Compound`. We'll explore how this can be useful for surface functionalization by considering a crystalline silica surface (featuring many `Ports`) as our host and a polymer chain as our guest.

First we'll import our crystalline silica surface from mBuild's `surfaces` library.

In [None]:
from mbuild.lib.surfaces import Betacristobalite
surface = Betacristobalite()
surface.visualize(backend="nglview")

Now, we'll create prototypes for two polymer chains of different lengths, specify a random pattern of 30 points in 2D space, and will use `apply_to_compound` to stick copies of the first polymer on the surface, backfilling unused `Ports` with the shorter polymer. In the mBuild nomenclature, `guests` are the `Compound` copies that have been added to the surface and `backfills` are an optional second `Compound` type that can be used to fill any leftover `Ports` in the host `Compound` after all points in the `Pattern` have been satisfied.

In [None]:
surface = Betacristobalite()
peg4 = mb.lib.recipes.Polymer(monomers=(CH2(), O()))
peg4.build(n=4, sequence='AAB', add_hydrogens=False)
peg1 = mb.lib.recipes.Polymer(monomers=(CH2(), O()))
peg1.build(n=1, sequence='AAB', add_hydrogens=False)
pattern = mb.Random2DPattern(30, seed=1)
guests, backfills = pattern.apply_to_compound(guest=peg4, host=surface, backfill=peg1, backfill_port_name='down')
functionalized_surface = mb.Compound(subcompounds=[surface, guests, backfills])
functionalized_surface.visualize(backend="nglview")

As we've seen, the `Pattern.apply_to_compound` method is a useful way to approach surface functionalization with mBuild. However, this can be done even easier by using `mbuild.Monolayer`, where the above steps have been wrapped into a class. Multi-component monolayers can be generated by simply passing a list of `Compounds` to the `chains` argument also with the `fractions` of each component.

In [None]:
from mbuild.lib.surfaces import Betacristobalite
from mbuild.lib.recipes import Polymer, Monolayer

surface = Betacristobalite()
peg4 = Polymer(monomers=(CH2(), O()))
peg4.build(n=4, sequence='AAB', add_hydrogens=False)
peg2 = Polymer(monomers=(CH2(), O()))
peg2.build(n=1, sequence='AAB', add_hydrogens=False)
c18 = Polymer(monomers=[CH2()])
c18.build(n=18, sequence='A', add_hydrogens=False)
monolayer = Monolayer(surface=surface, chains=(peg4, peg2, c18), fractions=(0.5, 0.4, 0.1))
monolayer.visualize(backend="nglview")

These are only a subset of the routines available in mBuild to construct molecular systems, and more routines are continuing to be added. As a reminder, additional information on mBuild can be found both at our [website](http://mbuild.mosdef.org/) and our [Github page](https://github.com/mosdef-hub/mbuild). We encourage you to submit "issues" on our Github if there is any additional functionality you would like to see implemented to support creation of systems relevant to your work, or if you are more emboldened, to submit a pull request with routines you have written for mBuild.

### One last demo: Saving molecular topologies

One last demonstration we will examine is how data files can be written from mBuild `Compounds` so that these can be used to run molecules simulations. mBuild utilizes the ParmEd package to support saving mBuild `Compounds` to a variety of common data formats (e.g. PDB, MOL2, GRO) and also features (more limited) writers of its own to save to LAMMPS and HOOMD (both XML and GSD) data formats.

In [None]:
from mbuild.lib.recipes import Alkane
hexane = Alkane(6)
hexane_box = mb.fill_box(hexane, n_compounds=5, box=mb.Box(lengths=[3, 3, 3]))
hexane_box.save('hexanes.gro')
! head hexanes.gro

The next tool in the MoSDeF toolkit we will cover is the Foyer package, which is used to automatically perform atomtyping and forcefield application. This is necessary to actually be able to run simulations using `Compounds` built using mBuild.

Our final code block will show how Foyer can be utilized from within mBuild's `save` function to automatically apply a user-specified forcefield (in this case the OPLS all-atom forcefield).

In [None]:
hexane_box.save('hexanes.top', forcefield_name='oplsaa')

In [None]:
! cat hexanes.top