# Polymer Partitioning in Two Fluid System
---
This exercise is to demonstrate a simple workflow utilizing the entire set of MoSDeF tools to write out a simulation file to use in LAMMPS. The main feature of this workflow is to demonstrate methods to use GMSO topologies instead of a ParmEd/OPENMM backend. This rerouted workflow, through a GMSO topology, gives the users the increased functionality and testing supported through GMSO. Although the workflow process remains relatively similar, small syntax differences allow users to gain increased functionality demonstrated in this workflow, such as using **forcefield matching** to individual molecules in the system, and to write **custom potential forms**. </br>

#### Note this workflow uses features that are still under development on the GMSO side of things. 
As such, it is necessary to install dev version of specific branches to access this functionality.
1. Install mBuild 
```bash
git clone https://github.com/mosdef-hub/mbuild.git
pip install -e ./
2. Install GMSO
```bash
git clone https://github.com/daico007/gmso.git
git fetch flatten_mbuild_convert
git checkout flatten_mbuild_convert
pip install ./
```
3. Install forcefield-utilities
```bash
conda install forcefield-utilities -c conda-forge
```
___


### Exercise Stages:
1. Import libraries
2. Custom mBuild recipes
3. Build partitioning box
4. Parameterize with multiple forcefields
5. Run Cassandra Simulations
6. Evaluate Results
---

# 1. Import Libraries
---

In [1]:
# Import Libraries
import numpy as np
import mbuild as mb
import forcefield_utilities as ffutils
import gmso



# 2. Custom mBuild Recipes
---

In [2]:
from scipy.constants import N_A
def Packing_Number(compound, vol):
    """
    Identify the number of compounds to place into a box.
    
    NOTES
    -----
    The compound must have the attribute `compound.dens` which is used to
    calculate the number of compounds to fit into the volume.
    """
    n_compounds = compound.dens * vol / compound.mass * N_A * 1e-21
    return int(n_compounds)

In [3]:
import operator
import functools
def Partitioned_Box(solute, solvent1, solvent2, boxl, frac_interface=0):
    """
    Solubilize the `solute` into `solvent1` and fill other half of the 
    box, made as a cubic box with sidelengths `boxl` with `solvent2`.
    """
    # Pack mbuild box with water, polymer, and hexane
    full_box = mb.Box([boxl, boxl, boxl])
    half_box = mb.Box([boxl/2, boxl, boxl])
    vol=functools.reduce(operator.mul, full_box.lengths, 1)/2
    solute.translate(np.array([boxl+boxl*frac_interface, boxl, boxl])/2)

    filled_box1 = mb.packing.solvate(
        solvent=solvent1, 
        solute=solute, 
        box=half_box,  
        n_solvent=Packing_Number(solvent1, vol),
        edge=0.01
    )
    filled_box1.name = "sol1"
    filled_box2 = mb.packing.fill_box(
        compound=solvent2,
        box=half_box,  
        n_compounds=Packing_Number(solvent2, vol),
        edge=0.01
    )
    filled_box2.translate([boxl/2,0,0])
    filled_box2.name = "sol2"
    partitioned_box = mb.Compound()
    partitioned_box.add(filled_box1)
    partitioned_box.add(filled_box2)
    partitioned_box.box = full_box
    return partitioned_box

# 3. Build Box of Molecules
---

In [4]:
# Build Polymer
#monomer = mb.load("CCC(=O)O", smiles=True) # Acrylic Acid monomer
monomer = mb.load("CCCO", smiles=True)
monomer.name = "monomer"
polymer= mb.lib.recipes.Polymer()
# polymer.add_monomer(monomer, indices=(6,9)) PAA only
polymer.add_monomer(monomer, indices=(5,10))
polymer.build(n=2)
polymer.name = "polymer"
#polymer.energy_minimize() visualize with and without energy minimiization
polymer.visualize()

<py3Dmol.view at 0x7ff8c79327d0>

In [5]:
# Build Solvent 1
water = mb.load("O", smiles=True)
water.name = 'water'
water.dens = 0.99

# Build Solvent 2
cyclopentane = mb.load("C1CCCC1", smiles=True)
cyclopentane.name = "cyclopentane"
cyclopentane.dens = 0.63

# Build Partitioned Box
boxl = 5 # Use a cubic boxlength of 5 nm
solute_position = 0 # center the polymer into the center of the box
partitioned_box = Partitioned_Box(polymer, water, cyclopentane, boxl, solute_position)
#partitioned_box = partitioned_box.group_by_molecules()
partitioned_box.visualize()

  "Compound.box.lengths < Compound.boundingbox.lengths. "
  "After adding new Compound, Compound.box.lengths < "


<py3Dmol.view at 0x7ff8c7964990>

# 4. Parameterize with Multiple Forcefields
---

## Load forcefields

In [6]:
import forcefield_utilities as ffutils
ffloader = ffutils.FoyerFFs() # ffloader is now an object where we can load in a forcefield for repeated uses. 
# In order to use a gmsoFF, we convert this Foyer forcefield to a GMSO forcefield.
polymer_ff = ffloader.load("gmso_files/alcohols.xml").to_gmso_ff()
pentane_ff = ffloader.load("gmso_files/alkanes.xml").to_gmso_ff()
water_ff = ffloader.load("gmso_files/tip3p.xml").to_gmso_ff() # 3 different foyer xmls located locally



## Convert Topology to GMSO

## Apply forcefields using isomorphs

In [7]:
from gmso.external import from_mbuild
import time
start = time.time()
topology_gmso = from_mbuild(partitioned_box) # Create GMSO topology
print("Time to convert mbuild structure: ", time.time()-start)
start = time.time()
topology_gmso.identify_connections() # Identify angles and dihedrals (this may be slow, 
print("Time to id connections: ", time.time()-start)
from gmso.parameterization import apply
import warnings
warnings.simplefilter("ignore", UserWarning)
ff_dicts = {
    "water": water_ff,
    "cyclopentane": pentane_ff,
    "polymer": polymer_ff
} #The names here are from the molecule names that were put into the box, and can be found
#by looking at topology_gmso.subtops
start = time.time()
apply(topology_gmso, ff_dicts, identify_connected_components=True,
                  use_molecule_info=False) # apply forcefield to relevant subtops
print("Time to apply forcefields: ", time.time()-start)
print(len(topology_gmso.atom_types))
assert topology_gmso.is_fully_typed

Time to convert mbuild structure:  17.996187925338745
Time to id connections:  144.241393327713
Time to apply forcefields:  8.157645225524902
11296


In [8]:
from gmso.core.views import PotentialFilters
print(len(topology_gmso.atom_types(PotentialFilters.REPEAT_DUPLICATES)))

11296


# 5. Write out Gromacs Simulation
---

%mkdir gmso_files/gmso_sim
topology_gmso.save("gmso_files/gmso_sim/init.top")
topology_gmso.save("gmso_files/gmso_sim/init.gro")

In [9]:
%mkdir gmso_files/gmso_sim
from gmso.external.convert_parmed import to_parmed
system = to_parmed(topology_gmso)
system.save("gmso_files/gmso_sim/init.top", overwrite=True)
system.save("gmso_files/gmso_sim/init.gro", overwrite=True)

mkdir: gmso_files/gmso_sim: File exists


In [10]:
def get_number_of_molecules(top):
    moleculesDict = {}
    old_number = -1
    for site in top.sites:
        if moleculesDict.get(site.molecule.name) == None:
            moleculesDict[site.molecule.name] = 0
        else:
            if site.molecule.number != old_number:
                moleculesDict[site.molecule.name] += 1
                old_number = site.molecule.number
    return moleculesDict
total_molecules = sum(get_number_of_molecules(topology_gmso).values())
total_molecules

2406

In [11]:
em_mdp = """
integrator          = steep
nsteps              = 500000
emstep              = 0.002
emtol               = 10
dt                  = 0.002

nstxout             = 10000
nstvout             = 10000
nstenergy           = 1000
nstlog              = 1000

cutoff-scheme       = Verlet
ns_type             = grid
nstlist             = 10

vdwtype         = Cut-off
vdw-modifier    = None
rvdw            = 1.4

coulombtype             = Cut-off
coulomb-modifier        = None
rcoulomb                = 1.4

gen_vel             = yes
gen-temp            = 372.0
gen-seed            = 4

tcoupl              = no

pcoupl              = no

pbc                 = xyz

DispCorr            = EnerPres

constraint-algorithm = LINCS
constraints         = all-bonds
"""

nvt_mdp = """
integrator          = md
nsteps              = 1000000
dt                  = 0.001

comm-mode           = Linear

nstxout             = 10000
nstvout             = 10000
nstenergy           = 1000
nstlog              = 1000

cutoff-scheme       = Verlet
ns_type             = grid
nstlist             = 10
pbc                 = xyz

vdwtype         = Cut-off
vdw-modifier    = None
rvdw            = 1.4

coulombtype             = Cut-off
coulomb-modifier        = None
rcoulomb                = 1.4

tcoupl              = nose-hoover
tc-grps             = System
tau_t               = 1
ref_t               = 372.0

pcoupl              = no

DispCorr            = EnerPres

constraint-algorithm = LINCS
constraints         = all-bonds
"""

npt_mdp = """
integrator          = md
nsteps              = 1000000
dt                  = 0.001

comm-mode           = Linear

nstxout             = 1000
nstvout             = 1000
nstenergy           = 1000
nstlog              = 1000

cutoff-scheme       = Verlet
ns_type             = grid
nstlist             = 10
pbc                 = xyz 

vdwtype         = Cut-off
vdw-modifier    = None
rvdw            = 1.4 

coulombtype             = Cut-off
coulomb-modifier        = None
rcoulomb                = 1.4 

gen_vel             = no

tcoupl              = nose-hoover
tc-grps             = System
tau_t               = 1 
ref_t               = 372.0

pcoupl                   = parrinello-rahman
pcoupltype               = isotropic
nstpcouple               = -1
tau-p                    = 10.0
compressibility          = 4.5e-5
ref-p                    = 14.02

DispCorr            = EnerPres

constraint-algorithm = LINCS
constraints         = all-bonds
"""

with open("gmso_files/gmso_sim/em.mdp", "w") as f:
    f.write(em_mdp)

with open("gmso_files/gmso_sim/nvt.mdp", "w") as f:
    f.write(nvt_mdp)

with open("gmso_files/gmso_sim/npt.mdp", "w") as f:
    f.write(npt_mdp)


In [None]:
%cd gmso_files/gmso_sim
!gmx grompp -f em.mdp -o em.tpr -c init.gro -p init.top --maxwarn 1
!gmx mdrun -v -deffnm em -s em.tpr -cpi em.cpt

!gmx grompp -f nvt.mdp -o nvt.tpr -c em.gro -p init.top --maxwarn 1
!gmx mdrun -v -deffnm nvt -s nvt.tpr -cpi nvt.cpt

!gmx grompp -f npt.mdp -o npt.tpr -c nvt.gro -p init.top --maxwarn 1
!gmx mdrun -v -deffnm npt -s npt.tpr -cpi npt.cpt
%cd ..

/Users/calcraven/Dropbox/Mac/Documents/Vanderbilt/Research/MoSDeF/2022-FOMMS-Workshop/FOMMS-workshop/tutorials/gmso_files/gmso_sim
                 :-) GROMACS - gmx grompp, 2021.3-bioconda (-:

                            GROMACS is written by:
     Andrey Alekseenko              Emile Apol              Rossen Apostolov     
         Paul Bauer           Herman J.C. Berendsen           Par Bjelkmar       
       Christian Blau           Viacheslav Bolnykh             Kevin Boyd        
     Aldert van Buuren           Rudi van Drunen             Anton Feenstra      
    Gilles Gouaillardet             Alan Gray               Gerrit Groenhof      
       Anca Hamuraru            Vincent Hindriksen          M. Eric Irrgang      
      Aleksei Iupinov           Christoph Junghans             Joe Jordan        
    Dimitrios Karkoulis            Peter Kasson                Jiri Kraus        
      Carsten Kutzner              Per Larsson              Justin A. Lemkul     
       Viveca Li

Wrote pdb files with previous and current coordinates
Step=   26, Dmax= 3.3e-02 nm, Epot=  1.67940e+04 Fmax= 1.53619e+03, atom= 1310
Step=   27, Dmax= 4.0e-02 nm, Epot=  1.50286e+04 Fmax= 3.23862e+03, atom= 1310
Step=   28, Dmax= 4.8e-02 nm, Epot=  6.13089e+03 Fmax= 2.33348e+03, atom= 1310

step 29: One or more water molecules can not be settled.
Check for bad contacts and/or reduce the timestep if appropriate.
Wrote pdb files with previous and current coordinates
Step=   44, Dmax= 1.7e-06 nm, Epot=  6.13856e+03 Fmax= 2.33341e+03, atom= 1310
Energy minimization has stopped, but the forces have not converged to the
requested precision Fmax < 10 (which may not be possible for your system). It
stopped because the algorithm tried to make a new step whose size was too
small, or there was no change in the energy since last step. Either way, we
regard the minimization as converged to within the available machine
precision, given your starting configuration and EM parameters.

Double precision

step 999900, remaining wall clock time:     0 s          nish Sun Jul 17 04:32:07 2022, will finish Sun Jul 17 04:32:10 2022Sun Jul 17 04:32:40 2022Sun Jul 17 04:32:50 2022, will finish Sun Jul 17 04:41:17 2022, will finish Sun Jul 17 06:10:10 2022, will finish Sun Jul 17 06:03:28 2022, will finish Sun Jul 17 05:57:15 2022817300, will finish Sun Jul 17 05:31:46 2022
Writing final coordinates.
step 1000000, remaining wall clock time:     0 s          

               Core t (s)   Wall t (s)        (%)
       Time:    62063.629     7757.955      800.0
                         2h09:17
                 (ns/day)    (hour/ns)
Performance:       11.137        2.155

GROMACS reminds you: "Right Now My Job is Eating These Doughnuts" (Bodycount)

                 :-) GROMACS - gmx grompp, 2021.3-bioconda (-:

                            GROMACS is written by:
     Andrey Alekseenko              Emile Apol              Rossen Apostolov     
         Paul Bauer           Herman J.C. Berendsen     

# 6. Data Visualization

In [None]:
"""If you used GROMACS"""
import numpy as np
import pylab as plt 

import panedr
from panedr import edr_to_df

data = edr_to_df("gmso_files/gmso_sim/npt.edr")

plt.rcParams['font.family'] = "DIN Alternate"
font = {'family' : 'DIN Alternate',
        'weight' : 'normal',
        'size'   : 12}

fig, ax = plt.subplots(1, 1)

ax.spines["bottom"].set_linewidth(3)
ax.spines["left"].set_linewidth(3)
ax.spines["right"].set_linewidth(3)
ax.spines["top"].set_linewidth(3)

ax.title.set_text('Control plot')
ax.set_xlabel(r"MD Step")
ax.set_ylabel('Density $(kg / m{^3})$')
ax.yaxis.tick_left()
ax.yaxis.set_label_position('left')
ax.axhline(y=541, color='r', linestyle='-', label='~TraPPE-UA Density')

dt, density = list(), list()
for i, j in enumerate(data["Density"]):
    dt.append(i)
    density.append(j)
    
ax.plot(dt, density, "-", color='lightgray', label='Density')
ax.legend(loc="best")