<a href="https://colab.research.google.com/github/mosdef-hub/CECAM-MoSDeF-Workshop/blob/main/polymer_workflow/hoomd-organics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Organic Polymers with HOOMD**
---
## Summary
Users are encouraged to build functionality around MoSDeF tools by wrapping and extending the core classes to suit their needs. `hoomd-organics` is a lightweight library that provides another example of extending MoSDeF tools, reviews concepts from the Slitpore and Biomolecule notebooks, and demonstrates a rudimentary coarse-graining workflow.

## Learning Objectives
This notebook provides interactive examples that will assist learners in using MoSDeF tools to:
1. Initialize complex macromolecules for molecular simulation.
2. Demonstrate how to run HOOMD-Blue simulations with these molecules.
3. Use and inspect forcefields.
4. Generate coarse-grained representations and run simplified models.

## Tutorial Contents
0. Set up the notebook environment on Google Colab
1. A concise polymer example with `hoomd-organics`
2. Defining molecules
3. Defining and inspecting forces
4. Specifying your own systems
5. Coarse-graining

# __0. Orientation, Installation, & Setup__
---

## Software stack setup
After running the cell below the kernel will restart -- This is necessary for conda dependencies, but you'll need to wait for that kernel restart before running the second cell.

## Interface notes
There are two types of output in these Colab notebooks that can be a little tricky:

1. If the output is very long, for example from the mamba command in the second cell, scrolling past the output can feel onerous. In this case, scrolling up and down in the narrow grey area between the sidebar menu and the cells can help you navigate.

2. If the output is a visualization of a molecule or simulation configuration, scrolling up or down will zoom in or out if the cursor is over the visualization. In these cases, take some care to scroll outside of the visualization.

OK, so if you haven't already, run the next cell by using Shift+Return when the cell is active (click on it) or by pressing the Play button that appears when you mouseover it

In [None]:
!pip install -q condacolab
!git clone --single-branch --branch cecam https://github.com/cmelab/hoomd-organics
import condacolab
condacolab.install()

It will take about 2-3 minutes to install the python dependencies after the kernel restarts. Once the kernel does restart, you can run the following cell right away. This cell and the previous one only need to be run once each, and running either one a second time can cause some confusions.

In [None]:
#!pip install --upgrade ipykernel #breaks things?
import os
os.chdir("hoomd-organics")
!mamba env update -n base -f environment-cpu.yml
!python -m pip install -e .
import warnings
warnings.filterwarnings('ignore')


#  __1. HOOMD simulations from start to finish with MoSDeF Tools__
---
## Overview:
We'll see how to run simulations of poly(phenylene sulfide) (PPS) molecules using the `hoomd-organics`, a package of MoSDeF tools for initializing and performing common MD simulations of organic molecules. This uses the [`HOOMD-blue`](https://hoomd-blue.readthedocs.io/en/v4.1.0/) simulation engine.



## 1.1 Initialization, parameterization, and simulation

First, let's see everything in one block:

With just a couple of imports and a few lines of code, we are able to initialize 30 8-mers of PPS, randomly pack them into a volume, perform a simulation in the NVT ensemble, and peek at the final configuration.

Depending on your colab node utilization, this may take anywhere from 1-7 minutes.

In [None]:
from hoomd_organics.library import PPS, OPLS_AA_PPS
from hoomd_organics import Pack, Simulation
from cmeutils.visualize import FresnelGSD

molecules = PPS(num_mols=30, lengths=8)
system = Pack(molecules=molecules, force_field=OPLS_AA_PPS(), density=0.8, r_cut=2.5, auto_scale=True, scale_charges=True, packing_expand_factor=5)
pps_ff = system.hoomd_forcefield
sim = Simulation(initial_state=system.hoomd_snapshot, forcefield=pps_ff, gsd_write_freq=100, log_write_freq=100, gsd_file_name="pps.gsd")
sim.run_update_volume(n_steps=1000, period=1, kT=1, tau_kt=1, final_box_lengths=system.target_box)
sim.run_NVT(n_steps=1000, kT=1.0, tau_kt=0.01)
viz = FresnelGSD("pps.gsd")
viz.frame = -1 # python convention for last element
viz.view()


In the above example, a lot of functionality is provided by two key imports: `PPS`, `and OPLS_AA_PPS`. `PPS()` uses `mBuild` tools to initialize PPS chemistries specificially, and `OPLS_AA_PPS` is an instance of a `foyer.Forcefield` that provides the subset of parameters from OPLS_AA needed by PPS specifically.

In the next sections we'll explore each of these components in some more depth.

In [None]:
#Can play around visualizing other frames here
viz.frame = 2 #Here the frames run from 0-20: 1 initial configuration, 10 frames from the volume shrink, and 10 frames from the NVT run.
viz.view()

# __2. Defining Molecules__
---
What other ways can we initialize molecules in simulation volumes?

Above, we used the `PPS` class, a subclass of the `hoomd-organics` `Molecule` class. This class includes all the necessary information about the PPS molecule, including the monomer structure and how the monomers bond to create a chain. All we needed was to specify is the polymer length and how many polymer chains we want to create in the `PPS()` constructor.

You can also define your own molecule(s):
- Using the SMILES string of the molecule
- Using the molecule file (accepted formats are: `.mol2` and `.sdf`)
- Using a [`mbuild`](https://mbuild.mosdef.org/en/stable/) compound or a [`gmso`](https://gmso.mosdef.org/en/stable/) topology
- By defining your own subclass of `Molecule`, such as [PPS](https://github.com/cmelab/hoomd-organics/blob/e709be850cc2e818f817243bc82e5414465d0e6b/hoomd_organics/library/polymers.py#L35).

## __Exercise 2.1__

Use the template code below to initialize some copies of a molecule using SMILES strings. Put your blue sticky note up on your laptop when you've been able to explore a bit. Put up your pink sticky note or file an issue at [github](https://github.com/cmelab/hoomd-organics/issues) if you run into any problems!

### <font color="red"><b>Exercise 2.1 Hint </b></font>
<details>
<summary>Click here for help.</summary>
Replace "YOUR_SMILES_HERE" with a valid smiles string such as "c1cc(C(O)=O)ccc1" to initialize `num_mols` instances of that molecule.
</details>


In [None]:
# example of loading molecule(s) using the SMILES string and visualizing it
from hoomd_organics import Molecule
benzoic_acid_mol = Molecule(num_mols=20, smiles="YOUR_SMILES_HERE") #
benzoic_acid_mol.molecules[0].visualize()

## 2.2 Initializing molecules from `mol2` or `sdf` files


In [None]:
# example of loading a molecule using a mol2 file
# If you have another mol2 or sdf file accessible over the web, you can wget it as below:
#!wget https://raw.githubusercontent.com/cmelab/hoomd-organics/main/hoomd_organics/assets/molecule_files/IPH.mol2
phenol_mol = Molecule(num_mols=20, file="hoomd_organics/assets/molecule_files/IPH.mol2")
phenol_mol.molecules[0].visualize()

## 2.3:  Initializing from a [`mbuild`](https://mbuild.mosdef.org/en/stable/) compound or a [`gmso`](https://gmso.mosdef.org/en/stable/) topology

In [None]:
# example of loading a molecule from mbuild compound or gmso topology
import mbuild as mb
mb_compound = mb.load("c1ccccc1", smiles=True) #let's doublecheck benzene
gmso_top = mb_compound.to_gmso()
benzene_mol = Molecule(num_mols=20, compound=mb_compound)
benzene_mol = Molecule(num_mols=20, compound=gmso_top)

#__3. Defining and inspecting systems__

How did we use the molecules created above to initialize a simulation volume that was then used to run an MD simulation?

The `Pack` class, which is a subclass of the `System` class, is used to pack a box of PPS molecules given a density. The `System` class provides code to create the simulation volume and fill it with molecules, applies the force-field (if provided) to the system and creates the initial state of the system in the form a `gsd` snapshot.

If the force-field is provided, `Pack` also gets the list of forces that defines the bonded and non-bonded interactions between the particles.

In this example, we passed the molecules object created in section 1.1 to pack a box with density=0.8.
Let's visualize this initial configuration:

In [None]:
system.system.visualize()

## 3.1 Inspecting forces

Let's see what values of sigma and epsilon were used to parameterize the Lennard-Jones potential in our simulation.

To get this information, we access forcefield information, which we stored in `pps_ff` earlier, calling `system.hoomd_forcefield`.

By accessing `pps_ff` we can see which forcefield components are stored in which elements of the list:

In [None]:
pps_ff

## __Activity 3.1__
and then we can view the parameters (`params`) of the LJ pair forces as a dictionary:

### <font color="red"><b>Exercise 3.1 Hint </b></font>
<details>
<summary>Click here for help.</summary>

We'll replace "INDEX" with the index of the forcefield element we wish to inspect. From the previous cell we can see that LJ pair forces are stored in `pps_ff[3]`, so we'll pass the parameters `pps_ff[3].params` to the dict() funtion to summarize them.
</details>

In [None]:
dict(pps_ff[INDEX].params)

# __4. Specifying your own systems__
## 4.1  Defining forcefields
`hoomd-organics` package has a list of pre-defined force-fields that can be used to initialize the system. If you have the `xml` file of the forcefield, you can use the `FF_from_file` class from `hoomd_organics.library` to create a force-field object.
You can also define your own forcefield by creating a subclass of the `foyer.Forcefield` class.


In [None]:
from hoomd_organics.library import FF_from_file
benzene_ff = FF_from_file(xml_file="hoomd_organics/assets/forcefields/benzene_opls.xml")

Checkout `hoomd_organics/library/forcefields.py` for more some examples of defining a forcefield using a subclass of `foyer.Forcefield` for specific molecules.

##4.2. Creating initial configurations


`hoomd_organics` package has two methods of filling the box built in the `System` class: `Pack` and `Lattice`. We used `Pack` above, which leverages an `mBuild` interface to packmol. `Lattice` is demonstrated below:

In [None]:
# example of defining a system using the Lattice method

from hoomd_organics import Lattice
from hoomd_organics.library import OPLS_AA

benzene_mol = Molecule(num_mols=32, smiles="c1ccccc1")

lattice = Lattice(
            molecules=[benzene_mol],
            force_field=OPLS_AA(),
            density=1.0,
            r_cut=2.5,
            x=1,
            y=1,
            n=4,
            auto_scale=True
        )
lattice.system.visualize()

##4.3. Systems with multiple molecule types

You can also define your own method of filling the box by creating a subclass of the `System` class. For example, one method of filling a box with two types of molecule is creating alternate layers of each molecule type.

The system class can take a list of different molecule types along with different forcefields. If all molecule types use the same forcefield, then you only need to pass the forcefield once.

In [None]:
#!wget https://github.com/cmelab/hoomd-organics/raw/main/hoomd_organics/assets/forcefields/dimethylether_opls.xml
from hoomd_organics.library import OPLS_AA_DIMETHYLETHER
dimethylether_mol = Molecule(num_mols=20, smiles="COC")
pps_mol = PPS(num_mols=10, lengths=4)
multi_type_system = Pack(
    molecules=[dimethylether_mol, pps_mol], #specify numbers of molecules in constructors above
    density=0.8,
    r_cut=2.5,
    force_field=[OPLS_AA_DIMETHYLETHER(), OPLS_AA_PPS()],
    auto_scale=True,
)
multi_type_system.system.visualize()

# __5. Coarse-graining__
---
In the following example, we'll demonstrate how to generate a coarse-grained representation of a molecule, apply it to a simulation volume, define a forcefield for that coarse representation, and run a HOOMD simulation with it.

In [None]:
from hoomd_organics.base import Pack, Simulation
from hoomd_organics.library import PPS, BeadSpring
pps_mol = PPS(num_mols=300, lengths=6)
pps_mol.molecules[0].visualize()

In [None]:
pps_mol.coarse_grain(beads={"A": "c1ccc(S)cc1"})
pps_mol.molecules[0].visualize()

In [None]:
cg_system = Pack(molecules=pps_mol, density=0.5, r_cut=2.5, auto_scale=False)
cg_system.system.visualize()

In [None]:


ff = BeadSpring(
    r_cut=2.5,
    beads={"A": dict(epsilon=1.0, sigma=1.0),},
    bonds={"A-A": dict(r0=1.1, k=300),},
    angles={"A-A-A": dict(t0=2.0, k=200)},
    dihedrals={"A-A-A-A": dict(phi0=0.0, k=100, d=-1, n=1)},
)
cg_sim = Simulation(initial_state=cg_system.hoomd_snapshot, forcefield=ff.hoomd_forcefield, gsd_write_freq=100, log_write_freq=100, gsd_file_name = "cg.gsd")
cg_sim.run_update_volume(n_steps=1000, period=1, kT=1, tau_kt=1,  final_box_lengths=cg_system.target_box)
print(cg_system.hoomd_snapshot.particles.types)
cg_sim.run_NVT(n_steps=1e3, kT=1.2, tau_kt=1)


In [None]:
cg_viz = FresnelGSD("cg.gsd")
cg_viz.frame = 1
cg_viz.view()