# 0. Set up environment on Google Colab

In [None]:
!pip install -q condacolab
import condacolab
condacolab.install_miniforge()

In [None]:
import condacolab
condacolab.check()

!conda install mamba

!mamba install anaconda-client -n base
!git clone https://github.com/mosdef-hub/CECAM-MoSDeF-Workshop
!mamba env update -n base -f CECAM-MoSDeF-Workshop/environment.yml
!pip install -e CECAM-MoSDeF-Workshop

# Carbon Slitpore Workflow
- Striolo, A.; Chialvo, A. A.; Cummings, P. T.; Gubbins, K. E. Water Adsorption in Carbon-Slit Nanopores. Langmuir, 2003, 19 (20), 8583–8591.
    - "Porous  carbon  materials  are  used  for  separation, purification, and catalysis purposes. While the adsorption and phase behavior of nonpolar fluids in carbon pores has  been  studied  extensively,our  understanding  regarding adsorption of water in carbonaceous materials is still rudimentary. Nevertheless, the structure and the thermodynamic  properties  of  water  confined  in  hydrophobic  regions  are  of  importance  in  many  scientific disciplines such as chemistry, geology, nanotechnology, and biology. Water adsorption in hydrophobic materials is typically characterized by negligible adsorption at low relative pressures, sudden and complete pore filling by a capillary-condensation mechanism, and large adsorption/ desorption hysteresis loops."


## Simulation Workflow
- The above study was recreated in 2020 in a work by Cummings et al. using open-source moleuclar modeling software with focus on the Molecular Simulation Design Framework (MoSDeF).
    - Peter Cummings, Clare McCabe, Christopher Iacovella, et al. Open-Source Molecular Modeling Software in Chemical Engineering Focusing on the Molecular Simulation Design Framework. Authorea. November 30, 2020.

## __1. Construct System with mBuild__
- The chemical system can be constructied with mBuild, the hierarchical molecular constructor of the MoSDeF software suite. The library offers several way to load or create a chemical systems, e.g., loading from common file format such as .xyz, .mol2, .pdb, from a SMILES string, using internal recipes, or user-construct recipes.
- Below, we demonstrate two methods of creating a molecule, i.e., using a SMILES string to create a water molecule, and using an user-recipe to build a carbon slitpore.

In [1]:
import warnings 
warnings.filterwarnings("ignore")
# Import Libraries
import mbuild as mb
import gmso

from cecam_mosdef.slitpore_workflow.porebuilder import GraphenePore, GrapheneSurface

  entry_points = metadata.entry_points()["mbuild.plugins"]


In [None]:
# load molecules from their daylight SMILES string 
# https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html
water = mb.load("O", smiles=True) 
water_box = mb.fill_box(water, box=[5,5,5], n_compounds=100) #density specify


"""Visualization utilities"""
water.print_hierarchy()

water.visualize() # visualize molecule atoms and bonds

In [None]:
# Load structure from recipes
graphene = GraphenePore(pore_length=4,
                        pore_depth=4,
                        n_sheets=2,
                        pore_width=1.2,
                        slit_pore_dim=1)
# Try changing the n_sheets to form more layers
"""Visualization utility""" 
graphene.print_hierarchy()

graphene.visualize()

### Exercise 1 - Creating your own systems (ET: 10 mins)
1. Create and visualize a graphene pore using the recipe above, feel free to change up some of the parameters and see its effects on the final system.
2. Create and visualize a solvent of choice with mbuild using SMILES string
    - Tips: Google molecule name + SMILES usually return the input you need
    - Bonus: Load a molecule from a pdb file (there is a few exist at ...)

In [12]:
# Start your exercise here

## __2. Loading Forcefield__

- In the MoSDeF ecosystem, we stored forcefield is stored in XML format, which contains information about version, combining rule, atom types, connection types and associated doi. Each atom type also includes a `def`, which stores the SMARTS definition, and `doi`, which store the original paper that the parameters are sourced from.
- Currently, there are two XML formats supported by MoSDeF tools, one of which is an extended version of OpenMM XML, while the other is newly developed to include more information that include additional information that we believe would be beneficial for performing TRUE research.

In [4]:
import forcefield_utilities as ffutils

In [5]:
carbon_forcefield = gmso.ForceField("../forcefields/carbon.xml")
carbon_forcefield

<ForceField Graphene,
 1 AtomTypes,
 0 BondTypes,
 0 AngleTypes,
 0 DihedralTypes,
 0 ImproperType,
 0 PairPotentialType,
 id: 6104351552>

In [None]:
"""Basic attributes of each atom type"""
for name, atype in carbon_forcefield.atom_types.items():
    print(atype)
    print("SMARTS definition:", atype.definition)
    print("Potential expression")
    display(atype.expression)
    print(atype.parameters)

In [None]:
spce_forcefield = gmso.ForceField("../forcefields/spce.xml")
spce_forcefield

In [None]:
"""Basic attributes of each connection type"""
for name, btype in spce_forcefield.bond_types.items():
    print(btype)
    print("Potential expression")
    display(btype.expression)
    print(btype.parameters)

### Exercise 2 - Load a force field and inspect some of its attributes (ET: 10mins)
1. Load the "OPLS" forcefield at "path" to an object named `oplsaa` 
2. Inspect the forcefield 
    - Try calling `oplsaa.__dict__` and see all attributes that a force field has
    - What is the comining rule and scaling factor of this forcefield
3. Inspect some attributes of an atomtype 
    - Inspect the potential expression 
    - Notable attributes

In [11]:
# Start your exerise here

## __3. Parameterization__
- MoSDeF's backend data structure supports automatic atom typing and parameterization (mapping atom types and connection types stored in a loaded forcefield to a GMSO structure).
- This is done internally using Foyer, which performs graph matching between the molecule bond graph (of the GMSO Topology object) to the atom type SMARTS string. The algorithm for the processed is outlined here[FOYER PAPER LINK].
- The parameterization step created a typed Topology, which would be ready to be saved out to various file formats, ready to be taken in by corresponding simulation codes.

In [None]:
from gmso.parameterization import apply

graphene_top = graphene.to_gmso()
water_top = water_box.to_gmso()
graphene_ptop = apply(graphene_top, carbon_forcefield)
water_ptop = apply(water_top, spce_forcefield)

In [None]:
# Iterable attributes
# graphene_top.sites
# graphene_top.bonds
# graphene_top.angles
# graphene_top.dihedrals
# graphene_top.impropers

display(graphene_ptop.sites[0].atom_type.expression)
print(f"{graphene_ptop.sites[0].atom_type.parameters}")

In [None]:
"""Utility to output system as Dataframe"""
water_ptop.to_dataframe(site_attrs=["atom_type.parameters"])

In [None]:
"""Utility to output system as Dataframe"""
graphene_ptop.to_dataframe(site_attrs=["atom_type.parameters"])
# TODO: only print unique parameters here

### Exercise 3 - Parametrized your solvent (ET: 10 mins)
1. Use the OPLS to try parameterize the molecule you created in the above exercise (it's may or may not be successful depends on how exotic the molecule you created)
    - Summarize the all the atomtypes in a dataframe 
2. Open the docstring for Topology.to_dataframe
    - See what you can modify the output of the dataframe to get the information you need.
    

In [9]:
### Start your exercise here

## __4. Saving out to Cassandra files__
- The GMSO data structure provide direct support to multiple simulation engines, including GROMACS, LAMMPS, HOOMD-blue, GOMC and Cassandra. This includes the ability to directly save the typed Topology to molecular file input which can be used directly by the corresponding engines.
- In this example, we are writing out the file into Cassandra file format (`.mcf`).

In [None]:
# Saving out file and inspect the output

## __5. Set up Cassandra input file and run simulation (Optional)__
- Colab with groups to create automated input file generator
- Using mosdef_cassandra, mention mosdef_gomc

In [None]:
import mosdef_cassandra as mc
import unyt as u


In [None]:
# This is pending rewritten with the new mcf PR

import mosdef_cassandra as mc
import unyt as u

# set variables
n_steps = 1000
temperature = 300 # K
mu = -54.0 # u.kJ / u.mol,

# Create box and species list
box_list = [empty_pore]
species_list = [typed_pore, typed_water]

# Specify mols at start of the simulation
mols_in_boxes = [[1, 0]]

# Create MC system
system = mc.System(box_list, species_list, mols_in_boxes=mols_in_boxes)
moves = mc.MoveSet("gcmc", species_list)

# Set move probabilities
moves.prob_translate = 0.25
moves.prob_rotate = 0.25
moves.prob_insert = 0.25
moves.prob_regrow = 0.0

# Specify the restricted insertion
restricted_type = [[None, "slitpore"]]
restricted_value = [[None, 0.5 * pore_width ]]
moves.add_restricted_insertions(
    species_list, restricted_type, restricted_value
)

# Set thermodynamic properties
thermo_props = [
    "energy_total",
    "energy_intervdw",
    "energy_interq",
    "nmols",
]

default_args = {
    "run_name" : "gcmc",
    "cutoff_style": "cut",
    "charge_style": "ewald",
    "rcut_min": 0.5 * u.angstrom,
    "vdw_cutoff": 9.0 * u.angstrom,
    "charge_cutoff": 9.0 * u.angstrom,
    "properties": thermo_props,
    "angle_style": ["harmonic", "fixed"],
    "coord_freq": 100000,
    "prop_freq": 1000,
}

custom_args = { **default_args, **custom_args}

mc.run(
    system=system,
    moveset=moves,
    run_type="equilibration",
    run_length=nsteps,
    temperature=temperature,
    chemical_potentials=["none", mu],
    **custom_args,
)