# Force Field Tutorial 01: Intermediate

## Inspecting XML file

In [14]:
cat utils/foyer_spce.xml

<ForceField version="0.0.1" name="SPC/E Water" combining_rule="geometric">
 <AtomTypes>
   <Type name="opls_116" class="OW" element="O" mass="15.99940" def="[O;X2](H)(H)" desc="SPC/E Oxygen" doi="10.1021/j100308a038"/>
   <Type name="opls_117" class="HW" element="H" mass="1.00800" def="[H;X1][O;X2](H)" desc="SPC/E Hydrogen" doi="10.1021/j100308a038"/>
 </AtomTypes>
 <HarmonicBondForce>
   <Bond class1="OW" class2="HW" length="0.100" k="345000.0"/>
 </HarmonicBondForce>
 <HarmonicAngleForce>
   <Angle class1="HW" class2="OW" class3="HW" angle="1.91061193" k="383.0"/>
 </HarmonicAngleForce>
 <NonbondedForce coulomb14scale="0.5" lj14scale="0.5">
  <Atom type="opls_116" charge="-0.8476" sigma="0.316557" epsilon="0.650194"/>
  <Atom type="opls_117" charge="0.4238" sigma="0.0" epsilon="0.0"/>
 </NonbondedForce>
</ForceField>


In [15]:
cat utils/gmso_spce.xml

<?xml version='1.0' encoding='UTF-8'?>
<ForceField name="SPC/E Water" version="0.0.1">
  <FFMetaData electrostatics14Scale="0.5" nonBonded14Scale="0.5" combiningRule="geometric">
    <Units energy="kJ" distance="nm" mass="amu" charge="elementary_charge"/>
  </FFMetaData>
  <AtomTypes expression="4*epsilon*(-sigma**6/r**6 + sigma**12/r**12)">
    <ParametersUnitDef parameter="epsilon" unit="kJ/mol"/>
    <ParametersUnitDef parameter="sigma" unit="nm"/>
    <AtomType name="opls_116" mass="15.9994" charge="-0.8476" atomclass="OW" doi="10.1021/j100308a038" definition="[O;X2](H)(H)" description="SPC/E Oxygen">
      <Parameters>
        <Parameter name="epsilon" value="0.650194"/>
        <Parameter name="sigma" value="0.316557"/>
      </Parameters>
    </AtomType>
    <AtomType name="opls_117" mass="1.008" charge="0.4238" atomclass="HW" doi="10.1021/j100308a038" definition="[H;X1][O;X2](H)" description="SPC/E Hydrogen">
      <Parameters>
        <Parameter name="epsilon" value="0.0"/>
  

## SMARTS String

### Defining SMARTS
Focusing first on atom type `opls_140`, the SMARTS string, `def="H[C;X4]"`, states that this atom-type applies when:
- The element is hydrogen, i.e., `H`
- When that hydrogen is connected to a carbon that has 4 neighbors, i.e., `[C;X4]`

Similarly, for atom type `opls_138`, the SMARTS string, `def="[C;X4](H)(H)(H)H"`, states that this atom-type applies when:
- The element is carbon, with 4 neighbors, i.e., `[C;X4]`
- 4 of those neighbors are hydrogens, i.e., `(H)(H)(H)H`

For atom type `opls_136`, the SMARTS string, `def="[C;X4](H)H"`, states that this atom-type applies when:
- The element is carbon, with 4 neighbors, i.e., `[C;X4]`
- At least 2 of those neighbors are hydrogens, i.e., `(H)H`

For atom type `opls_135`, the SMARTS string, `def="[C;X4](H)(H)H"`, states that this atom-type applies when:
- The element is carbon, with 4 neighbors, i.e., `[C;X4]`
- At least 3 of those neighbors are hydrogens, i.e., `(H)(H)H`

### Atomtyping using SMARTS
Let us now consider using these rules to atom-type the carbon in methane (i.e., CH4).

- `opls_138` would obviously evaluate to `True`, as it is defined for a carbon, with 4 hydrogen neighbors. 
- `opls_135` and `opls_136` would also evaulate to `True`.  In the case of opls_135, our definition only states that at least 3 of the neighbors are hydrogen, and have not made any specific claims about the identity of the 4th neighbor; similarly, for opls_136, we have only stated that 2 neighbors must be hydrogen and not specified the identity of the other 2 neighbors. 

This is an important feature of Foyer to take note of.  Foyer will evaluate all the rules in a forcefield file, rather than just stopping at the first rule that evaluates to `True`. This allows rules to be defined in any order.  Furthermore, Foyer iterates over the rules, which allows recursive definitions of usage, i.e., referring to specific atom-types in the SMARTS string. 

### Overrides
We will discuss two ways to address this. One approach is to employ `overrides`.  `overrides` provide a means to dictate rule precedence (i.e., which rules are more specific than others).  In the force field file above, `opls_138` has defined: `overrides="opls_135,opls_136"`.  That is, if `opls_138` evaluates to `True`, then it takes precedence over `opls_135` and `opls_136`, even if they evaluate to `True`. 

Similarly, if opls_136 evaluates to `True`, it `overrides="opls_136"`, thus taking precedence. 

`overrides` are especially useful if the chemical context of two different atom-types are effectively the same.  E.g., the terminal methyl group in an alkane has the same first neighbor environment as the methyl group in toluene, however the parameters (and thus atom-type) are different. Thus `overrides` can be used to state that if the more specific toluene rule evaluates to `True` it should take precedences over the more general alkane rule (as shown below):

<img src="utils/ch3-toluene.png" alt="Drawing" style="width: 700px;"/>

### Better SMARTS definitions
In many cases, `overrides` can be avoided by simply providing more specific SMARTS strings.  For example, the rules for `opls_135`, `opls_136`, and `opls_138` above can be made more specific by stating the identify of the other neighbors besides carbon and thus eliminate the need for `overrides`, as shown below. 

In [17]:
! sed -n 2,18p utils/OPLSaa_alkanes2.xml

    <AtomTypes>
        <Type name="opls_135" def="[C;X4](H)(H)(H)C"
              class="CT" element="C" mass="12.01100" desc="alkane CH3"
              doi="10.1021/ja9621760"/>

        <Type name="opls_136" def="[C;X4](H)(H)(C)C"
              class="CT" element="C" mass="12.01100" desc="alkane CH2"
              doi="10.1021/ja9621760"/>

        <Type name="opls_138" def="[C;X4](H)(H)(H)H"
              class="CT" element="C" mass="12.01100" desc="alkane CH4"
              doi="10.1021/ja9621760"/>

        <Type name="opls_140" def="H[C;X4]"
              class="HC" element="H" mass="1.00800" desc="alkane H"
              doi="10.1021/ja9621760"/>
    </AtomTypes>


Note that SMARTS can be used to define more than just the immediate local pattern. Recall the definition for `opls_140`, `def='H[C;X4]'`. This definition states not only that the element is hydrogen and is bonded to a carbon, but that the carbon it is bonded to has 4 total bonds. If necessary, the identify of those bonded neighbors could  be specified. 

Additionally, we note that since this force field only has a single hydrogen atom-type, the definition could actually be made less specific, i.e., `def='H'`, and still produce the correct output.  

Thus, it is important to keep in mind that there are multiple valid SMARTS definitions that can be provided for a given atom-type; the specificity of the SMARTS definition and whether or not `overrides` are necessary will depend on the chemical context of the parameters themselves, the number of atom-types included in a forcefield, the intended usage of the forcefield, and the personal preferences of the individual(s) defining the forcefield.   

## Other Metadata

In [6]:
import mbuild as mb
import gmso
from gmso.parameterization import apply


  entry_points = metadata.entry_points()["mbuild.plugins"]


In [7]:
ethane = mb.load("CC", smiles=True)
ethane_top = ethane.to_gmso()
ethane_top.identify_connections()
print(ethane_top)

<Topology Topology, 8 sites, id: 5541541776>


In [8]:
# Default settings
print("Combining rule:", ethane_top.combining_rule)
print("LJ scaling factor:", ethane_top.scaling_factors[0])
print("Electrostatics scaling factor:", ethane_top.scaling_factors[0])

Combining rule: lorentz
LJ scaling factor: [0.  0.  0.5]
Electrostatics scaling factor: [0.  0.  0.5]


In [9]:
oplsaa = gmso.ForceField("oplsaa")
print("Combining Rule:", oplsaa.combining_rule)
print("Scaling Facors:", oplsaa.scaling_factors)



Combining Rule: geometric
Scaling Facors: {'electrostatics14Scale': 0.5, 'nonBonded14Scale': 0.5}


In [10]:
ethane_top = apply(ethane_top, oplsaa)
ethane_top

<Topology Topology, 8 sites, id: 5541541776>


  all_scales[index][scaling_interaction_idxes[interaction]] = value
  all_scales[index][scaling_interaction_idxes[interaction]] = value


In [13]:
# After typing
print("Combining rule:", ethane_top.combining_rule)
print("LJ scaling factor:", ethane_top.scaling_factors[0])
print("Electrostatics scaling factor:", ethane_top.scaling_factors[0])

Combining rule: geometric
LJ scaling factor: [0.  0.  0.5]
Electrostatics scaling factor: [0.  0.  0.5]
