# GMSO Basics

### _Data structures for the storage of chemical systems and output of syntactically correct datafiles._ 



<p align="center">
    <img src="../graphics/mosdef_graphic_gmso.png", alt="GMSO within the MoSDeF Ecosystem" width="500" height="500"/>
</p>

This tutorials is for people who would like to utilize GMSO to write out basic simulation structure files.

The final piece of the MoSDeF software stack is `GMSO`, which stands for the General Molecular Simulation Object. 
* The library provides data structures for **storage of chemical systems** 
    * initialized by mBuild and atom typed with Foyer. 
    * converted from other file or data structures
* The package is designed to cover short-comings observed in other similar chemical structure data objects
     * general support for exotic potentials
* Functionalities to support [TRUE](https://www.tandfonline.com/doi/full/10.1080/00268976.2020.1742938) (Transparent, Reproducible, Usable-by-others, and Extensible) simulations research. 


**_Note_**: At the moment, GMSO is still under development, but we have defined a set of goals and some target features that should be available when the package matures.

_______

## GMSO Features
The following features are accessible:
1. __Supporting a variety of models__ in the molecular simulation/computational
  chemistry community:</br>
  No assumptions are made about an interaction site
  representing an atom or bead, instead these can be atomistic,
  united-atom/coarse-grained, polarizable, and other models! Units are 
  explicitly defined and convertible using the `Unyt` package.</br></br>

1. __Greater flexibility for exotic potentials__: 
</br>The [AtomType](https://github.com/mosdef-hub/gmso/blob/main/gmso/core/atom_type.py) (and analogue
  [classes for intramolecular interactions](https://github.com/mosdef-hub/gmso/tree/main/gmso/core)) uses [`SymPy`](https://www.sympy.org) to store any
  potential that can be represented by a mathematical expression. </br></br>

1. __Adaptable for new engines__: 
</br>By not being designed for
  compatibility with any particular molecular simulation engine or ecosystem,
  it becomes tractable for developers in the community to add glue for
  engines that are not currently supported. </br></br>

1. __Compatibility with existing community tools__: 
</bf>No single molecular simulation
  tool will ever be a silver bullet, so ``GMSO`` includes functions to convert
  between various file formats and libraries. These can be used in their own right to convert between objects in-memory and also to support conversion to file formats not natively supported at any given time. Currently supported conversions include:
    * [`ParmEd`](./gmso/external/convert_parmed.py)
    * [`OpenMM`](./gmso/external/convert_openmm.py)
    * [`mBuild`](./gmso/external/convert_mbuild.py)
    * more in the future! </br></br>

1. __Native support for reading and writing many common file formats__: We natively have support for:
    * [`XYZ`](./gmso/formats/xyz.py)
    * [`GRO`](./gmso/formats/gro.py)
    * [`TOP`](gmso/formats/top.py)
    * [`LAMMPSDATA`](gmso/formats/lammpsdata.py)
    * indirect support, through other libraries, for many more!
    


## GMSO ForceField XML
____
A way to compare the utility of using `GMSO` over other structure data classes is to look at the information stored in the XML files of GMSO compared to something like the Foyer XML.

It is readily transparent what defines the outcome of using this forcefield. Nothing is left to assumption;
* The units for all parameters, 
* The exact mathematical expression for the potentials
* The names used for the expression, valid parameters, and atomtypes that make up the connections 

are machine and human readable. This allows users to store any nature of mathematical description for a chemical topology, as well as easily defined unit and equation equivalencies can be used in conversions for modified styles **across engines**. 

### **The utility in GMSO is this _explicit_ and _generalized_ storage that provides handling to a broad scope of parameterized chemical descriptions.**

```xml
<?xml version='1.0' encoding='UTF-8'?>
<ForceField name="TIP3P" version="0.0.1"> 
  <!-- Defines units as metadata-->
  <FFMetaData>
    <Units energy="kJ/mol" mass="amu" charge="elementary_charge" distance="nm"/>
  </FFMetaData>
  <!-- Potentials can be grouped together by expression and can have optional names -->
  <AtomTypes expression="4*epsilon * ((sigma/r)**12 - (sigma/r)**6)">
     <!--   Units for parameters are defined in this tag    -->
    <ParametersUnitDef parameter="epsilon" unit="kJ/mol"/>
    <ParametersUnitDef parameter="sigma" unit="nm"/>
    <AtomType name="opls_111" element="O" charge="-0.834" mass="16" definition="O" description="water O">
      <Parameters>
        <Parameter name="epsilon" value="0.636386"/>
        <Parameter name="sigma" value="0.315061"/>
      </Parameters>
    </AtomType>
    <AtomType name="opls_112" element="H" charge="0.417" mass="1.011" definition="H">
      <Parameters>
        <Parameter name="epsilon" value="0.0"/>
        <Parameter name="sigma" value="1.0"/>
      </Parameters>
    </AtomType>
  </AtomTypes>
  <BondTypes expression="0.5 * k * (r-r_eq)**2">
    <ParametersUnitDef parameter="k" unit="kJ/mol/nm**2"/>
    <ParametersUnitDef parameter="r_eq" unit="nm"/>
    <BondType name="BondType-Harmonic-1" type1="opls_111" type2="opls_112">
      <Parameters>
        <Parameter name="k" value="502416.0"/>
        <Parameter name="r_eq" value="0.09572"/>
      </Parameters>
    </BondType>
  </BondTypes>
  <AngleTypes expression="0.5 * k * (theta - theta_eq)**2">
    <ParametersUnitDef parameter="k" unit="kJ/(mol*radian**2)"/>
    <ParametersUnitDef parameter="theta_eq" unit="radian"/>
    <AngleType name="AngleType-Harmonic-1" type1="opls_112" type2="opls_111" type3="opls_112">
      <Parameters>
        <Parameter name="k" value="682.02"/>
        <Parameter name="theta_eq" value="1.824218134"/>
      </Parameters>
    </AngleType>
  </AngleTypes>
</ForceField>
```


## GMSO `Topology` Structure
_____
`GMSO`'s goal is to provide a flexible backend framework to store topological information of a chemical system in a reproducible fashion. This is where the parameterized information from the ForceField gets codified to the chemical system of interest.
[**Topology**](https://github.com/mosdef-hub/gmso/blob/main/gmso/core/topology.py) in this case is defined as the information needed to initialize a molecular simulation.
Depending on the type of simulation performed, this ranges from:
* particle positions
* particle connectivity
* box information
* forcefield data
    - potenial energy functional forms
    - potential energy parameters
* other optional data
    - particle mass
    - elemental data
    - etc.

`gmso.core.Topology` is the central data class that stores this information. It's hierachy can be visualized below.
<p align="center">
    <img src="../graphics/GMSO-structure.png?raw=true", alt="GMSO within the MoSDeF Ecosystem" width="1000" height="1000"/>
</p>

## Interconverting Topology using: gmso.formats, gmso.external and gmso.lib
----
Modules `gmso.formats` and `gmso.external` define file writers to different simulation engines and converters to/from external packages respectively.

Module structure for `gmso.formats`:

```
gmso/formats/
├── gro.py
├── gsd.py
├── lammpsdata.py
├── mcf.py
├── top.py
└── xyz.py
```

Currently, we support the following file readers/writers:

|Extension | Engine | Typed or Un-typed? | Internal reader | Internal writer | 
|:---:|:------:|:------------------:|:---------------:|:---------------:|
|.mol2| many       | Un-typed | Available  |-| 
|.xyz | many       | Un-typed | Available | Available |
|.gro | GROMACS    | Un-typed | Available | Available |
|.top | GROMACS    | Typed    | - | Available |
|.gsd | HOOMD-Blue | Typed | - | Available |
|.mcf | Cassandra  | Typed | - | Available |
|.data | LAMMPS | Typed | Available |Available | 
|.json| GMSO | Typed | Available | Available |


The module structure for `gmso.external` is as follows:

```
gmso/external/
├── convert_mbuild.py
├── convert_openmm.py
├── convert_parmed.py
├── convert_networkx.py
├── convert_foyer_xml.py
```

We support the conversion to/from a [parmed](https://github.com/ParmEd/ParmEd) `Structure`, [mbuild](https://mbuild.mosdef.org) `Compound`. Currently, we only support converting a gmso `Topology` to an [OpenMM](http://openmm.org/) `Topology`.

In [None]:
# Convert a topology to a Parmed Structure
from gmso.external.convert_parmed import to_parmed
from gmso import Topology
import parmed as pmd

topology = Topology.load("gmso_files/350-waters.gro")

structure = to_parmed(topology, refer_type=False)
assert isinstance(structure, pmd.Structure)
assert len(list(structure.topology.atoms())) == topology.n_sites
print(structure)

## Forcefield Loading
_____
The `ForceField` class in `GMSO` is stored in gmso/core/forcefield.py. This class is an extension of the Foyer XML format and contains information parameters and expression relevant to the topology. A Forcefield XML file can be converted from the origin Foyer/OpenMM format or generated in the more flexible GMSO style.

In [None]:
# converting from a Foyer XML
import os
from foyer.forcefields.forcefields import get_ff_path
from gmso.external.convert_foyer_xml import from_foyer_xml
from gmso  import ForceField

fp = os.path.join(get_ff_path()[0], "xml/oplsaa.xml") #path to prepackaged ff
from_foyer_xml(fp, "gmso_files/oplsaa_gmso.xml" , overwrite=True)
gmso_oplsaa = ForceField("gmso_files/oplsaa_gmso.xml" , strict=False)
list(gmso_oplsaa.__dict__.keys()) #print out attributes

The **forcefield-utilities** package, available on **Conda Forge**, has been developed to improve loading for Foyer forcefields to GMSO. This process can be notably slow for larger forcefields, and the forcefield-utilities package was developed separately to load many different styles of XML quickly. The following example shows how to use this package.

In [None]:
# converting using Forcefield utilities
try:
    import forcefield_utilities as ffutils
except:
    raise(ImportError("""The package forcefield-utilities is not installed. Please install via\n 
    $ conda install forcefield-utilities"""))

ffloader = ffutils.FoyerFFs() # instance to load forcefields
oplsaa = ffloader.load("oplsaa") #load from Foyer xmls or search locally path
gmso_oplsaa = oplsaa.to_gmso_ff()
print("The OPLSAA forcefield has {len(gmso_oplsaa.atom_types)} atom types.")

Just like the `Foyer Forcefield` class, the `GMSO ForceField` class can be loaded from and saved to the XML file format for manual parsing and modifications, if need be. This lets users add parameters, validate metadata, and reload for continued use.

In [None]:
# Loading a Native GMSO ForceField
from gmso import ForceField 
from gmso.tests.utils import get_path # library of testable gmso xmls

tip3p_ff = ForceField(get_path('tip3p.xml'))
print(tip3p_ff.atom_types)

In [None]:
# Saving a Native GMSO ForceField
# Create an additional atomtype
from gmso import AtomType
atype = AtomType(name="oplsaa_add", expression=None)

# Load and modify forcefield
from gmso import ForceField 
from gmso.tests.utils import get_path
tip3p_ff = ForceField(get_path('tip3p.xml'))
tip3p_ff.atom_types[atype.name] = atype
tip3p_ff.xml("gmso_files/tip3p_modified.xml") 


## Parameterization (work in progress)
----
Recent efforts have looked to use GMSO not only as a writer for parameterized structures, but also as a place to _parameterize topologies_. This can be done using the **`apply`** utility in`gmso.parameterization.parameterize.py`. For more complex methods to use this method, such as applying forcefields by molecule, see the **ExampleGMSOWorkflow.ipynb** notebook.

In [None]:
import warnings
warnings.simplefilter("ignore", UserWarning)

import mbuild as mb #Build the mBuild compound
ethane = mb.load("CC", smiles=True) #create ethane
ethane.name = "ethane"
mb_compound = mb.fill_box(ethane, n_compounds = 5, density=0.6) # fill a small box

import forcefield_utilities as ffutils #
ffloader = ffutils.FoyerFFs() #load gmso forcefield
ethane_ff = ffloader.load("gmso_files/alkanes.xml").to_gmso_ff()

from gmso.external import from_mbuild
from gmso.parameterization import apply #main utility for applying a forcefield
topology_gmso = from_mbuild(mb_compound) # Create GMSO topology
topology_gmso.identify_connections() # Identify angles and dihedrals
print(f"The topology has {len(topology_gmso.dihedral_types)} parameterized dihedrals")
apply(topology_gmso, ethane_ff, identify_connected_components=True, match_ff_by="molecule",
                  use_molecule_info=False) # apply forcefield to relevant subtops
print(f"The topology has {len(topology_gmso.dihedral_types)} parameterized dihedrals")
help(apply)

## Visualizing Parameterized Topologies
----
Along with the ability to write out to human readable files to validate parameters, it can be useful to see a connected molecule representation of the topology with the applied parameters. This can be especially useful for understanding which parameters are missing. This uses **ipywidgets** and **matplotlib** to interactively select specific sites or connections.

In [None]:
from gmso import Topology
from gmso.formats.networkx import interactive_networkx_atomtypes, interactive_networkx_bonds #or angles, or dihedrals

topology = Topology.load("gmso_files/ethanes.json")
interactive_networkx_atomtypes(topology)

In [None]:
interactive_networkx_bonds(topology)

# GMSO Advanced Supplements
This tutorial is for contributors to GMSO or those who want to customize GMSO functionality </br> </br>

The following ideas let one understand the operation of GMSO, which uses pydantic `BaseModel` to leverage code reusibility. The lesson will help follow the nature of writing custom functions that use the GMSO `Topology` in order to probe applied potentials, understand unit handling, and write to custom file formats that may not be natively available.

## Module gmso.abc
----------
This module provides the abstract base classes for all other core data structures used in gmso. Our abstract base classes inherit from [pydantic](https://pydantic-docs.helpmanual.io/)'s `BaseModel` class which provides type hints as well as runtime data validation together with out-of-the-box serialization. The module structure is as follows:
```
gmso/abc 
├── gmso_base.py 
├── abstract_site.py 
├── abstract_potential.py 
├── abstract_connection.py 
```


1. [`gmso_base.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/abc/gmso_base.py): Defines the class `GMSOBase` i.e. The base class for all our other classes that tweaks pydantic's `BaseModel` class to provide an `id`-based hasing as well as injects numpydoc style docstrings from the fields of the class.
---
2. [`abstract_site.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/abc/abstract_site.py): Defines the `Site` class which provides a basic topology site with following attributes: (a.) name (b.) position (c.) label
---
3. [`abstract_potential.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/abc/abstract_potential.py): Defines the abstract `Potential` class which is the base class for our `ParametricPotentials` as well as `PotentialTemplates`.
---
4. [`abstract_connection.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/abc/abstract_site.py): Defines the abstract `Connection` class which is the base class for our `Bond`, `Angle`, `Dihedral` and `Improper` classes.

### Extensibility 
The implementation of abstract base classes allows for a consistent and systematic expansion of the library. The `gmso.abc` would allow new data classes to be implemented as needs arise with built-in common attributes/methods and type-check. 

### Example: Implementing a Bead

The `Bead` class can now be implemented as a subclass of the abstract `Site` class. We can use the existing attributes from the super class like `name`, `position` etc... and define new attributes and methods for `Bead`. The goal is the consolidation of as many universal characteristics of a generic topology site into a base class (`Site`) and tweak its down-stream usage according to the needs of a particular site (like an `Atom` or a `Bead`). Usage of `Site` to create a `Bead` class is shown below:

In [None]:
import warnings
warnings.simplefilter('ignore')
import unyt as u
from pydantic import Field, ValidationError

from gmso.abc.abstract_site import Site


class Bead(Site):
    __base_doc__ = "Basic Bead class inheriting from the Site Class"
    mass_: u.unyt_quantity = Field(
        default=1.0*u.amu,
        description='Mass of the bead'
    )
        
    charge_: u.unyt_quantity = Field(
        default=0.0*u.elementary_charge,
        description='Charge of the bead'
    )
    
    class Config:
        fields = {
            'mass_': 'mass',
            'charge_': 'charge'
        }
        alias_to_fields = {
            'mass': 'mass_',
            'charge': 'charge_'
        }
    
my_bead = Bead()
my_bead.name  # When you inherit, the attribute(field) `name` is injected as the class name(Bead in this case)

# We use pydantic for validation as well, for example if you assign a string to charge by accident :)
try:
    my_bead.charge = 'Some weird charge string'
except ValidationError as e:
    print(e)

In [None]:
# Documentation is injected automatically as well
%pdoc Bead

## Core Classes
----
In `GMSO` we define the following core classes, which inherit from different abstract classes. All of our core classes make use of these [pydantic](https://pydantic-docs.helpmanual.io/) derived subclasses, such as `abstract_site`, `abstract_connection` or `abstract_potential`. The module `gmso.core` is structured and subclassed as follows:

```
gmso/core/                                       subclass    
├── angle.py                 --->                gmso/abc/abstract_connection.Connection
├── angle_type.py            --->                gmso/core/parametric_potential.ParametricPotential 
├── atom.py                  --->                gmso/abc/abstract_site.Site 
├── atom_type.py             --->                gmso/core/parametric_potential.ParametricPotential 
├── bond.py                  --->                gmso/abc/abstract_connection.Connection
├── bond_type.py             --->                gmso/core/parametric_potential.ParametricPotential 
├── box.py                   --->                Python Object
├── dihedral.py              --->                gmso/abc/abstract_connection.Connection
├── dihedral_type.py         --->                gmso/core/parametric_potential.ParametricPotential 
├── element.py               --->                gmso/abc/gms_base.GMSOBase
├── forcefield.py            --->                Python Object
├── improper.py              --->                gmso/abc/abstract_connection.Connection
├── improper_type.py         --->                gmso/core/parametric_potential.ParametricPotential 
├── parametric_potential.py  --->                gmso/abc/abstract_potential.AbstractPotential
└── topology.py              --->                Python Object
```

##### This hierarchy allows common utilities to be subclassed and populated with default values, such as:
1. The `ParametricPotential` class that stores `SymPy` expressions and parameters.
2. The `Connection` class that stores sites that make up the connection and proper parametric connection type.
3. The `Site` class which gives access to the position, mass, and element/name information important for defining the features of that position in space.
4. Python objects that are at the bottom of the hierachy 


### Creating core class instances
#### _Sites_
1. [`atom.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/atom.py): Defines the class `gmso.core.atom.Atom` which inherits from `gmso.abc.abstract_site.Site` to define an `Atom`.

You can use the `json` package to easily define the format of .json encoding and decoding, making it simple to modify the default [json writer](https://github.com/mosdef-hub/gmso/blob/main/gmso/formats/json.py) for additional topology information. This example shows creating atom sites and using a custom `UnytJsonEncoder` to write out this info to a readable format.

In [None]:
# Use json package to define class attributes to write object to json readable format
import json
import unyt as u
class UnytJsonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, u.unyt_array):
            return {
                    'array': obj.ravel().tolist(),
                    'unit': str(obj.units)    
            }
        
        return json.JSONEncoder.default(self, obj)

In [None]:
from gmso.core.atom import Atom  #import the Atom class
from pprint import pprint
atom1 = Atom(name='atom1', charge=2.0*u.elementary_charge) #initialize an atom with charge and unyt units
atom2 = Atom(name='atom2', charge=1.0*u.elementary_charge) #initialize an atom with charge and unyt units

# Dumping the model as json is easy
pprint(json.dumps(atom1.dict(by_alias=True, exclude_unset=True), cls=UnytJsonEncoder, indent=2))

#### _Connections_
1. [`bond.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/bond.py): Defines the class `gmso.core.bond.Bond` which inherits from `gmso.abc.abstract_connection.Connection` to define a 2-partner connection between `Atoms`.
---
2. [`angle.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/angle.py): Defines the class `gmso.core.angle.Angle` which inherits from `gmso.abc.abstract_connection.Connection` to define a 3-partner connection between `Atoms`.
---
3. [`dihedral.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/dihedral.py): Defines the class `gmso.core.dihedral.Dihedral` which inherits from `gmso.abc.abstract_connection.Connection` to define a 4-partner connection (dihedral, i-j-k-l order) between `Atoms`.
---
4. [`improper.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/improper.py): Defines the class `gmso.core.improper.Improper` which inherits from `gmso.abc.abstract_connection.Connection` to define a 4-partner connection (improper, central atom first) between `Atoms`.

In [None]:
from gmso import Bond #or Angle, Dihedral, Improper
bond = Bond(connection_members=[atom1, atom2]) # will raise an error if you pass too many sites
bond.connection_members

#### _Potentials_
1. [`parametric_potential.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/parametric_potential.py): Defines the class `gmso.core.parametric_potential.ParametricPotential` which inherits from `gmso.abc.abstract_potential.Potential` to define a `ParametricPotential` class which stores the functional form of a Potential as a `SymPy` expression and parameters of the potential expression as `unyt_quantities`.
---
2. [`atom_type.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/atom_type.py): Defines the class `gmso.core.atom_type.AtomType` which inherits from `gmso.core.parametric_potential.ParametricPotential` to describe properties for an `AtomType`.   
---
3. [`bond_type.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/bond_type.py): Defines the class `gmso.core.bond_type.BondType` which inherits from `gmso.core.parametric_potential.ParametricPotential` to describe properties for a `BondType`.
---
4. [`angle_type.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/angle_type.py): Defines the class `gmso.core.angle_type.AngleType` which inherits from `gmso.core.parametric_potential.ParametricPotential` to describe properties for an `AngleType`.
---
5. [`dihedral_type.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/dihedral_type.py) and [`improper_type.py`](https://github.com/mosdef-hub/gmso/blob/3ff3829cb4bc492b41e5e520d26d35c09c5338a4/gmso/core/atom_type.py): Defines the classes `gmso.core.atom_type.DihedralType` and `gmso.core.improper_type.ImproperType` which inherit from `gmso.core.parametric_potential.ParametricPotential` which describe properties for a `DihedralType` and `ImproperType` respectively.

In [None]:
from gmso.core.parametric_potential import ParametricPotential

# Handle potential expression using separate Expression class
new_potential = ParametricPotential(
            name='mypotential',
            expression='a*x+b',
            parameters={
                'a': 1.0*u.g,
                'b': 1.0*u.m
            },
            independent_variables={'x'}
)

try:
    new_potential.independent_variables = 'y'
except ValueError as e:
    print(e)

### Topology Building
Example of manually building a methane topology. These methods are mostly leveraged in the loaders when converting data from a given format to a `gmso.Topology`. We can also see the utility that lets **angles, dihedrals, and impropers to be identified** from bonds in the system

In [None]:
from gmso import Topology, Atom, Bond
methane_top = Topology(name="methane") #initial topology instance
c = Atom(name='c')
h1 = Atom(name='h1')
h2 = Atom(name='h2')
h3 = Atom(name='h3')
h4 = Atom(name='h4')
ch1 = Bond(connection_members=[c,h1])
ch2 = Bond(connection_members=[c,h2])
ch3 = Bond(connection_members=[c,h3])
ch4 = Bond(connection_members=[c,h4])
methane_top.add_site(c, update_types=False)
methane_top.add_site(h1, update_types=False)
methane_top.add_site(h2, update_types=False)
methane_top.add_site(h3, update_types=False)
methane_top.add_site(h4, update_types=False)
methane_top.add_connection(ch1, update_types=False)
methane_top.add_connection(ch2, update_types=False)
methane_top.add_connection(ch3, update_types=False)
methane_top.add_connection(ch4, update_types=False)

# You can now infer angles, dihedrals and impropers from bonds
print(f"Topology {methane_top.name} has {methane_top.n_angles} angles")   # No angles now
methane_top.identify_connections() # Uses networkx to identify connections using subgraph isomorphism
print(f"Topology {methane_top.name} now has {methane_top.n_angles} angles") 

## Handling Potential Parameters and Units in GMSO
----

Potential expression and parameters represent the functional form of any bonded or non bonded potential in GMSO. At its core, any Potential class (inheriting from `gmso.abstract_potential.AbstractPotential`) is a container for two entities:

1. The expression for the Potential
2. The parameters (i.e. non-free variables) in the Potential expression and their values

We delegate the handling of potential expression, the variables and their values in GMSO to a utility class called `PotentialExpression`, which keeps track of the changes to the independent variables and expression of a potential expression.

In [None]:
from gmso.utils.expression import PotentialExpression

expression = PotentialExpression(
    expression='x*b+c',
    independent_variables={'c'}
)

print(expression.is_parametric)
try:
    expression.independent_variables = 'd' # Will throw an error
except ValueError as e:
    print(e)

As shown above, we can use the `PotentialTemplateLibrary` to parametrize any potential we want. All the parameters are maintained as `unyt_arrays` which makes them easier to work with in any unit system. An example of creating a Potential from `LennardJonesPotential` template is shown below:

In [None]:
import unyt as u
from gmso.core.parametric_potential import ParametricPotential
from gmso.lib.potential_templates import PotentialTemplateLibrary
pt_lib = PotentialTemplateLibrary()
pt_lib.get_available_template_names()

In [None]:
# Look at the expression and variables for the SymPy expression.
display(pt_lib['LennardJonesPotential'].expression, pt_lib['LennardJonesPotential'].independent_variables)

In [None]:
# Create a parameterized potential which can be tagged to a topology
lj_parametrized = ParametricPotential.from_template(
    potential_template=pt_lib['LennardJonesPotential'],
    parameters={
        'sigma' : 1.0 * u.nm,
        'epsilon': 1.0 * u.kJ / u.mol
    }
)

display( # Easily check and validate the values for your variables
    lj_parametrized.expression, 
    pt_lib['LennardJonesPotential'].independent_variables, 
    lj_parametrized.parameters
)

## Viewing unique types in Topology
--------
In order to facilitate writing to files, quickly filtering through typed objects in the topology is essential. However, there can be many different definitions of "unique", so GMSO has leveraged `gmso.core.view.TopologyPotentialViews` for filtering through all typed objects in the system.

In [None]:
# List of premade ways to filter through potential lists
from gmso.core.views import PotentialFilters
PotentialFilters.all() #These can easily be added to for further filterint

In [None]:
from gmso import Topology

topology = Topology.load("gmso_files/ethanes.json")
msg = lambda x,y: print(f"There are {x} unique types using method {y}.")
msg(len(topology.atom_types),1) # Total number of atomtypes
msg(len(topology.atom_types(filter_by=PotentialFilters.UNIQUE_NAME_CLASS)),2) # Sites with different names
msg(len(topology.atom_types(filter_by=PotentialFilters.UNIQUE_EXPRESSION)),3) # Sites with different expressions
msg(len(topology.bond_types(filter_by=PotentialFilters.UNIQUE_PARAMETERS)),4) # Bonds with different parameters

## Coming to GMSO
----
`GMSO` still is being improved. Most notable, the writers in `gmso.formats` still need to be filled out to support the many complex ways those files can be written. Leveraging these building blocks will make that process possible. Improvements for GMSO are as follows:

1. Add support for new engines and extend features to write out different formats in existing engines. This includes complex potential forms including table styles, unit conversions, and support for converting between compatible potential forms.
</br> </br>
2. Loading pre-parameterized topologies, and mapping those parameters to extended forms of that topology. For example, this may look something like creating a long protein object from already parameterized residues.
</br></br>
3. The ability to modify ForceField files with additional parameters, and keep track of these changes through automated hashing that can be passed to utilities to check for the list of differences between two forcefields. This will be notably useful for large forcefields, such as the OPLS-AA found in Foyer.
</br></br>
4. The ability to generate SMARTS identifiers for local chemical environments to facilitate the adding of these parameters.