# GMSO - General Molecular Simulation Object

## Atomtyping

[GMSO](https://github.com/mosdef-hub/gmso) allows for engine agnostic python object to store everything needed for writing molecule input files for simulation.


This notebook is designed to provide advanced principles for interfacing with GMSO. Tutorials get more in-depth in terms of complex usage denoted by the `beginners`, `intermediates`, and `experts` tags on these notebooks. For a full list of the examples discussed in these tutorials, please see [the GMSO Tutorials README](README.md).

After completing this notebook, move on to [this notebook](07_for_experts.ipynb) to learn about how to convert entire topologies.

Contained in this notebook is examples of:
* Optimizing atomtyping
* Handling failed atomtyping
* Filtering for unique_types
* Sorting a topology


In [None]:
# imports
import copy
import time
import numpy as np
import warnings
warnings.simplefilter('ignore')

import gmso
from gmso.parameterization import apply

## Optimizing atomtyping

In [None]:
def timed_typing(n_repeats, kwargs):
    timesList = []
    for i in range(n_repeats):
        top = gmso.Topology.load("source_files/ethane-box.json")
        ff = gmso.ForceField("oplsaa")
        start = time.perf_counter()
        apply(top, {"Compound":ff}, identify_connections=True, **kwargs)
        stop = time.perf_counter()
        timesList.append(stop-start)
    print(f"{np.mean(timesList):.2f} +- {np.std(timesList):.2f} s")

In [None]:
# base atomtyping

n_repeats = 1
timed_typing(n_repeats, kwargs={})

In [None]:
# atomtype by isomorphic graphs

n_repeats = 1
timed_typing(n_repeats, kwargs={"speedup_by_molgraph":True})

In [None]:
# atomtype by molecule tagged to sites

n_repeats = 1
timed_typing(n_repeats, kwargs={"speedup_by_moltag":True})

## Filtering for unique types
Many times, it's import to parse through a topology and count the number of different types that are being used. However, uniqueness could be considered in a number of different ways. We have fabricated commonly used ones as `PotentialFilters`. But any sorting function could be passed to the `filter_by` argument of the topology types to isolate whatever attributes are to be considered "unique".

In [None]:
from gmso.core.views import PotentialFilters
print(PotentialFilters.all())
top = gmso.Topology.load("source_files/ethane-typed.json")

# filter atomtypes
print(len(list(top.atom_types(filter_by=PotentialFilters.UNIQUE_SORTED_NAMES))))

In [None]:
# filter angle
print(list(top.angle_types(filter_by=PotentialFilters.UNIQUE_PARAMETERS)))
print(len(list(top.angle_types(filter_by=PotentialFilters.UNIQUE_PARAMETERS))))

## Sorting a topology
Typically, it's nice to have topologies written in a manner that is consistent. Because of this, sometimes we need to sort the potentials in a way that they're easy to find. However, this sorting is typically done in the writers. Please look at `gmso/formats` module to check how any individual engine is sorted.

In [None]:
# sort atomtypes by name
pfilter = PotentialFilters.UNIQUE_SORTED_NAMES
atypesView = sorted(top.atom_types(filter_by=pfilter), key=lambda x: x.name)
print(list(atypesView))

In [None]:
# sort bondtypes
pfilter = PotentialFilters.UNIQUE_SORTED_NAMES
bond_types = list(top.bond_types(filter_by=pfilter)) # make a list of all unique types
bond_types.sort(key=lambda x: sorted(x.member_types)) # sort that by the elements of that list
print(bond_types)