# Introduction to Trimer Dataset

Before getting into any Machine Learning analysis,
this is an introduction to the dataset we are working with.
It is a collection of configurations in a molecular dynamics simulation,
showing the melting of a crystal completely surrounded by liquid.
To make the simulations simpler to visualise
they are conducted in two dimensions.
The molecule I am modelling is comprised of
three discs in a rigid configuration,
for which we have found three different crystal structures (or polymorphs)
each with very similar energies.
These crystal structures are denoted p2, p2gg, and pg,
named for the symmetry of the unit cell.

The goal of the machine learning
is to be capable of distinguishing these crystal structures,
and additionally developing an approach which can be applied to
any collection of local ordering.
As part of approaching this from a general sense,
the only information about

## Setting up Environment

The trimer module within the `src` directory
contains a number of helper functions to
simplify the interaction within the notebook.
To import this module I am just adding the module
to the collection of paths searched by python
then using the standard import statement.

For the visualisation of the figures
I am using the [bokeh](https://bokeh.pydata.org/en/latest/) library
which allows for interaction with the resulting figures,
having both a coarse and fine grained view of the configuration.
To display the figures in a jupyter notebook
we need to use the `output_notebook()` function to configure everything correctly.

In [6]:
# Import project source files
import sys
sys.path.append("../src")
import trimer
import figures

# Set bokeh to output configurations to the notebook
from bokeh.io import output_notebook, show
output_notebook()

## Visualising the Configurations

The figure below shows the configurations we are using,
where each molecule is coloured according to it's orientation.
Through colouring the orientation,
the difference between the three crystal structures is noticeable,
with the p2 (left) having two layers facing in opposite directions,
the p2gg (center) having four layers, and
the pg (right) having two layers with slightly different orientations.
The highly orientationally ordered crystal structures
are visually distinguishable from the surrounding liquid state.

In [7]:
crystals = ["p2", "p2gg", "pg"]
snaps = []
for crystal in ["p2", "p2gg", "pg"]:
    snaps.append(
        trimer.read_file(
            index = 2,
            temperature=0.46,
            pressure=1.00,
            crystal=crystal,
        )
    )
show(trimer.plot_snapshots(snaps))

While it is relatively simple to
visually distinguish these states as they are presented,
it is much harder to design an algorithm to detect them.
The rest of these notebooks are an investigation of
the most appropriate methods for the detection
of these different local structural orderings.

In [8]:
show(figures.plot_labelled_config(snaps[0]))

In [9]:
snaps = trimer.read_all_files(
    "../data/simulation/dataset/output/", 
    index=1, 
    pattern="dump-*.gsd"
)

In [12]:
show(trimer.plot_frame(snaps[0][1]))