## dev cells

In [None]:
from ase.visualize.plot import plot_atoms

from autoadsorbate import Fragment, Surface

## autoadsorbate

The challenge of generating initial structures for heterogeneous catalysis is traditionally addressed through manual labor. However, this package aims to offer an alternative approach.

To effectively simulate reactive behavior at surfaces, it is crucial to establish clear definitions within our framework. The following definitions are essential in order to accurately characterize the structures of interest:

- __Fragment__: 
    - <font color='red'>Molecules</font> - species that exist in their corresponding geometries __even when isolated from the surface__.
    - <font color='red'>Reactive species</font> - species that exist in their corresponding geometries __only when attached to the surface__.
- __Surface__:
    - The definition of the surface is simple - <font color='red'>every atom of the slab that can be in contact with an intermediate is considered a surface atom</font>. The surface is a collection of such atoms.
    - Every atom of the surface is a "top" site.
    - When two "top" sites are close (close in its literal meaning) to each other, they form a "bridge" site.
    - When three "top" sites are close (close in its literal meaning) to each other, they form a "3-fold" site.
    - etc.
- __Active Site__:
    - A collection of one or more sites that can facilitate a chemical transformation is called an active site.
    - A "top" site can be an active site only for Eley-Rideal transformations.
    - All other transformations require that at least one intermediate binds through at least two sites. All involved sites compose an active site.
- __Intermediate__:
    - Intermediates are fragments bound to an active site.

<!-- ### basic imports -->

the idea was to keep the package as light as possible, hence the foundation of this package is ase and rdkit, allong with some basic python packages (pandas, numpy, etc.)

In [None]:
# from autoadsorbate.autoadsorbate import Surface, Fragment
# from ase.io import read, write
# from ase.visualize import view
# from ase.visualize.plot import plot_atoms
# import matplotlib.pyplot as plt

### Fragment

#### Molecules

In [None]:
f = Fragment(smile="COC", to_initialize=5)

In [None]:
from autoadsorbate import docs_plot_conformers

conformer_trajectory = f.conformers
fig = docs_plot_conformers(conformer_trajectory)

Notice that the orientation of the fragment is arbitrary. We could simply paste these structures on a surface of some material, but it would be difficult to quantify the quality of the initial random guesses and hence how many structures we need to sample. We would then have to run dynamic simulations to probe for local minima and check which minima are the most stable.

In this case of DME, we can use our knowledge of chemistry to simplify the problem. Since the O atom bridging the two methyl groups had 2 "lone electron pairs," we can use a simple trick: replacing one of the lone pairs with a marker atom (let's use Cl).


Notice that we had to make two adjustments to the SMILES string:
- to be able to replace the lone pair with a marker we must "trick" the valnce of the O atom, and reshufle the smiles formula so that the marker is in first place (for easy book-keeping)
    - ```COC``` original
    - ```CO(Cl)C``` add Cl instead of the O lone pair (this is an invalid SMILES)
    - ```C[O+](Cl)C``` trick to make the valence work
    - ```Cl[O+](C)C``` rearrange so taht the SMILES string starts with the marker first (for easy book keeping)

This can be also done with a function:

In [None]:
from autoadsorbate import get_marked_smiles

marked_smile = get_marked_smiles(["COC"])[0]
marked_smile

In [None]:
f = Fragment(smile="Cl[O+](C)(C)", to_initialize=5)
len(f.conformers)

In [None]:
from autoadsorbate import docs_plot_conformers

conformer_trajectory = f.conformers
fig = docs_plot_conformers(conformer_trajectory)

Now we can use the marker atom to orient our molecule:

In [None]:
from autoadsorbate import docs_plot_sites

oriented_conformer_trajectory = [f.get_conformer(i) for i, _ in enumerate(f.conformers)]
fig = docs_plot_conformers(oriented_conformer_trajectory)

We can also easily remove the marker:

In [None]:
clean_conformer_trajectory = [atoms[1:] for atoms in oriented_conformer_trajectory]
fig = docs_plot_conformers(clean_conformer_trajectory)

#### Reactive species 

Methoxy

In [None]:
f = Fragment(smile="ClOC", to_initialize=5)
oriented_conformer_trajectory = [f.get_conformer(i) for i, _ in enumerate(f.conformers)]
fig = docs_plot_conformers(oriented_conformer_trajectory)

##### Methyl

In [None]:
f = Fragment(smile="ClC", to_initialize=5)
oriented_conformer_trajectory = [f.get_conformer(i) for i, _ in enumerate(f.conformers)]
fig = docs_plot_conformers(oriented_conformer_trajectory)

##### Frangments with more than one binding mode (e.g. 1,2-PDO)

bound through single site:

In [None]:
f = Fragment(smile="Cl[OH+]CC(O)C", to_initialize=5)
oriented_conformer_trajectory = [f.get_conformer(i) for i, _ in enumerate(f.conformers)]
fig = docs_plot_conformers(oriented_conformer_trajectory)

Coordinated withboth hydroxil:

In [None]:
f = Fragment(smile="S1S[OH+]CC([OH+]1)C", to_initialize=5)
oriented_conformer_trajectory = [f.get_conformer(i) for i, _ in enumerate(f.conformers)]
fig = docs_plot_conformers(oriented_conformer_trajectory)

### Surface

First we need to have a slab (slab is an arrangement of atoms that contains the boundry between the material in question and other - i.e. gas, fluid, other material). We can read one (```ase.io.read('path_to_file')```) we prepared earlier, or we can use ase to construct a new slab:

In [None]:
from ase.build import fcc111

slab = fcc111("Cu", (4, 4, 4), periodic=True, vacuum=10)

Now we can initalize the Surface object which associates the constructed slab (ase.Atoms) with additional information required for placing Fragments.
We can view which atoms are in the surface:

In [None]:
s = Surface(slab)
plot_atoms(s.view_surface(return_atoms=True))

We have access to all the sites info as a pandas dataframe:

In [None]:
s.site_df.head()

or in dict form:

In [None]:
s.site_dict.keys()

One can easily get access to sites ase.Atoms and find useful information in the ase.Atoms.info:

In [None]:
site_atoms = s.view_site(0, return_atoms=True)
site_atoms.info

In [None]:
fig = docs_plot_sites(s)

We can keep only the symmetry unique ones like this:

In [None]:
s.sym_reduce()
s.site_df

In [None]:
plot_atoms(s.view_surface(return_atoms=True))

## Making surrgate smiles automatically

In [None]:
from autoadsorbate import _example_config

_example_config

In [None]:
from autoadsorbate import construct_smiles

config = {
    "backbone_info": {"C": 0, "O": 0, "N": 2},
    "allow_intramolec_rings": True,
    "ring_marker": 2,
    "side_chain": ["(", ")"],
    "brackets": ["[", "]", "H+]", "H2+]", "H3+]"],
    "make_labeled": True,
}

smiles = construct_smiles(config)

In [None]:
smiles

In [None]:
from autoadsorbate import Fragment

trj = []
for s in smiles:
    try:
        f = Fragment(s, to_initialize=1)
        a = f.get_conformer(0)
        trj.append(a)
    except:
        pass

lst = [z for z in zip([a.get_chemical_formula() for a in trj], trj)]
lst.sort(key=lambda tup: tup[0])
trj = [a[1] for a in lst]
len(trj)

In [None]:
from autoadsorbate import get_drop_snapped

xtrj = get_drop_snapped(trj, d_cut=1.5)
len(xtrj)

In [None]:
import matplotlib.pyplot as plt
from ase import Atoms
from ase.visualize.plot import plot_atoms

fig, axs = plt.subplots(3, 11, figsize=[10, 5], dpi=100)

for i, ax in enumerate(axs.flatten()):
    try:
        platoms = xtrj[i].copy()

    except:
        platoms = Atoms("X", positions=[[0, 0, 0]])

    for atom in platoms:
        if atom.symbol in ["Cl", "S"]:
            atom.symbol = "Ga"
    plot_atoms(platoms, rotation=("-90x,0y,0z"), ax=ax)
    ax.set_axis_off()
    ax.set_xlim(-1, 5)
    ax.set_ylim(-0.5, 5.5)

fig.set_layout_engine(layout="tight")

## Fully automatic - populate Surface with Fragment

In [None]:
from ase.build import fcc211

from autoadsorbate import Fragment, Surface

slab = fcc211(symbol="Cu", size=(6, 3, 3), vacuum=10)
s = Surface(slab, touch_sphere_size=2.7)
s.sym_reduce()

fragments = [
    Fragment("S1S[OH+]CC(N)[OH+]1", to_initialize=20),
    Fragment("Cl[OH+]CC(=O)[OH+]", to_initialize=5),
]

out_trj = []
for fragment in fragments:
    out_trj += s.get_populated_sites(
        fragment,
        site_index="all",
        sample_rotation=True,
        mode="heuristic",
        conformers_per_site_cap=5,
        overlap_thr=1.6,
        verbose=True,
    )
    print("out_trj ", len(out_trj))