This notebook aims to compute the descriptor for combination of two "center" atoms

# Create a test system

In [None]:
import ase
import numpy as np

atoms = ase.Atoms("SSNO", positions=[[0, 0, 0], [0, 0, 0.1], [0, 0, 1], [0, 0, 2]])
frames = [atoms]

# Common Hyperparameters

In [None]:
r_cut = 4
n_max = 12
l_max = 6
sigma = 0.3

In [None]:
# this is for one frame only for now... (but we can assume a nested list of lists if
# there are multiple frames)

list_S = [1, 2]  # list of all indices we label as "start" atom
list_M = [2, 3]  # list of all indices we label as "middle" atom
list_E = [3, 1]  # list of all indices we label as "end" atom

assert len(list_S) == len(list_M)
assert len(list_S) == len(list_E)

# `dscribe` descriptor

For reference we calculate a SOAP descscriptor using the `describe` library.

In [None]:
from dscribe.descriptors import SOAP

soaper = SOAP(
    r_cut=r_cut,
    n_max=n_max,
    l_max=l_max,
    sigma=sigma,
    sparse=False,
    species=["S", "O", "N"],
)

As `centers` we use our chosen "start" atoms.

In [None]:
soap_water = soaper.create(frames[0], centers=list_S)

# pair descriptor

Now we compute the pair descriptor using the "start" and "end" atoms as centers.

The code for the descriptor calculations is extracted from 

https://github.com/curiosity54/mlelec

And uses [rascaline](https://luthaf.fr/rascaline/latest/index.html) and
[metatensor](https://lab-cosmo.github.io/metatensor/latest/index.html) as backend
libraries. Take a look at the explanations and how-to's for learning more about the
syntax we use below.

We start by importing the code from the [utils](utils) folder

In [None]:
from utils.acdc import pair_features

In [None]:
hypers = {
    "cutoff": r_cut,
    "max_radial": n_max,
    "max_angular": l_max,
    "atomic_gaussian_width": sigma,
    "center_atom_weight": 1,
    "radial_basis": {"Gto": {}},
    "cutoff_function": {"ShiftedCosine": {"width": 0.1}},
}

We can specify a larger cutoff as below to find pairs that are much further away than
the cutoff used for describing local densities like in SOAP

In [None]:
hypers_pair = hypers.copy()
hypers_pair["cutoff"] = 10

The pair feature combines a local feature like SOAP ($\nu=2$) with the expression
$\rho_i^{\otimes \nu} \otimes g_{ij}$. We usually use $\nu=1$ so that the feature
resulting from the tensor product instead has a soap like behavior. One can also create
a pair feature of the form $\rho_i^{\otimes \nu} \otimes g_{ij} \otimes \rho_j^{\otimes
\nu}$, (for $\nu=1$, this is similar in dimensions to the bispectrum)

Below we define `both_centers` which defines whether we computing the pair feature as
$\rho_i^\nu \otimes g_{ij}$ (when `False`) or $\rho_i^\nu \otimes g_{ij} \otimes
\rho_j^\nu$ (when `True`). The latter is more informative as it has local environment
info on both atoms but it is also more costly to compute.

In [None]:
both_centers = False

if `all_pairs` is `True`, this resets the cutoff so that the resulting environment
captures all pairs in the system.

In [None]:
all_pairs = False

In [None]:
frames

We now compute the pair descriptor. Note that if the parameter `hypers_pair` is not given explicitly 
the value from `hypers` are used instead.

In [None]:
center_indices = list_S
neighbor_indices = list_E

rhoij_SE = pair_features(
    frames=frames,
    hypers=hypers,
    hypers_pair=hypers_pair,
    center_indices=center_indices,
    neighbor_indices=neighbor_indices,
    cg=None,
    order_nu=1,
    both_centers=both_centers,
    lcut=0,
)

We can also calculate the pairs for the start and the middle atom

In [None]:
center_indices = list_S
neighbor_indices = list_M

rhoij_SM = pair_features(
    frames=frames,
    hypers=hypers,
    hypers_pair=hypers_pair,
    center_indices=center_indices,
    neighbor_indices=neighbor_indices,
    cg=None,
    order_nu=1,
    both_centers=both_centers,
    lcut=0,
)

`order_nu` specifies what kind of local densities to combine to create pair features.

Here we use `lcut` so that the resulting features are always scalar (or indexed by
`spherical_harmonics=0`) **CAUTION: you might want to change this value if computing
features with `both_centers=True` or trying to use these features to learn non-scalar
properties. A reasonable number is $~3$ or $4$.**

Now we can use these features straight away in a linear or kernel model

In [None]:
rhoij_SE[0].values

In [None]:
rhoij_SE[1].values

We could also stack SE values and SM values for the same S. But only if you need.