# Many Body Tensor Representation

MBTR is a global descriptor for a molecule/unit-cell. it eliminates rotational, transational, and permutation variences by forming tensors of combinations of elements, from one to four, called K1, K2, K3, K4. All such combinations have an associated gaussian-smeared exponantially-weighted histogram.

For MBTR, we use [describe package](https://github.com/SINGROUP/describe) as developed by [Surfaces and Interfaces at the Nanoscale, Aalto](http://physics.aalto.fi/en/groups/sin/)

## The Tensor
The tensor comprises combinations of elements in different numbers. So, K1 is 1 element, K2 is 2 elements, and so on. These K's represent different expression of the molecule/unit-cell.

### K1
K1 represents the gaussian-smeared exponantially-weighted histogram for **counts** of each element type. So, in essense it is a matrix of size MxN, where M is the number of elements, and N is the number of bins.

![K1](./images/k1.png)

### K2
K2 represents the gaussian-smeared exponantially-weighted histogram **inverse distances** of pairs of elements. So, this becomes a tensor of size MxMxN, where M is the number of elements, and N is the number of bins.

![K2](./images/k2.png)

### K3
K3 represents the gaussian-smeared exponantially-weighted histogram **angles between triplets** of elements. So, this becomes a tensor of size MxMxMxN, where M is the number of elements, and N is the number of bins.

![K3](./images/k3.png)

*Note: the describe package has implementation of MBTR upto K3*

### K4 
K4 represents the gaussian-smeared exponantially-weighted histogram **di-hedral angles** between quadruplets of elements. So, this becomes a tensor of size MxMxMxMxN, where M is the number of elements, and N is the number of bins.

![K4](./images/k4.png)

## Weighting

All the tensors, but K1, are weighted. This ensures that contributions from nearby elemets is  higher, than from farther ones. The describe package implements exponantial weighting.

For more info about MBTR see:
[Huo, Haoyan, and Matthias Rupp. *arXiv preprint* **arXiv:1704.06439 (2017)**](https://arxiv.org/pdf/1704.06439.pdf)  

## Example

We are going to see MBTR in action for a simple NaCl system.

In [None]:
# --- INITIAL DEFINITIONS ---
from describe.descriptors import MBTR
from describe.core import System
from describe.data.element_data import numbers_to_symbols
import numpy as np
from ase.visualize import view
import matplotlib.pyplot as mpl

### Atom description

We'll make a ase.Atoms class for NaCl. However, the describe package has a wrapper, called System, to ase.Atoms, with added functions. So we'll make a System class, with syntax simmilar to ase.Atoms.

In [None]:
# Define the system under study: NaCl in a conventional cell.
NaCl_conv = System(
    cell=[
        [5.6402, 0.0, 0.0],
        [0.0, 5.6402, 0.0],
        [0.0, 0.0, 5.6402]
    ],
    scaled_positions=[
        [0.0, 0.5, 0.0],
        [0.0, 0.5, 0.5],
        [0.0, 0.0, 0.5],
        [0.0, 0.0, 0.0],
        [0.5, 0.5, 0.5],
        [0.5, 0.5, 0.0],
        [0.5, 0.0, 0.0],
        [0.5, 0.0, 0.5]
    ],
    symbols=["Na", "Cl", "Na", "Cl", "Na", "Cl", "Na", "Cl"],
)
view(NaCl_conv, viewer='x3d')

### Setting MBTR hyper-parameters

Next we set-up hyper-parameters:
1. atomic_numbers, the atomic numbers to include in the MBTR, helps comparing two structures with missing elements
2. k, list/set of K's to be computed
3. grid: dictionary for K1, K2, K3 with
    min, max: are the min and max values for each distribution
    sigma, the exponent coefficient for smearing
    n, number of bins.
4. weights: dictionary of weighting functions to be used.

**Note: The describe package has implementation of MBTR upto K3**


In [None]:
# Create the MBTR desciptor for the system
decay_factor = 0.5
mbtr = MBTR(
    atomic_numbers=[11, 17], # Na and Cl
    k=[1, 2, 3],
    periodic=True,
    grid={
        "k1": {
            "min": 10,
            "max": 18,
            "sigma": 0.1,
            "n": 200,
        },
        "k2": {
            "min": 0,
            "max": 0.7,
            "sigma": 0.01,
            "n": 200,
        },
        "k3": {
            "min": -1.0,
            "max": 1.0,
            "sigma": 0.05,
            "n": 200,
        }
    },
    weighting={
        "k2": {
            "function": lambda x: np.exp(-decay_factor*x),
            "threshold": 1e-3
        },
        "k3": {
            "function": lambda x: np.exp(-decay_factor*x),
            "threshold": 1e-3
        },
    },
    flatten=False)
print("Number of features: {}".format(mbtr.get_number_of_features()))

### Calculate MBTR

We call the create functin of mbtr class over our System(ase.Atoms) object

In [None]:
#Create Descriptor
desc = mbtr.create(NaCl_conv)

### Plotting 

We will now plot all the tensors, in the same plot, for K1, K2, and K3

In [None]:
#plot K1
x1 = mbtr._axis_k1
mpl.plot(x1, desc[0][0, :], label="Na", color="blue")
mpl.plot(x1, desc[0][1, :], label="Cl", color="orange")
mpl.ylabel("$\phi$ (arbitrary units)", size=20)
mpl.xlabel("Atomic number", size=20)
mpl.title("The element count in NaCl crystal.", size=20)
mpl.legend()
mpl.show()

In [None]:
# Plot K2
x2 = mbtr._axis_k2
mpl.plot(x2, desc[1][0, 1, :], label="NaCl, ClNa", color="blue")
mpl.plot(x2, desc[1][1, 1, :], label="ClCl", color="orange")
mpl.plot(x2, desc[1][0, 0, :], label="NaNa", color="green")
mpl.ylabel("$\phi$ (arbitrary units)", size=20)
mpl.xlabel("Inverse distance (1/angstrom)", size=20)
mpl.title("The exponentially weighted inverse distance distribution in NaCl crystal.", size=20)
mpl.legend()
mpl.show()

In [None]:
# Plot K3
x3 = mbtr._axis_k3
imap = mbtr.index_to_atomic_number
smap = {}
for index, number in imap.items():
    smap[index] = numbers_to_symbols(number)
mpl.plot(x3, desc[2][0, 0, 0, :], label="NaNaNa, ClClCl".format(smap[0], smap[0], smap[0]), color="blue")
mpl.plot(x3, desc[2][0, 0, 1, :], label="NaNaCl, NaClCl".format(smap[0], smap[0], smap[1]), color="orange")
mpl.plot(x3, desc[2][1, 0, 1, :], label="NaClNa, ClNaCl".format(smap[1], smap[0], smap[1]), color="green")
mpl.ylabel("$\phi$ (arbitrary units)", size=20)
mpl.xlabel("cos(angle)", size=20)
mpl.title("The exponentially weighted angle distribution in NaCl crystal.", size=20)
mpl.legend()
mpl.show()

## Remark

The MBTR is a fingreprint of entire system. Thus, it can be used to:
1. Compare the similarity of two chemical system by taking the norm of the MBTR values.
2. Machine learn total properties, like energies, dipole moment, etc.

## Exercise

Verify that the MBTR is translationally and rotationally invarient.