# Introduction to neural network force fields (NNFFs)


Edited by N. Hu

Medford Research Group at Georgia Tech

08/25/2021

This ipython notebook of training NNFFs is based on the BDQM-VIP lecture materials and the sample scripts provided by AMPtorch (`amptorch/example/`) for both conventional Symmetry Functions as finger-printing scheme + Behler-Parrnello atomistic neural network structure (2nd Generation NN), and Gaussian Multi-Pole + SingleNN neural network structure.

This paper introduces the basics and the formulation of Gaussian Multi-pole (GMP) descriptors: 
<https://arxiv.org/abs/2102.02390?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%253A+arxiv%252FQSXk+%2528ExcitingAds%2521+cs+updates+on+arXiv.org%2529>


# Table of contents
1. [Introduction](#introduction)
    1. [The Descriptors/Fingerprinting Scheme](#fp_scheme)
    2. [Atomistic Neural Network Structures](#nn_structure)
2. [Installation](#installation)


## Introduction <a name="introduction"></a>

There are different ways to approximate the potential energy surface of a chemical system of interests. While electornic structure calcualtions are computationally expensive and classical force field methods relatively inaccurate, machine learning force fields trained on electronic structures provide a viable solution that overcomes the high computational costs yet has high accuracy. Neural networks as a universal approximator is proven effective at approximating the potential energy surface. Behler and Parrinello introduced the 2nd Generation Neural Network Force Fields that are widely applied to Chemistry, Chemical Engineering, Materials and etc. [1] Behler, J. (2015). Constructing high-dimensional neural network potentials: A tutorial review. International Journal of Quantum Chemistry, 115(16), 1032–1050. https://doi.org/10.1002/qua.24890

According to the Born-Oppenheimer approximation, the potential energy at ground state for a system is dictated by nuclear coordinates when there are no external fields and constant charges. The potential energy can be considered as a function of the nuclear coordinates of a system.

$$E_{\text{system}}(\mathbf{R})$$

where $E_{\text{system}}$ is the potential energy of the chemical system, and $\mathbf{R}$ is the $(N, 3)$ matrix that represents the nuclear coordinates of $N$ atoms in the chemical system. 

One main assumptions of atomistic neural networks is that the system energy can be partitioned into atomic energy:

$$E_{\text{system}} = \sum_{i=0}^{N} E_{i}$$

where $i$ is the index for every atom in the system. 

Atomistic NNFFs have two main design components:
1. The Descriptors/Fingerprinting Scheme
2. The Atomistic Neural Network Structure

The trained NNFFs in AMPtorch can then be used to predict the 

Here we will breifly introduce Symmetry Functions(SFs) and Gaussian Multi-Poles(GMPs) [2] as fingerprinting schemes. For atomistic neural network structures, the discussion will mainly cover 2nd-Generation Neural Networks and SingleNN. [2] Lei, X., & Medford, A. J. (2021). A Universal Framework for Featurization of Atomistic Systems. http://arxiv.org/abs/2102.02390

### The Descriptors/Fingerprinting Scheme <a name="fp_scheme"></a>

The system energy is constant when the atoms translate and rotate, therefore making the default Euclidean coordinates undesirable to be applied directly as input descriptors for a regression model. To resolve the issue, Behler and Parrinello first introduced Symmetry Functions that have several key properties to make it compatible for atomistic neural network potentials:

1. rotational and translational invariance
2. invariance to permutation of atoms

SFs have two components: 1) $G^2$ functions that describe 2-body radial interactions within a cutoff radius; 2) $G^4$ functions that describve 3-body angular interactions within a cutoff. The mathematical expressions can be found in [1] reference, Equation (9) and (11). [1] Behler, J. (2015). Constructing high-dimensional neural network potentials: A tutorial review. International Journal of Quantum Chemistry, 115(16), 1032–1050. https://doi.org/10.1002/qua.24890

Because the formulation of SFs does not take into element types into account, the interactions among different elements are divided into different columns as input. As a result, the number of feature dimensions undesirably increases with the number of elements present. GMPs is another fingerprinting scheme whose input dimensions remain constant regardless of the number of chemical elements. 

GMPs use Gaussians to probe the radial coordinates, and MCSH multi-pole expansions to probe the angular coordinates:

$$\mu_{i, abc} = < probe, \hat{\rho}> = <angular\;probe \times radial\;probe, \hat{\rho}>$$

$$\mu_{i, abc} = <S_{abc} \times G_{i}, \sum_{\substack{j\\atoms}} \sum_{\substack{k\\ Gaussians}} G_{dens,j,k}> $$

[2] Lei, X., & Medford, A. J. (2021). A Universal Framework for Featurization of Atomistic Systems. http://arxiv.org/abs/2102.02390

The descriptors are designed mathematical expressions that describe the local chemical environment surrouding a central atom:

$$E_{\text{system}} = \sum_{i=0}^{N} E_{i}(\vec{G_i})$$

where $\vec{G_i}$ is the descriptor for atom $i$. 



### Atomistic Neural Network Structures <a name="nn_structure"></a>

Based on the per-atom energy partition assumption, atomistic neural networks are built to predict the atomic energy given local atomic information from the descriptors. 

2nd Gen NNFF introduced by Behler and Parrinello is a network of atomic neural networks, with one for every element. SingleNN uses a single atomic neural network whose weights and biases are shared across elements. For a simple system of a water molcule, schematically, 2nd NNFF has two atomic neural networks, one for H atoms and one for O atoms: 

<img src="figures/2ndNNFF.png">

SingleNN looks like: 

<img src="figures/singleNN.png">



## Preparation: Install AMPtorch with conda <a name="installation"></a>

Please follow the instructions as shown in the github repo to install AMPtorch and its dependencies: <https://github.com/ulissigroup/amptorch/tree/MCSH_paper1_lmdb>



## Reference

[1] Behler, J. (2015). Constructing high-dimensional neural network potentials: A tutorial review. International Journal of Quantum Chemistry, 115(16), 1032–1050. https://doi.org/10.1002/qua.24890

[2] Lei, X., & Medford, A. J. (2021). A Universal Framework for Featurization of Atomistic Systems. http://arxiv.org/abs/2102.02390

[3] Liu, M., & Kitchin, J. R. (2020). SingleNN: Modified Behler-Parrinello Neural Network with Shared Weights for Atomistic Simulations with Transferability. Journal of Physical Chemistry C, 124(32), 17811–17818. https://doi.org/10.1021/acs.jpcc.0c04225