# Advanced examples of DMFF 1.0.0
In our new tutorial notebook https://nb.bohrium.dp.tech/detail/6366839940 You must already have a basic understanding of DMFF version 1.0.0. As an advanced tutorial, we have also prepared this example-notebook for you as a supplement to the Tutorial, which includes an introduction to new modules in DMFF such as Qeq, ML Force, and the OpenMM plugin.

## Environment Setup

Retrieve DMFF from GitHub and switch to the desired branch, then proceed with the installation.

In [1]:
! rm -rf DMFF
! rm -rf /opt/mamba/lib/python3.10/site-packages/dmff*
! git clone https://github.com/deepmodeling/DMFF.git
! git config --global --add safe.directory `pwd`/DMFF
! cd DMFF && git checkout wangxy/v1.0.0-devel && pip install .

Cloning into 'DMFF'...
remote: Enumerating objects: 4430, done.[K
remote: Counting objects: 100% (4430/4430), done.[K
remote: Compressing objects: 100% (1458/1458), done.[K
remote: Total 4430 (delta 2950), reused 4316 (delta 2887), pack-reused 0[K
Receiving objects: 100% (4430/4430), 22.09 MiB | 4.75 MiB/s, done.
Resolving deltas: 100% (2950/2950), done.
Updating files: 100% (273/273), done.
Updating files: 100% (317/317), done.
Branch 'wangxy/v1.0.0-devel' set up to track remote branch 'wangxy/v1.0.0-devel' from 'origin'.
Switched to a new branch 'wangxy/v1.0.0-devel'
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /data/DMFF
  Preparing metadata (setup.py) ... [?25ldone
Collecting networkx>=3.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m8.1 MB/s[0m eta

Install the required libraries; this step is time-consuming, so please be patient.

In [2]:
! mamba install openmm=7.7.0 rdkit -c conda-forge -y
! pip install parmed mdtraj pymbar networkx


                  __    __    __    __
                 /  \  /  \  /  \  /  \
                /    \/    \/    \/    \
███████████████/  /██/  /██/  /██/  /████████████████████████
              /  / \   / \   / \   / \  \____
             /  /   \_/   \_/   \_/   \    o \__,
            / _/                       \_____/  `
            |/
        ███╗   ███╗ █████╗ ███╗   ███╗██████╗  █████╗
        ████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗
        ██╔████╔██║███████║██╔████╔██║██████╔╝███████║
        ██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║
        ██║ ╚═╝ ██║██║  ██║██║ ╚═╝ ██║██████╔╝██║  ██║
        ╚═╝     ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝╚═════╝ ╚═╝  ╚═╝

        mamba (0.27.0) supported by @QuantStack

        GitHub:  https://github.com/mamba-org/mamba
        Twitter: https://twitter.com/QuantStack

█████████████████████████████████████████████████████████████


Looking for: ['openmm=7.7.0', 'rdkit']

[?25l[2K[0G[+] 0.0s
[2K[1A[2K[0G[+] 0.1s
conda-forge/linux-64 [3

## 1. ADMPQeqForce

ADMPQeqForce provides a support to coulombic energy calculation for constant potential model and constant charge model. Net charges on all atoms were equilibrated at setted constraint first, then charge related energys were carried out next.

You can directly run the test:

In [3]:
run DMFF/tests/test_admp/test_qeq.py

And we will provide a more detailed explanation as follows

### Import the necessary libraries

In [1]:
import openmm.app as app
import openmm.unit as unit
from dmff.api import Hamiltonian
from dmff.api import DMFFTopology
from dmff.api.xmlio import XMLIO
from dmff import NeighborList
import jax
from jax import value_and_grad
import jax.numpy as jnp
import numpy as np
import time
import pickle
import sys

### Load your force field

In [5]:
xml = XMLIO()
xml.loadXML("DMFF/tests/data/qeq2.xml")

# get residues
res = xml.parseResidues()

For information about the force field file, please refer to the user guide, which contains detailed explanations.
### Initialize the charge and type of each atom and aux

In [6]:
charges, types = [], []
for i in range(len(res)):
    charges += [a["charge"] for a in res[i]["particles"]]
    types += [a["type"] for a in res[i]["particles"]]
charges = np.zeros((len(charges),))

# initialize aux
aux = {
    "q": jnp.array(charges),
     "lagmt": jnp.array([1.0, 1.0])
    #"lagmt": jnp.array([1.0])
}

### Load the topological information and supplement it

In [7]:
# Load topology
pdb = app.PDBFile("DMFF/tests/data/qeq2.pdb")
dmfftop = DMFFTopology(from_top=pdb.topology)
pos = pdb.getPositions(asNumpy=True).value_in_unit(unit.nanometer)
pos = jnp.array(pos)
box = dmfftop.getPeriodicBoxVectors()

# Assign atom charges and types in te topology
atoms = [a for a in dmfftop.atoms()]
for na, a in enumerate(atoms):
    a.meta["charge"] = charges[na]
    a.meta["type"] = types[na]

### Preparation for potential function

In [8]:
# create Hamiltonian
hamilt = Hamiltonian("DMFF/tests/data/qeq2.xml")

# create neighborlist & pairs
nblist = NeighborList(box, 0.6, dmfftop.buildCovMat())
pairs = nblist.allocate(pos)         

# initialize const_list
const_list, map_atomtype = [], []
for i in dmfftop.residues():
    temp = []
    for j in i.atoms():
        temp.append(int(j.id)-1)
    const_list.append(np.array(temp))

# create map_atomtype
for i in dmfftop.atoms():
    map_atomtype.append(int(i.meta["type"])-1)    #temp set

# assign const_val
n_template = len(const_list)
const_val = jnp.zeros(n_template)

### Create potential function and Calculate the energy

In [9]:
pot = hamilt.createPotential(dmfftop, nonbondedCutoff=0.6*unit.nanometer, nonbondedMethod=app.PME,
                            ethresh=1e-3, neutral=True, slab=False, constQ=True,
                            const_list=const_list, const_vals=const_val,
                            has_aux=True)

#return energy
efunc = pot.getPotentialFunc()
energy, aux = efunc(pos, box, pairs, hamilt.paramset.parameters, aux)
print("energy: %f kj/mol" %energy)
print(aux)

energy: 4817.286675 kj/mol
{'q': DeviceArray([-2.99605719e-04, -3.40972179e-04, -4.91927203e-04,
             -7.57415141e-04, -9.72305199e-04, -9.04476306e-04,
             -6.19403852e-04, -3.18511669e-04, -4.14033308e-04,
             -5.08883423e-04, -5.64320831e-04, -5.92336096e-04,
             -4.59863367e-04, -5.31103425e-04, -3.93029108e-04,
             -4.28646761e-04, -2.86954295e-04, -3.30793706e-04,
             -3.17696799e-04, -3.50221376e-04, -2.77981466e-04,
             -3.21483470e-04, -2.96629949e-04, -3.72879359e-04,
             -3.51565779e-04, -4.66407859e-04, -5.10171801e-04,
              3.04087838e-04, -3.70321220e-03,  3.20495493e-04,
             -9.18942861e-04, -3.34922291e-04, -4.97683690e-04,
             -3.47713379e-04, -3.55104058e-04, -2.89328710e-04,
             -3.24275491e-04, -2.80456324e-04, -3.02740626e-04,
             -2.75075995e-04, -2.91228744e-04, -2.70938814e-04,
             -4.46974131e-04, -2.97128120e-04, -8.78905378e-04,
       

## 2. Machine Learning Force

## 2.1 SGNN
Navigate to the working directory

In [2]:
import os
os.chdir(os.path.join("DMFF","examples", "sgnn"))

SGNN assume the remaining bonding energy can be written as a sum over different local fragments of the molecule. These fragments are defined as “subgraphs” (labeled as g):

$$
E_{sGNN}=\sum {E_{g}}
$$

Each subgraph defines the local environment of a central bond, and $E_g$ represents the intramolcular energy attributed to that bond. This leads to a rigorously localized representation of the molecule, warranting the extendibility of the resulting model.

### Create a SGNN potential function

For information about the force field file, please refer to the user guide, which contains detailed explanations. Now you need to do the following to create a SGNN potential:

In [3]:
H = Hamiltonian('peg.xml')
app.Topology.loadBondDefinitions("residues.xml")
pdb = app.PDBFile("peg4.pdb")
rc = 0.6
# generator stores all force field parameters
pots = H.createPotential(pdb.topology, nonbondedCutoff=rc*unit.nanometer, ethresh=5e-4)

### Preparation for energy calculation

In [4]:
# construct inputs
positions = jnp.array(pdb.positions._value)
a, b, c = pdb.topology.getPeriodicBoxVectors()
box = jnp.array([a._value, b._value, c._value])
# neighbor list
nbl = NeighborList(box, rc, pots.meta['cov_map']) 
nbl.allocate(positions)

DeviceArray([[ 0,  1,  1],
             [ 0,  2,  1],
             [ 0,  3,  1],
             [ 0,  4,  2],
             [ 0,  5,  3],
             [ 0,  6,  3],
             [ 0,  7,  3],
             [ 0,  8,  3],
             [ 0,  9,  4],
             [ 0, 10,  4],
             [ 0, 11,  2],
             [ 0, 12,  1],
             [ 0, 13,  2],
             [ 0, 14,  2],
             [ 0, 19,  4],
             [ 0, 20,  5],
             [ 0, 21,  5],
             [ 1,  2,  2],
             [ 1,  3,  2],
             [ 1,  4,  3],
             [ 1,  5,  4],
             [ 1,  6,  4],
             [ 1,  7,  4],
             [ 1,  8,  4],
             [ 1,  9,  5],
             [ 1, 10,  5],
             [ 1, 11,  3],
             [ 1, 12,  2],
             [ 1, 13,  3],
             [ 1, 14,  3],
             [ 1, 19,  5],
             [ 1, 20,  6],
             [ 1, 21,  6],
             [ 2,  3,  2],
             [ 2,  4,  3],
             [ 2,  5,  4],
             [ 2,  6,  4],
 

And you can get parameters by:

In [5]:
paramset = H.getParameters()

### Load data and fix it

In [6]:
with open('test_backend/set_test_lowT.pickle', 'rb') as ifile:
    data = pickle.load(ifile)

# input in nm
pos = jnp.array(data['positions'][0:20]) / 10
box = jnp.eye(3) * 5

### Calculate the energy

In [7]:
efunc = jax.jit(pots.getPotentialFunc())
efunc_vmap = jax.vmap(jax.jit(pots.getPotentialFunc()), in_axes=(0, None, None, None), out_axes=0)
print(efunc(pos[0], box, nbl.pairs, paramset))
print(efunc_vmap(pos, box, nbl.pairs, paramset))

-21.588284621154514
[-21.58828462 -39.79334159  10.03889335 -48.22451239 -32.90970162
 -49.68568287 -47.58035178 -51.73860617 -37.39235277 -35.01933271
 -46.06621902 -31.69327601  -6.86739655  -5.13698524 -27.4031207
 -44.65301991 -52.00357797   3.1734038  -72.79081259 -28.27007722]


## 2.1 EANN
Navigate to the working directory

In [8]:
current_directory = os.getcwd()
parent_directory = os.path.dirname(current_directory)
os.chdir(parent_directory)
os.chdir(os.path.join("eann"))

EANN framework born out from the EAM idea. This physically inspired embedded atom neuralnetworks (EANN) representation is not only conceptually andnumerically simple but also very efficient and accurate, as discussed below. EANN assume that the impurity experiences a locally uniform electron density, the embedding energy can be approximated as a function of the scalar local electron density at the impurity site plus an electrostatic interaction. Considering all atoms in the system as impurities embedded in the electron gas created by other atoms, in the EAM framework, the total energy of an $N$ atom system is just the sum over all individual impurity energies.

$$
E=\sum_{i=1}^{N} E_{i}=\sum_{i=1}^{N}\left[F_{i}\left(\rho_{i}\right)+\frac{1}{2} \sum_{j \neq i} \phi_{i j}\left(r_{i j}\right)\right]
$$

where $F_i$ is the embedding function, $ρ_i$ is the embedded electron density at the position of atom $i$ given by the superposition of the densities of surrounding atoms, and $\phi_{ij}$ is the short-range repulsive potential between atoms $i$ and $j$ depending on their distance $r_{ij}$. As the exact forms of these functions are generally unknown, they are often taken from electron gas computations or fit to experimental properties with semiempirical expres-sions. Given these intrinsic approximations, EAM or even its modified version has a limited accuracy and is mainly suitablefor metallic systems.

To go beyond the EAM, we need to improve both expressions of the embedded density and the function $F$. To this end, EANN start from the commonly used Gaussian-type orbitals (GTOs) centered at each atom,

$$
\phi_{l_{x} y_{l} y_{z}}^{\alpha, r_{s}}=x^{l_{x}} y^{l_{y}} z^{l_{z}} \exp \left(-\alpha\left|r-r_{s}\right|^{2}\right)
$$

where each atom is taken as the origin, $r=(x,y,z)$ constitutes the coordinate vector of an electron, $r$ is the norm of the vector,$α$ and $r_s$ are parameters that determine radial distributions of atomic orbitals, ${l_x+l_y+l_z=L}$ specifies the orbital angular momentum ($L$), e.g., $L$ = 0, 1, and 2, correspond to the s, p, and d orbitals, respectively. In this representation, the embedded density of atom $i$ can be taken as the square of the linear combination of atomic orbitals from neighboring atoms, in a similar spirit as that in Hartree−Fock (HF) and densityfunctional theory (DFT). This would generate a scalar $ρ^i$ value for the embedding atom $i$, as used in the EAM, which has been proven to offer insufficient representability for the total energyand can be improved by including the gradients of density.

As for code, just follow the step in SGNN:

In [9]:
H = Hamiltonian('peg.xml')
app.Topology.loadBondDefinitions("residues.xml")
pdb = app.PDBFile("peg4.pdb")
rc = 0.4
# generator stores all force field parameters
pots = H.createPotential(pdb.topology, nonbondedCutoff=rc*unit.nanometer, ethresh=5e-4)

# construct inputs
positions = jnp.array(pdb.positions._value)
a, b, c = pdb.topology.getPeriodicBoxVectors()
box = jnp.array([a._value, b._value, c._value])
# neighbor list
nbl = NeighborList(box, rc, pots.meta['cov_map']) 
nbl.allocate(positions)


paramset = H.getParameters()
# params = paramset.parameters
paramset.parameters

efunc = jax.jit(pots.getPotentialFunc())
print(efunc(positions, box, nbl.pairs, paramset))

-0.09797672247941436


## 3. OpenMM Plugin for DMFF

This is a plugin for [OpenMM](http://openmm.org) that used the trained JAX model by [DMFF](https://github.com/deepmodeling/DMFF) as an independent Force class for dynamics.
To use it, you need to save you DMFF model with the script in `DMFF/backend/save_dmff2tf.py`.

Install the python, openmm and cudatoolkit.
```shell

mkdir omm_dmff_working_dir && cd omm_dmff_working_dir
conda create -n dmff_omm -c conda-forge python=3.9 openmm cudatoolkit=11.6
conda activate dmff_omm
```
### Download `libtensorflow_cc` and install `cppflow` package
Install the precompiled libtensorflow_cc library from deepmodeling channel.
```shell

conda install -c deepmodeling libtensorflow_cc=2.9.1=cuda112h02da4e0_0
```
Download the tensorflow sources file. Copy the `c` direcotry in source code to installed header files of tensorflow library, since it's needed by package `cppflow`.
```shell

wget https://github.com/tensorflow/tensorflow/archive/refs/tags/v2.9.1.tar.gz
tar -xvf v2.9.1.tar.gz
cp -r tensorflow-2.9.1/tensorflow/c ${CONDA_PREFIX}/include/tensorflow/
```
Download `cppflow` and move the headers library to environment path.
```shell

git clone https://github.com/serizba/cppflow.git
cd cppflow
git apply DMFF/backend/openmm_dmff_plugin/tests/cppflow_empty_constructor.patch
mkdir ${CONDA_PREFIX}/include/cppflow
cp -r include/cppflow ${CONDA_PREFIX}/include/
```

### Install the OpenMM DMFF plugin from source 

Compile the plugin from source with following steps.
1. Set up environment variables.
   ```shell
   export OPENMM_INSTALLED_DIR=$CONDA_PREFIX
   export CPPFLOW_INSTALLED_DIR=$CONDA_PREFIX
   export LIBTENSORFLOW_INSTALLED_DIR=$CONDA_PREFIX
   cd DMFF/backend/openmm_dmff_plugin/
   mkdir build && cd build
   ```

2. Run `cmake` command with required parameters.
   ```shell
   cmake .. -DOPENMM_DIR=${OPENMM_INSTALLED_DIR} -DCPPFLOW_DIR=${CPPFLOW_INSTALLED_DIR} -DTENSORFLOW_DIR=${LIBTENSORFLOW_INSTALLED_DIR}
   make && make install
   make PythonInstall
   ```
   
3. Test the plugin in Python interface, reference platform.
   ```shell
   python -m OpenMMDMFFPlugin.tests.test_dmff_plugin_nve -n 100
   python -m OpenMMDMFFPlugin.tests.test_dmff_plugin_nvt -n 100 --platform CUDA
   ```

check if you can use the plugin now:

In [None]:
from OpenMMDMFFPlugin import DMFFModel

And here is an example for how to use the plugin, you can input this in your shell:

In [None]:
! python -m OpenMMDMFFPlugin.tests.test_dmff_plugin_nve -n 100 --pdb ../examples/water_fullpol/water_dimer.pdb --model ./openmm_dmff_plugin/python/OpenMMDMFFPlugin/data/admp_water_dimer_aux --has_aux True