# MACE in Practice II

In this tutorial, you will learn how to fit and test a `MACE` model (Message Passing Neural Network), which is a highly accurate and efficient MLIP (Machine Learnt Interatomic Potential). The training/testing techniques we show here, however, are broadly applicable to all MLIPs. You can independently learn about MACE by studying the [original method paper](https://proceedings.neurips.cc/paper_files/paper/2022/file/4a36c3c51af11ed9f34615b81edb5bbc-Paper-Conference.pdf). MACE was developed by unifying the Atomic Cluster Expansion (ACE) approach with the Neural Equivariant Interatomic Potentials (NequIP). The mathematical formalism which unifies these methods is explained in the [accompaning paper](https://doi.org/10.48550/arXiv.2205.06643). Another [useful reference](https://doi.org/10.48550/arXiv.2305.14247) showcases the method's performance on published benchmark datasets. The [code implementation](https://github.com/ACEsuit/mace) is publically available and [here](https://mace-docs.readthedocs.io/en/latest/) you can find the documentation.

## Learning Objectives for today:

1. **Iterative Training: improving stability and accuracy**
2. **Error estimation: committee models**
3. **Active learning: unsupervised iterative training**
4. **Foundational models: out-of-the-box MLIPs**
5. **Fine-tuning on new data and labels**

## The Molecular Liquid Condensed Phase

The MLIP was trained on clusters, can we simulate the liquid molecular environment?

In [None]:
init_conf = read('data/ECEMC.xyz','4') #read a liquid config with periodic boundary conditions
init_conf.center()

simpleMD(init_conf, temp=500, calc=mace_calc, fname='liquid_md.xyz', s=10, T=2000)

This XTB calculator is non-periodic, so this dynamics would not be possible without an MLIP! Check for yourself, by replacing the calculator with xtb. The system is much larger than the example before (12 molecules vs just one), check how GAP scales with size by replacing the calculator with gap.


Transferability from clusters to the condensed phase environment is still an open research question. If this works, it implies that we might be able to learn on highly accuracte Quantum Chemistry methods on molecular clusters and make predictions (density, diffusivity) for the condensed phase! This is new Science!

# Active Learning with MACE

This is a short tutorial on how to use active learning with MACE.

Active learning consists is an iterative fitting process aiming at providing the model with the most optimal training data for increasing its performance. For any active learning sheme, two things are essential:

- A efficient data generation source, in our case molecular dynamics.
- A score function rating the utility of a given data to be trained

The score function in the case of interatomic potentials is usually correlated to the uncertainty of the prediction of the model for a given configuration. For neural networks potentials like MACE, a straightforward measure of uncertainty is the variance of the output over an ensemble of models.


To obtain a comittee from MACE, we need to train a committee of models and add some randomness to the optimization process. We can achieve this by changing the `--seed`.

Let us train three small MACE models to make sure they break for demo purposes. We will use different seeds for the three independent models. This will allow us to create a `comittee` of independent predictors. As we change hyper parameters, we will also change the `--name` of the model to make sure it saves seperatly.

In [None]:
#prepare a much smaller training example, and let us pick independent data sets
from ase.io import read, write
db = read('data/solvent_xtb.xyz', ':')
write('data/solvent_mace_small1_train.xyz', db[:3]+db[3:53]) #
write('data/solvent_mace_small2_train.xyz', db[:3]+db[53:103])
write('data/solvent_mace_small3_train.xyz', db[:3]+db[103:153])

#train the first model
!python3 ./mace/scripts/run_train.py \
    --name="model_small1" \
    --train_file="data/solvent_mace_small1_train.xyz" \
    --valid_fraction=0.05 \
    --E0s="isolated" \
    --energy_key="energy" \
    --forces_key="forces" \
    --model="MACE" \
    --num_interactions=2 \
    --max_ell=2 \
    --hidden_irreps="16x0e" \
    --num_cutoff_basis=5 \
    --correlation=2 \
    --r_max=3.0 \
    --batch_size=5 \
    --valid_batch_size=5 \
    --eval_interval=1 \
    --max_num_epochs=50 \
    --start_swa=30 \
    --swa_energy_weight=1000 \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --error_table="PerAtomRMSE" \
    --default_dtype="float32" \
    --swa \
    --device=cuda \
    --seed=345

#train the second model
!python3 ./mace/scripts/run_train.py \
    --name="model_small2" \
    --train_file="data/solvent_mace_small2_train.xyz" \
    --valid_fraction=0.05 \
    --E0s="isolated" \
    --energy_key="energy" \
    --forces_key="forces" \
    --model="MACE" \
    --num_interactions=2 \
    --max_ell=2 \
    --hidden_irreps="16x0e" \
    --num_cutoff_basis=5 \
    --correlation=2 \
    --r_max=3.0 \
    --batch_size=5 \
    --valid_batch_size=5 \
    --eval_interval=1 \
    --max_num_epochs=50 \
    --start_swa=30 \
    --swa_energy_weight=1000 \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --error_table="PerAtomRMSE" \
    --default_dtype="float32" \
    --swa \
    --device=cuda \
    --seed=567

#train the thirds model
!python3 ./mace/scripts/run_train.py \
    --name="model_small3" \
    --train_file="data/solvent_mace_small3_train.xyz" \
    --valid_fraction=0.05 \
    --E0s="isolated" \
    --energy_key="energy" \
    --forces_key="forces" \
    --model="MACE" \
    --num_interactions=2 \
    --max_ell=2 \
    --hidden_irreps="16x0e" \
    --num_cutoff_basis=5 \
    --correlation=2 \
    --r_max=3.0 \
    --batch_size=5 \
    --valid_batch_size=5 \
    --eval_interval=1 \
    --max_num_epochs=50 \
    --start_swa=30 \
    --swa_energy_weight=1000 \
    --ema \
    --ema_decay=0.99 \
    --amsgrad \
    --error_table="PerAtomRMSE" \
    --default_dtype="float32" \
    --swa \
    --device=cuda \
    --seed=731

Now we can run dynamics with a commitee of models and look at the variance in the energy prediction. Because XTB is cheap enough we can compare that variance with the true error. Do they correlate?

In [None]:
from aseMolec import extAtoms as ea
from ase import units
from ase.md.langevin import Langevin
from ase.md.velocitydistribution import Stationary, ZeroRotation, MaxwellBoltzmannDistribution
from ase.io import read, write

import random
import numpy as np
import time
import pylab as pl
from IPython import display

from xtb.ase.calculator import XTB
from mace.calculators import MACECalculator

model_paths = ['model_small1_swa.model','model_small2_swa.model', 'model_small3_swa.model']
xtb_calc = XTB(method="GFN2-xTB")
mace_calc = MACECalculator(model_paths=model_paths, device='cpu', default_dtype="float32")

init_conf = ea.sel_by_info_val(read('data/solvent_molecs.xyz',':'), 'Nmols', 1)[0].copy()
init_conf.set_calculator(mace_calc)

#initialize the temperature
random.seed(701)
MaxwellBoltzmannDistribution(init_conf, temperature_K=500)
Stationary(init_conf)
ZeroRotation(init_conf)

dyn = Langevin(init_conf, 1*units.fs, temperature_K=1200, friction=0.1)

%matplotlib inline

time_fs = []
temperature = []
energies_1 = []
energies_2 = []
energies_3 = []
variances = []
xtb_energies = []
true_errors = []

! rm -rfv committee_md.xyz
fig, ax = pl.subplots(3, 1, figsize=(8,8), sharex='all', gridspec_kw={'hspace': 0, 'wspace': 0})


def write_frame():
        at = dyn.atoms.copy()
        at.calc = xtb_calc
        xtb_energy = at.get_potential_energy()

        dyn.atoms.write('committee_md.xyz', append=True)
        time_fs.append(dyn.get_time()/units.fs)
        temperature.append(dyn.atoms.get_temperature())
        energies_1.append(dyn.atoms.calc.results["energies"][0]/len(dyn.atoms))
        energies_2.append(dyn.atoms.calc.results["energies"][1]/len(dyn.atoms))
        energies_3.append(dyn.atoms.calc.results["energies"][2]/len(dyn.atoms))
        variances.append(dyn.atoms.calc.results["energy_var"]/len(dyn.atoms))
        xtb_energies.append(xtb_energy/len(dyn.atoms))
        true_errors.append(np.var([dyn.atoms.calc.results["energy"],xtb_energy])/len(dyn.atoms))

        # subplot the variance of the energy as a function of the steps and the temperature as two subplots
        ax[0].plot(np.array(time_fs), np.array(variances), color="y")
        ax[0].plot(np.array(time_fs), np.array(true_errors), color="black")
        ax[0].set_ylabel(r'$\Delta$ E (eV$^2$/atom)')
        ax[0].legend(['Estimated Error', 'True Error'], loc='lower left')

        # plot the temperature of the system as subplots
        ax[1].plot(np.array(time_fs), temperature, color="r", label='Temperature')
        ax[1].set_ylabel("T (K)")

        ax[2].plot(np.array(time_fs), energies_1, color="g")
        ax[2].plot(np.array(time_fs), energies_2, color="y")
        ax[2].plot(np.array(time_fs), energies_3, color="olive")
        ax[2].plot(np.array(time_fs), xtb_energies, color="black")
        ax[2].set_ylabel("E (eV/atom)")
        ax[2].set_xlabel('Time (fs)')
        ax[2].legend(['E mace1', 'E mace2', 'E mace3', 'E xtb'], loc='lower left')

        display.clear_output(wait=True)
        display.display(fig)
        time.sleep(0.01)

dyn.attach(write_frame, interval=10)
dyn.run(2000)
print("MD finished!")

As expected, the dynamics has failed. In this case our reference PES is xtb which is cheap to evaluate so we can easily check the error on the fly. In practice we will be trainking MLIPs on expensive reference methods, where computing the true error on the fly is impractical. Notice when the dynamics `explodes`, the `true error` divergese, but crucially the `estimated error` also diverges.