# A Short MACE Tutorial
## Ilyes Batatia 

Hot link to Google Colab version: https://colab.research.google.com/drive/1D6EtMUjQPey_GkuxUAbPgld6_9ibIa-V?authuser=0&pli=1#scrollTo=X2XNYxlFHEKR

## Introduction

This is a short tutorial for MACE, a highly accurate and efficient ML interatomic potential.
Please read the associated [paper](https://arxiv.org/pdf/2206.07697.pdf).
The reference implementation is available [here](https://github.com/ACEsuit/mace).

## Installation

In [None]:
# Install dependencies
!pip install e3nn==0.4.4 opt_einsum ase torch_ema prettytable

# Clone MACE
!git clone --depth 1 https://github.com/ACEsuit/mace.git

Collecting e3nn==0.4.4
  Downloading e3nn-0.4.4-py3-none-any.whl (387 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m387.7/387.7 kB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m
Collecting ase
  Downloading ase-3.22.1-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m20.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torch_ema
  Downloading torch_ema-0.3-py3-none-any.whl (5.5 kB)
Collecting opt-einsum-fx>=0.1.4 (from e3nn==0.4.4)
  Downloading opt_einsum_fx-0.1.4-py3-none-any.whl (13 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.8.0->e3nn==0.4.4)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m32.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.8.0->e3nn==0.4.4)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-m

In [None]:
!pip install mace/

Processing ./mace
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting matscipy (from mace-torch==0.3.4)
  Downloading matscipy-1.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (438 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m438.9/438.9 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
Collecting looseversion (from matscipy->mace-torch==0.3.4)
  Downloading looseversion-1.3.0-py2.py3-none-any.whl (8.2 kB)
Building wheels for collected packages: mace-torch
  Building wheel for mace-torch (pyproject.toml) ... [?25l[?25hdone
  Created wheel for mace-torch: filename=mace_torch-0.3.4-py3-none-any.whl size=87188 sha256=09450eae8a778c0bdccf4f218e2f4af17123a0bc35d10bcb3de28b0fa4fa7ef9
  Stored in directory: /tmp/pip-ephem-wheel-cache-up82ga0y/wheels/df/5f/32/ef59561725170a81c728fd01c75e56a9ee83bad6da485fc6a5
Successfully built m

**Note:** Make sure to enable GPU: Runtime --> Change runtime type to GPU

## Loading Data
The data files used to train the MACE model have to be in `extxyz` format.
In this tutorial, we use the 3BPA dataset consisting of 500 configurations sampled a 300K with DFT.
The energies are in eV and forces in eV/A.

In [None]:
!git clone https://github.com/davkovacs/BOTNet-datasets.git

Cloning into 'BOTNet-datasets'...
remote: Enumerating objects: 57, done.[K
remote: Counting objects: 100% (57/57), done.[K
remote: Compressing objects: 100% (50/50), done.[K
remote: Total 57 (delta 13), reused 37 (delta 7), pack-reused 0[K
Receiving objects: 100% (57/57), 28.73 MiB | 20.03 MiB/s, done.
Resolving deltas: 100% (13/13), done.


In [None]:
!ls BOTNet-datasets/dataset_3BPA

iso_atoms.xyz  test_1200K.xyz  test_600K.xyz  train_300K.xyz
README.md      test_300K.xyz   test_dih.xyz   train_mixedT.xyz


## Training

To train a MACE model you can specify the training file with the `--train_file` flag. The validation set can either be specified as a separate file using the `--valid_file` keyword, or it can be specified as a fraction of the training set using the `--valid_fraction` keyword. It is also possible to provide a test set that only gets evaluated at the end of the training using the `--test_file` keyword. If you want to compute the RMSE for different parts of the training set separately, specify the `config_type` keyword in the `info` dict of the configurations.

When parsing the data files the energies are read using the keyword `energy` and the forces using the keyword `forces`. To change that, specify the `--energy_key` and `--forces_key`.

For illustration, we create a very small model with 16 invariant messages specified by `hidden_irreps='16x0e'`.

In [None]:
!python3 ./mace/scripts/run_train.py \
  --name="MACE_model" \
  --train_file="BOTNet-datasets/dataset_3BPA/train_300K.xyz" \
  --valid_fraction=0.05 \
  --test_file="BOTNet-datasets/dataset_3BPA/test_300K.xyz" \
  --E0s='{1:-13.663181292231226, 6:-1029.2809654211628, 7:-1484.1187695035828, 8:-2042.0330099956639}' \
  --model="ScaleShiftMACE" \
  --hidden_irreps='32x0e' \
  --r_max=4.0 \
  --batch_size=20 \
  --max_num_epochs=100 \
  --ema \
  --ema_decay=0.99 \
  --amsgrad \
  --default_dtype="float32" \
  --device=cpu \
  --seed=123 \
  --swa

2024-04-02 10:00:25.893 INFO: MACE version: 0.3.4
2024-04-02 10:00:25.895 INFO: Configuration: Namespace(name='MACE_model', seed=123, log_dir='logs', model_dir='.', checkpoints_dir='checkpoints', results_dir='results', downloads_dir='downloads', device='cpu', default_dtype='float32', log_level='INFO', error_table='PerAtomRMSE', model='ScaleShiftMACE', r_max=4.0, radial_type='bessel', num_radial_basis=8, num_cutoff_basis=5, interaction='RealAgnosticResidualInteractionBlock', interaction_first='RealAgnosticResidualInteractionBlock', max_ell=3, correlation=3, num_interactions=2, MLP_irreps='16x0e', radial_MLP='[64, 64, 64]', hidden_irreps='32x0e', num_channels=None, max_L=None, gate='silu', scaling='rms_forces_scaling', avg_num_neighbors=1, compute_avg_num_neighbors=True, compute_stress=False, compute_forces=True, train_file='BOTNet-datasets/dataset_3BPA/train_300K.xyz', valid_file=None, valid_fraction=0.05, test_file='BOTNet-datasets/dataset_3BPA/test_300K.xyz', E0s='{1:-13.6631812922312

It is possible to use `--model=MACE`, in order to have the correct limit for isolated atoms. This is recommanded for task studying bond breaking events.

## Run

The trained model is realidy usable to run some ASE MD for illustration. The Colab hardware are not very performant so we put a small number of timesteps for illustration.

In [None]:
from ase import units
from ase.md.langevin import Langevin
from ase.io import read, write
import numpy as np
import time

from mace.calculators import MACECalculator

calculator = MACECalculator(model_paths='/content/checkpoints/MACE_model_run-123.model', device='cpu')
init_conf = read('BOTNet-datasets/dataset_3BPA/test_300K.xyz', '0')
init_conf.set_calculator(calculator)

dyn = Langevin(init_conf, 0.5*units.fs, temperature_K=310, friction=5e-3)
def write_frame():
        dyn.atoms.write('md_3bpa.xyz', append=True)
dyn.attach(write_frame, interval=50)
dyn.run(100)
print("MD finished!")