# Fine-tune the pretrained CHGNet for better accuracy


In [None]:
from __future__ import annotations

# install CHGNet (only needed on Google Colab or if you didn't install CHGNet yet)
!git clone --depth 1 https://github.com/CederGroupHub/chgnet
!pip install ./chgnet

In [None]:
import numpy as np
from pymatgen.core import Structure

from chgnet.model import CHGNet

chgnet = CHGNet.load()

CHGNet initialized with 400,438 parameters


## 1. Prepare Training Data


In [None]:
try:
    from chgnet import ROOT

    lmo = Structure.from_file(f"{ROOT}/examples/o-LiMnO2_unit.cif")
except Exception:
    from urllib.request import urlopen

    url = "https://github.com/CederGroupHub/chgnet/raw/main/examples/o-LiMnO2_unit.cif"
    cif = urlopen(url).read().decode("utf-8")
    lmo = Structure.from_str(cif, fmt="cif")

We create a dummy fine-tuning dataset by using CHGNet prediction with some random noise. For your purpose on fine-tuning to specific chemical system or AIMD data, please modify the block below


In [None]:
structures, energies_per_atom, forces, stresses, magmoms = [], [], [], [], []

for _i in range(100):
    structure = lmo.copy()
    # stretch the cell by a small amount
    structure.apply_strain(np.random.uniform(-0.1, 0.1, size=(3)))
    # perturb all atom positions by a small amount
    structure.perturb(0.1)

    pred = chgnet.predict_structure(structure)

    structures.append(structure)
    energies_per_atom.append(pred["e"] + np.random.uniform(-0.1, 0.1, size=1))
    forces.append(pred["f"] + np.random.uniform(-0.01, 0.01, size=pred["f"].shape))
    stresses.append(
        pred["s"] * -10 + np.random.uniform(-0.05, 0.05, size=pred["s"].shape)
    )
    magmoms.append(pred["m"] + np.random.uniform(-0.03, 0.03, size=pred["m"].shape))

Note that the magmom output from CHGNet is in unit of GPa, here the -10 unit conversion modifies it to be kbar in VASP raw unit. We do this since by default, StructureData dataset class takes in VASP units.


## 2. Define DataSet


In [None]:
from chgnet.data.dataset import StructureData, get_train_val_test_loader

In [None]:
dataset = StructureData(
    structures=structures,
    energies=energies_per_atom,
    forces=forces,
    stresses=stresses,  # can be None
    magmoms=magmoms,  # can be None
)
train_loader, val_loader, test_loader = get_train_val_test_loader(
    dataset, batch_size=8, train_ratio=0.9, val_ratio=0.05
)

Here the batch_size is defined to be 8 for small gpu-memory. If > 10 GB memory is available, we highly recommend increase the batch_size for better speed.

If you have so many structures (which is highly typical from AIMD), it's ineffecient to put them all at once into the python list as it's probably impossible for memory issue. In this case we highly recommend you to pre-convert all the structures into graphs and save them using examples/make_graphs.py. And later you can directly train CHGNet by loading the graphs from hard-drive instead of memory using the GraphData class defined in data/dataset.py


## 3. Define model and trainer


In [None]:
from chgnet.trainer import Trainer

# Load pretrained CHGNet
chgnet = CHGNet.load()

It's optional to freeze the weights inside some layers. This is a common technique to retain the learned knowledge during fine-tuning in large pretrained neural networks.


In [None]:
# Optionally fix the weights of some layers
for layer in [
    chgnet.atom_embedding,
    chgnet.bond_embedding,
    chgnet.angle_embedding,
    chgnet.bond_basis_expansion,
    chgnet.angle_basis_expansion,
    chgnet.atom_conv_layers,
    chgnet.bond_conv_layers,
    chgnet.angle_layers,
]:
    for param in layer.parameters():
        param.requires_grad = False

In [None]:
# Define Trainer
trainer = Trainer(
    model=chgnet,
    targets="efsm",
    optimizer="Adam",
    scheduler="CosLR",
    criterion="MSE",
    epochs=5,
    learning_rate=0,
    use_device="cpu",
    print_freq=6,
)

## 4. Start training


In [None]:
trainer.train(train_loader, val_loader, test_loader)

After training, the trained model can be found in the directory of today's date. Or it can be accessed by:


In [None]:
model = trainer.model
best_model = trainer.best_model

## Extras 1: GGA / GGA+U compatibility


### Q: Why and when do you care about this?

**When**: If you want to fine-tune the pretrained CHGNet with your own GGA+U VASP calculations, and you want to keep your VASP energy compatible to the pretrained dataset. In case your dataset is so large that the pretrained knowledge does not matter to you, you can ignore this.

**Why**: CHGNet is trained on both GGA and GGA+U calculations from Materials Project. And there has been developed methods in solving the compatibility between GGA and GGA+U calculations which makes the energies universally applicable for cross-chemistry comparison and phase-diagram constructions. Please refer to:

https://journals.aps.org/prb/abstract/10.1103/PhysRevB.84.045115

Below we show an example to apply the compatibility.


In [None]:
# Imagine this is the VASP raw energy
chgnet = CHGNet.load()
VASP_raw_energy = chgnet.predict_structure(lmo)["e"] * len(lmo)
print(f"The raw total energy from VASP of LMO is: {VASP_raw_energy}")

You can look for the energy correction applied to each element in :

https://github.com/materialsproject/pymatgen/blob/v2023.2.28/pymatgen/entries/MP2020Compatibility.yaml

Here LiMnO2 applies to both Mn in transition metal oxides correction and oxide correction.


In [None]:
num_Mn = lmo.composition.as_dict()["Mn3+"]
Mn_correction_in_TMO = -1.668
num_O = lmo.composition.as_dict()["O2-"]
Oxide_correction = -0.687

corrected_energy = (
    VASP_raw_energy + num_Mn * Mn_correction_in_TMO + num_O * Oxide_correction
)
print(
    f"The corrected total energy of LMO after MP2020Compatibility = {corrected_energy}"
)

Now use this corrected energy as labels to tune CHGNet, you're good to go!


## Extras 2: AtomRef


If you want to fine tune CHGNet to DFT labels that are even more incompatible with Materials Project, like r2SCAN functional, or other DFTs like Gaussian or QE. More trick has to be done to withhold the most amount of information learned during pretraining.

For example, formation energy can be a well-compatible property across different functionals. In CHGNet, we use a Atom_Ref operation, which is a formation-energy-like calculation for per-element contribution to the total energy.

When fine-tuning to other functionals that might have large discrepancy in elemental energies. We recommend you to refit the AtomRef. So that the finetuning on the graph layers can be focused on energy contribution from atom-atom interaction instead of meaningless atom reference energies.

Below I will show an example to refit the AtomRef layer:


In [None]:
print("The pretrained Atom_Ref (per atom reference energy):")
for param in chgnet.composition_model.parameters():
    print(param)

In [None]:
# A list of structures / graphs
structures = [
    lmo,
    Structure(
        species=["Li", "Mn", "Mn", "O", "O", "O"],
        lattice=np.random.rand(3, 3),
        coords=np.random.rand(6, 3),
    ),
    Structure(
        species=["Li", "Li", "Mn", "O", "O", "O"],
        lattice=np.random.rand(3, 3),
        coords=np.random.rand(6, 3),
    ),
    Structure(
        species=["Li", "Mn", "Mn", "O", "O", "O", "O"],
        lattice=np.random.rand(3, 3),
        coords=np.random.rand(7, 3),
    ),
]

# A list of energy_per_atom values (random values here)
energies_per_atom = [5.5, 6, 4.8, 5.6]

In [None]:
from chgnet.model.composition_model import AtomRef

print("We initialize another identical AtomRef layers")
new_AtomRef = AtomRef(is_intensive=True)
new_AtomRef.initialize_from_MPtrj()
for param in new_AtomRef.parameters():
    print(param[:, :3])

In [None]:
new_AtomRef.fit(structures, energies_per_atom)
print("After refitting, the AtomRef looks like:")
for i in new_AtomRef.parameters():
    print(i)