Skip to content

Latest commit

 

History

History

equiformer_v2

EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

Yi-Lun Liao, Brandon Wood, Abhishek Das*, Tess Smidt*

[arXiv:2306.12059]

NOTE: Please refer to the official EquiformerV2 codebase for installation instructions and for up-to-date code that reproduces numbers in the paper.

The version of EquiformerV2 code within this OCP repository is meant to make it easier to use EquiformerV2 as part of the OCP toolkit and to ease future development.

OC20 checkpoints and configs

We provide model weights for EquiformerV2 trained on S2EF-2M dataset for 30 epochs, EquiformerV2 (31M) trained on S2EF-All+MD, and EquiformerV2 (153M) trained on S2EF-All+MD.

Model Training Split Download S2EF val force MAE (meV / Å) S2EF val energy MAE (meV) Test results
EquiformerV2 (83M) 2M checkpoint | config 19.4 278 -
EquiformerV2 (31M) All+MD checkpoint | config 16.3 232 S2EF | IS2RE | IS2RS
EquiformerV2 (153M) All+MD checkpoint | config 15.0 227 S2EF | IS2RE | IS2RS

OC22 checkpoints and configs

Model Download S2EF-Total val force MAE (meV / Å) S2EF-Total val energy MAE (meV) Test results
EquiformerV2 ($\lambda_E$=4, $\lambda_F$=100, 121M) checkpoint | config 26.9 547 S2EF-Total

OC22 energy prediction

For the energy targets, instead of using the total DFT energies directly, we reference them using per-element linear fit reference energies, followed by normalizing the referenced energy distribution.

That is, during training, target $E = \frac{E_{DFT} - E_{ref} - E_{mean}}{E_{std}}$, and during testing/inference, the total DFT energy prediction $\hat{E_{DFT}}$ is given as $\hat{E} \times E_{std} + E_{ref} + E_{mean}$ where
$E_{DFT}$ = raw DFT energy,
$E_{ref}$ = reference energy (per-element reference energies available here for OC22),
$E_{mean}$ = normalizer mean, computed after subtracting per-element references (=0 for OC22),
$E_{std}$ = normalizer standard deviation, computed after subtracting per-element references (=25.12 for OC22),
$\hat{E}$ = predicted energy,
$\hat{E_{DFT}}$ = predicted total DFT energy.

We can also write this as $\hat{E_{DFT}} = E_{std} \times (\hat{E} + \frac{E_{ref}}{E_{std}}) + E_{mean}$, which makes it a little easier to handle it in the current version of the code.

$\frac{E_{ref}}{E_{std}}$ comes packaged as part of the checkpoint above and can be used during inference using the use_energy_lin_ref flag in the config.

During training / finetuning, the OC22 dataloader handles the energy referencing, so set use_energy_lin_ref=False.

Running EquiformerV2

  • If you haven't trained OCP models before and are specifically interested in EquiformerV2, the training / validation scripts provided in the official EquiformerV2 codebase might be easier to get started.
  • We provide a slightly modified trainer and LR scheduler. The differences from the parent forces trainer are the following:
    • Support for cosine LR scheduler.
    • When using the LR scheduler, it first converts the epochs into number of steps and then passes it to the scheduler. That way in the config everything can be specified in terms of epochs.
  • To run training (similar workflow as other OCP models):
    python main.py \
      --config-yml configs/s2ef/2M/equiformer_v2/equiformer_v2_N@12_L@6_M@2.yml \
      --mode train
  • To run validation with a pretrained model checkpoint:
    python main.py \
      --config-yml configs/s2ef/2M/equiformer_v2/equiformer_v2_N@12_L@6_M@2.yml \
      --checkpoint path/to/checkpoint.pt \
      --mode validate

Citing

If you use EquiformerV2 in your work, please consider citing:

@article{equiformer_v2,
  title={{EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations}},
  author={Yi-Lun Liao and Brandon Wood and Abhishek Das* and Tess Smidt*},
  journal={arxiv preprint arxiv:2306.12059},
  year={2023},
}