diff --git a/lambench/metrics/results/README.md b/lambench/metrics/results/README.md index 0187a844..07413708 100644 --- a/lambench/metrics/results/README.md +++ b/lambench/metrics/results/README.md @@ -1,6 +1,6 @@ # Overview -Large atomic models (LAM), also known as machine learning interatomic potentials (MLIPs), are considered foundation models that predict atomic interactions across diverse systems using data-driven approaches. LAMBench is a benchmark designed to evaluate the performance of such models. It provides a comprehensive suite of tests and metrics to help developers and researchers understand the accuracy and generalizability of their machine learning models. +Large atomistic models (LAM), also known as machine learning interatomic potentials (MLIPs), are considered foundation models that predict atomic interactions across diverse systems using data-driven approaches. **LAMBench** is a benchmark designed to evaluate the performance of such models. It provides a comprehensive suite of tests and metrics to help developers and researchers understand the accuracy and generalizability of their machine learning models. ## Our mission includes @@ -26,7 +26,7 @@ Figure 1: Generalizability on force field prediction tasks, 1 - $\bar{M}^m_{FF}$ Figure 2: Accuracy-Efficiency Trade-off, $\bar{M}^m_{FF}$ vs $M_E^m$. -# LAMBench Metrics Calculations +# LAMBench Metrics Calculation ## Generalizability @@ -81,12 +81,16 @@ In contrast, an ideal model that perfectly matches Density Functional Theory (DF ### Domain Specific Property Calculation -For the domain-specific property tasks, we adopt the MAE as the error metric. -In the Inorganic Materials domain, the MDR phonon benchmark predicts maximum phonon -frequency, entropy, free energy, and heat capacity at constant volume, with each prediction type assigned a weight of 0.25. -In the Molecules domain, the TorsionNet500 benchmark predicts the torsion profile energy, torsion barrier height, and the number of molecules for which the model's prediction of the torsional barrier height has an error exceeding 1 kcal/mol. -Each prediction type in this domain is assigned a weight of $\frac{1}{3}$. -The resulting score is denoted as $\bar M^{m}_{PC}$. +For the domain-specific property calculation tasks, we adopt the MAE as the primary error metric. + +In the Inorganic Materials domain, the MDR phonon benchmark predicts the maximum phonon frequency, entropy, free energy, and heat capacity at constant volume, while the elasticity benchmark evaluates the shear and bulk moduli. Each prediction type +is assigned an equal weight of $\frac{1}{6}$. + +In the Molecules domain, the TorsionNet500 benchmark evaluates the torsion profile energy, torsional barrier height, and the number of molecules for which the predicted torsional barrier height error exceeds 1 kcal/mol. The Wiggle150 benchmark assesses the relative conformer energy profile. Each prediction type in this domain is assigned a weight of 0.25. + +In the Catalysis domain, the OC20NEB-OOD benchmark evaluates the energy barrier, reaction energy change (delta energy), and the percentage of reactions with predicted energy barrier errors exceeding 0.1 eV for three reaction types: transfer, dissociation, and desorption. Each prediction type in this domain is assigned a weight of 0.2. + +The resulting error metric after averaging over all domains is denoted as $\bar M^{m}_{PC}$. ## Applicability @@ -122,4 +126,4 @@ The final instability metric is computed as the average over all nine structures $$M^m_{\mathrm{IS}} = \frac{1}{9}\sum_{i=1}^{9} M^m_{\mathrm{IS},i}$$ -This result is bounded within the range $[0, +\infty]$, where a lower value signifies greater stability. +This result is bounded within the range $[0, +\infty)$, where a lower value signifies greater stability. diff --git a/pyproject.toml b/pyproject.toml index 32f655af..5c305340 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -33,7 +33,7 @@ classifiers = [ ] [project.optional-dependencies] -deepmd = ["deepmd-kit @ git+https://github.com/iProzd/deepmd-kit.git@P0131_dpa3_optim_angle#egg=deepmd-kit", "torch>=2.1"] +deepmd = ["deepmd-kit==3.1.0", "torch>=2.1"] fairchem = ["fairchem-core", "torch_geometric", "pyg_lib", "torch_scatter", "torch_sparse", "torch_cluster", "torch_spline_conv"] # See README mace = ["mace-torch"] mattersim = ["mattersim >= 1.1.1"]