MOFSimBench: Benchmarking Universal Machine Learning Interatomic Potentials for Metal-Organic Frameworks
This repository contains the code and data for the paper "MOFSimBench: Evaluating Universal Machine Learning Interatomic Potentials for Metal-Organic Framework Molecular Modeling". The project aims to benchmark the performance of various Universal Machine Learning Interatomic Potentials (uMLIPs) in simulating Metal-Organic Frameworks (MOFs) across different properties, including structural optimization, simulation stability, and bulk modulus and heat capacity.
Results of runs completed for this paper to reproduce the figures can be found here: https://dx.doi.org/10.6084/m9.figshare.30234010
- 🛠️ Installation
- ⚙️ Setting up your calculator
- 📁 Project Structure
- 🚀 Running the benchmark
- 📊 Analyzing the results
- 🤝 Contributing
The recommended way to run the benchmark is via Conda environments and SLURM. The SLURM scripts expect environments to be named mb_your-model.
- Create a Conda environment:
conda create -n mb_your-model python=3.11 # or other Python version
conda activate mb_your-model- Clone the repository:
git clone https://github.com/AI4ChemS/mof-umlip-benchmark
cd mof-umlip-benchmark- Install core dependencies:
pip install .- Install DFTD3 package (if needed for D3 corrections):
pip install torch-dftd- Make sure to install an ase version that contains the
MTKNPTdriver:
Our NpT tests rely on this driver, which is not currently available in a pypi release of ase. You can install it from the ase git repository:
pip install git+https://gitlab.com/ase/ase.gitSet up your calculator in the mof_benchmark/setup/calculator.yaml and mof_benchmark/setup/calculator.py files.
Models normally require inference-time D3 corrections; make sure to enable them for your model in the yaml file. A typical entry could look like this:
orb_v3:
model_name: orb-v3-conservative-inf-omat
with_d3: true
model_kwargs:
precision: float32-highestNote: The calculator name is used to identify the Conda environment. For an environment named mb_your-model, its name is expected to be your-model[_suffix] with an optional suffix that is not used in the identification of the Conda environment.
For the example above, the model name is orb and the suffix is _v3. The scripts expect the corresponding Conda environment to be mb_orb.
Connect your model to the benchmark in the calculator.py file. Several architectures from the study are implemented already.
To test that the model works, run:
python mof_benchmark/setup/test_calculator.py your-modelIt should output energy, forces, and stresses, run a short optimization, and a quick speed test.
The repository is organized as follows:
mof_benchmark/: Contains the core Python package.analysis/: Scripts and Streamlit pages for analyzing and visualizing results.experiments/: Scripts and configurations for running tasks.scripts/: Python scripts for different experiments (optimization, stability, heat capacity, bulk modulus).structures/: MOF structure definitions.
setup/: Configuration files for calculators (e.g.,calculators.yaml,calculator.py).
The benchmark is optimized to run on distributed systems managed with SLURM and can be run with a single command. On different systems, each task can also be easily called using the respective Python scripts.
Sample files are available for each task. They can be found under mof_benchmark/experiments/scripts in the respective task folders. The slurm submission scripts are named submit.sh. Adapt them to the required settings on your HPC.
For stability MDs, each structure is submitted in a separate job due to the extended runtime. In this case, the submit script relies on SLURM arrays to distribute the jobs.
With correctly configured submit.sh files in the bulk_modulus, heat_capacity, optimization, and stability directories, all jobs can easily be submitted via the run_all.sh script in mof_benchmark/experiments/scripts:
./run_all.sh your-calculatorAll tasks can also be run from Python directly (The SLURM scripts just call these):
E.g., for the optimization task, run:
python optimization.py --calculator your-model --settings optimization.yaml
For the stability tasks, run:
python stability.py --calculator your-model --settings stability_prod_mtk.yaml --index 0The index (0-99) specifies the structure to run.
As an additional test, we compare the energy predictions of uMLIPs to QMOF DFT references. To run this task, download the QMOF database and place the qmof_database folder in mof_benchmark/experiments/structures. Make sure to unzip relaxed_structures.zip.
To perform the task and analysis, the GoldDAC test.xyz file must be placed in the mof_benchmark/analysis/interaction_energy directory.
Structures must also be extracted into the mof_benchmark/experiments/structures/golddac directory. A python notebook is provided to extract the structures from the test.xyz file.
To compute the results from the experiments, run the analysis scripts in mof_benchmark/analysis. You can quickly run everything using:
./run_analysis.shResults can then be plotted using the plots.ipynb notebook in mof_benchmark/analysis/plot.
Additionally, a Streamlit app is available to explore the results in depth.
Run it using:
cd mof_benchmark/analysis
streamlit run Overview.pyThe benchmark can be extended with new tasks and models due to its modular design.
To create a new task, we refer contributors to the interaction energy task in mof_benchmark/experiments/scripts/interaction_energy for a simple example.
Task classes are inherited from the TaskRunner which handles three aspects:
- Preparing calculator and structures based on the provided settings.
- Calling the task for each structure.
- Creation and cleaning of a temporary working directory, reducing filesystem load on distributed systems.
To perform a task, the run_task method has to be implemented. Storing results needs to be handled in this method as well.
Structures can be defined using file paths or loaded from the structure shortcuts defined in mof_benchmark/experiments/structures/structures.yaml.
For more technical details about the benchmark, please refer to our paper. And if you use this benchmark in your research, please cite our paper:
@article{krass2025mofsimbench,
title={MOFSimBench: evaluating universal machine learning interatomic potentials in metal-organic framework molecular modeling},
author={Kra{\ss}, Hendrik and Huang, Ju and Moosavi, Seyed Mohamad},
journal={npj Computational Materials},
year={2025},
publisher={Nature Publishing Group}
}