ML for MOF Property Prediction

Getting Started

The code is written in Python 3.8. We recommend using Anaconda to build a virtual environment. Details of how to download Anaconda can be found here:

https://www.anaconda.com/

After cloning the repository, create a virtual environment by running the following commands in terminal or Anaconda prompt.

conda create --name ML_MOFs --file requirements.txt
conda activate ML_MOFs

The kaleido package is required to save graphs. Install this using pip.

pip install -U kaleido

Datasets

All the data used in this study can be found in ML_MOFs/Data/

MOF_data.csv - Target and descriptor values for the dataset used for initial 10-fold cross validation
MOF_data_test.csv - Target and descriptor values for the unseen test set

Dataset Analysis

To perform an analysis of the dataset run ML_MOFs/Analysis/data_analysis.py.

Basic statistics and pairwise descriptor correction are saved to ML_MOFs/Results/Analysis_results/.

Histograms of target and descriptor ranges are saved to ML_MOFs/Graphs/Analysis_graphs/.

Machine Learning

For 10-fold cross validation on the full dataset, run ML_MOFs/ML/ML_main.py. Predictions are saved to ML_MOFs/Results/ML_results/Classification and ML_MOFs/Results/ML_results/Regression.

Further analysis of the models produced can be generated by running ML_MOFs\ML\classification_analysis.py and ML_MOFs\ML\regression_analysis.py.

Additional Figures

Additional figures are generated by running ML_MOFs/figures.py and are saved in ML_MOFs/Graphs/Figures.

Running the model for your own training/test sets

Run ML_MOFs/ML/test_ML.py changing lines 84 and 85 to the location of your training and test sets respectively.

Note: you will need to calculate the descriptors as detailed in our publication prior to machine learning, using our datasets as a template for column names.

RASPA Input Files

Sample RASPA input files can be found in ML_MOFs/RASPA_Input_Files/

Data Curation Protocols

These can be found in ML_MOFs/Curation/. In this location there is also CIF files of structures which did pass curation.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.idea		.idea
ML_MOFs		ML_MOFs
.gitattributes		.gitattributes
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.idea

.idea

ML_MOFs

ML_MOFs

.gitattributes

.gitattributes

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

ML for MOF Property Prediction

Getting Started

Datasets

Dataset Analysis

Machine Learning

Additional Figures

Running the model for your own training/test sets

RASPA Input Files

Data Curation Protocols

About

Releases 1

Packages

Languages

samuel-boobier/ML-MOFs

Folders and files

Latest commit

History

Repository files navigation

ML for MOF Property Prediction

Getting Started

Datasets

Dataset Analysis

Machine Learning

Additional Figures

Running the model for your own training/test sets

RASPA Input Files

Data Curation Protocols

About

Resources

Stars

Watchers

Forks

Languages