Skip to content

s-a-malik/inorg-synth-graph

Repository files navigation

inorg-synth-graph

Inorganic Reaction Representation Learning and Product Prediction.

Implementation of Predicting the outcomes of materials syntheses with deep learning [ArXiv].

Dependancies

See requirements.txt file

The raw dataset used for this work can be downloaded using the following command (linux):

mkdir data/datasets
wget -cO - https://ndownloader.figshare.com/files/17412674  > data/datasets/solid-state_dataset_2019-06-27.json

More recent versions of the dataset are released by the original authors here

The element embeddings used in this work are found here: Unsupervised word embeddings capture latent knowledge from materials science literature

Preprocess

preprocess.py is used to generate the dataframes and supporting files from the raw data. The number of elements and precursors can be adjusted using optional arguments.

Using the default seed (0) gives the dataset splittings used in the paper.

Training and Testing

train_action_rnn.py is used for training the action sequence autoencoder.

train_reaction_graph.py is used for training the reaction graph model without action sequences.

train_reaction_graph_with_actions.py is used for training the reaction graph model with action sequences.

train_baseline.py is used for training the baseline magpie model.

train_stoich.py is used for training the stoichiometry prediction model.

Model dimensions and Hyperparameters can be set using argparse flags.

Example Usage

Preprocessing:

python preprocess.py --dataset data/datasets/solid-state_dataset_2019-06-27.json \
    --max-prec 10 --min-prec 2 \
    --ps _10_precs --seed 0

Training Action Autoencoder:

python train_action_rnn.py --train-path data/train_10_precs.pkl \
    --test-path data/test_10_precs.pkl \
    --action-path data/action_dict_10_precs.json

Note the --split-prec-amts flag should be used to split out the data such that it can be used with the baseline model.

Training product element prediction model (with actions):

python train_reaction_graph_with_actions.py --train-path data/train_10_precs.pkl \
    --test-path data/test_10_precs.pkl \
    --fea-path data/magpie_embed_10_precs.json \
    --action-path data/action_dict_10_precs.json \
    --elem-path data/elem_dict_10_precs.json \
    --action-rnn models/checkpoint_rnn_f-0_s-0_t-1.pth.tar \
    --train-rnn --mask --amounts \
    --ensemble 5

Get reaction embeddings for full dataset (for training stoichiometry prediction)

python train_reaction_graph_with_actions.py --train-path data/train_10_precs.pkl \
    --test-path data/test_10_precs.pkl \
    --fea-path data/magpie_embed_10_precs.json \
    --action-path data/action_dict_10_precs.json \
    --elem-path data/elem_dict_10_precs.json \
    --action-rnn models/checkpoint_rnn_f-0_s-0_t-1.pth.tar \
    --train-rnn --mask --amounts \
    --ensemble 5 \
    --get-reaction-emb

Training the stoichiometry prediction model:

python train_stoich.py --train-path data/train_f-1_emb_reaction_graph_actions.pkl \
    --test-path data/test_f-1_emb_reaction_graph_actions.pkl \
    --elem-path data/elem_dict_10_precs.json \
    --elem-fea-path data/embeddings/matscholar-embedding.json \
    --use-correct-targets \
    --ensemble 5

For end-to-end testing, use the --evaluate flag on the trained product prediction model to obtain the element predictions, then the --evaluate flag on the trained stoichiometry prediction model (removing the --use-correct-targets flag in the example).

Cite

Please cite if you have found our work helpful:

@article{doi:10.1021/acs.chemmater.0c03885,
author = {Malik, Shreshth A. and Goodall, Rhys E. A. and Lee, Alpha A.},
title = {Predicting the Outcomes of Material Syntheses with Deep Learning},
journal = {Chemistry of Materials},
volume = {33},
number = {2},
pages = {616-624},
year = {2021},
doi = {10.1021/acs.chemmater.0c03885},
URL = {https://doi.org/10.1021/acs.chemmater.0c03885},
eprint = {https://doi.org/10.1021/acs.chemmater.0c03885}
}

Disclaimer

This is research code shared without support or guarantee of quality. Please let me know however if there is anything wrong or that could be improved and I will try to solve it.

Releases

No releases published

Packages

No packages published

Languages