Add dropout on predict #8

WardLT · 2019-06-19T20:48:54Z

Adds option to maintain use of dropout during predictions.

Useful as a method for assessing model uncertainty (https://arxiv.org/abs/1506.02142)

Useful as a method for assessing model uncertainty https://arxiv.org/abs/1506.02142 Using the strategy suggested by fchollet keras-team/keras#9412

…dd_dropout

chc273 · 2019-06-20T05:07:53Z

Nice addition! @WardLT

sgbaird · 2022-01-31T22:25:54Z

@WardLT how does one use this to get UQ for model predictions?

WardLT · 2022-01-31T23:21:54Z

The idea's to keep performing dropout during inference by setting dropout_on_predict=True, which means you'll use a different sets of weights each time and the predictions will be different. The paper linked above suggests that the distributions of points can be used to measure the confidence intervals of the models (e.g., standard deviation can be used as a proxy for "confidence of the models," percentiles of the prediction distributions can be used to set confidence intervals).

Just realized now I forgot to add information about this to the documentation. Sorry about that.

sgbaird · 2022-02-01T00:13:44Z

Gotcha, so train multiple times and take the standard deviation of the multiple predictions, for example, if I'm understanding correctly. Thanks!

WardLT · 2022-02-01T02:29:55Z

Almost: you only need to train a single model as long as that model is trained with many dropout layers. If it helps, picture it as a single model that captures a larger ensemble

…

On Mon, Jan 31, 2022, 7:13 PM Sterling Baird ***@***.***> wrote: Gotcha, so train multiple times and take the standard deviation of the multiple predictions, for example, if I'm understanding correctly. Thanks! — Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABVNYDOKP2S6U4GBNTABF7TUY4QUHANCNFSM4HZNHDOQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

sgbaird · 2022-02-01T02:52:05Z

@WardLT, ah, OK. Maybe what I'm still missing is how to go from the changes in this PR to standard deviations on the model predictions.

WardLT · 2022-02-01T12:54:00Z

Sure, I'll step through the procedure with more detail to see what I'm mistakenly skipping over (pardon if I'm hitting things you already know):

Create a MEGNet model with dropout layers and dropout_on_predict=True, then train it. Nothing special here outside of setting the option. Keras will train the model using dropout, which means a different population of weights are zeroed out each batch.
Perform multiple predictions on a single entry using the trained model. Here's where the special part is. Normally, Keras turns off the dropout layers during prediction so that you use all of the weights in the network. In our case, Keras will continue to perform dropout by setting different weights to zero each time we make a prediction because we forced dropout_on_predict.
1. Important Note: dropout_on_predict=True means that Keras will use the same sets of weights on each batch. So, creating a batch with the same entries in it will results in the same prediction (because Keras is zeroing the same weights for dropout within a batch).
Compute the mean and standard deviation of the predictions. Each inference on a single entry will result in subtly different predictions and there is evidence to support these variations being a good measure of uncertainty for the model. A common way to turn them into a single confidence metric is to compute the standard deviation of the predictions and to turn them into a single prediction by averaging.
1. If you're a friend of Bayesian methods, the variation of predictions is thought to represent a posterior distribution of model outputs conditioned on the training data. In non-Bayesian speak, this means you can take many other statistics about the distribution (e.g., the fraction of predictions that result in the compound being on the convex hull is the probability the material will be on the hull).

Does this clear things up?

sgbaird · 2022-02-01T19:52:35Z

Perfect! I think that cleared things up. Fit once, predict multiple times, and treat each prediction as a sample from the posterior distribution. Thank you!

sgbaird · 2022-02-02T02:26:46Z

@WardLT I've been looking into this a bit more and wonder if you could shed some additional light. This approach seems to work well in the case where the distribution of the new test data is similar to the training data, but provides overconfident uncertainties if the distributions are distinct (the latter of which is usually the case in materials discovery campaigns). In other words, there seems to be some agreement in the literature (see below) that bootstrapped ensembles and Monte-Carlo dropout methods are often overconfident for out-of-domain predictions. What are your thoughts on the strengths and weaknesses of the dropout approach in a materials science context? Does the dropout uncertainty that you described fall into this category, or this there something distinct?

A key failing of ensemble metrics is that with sufficient model damping (e.g., by L2 regularization), variance over models can approach zero41 for compounds very distant from training data, leading to over-condence in model predictions.
Another approach to obtain model-derived variances in dropout-regularized neural networks is Monte Carlo dropout (mc-dropout)50 (Fig. 1). In mc-dropout, a single trained model is run repeatedly with varied dropout masks, randomly eliminating nodes from the model (ESI Text S1†). The variance over these predictions provides an effective credible interval with the modest cost of running the model multiple times rather than the added cost of model re-training. In transition metal complex discovery, we found that dropout-generated credible intervals provided a good estimate of errors on a set aside test partition but were over-confident when applied to more diverse transition metal complexes.7,8 Consistent with the ensembles and mcdropout estimates, uncertainty in ANNs can be interpreted by taking a Bayesian view of weight uncertainty where a prior is assumed over the distribution of weights of the ANN and then updated upon observing data, giving a distribution over possible models.51 However, if the distribution of the new test data is distinct from training data, as is expected in chemical discovery, this viewpoint on model uncertainty may be incomplete.

Janet, J. P.; Duan, C.; Yang, T.; Nandy, A.; Kulik, H. J. A Quantitative Uncertainty Metric Controls Error in Neural Network-Driven Chemical Discovery. Chem. Sci. 2019, 10 (34), 7913–7922. https://doi.org/10.1039/C9SC02298H.

and

For example, when the OCHEM data set served as the test set, the RMSE remained nearly constant despite successive removal of predictions ranked lowest by STD but decreased with removal of predictions ranked lowest by SDC (Figure 1B). That is, STD failed to identify predictions with large errors, whereas SDC successfully identified and removed these predictions. This contrast was more pronounced in the RMSEs of the Bradley test set: whereas successive removal of predictions ranked lowest by SDC considerably reduced the RMSEs of the remaining predictions, removal of predictions ranked lowest by STD gradually increased the RMSE of the remaining predictions. In other words, only SDC was successful in removing predictions with large errors.

Liu, R.; Wallqvist, A. Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds. J. Chem. Inf. Model. 2019, 59 (1), 181–189. https://doi.org/10.1021/acs.jcim.8b00597.

Note: sum of distance-weighted contributions == SDC, standard deviation == STD

Btw, I think this PR is a good contribution. Just getting curious in the context of https://github.com/ml-evs/modnet-matbench/issues/18 and a few other projects and want to get out of my echo chamber.

Also relevant to MEGNet uncertainty quantification is the unlockNN repository mentioned in #338. Thanks also for the patience with me opening discussion back up on a PR from several years ago.

WardLT added 2 commits June 19, 2019 15:45

Added ability to keep dropout active on prediction

98369c3

Useful as a method for assessing model uncertainty https://arxiv.org/abs/1506.02142 Using the strategy suggested by fchollet keras-team/keras#9412

Merge branch 'master' of github.com:materialsvirtuallab/megnet into a…

0a7bba2

…dd_dropout

chc273 merged commit 7e1ea00 into materialsvirtuallab:master Jun 20, 2019

sgbaird mentioned this pull request Jan 31, 2022

unlockNN module for uncertainty quantification of MEGNet #338

Open

WardLT deleted the add_dropout branch January 31, 2022 23:16

sgbaird mentioned this pull request Feb 12, 2022

Metrics for quality of uncertainty quantification modl-uclouvain/modnet-matbench#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dropout on predict #8

Add dropout on predict #8

WardLT commented Jun 19, 2019

chc273 commented Jun 20, 2019

sgbaird commented Jan 31, 2022

WardLT commented Jan 31, 2022

sgbaird commented Feb 1, 2022

WardLT commented Feb 1, 2022 via email

sgbaird commented Feb 1, 2022

WardLT commented Feb 1, 2022

sgbaird commented Feb 1, 2022

sgbaird commented Feb 2, 2022 •

edited

Loading

Add dropout on predict #8

Add dropout on predict #8

Conversation

WardLT commented Jun 19, 2019

chc273 commented Jun 20, 2019

sgbaird commented Jan 31, 2022

WardLT commented Jan 31, 2022

sgbaird commented Feb 1, 2022

WardLT commented Feb 1, 2022 via email

sgbaird commented Feb 1, 2022

WardLT commented Feb 1, 2022

sgbaird commented Feb 1, 2022

sgbaird commented Feb 2, 2022 • edited Loading

sgbaird commented Feb 2, 2022 •

edited

Loading