Deep learning results are not reproducible #15514
Unanswered
hasithjp
asked this question in
Technical Notes
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Motivation
H2O's Deep Learning uses a technique called HOGWILD! which greatly increases the speed of training, but is not reproducible by default.
Solution
In order to obtain reproducible results, you must set
reproducible = TRUE
andseed = 1
(for example, but you can use any seed as long as you use the same one each time). If you force reproducibility, it will slow down the training because this only works on a single thread. By default, H2O clusters are started with the same number of threads as number of cores (e.g. 8 is typical on a laptop).The R example below demonstrates how to produce reproducible deep learning models:
Now we will fit two models and show that the training AUC is the same both times (ie. reproducible).
JIRA Issue Migration Info
Jira Issue: TN-3
Assignee: Erin Ledel
Reporter: Erin Ledel
State: Resolved
Beta Was this translation helpful? Give feedback.
All reactions