Results not reproducible when running AllenNLPExecutor
multiple times with transformers
#2716
Labels
bug
Issue/PR about behavior that is broken. Not for typos/examples/CI/test but for Optuna itself.
Expected behavior
If we run
AllenNLPExecutor
multiples times in a single process, we should get exactly the same results as we runAllenNLPExecutor
one time in different processes. However, with transformer models we get different results in 2nd trial and afterwards.I found this is caused by
allennlp.common.cached_transformers
module, which only constructs the model in the first trial (which would consume some random numbers), and uses the cached model in trials afterwards (which would not consume random numbers), leading to inconsistent results between single run and multiple runs.I have reported the same issue in himkt/allennlp-optuna#45 and we think it is better to fix it here.
Environment
Error messages, stack traces, or logs
Steps to reproduce
Run
test.py --lrs 3e-5 5e-5
(see below) with 2 trials, results fortrial_1
(lr=5e-5) are listed below:Run
test.py --lrs 5e-5
, with only one trial (lr=5e-5), results are listed below, note the accuracy and loss are different from the first run:Reproducible examples (optional)
test.py
config.jsonnet
Additional context (optional)
The text was updated successfully, but these errors were encountered: