-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving reproducibility by additional settings in set_random_seeds
#333
Conversation
Codecov Report
@@ Coverage Diff @@
## master #333 +/- ##
==========================================
+ Coverage 81.75% 81.78% +0.03%
==========================================
Files 51 51
Lines 3485 3491 +6
==========================================
+ Hits 2849 2855 +6
Misses 636 636 |
2e1e5f4
to
bcffb04
Compare
I am quite against this. I think by default I wouldn't want the slowdowns coming from deterministic mode. set_random_seeds atm should give you results that are very similar, but not exactly same, which is enough in my view for many cases for scientific reproducibility. I would be fine with adding a deterministic flag, but with default false... Then one could set this to true if one tries to have exact reproducibility (keeping in mind that it may also only be exactly reproducible on same machine). |
I think that we do not see those differences in our CI because we don't use cuda and maybe our datasets are not big enough? (I am not sure regarding the dataset size). I can show you what happens with reproducibility when setting random seeds without
|
One more thing, regarding |
@robintibor what do you think about this behavior? |
I feel like cudnn.benchmark is something you should know when you are setting it why you set it and that it will result in some nondeterminism. So maybe what we could do is just inside set_random_seeds if cudnn benchmark is set to True, give out a warning that it may not be so well reproducible. Function could also have a flag to suppress that warning. Could be like an argument |
great maybe add a small note to whats_new? |
@robintibor thanks for taking a look at this. It's a good idea to have this warning and I think it's enough to somehow warn users about lack of reproducibility that may happen in some cases and give them the solution. Sure I'll add a line in whats_new, just waiting for the doc to render, I want to check if everything is ok. |
a3b05b0
to
af48988
Compare
@robintibor I think it's ready, let me know if we need something more here :) |
One more thing @robintibor Should we remove the line about reproducing results in all the examples? May be misleading if we set braindecode/examples/plot_sleep_staging.py Lines 222 to 225 in fa6042b
|
We could enhance it to: # Set random seed to be able to roughly reproduce results
# Note that with cudnn benchmark set to True, GPU indeterminism
# may still make results substantially different between runs
set_random_seeds(seed=random_state, cuda=cuda) |
af48988
to
b1d9bdd
Compare
@robintibor done :) |
Great that we have a version that works for both of us! Thanks for the work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry to be a bit late to the party
@sliwy do you think it's relevant? if so can you open a new PR to fix this?
馃檹
Last days I faced issues with reproducibility when using
set_random_seeds
.First,
set_random_seeds
does not ensure reproducibility because we are not settinguse_deterministic_algorithms()
and we are not settingtorch.backends.cudnn.benchmark = False
.Those settings can slow down computations which is a side effect of reproducibility in pytorch (see https://pytorch.org/docs/stable/notes/randomness.html). Still, I feel all people that set random seeds want reproducibility and not just the same weights initialization and batch sampling order.
However some operations may not be possible with deterministic behavior and cuda >= 10.2, so I guess we might want to provide a parameter whether to set it to be deterministic or not (default to deterministic), what do you think? For example it fails in our tests because we use
set_random_seeds
for tests_acceptance, souse_deterministic_algorithms
affects all tests.I also added a note about the
PYTHONHASHSEED
setting, which in some cases may be needed to ensure reproducibility. I spent a long time looking for a solution to this problem, so I think it may be good to keep it as a note in the docstring. I had to setPYTHONHASHSEED
before running scripts which is different than usually suggestedos.environ['PYTHONHASHSEED']
.This caused me a lot of troubles last week, because I believed that
set_random_seeds
will make my code reproducible 馃槀