Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freeze torch rng for scoring functions; fix tests for skorch 0.8 #155

Merged
merged 2 commits into from
Aug 19, 2020

Conversation

robintibor
Copy link
Contributor

No description provided.

@robintibor
Copy link
Contributor Author

So, I found:

  • calling __iter__() of a PyTorch DataLoader causes a change in the random generator state of pytorch (even with shuffle=False etc.)
  • Therefore, scoring functions that create a DataLoader can influence the later training, e.g. by influencing which data is in which training batch
  • Removing the 'accuracy' metric scoring actually made results (loss) consistent between 0.7 and 0.8 skorch
  • Also, just removing creation of EpochScoring for valid set made results (loss) consistent already
  • Prints revealed training batches change with this valid EpochScoring added
  • Therefore, strongly suggests something in random generator state changes, still not 100% clear what difference between 0.7 and 0.8 skorch

Still, overall feel it is likely enough that changes in metrics are due to random generator calls so I ensured that our scoring classes retrain previous random generator state on exit and updated numbers.

Copy link
Collaborator

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you update the supported version of dependencies it should be reported in the readme.txt that specifies the dependencies. See https://github.com/scikit-learn/scikit-learn/#dependencies for an example.

# y pred should be same as self.y_preds_
# robintibor@gmail.com: Unclear if this also leads to any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have git blame for this. I would not clutter the code with email adresses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok nagut, removed

1.065806
1.784507,
1.421611,
1.057717
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you still need to change this? Do we test on CIs using skorch 0.7?

'valid_loss': 1.0293445587158203,
'valid_loss_best': True,
'valid_trial_accuracy': 0.25,
'valid_trial_accuracy_best': False}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you bascially updated all the tests @robintibor so we cannot easily assess that your PR is a bug fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that I now changed behavior to enforce that adding metrics will not change random state. Before, the results would actually change depending on if you had any extra metric added or none, because this would affect the random state during training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants