ENH: raise an error when MLP diverges#29773
Conversation
51f9830 to
5a37950
Compare
| self.validation_scores_.append(self._score(X_val, y_val)) | ||
| try: | ||
| val_score = self._score(X_val, y_val) | ||
| except ValueError: |
There was a problem hiding this comment.
Since we could catch many ValueError, here I assume that we have something in the exception that tell us that this is a np.nan or something like this, isn't it?
There was a problem hiding this comment.
good idea, i'll match on that
| try: | ||
| val_score = self._score(X_val, y_val) | ||
| except ValueError: | ||
| val_score = np.inf |
There was a problem hiding this comment.
We will need to add a non-regression test to check that the code works as expected.
|
In addition, we will need an entry in the changelog: Please add an entry to the change log at |
69d14e0 to
4c2bfc3
Compare
|
ok, I set the random state to enable reproducible results :) it should be good now, the test passes on my laptop |
|
I see that we will not be able to accomplished this PR for 1.5.2. Otherwise everything looks good (I slightly change the entry for the changelog). |
glemaitre
left a comment
There was a problem hiding this comment.
Thanks @MarcBresson
Let's have a second review.
|
@adrinjalali do you want to have a look after debugging in the PR. |
| try: | ||
| val_score = self._score(X_val, y_val) | ||
| except ValueError as e: | ||
| if str(e) == "Input contains NaN.": | ||
| val_score = np.inf | ||
| else: | ||
| raise e |
There was a problem hiding this comment.
This is quite brittle. It depends on the internals and the actual text raised by accuracy_score and r2_score (which are both not ideal lol)
I rather have a check here explicitly for nan, and if we have nan, do the logic, instead of relying on the scorer to raise that exception.
There was a problem hiding this comment.
X_val and y_val don't have any NaN. But because model weights are really large numbers, the computation of the score generates NaN values, which, when validation runs, raises the error.
The exception is raised deep in the code (check #29504 for the full traceback) so that's why I ended up just adding the try: except: statement.
I can change the logic behind the _score function to account for that
There was a problem hiding this comment.
ok, I did manage to do it better :) thank you for pushing me
| Non-regression test for: | ||
| https://github.com/scikit-learn/scikit-learn/issues/29504 | ||
| """ | ||
| mlp = MLPRegressor( |
There was a problem hiding this comment.
We need to test both the regressor and the classifier.
There was a problem hiding this comment.
Because a classifier predicts 0s and 1s, the accuracy score will never have NaNs
There was a problem hiding this comment.
I'm confused cause the code seems to touch both classifier and regressor, but according to your comment here classifier is fine as is?
| :mod:`sklearn.neural_network` | ||
| ............................. | ||
|
|
||
| - |Fix| :class:`neural_network.MLPClassifier` and :class:`neural_network.MLPRegressor` |
There was a problem hiding this comment.
So here this doesn't apply to MLPClassifier?
There was a problem hiding this comment.
Ups, I'm the culprit here. I think that I change the changelog from the base class to the public classes. So I added MLPClassifier but then it might never have been an issue.
| """Private score method without input validation""" | ||
| # Input validation would remove feature names, so we disable it | ||
| return accuracy_score(y, self._predict(X, check_input=False)) | ||
| return super()._score_with_function(X, y, score_function=accuracy_score) |
There was a problem hiding this comment.
so does this apply to Classifier or not?
There was a problem hiding this comment.
it does not apply to classifier, but I did a small refactor so that
# Input validation would remove feature names, so we disable it
return accuracy_score(y, self._predict(X, check_input=False))is only written once :)
| Non-regression test for: | ||
| https://github.com/scikit-learn/scikit-learn/issues/29504 | ||
| """ | ||
| mlp = MLPRegressor( |
There was a problem hiding this comment.
I'm confused cause the code seems to touch both classifier and regressor, but according to your comment here classifier is fine as is?
|
@adrinjalali sorry for all the confusion, as much as I love atomic commit, aa7e7a8 was both about implementing your suggestion, and doing a small refactor so that |
…29810) Co-authored-by: Lock file bot <noreply@github.com>
Co-authored-by: Mr. Snrub <45150804+s-banach@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Shruti Nath <shrutinath@Shrutis-Laptop.local>
…istant when the model was strictly diverging with early_stopping
|
@MarcBresson we don't care about commits inside PRs much at all. Better not to force push so that we can see the history. At the end we squash and merge anyway. For the conflicts, you might want to use the version of the lockfiles in main. |
|
noted! The last change was just about changing np.NaN to np.nan :) |
…loat("nan") != float("nan")`
… possible and return nan in these cases
Reference Issues/PRs
Fixes #29504
What does this implement/fix? Explain your changes.
When MLP weights overflowed with
early_stopping=True, the scorer function crashed because it could not compute a validation score. Now, when it overflows, the validation score is replaced by inf, and the model follows its course.Any other comments?
Because it diverges, early_stopping will not actually stop the model before the number of epochs reached
n_max_epochs. One condition for early stopping to trigger is that the new score must be lower or equal to the best score. Since the model diverges, the new score is always greater than the best score (which generally happens on the first epoch)