New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] 32-bit support for MLP #17759
[MRG] 32-bit support for MLP #17759
Conversation
Hi @d3b0unce, |
@harishB97, thank you for pointing it out. That call was indeed redundant. Passing the tuple directly to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @d3b0unce ! I think there is 1 occurrences missing,
activations.append(np.empty((X.shape[0],
Also could you please add a check to test that coef, intercept (and predictions only in the case of MLPRegressor) are of dtype float32 when X is float32?
multi_output=True) | ||
multi_output=True, | ||
dtype=(np.float64, np.float32)) | ||
self._dtype = X.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So right now _validate_input
is indeed only called on fit, but generally I think it would be safer to make it not store anything on self
in case someone uses it for predict in the future. Using X.dtype
later should be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I had assigned it to self
as I needed dtype
in _init_coef
where X
is not in scope. Now I am passing dtype
as a parameter to _init_coef
.
Thank you @rth, for the suggestions! I added tests to check dtypes of parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @d3b0unce !
@@ -985,7 +993,7 @@ def _validate_input(self, X, y, incremental): | |||
" `self.classes_` has %s. 'y' has %s." % | |||
(self.classes_, classes)) | |||
|
|||
y = self._label_binarizer.transform(y) | |||
y = self._label_binarizer.transform(y).astype(np.bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason behind this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because the call to _label_binarizer.transform
returns int64
, which was upcasting other network parameters to float64
during _backprop
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include this explanation as a comment in the code. (This downcast to bool is to prevent upcasting when working with float32 data)
assert_allclose(pred_64, pred_32, rtol=1e-04) | ||
|
||
|
||
def test_mlp_param_dtypes(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be parametrize:
@pytest.mark.parametrize('dtype', [np.float32, np.float64])
def test_mlp_param_dtypes(dtype):
# Checks input dtype is used for attributes and prediction
X, y = X_digits[:300].astype(dtype), y_digits[:300]
mlp = MLPRegressor(alpha=1e-5,
hidden_layer_sizes=(5, 3),
random_state=1, max_iter=50)
mlp.fit(X, y)
pred = mlp_64.predict(X)
assert pred.dtype == dtype
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
hidden_layer_sizes=(5, 3), | ||
random_state=1, max_iter=100) | ||
mlp_64.fit(X_digits[:300], y_digits[:300]) | ||
pred_64 = mlp_64.predict(X_digits[300:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check for predict_proba
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, added a test
Thank you @thomasjpfan for the suggestions! I've committed the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment otherwise LGTM, thanks:
# Checks if input dtype is used for network parameters | ||
# and predictions | ||
X, y = X_digits.astype(dtype), y_digits | ||
mlp = MLPRegressor(alpha=1e-5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also parametrize this test function with Estimator in [MLPClassifier, MLPRegressor]
. Fitting on y should work in both cases since it is int. And then the last assert only do it in the case of MLPRegressor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Also please add an entry to the change log at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @d3b0unce !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small comment, otherwise LGTM!
Thank you @d3b0unce !
@@ -985,7 +993,7 @@ def _validate_input(self, X, y, incremental): | |||
" `self.classes_` has %s. 'y' has %s." % | |||
(self.classes_, classes)) | |||
|
|||
y = self._label_binarizer.transform(y) | |||
y = self._label_binarizer.transform(y).astype(np.bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's include this explanation as a comment in the code. (This downcast to bool is to prevent upcasting when working with float32 data)
Thanks @thomasjpfan! Added the comment. |
Reference Issues/PRs
Fixes #17700
What does this implement/fix? Explain your changes.
Includes
float32
support forMLPClassifier
andMLPRegressor
.