-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] FIX allow non-finite target values in TransformedTargetRegressor #11349
base: main
Are you sure you want to change the base?
[MRG] FIX allow non-finite target values in TransformedTargetRegressor #11349
Conversation
CI failing. Let me know if you need help with it |
sklearn/compose/_target.py
Outdated
@@ -162,7 +162,7 @@ def fit(self, X, y, sample_weight=None): | |||
------- | |||
self : object | |||
""" | |||
y = check_array(y, accept_sparse=False, force_all_finite=True, | |||
y = check_array(y, accept_sparse=False, force_all_finite='allow-nan', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider going further to switch off all finiteness validation. Are there cases where it would be risky to pass an inf through unchecked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would be safe to leave both INF and NAN checks to the actual transformer. I'll give it a go, let's check if it passes the tests.
) | ||
|
||
estimator.fit(X, y) | ||
estimator.predict(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check that your output contains NaN where it should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
estimator.predict(X)
is actually unnecessary. The test only to assert that estimator.fit(X, y)
doesn't raise. In this case, predict is expected to return no NaN. should I make it explicit or remove predict
?
sklearn/compose/tests/test_target.py
Outdated
|
||
X, y = datasets.load_linnerud(return_X_y=True) | ||
|
||
# put some NaN in y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
It's worth checking predict also. Ensuring the output does not contain NaN doesn't hurt, but I don't think it is necessary |
Travis is reporting flake8 errors.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, thanks!
I'd be interested if you have other comments/critiques of TransformedTargetRegressor design before we release it.
@jnothman Thanks for help! This will resolve my issue for now. My next step is to evaluate it with |
@glemaitre please review changes? |
@vahidbas |
@@ -162,7 +162,7 @@ def fit(self, X, y, sample_weight=None): | |||
------- | |||
self : object | |||
""" | |||
y = check_array(y, accept_sparse=False, force_all_finite=True, | |||
y = check_array(y, accept_sparse=False, force_all_finite=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be 'allow-nan'
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glemaitre all checks are turned off as suggested by @jnothman here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK fine with me then.
I am also wondering if we should avoid the scikit-learn/sklearn/compose/_target.py Lines 132 to 134 in af842d3
As mentioned by @shreyasramachandran, if you pass a @jnothman Which behaviour do you think is the best by default? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now a bit confused about how this is working with that validate=True
...?
check_array(X, accept_sparse=self.accept_sparse) So it does not let pass the Nan |
So if validate=True doesn't pass the nan, why does it help for us to change
force_all_finite here?
|
It's because we're only handling the case here where a transformer, rather than a function, is provided. Yes, I think we should handle both cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this currently works for the case of passing in a transformer
, but not when passing in func
and inverse_func
?
Reference Issues/PRs
Fixes #11339
simply changed
force_all_finite
to'False'
Update:
turn off all finiteness checks
What does this implement/fix? Explain your changes.
Allow target values to have missing values in
TransformedTargetRegressor