Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FIX allow non-finite target values in TransformedTargetRegressor #11349

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

vahidbas
Copy link

@vahidbas vahidbas commented Jun 24, 2018

Reference Issues/PRs

Fixes #11339
simply changed force_all_finite to 'False'

Update:
turn off all finiteness checks

What does this implement/fix? Explain your changes.

Allow target values to have missing values in TransformedTargetRegressor

@jnothman
Copy link
Member

CI failing. Let me know if you need help with it

@@ -162,7 +162,7 @@ def fit(self, X, y, sample_weight=None):
-------
self : object
"""
y = check_array(y, accept_sparse=False, force_all_finite=True,
y = check_array(y, accept_sparse=False, force_all_finite='allow-nan',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider going further to switch off all finiteness validation. Are there cases where it would be risky to pass an inf through unchecked?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be safe to leave both INF and NAN checks to the actual transformer. I'll give it a go, let's check if it passes the tests.

)

estimator.fit(X, y)
estimator.predict(X)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check that your output contains NaN where it should.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimator.predict(X) is actually unnecessary. The test only to assert that estimator.fit(X, y) doesn't raise. In this case, predict is expected to return no NaN. should I make it explicit or remove predict?


X, y = datasets.load_linnerud(return_X_y=True)

# put some NaN in y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jnothman
Copy link
Member

It's worth checking predict also. Ensuring the output does not contain NaN doesn't hurt, but I don't think it is necessary

@jnothman
Copy link
Member

jnothman commented Jun 25, 2018 via email

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thanks!

I'd be interested if you have other comments/critiques of TransformedTargetRegressor design before we release it.

@glemaitre glemaitre changed the title [MRG] FIX allow NaNs for the target values in TransformedTargetRegressor #11339 [MRG] FIX allow NaNs for the target values in TransformedTargetRegressor Jun 26, 2018
@glemaitre glemaitre changed the title [MRG] FIX allow NaNs for the target values in TransformedTargetRegressor [MRG] FIX allow non-finite target values in TransformedTargetRegressor Jun 26, 2018
@vahidbas
Copy link
Author

@jnothman Thanks for help! This will resolve my issue for now. My next step is to evaluate it with model_selection.* tools for joint hyperparameter tuning, I'll post any new finding.

@jnothman
Copy link
Member

@glemaitre please review changes?

@jnothman
Copy link
Member

@vahidbas
Please add an entry to the change log at doc/whats_new/v0.21.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@@ -162,7 +162,7 @@ def fit(self, X, y, sample_weight=None):
-------
self : object
"""
y = check_array(y, accept_sparse=False, force_all_finite=True,
y = check_array(y, accept_sparse=False, force_all_finite=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be 'allow-nan' instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre all checks are turned off as suggested by @jnothman here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK fine with me then.

@glemaitre
Copy link
Member

I am also wondering if we should avoid the validate=True there:

self.transformer_ = FunctionTransformer(
func=self.func, inverse_func=self.inverse_func, validate=True,
check_inverse=self.check_inverse)

As mentioned by @shreyasramachandran, if you pass a func and inverse_func, you will create such a transformer which will not accept the NaN by default.

@jnothman Which behaviour do you think is the best by default?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now a bit confused about how this is working with that validate=True...?

@glemaitre
Copy link
Member

I'm now a bit confused about how this is working with that validate=True...?

validate=True performs

check_array(X, accept_sparse=self.accept_sparse)

So it does not let pass the Nan

@jnothman
Copy link
Member

jnothman commented Feb 11, 2019 via email

@jnothman
Copy link
Member

It's because we're only handling the case here where a transformer, rather than a function, is provided. Yes, I think we should handle both cases

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this currently works for the case of passing in a transformer, but not when passing in func and inverse_func?

Base automatically changed from master to main January 22, 2021 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow NaNs for the target values in TransformedTargetRegressor
5 participants