Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+2] TransformedTargetRegressor #9041

Merged
merged 83 commits into from Dec 13, 2017
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
21a2ff3
implement target transformer
amueller Jun 5, 2017
7b1f7e8
make example use log and ex
amueller Jun 5, 2017
b306fa5
some docstrings
amueller Jun 5, 2017
7d9badf
EHN/TST advance TTR
glemaitre Jun 6, 2017
97da7a3
FIX call fit of the transformer at validation time
glemaitre Jun 6, 2017
61a543a
EHN/TST ravel y when needed
glemaitre Jun 7, 2017
de8dbb4
FIX address comment Andy
glemaitre Jun 7, 2017
254fac2
EHN add LinearRegression
glemaitre Jun 7, 2017
53c7c81
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Jun 7, 2017
693de84
EHN move to target file
glemaitre Jun 7, 2017
3dafc8f
FIX/EXA fix example in the docstring
glemaitre Jun 7, 2017
27f1c43
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Jun 8, 2017
73bbcaf
ENH address comments
glemaitre Jun 8, 2017
e6a4e7d
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Jun 8, 2017
503a985
DOC narrative doc for ttr
glemaitre Jun 8, 2017
63dbe9a
DOC update whats new and docstring
glemaitre Jun 9, 2017
9feafda
Update whats new
glemaitre Jun 9, 2017
32a85a6
Remove useless changes
glemaitre Jun 9, 2017
af51cf8
Update whats new
glemaitre Jun 9, 2017
dcae366
address comments
glemaitre Jun 9, 2017
d8310ad
Merge branch 'targettransformer' of github.com:glemaitre/scikit-learn…
glemaitre Jun 9, 2017
4c3ab11
DOC change to bostong dataset
glemaitre Jun 9, 2017
49ea3c4
Remove score
glemaitre Jun 9, 2017
f1a7289
add the estimator to multioutput
glemaitre Jun 9, 2017
ffe6892
Rename the class
glemaitre Jun 9, 2017
2a868ee
gael comments
glemaitre Jun 9, 2017
18c66c6
revert example
glemaitre Jun 9, 2017
85a8865
FIX docstring and commont test
glemaitre Jun 10, 2017
7a10796
FIX solve issue circular import
glemaitre Jun 10, 2017
086fba0
FIX circular import
glemaitre Jun 10, 2017
44ea999
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Jun 10, 2017
6c4734e
DOC/FIX/TST test 1d/2d y support transformer and vlad comments
glemaitre Jun 10, 2017
3ecde9f
TST apply change of manoj
glemaitre Jun 14, 2017
5e7d6c9
TST apply change of manoj
glemaitre Jun 14, 2017
db4bf57
TST factorize single- multi-output test
glemaitre Jun 14, 2017
0fe1622
FIX ensure at least 1d array
glemaitre Jun 14, 2017
8b94056
TST add test for support of sample weight
glemaitre Jun 14, 2017
01d94e2
EXA add example
glemaitre Jun 14, 2017
a0b84c4
DOC fix
glemaitre Jun 14, 2017
437dfaa
DOC revert author
glemaitre Jun 14, 2017
36968ba
FIX minor fix in example and doc
glemaitre Jun 14, 2017
d253fcd
DOC fixes
glemaitre Jun 15, 2017
451dfd3
FIX remove useless import
glemaitre Jun 15, 2017
19a6f94
Remove absolute tolerance
glemaitre Jul 26, 2017
9e07197
Merge branch 'master' into targettransformer
glemaitre Jul 26, 2017
0ddfee0
TST single to multi and regressor checking
glemaitre Aug 1, 2017
8392cc5
pass sample_weight directly
glemaitre Aug 2, 2017
a0bf0b0
PEP8
glemaitre Aug 2, 2017
18bcec0
use is_regressor instead of tag
glemaitre Aug 4, 2017
f3e151f
Merge branch 'master' into targettransformer
glemaitre Aug 17, 2017
85cc14c
TST split tests
glemaitre Aug 18, 2017
075bf92
Merge remote-tracking branch 'glemaitre/targettransformer' into targe…
glemaitre Aug 18, 2017
51583c2
TST fix multi to single test
glemaitre Aug 21, 2017
9853552
solve the issue if the function are not 2d
glemaitre Aug 21, 2017
a1998fa
DOC update docstring
glemaitre Aug 21, 2017
7c8c0ca
DOC fix docstring
glemaitre Aug 22, 2017
97330b0
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Sep 3, 2017
7f13b9a
TST check compatibility 1D 2D fuction even if supposidely not supported
glemaitre Sep 3, 2017
ae973f8
TST relax equality in prediction
glemaitre Sep 3, 2017
129373d
TST remove single to multi case
glemaitre Sep 3, 2017
9064f24
Address olivier and johel comments
glemaitre Sep 4, 2017
3d80728
not enforcing regressor
glemaitre Sep 4, 2017
5f9db73
Renamed to TransformedTargetRegressor
glemaitre Sep 4, 2017
58c5506
DOC reformat plot titles
ogrisel Sep 5, 2017
d0f83fa
change naming functions
glemaitre Sep 7, 2017
4e61395
DOC fixing title
glemaitre Sep 7, 2017
687703b
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Oct 25, 2017
500a77c
DOC fix merge git mess
glemaitre Oct 25, 2017
35cb75d
TST/EHN only squeeze when ndim == 1
glemaitre Oct 30, 2017
9a939f3
TST forgot to call fit
glemaitre Oct 30, 2017
00e6d78
Merge remote-tracking branch 'origin/master' into targettransformer
glemaitre Oct 31, 2017
04dc4a7
FIX pass check_inverse to FunctionTransformer
glemaitre Oct 31, 2017
f757c10
DOC remove blank lines
glemaitre Nov 6, 2017
214fde6
Add comments and lift constraint upon X
glemaitre Nov 10, 2017
68c5b7e
avoid type conversion since this is in check_array
glemaitre Nov 10, 2017
9976ace
Merge branch 'master' into targettransformer
glemaitre Nov 28, 2017
3c99cde
TST check that y is always converted to array before transformer call
glemaitre Nov 28, 2017
0b364f6
reverse right of plot_ols
glemaitre Nov 28, 2017
64f5d52
address joels comments
glemaitre Nov 29, 2017
5929f81
Merge branch 'master' into targettransformer
glemaitre Dec 13, 2017
d637038
MAINT rename module name
glemaitre Dec 13, 2017
790c86a
DOC fix indent
glemaitre Dec 13, 2017
bbee2be
FIX add the new module
glemaitre Dec 13, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/modules/classes.rst
Expand Up @@ -1200,6 +1200,7 @@ See the :ref:`metrics` section of the user guide for further details.
preprocessing.PolynomialFeatures
preprocessing.RobustScaler
preprocessing.StandardScaler
preprocessing.TransformTargetRegressor

.. autosummary::
:toctree: generated/
Expand Down
67 changes: 66 additions & 1 deletion doc/modules/preprocessing_targets.rst
@@ -1,4 +1,3 @@

.. currentmodule:: sklearn.preprocessing

.. _preprocessing_targets:
Expand All @@ -7,6 +6,72 @@
Transforming the prediction target (``y``)
==========================================

Transforming target in regression
---------------------------------

:class:`TransformTargetRegressor` transforms the target before fitting a
regression model and inverting back the prediction to the original space. It
Copy link
Member

@vene vene Jun 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion:

transforms the target y before fitting a regression model. The predictions are mapped back to the original space via an inverse transform. It takes...

takes as an argument the regressor that will be used for prediction, and the
transformer that will be applied to the target variable::

>>> import numpy as np
>>> from sklearn.datasets import load_boston
>>> from sklearn import preprocessing
>>> from sklearn.linear_model import LinearRegression
>>> from sklearn.model_selection import train_test_split
>>> boston = load_boston()
>>> X = boston.data
>>> y = boston.target
>>> transformer = preprocessing.StandardScaler()
>>> regressor = LinearRegression()
>>> regr = preprocessing.TransformTargetRegressor(regressor=regressor,
... transformer=transformer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

>>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
>>> regr.fit(X_train, y_train) # doctest: +ELLIPSIS
TransformTargetRegressor(...)
>>> print('R2 score:', regr.score(X_test, y_test)) # doctest : +ELLIPSIS
R2 score: 0.63...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe as a comparison you could add the score of the linear model on the original target variable. I get the following:

>>> raw_target_regr = LinearRegression().fit(X_train, y_train)
>>> print('R2 score: {0:.2f}'.format(raw_target_regr.score(X_test, y_test)))
.score(X_test, y_test)
R2 score: 0.63

The transformer can also be replaced by a function and an inverse function. We
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase:

"For simple transformations, instead of a Transformer object, a pair of functions can be passed, defining the transformation and its inverse mapping::"

because it's not clear to me what it means "the transformer can be replaced". Plus, it can't always: not when it's stateful.

can define the following two functions::

>>> from __future__ import division
>>> def func(x):
... return np.log(x)
>>> def inverse_func(x):
... return np.exp(x)

Subsequently, the object is created as::

>>> regr = preprocessing.TransformTargetRegressor(regressor=regressor,
... func=func,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

... inverse_func=inverse_func)
>>> regr.fit(X_train, y_train) # doctest: +ELLIPSIS
TransformTargetRegressor(...)
>>> print('R2 score:', regr.score(X_test, y_test)) # doctest: +ELLIPSIS
R2 score: 0.64...

By default, the provided function are checked at each fit to be the inverse of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*functions

each other. However, it is possible to bypass this checking by setting
``check_inverse`` to ``False``::

>>> def inverse_func(x):
... return x
>>> regr = preprocessing.TransformTargetRegressor(regressor=regressor,
... func=func,
... inverse_func=inverse_func,
... check_inverse=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but I wonder if we should have this check_inverse argument in FunctionTransformer in the future as well. If we do, then the default behaviour is different.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not be against it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of moving check_inverse to FunctionTransformer.

>>> regr.fit(X_train, y_train) # doctest: +ELLIPSIS
TransformTargetRegressor(...)
>>> print('R2 score:', regr.score(X_test, y_test)) # doctest: +ELLIPSIS
R2 score: -4.50...

.. note::

The transformation can be triggered by setting either ``transformer`` or the
functions ``func`` and ``inverse_func``. However, setting both options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pair of functions --- clearer to me

will raise an error.

Label binarization
------------------

Expand Down
9 changes: 9 additions & 0 deletions doc/whats_new.rst
Expand Up @@ -31,6 +31,15 @@ Changelog
New features
............

- Added the :class:`sklearn.preprocessing.TransformTargetRegressor` wraps
a regressor and applies a transformation to the target before fitting,
finally transforming the regressor's predictions back to the original
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly we can reuse the exact phrasing from my comment 5 min ago

space. :issue:`9041` by `Andreas Müller`_ and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guillaume is missing. EDIT: actually the entry is duplicated but in slightly different version

- Added the :class:`sklearn.preprocessing.TransformedTargetRegressor` which
is a meta-estimator to regress on a modified ``y`` for example, to perform
regression in log-space. :issue:`9041` by `Andreas Müller`_ and
:user:`Guillaume Lemaitre <glemaitre>`.

- Validation that input data contains no NaN or inf can now be suppressed
using :func:`config_context`, at your own risk. This will save on runtime,
and may be particularly useful for prediction time. :issue:`7548` by
Expand Down
3 changes: 3 additions & 0 deletions sklearn/preprocessing/__init__.py
Expand Up @@ -28,6 +28,8 @@
from .label import LabelEncoder
from .label import MultiLabelBinarizer

from .target import TransformTargetRegressor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unhelpful whitespace

from .imputation import Imputer


Expand All @@ -45,6 +47,7 @@
'OneHotEncoder',
'RobustScaler',
'StandardScaler',
'TransformTargetRegressor',
'add_dummy_feature',
'PolynomialFeatures',
'binarize',
Expand Down
203 changes: 203 additions & 0 deletions sklearn/preprocessing/target.py
@@ -0,0 +1,203 @@
# Authors: Andreas Mueller < andreas.mueller@columbia.edu>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra space before andreas

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like me some extra space

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's some for you























# Guillaume Lemaitre <guillaume.lemaitre@inria.fr>
# License: BSD 3 clause

import numpy as np

from ..base import BaseEstimator, RegressorMixin, clone
from ..linear_model import LinearRegression
from ..utils.fixes import signature
from ..utils.validation import check_is_fitted, check_array
from ._function_transformer import FunctionTransformer

__all__ = ['TransformTargetRegressor']


class TransformTargetRegressor(BaseEstimator, RegressorMixin):
"""Meta-estimator to regress on a transformed target.

Useful for applying a non-linear transformation in regression
problems. This transformation can be given as a Transformer such as the
QuantileTransformer or as a function and its inverse such as ``np.log`` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps just call these log and exp, leaving out the np

``np.exp``.

The computation during ``fit`` is::

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove this blank line?

regressor.fit(X, func(y))

or::

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this etc?

regressor.fit(X, transformer.transform(y))

The computation during ``predict`` is::

inverse_func(regressor.predict(X))

or::

transformer.inverse_transform(regressor.predict(X))

Parameters
----------
regressor : object, (default=LinearRegression())
Regressor object such as derived from ``RegressorMixin``. This
regressor will be cloned during fitting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This regressor will automatically be cloned each time prior to fitting."


transformer : object, (default=None)
Estimator object such as derived from ``TransformerMixin``. Cannot be
set at the same time as ``func`` and ``inverse_func``. If ``None`` and
``func`` and ``inverse_func`` are ``None`` as well, the transformer
will be an identity transformer. The transformer will be cloned during
fitting.

func : function, optional
Function to apply to ``y`` before passing to ``fit``. Cannot be set at
the same time than ``transformer``. If ``None`` and ``transformer`` is
``None`` as well, the function used will be the identity function.

inverse_func : function, optional
Function to apply to the prediction of the regressor. Cannot be set at
the same time than ``transformer``. If ``None`` and ``transformer`` as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the same time as

(and the same for func)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and transformer is None as well

well, the function used will be the identity function. The inverse
function is used to return to the same space of the original training
labels during prediction.

check_inverse : bool, (default=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, ( is awkward. either drop the () or drop the ,

Whether to check that ``transform`` followed by ``inverse_transform``
or ``func`` followed by ``inverse_func`` leads to the original data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: data -> targets.


Attributes
----------
regressor_ : object
Fitted regressor.

transformer_ : object
Used transformer in ``fit`` and ``predict``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Transformer used"


y_ndim_ : int
Number of targets.

Examples
--------
>>> import numpy as np
>>> from sklearn.linear_model import LinearRegression
>>> from sklearn.preprocessing import TransformTargetRegressor
>>> tt = TransformTargetRegressor(regressor=LinearRegression(),
... func=np.log, inverse_func=np.exp)
>>> X = np.arange(4).reshape(-1, 1)
>>> y = np.exp(2 * X).ravel()
>>> tt.fit(X, y)
... #doctest: +NORMALIZE_WHITESPACE
TransformTargetRegressor(check_inverse=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it useful to show this?

func=<ufunc 'log'>,
inverse_func=<ufunc 'exp'>,
regressor=LinearRegression(copy_X=True,
fit_intercept=True,
n_jobs=1,
normalize=False),
transformer=None)
>>> tt.score(X, y)
1.0
>>> tt.regressor_.coef_
array([ 2.])

"""
def __init__(self, regressor=None, transformer=None,
func=None, inverse_func=None, check_inverse=True):
self.regressor = regressor
self.transformer = transformer
self.func = func
self.inverse_func = inverse_func
self.check_inverse = check_inverse

def _fit_transformer(self, y, sample_weight):
if (self.transformer is not None and
(self.func is not None or self.inverse_func is not None)):
raise ValueError("Both 'transformer' and functions 'func'/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop "Both" or move it to become "cannot both be set" (without "at the same time").

"'inverse_func' cannot be set at the same time.")
elif self.transformer is not None:
self.transformer_ = clone(self.transformer)
else:
self.transformer_ = FunctionTransformer(
func=self.func, inverse_func=self.inverse_func, validate=False)
fit_parameters = signature(self.transformer_.fit).parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really have a helper for that but whatever ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point! does this break if the transformer's fit comes from a @if_delegate_has_method ??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not, I think....

Copy link
Member

@vene vene Jun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a fit ever come from an "@if_delegate_has_method"? I would understand for a predict or a transform, but a fit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, makes no sense to have it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are looking for utils.validation.has_fit_parameter? (https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/utils/validation.py#L603)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the wrong strategy. We should pass sample_weight iff it is not None, or we should never pass in weights until we have a prop routing API. Passing on the basis of the signature is brittle (it is silent when the weights are not passed even if the used passed in weights explicitly), and would make the code a mess if more properties were supported. Unless I've missed some motivation for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I could not find a transformer with sample_weight. I replicated the the code that we have in the bagging estimator:

support_sample_weight = has_fit_parameter(self.regressor_, 'sample_weight')
if support_sample_weight:
    if sample_weight is None:
        current_sample_weight = np.ones((y.shape[0],))
    else: 
        current_sample_weight = sample_weight

In case that transformers should handle sample_weight, it will be a use case for the sample_prop since several sample_weight could be passed to the regressor and to the transformer.

if "sample_weight" in fit_parameters:
self.transformer_.fit(y, sample_weight=sample_weight)
else:
self.transformer_.fit(y)
if self.check_inverse:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to move this check inside FunctionTransformer? (at some point, maybe not here)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will remain here regardless of its being present in FunctionTransformer? And will remain an error rather than a warning?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change in the FunctionTransformer but I forgot to change it here.

n_subsample = min(10, y.shape[0])
subsample_idx = np.random.choice(range(y.shape[0]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unprotected np.random: we need to use a random_state

size=n_subsample, replace=False)
if not np.allclose(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the default tolerance? maybe we want to lower that a little bit?

y[subsample_idx],
self.transformer_.inverse_transform(
self.transformer_.transform(y[subsample_idx])),
atol=1e-4):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is surprisingly high tolerance to me

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1e-7 seems better to you?

raise ValueError("The provided functions or transformer are"
" not strictly inverse of each other. If"
" you are sure you want to proceed regardless"
", set 'check_inverse=False'")

def fit(self, X, y, sample_weight=None):
"""Fit the model according to the given training data.

Parameters
----------
X : {array-like, sparse matrix}, shape (n_samples, n_features)
Training vector, where n_samples is the number of samples and
n_features is the number of features.

y : array-like, shape (n_samples,)
Target values.

sample_weight : array-like, shape (n_samples,) optional
Array of weights that are assigned to individual samples.
If not provided, then each sample is given unit weight.

Returns
-------
self : object
Returns self.
"""
y = check_array(y, ensure_2d=False)
self.y_ndim_ = y.ndim
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check_array before this.

Should probably document this attribute.

if y.ndim == 1 and self.func is None:
y_2d = y.reshape(-1, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we don't want to do this when func and inverse_func are provided?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I come back on this point. I am not sure this is a great idea since that it changes the behaviour between if you pass a function or a transformer. We could still make the transform to 2d and build the FunctionTransformer with validate=True and the behaviour will always be the same.

I don't see a case in which the user would define a function which working on a 1D array but would failed on a 2D array

else:
y_2d = y
self._fit_transformer(y_2d, sample_weight)
if self.regressor is None:
self.regressor_ = LinearRegression()
else:
self.regressor_ = clone(self.regressor)
if sample_weight is not None:
self.regressor_.fit(X, self.transformer_.fit_transform(y_2d),
sample_weight=sample_weight)
else:
self.regressor_.fit(X, self.transformer_.fit_transform(y_2d))
return self

def predict(self, X):
"""Predict using the base regressor, applying inverse.

The regressor is used to predict and the ``inverse_func`` or
``inverse_transform`` is applied before returning the prediction.

Parameters
----------
X : {array-like, sparse matrix}, shape = (n_samples, n_features)
Samples.

Returns
-------
y_hat : array, shape = (n_samples,)
Predicted values.

"""
check_is_fitted(self, "regressor_")
pred = self.transformer_.inverse_transform(self.regressor_.predict(X))
if self.y_ndim_ == 1 and self.func is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the second condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the useless reshaping for func and inverse_func case. See Joel's comment

return pred.ravel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ".squeeze" rather than ".ravel": ravel is too flexible and can hide errors

else:
return pred