Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OneVsRestClassifier's fit method does not accept kwargs #10882

Open
klamann opened this issue Mar 28, 2018 · 14 comments
Open

OneVsRestClassifier's fit method does not accept kwargs #10882

klamann opened this issue Mar 28, 2018 · 14 comments

Comments

@klamann
Copy link

klamann commented Mar 28, 2018

Description

The fit method of OneVsRestClassifier and other MetaEstimators (e.g. OneVsOneClassifier) only accepts X and y as parameters. Underlying estimators like SGDClassifier accept more optional keyword args to this function which are essential for some tasks (e.g. the sample_weight parameter is the only way to add weights to training samples in multi-label classification problems).

Steps/Code to Reproduce

Here's how I solve a multi-label classification task with a linear SVM

from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.linear_model import SGDClassifier

X = [[0.0, 0.0], [1.0, 1.0], [1.0, 0.0]]
y = [[0], [1], [0, 1]]
y_mlb = MultiLabelBinarizer().fit_transform(y)
sample_weight = [1.0, 0.5, 0.8]
clf = OneVsRestClassifier(SGDClassifier(loss="hinge"))
clf.fit(X, y_mlb)   # unable to pass `sample_weight`

see also this related question on stackoverflow.

Expected Results

For regular (single-label) classification tasks, I can pass the sample_weight kwarg directly to SGDClassifier.fit, but with OneVsRestClassifier this is not possible

Actual Results

I cannot add weights to my training samples when OneVsRestClassifier (or a similar meta-estimator) is used.

Feature Request

Please let the fit method of OneVsRestClassifier (and similar meta-estimators) accept arbitrary kwargs and pass them on to the fit method of the wrapped estimator. The same may be useful for partial_fit and score, though I'm not sure about that.

@klamann klamann changed the title OneVsRestClassifier's fit method does not accept kwargs OneVsRestClassifier's fit method does not accept kwargs Mar 28, 2018
@klamann
Copy link
Author

klamann commented Mar 28, 2018

Here's a workaround for OneVsRestClassifier. Things that changed:

  • replaced sklearn.multiclass._fit_binary with a slightly adjusted implementation that accepts **kwargs and simply forwards them to estimator.fit(X, y, **kwargs)
  • added **kwargs as parameter to OneVsRestClassifier.fit
  • replaced the call to _fit_binary with the custom implementation and added **kwargs as parameter
import warnings

import numpy as np
from sklearn import clone
from sklearn.externals.joblib import Parallel, delayed
from sklearn.multiclass import OneVsRestClassifier, _ConstantPredictor
from sklearn.preprocessing import LabelBinarizer


# alternative to sklearn.multiclass._fit_binary
def _fit_binary(estimator, X, y, classes=None, **kwargs):
    unique_y = np.unique(y)
    if len(unique_y) == 1:
        if classes is not None:
            if y[0] == -1:
                c = 0
            else:
                c = y[0]
            warnings.warn("Label %s is present in all training examples." % str(classes[c]))
        estimator = _ConstantPredictor().fit(X, unique_y)
    else:
        estimator = clone(estimator)
        estimator.fit(X, y, **kwargs)
    return estimator


class OneVsRestClassifierPatched(OneVsRestClassifier):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def fit(self, X, y, **kwargs):
        self.label_binarizer_ = LabelBinarizer(sparse_output=True)
        Y = self.label_binarizer_.fit_transform(y)
        Y = Y.tocsc()
        self.classes_ = self.label_binarizer_.classes_
        columns = (col.toarray().ravel() for col in Y.T)
        self.estimators_ = Parallel(n_jobs=self.n_jobs)(delayed(_fit_binary)(
            self.estimator, X, column, classes=[
                "not %s" % self.label_binarizer_.classes_[i],
                self.label_binarizer_.classes_[i]], **kwargs)
            for i, column in enumerate(columns))
        return self

Code based on https://stackoverflow.com/a/49535681/599739

@jnothman
Copy link
Member

You can use SGDClassifier (and all classifiers in scikit-learn) for multiclass. It has inbuilt OvR. But I agree that OneVsOneClassifier.fit should support kwargs.

@jnothman jnothman added Easy Well-defined and straightforward way to resolve Enhancement help wanted labels Mar 28, 2018
@vivekk0903
Copy link
Contributor

@jnothman Multi-class isnt an issue here. The SGDClassifier dont support multi-label indicator matrix for y in fit() or partial_fit()

@jnothman
Copy link
Member

jnothman commented Mar 29, 2018 via email

@klamann
Copy link
Author

klamann commented Mar 29, 2018

This doesn't look too hard, actually. I'm considering to contribute a patch for this. Here's what I would do:

  • add **kwargs to the parameter list of _fit_binary and _partial_fit_binary in sklearn.multiclass and forward them to estimator.fit / estimator.partial_fit
  • add **kwargs to the parameter list of fit and partial_fit of OneVsOneClassifier, OneVsRestClassifier and OutputCodeClassifier and forward them to the respective calls to _fit_binary and _partial_fit_binary

We could also adjust other classes that extend the MetaEstimatorMixin following the same pattern, if that is desired. This would concern ClassifierChain, MultiOutputEstimator, RFE, RFECV and RANSACRegressor. By the way, this has already been implemented in SelectFromModel and BaseSearchCV.

@klamann
Copy link
Author

klamann commented Mar 31, 2018

I've implemented most of the changes that I lined out in my last comment, but while I was writing unit tests I ran into an issue that I'm not sure how to solve.

It's about parameter validation. Many Meta-Estimators check the parameters of the fit method of the estimators they wrap with something like has_fit_parameter(self.estimator, 'sample_weight'), so when sample weights have been provided but the estimator's fit method doesn't have that parameter name, a ValueError is raised.

I've added support for sample weights to pretty much all meta estimators, but not by explicitly adding a sample_weight parameter. Instead, I'm using a more generic **fit_params catch-all kwarg that is forwarded to the underlying estimator, like the one used in SelectFromModel.fit, which has the advantage that we can pass arbitrary parameters to fit.

However, this kind of validation doesn't work when we use kwargs instead of explicit parameter names. So now we could:

  • additionally check everywhere if fit_params is available when sample_weight is not
  • explicitly add sample_weight to all meta estimator's fit methods
  • do something different?

Any thoughts on this?

@jnothman
Copy link
Member

jnothman commented Apr 2, 2018 via email

@klamann
Copy link
Author

klamann commented Apr 2, 2018

I agree, has_fit_parameter is not a sufficient solution for these cases, but I'm not sure what solution you have in mind. Should we get rid of the check for sample_weight and test if the fit function accepts arbitrary kwags instead? Also not sure what you mean with "Removing the Nones may suffice", could you elaborate on that?

@jnothman
Copy link
Member

jnothman commented Apr 2, 2018 via email

@klamann
Copy link
Author

klamann commented Apr 2, 2018

Yeah, wrong context. I was talking about function inspection, which is used by has_fit_parameter to decide, whether an estimator's fit method can accept sample weights in the first place. Currently, there are checks in several places that look if the fit method has a sample_weight parameter, but now that we are going to allow arbitrary kwargs, this is obsolete.

I'm just testing a solution where I replace the calls to has_fit_parameter with a new function check_sample_weight_support, which not only looks for the presence of a sample_weight parameter, but also returns a positive result when a **kwarg of some kind is present. I'll post a link to the commit when I'm done.

@klamann
Copy link
Author

klamann commented Apr 2, 2018

OK, I'm stopping to work on this until we find a solution for the problem of deciding whether an estimator supports sample weights. Here's the issue in short:

  • In a lot of places, meta estimators (and unit tests) check if the underlying estimator supports sample weights by inspecting the parameter list of the estimator's fit method with has_fit_parameter.
  • This works if we have a single layer of abstraction, e.g. OneVsRestClassifier(SVC()).
  • In more complex (yet common) scenarios, this fails, e.g. when we have a Pipeline that has OneVsRestClassifier(SVC()) as last estimator (the fit method of OvR does not have a sample_weight param, but SVC does) or when we chain meta estimators (e.g. MultiOutputClassifier(OneVsRestClassifier(SVC())))
  • I tried to circumvent this issue by replacing the has_fit_parameter-routine with a less strict solution that allows meta estimators to pass sample weights, if any kind of **kwarg parameter is present. This however failed, because there are functions like bagging._parallel_build_estimators which, if they detect that the underlying estimator might support sample weights, always generate a uniform list of sample weights, which of course adds no value, but it breaks my solution. And there's a ton of unit tests that would pass sample_weight to estimators that have **kwargs in their fit method but don't actually support sample weights...

In my opinion it is generally a bad idea to have meta estimators reason about underlying code by inspecting highly implementation-specific stuff like function arguments. Accepting sample weights is an inherent property of an estimator and we should have a well-defined way to check whether an estimator has it. Then, meta estimators could forward the question, whether sample weights are supported, to their underlying estimator, through as many layers as there may be (e.g. Pipeline would ask its _final_estimator, OvR would ask its estimator, and SVC would just return True, because we know that support vecor machines can handle sample weights).

If we want to do this properly, we'll have to touch a lot of code... Maybe we should split this issue, have a simple solution for OvR and OvO (which will have the same problems as the other implementations, but it'll probably suffice for 80% of the use cases) and have a separate effort to find a proper solution for the sample weight detection problem.

@jnothman
Copy link
Member

jnothman commented Apr 2, 2018 via email

@fujiaxiang
Copy link
Contributor

Hi I've been dealing with something related lately and came across this issue.

I read through earlier discussions. I agree that in general we really should avoid having meta-models "inspecting implementation-specific stuff like function arguments". However, if we limit this function has_fit_parameter for rare exceptions like the bagging case. It'd be ok to keep it and implement in a trial-and-error fashion.

That said, we probably should remove the use of has_fit_parameter in some other cases. One example would be _MultiOutputEstimator.fit in sklearn/multioutput.py. My take is that since this method is not sure if underlying model supports sample_weight, it should not have named parameter sample_weight, but rather absorb it into **fit_params and pass down to the underlying model if present.

BTW it seems that the use of has_fit_parameter has already been removed from VotingClassifier. So that's probably a good sign.

@jnothman Any thoughts?

@cmarmo cmarmo added Needs Decision Requires decision module:multiclass and removed Easy Well-defined and straightforward way to resolve help wanted labels Jan 15, 2022
@adrinjalali
Copy link
Member

#24027 should fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants