New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoring functions don't know classes_ #6231

Open
pkch opened this Issue Jan 26, 2016 · 20 comments

Comments

8 participants
@pkch
Copy link

pkch commented Jan 26, 2016

Moving the discussion with @amueller from pydata/patsy#77 (comment).

Proposing to:

  • add an optional 'labels' argument to log_loss
  • add an argument to make_scorer to enable passing labels argument to the custom loss function

Related to this - perhaps also allow make_scorer's need_threshold = True to be used for non-binary classification problems. Not sure why it's limited to binary. Totally meaningful for multi-class situations, where the metric might use the decision function to evaluate based on the rank order or something (for example, SVC doesn't have a cheap predict_proba, but has a cheap decision_function, so it would allow it to be used with rank-based metris). From make_scorer perspective, it just seems like a completely arbitrary restriction because it shouldn't care about what the custom loss function does with the decision_function output.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jan 26, 2016

Responding to that second dot point only, make_scorer will automatically
pass on any additional kwargs.

On 26 January 2016 at 15:32, pkch notifications@github.com wrote:

Moving the discussion from github/pydata/77.

Proposing to:

  • add an optional 'labels' argument to log_loss
  • add an argument to make_scorer to enable passing labels argument to
    the custom loss function

Related to this - perhaps also allow make_scorer's need_threshold = True
to be used for non-binary classification problems. Not sure why it's
limited to binary. Totally meaningful for multi-class situations, where the
metric might use the decision function to evaluate based on the rank order
or something (for example, SVC doesn't have a cheap predict_proba, but
has a cheap decision_function, so it would allow it to be used with
rank-based metris). From make_scorer perspective, it just seems like a
completely arbitrary restriction because it shouldn't care about what the
custom loss function does with the decision_function output.


Reply to this email directly or view it on GitHub
#6231.

@pkch

This comment has been minimized.

Copy link

pkch commented Jan 28, 2016

@jnothman - I don't see how it would work. I meant that the when the classifier finishes training, its classes_ attribute should be passed to the custom loss function. This would help avoid calling LabelEncoder inside the custom loss function (which is required to match the unlabeled np.ndarray returned by predict_proba with the labels in the true y).

make_scorer can only pass additional kwargs to the custom loss function if these arguments are known at the time make_scorer is called. But make_scorer is called to create a scorer, which is to be used on arbitrary future datasets, with arbitrary future classifiers. The class labels won't be known until the classifier is trained on a particular dataset.

On another thought, if sklearn were to return DataFrame instead of np.ndarray this would be unnecessary. But I doubt it's in the plans?

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Oct 7, 2016

+1 on adding labels to all metrics.

And I agree this is an issue with the current design of the scorers. I think the classifiers should actually pass the classes_ to the scorer. There is even more issues with this, if this happens in cross-validation, for example, as different folds might have different classes, so we might want to tell all classifiers what the classes are (but that proposal was rejected a while ago because it really clutters the API).

I'm fine with a fix on the scorer level, though. The scorers could actually inspect the classes_ attribute and act accordingly. This is somewhat of a big API change though. I'd be happy to see a test and PRs though.

This relates closely to how we compute the R^2 by the way, but that is harder to fix. The R^2 should use the training set mean, not the test set mean, but that's unknown to the scorer. Getting classes_ in the call from scorer is easy, but the R^2 computation is hard, as the model doesn't actually store the class mean :-(

Ideally I'd fix both issues in one go, but I'm not sure how to do that.

@amueller amueller changed the title Scoring functions Scoring functions don't know classes_ Oct 7, 2016

@amueller amueller added this to the 0.19 milestone Oct 7, 2016

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Oct 7, 2016

I feel this is pretty important, as currently results can be pretty wrong if not all classes are present in a test set.

On second thought, I'm not sure what the right way to handle this is, though. Should we warn or error if macro-average is requested and not all classes are present?

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Oct 8, 2016

I'd be interested in seeing scorers access classes_ to pass that on to
capable metrics...

On 8 October 2016 at 05:30, Andreas Mueller notifications@github.com
wrote:

+1 on adding labels to all metrics.

And I agree this is an issue with the current design of the scorers. I
think the classifiers should actually pass the classes_ to the scorer.
There is even more issues with this, if this happens in cross-validation,
for example, as different folds might have different classes, so we might
want to tell all classifiers what the classes are (but that proposal was
rejected a while ago because it really clutters the API).

I'm fine with a fix on the scorer level, though. The scorers could
actually inspect the classes_ attribute and act accordingly. This is
somewhat of a big API change though. I'd be happy to see a test and PRs
though.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6231 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6-Xq3GVCuOYch82zcOeF0dOtVBZ6ks5qxo_fgaJpZM4HMN5L
.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jun 19, 2017

Re my comment at #9144 that scorer objects should look at the set of classes before cv, we should also consider how this might work with nested CV...

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jun 20, 2017

Hm so we make labels a required parameter (with deprecation)?

and do we want to store the labels in the scorer actually? we can also just make it a required argument of the __call__ and always provide it in GridSearchCV and cross_val_score etc.

In general this problem also happens for the score method of GridSearchCV, where we don't have labels... but we stored them in fit... but these where the training labels. But I guess that's the best we can do?

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jun 20, 2017

If we don't pass the labels along with X and y, then doing it for the built-in ones and the make_scorer ones would be quite different.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jun 20, 2017

@jmschrei

This comment has been minimized.

Copy link
Member

jmschrei commented Jun 25, 2017

The same issue arises in the models that support partial fit if a batch of data happens to not have a certain label. Allowing the user to specify the classes more generally might be worth pursuing. Potentially an optional parameter?

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jun 28, 2017

@jmschrei how so? the first pass to partial_fit is required to specify all the classes.

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jun 28, 2017

@jnothman so we add an extra constructor argument to all classifiers?
We said "no" to that a while ago, but I guess we can change that. I actually agree with your assessment.
So we add an optional constructor parameter to all classifiers?
I'm happy to do that (or rather, have @aarshayj do it ;). But I would like buy-in from @ogrisel @agramfort @GaelVaroquaux if possible before embarking on this.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jun 28, 2017

@GaelVaroquaux

This comment has been minimized.

Copy link
Member

GaelVaroquaux commented Jun 28, 2017

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jun 28, 2017

@ogrisel

This comment has been minimized.

Copy link
Member

ogrisel commented Jun 29, 2017

Explicit is better than implicit. I guess this applies in this case. +1 for a PR that adds classes as an optional hyperparam to all classifiers and a common test that looks like this:

rng = np.random.RandomState(42)
X = rng.normal(size=(50, 5))
y = rng.random.randint(2, 4, size=50)  # 2 or 3

all_classes = np.array([1, 2, 3, 42])

for clf_type in all_classifier_types:
    clf = clf_type(classes=all_classes).fit(X, y)
    assert clf.classes_ == all_classes

    if hasattr(clf, 'decision_function'):
        assert clef.decision_function(X).shape == (50, 4)
    if hasattr(clf, 'predict_proba'):
        assert clef.predict_proba(X).shape == (50, 4)
    ...
@amueller

This comment has been minimized.

Copy link
Member

amueller commented Jul 13, 2017

See #9290 for a draft of the API.

@adrinjalali

This comment has been minimized.

Copy link
Member

adrinjalali commented Aug 31, 2018

Is adding classes to all classifier inits still on the table? If yes, I could work on it.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Sep 1, 2018

@amueller

This comment has been minimized.

Copy link
Member

amueller commented Sep 1, 2018

There's actually a pretty good step towards this in #9585. There's also a PR that adds classes everywhere #9532 but that's indeed hairy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment