New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG + 2] Rename scorers like `mse` to `neg_mse` #7261

Merged
merged 15 commits into from Sep 8, 2016

Conversation

Projects
None yet
9 participants
@betatim
Contributor

betatim commented Aug 27, 2016

Reference Issue

Fixes #2439

What does this implement/fix? Explain your changes.

Renaming scorers for which smaller is better (like MSE) to neg_mse so that they fit the idea of "bigger is better".

  • mean_squared_error
  • mean_absolute_error
  • log_loss
  • median_absolute_error

@betatim betatim changed the title from Rename scorers like `mse` to `neg_mse` to [WIP] Rename scorers like `mse` to `neg_mse` Aug 27, 2016

betatim added some commits Aug 27, 2016

Rename smaller-is-better scoring metrics
Rename scorers like MSE to neg_MSE so that it is less
surprising that they return negative values.

@betatim betatim changed the title from [WIP] Rename scorers like `mse` to `neg_mse` to [MRG] Rename scorers like `mse` to `neg_mse` Aug 27, 2016

Introduce deprecation warning and fix tests
get_scrorer now warns if you use an old name for a scorer
and tests have been updated to use new naming convention.
mean_absolute_error_scorer = make_scorer(mean_absolute_error,
greater_is_better=False)
median_absolute_error_scorer = make_scorer(median_absolute_error,
greater_is_better=False)

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

Because those variables do not have a leading _ the can unfortunately be considered public API even if not in the documentation. I think we should maintain backward compat aliases until 0.20:

# Backward compat alias to keep until the end of the deprecation period (0.20)
mean_squared_error_scorer = neg_mean_squared_error_scorer
mean_absolute_error_scorer = neg_mean_absolute_error_scorer
median_absolute_error_scorer = neg_median_absolute_error_scorer

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

Maybe we could add a deprecation_msg option to make_scorer to make sure that the deprecation warning is raised when the user actually call the scorer instead of raising the warning in get_scorer that only covers a subset of the public API.

neg_median_absolute_error_scorer = make_scorer(median_absolute_error,
                                               greater_is_better=False)
median_absolute_error_scorer = make_scorer(
    median_absolute_error,
    greater_is_better=False,
    deprecation_msg='Scoring method median_absolute_error was renamed to '
                    'neg_median_absolute_error in version 0.18 and will be '
                    'removed in 0.20.'
)

This comment has been minimized.

@amueller

amueller Aug 29, 2016

Member

why not deprecated(make_scorer(median_absolute_error, greater_is_better=False)) ?

This comment has been minimized.

@amueller

amueller Aug 29, 2016

Member

Ah, it's an object, not a function, so it doesn't go into the right branch in deprecated, right? We could still wrap the __call__ method, right?

This comment has been minimized.

@amueller

amueller Aug 29, 2016

Member

deprecated("message")._decorate_fun(make_scorer(median_absolute_error, greater_is_better=False))
seems to be the right solution imho.

This comment has been minimized.

@betatim

betatim Aug 30, 2016

Contributor

Neither of the deprecated()._decorate_* work because make_scorer doesn't return a class nor a function :-/ To get the __name__ of a class instance you have to do obj.__class__.__name__.

I like handling the deprecation stuff as a wrapper instead of extra argument. I made a first proposal in 09cd4c7. It still has a few problems but it would allow you to do:

deprecated('hello')(make_scorer(median_absolute_error, greater_is_better=False))

This will however print Thing _PredictScorer in deprecated... and I think it will be quite hard to find out the correct name to print in a generic way inside deprecated() ... but do we want to cook something that is special to handling make_scorer??

This comment has been minimized.

@ogrisel

ogrisel Aug 31, 2016

Member

I would rather avoid making the sklearn.utils.deprecation.deprecated wrapper too complex. I think it's good enough to not use it at all in this case and just use a _deprecation_msg attribute on _BaseScorer as suggested in #7261 (comment).

This comment has been minimized.

@amueller

amueller Aug 31, 2016

Member

I wouldn't want to make the wrapper to complex, but I feel that deprecating the call method is really the easiest thing. We also don't need to change the wrapper for that. It will then say __call__ is deprecated and we can let the message explain that.

@@ -333,19 +342,19 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False,
recall_scorer = make_scorer(recall_score)
# Score function for probabilistic classification
log_loss_scorer = make_scorer(log_loss, greater_is_better=False,
needs_proba=True)

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

Same remark here.

accuracy=accuracy_scorer, roc_auc=roc_auc_scorer,
average_precision=average_precision_scorer,
log_loss=log_loss_scorer,
neg_log_loss=neg_log_loss_scorer,
adjusted_rand_score=adjusted_rand_scorer)

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

We should read the deprecated keys in the SCORERS dict as well. Hoping that our users will rather use get_scorer to get the deprecation message.

This comment has been minimized.

@amueller

This comment has been minimized.

@betatim

betatim Aug 30, 2016

Contributor

We translated read -> leave at the sprint

@@ -906,7 +906,7 @@ def test_cross_val_score_with_score_func_regression():
# Mean squared error; this is a loss function, so "scores" are negative
mse_scores = cval.cross_val_score(reg, X, y, cv=5,

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

Please rename to neg_mse_scores or multiply by -1.

@@ -348,7 +348,8 @@ def test_cross_val_score_with_score_func_regression():
assert_array_almost_equal(r2_scores, [0.94, 0.97, 0.97, 0.99, 0.92], 2)
# Mean squared error; this is a loss function, so "scores" are negative
mse_scores = cross_val_score(reg, X, y, cv=5, scoring="mean_squared_error")
mse_scores = cross_val_score(reg, X, y, cv=5,
scoring="neg_mean_squared_error")

This comment has been minimized.

@ogrisel

ogrisel Aug 27, 2016

Member

Please rename to neg_mse_scores or multiply by -1.

betatim added some commits Aug 27, 2016

Keep old metric names for deprecation period
Maintain old names during the deprecation period and update
tests to use better variables names.
@ogrisel

This comment has been minimized.

Member

ogrisel commented Aug 29, 2016

There is a failed doctest in model_evaluation.rst:

https://travis-ci.org/scikit-learn/scikit-learn/jobs/155644998#L3998

@ogrisel

This comment has been minimized.

Member

ogrisel commented Aug 29, 2016

Also line 204 of get_scorer should also filter out the deprecated scores from the list of valid options in its error message.

@ogrisel ogrisel added this to the 0.18 milestone Aug 29, 2016

@betatim

This comment has been minimized.

Contributor

betatim commented Aug 29, 2016

Thanks. I had updated the expected output but not the actual code ... both fixed now.

median_absolute_error=median_absolute_error_scorer,
mean_absolute_error=mean_absolute_error_scorer,
mean_squared_error=mean_squared_error_scorer,
accuracy=accuracy_scorer, roc_auc=roc_auc_scorer,
average_precision=average_precision_scorer,
log_loss=log_loss_scorer,

This comment has been minimized.

@amueller

amueller Aug 29, 2016

Member

we can't remove that, right? It's public API.

This comment has been minimized.

@betatim

betatim Aug 30, 2016

Contributor

Correct. Wasn't paying enough attention apparently :(

@amueller

This comment has been minimized.

Member

amueller commented Aug 29, 2016

thanks for working on this, it's very important I think.

@amueller amueller added the Blocker label Aug 29, 2016

@jnothman

This comment has been minimized.

Member

jnothman commented Aug 29, 2016

Sorry I've not been attuned to this thread. As I've proposed before, I think all deprecation should be in get_scorer: that's where the warning should be raised. The scorers themselves are not public API. Is that mistaken?

@amueller

This comment has been minimized.

Member

amueller commented Aug 29, 2016

Hm the scorer objects don't have underscores in their names, and the SCORERS dict that contains the instances is mentioned in the docs. I considered them public API.

@jnothman

This comment has been minimized.

Member

jnothman commented Aug 29, 2016

SCORERS yes, but I think that was a bad design choice :) if we deprecate SCORERS at the same time, do we win? I don't think the scorers themselves were ever intended as public. They were never included in classes.rst, they are as useful as their names, and they should be immutable. In practice we have lots of names imported or declared in non-underscored modules that are not assumed to be public; these are mere module-level variables. Next you'll argue that sklearn.metrics.scorer.qualified_name is also public API!

betatim added some commits Aug 30, 2016

Distinguish instances from functions in deprecated
Treat (callable) instances different from functions in
the deprecated decorator
@betatim

This comment has been minimized.

Contributor

betatim commented Aug 30, 2016

My original implementation was via get_scorer (ef64969) but after chatting with @ogrisel we decided that people will have been "naughty" and used both the sklearn.metrics.scorer.qualified_name and SCORERS from the outside (and that they should get a deprecation warning).

I think if we can organise for those users to get a warning as well without too much acrobatics then we should. No opinion on whether that warning should be the rename deprecation warning or a "stop using these internals" warning.

@@ -33,14 +34,15 @@
class _BaseScorer(six.with_metaclass(ABCMeta, object)):
def __init__(self, score_func, sign, kwargs):
def __init__(self, score_func, sign, kwargs, deprecation_msg=None):

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Aug 30, 2016

Member

I think that we will want to get rid of the "deprecation_msg" in two release. Maybe add a comment here saying that this should be removed for 0.20.

This comment has been minimized.

@ogrisel

ogrisel Aug 31, 2016

Member

Maybe we could rename that parameter to _deprecation_msg to make it explicit that this is not public API.

This comment has been minimized.

@ogrisel

ogrisel Aug 31, 2016

Member

Or even not put _deprecation_msg in the __init__ at all but just as None initialized attribute on the _BaseScorer class and manually use

mean_scorer_error._deprecation_msg = "the message"

to activate the deprecation.

def __call__(self, estimator, X, y, sample_weight=None):
pass
if self._deprecation_msg is not None:
warnings.warn(self._deprecation_msg, category=DeprecationWarning)

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Aug 30, 2016

Member

Don't you want a stacklevel=2 at least here, maybe even 3 (given that this method will always be called inside another)

@@ -76,6 +79,22 @@ def wrapped(*args, **kwargs):
return wrapped
def _decorate_instance(self, instance):
cls = instance.__class__
msg = "Thing %s is deprecated" % cls.__name__

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Aug 30, 2016

Member

Why "Thing %s is deprecated"? Why not "%s is deprecated"?

This comment has been minimized.

@betatim

betatim Aug 30, 2016

Contributor

Because I am not at all sure that we want to use _decorate_instance, wanted to put it out there as an idea: #7261 (comment)

(conclusion: discussion in line comments isn't very findable)

@GaelVaroquaux

This comment has been minimized.

Member

GaelVaroquaux commented Aug 30, 2016

I've finished my pass. I am 👍 with this PR once my small remarks are addressed.

Also, flake8 is unhappy (check travis). I guess that you should address that.

@abstractmethod

This comment has been minimized.

@jnothman

jnothman Sep 5, 2016

Member

I think you can keep this.

@@ -42,6 +42,7 @@ def __init__(self, score_func, sign, kwargs):
# XXX deprecation_msg property again and make __call__ abstract again

This comment has been minimized.

@jnothman

jnothman Sep 5, 2016

Member

comment update?

@betatim

This comment has been minimized.

Contributor

betatim commented Sep 6, 2016

bump

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 6, 2016

:)

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 6, 2016

Examples not yet updated?

  • examples/model_selection/plot_underfitting_overfitting.py
  • examples/plot_kernel_ridge_regression.py
@ogrisel

This comment has been minimized.

Member

ogrisel commented Sep 6, 2016

The fix it-self looks good to me. +1 for merge once the examples are updated.

@betatim

This comment has been minimized.

Contributor

betatim commented Sep 6, 2016

Updated.

plt.plot(train_sizes, test_scores_svr.mean(1), 'o-', color="r",
plt.plot(train_sizes, -test_scores_svr.mean(1), 'o-', color="r",

This comment has been minimized.

@jnothman

jnothman Sep 6, 2016

Member

I think you should leave off this negation and change the ylabel to say "Negative".

This comment has been minimized.

@jnothman

jnothman Sep 6, 2016

Member

But what you have here is okay; better than what's currently plotted at http://scikit-learn.org/stable/auto_examples/plot_kernel_ridge_regression.html.

@ogrisel, any preference for plotting -MSE vs plotting MSE?

This comment has been minimized.

@betatim

betatim Sep 6, 2016

Contributor

I find plotting MSE and seeing it decrease easier on my brain than having to think "error, smaller is better but here it is negated so going up means closer to zero ..."

This comment has been minimized.

@jnothman

jnothman Sep 7, 2016

Member

The question from my perspective is whether it's worth encouraging users to negate output. Or whether, regardless of scoring function we'd like to encourage the convention that better is up.

This comment has been minimized.

@betatim

betatim Sep 7, 2016

Contributor

I have no opinion on this beyond previous comment. Who is going to arbitrate/decide?

This comment has been minimized.

@amueller

amueller Sep 7, 2016

Member

No strong opinion here. I'm fine with the -

This comment has been minimized.

@jnothman

jnothman Sep 8, 2016

Member

Merge then?

@jnothman

This comment has been minimized.

Member

jnothman commented Sep 6, 2016

LGTM other than that too

@jnothman jnothman changed the title from [MRG + 1] Rename scorers like `mse` to `neg_mse` to [MRG + 2] Rename scorers like `mse` to `neg_mse` Sep 6, 2016

@lesteve

This comment has been minimized.

Member

lesteve commented Sep 8, 2016

It would be nice to add tests to make sure that accessing the old scorers (via the different ways mentioned above) does give DeprecationWarning as expected.

@betatim bonus points if you feel like tackling my comment from a week ago which got lost in other (very likely more important) considerations.

@betatim

This comment has been minimized.

Contributor

betatim commented Sep 8, 2016

Tested.

I skipped testing the from sklearn.metrics.scorer import mean_squared_error_scorer kind of access. I think it would require a bit of import acrobatics to do based on a string name and not sure that is worth it.

@amueller

This comment has been minimized.

Member

amueller commented Sep 8, 2016

merge?

@betatim

This comment has been minimized.

Contributor

betatim commented Sep 8, 2016

Make it so.

@GaelVaroquaux GaelVaroquaux merged commit 4b2304f into scikit-learn:master Sep 8, 2016

2 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@GaelVaroquaux

This comment has been minimized.

Member

GaelVaroquaux commented Sep 8, 2016

Yey!

(I didn't do much, but I pressed the green button. What a pleasure!)

@GaelVaroquaux

This comment has been minimized.

Member

GaelVaroquaux commented Sep 8, 2016

Thanks heaps @betatim !

@amueller

This comment has been minimized.

Member

amueller commented Sep 8, 2016

awesome, thanks! this was was overdue

@betatim betatim deleted the betatim:negative-scorers branch Sep 8, 2016

@betatim

This comment has been minimized.

Contributor

betatim commented Sep 8, 2016

Thanks a lot for all the comments and extra 👀!

@raghavrv

This comment has been minimized.

Member

raghavrv commented Sep 9, 2016

Yay 🍻 Much needed PR!

@raghavrv

This comment has been minimized.

Member

raghavrv commented Sep 12, 2016

@jnothman This also closes #6028 #5023 right?

@amueller

This comment has been minimized.

Member

amueller commented Sep 12, 2016

It does. Closed those. Thanks @raghavrv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment