New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Enhancement: Add MAPE as an evaluation metric #10711
[MRG] Enhancement: Add MAPE as an evaluation metric #10711
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should note the sensitivity to small y_true
, and perhaps we need to validate that y_true
is non-zero.
Thanks for the thoroughness!
doc/modules/model_evaluation.rst
Outdated
@@ -85,6 +85,7 @@ Scoring Function | |||
**Regression** | |||
'explained_variance' :func:`metrics.explained_variance_score` | |||
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error` | |||
'neg_mean_absolute_percentage_error' :func:`metrics.mean_absolute_percentage_error` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amueller suggested neg_mape
here. I'm not entirely sure which is better. But you'll need to widen the table if we go with this name!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I am open to change the name or keep it as is. But I'll the necessary changes once we agree on a name ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc/modules/model_evaluation.rst
Outdated
@@ -104,7 +105,7 @@ Usage examples: | |||
>>> model = svm.SVC() | |||
>>> cross_val_score(model, X, y, scoring='wrong_choice') | |||
Traceback (most recent call last): | |||
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'balanced_accuracy', 'brier_score_loss', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score'] | |||
ValueError: 'wrong_choice' is not a valid scoring value. Valid options are ['accuracy', 'adjusted_mutual_info_score', 'adjusted_rand_score', 'average_precision', 'balanced_accuracy', 'brier_score_loss', 'completeness_score', 'explained_variance', 'f1', 'f1_macro', 'f1_micro', 'f1_samples', 'f1_weighted', 'fowlkes_mallows_score', 'homogeneity_score', 'mutual_info_score', 'neg_log_loss', 'neg_mean_absolute_error', 'neg_mean_absolute_percentage_error', 'neg_mean_squared_error', 'neg_mean_squared_log_error', 'neg_median_absolute_error', 'normalized_mutual_info_score', 'precision', 'precision_macro', 'precision_micro', 'precision_samples', 'precision_weighted', 'r2', 'recall', 'recall_macro', 'recall_micro', 'recall_samples', 'recall_weighted', 'roc_auc', 'v_measure_score'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have inspired #10712. Thanks!
sklearn/metrics/regression.py
Outdated
if y_type == 'continuous-multioutput': | ||
raise ValueError("Multioutput not supported " | ||
"in mean_absolute_percentage_error") | ||
return np.average(np.abs((y_true - y_pred)/y_true))*100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space around / and * please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call mean rather than average just to match the name
This pull request introduces 2 alerts when merging 7c516ef into 2e30df3 - view on lgtm.com new alerts:
Comment posted by lgtm.com |
Yes, I think raise an error, but I don't know much about MAPE so I can't
say if there are alternative approaches (except the one on Wikipedia which
suggests getting differences relative to the mean of y_true).
"mean_absolute_percentage_error requires y_true to never be zero" would be
sufficient.
|
@amueller, @jnothman, @qinhanmin2014 The test scenarios in:
generate a
this explains the failures in Travis.CI, could you please suggest what to do? |
Pep8 consistency is a very poor excuse for naming inconsistency. I'm unsure what to do with the name, but pep8 is the last of our concerns (stick |
You might need to add a new category in the common tests for regression metrics that require non-zero y, and update y in the tests accordingly. |
@jnothman thanks for the tip ( |
sklearn/metrics/tests/test_common.py
Outdated
# matrix instead of a number. Testing of | ||
# confusion_matrix with sample_weight is in | ||
# test_classification.py | ||
"confusion_matrix", # Left this one here because the tests in this file do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not change unrelated things. It makes your contribution harder to review and may introduce merge conflicts to other pull requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, my bad. I used autopep8 on the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #10645!
@lesteve travis build finished successfully, so there is no regression with the new changes that you pushed. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments.
Just curious, maybe for @amueller who opened the associated issue, Wikipedia page does not seem very nice with MAPE, e.g.:
"Although the concept of MAPE sounds very simple and convincing, it has major drawbacks in practical application". Is is still used despite its drawbacks?
@@ -85,6 +85,7 @@ Scoring Function | |||
**Regression** | |||
'explained_variance' :func:`metrics.explained_variance_score` | |||
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error` | |||
'neg_mape' :func:`metrics.mean_absolute_percentage_error` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not spell it out fully here since like all the other metrics? i.e. neg_mean_absolute_percentage_error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lesteve I clarified in the PR description above that the name has to be chosen/voted by all of us. Initially I used neg_mean_absolute_percentage_error
but then, since mape
is already a famous acronym which, also, makes the metric cleaner, I chose to switch to neg_mape. However, we can change back to the long version, If most of us think that's the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be in favour of neg_mean_absolute_error
version personally. It is more consistent with neg_mean_absolute_error
and more consistent with the metric name ( metrics.mean_absolute_percentage_error
). Happy to hear what others think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be in favor using the explicit expanded name by default and introduce neg_mape
as an alias as we do for neg_mse
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we do not have neg_mse
. I thought we had.
doc/whats_new/v0.20.rst
Outdated
@@ -91,6 +91,8 @@ Model evaluation | |||
- Added the :func:`metrics.balanced_accuracy_score` metric and a corresponding | |||
``'balanced_accuracy'`` scorer for binary classification. | |||
:issue:`8066` by :user:`xyguo` and :user:`Aman Dalmia <dalmia>`. | |||
- Added the :func:`metrics.mean_absolute_percentage_error` metric and the associated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Model evaluation" is not the best section for this I would say, in doc/whats_new/v0.19.0.rst there is a "Metrics" section. I think you can do the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, I will do that.
sklearn/metrics/scorer.py
Outdated
@@ -487,6 +489,9 @@ def make_scorer(score_func, greater_is_better=True, needs_proba=False, | |||
mean_absolute_error_scorer = make_scorer(mean_absolute_error, | |||
greater_is_better=False) | |||
mean_absolute_error_scorer._deprecation_msg = deprecation_msg | |||
neg_mape_scorer = make_scorer(mean_absolute_percentage_error, | |||
greater_is_better=False) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe remove the new line here. When there is no clear rule, my advice would be to follow the same implicit convention as the code you are changing.
@amueller can you approve if all the changes that you requested have been addressed? Thanks! |
'scorer' isn't part of the name, firstly.
Secondly, I think the conventional name of the metric is stupid when I
think it's better described as "mean relative error". The only "absolute"
thing about it is that it's not squared, as far as I can tell. (And if this
measure is known in the literature as "mean relative error" as well, I'd be
tempted to use that name.
Given the weirdness of its name, I'm tempted to say that the name "MAPE"
has a special meaning beyond the words that constitute it, and hence we
should consider the short name despite the inconsistency
But I don't like the inconsistency either.
I also don't think this is something to agonize over.
|
I don't see any reason this shouldn't be considered for merge, @mohamed-ali. I think the docs should emphasise a little more that the metric is not reported as a percentage here. |
would you mind fixing the conflicts please? |
Yes sure, I'll spend some time in the next two days to fix them. Thanks for letting me know. |
thanks and sorry about the delay in reviewing |
hm looks like there's a bunch of errors now :-/ |
Yes, many things changed since this PR was created. I'll spend more time debugging. If not, I might recreate the PR with a new branch from master. |
Closing in favor of #15007. |
Reference Issues/PRs
Fixes #10708
Closes #6605
What does this implement/fix? Explain your changes.
sklearn.metrics.mean_absolute_percentage_error
neg_mape
scorer for regression problems.tests/test_common.py
andtests/test_score_objects.py
y_true
that doesn't include zeros intests/test_common.py
andtests/test_score_objects.py
.doc/modules/model_evaluation.rst
anddoc/modules/classes.rst
Any other comments?
neg_mean_absolute_percentage_error_scorer
orneg_mape_scorer