-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1463 mean absolute percentage error #1510
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1510 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 227 228 +1
Lines 15592 15629 +37
=========================================
+ Hits 15584 15621 +37
Misses 8 8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh This looks great! I think we should resolve the discussion about what to do with division by 0 before merging but other than that I think it's good to merge!
I'm ok with holding off with making this the default objective for time series regression problems because of its noted limitations, namely the bias in selecting estimators that under-estimate the target.
I think for the default objective for automl we should consider these two alternatives:
score_needs_proba = False | ||
perfect_score = 0.0 | ||
|
||
def objective_function(self, y_true, y_predicted, X=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should just raise a ValueError if any y_true
are 0? My concern is that 0 can be a meaningful target value (like stock returns) and users may be misled that those records are being used in the evaluation when they are not.
"""Mean absolute percentage error for time series regression.""" | ||
name = "Mean Absolute Percentage Error" | ||
greater_is_better = False | ||
score_needs_proba = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: You don't need to specify score_needs_proba
since that is specified in RegressionObjective
@@ -345,6 +345,17 @@ def objective_function(self, y_true, y_predicted, X=None): | |||
return metrics.mean_absolute_error(y_true, y_predicted) | |||
|
|||
|
|||
class MAPE(RegressionObjective): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add to the api reference too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh looks great!! The only blocking comments I had were about using eps
in the impl, and starting a thread with @freddyaboulton about whether or not we should define TimeseriesRegressionObjective
right now
perfect_score = 0.0 | ||
|
||
def objective_function(self, y_true, y_predicted, X=None): | ||
return (np.abs((y_true - y_predicted) / y_true))[y_true != 0].mean() * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh nice, looks good!
Sklearn 0.24.2 (a current release candidate) has a MAPE impl (docs).
I like their use of max(y_true, eps)
in the denominator. I think we should do that instead of excluding rows where y_true
is exactly 0.
I also liked that they had weight support. We don't have a pattern for this yet from other objectives, and its not a requirement for this PR, but would be nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsherry That's an interesting approach, but wouldn't that result in an extremely high value in the overall calculation? @freddyaboulton
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point @dsherry! I think @ParthivNaresh is right because sklearn's definition of eps
is way too small. Even if we used our own I'm worried that there'll always be one use case where the value we pick is "wrong".
I prefer treating this how we treat RootMeanSquaredLogError or MeanSquaredLogError. Those objectives error out if passed targets with any negative values. They are also kept out of the core objectives because the user only uses them when they are defined for their particular dataset. Maybe we should add MAPE to get_non_core_objectives
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sklearn's definition of eps is way too small
Hmm, its an arbitrary threshold, right? Setting it to min numerical precision makes sense to me because any differences smaller than that (more or less) cannot be trusted.
It seems we have a few options for what to do when the target contains values equal to or super close to zero (options not all mutually exclusive):
- Have the objective error out.
- Have the objective round the zero value up to a small eps in the denominator.
- Have automl error out during init if MAPE was selected as the automl objective. Similar treatment for RMSLE/MSLE if the target has negative values.
What behavior is most useful to users?
@freddyaboulton @ParthivNaresh I don't have a strong preference between 1 and 2--if you're both advocating for 1, let's do that in this PR. However to really nail that question I think we should also file 3 as an enhancement and address it separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the problem with the sklearn approach is that it will divide by eps whenever y=0 which will cause the score to blow up.
I don't think scores that large are useful to users, especially if they have a dataset where y=0 occurs frequently. I like option 1 because I prefer letting the user know at the beginning that this objective isn't great for problems with y=0 (rather than having them comb through or impl trying to reason why their objective is ~1e14).
You bring up a good point about AutoMLSearch! We should definite file that. Currently we never allow RMSLE and MSLE as the primary objective but I like the idea of only disallowing it if the target has negative values. Similar for MAPE. That sounds like it could be a data check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton agreed! And ooh yeah definitely, perhaps that could be part of the "InvalidTargetsDataCheck"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #1521 !
@@ -345,6 +345,17 @@ def objective_function(self, y_true, y_predicted, X=None): | |||
return metrics.mean_absolute_error(y_true, y_predicted) | |||
|
|||
|
|||
class MAPE(RegressionObjective): | |||
"""Mean absolute percentage error for time series regression.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention that we've scaled this by 100 so that its a percentage? I've seen it defined both with or without the 100 factor in the past.
|
||
assert obj.score(s1_actual, s1_predicted) == pytest.approx(3.5 / 6 * 100) | ||
assert obj.score(s2_actual, s2_predicted) == pytest.approx(4.83333333 / 7 * 100) | ||
assert obj.score(s3_actual, s3_predicted) == pytest.approx(3.0625 / 7 * 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh nice 👍 I was able to walk through by hand and reproduce the math for the first example you have here. Only thought here is that if you could simplify the numbers a bit, make the arrays shorter, make some of those decimals more round etc., it would help make this math easier to follow.
@@ -345,6 +345,17 @@ def objective_function(self, y_true, y_predicted, X=None): | |||
return metrics.mean_absolute_error(y_true, y_predicted) | |||
|
|||
|
|||
class MAPE(RegressionObjective): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@freddyaboulton did we decide not to define TimeseriesRegressionObjective
? Or did we simply not decide yet, haha, since this is the first objective explicitly defined for timeseries? I can't remember where we left that conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, our objectives were only defined for a single problem type. We agreed to change this to a list of valid problem types and that our existing regression objectives should work for both regression and time series regression problems.
So long story short, we didn't define a TimeSeriesRegressionObjective
because we didn't have any objectives that should only work for time series. We should make MAPE a TimeSeriesRegressionObjective
in that case :) Good call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it, thanks @freddyaboulton
So @ParthivNaresh , I think what we should do here is:
- Define
TimeseriesRegressionObjective
which subclassesRegressionObjective
but allows onlyProblemTypes.TIME_SERIES_REGRESSION
- Have
MAPE
subclassTimeseriesRegressionObjective
That sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsherry Sounds great!
…m/alteryx/evalml into 1463-mean-absolute-percentage-error
def objective_function(self, y_true, y_predicted, X=None): | ||
if (y_true == 0).any(): | ||
raise ValueError("Mean Absolute Percentage Error cannot be used when " | ||
"targets contain the value 0.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 cool!
Set to [ProblemTypes.TIME_SERIES_REGRESSION] | ||
""" | ||
|
||
problem_types = [ProblemTypes.TIME_SERIES_REGRESSION] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh looks great thanks
@@ -5,7 +5,7 @@ colorama==0.4.4 | |||
featuretools==0.22.0 | |||
graphviz==0.15 | |||
ipywidgets==7.5.1 | |||
lightgbm==3.1.0 | |||
lightgbm==3.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah @ParthivNaresh for future reference, if the dependency update check fails on your PR, the dependency checker bot which runs every hour will get a separate PR up in with this update. And then once that's merged you can rebase/update your PR and you should be all set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We keep those things separate so that we can watch the dependency update step pass/fail CI, independent from any other changes like those being made in this PR. This helps us catch breaking changes made by the latest version of our dependency libs.
Fixes #1463