New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added mean_absolute_percentage_error in metrics fixes #10708 #15007
Conversation
It's check. If I don't put 1 in denomenator, then it will return inf when
y_true is 0 and then the value
returned by mean_absolute_percentage_error will be nan. To avoid this, I
am using 1 which won't
abruptly deviate the error value and also will be consistent. @agramfort
I would not do this. To me it's a feature not a bug that you can NaN / Inf
if you have any y that is 0.
If this happens in practice I presume it's bug in the dataset.
|
I was not using 1 previously but when I wrote test script for the function, one of the test case was given like below :
Above input is completely valid input and looks like the output of the classifier. If I don't put 1 in denomenator, it will fail and gives me nan. But if I put 1 then it gives me correct answer 8.99 which I needed at last. |
See an alternative contribution of this feature in #10711. I've not yet reviewed where it got stuck |
I have seen the pending PR, should I raise error when any y_true is 0? If that is the case then I will write seperate test function for evaluating mean_absolute_percentage_error because the tests written for other error functions won't work for this one. Should I do that? or should I search for some better approach? Awaiting for reply @agramfort @jnothman |
|
@jnothman @agramfort , Please look at this one. This is implementation of mean_absolute_percentage_error loss in tensorflow. They have clipped the smallest value to EPSILON. I think this is more robust approach than giving error. This will also work with every test case. If we raise on finding zero, most of the standard binary classification tasks will give error. But if we clip denominator to EPSILON, then it will work in every case. |
@amueller, @mohamed-ali, @jnothman, @agramfort
Above code shows that they have clipped the denominator value to epsilon lower bound. Advantages of this logic : Current Logic Passes all test cases and implementation is robust as per tensorflow. |
@ashutosh1919 you need to update classes.rst and doc/modules/model_evaluation.rst and scorer.py a good way to find this out is to do a git grep on the related function eg |
I have added the doc for MAPE as per your instructions. @agramfort, Please take a look at doc. It is passes all tests. |
@agramfort, @amueller @jnothman I don't know how contribution are orchestrated in sklearn, but the issue #10708 has been addressed with my PR #10711. The PR was approved by @jnothman, then remained unmerged for more than one year, and I have been recently asked by @amueller to synchronize it with master, but now I see this PR being created to address the exact same issue? @agramfort @ashutosh1919 Could you please close the PR I created, or at least let me know there beforehand so I don't waste more time. Thanks. |
Sorry @mohamed-ali, At the time when I raised my PR, I didn't know that issue for the same is raised before and PR is already there on it. Thanks to @jnothman that he informed about your PR and also about the issue. By seeing the last update on your PR, I thought that there has not been any update since then. I am extremely sorry for that. But I have mentioned you in here so that you stay aware about the same issue. Hands over to @agramfort, Please close one of the PR (Mine or his) on same issue. |
Added |
yes we need only one metric. I would go with MAPE to match wikipedia. MARE
would be our invention. We can
now make sure that google finds MAPE in sklearn doc when looking for mean
absolute relative error.
… |
@agramfort to clarify your point: you want to only implement the We would need to make it extra explicit in the documentation (user guide and docstring) that our implementation does not make the multiplication by 100. I think @rth also agrees with this. My original position expressed in #15007 (comment) was a bit different, but if others (in particular @jnothman @lesteve and @amueller who participated in the review of #10711) also agree with this choice, then I am fine with it. |
We could invent mean_absolute_proprotional_error to have the same acronym??!
|
We could invent mean_absolute_proprotional_error to have the same acronym??!
I prefer not to me more creative than wikipedia :)
+1 for mean_absolute_percentage_error without the 100.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise this is looking good
It looks to much like a hack and feels un-natural to me. I think would rather either:
|
- or just use the name mean_absolute_percentage_error without the x100
operation as Wikipedia does it.
+1
|
@jnothman , I have updated the code as per your last review. Now, the code in this PR and #16689 for MARE are same other than the difference of names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last think: doc/modules/model_evaluation.rst
lists scorers. neg_mean_absolute_percentage_error
is absent there.
I have addressed the last comment. Merging with +3. Thank you! |
(scikit-learn#15007) Co-authored-by: mohamed-ali <m.ali.jamaoui@gmail.com> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Joel Nothman <joel.nothman@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@pm.me>
(scikit-learn#15007) Co-authored-by: mohamed-ali <m.ali.jamaoui@gmail.com> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Joel Nothman <joel.nothman@gmail.com> Co-authored-by: Roman Yurchak <rth.yurchak@pm.me>
Wiki mentions multiplication by 100. Not Multiplying by 100 (current impl) seems so unnatural.
|
I have faced problem when working with any quantitative modelling algorithm. Scikit-learn doesn't have any function such mean_absolute_percentage_error. Everytime, when I want to use this function, then I have to implement it myself. But, if it is included in sklearn itself then it will be great. I have written code for this function and I have also added tests for it. @agramfort , Please Review my PR.

This PR Fixes #10708 . This PR is continuation of the work done by @mohamed-ali in #10711 PR.