Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created new per class accuracy function. #624

Merged
merged 23 commits into from
Nov 24, 2019

Conversation

deepandas11
Copy link
Contributor

@deepandas11 deepandas11 commented Nov 8, 2019

Features:

  • new pos_label argument(got to change name, as there could be ambiguity
    with positive_label in the original scoring function). Is None by
    default, in which case, average accuracy is returned
  • uses existing _accuracy function for computing per class accuracy

Description

This PR intends to introduce a standalone per class accuracy function

Related issues or pull requests

Fixes #617

Pull Request Checklist

  • Added a note about the modification or contribution to the ./docs/sources/CHANGELOG.md file (if applicable)
  • Added appropriate unit test functions in the ./mlxtend/*/tests directories (if applicable)
  • Modify documentation in the corresponding Jupyter Notebook under mlxtend/docs/sources/ (if applicable)
  • Ran PYTHONPATH='.' pytest ./mlxtend -sv and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., PYTHONPATH='.' pytest ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)
  • Checked for style issues by running flake8 ./mlxtend

Features:
* new pos_label argument(got to change name, as there could be ambiguity
with positive_label in the original scoring function). Is None by
default, in which case, average accuracy is returned
* uses existing _accuracy function for computing per class accuracy
@pep8speaks
Copy link

pep8speaks commented Nov 8, 2019

Hello @deepandas11! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-11-22 19:01:42 UTC

@deepandas11 deepandas11 changed the title Created new per class accuracy function. [WIP] Created new per class accuracy function. Nov 8, 2019
@coveralls
Copy link

coveralls commented Nov 8, 2019

Coverage Status

Coverage increased (+0.006%) to 92.354% when pulling d814e0d on deepandas11:per_class_accuracy into 9dc73ed on rasbt:master.

@@ -111,10 +120,10 @@ def scoring(y_target, y_predicted, metric='error',
elif metric == 'error':
res = _error(targ_tmp, pred_tmp)
elif metric == 'average per-class accuracy':
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a misunderstanding. The per-class accuracy would not be used to replace the average-per-class-accuracy. Maybe have a look at the brief comment in #615

Copy link
Contributor Author

@deepandas11 deepandas11 Nov 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This discussion makes it much clearer now. Thanks!

@deepandas11 deepandas11 changed the title [WIP] Created new per class accuracy function. Created new per class accuracy function. Nov 12, 2019
@deepandas11
Copy link
Contributor Author

deepandas11 commented Nov 13, 2019

Hello, @rasbt !
Please let me know if there are any corrections/recommendations!

pos_label : str or int, None by default.
The class whose accuracy score is to be reported.
'binary' has to be set to True.
binary : bool
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if binary is set to False. Will this be a regular accuracy computation then? I think it looks like that. It would be nice to document that.

I have another suggestion, though. How about

  1. calling this function just "accuracy" and computing the "average" per-class accuracy as an option?

E.g., instead of binary, we could use another parameter called method.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should have been documented. Thanks for pointing that out.

As for the suggestion, I think if you are looking to introduce modularity with respect to the individual metrics in "scoring.py", this would be a good first step.
However, it would be interesting to know if you indeed want to eventually deprecate the current methodology of finding metrics and move towards standalone functions, like this?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to support both, like scikit-learn does. The 'string' arguments in scoring are useful as a short-cut. However, it doesn't allow options like pos_label. Inside scoring, the idea would then be to use the per_class_accuracy function to implement the current 'average per-class accuracy'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, implementation-wise, I think the API would be less confusing if we have just a single accuracy function that can compute the three types of metrics, based on the choice of "method". This however, means no separate per_class_accuracy method, as was originally intended.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think it's best if we call it just classification_accuracy? And then inside we have the "per class" option

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have to correct this, "accuracy_score" may be better because we also have lift_score already, and it is generally more consistent with sklearn then.

@rasbt
Copy link
Owner

rasbt commented Nov 13, 2019

Sorry, it was a busy week and just got a chance to look at it. I have a suggestion (mention in the comment above). Let me know what you think. Also, it's weird that the unit test for Windows fails. Not sure if this is necessarily related to this PR or a general issue. Will investigate.

Deepan Das added 5 commits November 13, 2019 16:53
Features:
* new pos_label argument(got to change name, as there could be ambiguity
with positive_label in the original scoring function). Is None by
default, in which case, average accuracy is returned
* uses existing _accuracy function for computing per class accuracy
@rasbt
Copy link
Owner

rasbt commented Nov 13, 2019

Just did a rebase of your changes on top of the master branch to resolve the CI issues. You would need to do a

git pull origin per_class_accuracy

on your end before making any additional changes.

@deepandas11
Copy link
Contributor Author

Thank you so much for resolving the CI issue.
No worries about the delay.

@deepandas11
Copy link
Contributor Author

So, I have made the changes as discussed.
Another feature we can now possibly add to the accuracy scores is the normalize feature like in sklearn. (But maybe in a separate PR. I can open an issue if you feel we should add it)

Copy link
Owner

@rasbt rasbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes. It looks good to me overall. Just have a few minor comments, and I think considering adding normalize already might be a good idea.

  • I can take care of refactoring scoring.py then to use the new accuracy_score there internally, so no worries about that.

  • I will also make sure the documentation builds correctly then.

mlxtend/evaluate/accuracy.py Outdated Show resolved Hide resolved
mlxtend/evaluate/accuracy.py Show resolved Hide resolved
mlxtend/evaluate/accuracy.py Outdated Show resolved Hide resolved
@deepandas11
Copy link
Contributor Author

deepandas11 commented Nov 15, 2019

Have made the changes. Although, I think you have already seen that I made a few changes to scoring.py in aa095d7 to accommodate the new accuracy function.

@rasbt
Copy link
Owner

rasbt commented Nov 15, 2019

Thanks a lot! I am currently a bit swamped but will try to take a look at it on the weekend!

Copy link
Owner

@rasbt rasbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Here are a few minor issues I found.

mlxtend/evaluate/accuracy.py Outdated Show resolved Hide resolved
mlxtend/evaluate/accuracy.py Outdated Show resolved Hide resolved
@rasbt
Copy link
Owner

rasbt commented Nov 22, 2019

Thanks for making those changes. Looks good to me now. I added the documentation, let me know if that looks good to you.

@deepandas11
Copy link
Contributor Author

Looks great.

@rasbt
Copy link
Owner

rasbt commented Nov 24, 2019

Cool thanks! I will merge then!

@rasbt rasbt merged commit 9cb5f6c into rasbt:master Nov 24, 2019
@rasbt rasbt mentioned this pull request Dec 1, 2019
@deepandas11 deepandas11 deleted the per_class_accuracy branch December 1, 2019 18:03
@rasbt rasbt mentioned this pull request Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Standalone average_per_class_accuracy function
4 participants