Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add area under the Precision-Recall curve (AUPRC): a useful metric for class imbalance. #806

Closed
gm039 opened this issue Nov 3, 2020 · 6 comments

Comments

@gm039
Copy link

gm039 commented Nov 3, 2020

The area under the Precision-Recall curve (AUPRC) is very useful metric for class imbalance. Here is the source (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.average_precision_score.html).

I tried using the add_metric feature in Pycaret 2.2 as below.

from sklearn.metrics import average_precision_score
add_metric('AUPRC_ID','AUC_PRC',average_precision_score, greater_is_better = True)

But the scores are different from the score obtained using the evaluate_model(tuned_model_best) precision-recall curve (See the snapshot below).

image

Additionally, the AUC_PRC score seems to takes Target and Predicted Label as input argument instead of Target and Predicted Score.

@gm039
Copy link
Author

gm039 commented Nov 4, 2020

@pycaret What is the default input argument to the metric defined through the add_metric feature in Pycaret 2.2?

@pycaret
Copy link
Collaborator

pycaret commented Nov 4, 2020

@gm039 You can actually access the metrics PyCaret is using by calling the get_metrics function.

@gm039
Copy link
Author

gm039 commented Nov 4, 2020

@pycaret Thanks! Below are the metrics used.

image

The input argument to the metric should be Target and Predicted Score but its taking Target and Predicted Label as verified in the previous snapshot.

Unfortunately, both the average precision calculated is different from the score obtained using the evaluate_model(tuned_model_best) precision-recall curve (See the previous snapshot's PR curve).

@pycaret
Copy link
Collaborator

pycaret commented Nov 4, 2020

@gm039 You can pass the target = 'pred_proba' inside the add_metric call and it will work just fine :) See below:

from pycaret.datasets import get_data
data = get_data('juice')

from pycaret.classification import *
s = setup(data, target = 'Purchase', session_id = 123, silent = True)

from sklearn.metrics import average_precision_score
add_metric('apc', 'APC', average_precision_score, target = 'pred_proba')

lr = create_model('lr')

predict_model(lr);

plot_model(lr, plot = 'pr')

image

image

Hope this helps?

@pycaret
Copy link
Collaborator

pycaret commented Nov 4, 2020

@gm039 Please close the issue, if this answers the question.

@gm039
Copy link
Author

gm039 commented Nov 4, 2020

@pycaret Thankyou so much. It worked.

@gm039 gm039 closed this as completed Nov 4, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant