Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Evaluation] Add metrics for evaluating classification tasks #11

Closed
bcebere opened this issue Feb 6, 2023 · 0 comments · Fixed by #38
Closed

[Evaluation] Add metrics for evaluating classification tasks #11

bcebere opened this issue Feb 6, 2023 · 0 comments · Fixed by #38
Labels
enhancement New feature or request

Comments

@bcebere
Copy link
Contributor

bcebere commented Feb 6, 2023

Feature Description

One of the major tasks of the library is evaluating the quality of the models and evaluating the AutoML objectives.

To that end, metrics are needed for every supported problem type.

One of them is evaluating classification tasks. The library should offer an API for using any of these metrics, testing the predicted values against the ground truth.

Important metrics to cover here:

  • aucroc : the Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.
  • aucprc : The average precision summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight.
  • accuracy : Accuracy classification score.
  • f1_score(micro, macro, weighted): F1 score is a harmonic mean of the precision and recall. This version uses the "micro" average: calculate metrics globally by counting the total true positives, false negatives and false positives.
  • kappa: computes Cohen’s kappa, a score that expresses the level of agreement between two annotators on a classification problem.
  • precision(micro, macro, weighted): Precision is defined as the number of true positives over the number of true positives plus the number of false positives. This version(micro) calculates metrics globally by counting the total true positives.
  • recall(micro, macro, weighted): Recall is defined as the number of true positives over the number of true positives plus the number of false negatives. This version(micro) calculates metrics globally by counting the total true positives.
  • mcc: The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary and multiclass classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes.

AP reference: https://github.com/vanderschaarlab/autoprognosis/blob/main/src/autoprognosis/utils/tester.py

@bcebere bcebere added the enhancement New feature or request label Feb 6, 2023
@DrShushen DrShushen transferred this issue from another repository Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant