<a href="https://colab.research.google.com/github/mingmcs/pyhealth/blob/week7/Tutorial_5_pyhealth_metrics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Preparation**
- install pyhealth alpha version

In [None]:
!pip install pyhealth

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


### **Instruction on [pyhealth.metrics](https://pyhealth.readthedocs.io/en/latest/api/metrics.html)**
- **[README]**: This module contains the metrics for evaluating
  - [multiclass classification](https://pyhealth.readthedocs.io/en/latest/api/metrics/pyhealth.metrics.multiclass.html)
  - [multilabel classification](https://pyhealth.readthedocs.io/en/latest/api/metrics/pyhealth.metrics.multilabel.html)
  - [binary classification](https://pyhealth.readthedocs.io/en/latest/api/metrics/pyhealth.metrics.binary.html)

### **1. binary classification metrics**
- User specifies the true label list `y_true` and the predicted logits `y_prob`.
- User specifies `metrics`, which is a list of metrics that we want to calculate. Below, we use all the possible metrics for binary classification.
- **Example**: we use `np.random` to generate the `y_true` and `y_prob` below.

In [None]:
import numpy as np

from pyhealth.metrics.binary import binary_metrics_fn

# randomly generated true labels and predicted probability
y_true = np.random.randint(2, size=100000)
y_prob = np.random.random(size=100000)

all_metrics = [
    "pr_auc",
    "roc_auc",
    "accuracy",
    "balanced_accuracy",
    "f1",
    "precision",
    "recall",
    "cohen_kappa",
    "jaccard",
]

binary_metrics_fn(y_true, y_prob, metrics=all_metrics)

{'pr_auc': 0.49685758491817855,
 'roc_auc': 0.4977532325321279,
 'accuracy': 0.49824,
 'balanced_accuracy': 0.49824770007219454,
 'f1': 0.4991415452186065,
 'precision': 0.4969983699757484,
 'recall': 0.5013032842763765,
 'cohen_kappa': -0.0035045235518360585,
 'jaccard': 0.3325706988746708}

### **2. multiclass classification metrics**
- User specifies the true label list `y_true` and the predicted logits `y_prob`.
- User specifies `metrics`, which is a list of metrics that we want to calculate. Below, we use all the possible metrics for multiclass classification.
- **Example**: we use `np.random` to generate the `y_true` and `y_prob` below.

In [None]:
from pyhealth.metrics.multiclass import multiclass_metrics_fn

# randomly generated true labels and predicted probability
y_true = np.random.randint(4, size=100000)
y_prob = np.random.randn(100000, 4)
y_prob = np.exp(y_prob) / np.sum(np.exp(y_prob), axis=-1, keepdims=True)

all_metrics = [
    "roc_auc_macro_ovo",
    "roc_auc_macro_ovr",
    "roc_auc_weighted_ovo",
    "roc_auc_weighted_ovr",
    "accuracy",
    "balanced_accuracy",
    "f1_micro",
    "f1_macro",
    "f1_weighted",
    "jaccard_micro",
    "jaccard_macro",
    "jaccard_weighted",
    "cohen_kappa",
]

multiclass_metrics_fn(y_true, y_prob, metrics=all_metrics)

{'roc_auc_macro_ovo': 0.49909983586193224,
 'roc_auc_macro_ovr': 0.49909924033232794,
 'roc_auc_weighted_ovo': 0.4990969765890098,
 'roc_auc_weighted_ovr': 0.4990915608239235,
 'accuracy': 0.24804,
 'balanced_accuracy': 0.24805420360037064,
 'f1_micro': 0.24804,
 'f1_macro': 0.2480398855528448,
 'f1_weighted': 0.24803190267498276,
 'jaccard_micro': 0.14157857485330716,
 'jaccard_macro': 0.14157996679931306,
 'jaccard_weighted': 0.1415747652957216,
 'cohen_kappa': -0.0026085450086035245}

### **3. multilabel classification metrics**
- User specifies the true label list `y_true` and the predicted logits `y_prob`.
- User specifies `metrics`, which is a list of metrics that we want to calculate. Below, we use all the possible metrics for multilabel classification.
- **Example**: we use `np.random` to generate the `y_true` and `y_prob` below.

In [None]:
from pyhealth.metrics.multilabel import multilabel_metrics_fn

# randomly generated true labels and predicted probability
y_true = np.random.randint(2, size=(10000, 100))
y_prob = np.random.random(size=(10000, 100))

all_metrics = [
    "roc_auc_micro",
    "roc_auc_macro",
    "roc_auc_weighted",
    "roc_auc_samples",
    "pr_auc_micro",
    "pr_auc_macro",
    "pr_auc_weighted",
    "pr_auc_samples",
    "accuracy",
    "f1_micro",
    "f1_macro",
    "f1_weighted",
    "f1_samples",
    "precision_micro",
    "precision_macro",
    "precision_weighted",
    "precision_samples",
    "recall_micro",
    "recall_macro",
    "recall_weighted",
    "recall_samples",
    "jaccard_micro",
    "jaccard_macro",
    "jaccard_weighted",
    "jaccard_samples",
    "hamming_loss",
]

multilabel_metrics_fn(y_true, y_prob, metrics=all_metrics)

{'roc_auc_micro': 0.500419357027864,
 'roc_auc_macro': 0.5004095816773159,
 'roc_auc_weighted': 0.5004094030836111,
 'roc_auc_samples': 0.5004834406276736,
 'pr_auc_micro': 0.500914284467838,
 'pr_auc_macro': 0.5013350312749212,
 'pr_auc_weighted': 0.5013752132346262,
 'pr_auc_samples': 0.5219467025812635,
 'accuracy': 0.0,
 'f1_micro': 0.500349424174954,
 'f1_macro': 0.500327305383711,
 'f1_weighted': 0.5003496745899392,
 'f1_samples': 0.4978024864882309,
 'precision_micro': 0.5006993370804728,
 'precision_macro': 0.500696992387741,
 'precision_weighted': 0.5007392175970358,
 'precision_samples': 0.5006834304175435,
 'recall_micro': 0.5,
 'recall_macro': 0.4999974036846464,
 'recall_weighted': 0.5,
 'recall_samples': 0.4999439209488278,
 'jaccard_micro': 0.3336440049707462,
 'jaccard_macro': 0.3336449121652103,
 'jaccard_weighted': 0.33366481187295227,
 'jaccard_samples': 0.33366199138464897,
 'hamming_loss': 0.499759}

If you find it useful, please give us a star ⭐ (fork, and watch) at https://github.com/sunlabuiuc/PyHealth. 

Thanks very much for your support!