Skip to content

Adding evaluation metrics

Robin van de Water edited this page Nov 14, 2023 · 1 revision

We support adding multiple types of evaluation metrics for benchmarking DL or ML models. We additionally support three common metric libraries: TorchMetrics (Nicki Skafte Detlefsen et al. ,2022), Ignite (Fomin et al. 2020), and Scikit-Learn (Pedregosa et al. 2011). Adding a metric is a straightforward procedure. We added the Jensen Shannon Divergence (JSD) with the help of the SciPy library Virtanen et al. (2020). See below for details:

class JSD(EpochMetric):
    def __init__(self, output_transform: Callable = lambda x: x, check_compute_fn: bool = False) -> None:
        super(JSD, self).__init__(lambda x, y: JSD_fn(x, y), output_transform=output_transform, check_compute_fn=check_compute_fn)

        def JSD_fn(y_preds: torch.Tensor, y_targets: torch.Tensor):
            return jensenshannon(abs(y_preds).flatten(), abs(y_targets).flatten()) ** 2

One can then add the metric to be evaluated for a particula model, see below:

class DLMetrics:
    BINARY_CLASSIFICATION = {
        "AUC": AUROC(task="binary"),
        "PR" : AveragePrecision(task="binary"),
        "F1": F1Score(task="binary", num_classes=2),
        "Calibration_Error": CalibrationError(task="binary",n_bins=10)
        "Calibration_Curve": CalibrationCurve,
        "PR_Curve": PrecisionRecallCurve,
        "RO_Curve": RocCurve,
        "JSD": JSD,
        "Binary_Fairness": BinaryFairnessWrapper(num_groups=2, task='demographic_parity', group_name="sex")