Adding evaluation metrics
Robin van de Water edited this page Nov 14, 2023
·
1 revision
We support adding multiple types of evaluation metrics for benchmarking DL or ML models. We additionally support three common metric libraries: TorchMetrics (Nicki Skafte Detlefsen et al. ,2022), Ignite (Fomin et al. 2020), and Scikit-Learn (Pedregosa et al. 2011). Adding a metric is a straightforward procedure. We added the Jensen Shannon Divergence (JSD) with the help of the SciPy library Virtanen et al. (2020). See below for details:
class JSD(EpochMetric):
def __init__(self, output_transform: Callable = lambda x: x, check_compute_fn: bool = False) -> None:
super(JSD, self).__init__(lambda x, y: JSD_fn(x, y), output_transform=output_transform, check_compute_fn=check_compute_fn)
def JSD_fn(y_preds: torch.Tensor, y_targets: torch.Tensor):
return jensenshannon(abs(y_preds).flatten(), abs(y_targets).flatten()) ** 2
One can then add the metric to be evaluated for a particula model, see below:
class DLMetrics:
BINARY_CLASSIFICATION = {
"AUC": AUROC(task="binary"),
"PR" : AveragePrecision(task="binary"),
"F1": F1Score(task="binary", num_classes=2),
"Calibration_Error": CalibrationError(task="binary",n_bins=10)
"Calibration_Curve": CalibrationCurve,
"PR_Curve": PrecisionRecallCurve,
"RO_Curve": RocCurve,
"JSD": JSD,
"Binary_Fairness": BinaryFairnessWrapper(num_groups=2, task='demographic_parity', group_name="sex")