Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

xiaoyi-cheng · 2023-03-04T01:02:18Z

Feedback from Bilal from a PR review: #136 (comment)

How are we supposed to handle cases when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? Some options are:

We limit the confusion matrix CM to labels are present in both observed (label_series) and predicted labels (predicted_label_series). This is what sklearn does.
CM contains labels from the union of observed and predicted labels.
CM contains labels from observed labels only. If a predicted label is not found in observed labels, we raise an error saying something like "Unknown label 2".
I think we should pick option 3 since it assumes that observed labels provide us a complete list of all the possible labels. Option 1 could be problematic because it will drop some valid observed labels in case they are not found in predicted labels.

If we opt for 3, we should raise an error in this line.

Need to figure out if we want to handle this from analyzer side or library.

The text was updated successfully, but these errors were encountered:

goswamig · 2023-03-06T21:42:27Z

CC @bilalaws

goswamig · 2023-03-06T22:17:21Z

We need to dive deep in container and see if there are better options than mentioned above.

goswamig · 2023-03-06T22:20:26Z

@bilalaws do we have data on how often this use case will be hit by customer ?
when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? ?

xiaoyi-cheng mentioned this issue Mar 4, 2023

add basic stat metric, and model performance report #136

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

xiaoyi-cheng commented Mar 4, 2023

goswamig commented Mar 6, 2023

goswamig commented Mar 6, 2023 •

edited

goswamig commented Mar 6, 2023

Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

Multi-categorical confusion matrix calculation for labels not presented in predicted_labels #138

Comments

xiaoyi-cheng commented Mar 4, 2023

goswamig commented Mar 6, 2023

goswamig commented Mar 6, 2023 • edited

goswamig commented Mar 6, 2023

goswamig commented Mar 6, 2023 •

edited