You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How are we supposed to handle cases when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? Some options are:
We limit the confusion matrix CM to labels are present in both observed (label_series) and predicted labels (predicted_label_series). This is what sklearn does.
CM contains labels from the union of observed and predicted labels.
CM contains labels from observed labels only. If a predicted label is not found in observed labels, we raise an error saying something like "Unknown label 2".
I think we should pick option 3 since it assumes that observed labels provide us a complete list of all the possible labels. Option 1 could be problematic because it will drop some valid observed labels in case they are not found in predicted labels.
If we opt for 3, we should raise an error in this line.
Need to figure out if we want to handle this from analyzer side or library.
The text was updated successfully, but these errors were encountered:
@bilalaws do we have data on how often this use case will be hit by customer ? when a predicted label (in this case 2) is not present in the observed labels (in this case [1])? ?
Feedback from Bilal from a PR review: #136 (comment)
Need to figure out if we want to handle this from analyzer side or library.
The text was updated successfully, but these errors were encountered: