You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to extend slicing functions to multi label classification tasks.
For the moment, snorkel only supports binary classification.
Here is the computation of slice attention
predictor_confidences = torch.cat(
[
# Compute the "confidence" using score of the positive class
F.softmax(output, dim=1)[:, 1].unsqueeze(1)
for output in predictor_outputs
],
dim=-1,
)
attention_weights = F.softmax(
indicator_preds * predictor_confidences / self.temperature, dim=1
)
My questions:
1/ Why using the prediction of the positive class as confidence score ?
2/ What should we use in multi class/ label case ?
The text was updated successfully, but these errors were encountered:
Hi @phihung — thanks for your interest in slicing and for the great question!
1/ Why using the prediction of the positive class as confidence score ?
We interpret the magnitude of the logit as the "confidence" of the slice predictor — a larger magnitude suggests that the slice predictor is more confident about its learned decision boundary for examples in this slice. We'd like to use this notion of confidence in the attention mechanism— i.e. slice predictors should be downweighted if their predictions are not confident for examples in the slice! For more, see Section 3 in the paper.
2/ What should we use in multi class/ label case ?
Fundamentally, there's no blocker here, but implementations may not be trivial based on open questions (e.g. "how should we interpret confidence of a slice classifier across multi-class outputs?") We plan on addressing specific implementations moving forward!
I'm trying to extend slicing functions to multi label classification tasks.
For the moment, snorkel only supports binary classification.
Here is the computation of slice attention
My questions:
1/ Why using the prediction of the positive class as confidence score ?
2/ What should we use in multi class/ label case ?
The text was updated successfully, but these errors were encountered: