Logits vs Log-softmax scores in LearnedMixin implementation #1

erobic · 2020-05-22T14:32:50Z

Hi,

I had a question regarding the PyTorch implementation of LearnedMixin.

debias/debias/bert/clf_debias_loss_functions.py

Line 41 in af7f0e4

class LearnedMixin(ClfDebiasLossFunction):

def forward(self, hidden, logits, bias, labels):
    logits = logits.float()  # In case we were in fp16 mode
    logits = F.log_softmax(logits, 1)

    factor = self.bias_lin.forward(hidden)
    factor = factor.float()
    factor = F.softplus(factor)

    bias = bias * factor

    bias_lp = F.log_softmax(bias, 1)
    entropy = -(torch.exp(bias_lp) * bias_lp).sum(1).mean(0)

    loss = F.cross_entropy(logits + bias, labels) + self.penalty*entropy
    return loss

The forward function adds logits and bias variables, however, logits has been log-softmaxed whereas bias is not (bias seems to be raw logits from bias-only model). Should we really apply log-softmax to logits before sending into cross_entropy loss? Could you explain the reasoning behind this?

The text was updated successfully, but these errors were encountered:

ddemszky · 2020-10-01T14:51:33Z

Following up, as I have the same question. :) Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logits vs Log-softmax scores in LearnedMixin implementation #1

Logits vs Log-softmax scores in LearnedMixin implementation #1

erobic commented May 22, 2020

ddemszky commented Oct 1, 2020

Logits vs Log-softmax scores in LearnedMixin implementation #1

Logits vs Log-softmax scores in LearnedMixin implementation #1

Comments

erobic commented May 22, 2020

ddemszky commented Oct 1, 2020