Question about Loss #36

mt-cly · 2021-11-16T05:39:00Z

Hi, thanks for your great work.
I have some confusion about the class-balance cross-entropy loss at https://github.com/scaelles/DEXTR-PyTorch/blob/master/layers/loss.py#L26 . I notice that the output_gt_zero will differ the effects of postive prediction and negative prediction, I do not know why. In my thought, the loss_val should be equal to log(sigmoid(output)) when label==1, while log(1-sigmoid(output)) when label==0. However, this does not match with your code, can you please explain it or give me some paper references.
Thank you.

scaelles · 2021-11-20T10:13:00Z

Hello,

Thanks for your interest! Since we are predicting masks after cropping the image using the extreme points, there will be an imbalance on the number of foreground and background pixels. There will be way more foreground pixels than background ones which can bias the network on being overconfident about foreground. In order to alleviate that problem, we use the class balancing that was originally introduced in [1].

[1]: Holistically-Nested Edge Detection

mt-cly · 2021-11-23T07:18:37Z

Hi, thanks for your reply.
I read the reference paper, the paper introduces a bias weight to two terms in the formula, which corresponds to your num_labels_neg/num_total and num_labels_pos/num_total. But it still exists a gap with your implementation. I am not so clear about the loss_val calculation.
following is copied from your code.

output_gt_zero = torch.ge(output, 0).float()
loss_val = torch.mul(output, (labels - output_gt_zero)) - torch.log(1 + torch.exp(output - 2 * torch.mul(output, output_gt_zero)))

I wonder why it is not as follows:
loss_val = torch.mul(label, torch.log(1/(1+torch.exp(-output)))) + torch.mul(1.-label, torch.log(1-1/(1+torch.exp(-output))))
Thank you.

scaelles · 2021-12-11T17:07:59Z

Hello,

The loss that you describe would be the theoretical definition (you can find it here in one of our previous projects). However, that can be unstable during training and the formulation is rearranged so it behaves better during training. You can find here a similar derivation from the theoretical to the practical one (it's not exactly the same, but you should be able to derive ours from there).

Also, note also that the actual class balancing happens here.

Let me know if you have any other question!

scaelles closed this as completed Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Loss #36

Question about Loss #36

mt-cly commented Nov 16, 2021

scaelles commented Nov 20, 2021

mt-cly commented Nov 23, 2021 •

edited

Loading

scaelles commented Dec 11, 2021

Question about Loss #36

Question about Loss #36

Comments

mt-cly commented Nov 16, 2021

scaelles commented Nov 20, 2021

mt-cly commented Nov 23, 2021 • edited Loading

scaelles commented Dec 11, 2021

mt-cly commented Nov 23, 2021 •

edited

Loading