How to calculate the loss in the EM setting? #2

ngthanhtin · 2022-10-18T19:50:12Z

Hi,
Thank you for your great work, it is really interesting and complicated too.
I still do not understand how you calculate the loss in the EM setting, you said in the paper that this is a self-supervised setting in the EM case, and calculates the pixel-wise nll between the pseudo-target and the joint likelihood block.
But I found in your code, you still use the image label to input to the nll loss, and here did you take the sum over two last channels which will output a specific vector and use it to compare which the image label? If this is what you have done, it will contradict what you said in the paper, right?

Hope you can clarify these, they are the things I am thinking about a lot.

Best,
Tin

ngthanhtin · 2022-10-21T04:09:43Z

Hello @coallaoh @junsukchoe @naver-ai, could you please explain that to me? This is really helpful for my research. Thanks

coallaoh · 2022-10-21T04:46:58Z

The final loss for CALM-EM is at

calm/main.py

Line 163 in 0ebb8a5

loss = self.criterion(features, target)

That is, it's in the form of NLL(features, target).
From the reference of torch.nn.NLLLoss (https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html), you find that NLL(features, target) assumes features to be log probabilities already and it merely computes:

- \sum_i features_{i, target_i} (or mean, depending on how you set the reduce function)

Now, let's take a look at how features are computed for CALM-EM.

Your link at

calm/network/universal.py

Line 94 in 0ebb8a5

inputs = inputs.sum(dim=[2, 3])

is the right location to look into this.

features = inputs = (latent_posterior * joint_likelihood).sum(dim=[2, 3])

In maths notation, this quantity is

features_{ic} = \sum_k p'(z=k | x^i, y=c) log p(y=c, z=k | x^i)

where p' denotes the prob distribution computed form the previous-iteration model f_former that is not trained (no backpropagation). See detach() operation for latent_posterior. Please also note that joint_likelihood is already log-ed (is_log=True).

Combining everything together, we have the following expression:

loss = - \sum_i features_{i, target_i}
= - \sum_i \sum_k p'(z=k | x^i, y=target_i) log p(y=target_i, z=k | x^i)

Please note that this may be interpreted as self-supervising the pixel(z)-wise predictions p(y, z|x) with its own estimation of the cue location z for the true class y: p'(z|x, y). I understand that self-supervision has the connotation of not using any human-supplied annotation, while ours uses the GT class label. We used "self-supervision" here in the sense that the pixel-wise GT is not used and is replaced with a pseudo-pixel-wise-GT generated from a mere GT class label.

coallaoh · 2022-10-21T04:50:35Z

The maps of p(z,y^|x) and p(y^|x,z) are similar because they are identical up to linear scaling. Heatmaps are usually drawn with max-normalisation which outputs the same heatmap for the ones that are identical up to linear scaling.

ngthanhtin · 2022-10-21T05:30:45Z

Hi @coallaoh , thank you for your quick reply. From my understanding, you said that the feature_ic means the input pixel (x) at location i (or at position i) and the class label c.

And in the loss function:

loss = - \sum_i features_{i, target_i}
= - \sum_i \sum_k p'(z=k | x^i, y=target_i) log p(y=target_i, z=k | x^i)

the target_i means the label of the pixel at location i. So my question is the target_i here is the same for all location i and it is equal to the class label c, is it right?

coallaoh · 2022-10-21T06:19:23Z

No all the indices are mixed up.

the target_i means the label of the pixel at location i.

No target_i means the GT class label for sample i. i is the sample index. k is the location index (and so you can write z=k etc.)

... equal to the class label c, is it right?

Class label c is a free index, not a designated index. One could say for example c=target_i, meaning that you set the class index c as the GT class label for sample i.

ngthanhtin · 2022-10-21T18:17:32Z

Yes thanks @coallaoh ,
This figure makes me misunderstand your code and what you explain here (calculate loss between pseudo-target and joint distribution), for now, I got the picture, thank you.

coallaoh · 2022-10-21T18:35:47Z

Sounds great. Thanks for your interest in our work :)

coallaoh closed this as completed Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to calculate the loss in the EM setting? #2

How to calculate the loss in the EM setting? #2

ngthanhtin commented Oct 18, 2022

ngthanhtin commented Oct 21, 2022

coallaoh commented Oct 21, 2022 •

edited

coallaoh commented Oct 21, 2022

ngthanhtin commented Oct 21, 2022

coallaoh commented Oct 21, 2022

ngthanhtin commented Oct 21, 2022 •

edited

coallaoh commented Oct 21, 2022

How to calculate the loss in the EM setting? #2

How to calculate the loss in the EM setting? #2

Comments

ngthanhtin commented Oct 18, 2022

ngthanhtin commented Oct 21, 2022

coallaoh commented Oct 21, 2022 • edited

coallaoh commented Oct 21, 2022

ngthanhtin commented Oct 21, 2022

coallaoh commented Oct 21, 2022

ngthanhtin commented Oct 21, 2022 • edited

coallaoh commented Oct 21, 2022

coallaoh commented Oct 21, 2022 •

edited

ngthanhtin commented Oct 21, 2022 •

edited