Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the classification loss #15

Closed
NeuZhangQiang opened this issue Oct 26, 2020 · 3 comments
Closed

Question about the classification loss #15

NeuZhangQiang opened this issue Oct 26, 2020 · 3 comments

Comments

@NeuZhangQiang
Copy link

The SEAM is really a excellent work. After reading the paper, I have a question:

  1. how to get the final segmentation mask? In my understanding, the SEAM finally output a CAM map, then the Random work is used to segment the final mask? Am I right?

  2. How to calculate the classification loss? For example, the final output is
    image
    and we can also calculate the background as:
    image
    but, how can we use the two result to calculate the loss? how can we generate the ground truth? Is img(m, n) = c (the true label) the ground truth?

Any suggestion is appreciated!

@YudeWang
Copy link
Owner

Hi @NeuZhangQiang ,

  1. SEAM+AffinityNet generate pixel-level pseudo labels and there is another retrain step to train a segmentation model on these pseudo labels in fully supervised manner.
  2. The calssification loss is calculated on foreground category.
    loss_cls1 = F.multilabel_soft_margin_loss(label1[:,1:,:,:], label[:,1:,:,:])

    The background activation is calculated here.
    cam1[:,0,:,:] = 1-torch.max(cam1[:,1:,:,:],dim=1)[0]

@NeuZhangQiang
Copy link
Author

Dear @YudeWang What do you mean by "train a segmentation model on these pseudo labels in fully supervised manner"? Do you mean: use the output of CAM or PAM as the input image, and the manually labeled mask as the target, to train a model (such as Unet)? But, how can we obtain the manually labeled since the SEAM is designed for the weekly supervised segmentation?

In addition, the paper sad:
image
Does it mean: mask = CAM > threshold?
Actually, the code in infer_SEAM.py is:

bg_score = [np.ones_like(norm_cam[0])*args.out_cam_pred_alpha]
pred = np.argmax(np.concatenate((bg_score, norm_cam)), 0)

It also mean: mask = CAM > threshold?
Thus, it really make me puzzled by: "train a segmentation model on these pseudo labels in fully supervised manner".

In addition, in the figure in the paper:
image

the Cls Loss is calculated using the feature from CAM. However, the code in train_SEAM.py is:

cam1, cam_rv1 = model(img1)
label1 = F.adaptive_avg_pool2d(cam1, (1,1))
...
loss_cls1 = F.multilabel_soft_margin_loss(label1[:,1:,:,:], label[:,1:,:,:])

In this code, the classification loss is calculated by using the feature from PCM (cam_rv1). It also make me a little confuzed.

@YudeWang
Copy link
Owner

@NeuZhangQiang
SEAM+AffinityNet generates pixel-level pseudo labels for each image in train set. These pseudo labels can be used as target to train a segmentation model (such as deeplab) instead of manually labeled segmentation annotations, which are not available in WSSS problem.

As for cls loss, the code you given has shown that cls loss is calculated by cam1 instead of cam_rv1......

@YudeWang YudeWang closed this as completed Nov 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants