Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation probability distribution #3

Closed
ZhengdiYu opened this issue Apr 25, 2022 · 3 comments
Closed

segmentation probability distribution #3

ZhengdiYu opened this issue Apr 25, 2022 · 3 comments

Comments

@ZhengdiYu
Copy link

ZhengdiYu commented Apr 25, 2022

image
image

Hi,

Q1. I was trying to understand this part and Fig 6c. of your paper. What is segmentation probability distribution? Do you mean (B, class_num, H, W) before doing argmax into (B, 1, H, W)

You mentioned you trained two different pipelines using Fig6c. But I couldn't find a correspondence in Fig6c. Do you mean that you replace Segm.(S) with (B, class_num, H, W) and (B, 1, H, W) separately?

Q2. What is the corresponding class of the part-segmentation label ? is it the same as original MANO's joints without tips or InterHands2.6M's order? (i.e. wrist is 1 or 16?)

Q3.

def swap_lr_labels_segm_target_channels(segm_target):
"""
Flip left and right label (not the width) of a single segmentation image.
"""
assert isinstance(segm_target, torch.Tensor)
assert len(segm_target.shape) == 3
assert segm_target.min() >= 0
assert segm_target.max() <= 32
img_segm = segm_target.clone()
right_idx = ((1 <= img_segm)*(img_segm <= 16)).nonzero(as_tuple=True)
left_idx = ((17 <= img_segm)*(img_segm <= 32)).nonzero(as_tuple=True)
img_segm[right_idx[0], right_idx[1], right_idx[2]] += 16
img_segm[left_idx[0], left_idx[1], left_idx[2]] -= 16
img_segm_swapped = img_segm.clone()
img_segm_swapped[1], img_segm_swapped[2] = img_segm_swapped[2].clone(), img_segm_swapped[1].clone()
return img_segm_swapped

What is img_segm_swapped[1], img_segm_swapped[2] = img_segm_swapped[2].clone(), img_segm_swapped[1].clone() used for? I think this operation is conducted on the channel dimension, I'm not sure what is this for. The 3 channels should be the same.

Q4. Why don't you ignore background label when training the segmentation?

@zc-alexfan
Copy link
Owner

Q1. Yes, it is basically the logits that preserve lots of information for downstream.

F6c means you don't use any image features for pose estimation. Compared to DIGIT (see Fig. 2), DIGIT uses the image features and segmentation features; Fig6c only uses segmentation features.

Q2. The segmentation labels are on the MANO faces, so it is not related to the joints. DIGIT predicts the 21 joints from InterHand, not from MANO skeleton.

Q3. When you flip an image in data augmentation, the left hand becomes the right hand for example, then you need to flip the segmenation classes between hands as well. Yes the 3 channels are the same. I use 3 channels because I use PNG file format to reduce file size as they have lossless compression.

Q4. I need to label a background pixel to some class, so it is easier to just have a class for the background. Further, you want a model to tell if a pixel belongs to a background or a hand.

@ZhengdiYu
Copy link
Author

ZhengdiYu commented Apr 28, 2022

Thanks for your reply!

Regarding Q3. Yes I understand that I need to flip the labels. But I just don't know why is img_segm_swapped[1], img_segm_swapped[2] = img_segm_swapped[2].clone(), img_segm_swapped[1].clone() needed. Because this line of code is swapping the channel order inside the first dimension (e.g. (3, 256, 256)), which doesn't make sense, because the each channel in the first channel is the same.

Since the 3 channels are the same, What is the purpose of swapping two identical channel? Sorry if I misunderstood.

@zc-alexfan
Copy link
Owner

Ah. I see what you mean now.

I would say, try to have an assertion like:

assert img_segm_swapped[1].sum() == img_segm_swapped.[2].sum()

If they are identical, you can ignore this line. I think the reason I have that line is from a previous version of my code that needs flipping. However, it is not important if this operation does nothing here. Sorry for the inconvenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants