Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some confusion about GaitEdge forward #92

Closed
enemy1205 opened this issue Oct 19, 2022 · 9 comments
Closed

Some confusion about GaitEdge forward #92

enemy1205 opened this issue Oct 19, 2022 · 9 comments

Comments

@enemy1205
Copy link

In the inference of the GaitEdge , why it still need silhouette as input ? Can't it get silhouette from the output of the Segmentation U-Net? Also ,I don't understand why it need to input ratios . Isn't the end to end network just an RGB image input?

def forward(self, inputs):
        ipts, labs, _, _, seqL = inputs

        ratios = ipts[0]
        rgbs = ipts[1]
        sils = ipts[2]

        n, s, c, h, w = rgbs.size()
        rgbs = rgbs.view(n*s, c, h, w)
        sils = sils.view(n*s, 1, h, w)
        logis = self.Backbone(rgbs)  # [n, s, c, h, w]
        logits = torch.sigmoid(logis)
        mask = torch.round(logits).float()
        if self.is_edge:
            edge_mask, eroded_mask = self.preprocess(sils)

            # Gait Synthesis
            new_logits = edge_mask*logits+eroded_mask*sils

            if self.align:
                cropped_logits = self.gait_align(
                    new_logits, sils, ratios)
            else:
                cropped_logits = self.resize(new_logits)
        else:
            if self.align:
                cropped_logits = self.gait_align(
                    logits, mask, ratios)
            else:
                cropped_logits = self.resize(logits)
        _, c, H, W = cropped_logits.size()
        cropped_logits = cropped_logits.view(n, s, H, W)
        retval = super(GaitEdge, self).forward(
            [[cropped_logits], labs, None, None, seqL])
        retval['training_feat']['bce'] = {'logits': logits, 'labels': sils}
        retval['visual_summary']['image/roi'] = cropped_logits.view(
            n*s, 1, H, W)

        return retval
@darkliang
Copy link
Collaborator

darkliang commented Oct 19, 2022

Thanks for your interest.

  1. silhouette input is used for Gait Synthesis and to supervise the training of the segmentation network. We need both silhouette input and the output of the segmentation network.
  2. The RGB image is resized to 128x128 during preprocessing. The ratio is the image aspect ratio before the resize. We restore it to the original aspect ratio during training.

@enemy1205
Copy link
Author

Thanks for reply! Then during detecting , there is no need to supervise the training of the segmentation network,So only input the RGB image and ratio is enough?

@darkliang
Copy link
Collaborator

GaitEdge's model does not include a detection process.

@enemy1205
Copy link
Author

Sorry , I means forward function , getting the gait feature

@ChaoFan996
Copy link
Collaborator

The synthetic silhouette is composed of the binary interior (untrainable) and float edge (trainable), where the former comes from the silhouette dataset during the inference stage.

@enemy1205
Copy link
Author

But if I want to put it into practical use , I only can get RGB image from the VideoCapture. Or I will do segmentation again.

@darkliang
Copy link
Collaborator

If you want to use it for practical purposes, then you need a detection model to get the bbox of the human body, and another trained segmentation model to get the input silhouette.

@ChaoFan996
Copy link
Collaborator

The edge is extracted by conducting the erosion and dilation operations on silhouette, meaning we need to segment the RGB image first and feed it into GaitEdge.

@enemy1205
Copy link
Author

Thanks for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants