Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Motifs、VCTree don't use the mask features? #8

Closed
aa200647963 opened this issue Aug 2, 2022 · 3 comments
Closed

Motifs、VCTree don't use the mask features? #8

aa200647963 opened this issue Aug 2, 2022 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@aa200647963
Copy link

In the config file, `with_visual_mask=False'. So, only the Transformer-based methods use the mask features?

@Jingkang50
Copy link
Owner

Thank you for the question.
The short answer is yes.
We only use bbox features for two-stage methods.
We did try to use mask features in the following two ways:

  • for example, to get a dog's mask feature, we multiply the original image with the dog's 0-1 mask, and then follow the standard bbox feature extraction process with ROIAlign, and concate the mask feature map with bbox feature map;
  • we crop the mask with ROI and then concatenate the single-layer post-ROIAlign 0-1 mask to bbox feature map;

We found both of the practices can only bring negligible improvement, so we do not include them eventually.

@Jingkang50 Jingkang50 added the question Further information is requested label Aug 2, 2022
@Jingkang50
Copy link
Owner

@LilyDaytoy Hi Wenxuan, could you push a branch with mask-aware ROI Align, which we tried before?
The branch is only for participants' reference.

@LilyDaytoy
Copy link
Collaborator

LilyDaytoy commented Aug 11, 2022

@aa200647963 Hi, for using mask features in PSG 2-stage pipeline, you can check my branch https://github.com/LilyDaytoy/OpenPSG/tree/mask_roi for reference :)

For details of how masked ROI feats are extracted, you can check roi_forward_with_mask function in openpsg/models/roi_extractors/visual_spatial.py. I basically masked the feature maps with binary masks of all objects and stack them together, then reuse RoIAlign module in mmdet to crop out the "masked feature maps" with corresponding rois.
The final return of roi_forward_with_mask would be of 512 channels with 256 channels of bounding box features and 256 channels of mask features stacking together.
To enable mask features in the pipeline, you can check configs/motifs/panoptic_fpn_r50_fpn_1x_predcls_psg.py for example. (require_masked_feats=True; in_channels=512 in bbox_roi_extractor)

There are also some detailed modifications in unifying masks size and type, you can check all the modifications here LilyDaytoy/OpenPSG@Jingkang50:OpenPSG:main...mask_roi

This branch has not been tested yet ;) but hope this could help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants