Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modality fusion implementation question #53

Open
kjwkch opened this issue Apr 7, 2023 · 8 comments
Open

Modality fusion implementation question #53

kjwkch opened this issue Apr 7, 2023 · 8 comments

Comments

@kjwkch
Copy link

kjwkch commented Apr 7, 2023

hello. I am studying 2DPASS with code.
It seems that the modality fusion implementation is in network/arch_2dpass.py from line 100 to line 105.
In the thesis, bitwise add is specified, but I do not see it in the code, so I ask a question.
Below is the code.

modality fusion

feat_learner = F.relu(self.leanersidx)
feat_cat = torch.cat([img_feat, feat_learner], 1)
feat_cat = self.fcs1idx
feat_weight = torch.sigmoid(self.fcs2idx)
fuse_feat = F.relu(feat_cat * feat_weight)

I think that [fuse_feat = F.relu(feat_cat*feat_wieght) + img_feat] implements the formula in the paper as a code.
Isn't it?

@brahami14
Copy link

i have the same question

thanks in advance

@LiXiang0021
Copy link

Have you guys successively reproduced the model on Nuscenes, I did several experiments but the performance is far away from the report results. And, I also tested the provided weight on Nuscences getting results similar to the reported results. I'd like to know if I forgot to set some arguments.

@kjwkch
Copy link
Author

kjwkch commented Apr 21, 2023

This issue is code implementation, not performance. I think open code is not implemented as described in the paper.

@kjwkch
Copy link
Author

kjwkch commented Apr 21, 2023

Missing in the code.

modality fusion

feat_learner = F.relu(self.leaners[idx](pts_feat))
feat_cat = torch.cat([img_feat, feat_learner], 1)
feat_cat = self.fcs1[idx](feat_cat)
feat_weight = torch.sigmoid(self.fcs2[idx](feat_cat))
fuse_feat = F.relu(feat_cat * feat_weight)

I think that [fuse_feat = F.relu(feat_cat*feat_wieght) + img_feat] implements the formula in the paper as a code.

feat_learner = F.relu(self.leaners[idx](pts_feat))
feat_cat = torch.cat([img_feat, feat_learner], 1)
feat_cat = self.fcs1[idx](feat_cat)
feat_weight = torch.sigmoid(self.fcs2[idx](feat_cat))
fuse_feat = F.relu(feat_cat * feat_weight) + img_feat

@LiXiang0021
Copy link

Thanks for your reply, I will further check this issue.

@LiXiang0021
Copy link

I just trained the modified version as you said, and the performance did improve a little bit around 2 on mIoU. I believe there may be some other wrong implements or missing in the released code. And thank you again.

@jaywu109
Copy link

@kjwkch @LiXiang0021
After reviewing the current implementation, I noticed that besides the fusion modification, the point feature pass through the 2D learner needs to be added to the original point feature before passing through multihead_3d_classifier, in order to match the model architecture outlined in the paper as below:
CleanShot 2023-04-27 at 16 29 24@2x

        feat_learner = F.relu(self.leaners[idx](pts_feat)) 
        # feat_learner -> voxel-wise feature after 2D learner

        pts_pred_full = self.multihead_3d_classifier[idx]((pts_feat+feat_learner)) 
        # pts_feat+feat_learner -> voxel-wise Enhanced 3D Features

        # correspondence
        pts_label_full = self.voxelize_labels(data_dict['labels'], data_dict['layer_{}'.format(idx)]['full_coors'])
        pts_pred = self.p2img_mapping(pts_pred_full[coors_inv], point2img_index, batch_idx)

        # modality fusion

        feat_learner = self.p2img_mapping(feat_learner[coors_inv], point2img_index, batch_idx)
        # feat_learner -> point-wise feature after 2D learner and img_mapping

        feat_cat = torch.cat([img_feat, feat_learner], 1)
        feat_cat = self.fcs1[idx](feat_cat)
        feat_weight = torch.sigmoid(self.fcs2[idx](feat_cat))
        fuse_feat = F.relu(feat_cat * feat_weight) + img_feat

Currently, the implementation takes the point feature as input directly for multihead_3d_classifier instead of adding the point feature after the 2D learner.

pts_feat = data_dict['layer_{}'.format(idx)]['pts_feat']
coors_inv = data_dict['scale_{}'.format(last_scale)]['coors_inv']
# 3D prediction
pts_pred_full = self.multihead_3d_classifier[idx](pts_feat)

@yanx27, I would appreciate any suggestions you may have regarding this matter.

@brahami14
Copy link

@jaywu109 does the new script changes work for you ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants