Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you have any idea why oneformer3d is very sensitive to different backbones? #49

Open
yxchng opened this issue Apr 1, 2024 · 11 comments

Comments

@yxchng
Copy link

yxchng commented Apr 1, 2024

I tried changing the backbone from SpConv to supposedly more powerful new backbones like PTv3 and Swin3D without changing any other parameters but they both give much poorer results on S3DIS. You seem to be using different backbones for different tasks too which suggests that this framework might be sensitive to different backbones used. Do you have any idea why this is the case?

@oneformer3d-contributor
Copy link
Collaborator

One thing is you need to carefully follow the pre-processing of point clouds in our repo and your new backbones (e.g. PTv3). Like color normalization, voxel size or elastic transform. These things should also be changed.

Please share your results if got something interesting with these backbones :)

@yxchng
Copy link
Author

yxchng commented Apr 1, 2024

Hmmm, I am not very familiar with elastic transform. Is this transform different for different backbones? Also, isn't this transform only applied during training? So even if it is slightly different, it shouldn't have such a big impact on evaluation?

@oneformer3d-contributor
Copy link
Collaborator

Not sure, but it is rather strong augmentation. If the backbone is not trained with it, this can possible break the pre-training weights. I recommend to completely follow the preprocessing and augmentations of the new backbone.

@RayYoh
Copy link

RayYoh commented Apr 17, 2024

Not sure, but it is rather strong augmentation. If the backbone is not trained with it, this can possible break the pre-training weights. I recommend to completely follow the preprocessing and augmentations of the new backbone.

Hi, authors. Actually, I have tried this based on Pointcept from scratch.
My result is worse than yours even I add normal as features (about 1 point for AP50 Scannetv2, use topk 100). But SPFormer gets higher results than you claimed in the paper (maybe since I add normal). I am confused that elastic transform influences the results a lot?
And I also use your repo for the instance results (remove semantic and panoptic seg, top 100), just about 77.0. I'd like to ask if another two tasks can help the inst seg a lot?

@oneformer3d-contributor
Copy link
Collaborator

Hi @RayYoh ,
Yes, i think elastic transform gives like +4 for OneFormer3D and for SPFormer also. No semantic segmentation doesn't have positive impact on instance segmentation metrics. Also are you starting from pre-trained model? It is important for achieving good results.

@RayYoh
Copy link

RayYoh commented Apr 18, 2024

Hi @RayYoh , Yes, i think elastic transform gives like +4 for OneFormer3D and for SPFormer also. No semantic segmentation doesn't have positive impact on instance segmentation metrics. Also are you starting from pre-trained model? It is important for achieving good results.

Hi, what does +4 mean? For AP50 result?
Actually, I didn't use a pre-trained backbone since it uses a totally different data augmentation pipeline (e.g., coord shift, etc.) I just training from scratch like Mask3D use 600 epoches and OnecycleLR, it works well for SPFormer but not for OneFormer3D on Scannet v2.

@oneformer3d-contributor
Copy link
Collaborator

I think smth like +4 mAP50 (may be less). If you use some backbone from pointcept e.g. ptv3, you can also start with their pre-trained weights. I think it should help much.

@RayYoh
Copy link

RayYoh commented Apr 19, 2024

Yeah, thanks for your suggestion. Additionally, I am confused about the results of the checkpoint in the repo. It seems that the instance segmentation result is better than the original paper, but the semantic segmentation and panoptic results are a little bit worse. Why has this phenomenon, in my own opinion, better instance seg will get better semantic and panoptic results.

In addition, I'd like to ask if there are any tricks to get this better ins seg result because, in my reproduction, I get a similar result of the paper (maybe because of random, a little bit lower than paper, 1 point lower than your checkpoint).
image

@oneformer3d-contributor
Copy link
Collaborator

Unfortunately not many tricks, just the one about the loss weight from our readme and multiple train runs...

@RongkunYang
Copy link

Dear authors, I'm wondering whether the pretrain of backbone plays an importance effect in the performance of instance segmentation?

@oneformer3d-contributor
Copy link
Collaborator

Yes, when running our code without pre-trained checkpoint, the mAP for instance segmentation is about 4% worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants