Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this use ScanNetv2 dataset? #2

Closed
uyoung-jeong opened this issue Dec 6, 2021 · 5 comments
Closed

Does this use ScanNetv2 dataset? #2

uyoung-jeong opened this issue Dec 6, 2021 · 5 comments

Comments

@uyoung-jeong
Copy link

uyoung-jeong commented Dec 6, 2021

I tried to train the model on ScanNet v2 dataset, and found out that the dataset loading process does not match with the dataset format.
Here are what I found:

When I extract rgb and depth files, the rgb files are placed as {root}/scene{xxxx}_{xx}/color/{xxxx}.jpg.
However, your annotation file assumes additional directory 'frame': {root}/scene{xxxx}_{xx}/frame/color/{xxxx}.jpg
I extracted the files using official ScanNet code (python 2.x).

Camera intrinsic file path is also different.
My intrinsic file is stored as {root}/scene{xxxx}_{xx}/intrinsic/intrinsic_{color/depth}.txt
In your code, however, it is {root}/scene{xxxx}_{xx}/frame/intrinsic/scene{xxxx}_{xx}.txt
Also, my intrinsic file contains 4x4 matrix, so the line index does not exceed 4.
But your code reads 9th line of the intrinsic file.

After fixing above path and format problems, I ran into another problem.
At 89th line of data/datasets.py file, the mask size does not match with the rgb image size.

masks = masks.reshape(-1, height, width)

My extracted rgb image has 1296x968 size, while depth image has 640x480 size.
The mask size is 307200(=640x480), and don't know why this error happens.

Do your code use older version(v1?) of ScanNet? Or did I miss something during preprocessing?
I did not use ScanNet dataset before, so I might have made a mistake.
Thanks.

@EryiXie
Copy link
Owner

EryiXie commented Dec 6, 2021

Hi, first of all, thank you for pointing out these paths and naming issues, and sorry that you have to counter these issues.

Due to the reason that the official ScanNet code for extracting data from *.sens file is too slow. I believe at that time I did some modifications and resulted in the incompatible naming and pathing issue. And of course, I wanted to make the pathing rule the same as PlaneRCNN.

For the 3rd issue, yes the dataset we use is ScanNet v2, the RGB image is resized to (480x640) so that it is the same as the depth map and the plane annotation given by PlaneRCNN (both are 480x640). At this point, my suggestion is that you can write a simple script to resize the RGB images.

I will try to write a script for data preprocessing, so that everyone can simply run it, after extracting the data using the ScanNet original script. Before I finish it, I will link this issue to the README file. Thank you again.

@uyoung-jeong
Copy link
Author

uyoung-jeong commented Dec 7, 2021

Thanks for the answer. I fixed above mentioned problems, but still I faced several problems.
In order to run with RTX 3090 or A6000 gpus, pytorch version should be at least 1.7.0 due to cuda issue.
I am currently using pytorch 1.10

data_loader = torch.utils.data.DataLoader(dataset, args.batch_size,

I added generator argument in the above command as below:
generator=torch.Generator(device='cuda')
But the code cannot run on multiple GPUs. I guess that custom DataParallel cannot solve this issue. DistributedDataParallel should be employed.
I'm just running with single GPU to prevent errors.

feature_add_all_level += self.convs_all_levels[i](mask_feat)

This line raises an inplace operation error:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [8, 128, 120, 160]], which is output 0 of ReluBackward0, is at version 3; expected version 0 instead.

I solved the problem by replacing the line with below:
feature_add_all_level = feature_add_all_level.clone() + self.convs_all_levels[i](mask_feat)

In order to train your model from scratch, your code requires pretrained resnet weights from YOLACT: resnet101_reducedfc.pth or resnet50-19c8e357.pth.

I ran the evaluation code using your pretrained model PlaneRecNet_101_9_125000.pth.
The evaluation result seems different from your paper.
`
| all | .50 | .55 | .60 | .65 | .70 | .75 | .80 | .85 | .90 | .95 |
-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
box | 43.93 | 50.27 | 50.04 | 49.79 | 49.41 | 48.92 | 47.77 | 46.40 | 43.75 | 37.83 | 15.15 |
mask | 41.58 | 50.32 | 50.26 | 50.21 | 50.10 | 49.92 | 49.47 | 48.33 | 43.77 | 21.23 | 2.22 |
-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+

Depth Metrics:
abs_rel: 0.07233, sq_rel: 0.01839, rmse: 0.16482, log10: 0.03036, a1: 0.95648, a2: 0.99420, a3: 0.99875
ratio: 0.93974
`
My preprocessing procedure is not exactly the same as yours. For example, I used intrinsic of rgb camera.

@EryiXie
Copy link
Owner

EryiXie commented Dec 13, 2021

Hi uyoung-jeong, sorry for the delay.

For the 3rd issue, yes the Resnet weights are given by the author of YOLACT, but these are the Resnet weights trained on the ImageNet dataset, just like the standard, not pretrained Yolact. So I think ... it is fine, right? (Anyway, Yolact is the first instance segmentation method I read and learned line by line, and in my opinion it is a well-implemented one, so I adapted a lot of their code, maybe I should reimplement the ResNet + DCNv2 part by myself and use a more common version of the pretrained weight...)

For the 4th issue. Uhmm, then the result looks better than reported in the paper. Hahaha. Well, I believe I uploaded the wrong sample or more exactly missed one sample (there are one for training, one for validation, and also the one for evaluation), I will try to find the evaluation sample, I used in the paper, and upload it.

Thank you for mentioning the 1st and 2nd issues, the 1st issue happens on RTX 3090 or A6000 with Pytorch > 1.9 (as far as I know), which I also countered with the same setup on my server. But adding "generator=torch.Generator(device='cuda')" makes it not runnable on my local pc with an older version of PyTorch and an older GPU. I will make a comment on this line, and explain it in a later update.

And the 2nd issue is a really new thing that I didn't know before. Thank you again!

@uyoung-jeong
Copy link
Author

Thanks for the answer. I thought that you did not mention about the pretrained weights for the model, but it seems that I did not read the readme.md carefully. Thanks for the kind answer.
2nd issue does not appear when pytorch version is older than 1.8. So, if I just stick to the older version, distributed training problems and inplace operation problems can be ignored. However, as far as I know, older version of pytorch does not support cuda 11.x.
It is possible that current script uses same validation set for both training validation and evaluation. If you provide instructions about evaluation, it would be grateful.

@nku-zhichengzhang
Copy link

nku-zhichengzhang commented Aug 9, 2022

Hi uyoung-jeong, sorry for the delay.

For the 3rd issue, yes the Resnet weights are given by the author of YOLACT, but these are the Resnet weights trained on the ImageNet dataset, just like the standard, not pretrained Yolact. So I think ... it is fine, right? (Anyway, Yolact is the first instance segmentation method I read and learned line by line, and in my opinion it is a well-implemented one, so I adapted a lot of their code, maybe I should reimplement the ResNet + DCNv2 part by myself and use a more common version of the pretrained weight...)

For the 4th issue. Uhmm, then the result looks better than reported in the paper. Hahaha. Well, I believe I uploaded the wrong sample or more exactly missed one sample (there are one for training, one for validation, and also the one for evaluation), I will try to find the evaluation sample, I used in the paper, and upload it.

Thank you for mentioning the 1st and 2nd issues, the 1st issue happens on RTX 3090 or A6000 with Pytorch > 1.9 (as far as I know), which I also countered with the same setup on my server. But adding "generator=torch.Generator(device='cuda')" makes it not runnable on my local pc with an older version of PyTorch and an older GPU. I will make a comment on this line, and explain it in a later update.

And the 2nd issue is a really new thing that I didn't know before. Thank you again!

Thanks for your hard work and kind response. I meet the same issues as uyoung-jeong for data preparation, since your work is significant to the 3d plane community while using a unique data loader.

Could you kindly release the structure of the dataset? or share the file of the dataset via cloud drive?

Thank you for releasing the code and providing detailed descriptions. Btw, your readme is literally informative and detailed one comparing to others :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants