About the train/val splits for SUN RGB-D dataset #49

Harvey-Mei · 2022-05-26T04:29:06Z

Hello，
Thanks for your excellent work！
I noticed that you have processed the annotation for SUN RGB-D to coco format, could you please tell me your data processing method and the basis of splits.
I have generated the visiualzation for val part， but I cannnot find the samples showed in the paper of Total3D, Is it because you divided the data set differently?

Best,
Harvey

filaPro · 2022-05-26T06:53:33Z

Hi @Harvey-Mei ,

For SUN RGB-D we support 3 benchmarks: sunrgbd, perspective_sunrgbd, total_sunrgbd. As for total_sunrgbd preprocessing I think we mainly followed their official code and just saved the results to .json.

Harvey-Mei · 2022-05-26T13:01:39Z

Hi @filaPro
Thanks for your quick relpy, I'll check my data again.

Harvey-Mei · 2022-06-16T16:31:06Z

Hi @filaPro ,
Sorry for bothering again.
I counted ImVoxelNet's division of SUN-RGBD data and found some differences.

Specifically, I use the IM3D code, which divides the data set in the same way as total3d. I modified the code of their data processing part to save the image path, and then counted these image paths with the sunrgbd_total3d* used in ImVoxelNet.

ImVoxelNet train # 4918
ImVoxelNet test # 4781
==============================
Total3D train # 5135
Total3D test # 4934
==============================
Train Not Match # 246
Test Not Match # 175

Below is the python script I used：

import os 
import pickle
import json        

imvoxel_test_path = '/home/may/nvme/code/imvoxelnet/data/sunrgbd/sunrgbd_total_infos_val.json'
imvoxel_train_path = '/home/may/nvme/code/imvoxelnet/data/sunrgbd/sunrgbd_total_infos_train.json'

total3d_test_splits = '/home/may/nvme/code/Implicit3DUnderstanding/data/sunrgbd/preprocessed/test.json'
total3d_train_splits = '/home/may/nvme/code/Implicit3DUnderstanding/data/sunrgbd/preprocessed/train.json'
total3d_root = '/home/may/nvme/code/Implicit3DUnderstanding'


# collect imvoxelnet info
imvoxelnet_test_imgs = [] 
imvoxelnet_train_imgs = []

with open(imvoxel_test_path, 'r') as f:
    imvoxelnet_test_infos = json.load(f)
with open(imvoxel_train_path, 'r') as f:
    imvoxelnet_train_infos = json.load(f)

for sample in imvoxelnet_test_infos['images']:
    img_path = sample['file_name'].split('SUNRGBD')[-1]
    if 'flip' in img_path:
        img_path = img_path.replace('flip', '')
    if img_path not in imvoxelnet_train_imgs:
        imvoxelnet_test_imgs.append(img_path)
for sample in imvoxelnet_train_infos['images']:
    img_path = sample['file_name'].split('SUNRGBD')[-1]
    if 'flip' in img_path:
        img_path = img_path.replace('_flip', '')
    if img_path not in imvoxelnet_train_imgs:
        imvoxelnet_train_imgs.append(img_path)

print("ImVoxelNet train #", len(imvoxelnet_train_imgs))
print("ImVoxelNet test #", len(imvoxelnet_test_imgs))
print("="*30)

# collect total3d info
with open(total3d_test_splits, 'r') as f:
    total3d_test_list = json.load(f)
with open(total3d_train_splits, 'r') as f:
    total3d_train_list = json.load(f)

total3d_test_imgs = []
total3d_train_imgs = []
for sample in total3d_test_list:
    sample_path = os.path.join(total3d_root, sample[2:])
    with open(sample_path, 'rb') as f:
        sample_info = pickle.load(f)
    img_path = sample_info['rgb_path'].split('SUNRGBD')[-1]
    if img_path not in total3d_test_imgs:
        total3d_test_imgs.append(img_path)

for sample in total3d_train_list:
    sample_path = os.path.join(total3d_root, sample[2:])
    with open(sample_path, 'rb') as f:
        sample_info = pickle.load(f)
    if 'flip' in sample_path:
        continue
    img_path = sample_info['rgb_path'].split('SUNRGBD')[-1]
    if img_path not in total3d_train_imgs:
        total3d_train_imgs.append(img_path)
        
print("Total3D train #", len(total3d_train_imgs))
print("Total3D test #", len(total3d_test_imgs))
print("="*30)

# compare difference
train_not_match_count = 0
test_not_match_count = 0
train_not_match_imgs = []
test_not_match_imgs = []
for path in total3d_train_imgs:
    if path not in imvoxelnet_train_imgs:
        train_not_match_count += 1 
        train_not_match_imgs.append(path)
        
for path in total3d_test_imgs:
    if path not in imvoxelnet_test_imgs:
        test_not_match_count += 1 
        test_not_match_imgs.append(path)

print("Train Not Match #", train_not_match_count)
print("Test Not Match #", test_not_match_count)
print(train_not_match_imgs)
print('-'*50)
print(test_not_match_imgs)

I'm not sure if I'm missing anything or if there's something wrong with the configuration, can you give me some advice?

filaPro · 2022-06-16T17:07:14Z

Unfortunately I can not reproduce our preprocessing for total sun rgb-d. If this difference really exists it may be a bug on our side. Hope it has not much affect on metrics.

Harvey-Mei · 2022-06-17T00:55:40Z

Yes, I think so. From my statistics, only a very small number of samples are different, so I also think that it has not much affect on metrics.

Thanks anyway！

Harvey-Mei · 2022-06-21T11:28:52Z

Hello， Thanks for your excellent work！ I noticed that you have processed the annotation for SUN RGB-D to coco format, could you please tell me your data processing method and the basis of splits. I have generated the visiualzation for val part， but I cannnot find the samples showed in the paper of Total3D, Is it because you divided the data set differently?

Best, Harvey

Hello @filaPro，
I noticed that many pictures in SUN RGB-D are saved in different paths, but their file names are the same. Therefore saving the visualization with a filename will cause the file with the same name to be overwritten. This may explain why the samples shown in the total3d paper are not found in the test results。
https://github.com/saic-vul/imvoxelnet/blob/87e1d5c1e9d291461c9be345836b659d98398e04/mmdet3d/datasets/dataset_wrappers.py#L125

I modify this line to out_file_name = info['img_info'][j]['filename'].split('/')[-3] and generate new visualizations.

Although the number of samples is still less than Total3D, the final result is equal to the number of test samples.

Harvey-Mei changed the title ~~About the train/val splits of SUN RGB-D datasets~~ About the train/val splits for SUN RGB-D datasets May 26, 2022

Harvey-Mei changed the title ~~About the train/val splits for SUN RGB-D datasets~~ About the train/val splits for SUN RGB-D dataset May 26, 2022

Harvey-Mei closed this as completed Jun 2, 2022

Harvey-Mei reopened this Jun 16, 2022

Harvey-Mei closed this as completed Jun 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the train/val splits for SUN RGB-D dataset #49

About the train/val splits for SUN RGB-D dataset #49

Harvey-Mei commented May 26, 2022 •

edited

filaPro commented May 26, 2022

Harvey-Mei commented May 26, 2022

Harvey-Mei commented Jun 16, 2022 •

edited

filaPro commented Jun 16, 2022

Harvey-Mei commented Jun 17, 2022

Harvey-Mei commented Jun 21, 2022 •

edited

About the train/val splits for SUN RGB-D dataset #49

About the train/val splits for SUN RGB-D dataset #49

Comments

Harvey-Mei commented May 26, 2022 • edited

filaPro commented May 26, 2022

Harvey-Mei commented May 26, 2022

Harvey-Mei commented Jun 16, 2022 • edited

filaPro commented Jun 16, 2022

Harvey-Mei commented Jun 17, 2022

Harvey-Mei commented Jun 21, 2022 • edited

Harvey-Mei commented May 26, 2022 •

edited

Harvey-Mei commented Jun 16, 2022 •

edited

Harvey-Mei commented Jun 21, 2022 •

edited