[Enhance] Support RGB images on ScanNet for multi-view detector #696

filaPro · 2021-07-03T14:50:01Z

This PR aims to support RGB images on ScanNet for multi-view detector. More discussion in #620.

It is a bit WIP now as I also plan to add images to ScanNetDataset. For now I just adapted SensReader from original ScanNet repo. Can you please check if we are ok with it? cc @Tai-Wang

codecov · 2021-07-03T15:08:43Z

Codecov Report

Merging #696 (e8b6c40) into master (1f79fc7) will increase coverage by 0.02%.
The diff coverage is 37.73%.

❗ Current head e8b6c40 differs from pull request most recent head aea1385. Consider uploading reports for the commit aea1385 to get more accurate results

@@            Coverage Diff             @@
##           master     #696      +/-   ##
==========================================
+ Coverage   48.93%   48.96%   +0.02%     
==========================================
  Files         208      208              
  Lines       15861    15880      +19     
  Branches     2543     2538       -5     
==========================================
+ Hits         7761     7775      +14     
+ Misses       7606     7603       -3     
- Partials      494      502       +8

Flag	Coverage Δ
unittests	`48.96% <37.73%> (+0.02%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmdet3d/core/bbox/structures/coord_3d_mode.py	`60.99% <0.00%> (-0.88%)`	⬇️
mmdet3d/models/voxel_encoders/voxel_encoder.py	`54.40% <0.00%> (ø)`
mmdet3d/models/detectors/single_stage.py	`58.62% <12.50%> (-17.57%)`	⬇️
...et3d/models/dense_heads/anchor_free_mono3d_head.py	`65.31% <33.33%> (+6.67%)`	⬆️
mmdet3d/datasets/scannet_dataset.py	`78.51% <42.85%> (-9.34%)`	⬇️
mmdet3d/datasets/pipelines/transforms_3d.py	`90.85% <50.00%> (-0.51%)`	⬇️
mmdet3d/models/dense_heads/fcos_mono3d_head.py	`12.10% <66.66%> (+0.13%)`	⬆️
mmdet3d/models/detectors/voxelnet.py	`33.87% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1f79fc7...aea1385. Read the comment docs.

data/scannet/README.md

Tai-Wang

We would better add some basic comments or docstrings to extract_posed_images.py and add the corresponding contents/explanations in the ScanNet dataset doc.

Tai-Wang · 2021-07-04T12:25:28Z

This PR aims to support RGB images on ScanNet for multi-view detector. More discussion in #620.

It is a bit WIP now as I also plan to add images to ScanNetDataset. For now I just adapted SensReader from original ScanNet repo. Can you please check if we are ok with it? cc @Tai-Wang

It is fine to borrow some official code from ScanNet with the explicit acknowledgement/citation to their credit. Just need some basic comments and explanations in the doc to make this new feature clearer.

filaPro · 2021-07-05T18:35:59Z

Hi @Tai-Wang ,

Have a question about LoadMultiViewImageFromFiles. Looks like it is now not used in the project and also in my opinion it should be replaced with another class. The point is how to apply per image transforms like Resize or Normalize to the output of this class? It probably will cause MultiViewResize and MultiViewNormalize which looks not good. I think we should use the idea of MultiScaleFlipAug3D(..., transforms=[...]) in the way of MultiViewPipeline(..., transforms=[..., Resize, Normalize]). Do you agree or see a better solution?

Also a bit curious about your LoadImageFromFileMono3D. For some reason in all other datasets we return calibration matrices directly in dataset, and for NuScenesMonoDataset we have a standalone pipeline. Can we move this matrix to NuScenesMonoDataset.pre_pipeline?

Also new depth2img key for ScanNetDataset can affect GlobalAlignment transform. In fact, I do not fully understand why this transform is added in recent release. This is caused by unaligned points in segmentation task. But for this purpose we have separate ScanNetSegData and ScanNetSegDataset. How they affect ScanNetData and ScanNetDataset? For now if someone forget to add this GlobalAlignment is can cause bugs as it rotates only points but not gt_bboxes_3d. In my opinion all alignment stuff can be done in ScanNetData and GlobalAlignment can be removed, but may be I'm completely wrong. So my question is if we keep GlobalAlignment should it also rotate depth2img values for each RGB image or this depth2img can be already returned in aligned version.

Sorry for not very smart questions.

Tai-Wang · 2021-07-06T03:30:02Z

Hi @filaPro ,

LoadMultiViewImageFromFiles is a pipeline designed for MVXNet on nuScenes. Because the release of moca is blocked, it has not been used in current supported models. You can basically keep it for further potential usage and add your pipelines. I agree that following the approach of MultiScaleFlipAug3D to achieve transformations to multi-view images is a good choice.

As for LoadImageFromFileMono3D, it is just because when first designed this pipeline I added some other processing, and at last only the cam_intrinsic information is preserved. I agree that moving it into functions in the datasets like get_ann_infos or parse_ann_infos would be a good idea to make it consistent with other datasets and unify the pipelines. I will consider to refactor it before next release.

filaPro · 2021-07-12T09:03:29Z

Hi @Tai-Wang ,
I think it is enough for this PR. MultiViewPypeline can be committed in a separate one. I've also visualized a couple of boxes on RGB images, looks fine.

docs/datasets/scannet_det.md

Tai-Wang

Almost ready to be merged

Wuziyi616

@filaPro Good job! From the ImVoxelNet paper, it seems to perform training on training set and report results on val set. So all the gt boxes it uses is axis-aligned, and test set is not involved right? If that's true, this PR is done.

I just want to make sure this point because we perform point cloud alignment in the data loading pipeline. Gt boxes in both aligned and unaligned coordinates are provided.

filaPro · 2021-07-14T10:36:24Z

Sure, in ImVoxelNet paper only train and val scenes are used.

Btw, not quite understand how train/val/test splits are connected with point cloud alignment? We can not process this alignment on test step if necessary?

Wuziyi616 · 2021-07-14T10:42:32Z

Sure, in ImVoxelNet paper only train and val scenes are used.

Btw, not quite understand how train/val/test splits are connected with point cloud alignment? We can not process this alignment on test step if necessary?

Good question. This is mainly because the axis-aligned matrix required to perform alignment is consider as part of the annotation by ScanNet team. As a result, test scans don't have this matrix and can't be aligned. This doc provide more details about our design of ScanNet pre-processing scripts (which I believe is a bit complicated and confusing at the first glance).

BTW it reminds me that, it would be great if you can add this PR's change in the compatibility doc, so that users will know they may need to re-generate ScanNet data for your ImVoxelNet models (and maybe more multi-view method in the future :)

filaPro · 2021-07-14T10:54:50Z

Can I add a MMDetection3D 0.16.0 section in this compatibility.md?

Wuziyi616 · 2021-07-14T12:15:26Z

Can I add a MMDetection3D 0.16.0 section in this compatibility.md?

Yes, sure! Just follow the pattern :)

Edit: I think you can refer to this part, which is very similar to your changes here.

docs/compatibility.md

Wuziyi616 · 2021-07-14T13:36:45Z

Ready to merge after fixing that typo.

data/scannet/extract_posed_images.py

extract RGB images for ScanNet

5ba112d

update scannet_data_utils with rgb

bcc4500

Tai-Wang reviewed Jul 4, 2021

View reviewed changes

data/scannet/README.md Outdated Show resolved Hide resolved

Tai-Wang reviewed Jul 4, 2021

View reviewed changes

fix docs an tools; add use_camera to ScanNetDataset

4e123bf

Wuziyi616 mentioned this pull request Jul 13, 2021

can i use mmdetection3d to realize multiview 3D object detection demo? #742

Closed

Tai-Wang reviewed Jul 13, 2021

View reviewed changes

docs/datasets/scannet_det.md Outdated Show resolved Hide resolved

Tai-Wang reviewed Jul 13, 2021

View reviewed changes

filaPro added 3 commits July 13, 2021 13:27

fix typos is scannet doc

edbf3f6

fix very rare undefined poses for ScanNet

2e25eb6

update ScanNet dataset for more clear MultiViewPipeline

d6c0d44

filaPro mentioned this pull request Jul 13, 2021

[Enhance] Add MultiViewPipeline #748

Open

Tai-Wang approved these changes Jul 14, 2021

View reviewed changes

Wuziyi616 approved these changes Jul 14, 2021

View reviewed changes

update compatibility for ScanNet

a51645a

Wuziyi616 reviewed Jul 14, 2021

View reviewed changes

docs/compatibility.md Outdated Show resolved Hide resolved

fix typo in compatibility doc

0545086

ZwwWayne reviewed Jul 16, 2021

View reviewed changes

data/scannet/extract_posed_images.py Outdated Show resolved Hide resolved

use mmcv.track_parallel_progress for scannet images

aea1385

ZwwWayne merged commit da4c3af into open-mmlab:master Jul 20, 2021

filaPro mentioned this pull request Jul 20, 2021

met one error when run test.py for scannet dataset. KeyError: Caught KeyError in DataLoader worker process 0. KeyError: 'image_paths' SamsungLabs/imvoxelnet#20

Closed

filaPro deleted the add_scannet_images branch July 20, 2021 15:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhance] Support RGB images on ScanNet for multi-view detector #696

[Enhance] Support RGB images on ScanNet for multi-view detector #696

filaPro commented Jul 3, 2021

codecov bot commented Jul 3, 2021 •

edited

Loading

Tai-Wang left a comment

Tai-Wang commented Jul 4, 2021

filaPro commented Jul 5, 2021 •

edited

Loading

Tai-Wang commented Jul 6, 2021 •

edited

Loading

filaPro commented Jul 12, 2021

Tai-Wang left a comment

Wuziyi616 left a comment

filaPro commented Jul 14, 2021

Wuziyi616 commented Jul 14, 2021

filaPro commented Jul 14, 2021

Wuziyi616 commented Jul 14, 2021 •

edited

Loading

Wuziyi616 commented Jul 14, 2021

[Enhance] Support RGB images on ScanNet for multi-view detector #696

[Enhance] Support RGB images on ScanNet for multi-view detector #696

Conversation

filaPro commented Jul 3, 2021

codecov bot commented Jul 3, 2021 • edited Loading

Codecov Report

Tai-Wang left a comment

Choose a reason for hiding this comment

Tai-Wang commented Jul 4, 2021

filaPro commented Jul 5, 2021 • edited Loading

Tai-Wang commented Jul 6, 2021 • edited Loading

filaPro commented Jul 12, 2021

Tai-Wang left a comment

Choose a reason for hiding this comment

Wuziyi616 left a comment

Choose a reason for hiding this comment

filaPro commented Jul 14, 2021

Wuziyi616 commented Jul 14, 2021

filaPro commented Jul 14, 2021

Wuziyi616 commented Jul 14, 2021 • edited Loading

Wuziyi616 commented Jul 14, 2021

codecov bot commented Jul 3, 2021 •

edited

Loading

filaPro commented Jul 5, 2021 •

edited

Loading

Tai-Wang commented Jul 6, 2021 •

edited

Loading

Wuziyi616 commented Jul 14, 2021 •

edited

Loading