New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CaDDN Detector #538
CaDDN Detector #538
Conversation
Tested PointPillar inference to ensure no changes with following command on a Titan XP: PerformanceMaster: feature/CaDDN: ResultsMaster:
feature/CaDDN:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution, great work!
Welcome the first mono 3d det work CaDDN in OpenPCDet!
Please check the comments and see how we could further improve it to be more elegant.
Thank you!
import numpy as np | ||
|
||
|
||
def random_flip_horizontal(image, depth_map, gt_boxes, calib): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about moving this to augmentor_utils.py
with function name random_image_flip_horizontal
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
if "calib_matricies" in self.dataset_cfg.GET_ITEM_LIST: | ||
input_dict["trans_lidar_to_cam"], input_dict["trans_cam_to_img"] = kitti_utils.calib_to_matricies(calib) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET_ITEM_LIST
is a good idea for various data sources.
However, points=self.get_lidar()
is a common setting for LiDAR-based 3D object detection, so I think it should be kept by default to ensure previous configs could also use the KittiDataset class.
This part should be something like,
get_item_list = self.dataset_cfg.get('GET_ITEM_LIST', ['points'])
# load points
if 'points' in get_item_list:
xxx
# load images
xxxx
# load depth_maps
xxxx
# load calib_matricies
xxxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
self.voxel_grid = kornia.utils.create_meshgrid3d(depth=self.depth, | ||
height=self.height, | ||
width=self.width, | ||
normalized_coordinates=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it necessary to use kornia
? It seems we could simply implement this function with native PyTorch operations.
Such as implement it within one file of pcdet/utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could re-implement kornia functions, however I use seven different functions throughout the code. Adding these implementations adds additional code in this repo which I don't feel is necessary. Additionally, I already require adding a dependency (torchvision
), so the requirements need to updated anyways.
Kornia Functions:
kornia.image_to_tensor
kornia.utils.create_meshgrid3d
kornia.transform_points
kornia.normalize
kornia.losses.FocalLoss
kornia.convert_points_to_homogeneous
kornia.convert_points_from_homogeneous
|
||
__all__ = { | ||
'FrustumToVoxel': FrustumToVoxel | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think f2v
is a type of vfe (voxel feature encoding or extractor)
by using frustum features instead of point-wise features.
So how about moving f2v
to the folder of vfe, and create a module named like FrustumVFE
as FrustumToVoxel
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved f2v
as a submodule of ImageVFE
@@ -6,7 +6,7 @@ | |||
from ...ops.iou3d_nms import iou3d_nms_utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify this file by considering f2v
as vfe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, moved f2v
as a submodule of ImageVFE
pcdet/utils/grid_utils.py
Outdated
@@ -0,0 +1,19 @@ | |||
import torch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is no need to create a separate file for this simple function.
Such as we could merge grid_utils.py
, depth_utils.py
and transform_utils.py
as a single file transform_utils.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,5 @@ | |||
from .depth_ffe import DepthFFE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether it is better to also move ffe
to the folder of 'vfe', since it seems ffe
could only be used as a previous module of f2v
.
If so, the overall framework will still keep simple and clear even with the implementation of CaDDN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved ffe
as a submodule of ImageVFE
Nice codes! The only suggestion is that: how about fusing ffe+f2v as a new module of vfe? since it also aims to extract voxel-wise features from |
Thanks for the quick review! Sounds good, I'll update the requested changes and fuse the FFE + F2V in one module |
…nsample into data_processor
This PR should be good to go. I made |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed.
Summary
CaDDN
is a monocular 3D object detection method that estimates categorical depth distributions in order to generate 3D feature representations for 3D object detection. It has been accepted in CVPR 2021 as an oral submission.Paper: https://arxiv.org/abs/2103.01100
Code: https://github.com/TRAILab/CaDDN
Changes
kitti_dataset.py
anddataset.py
to support image, depth map, and 2D GT box loadingGET_ITEM_LIST
to specify which data items to loadrandom_flip_horizontal
CaDDN
detectorkornia
andtorchvision
requirementsDepthFFE
: Frustum feature extractor via depth distribution estimationDDNDeepLabV3
/DDNTemplate`: Estimate depth distributionsDDNLoss
: Loss for DDNFrustumToVoxel:
Transforms frustum to voxel gridFrustumGridGenerator
: Generates frustum sampling gridSampler
: Samples the frustum gridConv2DCollapse
: Collapses voxel grid to BEV via concat. + 1x1 conv.Balancer
: Loss balancer for foreground/background pixelsBasicBlock2D
: Conv2D + Bn + Relu blockcalib_to_matricies
: Generate transformation matricies from calib objectscalculate_grid_size
: Calculategrid_size
withoutVoxelGenerator
get_pad_params
: Get padding parameters for image paddingbin_depths
: Converts depth map into depth bin indicesnormalize_coords
: Normalize grid coordinates between [-1, 1]compute_fg_mask
: Compute foreground pixel mask for images based on 2D GT boxesproject_to_image
: Project 3D points to the image via projection matricies using PytorchResults