Skip to content

SamsungLabs/tr3d

Repository files navigation

PWC PWC PWC

TR3D: Towards Real-Time Indoor 3D Object Detection

News:

  • 🔥 June, 2023. TR3D is accepted at ICIP2023.
  • 🚀 June, 2023. We add ScanNet-pretrained S3DIS model and log significantly pushing forward state-of-the-art.
  • February, 2023. TR3D on all 3 datasets is now supported in mmdetection3d as a project.
  • 🔥 February, 2023. TR3D is now state-of-the-art on paperswithcode on SUN RGB-D and S3DIS.

This repository contains an implementation of TR3D, a 3D object detection method introduced in our paper:

TR3D: Towards Real-Time Indoor 3D Object Detection
Danila Rukhovich, Anna Vorontsova, Anton Konushin
Samsung Research
https://arxiv.org/abs/2302.02858

Installation

For convenience, we provide a Dockerfile.

Alternatively, you can install all required packages manually. This implementation is based on mmdetection3d framework. Please refer to the original installation guide getting_started.md, including MinkowskiEngine installation, replacing open-mmlab/mmdetection3d with samsunglabs/tr3d.

Most of the TR3D-related code locates in the following files: detectors/mink_single_stage.py, detectors/tr3d_ff.py, dense_heads/tr3d_head.py, necks/tr3d_neck.py.

Getting Started

Please see getting_started.md for basic usage examples. We follow the mmdetection3d data preparation protocol described in scannet, sunrgbd, and s3dis.

Training

To start training, run train with TR3D configs:

python tools/train.py configs/tr3d/tr3d_scannet-3d-18class.py

Testing

Test pre-trained model using test with TR3D configs:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP

Visualization

Visualizations can be created with test script. For better visualizations, you may set score_thr in configs to 0.3:

python tools/test.py configs/tr3d/tr3d_scannet-3d-18class.py \
    work_dirs/tr3d_scannet-3d-18class/latest.pth --eval mAP --show \
    --show-dir work_dirs/tr3d_scannet-3d-18class

Models

The metrics are obtained in 5 training runs followed by 5 test runs. We report both the best and the average values (the latter are given in round brackets). Inference speed (scenes per second) is measured on a single NVidia RTX 4090. Please, note that ScanNet-pretrained S3DIS model was actually trained in the original openmmlab/mmdetection3d codebase.

TR3D 3D Detection

Dataset mAP@0.25 mAP@0.5 Scenes
per sec.
Download
ScanNet 72.9 (72.0) 59.3 (57.4) 23.7 model | log | config
SUN RGB-D 67.1 (66.3) 50.4 (49.6) 27.5 model | log | config
S3DIS 74.5 (72.1) 51.7 (47.6) 21.0 model | log | config
S3DIS
ScanNet-pretrained
75.9 (75.1) 56.6 (54.8) 21.0 model | log | config

RGB + PC 3D Detection on SUN RGB-D

Model mAP@0.25 mAP@0.5 Scenes
per sec.
Download
ImVoteNet 63.4 - 14.8 instruction
VoteNet+FF 64.5 (63.7) 39.2 (38.1) - model | log | config
TR3D+FF 69.4 (68.7) 53.4 (52.4) 17.5 model | log | config

Example Detections

drawing

Citation

If you find this work useful for your research, please cite our paper:

@misc{rukhovich2023tr3d,
  doi = {10.48550/ARXIV.2302.02858},
  url = {https://arxiv.org/abs/2302.02858},
  author = {Rukhovich, Danila and Vorontsova, Anna and Konushin, Anton},
  title = {TR3D: Towards Real-Time Indoor 3D Object Detection},
  publisher = {arXiv},
  year = {2023}
}