Learning 3D Representation from Auto-labeled 2D Object Boxes

Official PyTorch implementation of the method Box4Scene.

Dependencies

Please install the required required packages. Some libraries used in this project, including MinkowskiEngine and Pytorch-lightning are known to have a different behavior when using a different version; please use the exact versions specified in requirements.txt.

As for the installation of YoLoWorld. please refer to OpenYOLO3D.

Datasets

The code provided is compatible with nuScenes and semantic KITTI. Put the datasets you intend to use in the datasets folder (a symbolic link is accepted).

Pre-trained models

Minkowski SR-UNet

Reproducing the results

Save image-to-LiDAR corresponse to boost training (required)

Before launching the pre-training, you first need to save image-to-LiDAR corresponse.

python preprocess4labelmaps/pc2image.py

Generate semantic and instance maps from YOLO (required)

Before launching the pre-training, you then need to generate semantic and instance maps from YOLOWorld model.

python preprocess4labelmaps/yolo_detector.py

Pre-training a 3D backbone

To launch a pre-training of the Minkowski SR-UNet (minkunet) on nuScenes:

python pretrain.py --cfg config/slidr_minkunet.yaml

You can alternatively replace minkunet with voxelnet to pre-train a PV-RCNN backbone.
Weights of the pre-training can be found in the output folder, and can be re-used during a downstream task. If you wish to use multiple GPUs, please scale the learning rate and batch size accordingly.

Semantic segmentation

To launch a semantic segmentation, use the following command:

python downstream.py --cfg_file="config/semseg_nuscenes.yaml" --pretraining_path="output/pretrain/[...]/model.pt"

with the previously obtained weights, and any config file. The default config will perform a finetuning on 1% of nuScenes' training set, with the learning rates optimized for the provided pre-training.

To re-evaluate the score of any downstream network, run:

python evaluate.py --resume_path="output/downstream/[...]/model.pt" --dataset="nuscenes"

If you wish to reevaluate the linear probing, the experiments in the paper were obtained with lr=0.05, lr_head=null and freeze_layers=True.

Acknowledgment

Part of the codebase has been adapted from SLidR.

License

Box4Scene is released under the Apache 2.0 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning 3D Representation from Auto-labeled 2D Object Boxes

Dependencies

Datasets

Pre-trained models

Minkowski SR-UNet

Reproducing the results

Save image-to-LiDAR corresponse to boost training (required)

Generate semantic and instance maps from YOLO (required)

Pre-training a 3D backbone

Semantic segmentation

Acknowledgment

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
downstream		downstream
models		models
prepross4labelmaps		prepross4labelmaps
pretrain		pretrain
utils		utils
README.md		README.md
downstream.py		downstream.py
evaluate.py		evaluate.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt
vis.py		vis.py

Folders and files

Latest commit

History

Repository files navigation

Learning 3D Representation from Auto-labeled 2D Object Boxes

Dependencies

Datasets

Pre-trained models

Minkowski SR-UNet

Reproducing the results

Save image-to-LiDAR corresponse to boost training (required)

Generate semantic and instance maps from YOLO (required)

Pre-training a 3D backbone

Semantic segmentation

Acknowledgment

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages