Skip to content

dengq7/Box4Scene

Repository files navigation

Learning 3D Representation from Auto-labeled 2D Object Boxes

Official PyTorch implementation of the method Box4Scene.

Dependencies

Please install the required required packages. Some libraries used in this project, including MinkowskiEngine and Pytorch-lightning are known to have a different behavior when using a different version; please use the exact versions specified in requirements.txt.

As for the installation of YoLoWorld. please refer to OpenYOLO3D.

Datasets

The code provided is compatible with nuScenes and semantic KITTI. Put the datasets you intend to use in the datasets folder (a symbolic link is accepted).

Pre-trained models

Minkowski SR-UNet

Reproducing the results

Save image-to-LiDAR corresponse to boost training (required)

Before launching the pre-training, you first need to save image-to-LiDAR corresponse.

python preprocess4labelmaps/pc2image.py

Generate semantic and instance maps from YOLO (required)

Before launching the pre-training, you then need to generate semantic and instance maps from YOLOWorld model.

python preprocess4labelmaps/yolo_detector.py

Pre-training a 3D backbone

To launch a pre-training of the Minkowski SR-UNet (minkunet) on nuScenes:

python pretrain.py --cfg config/slidr_minkunet.yaml

You can alternatively replace minkunet with voxelnet to pre-train a PV-RCNN backbone.
Weights of the pre-training can be found in the output folder, and can be re-used during a downstream task. If you wish to use multiple GPUs, please scale the learning rate and batch size accordingly.

Semantic segmentation

To launch a semantic segmentation, use the following command:

python downstream.py --cfg_file="config/semseg_nuscenes.yaml" --pretraining_path="output/pretrain/[...]/model.pt"

with the previously obtained weights, and any config file. The default config will perform a finetuning on 1% of nuScenes' training set, with the learning rates optimized for the provided pre-training.

To re-evaluate the score of any downstream network, run:

python evaluate.py --resume_path="output/downstream/[...]/model.pt" --dataset="nuscenes"

If you wish to reevaluate the linear probing, the experiments in the paper were obtained with lr=0.05, lr_head=null and freeze_layers=True.

Acknowledgment

Part of the codebase has been adapted from SLidR.

License

Box4Scene is released under the Apache 2.0 license.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors