Skip to content

Latest commit

 

History

History
230 lines (192 loc) · 15.3 KB

GETTING_STARTED_UDA.md

File metadata and controls

230 lines (192 loc) · 15.3 KB

Getting Started & Problem Definition

The purpose of an Unsupervised Domain Adaptation (UDA) task is to learn a generalized model or backbone $F$ on a labeled source domain $s$ and an unlabeled target domain $t$, such that the $F$ can be adapted to the new target domain $t$, where unlabeled training data (such as point cloud or images) from the target domain $t$ are assumed to be available during the adaptation process.

Getting Started & Task Challenges

  • Different domains present inconsistent object-size distribution, as illustrated in file of object-size statistics. Thus, Statistical Normalization (SN) is used to rescale the object-size during the source-domain training process, where both the bounding box size and point cloud within this bounding box are rescaled. For Waymo-to-KITTI adaptation, we found that the object-size variation is a major reason of cross-domain detection accuracy drop.

  • LiDAR beam is also constantly changing for different AD manufacturers. For waymo-to-nuScenes adaptation, we argue that the LiDAR-beam variation is a major challenge, and leverage the range-map provided by Waymo tfrecords to produce the low-beam point clouds (such as 32-beam or 16-beam). Please refer to Results for more details.

    • For some dataset where the range-map is not provided, such as ONCE dataset, one can employ the clustering algorithm on height-angles to obtain the pseudo-labeled low-beam point clouds, which is also verified to be effective in our codebase.

   

Getting Started & Training-Testing for UDA Setting

Here, we take Waymo-to-KITTI adaptation as an example.

Pretraining stage: train the source-only model on the labeled source domain:

  • Train FEAT=3 (X,Y,Z) with SN (statistical normalization) using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml
  • Train FEAT=3 (X,Y,Z) with SN (statistical normalization) using multiple machines
sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml
  • Train FEAT=3 (X,Y,Z) without SN (statistical normalization) using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml
  • Train FEAT=3 (X,Y,Z) without SN (statistical normalization) using multiple machines
sh scripts/slurm_train.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml
  • Train other baseline detectors such as PV-RCNN++ using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pv_rcnn_plus_feat_3_vehi_full_train.yaml
  • Train other baseline detectors such as Voxel-RCNN using multiple GPUs
sh scripts/dist_train.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/voxel_rcnn_feat_3_vehi.yaml

Evaluate the source-pretrained model:

  • Note that for the cross-domain setting where the KITTI dataset is regarded as the target domain, please try --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True to enable front view point cloud only. We report the best model for all epochs on the validation set.

  • Test the source-only models using multiple GPUs

sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--ckpt ${CKPT} 
  • Test the source-only models using multiple machines
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--ckpt ${CKPT}
  • Test the source-only models of all ckpts using multiple GPUs
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--eval_all
  • Test the source-only models of all ckpts using multiple machines
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \
--cfg_file ./cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml \
--eval_all

Adaptation stage: self-training the source-model on the unlabeled target domain:

  • You need to set the --pretrained_model ${PRETRAINED_MODEL} when finish the above pretraining model stage.

  • If you train the source-only model using the SN (statistical normalization). For example, you train the model with pvrcnn_old_anchor_sn_kitti.yaml, you should perform the pre-SN script as follows, where pre-SN represents that we perform the SN (statistical normalization) operation before the adaptation stage.

  • Train FEAT=3 (X,Y,Z) with pre-SN (statistical normalization) using multiple machines

sh scripts/UDA/slurm_train_uda.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} ${QUOTATYPE} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_pre_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
  • Train FEAT=3 (X,Y,Z) with pre-SN (statistical normalization) using multiple GPUs
sh scripts/UDA/dist_train_uda.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_pre_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
  • If you train the source-only model without using the SN (statistical normalization), you should perform the post-SN script as follows, where post-SN represents that we perform the SN (statistical normalization) operation during the adaptation stage.

  • Train FEAT=3 (X,Y,Z) with post-SN (statistical normalization) using multiple machines

sh scripts/UDA/slurm_train_uda.sh ${PARTITION} ${JOB_NAME} ${NUM_NODES} ${QUOTATYPE} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_post_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}
  • Train FEAT=3 (X,Y,Z) with post-SN (statistical normalization) using multiple GPUs
sh scripts/UDA/dist_train_uda.sh ${NUM_GPUs} \
--cfg_file ./cfgs/DA/waymo_kitti/pvrcnn_post_SN_feat_3.yaml \
--pretrained_model ${PRETRAINED_MODEL}

Evaluating the model on the target validation set:

  • Note that for the cross-domain setting where the KITTI dataset is regarded as the target domain, please try --set DATA_CONFIG_TAR.FOV_POINTS_ONLY True to enable front view point cloud only. We report the best model for all epochs on the validation set.

  • Test with a ckpt file:

python test.py \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--ckpt ${CKPT}
  • To test all the saved checkpoints of a specific training setting and draw the performance curve on the Tensorboard, add the --eval_all argument:
python test.py \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--eval_all
  • To test with multiple GPUs:
sh scripts/dist_test.sh ${NUM_GPUs} \ 
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--ckpt ${CKPT}
  • To test all ckpts with multiple GPUs
sh scripts/dist_test.sh ${NUM_GPUs} \
--cfg_file ${CONFIG_FILE} \
--batch_size ${BATCH_SIZE} \
--eval_all
  • To test with multiple machines:
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \ 
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --ckpt ${CKPT}
  • To test all ckpts with multiple machines:
sh scripts/slurm_test_mgpu.sh ${PARTITION} ${NUM_NODES} \ 
    --cfg_file ${CONFIG_FILE} --batch_size ${BATCH_SIZE} --eval_all

   

All UDA Results:

We report the cross-dataset adaptation results including Waymo-to-KITTI, nuScenes-to-KITTI, and Waymo-to-nuScenes.

  • All LiDAR-based models are trained with 4 NVIDIA A100 GPUs and are available for download.
  • The domain adaptation time is measured with 4 NVIDIA A100 GPUs and PyTorch 1.8.1.
  • All results are reported using the BEV/3D AP performance as the evaluation metric. We report the moderate case for KITTI dataset.
  • Pre-SN represents that we perform the SN (statistical normalization) operation during the pre-training stage (SN for source domain).
  • Post-SN represents that we perform the SN (statistical normalization) operation during the adaptation stage (SN for target domain).

UDA Results for Waymo-to-KITTI:

training time Adaptation Car@R40 download
PointPillar ~7.1 hours Source-only with SN 74.98 / 49.31 -
PointPillar ~0.6 hours Pre-SN 81.71 / 57.11 model-57M
PV-RCNN ~23 hours Source-only with SN 69.92 / 60.17 -
PV-RCNN ~23 hours Source-only 74.42 / 40.35 -
PV-RCNN ~3.5 hours Pre-SN 84.00 / 74.57 model-156M
PV-RCNN ~1 hours Post-SN 84.94 / 75.20 model-156M
Voxel R-CNN ~16 hours Source-only with SN 75.83 / 55.50 -
Voxel R-CNN ~16 hours Source-only 64.88 / 19.90 -
Voxel R-CNN ~2.5 hours Pre-SN 82.56 / 67.32 model-201M
Voxel R-CNN ~2.2 hours Post-SN 85.44 / 76.78 model-201M
PV-RCNN++ ~20 hours Source-only with SN 67.22 / 56.50 -
PV-RCNN++ ~20 hours Source-only 67.68 / 20.82 -
PV-RCNN++ ~2.2 hours Post-SN 86.86 / 79.86 model-193M

UDA Results for nuScenes-to-KITTI:

training time Adaptation Car@R40 download
PV-RCNN ~15.7 hours Source-only with SN 60.16 / 49.63 model-156M
PV-RCNN ~15.7 hours Source-only 64.58 / 27.12 model-156M
PV-RCNN ~1.5 hours Pre-SN 86.07 / 74.72 model-156M
PV-RCNN ~1 hours Post-SN 88.79 / 72.50 model-156M
Voxel R-CNN ~8.5 hours Source-only 66.94 / 30.33 model-201M
Voxel R-CNN ~2.2 hours Post-SN 87.11 / 66.02 model-201M
PV-RCNN++ ~18 hours Source-only with SN 54.47 / 36.05 model-193M
PV-RCNN++ ~18 hours Source-only 67.68 / 20.82 model-193M
PV-RCNN++ ~1 hours Post-SN 85.50 / 67.85 model-193M

UDA Results for Waymo-to-nuScenes:

  • [16-beam Waymo Train] deontes that we down-sample the point clouds of Waymo dataset from 64-beam to 16-beam, according to the given range map of the corresponding point clouds, and then we train the source-only model on the 16-beam Waymo data.
training time Adaptation Car@R40 download
PV-RCNN ~23 hours Source-only 31.02 / 21.21 -
PV-RCNN ~8 hours Self-training 33.29 / 22.15 model-156M
PV-RCNN ~19 hours 32-beam Waymo Train 34.19 / 21.37 model-156M
PV-RCNN ~15 hours 16-beam Waymo Train 40.23 / 23.33 model-156M
PV-RCNN ~8 hours 16-beam Waymo + Self-training - -
Voxel R-CNN ~16 hours Source-only 29.08 / 19.42 -
Voxel R-CNN ~2.2 hours Self-training 32.48 / 20.87 model-201M
Voxel R-CNN ~11 hours 16-beam Waymo Train 38.63 / 22.64 model-201M
PV-RCNN++ ~20 hours Source-only 31.96 / 19.80 -
PV-RCNN++ ~2.2 hours Self-training - -
PV-RCNN++ ~15.5 hours 16-beam Waymo Train 42.62 / 25.02 model-193M
PV-RCNN++ ~2.2 hours 16-beam Waymo + Self-training - -