Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles
cd cohff_opv2v
# Initialize conda env
conda create -y --name cohff python=3.8
conda activate cohff
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
conda install -c "nvidia/label/cuda-11.3.1" cuda-toolkit
# Install dependencies
python setup.py develop
The Semantic-OPV2V dataset includes additional files for the original OPV2V data.
We have resimulated the scenarios used in OPV2V by employing OpenCDA with additional semantic LiDARs in the CARLA simulator. You can merge the additional files into the original OPV2V data folder. This approach is consistent with other works, such as CoBEVT and FedBEVT.
We made three types of dataset for Semantic-OPV2V:
- 4LidarSurround: 4 semantic LiDARs are positioned around the ego vehicle.
- 4LidarCampos: 4 semantic LiDARs are positioned at the original camera positions on the ego vehicle.
- semanticlidar_18: 18 semantic LiDARs are positioned around the ego vehicle, offering a more complete 3D representation of the environment.
- Download OPV2V: Get the OPV2V dataset from HERE
- Integrate Semantic Data: Merge the Semantic-OPV2V dataset with the original OPV2V dataset. The download link in Google Drive: HERE
- Config Dataset: Detailed configuration of the dataset in file:
cohff_opv2v/config/train_config/opv2v_dataset.py
To train a depth estimation model, please use the depth ground truth we created for the OPV2V dataset. You can download it from: DepthGT. We recommend using CaDNN for depth estimation.
-
Navigate to the training configuration:
cohff_opv2v/config/train_config/task_config.yml
-
Set the task type for occupancy prediction task:
task_type: 0
model_type: 1
supervison: 0
train_dataset_dir: 'PATH/TO/TRAIN/SET'
val_dataset_dir: 'PATH/TO/VALIDATION/SET'
-
Start training:
python main_train.py
-
Navigate to the training configuration:
cohff_opv2v/config/train_config/task_config.yml
-
Set the task type for semantic segmentation prediction:
task_type: 1
model_type: 2
supervision: 1
train_dataset_dir: 'PATH/TO/TRAIN/SET'
val_dataset_dir: 'PATH/TO/VALIDATION/SET'
-
Start training:
python main_train.py
-
Navigate to the training configuration:
cohff_opv2v/config/train_config/task_config.yml
-
Set the task type for hybrid feature fusion:
task_type: 2
model_type: 2
supervision: 1
segmentation_net_path: "/PATH/TO/PRETRAINED/SEGMENTATION/MODEL"
completion_net_path: "/PATH/TO/COMPLETION/MODEL"
train_dataset_dir: 'PATH/TO/TRAIN/SET'
val_dataset_dir: 'PATH/TO/VALIDATION/SET'
-
Start training:
python main_train.py
-
Set the path of a checkpoint in
main_eval.py
-
Start evaluating:
python main_eval.py --model-dir "PATH/TO/MODEL/DIR --model-name "MODEL_NAME"
-
Find the evaluation results in JSON-files in the corresponding model folder.
The default configuration file for the model is located at: ./config/model_config/cfg_cohff_opv2v.py
. The available parameters are detailed in a table below:
Parameter | Meaning |
---|---|
model_type | Types of fusion model. 0: Masked semantic class based on occupancy prediction; 1: 1: Masked semantic features based on occupancy prediction; 2: concatenate occupancy and semantic features; |
task_type | Types of task for training. 0: Completion; 1: Semantic Segmentation; 2: Hybrid_Fusion |
gi | Flag of using Gauss Importance based on relative position |
supervision | Label used for supervision 0: ego gt; 1: collaborative gt |
pos_emb_type | Types of postional embeddings 1: [plane features] * [encoded rel. postion] + [encoded rel. angle] 2: [plane features] * ([encoded rel. postion] + [encoded rel. angle]) 3: [plane features] + [encoded rel. postion] + [encoded rel. angle] |
log_freq | Number of iterations between log output |
loss_freq | Number of iterations between loss record |
save_freq | Number of epochs between saves |
vox_range | Detection range in meters |
max_connect_cav | Max number of cavs |
nbr_class | Number of semantic classes |
h | Length of voxel space |
w | Width of voxel space |
z | Height of voxel space |
segmentation_net_path | Path of pre-trained segmantation network |
completion_net_path | Path of pre-trained completion network |
train_dataset_dir | Directory of the train dataset |
val_dataset_dir | Directory of the validation dataset |
lr | Learning rate |
grad_max_norm | Clip the gradients to have a maximum L2 norm of this value |
weight_decay | Weight decay |
optimizer_type | Optimizer type |
Our project extensively utilizes the toolchains in the OpenCDA ecosystem, including the OpenCOOD and the OpenCDA simulation tools, for developing the Semantic-OPV2V dataset.
Our project draws inspiration from a lot of awesome previous works in collaborative perception, also known as cooperative perception, such as: DiscoNet (NeurIPS21), Where2comm (NeurIPS22), V2X-ViT (ECCV2022), CoBEVT (CoRL2022), CoCa3D (CVPR23) and many others.
Additionally, our project benefits from a lot of insightful previous works in vision-based 3D semantic occupancy prediction, also known as semantic scene completion, such as: MonoScene (CVPR22), TPVFormer (CVPR23), VoxFormer (CVPR23), FB-OCC (CVPR23) and many others.
@inproceedings{song2024collaborative,
title={Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles},
author={Song, Rui and Liang, Chenwei and Cao, Hu and Yan, Zhiran and Zimmer, Walter and Gross, Markus and Festag, Andreas and Knoll, Alois},
publisher={IEEE/CVF},
booktitle={2024 IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
Welcome to explore our new dataset for cooperative perception: TUMTraf-V2X
@inproceedings{zimmer2024tumtrafv2x,
title={TUMTraf V2X Cooperative Perception Dataset},
author={Zimmer, Walter and Wardana, Gerhard Arya and Sritharan, Suren and Zhou, Xingcheng and Song, Rui and Knoll, Alois C.},
publisher={IEEE/CVF},
booktitle={2024 IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}