"Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation".

Source Code for ICML 2026 paper Paper Link:

"Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation".

0. Setup

Environment: Clone the repo and build the environment.

Recommend to use conda to manage the environment. check detail installation for more information.

conda env create -f environment.yaml

CUDA package (need to install nvcc compiler):

# CUDA already install in python environment.
cd assets/cuda/chamfer3D && python ./setup.py install && cd ../../..

1. Data Preparation

A. Download The View-of-Delft dataset (VoD)

The VoD dataset is organized as follows:

PATH_TO_VOD_DATASET
    ├── image_2
    │   │── 00001.jpg
    |       ...
    ├── pose
    │   │── 00001.json
    |       ...
    |       ...
    ├── label_2_withid
    │   │── 00001.txt
    |       ...
    |
    ├── lidar
    │   │── training
    │       ├── velodyne
    │           ├── 00001.bin
    │       ...
    │   │── calib
    │       ├──00001.txt
    │       ...
    ├── radar
    │   │── training
    │       ├── velodyne
    │           ├── 00001.bin
    │       ...
    │   │── calib
    │       ├──00001.txt
    │       ...

B. Generate VoD Scene Flow Dataset in .h5 format

In each script that needs to be run, the parts where the PATH or MODE needs to be modified for Reproduction have been highlighted with "TO DO".

# You need to change the paths in gen_ra_gt_flow.py
# change the val/train mode to generate the Training set and Validation set seperately.

cd ./dataprocess
python gen_ra_gt_flow.py.py

C. Generate 2D Tracking boxes for VoD sequences with YOLOv11 model

we adopt the deepsort 2D tracking algorithm from YOLOv11-DeepSort.

And we use the official pretrained Yolov11L model weight: Yolov11-L

# You need to change the **PATH** or **MODE** in ./dataprocess/YOLOv11-DeepSort/my_yolov11.py
# change the val/train mode to generate 2D Tracking boxes for the Training set and Validation set seperately.

cd ./dataprocess/YOLOv11-DeepSort
python my_yolov11.py

D. Generate 2D Segmentation Masks with SAM Model and Project to 3D Space

We use the pretrained SAM model to generate instance-level masks for each 2D tracking box from previous step. SAM with ViT-H

Then per-point instance id is generated for radar point clouds base on 2D-3D projection.

# You need to change the **PATH** or **MODE** in ./dataprocess/yolo11_deepsort_segany.py
# change the val/train mode to generate per-point instance id for the Training set and Validation set seperately.

cd ./dataprocess
python yolo11_deepsort_segany.py

2. Training

# You need to change the **PATH** in ./conf/config.yaml
# Also check the GPU settings

cd ..
python train.py

Please check /checkpoint file for our trained model.

3. Evaluation

# You need to change the **PATH** in ./conf/eval.yaml

cd ..
python eval.py

Cite & Acknowledgements

❤️: OpenSceneFlow ❤️: CMFlow ❤️: PV-RAFT ❤️: YOLOv11-DeepSort ❤️: segment-anything

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

"Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation".

0. Setup

1. Data Preparation

A. Download The View-of-Delft dataset (VoD)

B. Generate VoD Scene Flow Dataset in .h5 format

C. Generate 2D Tracking boxes for VoD sequences with YOLOv11 model

D. Generate 2D Segmentation Masks with SAM Model and Project to 3D Space

2. Training

3. Evaluation

Cite & Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
assets		assets
conf		conf
dataprocess		dataprocess
scripts		scripts
weight		weight
README.md		README.md
environment.yaml		environment.yaml
eval.py		eval.py
overall.png		overall.png
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

"Weakly Supervised Cross-Modal Learning for 4D Radar Scene Flow Estimation".

0. Setup

1. Data Preparation

A. Download The View-of-Delft dataset (VoD)

B. Generate VoD Scene Flow Dataset in .h5 format

C. Generate 2D Tracking boxes for VoD sequences with YOLOv11 model

D. Generate 2D Segmentation Masks with SAM Model and Project to 3D Space

2. Training

3. Evaluation

Cite & Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages