Zhiying Song · Lei Yang · Fuxi Wen · Jun Li
For any questions, please feel free to email me at song-zy24@mails.tsinghua.edu.cn.
Cooperative perception presents significant potential for enhancing the sensing capabilities of individual vehicles, however, inter-agent latency remains a critical challenge. Latencies cause misalignments in both spatial and semantic features, complicating the fusion of real-time observations from the ego vehicle with delayed data from others. To address these issues, we propose TraF-Align, a novel framework that learns the flow path of features by predicting the feature-level trajectory of objects from past observations up to the ego vehicle’s current time. By generating temporally ordered sampling points along these paths, TraF-Align directs attention from the current-time query to relevant historical features along each trajectory, supporting the reconstruction of current-time features and promoting semantic interaction across multiple frames. This approach corrects spatial misalignment and ensures semantic consistency across agents, effectively compensating for motion and achieving coherent feature fusion. Experiments on two real-world datasets, V2V4Real and DAIR-V2X-Seq, show that TraF-Align sets a new benchmark for asynchronous cooperative perception.
# Create a conda environment
conda create -n TraF-Align python=3.9.13
conda activate TraF-Align
# install torch, test on cuda 12.0
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu121
# install packages
pip install -r requirements.txt
# install spconv, test on spconv 2.1.35
pip install spconv-cu111
# Install bbx nms calculation cuda version
cd models/ops/iou3d_nms && python setup.py build_ext --inplace
We provide the checkpoints of TraF-Align trained on V2V4Real and V2X-Seq in checkpoints/. First set the configs in tools/inference.py, including checkpoint folder path, batch size, delay of Ego vehicle and cooperative agents, then set the root_dir to path of dataset in ${CONFIG_FILE}, and test them by
python tools/inference.pyFollowing OpenCOOD, TraF-Align uses yaml file to configure all the parameters for training. To train your own model
from scratch, first set the root_dir to dataset path in ${CONFIG_FILE}, and then run the following commands:
python tools/train.py --hypes_yaml ${CONFIG_FILE} Arguments Explanation:
hypes_yaml: the path of the training configuration file, e.g.'hypes_yaml/v2v4real/v2v4real_TraF-Align.yaml'.
Because we use one_cycle lr scheduler, the learning rate is correlated to trained epochs, we do not support train from checkpoints yet.
If you use multiple GPUs, run the following commands:
CUDA_VISIBLE_DEVICES=${gpus} python -m torch.distributed.launch --nproc_per_node=${gpu_number} --use_env tools/train.py --hypes_yaml ${hypes_yaml}Arguments Explanation:
gpus: the visible gpu ids for training, e.g.'1,2,3,4,5,6,7,8,9'.gpu_number: number of gpus, e.g.9.hypes_yaml: the path of the training configuration file, e.g.'hypes_yaml/v2v4real/v2v4real_TraF-Align.yaml'.
@inproceedings{song2025traf,
title={Traf-align: Trajectory-aware feature alignment for asynchronous multi-agent perception},
author={Song, Zhiying and Yang, Lei and Wen, Fuxi and Li, Jun},
booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference},
pages={12048--12057},
year={2025}
}TraF-Align is build upon OpenCOOD, many thanks to the high-quality codebase.