This repository provides the official PyTorch implementation of the following paper (CVPR 2023):
Event-guided Person Re-Identification via Sparse-Dense Complementary Learning
Chengzhi Cao, Xueyang Fu*, Hongjian Liu, Yukun Huang, Kunyu Wang, Jiebo Luo, Zheng-Jun Zha
Video-based person re-identification (Re-ID) is a prominent computer vision topic due to its wide range of video surveillance applications. Most existing methods utilize spatial and temporal correlations in frame sequences to obtain discriminative person features. However, inevitable degradations, e.g., motion blur contained in frames often cause ambiguity texture noise and temporal disturbance, leading to the loss of identity-discriminating cues. Recently, a new bio-inspired sensor called event camera, which can asynchronously record intensity changes, brings new vitality to the Re-ID task. With the microsecond resolution and low latency, event cameras can accurately capture the movements of pedestrians even in the aforementioned degraded environments. Inspired by the properties of event cameras, in this work, we propose a Sparse-Dense Complementary Learning Framework, which effectively extracts identity features by fully exploiting the complementary information of dense frames and sparse events. Specifically, for frames, we build a CNN-based module to aggregate the dense features of pedestrian appearance step-by-step, while for event streams, we design a bio-inspired spiking neural backbone, which encodes event signals into sparse feature maps in a spiking form, to present the dynamic motion cues of pedestrians. Finally, a cross feature alignment module is constructed to complementarily fuse motion information from events and appearance cues from frames to enhance identity representation learning. Experiments on several benchmarks show that by employing events and SNN into Re-ID, our method significantly outperforms competitive methods.
The contents of this repository are as follows:
- Python
- Pytorch (1.4)
- scikit-image
- opencv-python
Experiments on MARS, as it is the largest dataset available to date for video-based person reID. Please follow deep-person-reid to prepare the data. The instructions are copied here:
- Create a directory named
mars/
. - Download dataset to
mars/
from http://www.liangzheng.com.cn/Project/project_mars.html. - Extract
bbox_train.zip
andbbox_test.zip
. - Download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put
info/
indata/mars
(we want to follow the standard split in [8]). The data structure would look like: - Download
mars_attributes.csv
from http://irip.buaa.edu.cn/mars_duke_attributes/index.html, and put the file indata/mars
. The data structure would look like:
mars/
bbox_test/
bbox_train/
info/
mars_attributes.csv
-
Change the global variable
_C.DATASETS.ROOT_DIR
to/path2mars/mars
and_C.DATASETS.NAME
tomars
in config or configs. -
Utilize V2E to generate the corresponding event sequence.
The event version of MARS is so large (almost 20,000 videos). In the following weeks, we will put our data in this link: https://pan.baidu.com/s/1jont6AXijx3bwLzeHblwnw password:y762
-
Create a directory named ilids-vid/ under data/.
-
Download the dataset from http://www.eecs.qmul.ac.uk/~xiatian/downloads_qmul_iLIDS-VID_ReID_dataset.html to "ilids-vid".
-
Download the event sequence from: https://pan.baidu.com/s/19BgDlcbeKtt7EySNpD8gpw password:5jdg
-
Organize the data structure to match
ilids-vid/
i-LIDS-VID/
i-LIDS-VID—event/
train-test people splits
-
Create a directory named PRID/ under data/.
-
Download the dataset and event sequence from: https://pan.baidu.com/s/13OTKjwcfbrQQDbDtPyEYRA password:5olr
-
Organize the data structure to match
PRID/
prid_2011/
prid_2011_event/
To train SDCL , run the command below:
python Train_event_vid.py --arch 'model_name'\
--config_file "./configs/softmax_triplet.yml"\
--dataset 'prid_event_vid'\
--test_sampler 'Begin_interval'\
--triplet_distance 'cosine'\
--test_distance 'cosine'\
--seq_len 8
To test SDCL, run the command below:
python Test.py --arch 'model_name'\
--dataset 'prid_event_vid'\
--test_sampler 'Begin_interval'\
--triplet_distance 'cosine'\
--test_distance 'cosine'
More experiments result can be found in paper.
This project is licensed under the MIT License - see the LICENSE.md file for details.
[1] Howard et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017.
[2] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[3] Hirzer et al. Person Re-Identification by Descriptive and Discriminative Classification. SCIA 2011.
[4] Wang et al. Person Re-Identification by Video Ranking. ECCV 2014.
[5] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.
The evaluation code (cmc & mAP) is partially borrowed from the MARS-evaluation repository.
Please consider citing the following paper if you find our codes helpful. Thank you!
@InProceedings{Cao_2023_CVPR,
author = {Cao, Chengzhi and Fu, Xueyang and Liu, Hongjian and Huang, Yukun and Wang, Kunyu and Luo, Jiebo and Zha, Zheng-Jun},
title = {Event-Guided Person Re-Identification via Sparse-Dense Complementary Learning},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
pages = {17990-17999}
}
Should you have any question, please contact chengzhicao@mail.ustc.edu.cn.