Course project members: Lucas Tao (lucastao) and Nanyan Zhu (nz2305).
External member: Chen Liu.
Multi-object tracking (MOT) aims to identify and keep track of all objects in a video. Under the mainstream formulation, MOT consists of two main stages: detection and association. Individual objects are recognized in the former stage, usually in the form of bounding boxes each with a confidence score. In the latter stage, an association algorithm is used to figure out the correspondences among the current detections and previous detections (sometimes referred to as ``tracklets'').
While the detection stage is witnessing tremendous progress as detectors gain power and efficiency, the association stage remains less attended. Intriguingly, many state-of-the-art MOT methods are still using very rudimentary approaches for association, such as the Hungarian matching algorithm. While there exist end-to-end learning-based methods for data association stage, they are not gaining enough popularity. One main reason against such data-hungry methods is the scarcity of labeled data for tracking.
In this project, we propose a data augmentation approach to generate synthetically labeled tracking datasets from existing labeled tracking data. The approach will ``manipulate the trajectories'' of persons in the annotated video stream.
- Clone this repository
- Add the missing files
- Download
mot.tar
and unzip it atSynMOT/datasets/
. - Download
human_segmenter_checkpoints.tar
and unzip it atSynMOT/src/modules/human_segmenter/checkpoints/
. - Download
image_blender_checkpoints.tar
and unzip it atSynMOT/src/modules/image_blender/checkpoints/
. - Download
image_inpainter.zip
and unzip it atSynMOT/src/modules/image_inpainter/checkpoints/
.
- Download
- Create a proper environment.
- For docker users, a docker image is provided.
- For virtualenv users, create a new env with
python3 -m virtualenv venv
andpip3 install -r requirements.txt
- Run
main.py
.The docker provided does not work with certain GPUs. So to run the script with CPU, use:cd src python main.py
CUDA_VISIBLE_DEVICES=-1 python main.py
This work has been assisted by the following repositories:
- SiamMask as our human segmentation module.
- LaMa as our image inpainting module.
- Image Harmonization as our image blending module.