Skip to content

[CVPR 2025] PyTorch implementation of T-CORE, introduced in "When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning".

License

Notifications You must be signed in to change notification settings

yafeng19/T-CORE

Repository files navigation

T-CoRe

This is the official code for the paper "When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning" accepted by Conference on Computer Vision and Pattern Recognition (CVPR 2025). This paper is available at here.

paper slides Website video

When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning

Authors: Yang Liu, Qianqian Xu*, Peisong Wen, Siran Dai, Qingming Huang*

assets/pipeline.png

🚩 Checkpoints

Dataset Backbone Epoch $J\&F_m$ mIoU PCK@0.1 Download
ImageNet VIT-S/16 100 64.1 39.7 46.2 link
K400 VIT-S/16 400 64.7 37.8 47.0 link
K400 VIT-B/16 200 66.4 38.9 47.1 link

💻 Environments

  • Ubuntu 20.04
  • CUDA 12.4
  • Python 3.9
  • Pytorch 2.2.0

See requirement.txt for others.

🔧 Installation

  1. Clone this repository

    git clone https://github.com/yafeng19/T-CORE.git
  2. Create a virtual environment with Python 3.9 and install the dependencies

    conda create --name T_CORE python=3.9
    conda activate T_CORE
  3. Install the required libraries

    pip install -r requirements.txt

🚀 Training

Dataset

  1. Download Kinetics-400 training set.
  2. Use third-party tools or scripts to extract frames from original videos.
  3. Place the frames in data/Kinetics-400/frames/train.
  4. Generate files for training data by python base_model/tools/dump_files.py and plce the files in data/Kinetics-400/frames.
  5. Integrate the frames and files into the following structure:
    T-CoRe
    ├── data
    │   └── Kinetics-400
    │       └── frames
    │           ├── train
    │           │   ├── class_1
    │           │   │   ├── video_1
    │           │   │   │   ├── 00000.jpg
    │           │   │   │   ├── 00001.jpg
    │           │   │   │   ├── ...
    │           │   │   │   └── 00019.jpg
    │           │   │   ├── ...
    │           │   │   └── video_m
    │           │   ├── ...
    │           │   └── class_n
    │           ├── class-ids-TRAIN.npy
    │           ├── class-names-TRAIN.npy
    │           ├── entries-TRAIN.npy
    │           └── labels.txt
    ├── base_model
    └── scripts
    

Scripts

We provide a script with default parameters. Run the following command for training.

bash scripts/pretrain.sh

The well-trained models are saved at here.

📊 Evaluation

Dataset

In our paper, three dense-level benchmarks are adopted for evaluation.

Dataset Video Task Download link
DAVIS Video Object Segmentation link
JHMDB Human Pose Propagation link
VIP Semantic Part Propagation link

Scripts

We provide a script with default parameters. Run the following command for evaluation.

bash scripts/eval.sh

🖋️ Citation

If you find this repository useful in your research, please cite the following papers:

@misc{liu2025futurepasttamingtemporal,
      title={When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning}, 
      author={Yang Liu and Qianqian Xu and Peisong Wen and Siran Dai and Qingming Huang},
      year={2025},
      eprint={2503.15096},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2503.15096}, 
}

📧 Contact us

If you have any detailed questions or suggestions, you can email us: liuyang232@mails.ucas.ac.cn. We will reply in 1-2 business days. Thanks for your interest in our work!

🌟 Acknowledgements

  • Our code is based on the official PyTorch implementation of DINOv2.
  • The evaluation code is based on CropMAE.

About

[CVPR 2025] PyTorch implementation of T-CORE, introduced in "When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published