Skip to content
/ unite Public

[CVPR 2024] Code for UNITE, an unsupervised approach for video domain adaptation (https://arxiv.org/abs/2312.02914)

License

Notifications You must be signed in to change notification settings

reddyav1/unite

Repository files navigation


Unsupervised Video Domain Adaptation with
Masked Pre-Training and Collaborative Self-Training

About

UNITE is a three-stage approach for unsupervised video domain adaptation that uses a powerful image-based teacher model to adapt a video student model to the target domain. In the first stage, unsupervised pre-training is performed on target domain videos using the Unmasked Teacher objective. The second stage employs supervised fine-tuning on source domain videos. The third stage involves collaborative self-training, where both student and teacher model predictions are used to further adapt to the target domain.

teaser

Getting Started

Data Preparation

Please download the datasets from their original sources and update dataset_mappings.yaml with the correct paths to your .csv annotations files, which should be of the form /path/to/video.mp4,<class_id>.

To facilitate getting started, we provide data and our annotations files for the ARID-HMDB (A→H) domain shift in Daily-DA here. Remember to update the paths in the annotations files to point to your videos.

Download Checkpoints

Student Model

The student model in UNITE is initialized from the Unmasked Teacher (UMT) checkpoint pre-trained on Kinetics-710 (ViT-B/16). You can find a link to this checkpoint in the UMT repository, or can directly download it from here.

Teacher Model

Like UMT, we use CLIP as the teacher model by default:

Environment

We recommend you create a conda environment to run UNITE. Our environment is provided in environment.yaml. You can create your own by running:

conda env create --name unite --file environment.yaml

Running UNITE

Each of the three stages in UNITE is separated into its own Python file. We provide bash scripts that will launch distributed training for each stage (stage<X>.sh), currently configured for the ARID-HMDB (A→H) domain shift from Daily-DA as an example.

You will need to update the output directory and student model initialization checkpoint path in stage1.sh. In stage2.sh and stage3.sh, specify the ckpt_path to point to the desired output checkpoint from the previous stage.

Acknowledgement

This repository was built based on Unmasked Teacher.

This research was sponsored by the Army Research Laboratory under Cooperative Agreement W911NF-21-2-0211. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein.

Citation

@inproceedings{reddy2024unite,
  title={Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training},
  author={Reddy, Arun and Paul, William and Rivera, Corban and Shah, Ketul and de Melo, Celso M and Chellappa, Rama},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={18919--18929},
  year={2024}
}

License

This project is released under the MIT License.

Copyright (c) 2024 The Johns Hopkins University Applied Physics Laboratory

About

[CVPR 2024] Code for UNITE, an unsupervised approach for video domain adaptation (https://arxiv.org/abs/2312.02914)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages