Video Test-Time Adaptation for Action Recognition (CVPR 2023)

ViTTA is the first approach of test-time adaptation of video action recognition models against common distribution shifts. ViTTA is tailored to saptio-temporal models and capable of adaptation on a single video sample at a step. It consists in a feature distribution alignment technique that aligns online estimates of test set statistics towards the training statistics. It further enforces prediction consistency over temporally augmented views of the same test video sample.

Official implementation of ViTTA [arXiv]
Author HomePage

Requirements

Our experiments run on Python 3.6, PyTorch 1.7, mmcv-full 1.3.12. Other versions should work but are not tested.
Dependency of mmaction2 (for Video Swin Transformer):
$ pip install mmcv-full==1.3.12
$ git clone https://github.com/SwinTransformer/Video-Swin-Transformer.git && cd Video-Swin-Transformer
$ pip install -v -e . --user
Other relevant dependencies can be found in requirements.txt

Data Preparation

Download
Download required data for Experiments on UCF101 from here
list_video_perturbations_ucf: list of files for corrupted videos of UCF101 validation set (in 12 corruption types)
model_swin_ucf: Video Swin Transformer trained on UCF101 training set
model_tanet_ucf: TANet trained on UCF101 training set
model_tanet_ucf: TANet trained on UCF101 training set
source_statistics_tanet_ucf: precomputed source (UCF101 training set) statistics on TANet
source_statistics_swin_ucf: precomputed source (UCF101 training set) statistics on Video Swin Transformer
ucf_corrupted_videos: a folder of 12 compressed files containing videos of UCF validation set (in 12 corruption types)
ucf_corrupted_videos.zip: a single compressed file (83.8GB) containing videos of UCF validation set (in 12 corruption types)

Data structure
lines in file list are in format video_path n_frames class_id
video dataset structure

level_5_ucf_val_split_1_/
  gauss/
    ApplyEyeMakeup/
      v_ApplyEyeMakeup_g01_c01.mp4
      v_ApplyEyeMakeup_g01_c02.mp4
      ...
    ApplyLipstick/
    ...
  contrast/
    ApplyEyeMakeup/
      v_ApplyEyeMakeup_g01_c01.mp4
      v_ApplyEyeMakeup_g01_c02.mp4
      ...
    ApplyLipstick/
  ...

Usage

Specify the data paths in the scripts correspondingly (see comments in scripts)

Precompute source statistics on training set
precompute source (UCF101 training set) statistics on TANet:
$ python compute_stats/compute_spatiotemp_stats_clean_train_tanet.py
precompute source (UCF101 training set) statistics on Video Swin Transformer:
$ python compute_stats/compute_spatiotemp_stats_clean_train_swin.py
Test-time adaptation
$ python tta_tanet_ucf101.py test-time adaptation on TANet UCF101
$ python tta_swin_ucf101.py test-time adaptation on Video Swin Transformer UCF101
Source-only evaluation on corrupted validation data
$ python tta_tanet_ucf101.py
$ python tta_swin_ucf101.py

Citation

Thanks for citing our paper:

@inproceedings{lin2023video,
  title={Video Test-Time Adaptation for Action Recognition},
  author={Lin, Wei and Mirza, Muhammad Jehanzeb and Kozinski, Mateusz and Possegger, Horst and Kuehne, Hilde and Bischof, Horst},
  booktitle={CVPR},
  year={2023},
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
baselines		baselines
compute_stats		compute_stats
corpus		corpus
datasets_		datasets_
models		models
utils		utils
README.md		README.md
requirements.txt		requirements.txt
sourceonly_swin_ucf101_corr.py		sourceonly_swin_ucf101_corr.py
sourceonly_tanet_ucf101_corr.py		sourceonly_tanet_ucf101_corr.py
tta_swin_ucf101.py		tta_swin_ucf101.py
tta_tanet_ucf101.py		tta_tanet_ucf101.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Test-Time Adaptation for Action Recognition (CVPR 2023)

Requirements

Data Preparation

Usage

Citation

About

Releases

Packages

Languages

wlin-at/ViTTA

Folders and files

Latest commit

History

Repository files navigation

Video Test-Time Adaptation for Action Recognition (CVPR 2023)

Requirements

Data Preparation

Usage

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages