Cross-Attention Transformer for Video Interpolation (TAIN)

This repository contains the inference code and the pre-trained model for our paper:
Cross-Attention Transformer for Video Interpolation
ACCV Workshop 2022 [Vision Transformers: Theory and Applications Workshop at ACCV 2022]
Hannah Kim, Shuzhi Yu, Shuai Yuan, and Carlo Tomasi

Citation

Please cite our paper if you find our code or paper useful.

@InProceedings{Kim_2022_ACCV,
    author    = {Kim, Hannah Halin and Yu, Shuzhi and Yuan, Shuai and Tomasi, Carlo},
    title     = {Cross-Attention Transformer for Video Interpolation},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV) Workshops},
    month     = {December},
    year      = {2022},
    pages     = {320-337}
}

Directory Structure

project
│   README.md
|   main.py - main file to run evaluation
|   config.py - check & change testing configurations here
|   loss.py - defines different loss functions
|   utils.py - misc.
└───run
|   │   eval_vimeo.sh - script to evaluate on Vimeo90k benchmark
|   │   eval_ucf.sh - script to evaluate on UCF101 benchmark
|   │   eval_snu.sh - script to evaluate on SNU-FILM benchmark
|   │   eval_middlebury.sh - script to evaluate on Middlebury benchmark
└───model
│   │   common.py
│   │   tain.py - main model
|   |   vt.py - vision transformer module
└───data - implements dataloaders for each dataset
│   |   vimeo90k.py - main training / testing dataset
│   |   ucf101.py - testing dataset
│   |   snufilm.py - testing dataset
│   |   middlebury.py - testing dataset
└───checkpoint - pre-trained model weights
│   └───TAIN
|       | ...

Requirements

The code has been developed with

Python==3.7.11
numpy==1.20.3
PyTorch==1.8.1, torchvision==0.2.1, cudatoolkit==10.1
tensorboard==2.6.0
opencv==3.4.2

conda create -n tain
conda activate tain
conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=10.2 -c pytorch
conda install tensorboard
pip install einops

Download the pre-trained model weights and save in checkpoint/TAIN/.

Dataset Preparation

We use the following datasets for testing
- Vimeo90K Triplet dataset
- UCF101 dataset
- SNU-FILM (SNU Frame Interpolation with Large Motion) dataset
- Middlebury dataset
After downloading the full dataset, set its path to --data_root
- Vimeo90k dataset, run run/eval_vimeo.sh
- UCF101 dataset, run run/eval_ucf.sh
- SNU-FILM dataset, run run/eval_snu.sh
  - Testing mode (choose from ['easy', 'medium', 'hard', 'extreme']) can be modified by changing --test_mode option.
- Middlebury dataset, run run/eval_middlebury.sh

Results

Visualization of our proposed method and its comparison to the current state-of-the-art methods on examples from Vimeo90k and UCF101 dataset.

Acknowledment & Reference

CAIN by myungsub.
GMA by lmb-zacjiang.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoint/TAIN

checkpoint/TAIN

data

data

figures

figures

model

model

outputs

outputs

pytorch_msssim

pytorch_msssim

run

run

README.md

README.md

config.py

config.py

loss.py

loss.py

main.py

main.py

utils.py

utils.py

Repository files navigation

Cross-Attention Transformer for Video Interpolation (TAIN)

Citation

Directory Structure

Requirements

Dataset Preparation

Results

Acknowledment & Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
checkpoint/TAIN		checkpoint/TAIN
data		data
figures		figures
model		model
outputs		outputs
pytorch_msssim		pytorch_msssim
run		run
README.md		README.md
config.py		config.py
loss.py		loss.py
main.py		main.py
utils.py		utils.py

hannahhalin/TAIN

Folders and files

Latest commit

History

Repository files navigation

Cross-Attention Transformer for Video Interpolation (TAIN)

Citation

Directory Structure

Requirements

Dataset Preparation

Results

Acknowledment & Reference

About

Resources

Stars

Watchers

Forks

Languages