Skip to content

Temporal Cross-attention for Action Recognition

Notifications You must be signed in to change notification settings

tamaki-lab/MSCA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

1. MSCA: Temporal Cross-attention for Action Recognition

This is an official repo of paper "Temporal Cross-attention for Action Recognition" at ACCV2022 Workshop on Vision Transformers: Theory and applications (VTTA-ACCV2022).

1.1. Citation

@InProceedings{Hashiguchi_2022_ACCV,
    author    = {Hashiguchi, Ryota and Tamaki, Toru},
    title     = {Temporal Cross-attention for Action Recognition},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV) Workshops},
    month     = {December},
    year      = {2022},
    pages     = {276-288}
}

1.2. Acknowledgement

We thank for the author of TokenShift:

1.4. Implementation

MSCA is build upon TokenShift.

1.5. Weights

Download ImageNet-22k pretrained weights from Base16.

1.6. Prepare dataset

Prepare Kinetics-400 dataset organized in the following structure.

Almost same with TokenShift, with slight modificaitons. See config file.

k400
|_ frames331_train
|  |_ [category name 0]
|  |  |_ [video name 0]
|  |  |  |_ img_00001.jpg
|  |  |  |_ img_00002.jpg
|  |  |  |_ ...
|  |  |
|  |  |_ [video name 1]
|  |  |   |_ img_00001.jpg
|  |  |   |_ img_00002.jpg
|  |  |   |_ ...
|  |  |_ ...
|  |
|  |_ [category name 1]
|  |  |_ [video name 0]
|  |  |  |_ img_00001.jpg
|  |  |  |_ img_00002.jpg
|  |  |  |_ ...
|  |  |
|  |  |_ [video name 1]
|  |  |   |_ img_00001.jpg
|  |  |   |_ img_00002.jpg
|  |  |   |_ ...
|  |  |_ ...
|  |_ ...
|
|_ frames331_val
|  |_ [category name 0]
|  |  |_ [video name 0]
|  |  |  |_ img_00001.jpg
|  |  |  |_ img_00002.jpg
|  |  |  |_ ...
|  |  |
|  |  |_ [video name 1]
|  |  |   |_ img_00001.jpg
|  |  |   |_ img_00002.jpg
|  |  |   |_ ...
|  |  |_ ...
|  |
|  |_ [category name 1]
|  |  |_ [video name 0]
|  |  |  |_ img_00001.jpg
|  |  |  |_ img_00002.jpg
|  |  |  |_ ...
|  |  |
|  |  |_ [video name 1]
|  |  |   |_ img_00001.jpg
|  |  |   |_ img_00002.jpg
|  |  |   |_ ...
|  |  |_ ...
|  |_ ...
|
|_ trainValTest
   |_ train.txt
   |_ val.txt

1.7. train and val

python main.py --tune_from pretrain/ViT-B_16_Img21.npz --cfg config/custom/kinetics400/k400_attentionshift_div4_8x32_base_224.yml

About

Temporal Cross-attention for Action Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%