SLLM

This is the released code for our paper "Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration" [Arxiv], which will appear on ACMMM 2023.

In general, the proposed method, i.e., Sampel Less Learn More (SLLM), is a strategy for efficient video learning. The SLLM will first discard some sampled frames and restore the discarded features in latent space, and the computational cost of image encoding is reduced. SLLM can also be intergrated with other frame sampler strategies to achieve a better performance.

The released code is modified from Text2Vis.

Quick Start

Dataset

We applied the K400, UCF-101, HMDB-51, and ActivityNet as datasets for evaluating SLLM. The pre-preprocess of the datasets is identical with the Text2Vis and ActionCLIP. The annotation file is a text file with multiple lines, and each line indicates the directory to frames of a video, total frames of the video and the label of a video, which are split with a whitespace. Here is the format:

abseiling/-7kbO0v4hag_000107_000117 300 0
abseiling/-bwYZwnwb8E_000013_000023 300 0

For more details, please see the ./lists/

Pre-trained Model

Ps: We will release more models soon.

Backbone	Dataset	Link	ACC	Efficiency (3090 GPU)
Text2Vis (16 sampled frames)	HMDB	Google Drive	81.55%	124.80 videos/s

Inference

Please download the pre-trained model, and modify the line test: /path/to/checkpoint/last_model.pt in ./configs/[dataset_name]/[dataset]_test.yaml.

And use this code to run with our pretrained model:

python train.py --config ./configs/hmdb51/hmdb_test.yaml

Citation

If you find this work is useful, please kindly cite it:

@article{DBLP:journals/corr/abs-2307-14866,
  author       = {Harry Cheng and
                  Yangyang Guo and
                  Liqiang Nie and
                  Zhiyong Cheng and
                  Mohan S. Kankanhalli},
  title        = {Sample Less, Learn More: Efficient Action Recognition via Frame Feature
                  Restoration},
  journal      = {CoRR},
  volume       = {abs/2307.14866},
  year         = {2023},
  url          = {https://doi.org/10.48550/arXiv.2307.14866},
  doi          = {10.48550/arXiv.2307.14866},
  eprinttype    = {arXiv},
  eprint       = {2307.14866},
  timestamp    = {Wed, 02 Aug 2023 15:37:53 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2307-14866.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
clip		clip
configs		configs
datasets		datasets
lists		lists
modules		modules
utils		utils
.gitignore		.gitignore
README.MD		README.MD
overview.png		overview.png
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clip

clip

configs

configs

datasets

datasets

lists

lists

modules

modules

utils

utils

.gitignore

.gitignore

README.MD

README.MD

overview.png

overview.png

train.py

train.py

Repository files navigation

SLLM

Quick Start

Dataset

Pre-trained Model

Inference

Citation

About

Releases

Packages

Contributors 2

Languages

xaCheng1996/SLLM

Folders and files

Latest commit

History

Repository files navigation

SLLM

Quick Start

Dataset

Pre-trained Model

Inference

Citation

About

Resources

Stars

Watchers

Forks

Languages