ATM: Action Temporality Modeling for Video Question Answering

This is an official implementation of our paper accepted to ACM Multimedia'2023: ATM: Action Temporality Modeling for Video Question Answering

Get Started

The code is mainly developed from VGT. Thanks the authors for the great work and code.

Environment

Assume you have installed Anaconda, please do the following to setup the envs:

>conda create -n videoqa python==3.8
>conda activate videoqa
>pip install -r requirements.txt

Data Preparation

Create the data annotation folder inside data/. Download the csv files from annotations into data/dataset/nextqa. Download the folders from features into data/features/nextqa'''. Donwload the checkpoints into data/save_models/nextqa/```.

Scripts

Inference

sh ./shell/next_test.sh 0

Pretrain

sh ./shells/next_train.sh 0

Finetune

sh ./shells/next_ft.sh 0

Citation

@article{chen2023atm,
	  title={ATM: Action Temporality Modeling for Video Question Answering},
	  author={Chen, Junwen and Zhu, Jie and Kong, Yu},
	  journal={ACM Multimedia},
	  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
misc		misc
model		model
shells		shells
train		train
README.md		README.md
args.py		args.py
global_parameters.py		global_parameters.py
loss.py		loss.py
main.py		main.py
requirements.txt		requirements.txt
util.py		util.py

junwenchen/ATM

Folders and files

Latest commit

History

Repository files navigation

Get Started

Environment

Data Preparation

Scripts

Inference

Pretrain

Finetune

Citation

About

Resources

Stars

Watchers

Forks

Languages