Skip to content

Multilevel Hierarchical Network with Multiscale Sampling for Video Question Answering (IJCAI22)

License

Notifications You must be signed in to change notification settings

Mvrjustid/MHN-IJCAI22

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MHN

This is the PyTorch Implementation of our paper "Multilevel Hierarchical Network with Multiscale Sampling for Video Question Answering". (accepted by IJCAI’22)

alt text

Platform and dependencies

Ubuntu 14.04
Python 3.7
CUDA10.1
CuDNN7.5+
pytorch>=1.7.0

Data Preparation

  • Download the dataset
    MSVD-QA: link
    MSRVTT-QA: link
    TGIF-QA: link
  • Preprocessing
    1. To extract questions or answers Glove Embedding, please ref here.
      Take the action task in TGIF-QA dataset as an example, we have features at the path /QAfeatures: TGIF/word/action/TGIF_action_train_questions.pt TGIF/word/action/TGIF_action_val_questions.pt TGIF/word/action/TGIF_action_test_questions.pt TGIF/word/action/TGIF_action_vocab.json
    2. To extract appearance and motion feature, use the pretrained models here.
      for the action task, we have features at the path /Vfeatures:
      TGIF/SpatialFeatures/tumblr_nd24xaX8d11qkb1azo1_250/Features.pkl (shape is 2^level-1,16,2048)
      TGIF/SpatialFeatures/tumblr_no00ddSlG31t34v14o1_250/Features.pkl
      ...
      TGIF/TemporalFeatures/tumblr_nd24xaX8d11qkb1azo1_250/Features.pkl (shape is 2^level-1,2048)
      TGIF/TemporalFeatures/tumblr_no00ddSlG31t34v14o1_250/Features.pkl
      ...
      In our paper, number of levels is set to 3 by default.

Train and test

The trained models for the action task can be downloaded from here.

Reference

@article{peng2022MHN,
     title={Multilevel Hierarchical Network with Multiscale Sampling for Video Question Answering},
     author={Peng Min, Wang Chongyang, Gao Yuan, Shi Yu, Zhou Xiang-Dong},
     journal={Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI)},
     year={2022}}

About

Multilevel Hierarchical Network with Multiscale Sampling for Video Question Answering (IJCAI22)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%