Skip to content
/ lsmdc Public

A Joint Sequence Fusion Model for Video Question Answering and Retrieval. In ECCV 2018

Notifications You must be signed in to change notification settings

autogyro/lsmdc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

A Joint Sequence Fusion Model for Video Question Answering and Retrieval

This project hosts the tensorflow implementation for our ECCV 2018 paper, A Joint Sequence Fusion Model for Video Question Answering and Retrieval}.

Reference

If you use this code or dataset as part of any published research, please refer the following paper.

@inproceedings{
  author    = {Youngjae Yu and Jongseok Kim and Gunhee Kim},
  title     = "{A Joint Sequence Fusion Model for Video Question Answering and Retrieval}"
  booktitle = {ECCV},
  year      = 2018
}

Setup

Install dependencies

pip install -r requirements.txt

Setup python paths

git submodule update --init --recursive
add2virtualenv .

Prepare Data

  • Video Feature

    1. Download LSMDC data.

    2. Extract rgb features using pool5 layer of the pretrained ResNet-152 model.

    3. Extract audio features using VGGish.

    4. Concat rgb and video features and save it into hdf5 file, and save it in 'dataset/LSMDC/LSMDC16_features/RESNET_pool5wav.hdf5'.

  • Dataset

    • We processed raw data frames file in LSMDC17 and MSR-VTT dataset
    • Download dataframe files
    • Save these files in "dataset/LSMDC/DataFrame"
  • Vocabulary

Training

modify configuartion.py to suit your environment.

  • train_tag can be 'RET', 'MC', 'FIB'

Run train.py.

python train.py --tag="tag"

About

A Joint Sequence Fusion Model for Video Question Answering and Retrieval. In ECCV 2018

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages