Skip to content
/ SCDM Public
forked from yytzsy/SCDM

Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

Notifications You must be signed in to change notification settings

Tanwey/SCDM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCDM

Code for the paper: "Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos"

Introduction

Temporal sentence grounding (TSG) in videos aims to detect and localize one target video segment, which semantically corresponds to a given sentence query. We propose a semantic conditioned dynamic modulation (SCDM) mechanism to help solve the TSG problem, which relies on the sentence semantics to modulate the temporal convolution operations for better correlating and composing the sentence-related video contents over time.

Download Features and Example Preprocessed Data

First, download the following files into the './data' folder:

Then, download the preprocessed .h5 data for the Charades-STA dataset, and put it into the './data/Charades' folder. Actually, we have provided the code to preprocess the data, and you can also generate the preprocessed data by yourself.

Data Preprocessing

As denoted in our paper, we perform the temporal sentence grounding task in three datasets: Charades-STA, ActivityNet Captions, and TACoS. Before the model training and testing in these three datasets, please preprocess the data first.

  • Go to the './grounding/Charades-STA/data_preparation/' folder, and run:
python generate_charades_data.py

If you have downloaded the .h5 data for the Charades-STA dataset, you can ignore this step. Preprocessed data will be put into the './data/Charades/h5py/' folder.

  • Go to the './grounding/TACOS/data_preparation/' folder, and run:
python generate_tacos_data.py

Preprocessed data for the TACoS dataset will be put into the './data/TACOS/h5py/' folder.

  • Go to the './grounding/ActivityNet/data_preparation/' folder, and run:
python generate_anet_data.py

Preprocessed data for the ActivityNet Captions dataset will be put into the './data/ActivityNet/h5py/' folder.

Model Training and Testing

  • For the Charades-STA dataset, the proposed model and all its variant models are provided. For example, the proposed SCDM model implementation is in the './grounding/Charades-STA/src_SCDM' folder, run:
python run_charades_scdm.py --task train

for model training, and run:

python run_charades_scdm.py --task test

for model testing. Other variant models are similar to train and test.

  • For the TACoS and ActivityNet Captions dataset, we only provide the proposed SCDM model implementation in the './grounding/xxx/src_SCDM' folder. The training and testing process are similar to the Charades-STA dataset.
  • Checkpoints of saved trained models for these datasets are provided at ActivityNet_Checkpoints, Charades_Checkpoints, Tacos_Checkpoints. You can use these checkpoints to reproduce the results in the paper (not exactly the same, but almost).

Citation

@inproceedings{SCDM19,
  title={Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos},
  author={Yitian Yuan, Lin Ma, Jingwen Wang, Wei Liu and Wenwu Zhu},
  booktitle={Proceedings of the Thirty-third Conference on Neural Information Processing Systems (Neurips 2019)},
  year={2019},
}

About

Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%