Skip to content
Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Python
Branch: master
Clone or download
Latest commit 0f62088 Oct 24, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Add files via upload Sep 10, 2019
grounding Update run_anet_scdm.py Sep 11, 2019
Neurips2019_YitianYuan_poster.pdf Add files via upload Oct 24, 2019
README.md Update README.md Sep 17, 2019
model.PNG Add files via upload Sep 10, 2019
task.PNG Add files via upload Sep 10, 2019

README.md

SCDM

Code for the paper: "Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos"

Introduction

Temporal sentence grounding (TSG) in videos aims to detect and localize one target video segment, which semantically corresponds to a given sentence query. We propose a semantic conditioned dynamic modulation (SCDM) mechanism to help solve the TSG problem, which relies on the sentence semantics to modulate the temporal convolution operations for better correlating and composing the sentence-related video contents over time.

Download Features and Example Preprocessed Data

First, download the following files into the './data' folder:

Then, download the preprocessed .h5 data for the Charades-STA dataset, and put it into the './data/Charades' folder. Actually, we have provided the code to preprocess the data, and you can also generate the preprocessed data by yourself.

Data Preprocessing

As denoted in our paper, we perform the temporal sentence grounding task in three datasets: Charades-STA, ActivityNet Captions, and TACoS. Before the model training and testing in these three datasets, please preprocess the data first.

  • Go to the './grounding/Charades-STA/data_preparation/' folder, and run:
python generate_charades_data.py

If you have downloaded the .h5 data for the Charades-STA dataset, you can ignore this step. Preprocessed data will be put into the './data/Charades/h5py/' folder.

  • Go to the './grounding/TACOS/data_preparation/' folder, and run:
python generate_tacos_data.py

Preprocessed data for the TACoS dataset will be put into the './data/TACOS/h5py/' folder.

  • Go to the './grounding/ActivityNet/data_preparation/' folder, and run:
python generate_anet_data.py

Preprocessed data for the ActivityNet Captions dataset will be put into the './data/ActivityNet/h5py/' folder.

Model Training and Testing

  • For the Charades-STA dataset, the proposed model and all its variant models are provided. For example, the proposed SCDM model implementation is in the './grounding/Charades-STA/src_SCDM' folder, run:
python run_charades_scdm.py --task train

for model training, and run:

python run_charades_scdm.py --task test

for model testing. Other variant models are similar to train and test.

  • For the TACoS and ActivityNet Captions dataset, we only provide the proposed SCDM model implementation in the './grounding/xxx/src_SCDM' folder. The training and testing process are similar to the Charades-STA dataset.
  • Checkpoints of saved trained models for these datasets are provided at ActivityNet_Checkpoints, Charades_Checkpoints, Tacos_Checkpoints. You can use these checkpoints to reproduce the results in the paper (not exactly the same, but almost).

Citation

@inproceedings{SCDM19,
  title={Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos},
  author={Yitian Yuan, Lin Ma, Jingwen Wang, Wei Liu and Wenwu Zhu},
  booktitle={Proceedings of the Thirty-third Conference on Neural Information Processing Systems (Neurips 2019)},
  year={2019},
}
You can’t perform that action at this time.