EK-NLVL

PyTorch implementation of EK-NLVL QAS model

[Abstract] In this paper, we propose the framework of External Knowledge Transfer in Natural Language Video Localization (EK-NLVL). By utilizing the pretrained image captioner and unsupervised event proposal module, we generate pseudo-sentences and event proposals to train the Natural Language Video Localization (NLVL) model. Most of the existing approaches rely on costly annotations on sentences and temporal event proposals, restricting the models' performance only on the given datasets, not applicable to real-world NLVL problems. The proposed EK-NLVL leverages the idea of generating the pseudo-sentences from the given frames and summarizes it to ground the video event. We also propose the data augmentation with visual-aligned sentence filtering technique for pseudo-sentence generation that could effectively provide additional signal to the model for NLVL. Moreover, we propose the simpler model that leverages similarity between frame and pseudo-sentence by using CLIP loss, which effectively uses External Knowledge Transfer for the NLVL task. Experiments on Charades-STA and ActivityNet-Caption datasets demonstrate the efficacy of our method compared to the existing models.

Update

published in Journal of KIISE 2022
updated with link to pre-trained captioning model

Dependencies

This repository is implemented based on PyTorch with Anaconda.
from LGI Refer to Setting environment with anaconda or use Docker (carpedkm/ektnlvl:latest).

Preparation

Evaluating pre-trained models

Using anaconda environment

conda activate tg

# Evaluate QAS model trained from ActivityNet Captions Dataset
CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval \
                     --config [ANET CONFIG PATH] \
                     --checkpoint [ANET CHECKPOINT PATH] \
                     --dataset anet \
                     --ann_path <annotation> \
                     --exp_info <exp information>
# Evaluate QAS model trained from Charades-STA Dataset
CUDA_VISIBLE_DEVICES=0 python -m src.experiment.eval \
                     --config [CHARADES CONFIG PATH] \
                     --checkpoint [CHARADES CHECKPOINT PATH] \
                     --dataset charades \
                     --ann_path <annotation> \
                     --exp_info <exp information>

@inproceedings{kim2022qas,
    title     = "{Utilizing External Knowledge Transfer in Natural Language Video Localization}",
    author    = {Kim, Daneul and Ahn, Daechul and and Choi, Jonghyun},
    booktitle = {preprint},
    year      = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Charades_unsup_randomly_chosen_bartsumm_mnli_nb4_set1-3		Charades_unsup_randomly_chosen_bartsumm_mnli_nb4_set1-3
Docker		Docker
imgs		imgs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
anaconda_environment.md		anaconda_environment.md
checkout.ipynb		checkout.ipynb
compare_for_tiou.ipynb		compare_for_tiou.ipynb
preprocess.ipynb		preprocess.ipynb
process_files.ipynb		process_files.ipynb
visualization.ipynb		visualization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EK-NLVL

Update

Dependencies

Preparation

Evaluating pre-trained models

About

Releases

Packages

Languages

carpedkm/EK_NLVL

Folders and files

Latest commit

History

Repository files navigation

EK-NLVL

Update

Dependencies

Preparation

Evaluating pre-trained models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages