DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video Code accompanying the paper
This repository includes:
- Code for training and testing our model for temporal moment localization DORi: Discovering Object Relationships for Moment Localization of a Natural
- Links to the I3D and object features we extracted for the YouCookII, Charades-STA and Activity-Net which were used for the experiments in our paper.
- Links to pre-trained models on the YouCookII, Charades-STA and Activity-Net datasets.
-
Clone this repo
git clone https://github.com/crodriguezo/DORi.git cd DORi
-
Create a conda environment based on our dependencies and activate it
conda create -n <name> --file environment.txt conda activate <name>
Where you can replace
<name>
by whatever you want. -
Download everything
sh ./download.sh
This script will download the following things in the folder
~/data/DORi
:- The
glove.840B.300d.txt
pre-trained word embeddings. - The I3D features for Charades-STA, YouCookII and Activity-Net we extracted and used in our experiments.
This script will also install the
en_core_web_md
pre-trained spacy model, and download the pre-processed annotations.Downloading everything can take a while depending on your internet connection, please be patient.
- The
If you have modified the download path from the defaults in the script above please modify the contents of the file ./config/settings.py
accordingly.
To train our model in the Charades-STA dataset, please run:
python main.py --config-file=experiments/CharadesSTA/CharadesSTA_train.yaml
To load our pre-trained model and test it, first make sure the weights have been downloaded and are in the ./checkpoints/charades_sta
folder. Then simply run:
python main.py --config-file=experiments/CharadesSTA/CharadesSTA_test.yaml
If you are interested in downloading some specific resource only, we provide the links below.
I3D Features
Object Features
GLoVe
Pretrained weights
If you use our code or features please consider citing our works.
@InProceedings{rodriguez2020proposal,
title={Proposal-free Temporal Moment Localization of a Natural-Language Query in Video using Guided Attention},
author={Rodriguez-Opazo, Cristian and Marrese-Taylor, Edison and Saleh, Fatemeh Sadat and Li, Hongdong and Gould, Stephen},
booktitle={2020 IEEE Winter Conference on Applications of Computer Vision (WACV)},
pages={2453--2462},
year={2020},
organization={IEEE}
}
@InProceedings{Rodriguez-Opazo_2021_WACV,
title = {DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video},
author = {Rodriguez-Opazo, Cristian and Marrese-Taylor, Edison and Fernando, Basura and Li, Hongdong and Gould, Stephen},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month = {January},
year = {2021},
pages = {1079-1088}
}