Feature Re-Learning with Data Augmentation for Video Relevance Prediction

The source code of our TKDE paper Feature Re-Learning with Data Augmentation for Content-based Video Recommendation. We proposed a feature re-learning model enhanced by data augmentation that works for both frame-level and video-level features and negative-enhanced triplet ranking loss. It is also our winning entry for the Hulu Content-based Video Relevance Prediction Challenge at the ACM Multimedia 2018 conference.

Requirements

Required Packages

python 2.7
PyTorch 0.3.1
tensorboard_logger for tensorboard visualization

We used virtualenv to setup a deep learning workspace that supports PyTorch. Run the following script to install the required packages.

virtualenv --system-site-packages ~/cbvr
source ~/cbvr/bin/activate
pip install -r requirements.txt
deactivate

Required Data

Download track_1_shows(6G) and track_2_movies(9.0G) datasets from Google Drive or Baidu Pan or here. If you have already downloaded the datasets provided by Hulu organizers, use the script do_feature_convert.sh to convert the dataset to fit for our code.
Run the following script to extract the downloaded data. The extracted data is placed in $HOME/VisualSearch/.

ROOTPATH=$HOME/VisualSearch
mkdir -p $ROOTPATH
# extract track_1_shows and track_2_movies datasets
tar zxf track_1_shows.tar.gz -C $ROOTPATH
tar zxf track_2_movies.tar.gz -C $ROOTPATH

Getting started

Augmentation for frame-level features

Run the following script to train and evaluate the model with augmentation for frame-level features and the negative-enhanced triplet ranking loss.

source ~/cbvr/bin/activate
# on track_1_shows and track_2_movies with stride=12
stride=12
loss=netrl  # use trl if you would like to use common triplet ranking loss
./do_all_frame_level.sh track_1_shows inception-pool3 $stride $loss
./do_all_frame_level.sh track_2_movies inception-pool3 $stride $loss
deactive

Running the script will do the following things:

Generate augmented frame-level features and operate mean pooling to obtain video-level features in advance.
Train the feature re-learning model with augmentation for frame-level features and select a checkpoint that performs best on the validation set as the final model.
Evaluate the final model on the validate set and generate predicted results on the test set. Both two relevance prediction strategies are performed. Note that we as participants have no access to the ground-truth of the test set. Please contact the task organizers in case you may want to evaluate our model or your own model on the test set.

Augmentation for video-level features

Run the following script to train and evaluate the model with augmentation for video-level features.

source ~/cbvr/bin/activate
# on track_1_shows
./do_all_video_level.sh track_1_shows c3d-pool5 netrl
# on track_2_movies
./do_all_video_level.sh track_2_movies c3d-pool5 netrl
deactive

Running the script will do the following things:

Train the feature re-learning model with augmentation for video-level features and select a checkpoint that performs best on the validation set as the final model. (The augmented video-level features are generated on the fly.)
Evaluate the final model on the validate set and generate predicted results on the test set.

How to perform the proposed augmentation for other video-related tasks?

The proposed augmentation essentially can be used for other video-related tasks. This note shows

How to perform data augmentation over frame-level features?
How to perform data augmentation over video-level features?

Citation

If you find the package useful, please consider citing our following papers:

@inproceedings{mm2018-cbvrp-dong,
title = {Feature Re-Learning with Data Augmentation for Content-based Video Recommendation},
author = {Jianfeng Dong and Xirong Li and Chaoxi Xu and Gang Yang and Xun Wang},
doi = {10.1145/3240508.3266441},
year = {2018},
booktitle = {ACM Multimedia},
}

@article{dong2019feature,
  title={Feature Re-Learning with Data Augmentation for Video Relevance Prediction},
  author={Dong, Jianfeng and Wang, Xun and Zhang, Leimin and Xu, Chaoxi and Yang, Gang and Li, Xirong},
  journal={IEEE Transactions on Knowledge and Data Engineering},
  doi={10.1109/TKDE.2019.2947442}
  year={2019},
  publisher={IEEE}
}

Acknowledgements

We are grateful to HULU organizers for the challenge organization effort.

@article{liu2018content,
  title={Content-based Video Relevance Prediction Challenge: Data, Protocol, and Baseline},
  author={Liu, Mengyi and Xie, Xiaohui and Zhou, Hanning},
  journal={arXiv preprint arXiv:1806.00737},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
fig		fig
simpleknn		simpleknn
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TEMPLATE_eval.sh		TEMPLATE_eval.sh
augmentation.ipynb		augmentation.ipynb
data.py		data.py
data_augmenter.py		data_augmenter.py
do_all_frame_level.sh		do_all_frame_level.sh
do_all_video_level.sh		do_all_video_level.sh
do_feature_convert.sh		do_feature_convert.sh
evaluation.py		evaluation.py
feature_convert.py		feature_convert.py
gene_aug_feat.py		gene_aug_feat.py
model.py		model.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Feature Re-Learning with Data Augmentation for Video Relevance Prediction

Requirements

Required Packages

Required Data

Getting started

Augmentation for frame-level features

Augmentation for video-level features

How to perform the proposed augmentation for other video-related tasks?

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

danieljf24/cbvr

Folders and files

Latest commit

History

Repository files navigation

Feature Re-Learning with Data Augmentation for Video Relevance Prediction

Requirements

Required Packages

Required Data

Getting started

Augmentation for frame-level features

Augmentation for video-level features

How to perform the proposed augmentation for other video-related tasks?

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages