Caffe implementation of our method for transferring knowledge from seen objects in images to unseen objects in videos.
Contact: Yi-Wen Chen (chenyiwena at gmail dot com)
Please cite our paper if you find it useful for your research.
Unseen Object Segmentation in Videos via Transferable Representations
Yi-Wen Chen, Yi-Hsuan Tsai, Chu-Ya Yang, Yen-Yu Lin and Ming-Hsuan Yang
Asian Conference on Computer Vision (ACCV), 2018 (oral)
Best Student Paper Award Honorable Mention
@inproceedings{Chen_TransferSeg_2018,
author = {Yi-Wen Chen and Yi-Hsuan Tsai and Chu-Ya Yang and Yen-Yu Lin and Ming-Hsuan Yang},
booktitle = {Asian Conference on Computer Vision (ACCV)},
title = {Unseen Object Segmentation in Videos via Transferable Representations},
year = {2018}
}
VOSTR: Video Object Segmentation via Transferable Representations
Yi-Wen Chen, Yi-Hsuan Tsai, Yen-Yu Lin and Ming-Hsuan Yang
International Journal of Computer Vision (IJCV), 2020
@inproceedings{Chen_VOSTR_2020,
author = {Yi-Wen Chen and Yi-Hsuan Tsai and Yen-Yu Lin and Ming-Hsuan Yang},
journal = {International Journal of Computer Vision (IJCV)},
title = {VOSTR: Video Object Segmentation via Transferable Representations},
volume = {128},
number = {4},
pages = {931-949},
year = {2020}
}
-
Install Caffe: http://caffe.berkeleyvision.org/.
-
Install MATLAB
-
Clone this repo
git clone https://github.com/wenz116/TransferSeg.git
cd TransferSeg
- Prepare for MBS
-
Go to the folder
utils/MBS/mex
. -
Modify the opencv include and lib paths in
compile.m/compile_win.m
(for Linux/Windows). -
Run
compile/compile_win
in MATLAB (for Linux/Windows).
-
Download the PASCAL VOC Dataset as the source image dataset, and put it in the
data/PASCAL/VOC2011
folder. -
Download the DAVIS Dataset as the target video dataset, and put it in the
data/DAVIS
folder.
-
Download the FCN model pre-trained on PASCAL VOC, and put it in the
nets
folder. -
Go to the folder
scripts
.
-
Compute optical flow of the input video. Run
compute_optical_flow('<VIDEO_NAME>')
in MATLAB. The optical flow images will be saved atdata/DAVIS/Motion/480p/<VIDEO_NAME>/
. -
Compute motion prior of the input video via minimum barrier distance. Run
get_prior('<VIDEO_NAME>')
in MATLAB. The motion prior images will be saved atdata/DAVIS/Prior/480p/<VIDEO_NAME>/
. -
Extract features of each category in PASCAL VOC. The extracted features will be saved at
cache/features/
, named asfeatures_PASCAL_<CLASS_NAME>_fc7.p
.
python get_feature_PASCAL.py <GPU_ID>
- Extract features of the input video. The extracted features will be saved at
cache/features/
, named asfeatures_DAVIS_<VIDEO_NAME>_fc7.p
.
python get_feature_DAVIS.py <GPU_ID> <VIDEO_NAME>
- Segment mining. The selected segments will be saved at
data/DAVIS/Train/480p/<VIDEO_NAME>/
.
python get_score.py <GPU_ID> <VIDEO_NAME>
- Self learning. The trained models will be saved at
output/snapshot/
.
./train.sh <GPU_ID> <VIDEO_NAME>
The model and code are available for non-commercial research purposes only.
- 12/2018: code released