OSMN: One-Shot Modulation Network for Semi-supervised Video Segmentation

This repo is the released code of one-shot modulation network described in the CVPR 2018 paper:

@article{ Yang2018osmn,
  author = {Linjie Yang and Yanran Wang and Xuehan Xiong and Jianchao Yang and Aggelos K. Katsaggelos},  
title = {Efficient Video Object Segmentation via Network Modulation},
  journal = {CVPR},
  year = {2018}
}

In this work, we propose to use a meta neural network named modulator to manipulate the intermediate layers of the segmentation network given the appearance of the object in the first frame. Our method only takes 140ms/frame for inference on DAVIS dataset.

Installation

Clone the repository

git clone https://github.com/linjieyangsc/video_seg.git

Install if necessary the required dependencies:
- Python 2.7
- Tensorflow r1.0 or higher (pip install tensorflow-gpu) along with standard dependencies
- Densecrf by Philipp Krähenbühl and Vladlen Koltun
- Other python dependencies: PIL (Pillow version), numpy, scipy

Training on DAVIS

Stage 1: Training the network on MS-COCO

Download MS-COCO 2017 dataset from here.
Download the VGG 16 model trained on Imagenet from the TF model zoo from here.
Place the vgg_16.ckpt file inside models/.
Run python osmn_coco_pretrain.py --data_path DATA_PATH --model_save_path MODEL_SAVE_PATH --gpu_id GPU_ID --training_iters 200000 to train the model with a default learning rate 1e-5. After it finishes, run python osmn_coco_pretrain.py --data_path DATA_PATH --model_save_path MODEL_SAVE_PATH --gpu_id GPU_ID --training_iters 300000 --learning_rate 1e-6 to further train it with a decreased learning rate. Be sure to keep MODEL_SAVE_PATH the same as in the first step to restore from existing checkpoint. For reference to other arguments, please check them by running python osmn_coco_pretrain.py -h.

Stage 2: Fine-tuning the network on DAVIS

Download DAVIS 2017 dataset from here.
Preprocess the dataset by running python preprocessing/preprocess_davis.py DATA_DIR.
To finetune and evaluate the model on DAVIS 2017 as stated in the paper, run python osmn_train_eval.py --data_path DATA_PATH --whole_model_path WHOLE_MODEL_PATH --result_path RESULT_PATH --model_save_path MODEL_SAVE_PATH_FT --gpu_id GPU_ID --batch_size 4 --fix_bn --randomize_guide --training_iters 50000 --learning_rate 1e-6. Here WHOLE_MODEL_PATH should be the path to the model saved by Stage 1 training. Other arguments can be seen by running python osmn_train_eval.py -h. After it finishes, the prediction result on DAVIS 2017 will be saved into RESULT_PATH.

One-shot finetuning on DAVIS

The trained model can be further improved by online one-shot finetuning on specific video sequences. Run the following command to finetune the model on either DAVIS 2016 or 2017.

python osmn_online_finetune.py --whole_model_path WHOLE_MODEL_PATH --result_path RESULT_PATH --model_save_path MODEL_SAVE_PATH_OL --gpu_id GPU_ID --batch_size 1 --training_iters 100 --data_version [2016/2017]

Evaluation

To evaluate either Stage 1 or 2 model on DAVIS. Please run the following command.

python osmn_train_eval.py --data_path DATA_PATH --whole_model_path WHOLE_MODEL_PATH --result_path RESULT_PATH --only_testing --data_version [2016/2017] --gpu_id GPU_ID [--save_score] [--use_full_res]

--save_score need to be set while testing on DAVIS 2017, which is required to combined score maps of different objects for final prediction. --use_full_res is to use full resolution images for inference, which can be beneficial when visual guide is small. Then run the following command to get the mIU score.

python davis_eval.py DATA_PATH RESULT_PATH DATASET_VERSION DATASET_SPLIT

To evaluate the model on Youtube Objects. First download the dataset, then preprocess the dataset by running python preprocessing/preprocess_youtube.py DATA_DIR. Then evaluate the model.

python osmn_eval_youtube.py --data_path DATA_PATH --whole_model_path WHOLE_MODEL_PATH --result_path RESULT_PATH --gpu_id GPU_ID

Then use the following command to get the mIU score.

python youtube_eval.py DATA_PATH RESULT_PATH

We release official model for both Stage 1 and 2, which can be downloaded from here. The Stage 1 model obtains mIU 72.2 on DAVIS 2016 and 52.5 on DAVIS 2017. The Stage 2 model obtains mIU 74.0 on DAVIS 2016.

Training on YoutubeVOS

YoutubeVOS (https://youtube-vos.org/) is currently the largest video object segmentation dataset. To train and evaluate the model on YoutubeVOS, run python osmn_train_eval_ytvos.py --data_path DATA_PATH --result_path RESULT_PATH --model_save_path MODEL_SAVE_PATH --gpu_id GPU_ID --batch_size BATCH_SIZE --randomize_guide --training_iters 200000. After it finishes, further train the model with a decreased learning rate 1e-6 for 100k iterations. To evaluate the model, run the following command to generate results for each object.

python osmn_train_eval_ytvos.py --data_path DATA_PATH --whole_model_path WHOLE_MODEL_PATH --result_path RESULT_PATH --only_testing --gpu_id GPU_ID --save_score

Then use the script ytvos_merge_result.py to merge objects in the same video:

python ytvos_merge_result.py DATA_PATH PRED_PATH OUTPUT_PATH DATA_SPLIT

Finally upload the results to Codalab website of the ECCV challenge.

Contact

If you have any questions regarding the repo, please send email to Linjie Yang (yljatthu@gmail.com).

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
doc/ims		doc/ims
models		models
preprocessing		preprocessing
LICENSE		LICENSE
README.md		README.md
common_args.py		common_args.py
dataset_coco.py		dataset_coco.py
dataset_davis.py		dataset_davis.py
davis_eval.py		davis_eval.py
image_util.py		image_util.py
mobilenet_v1.py		mobilenet_v1.py
model_func.py		model_func.py
model_init.py		model_init.py
ops.py		ops.py
osmn.py		osmn.py
osmn_coco_pretrain.py		osmn_coco_pretrain.py
osmn_eval_youtube.py		osmn_eval_youtube.py
osmn_online_finetune.py		osmn_online_finetune.py
osmn_online_finetune_ytvos.py		osmn_online_finetune_ytvos.py
osmn_train_eval.py		osmn_train_eval.py
osmn_train_eval_ytvos.py		osmn_train_eval_ytvos.py
util.py		util.py
youtube_eval.py		youtube_eval.py
ytvos_merge_result.py		ytvos_merge_result.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OSMN: One-Shot Modulation Network for Semi-supervised Video Segmentation

Installation

Training on DAVIS

Stage 1: Training the network on MS-COCO

Stage 2: Fine-tuning the network on DAVIS

One-shot finetuning on DAVIS

Evaluation

Training on YoutubeVOS

Contact

About

Releases

Packages

Contributors 2

Languages

License

linjieyangsc/video_seg

Folders and files

Latest commit

History

Repository files navigation

OSMN: One-Shot Modulation Network for Semi-supervised Video Segmentation

Installation

Training on DAVIS

Stage 1: Training the network on MS-COCO

Stage 2: Fine-tuning the network on DAVIS

One-shot finetuning on DAVIS

Evaluation

Training on YoutubeVOS

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages