See our project website here.
In order to develop this code, we used the RVOS model (Recurrent End-to-End model for Video Object Segmentation), which can be found here.
- Clone the repo:
git clone https://github.com/imatge-upc/rvos-mots.git
- Install requirements
pip install -r requirements.txt
- Install PyTorch 1.0 (choose the whl file according to your setup, e.g. your CUDA version):
pip3 install https://download.pytorch.org/whl/cu100/torch-1.0.1.post2-cp36-cp36m-linux_x86_64.whl
pip3 install torchvision
Download the YouTube-VOS dataset from their website. You will need to register to codalab to download the dataset. Create a folder named databases
in the parent folder of the root directory of this project and put there the database in a folder named YouTubeVOS
. The root directory (rvos
folder) and the databases
folder should be in the same directory.
The training of the RVOS model for YouTube-VOS has been implemented using a split of the train set into two subsets: train-train and train-val. The model is trained on the train-train subset and validated on the train-val subset to decide whether the model should be saved or not. To train the model according to this split, the code requires that there are two json files in the databases/YouTubeVOS/train/
folder named train-train-meta.json
and train-val-meta.json
with the same format as the meta.json
included when downloading the dataset. You can also download the partition used in our experiments in the following links:
Download the DAVIS 2017 dataset from their website at 480p resolution. Create a folder named databases
in the parent folder of the root directory of this project and put there the database in a folder named DAVIS2017
. The root directory (rvos
folder) and the databases
folder should be in the same directory.
Dowload the KITTI-MOTS dataset from their website. Create a folder named databases
in the parent folder of the root directory of this project and put there the database in a folder named KITTIMOTS
. The root directory (rvos
folder) and the databases
folder should be in the same directory.
To highly speed the data loading we recommend to generate an LMDB indexing of it by doing:
python dataset_lmdb_generator.py -dataset=youtube
or
python dataset_lmdb_generator.py -dataset=davis2017
or
python dataset_lmdb_generator.py -dataset=kittimots
depending on the dataset you are using.
- Train the model for one-shot video object segmentation with
python train_previous_mask.py -model_name model_name
. Checkpoints and logs will be saved under../models/model_name
. - Train the model for zero-shot video object segmentation with
python train.py -model_name model_name
. Checkpoints and logs will be saved under../models/model_name
. - Other arguments can be passed as well. For convenience, scripts to train with typical parameters are provided under
scripts/
. - Plot loss curves at any time with
python plot_curves.py -model_name model_name
.
We provide bash scripts to evaluate models for the YouTube-VOS and DAVIS 2017 datasets. You can find them under the scripts
folder. On the one hand, eval_one_shot_youtube.sh
and eval_zero_shot_youtube.sh
generate the results for YouTube-VOS dataset on one-shot video object segmentation and zero-shot video object segmentation respectively. On the other hand, eval_one_shot_davis.sh
and eval_zero_shot_davis.sh
generate the results for DAVIS 2017 dataset on one-shot video object segmentation and zero-shot video object segmentation respectively.
Furthermore, in the src
folder, prepare_results_submission.py
and prepare_results_submission_davis
can be applied to change the format of the results in the appropiate format to use the official evaluation servers of YouTube-VOS and DAVIS respectively.
You can run demo.py to do generate the segmentation masks of a video. Just do:
python demo.py -model_name one-shot-model-davis --overlay_masks
and it will generate the resulting masks.
To run the demo for your own videos:
- extract the frames to a folder (make sure their names are in order, e.g. 00000.jpg, 00001.jpg, ...)
- Have the initial mask corresponding to the first frame (e.g. 00000.png).
- run
python demo.py -model_name one-shot-model-davis -frames_path path-to-your-frames -mask_path path-to-initial-mask --overlay_masks
to do it for zero-shot (i.e. without initial mask) run
python demo.py -model_name zero-shot-model-davis -frames_path path-to-your-frames --zero_shot --overlay_masks
Also you can use the argument -results_path
to save the results to the folder you prefer.
All models specified on the original paper of Curriculum Learning for Recurrent Video Object Segmentation can be downloaded on the following links. You may need to request access:
Weights for models trained for the YouTUBE-VOS and DAVIS challenges can also be downloaded on the following links:
Extract and place the obtained folder under models
directory.
You can then run evaluation scripts with the downloaded model by setting args.model_name
to the name of the folder.
For questions and suggestions use the issues section or send an e-mail to maria.gonzalez.calabuig@upc.edu, cventuraroy@uoc.edu or xavier.giro@upc.edu