This code implements the state-of-the-art video-segmentation / rotoscoping tool described in the CVPR 2017 paper ROAM: a Rich Object Appearance Model with Application to Rotoscoping by Ondrej Miksik, Juan-Manuel Perez-Rua, Philip H. S. Torr and Patrick Perez.
@inproceedings{miksik2017roam,
author = {Ondrej Miksik and Juan-Manuel Perez-Rua and Philip H.S. Torr and Patrick Perez},
title = {ROAM: a Rich Object Appearance Model with Application to Rotoscoping},
booktitle = {CVPR},
year = {2017}
}
For any questions about the code or the paper, feel free to contact us. More info can be found on the project page: http://www.miksik.co.uk/projects/rotoscoping/roam.html
The code is written in C++11 and can be compiled and run even on machines without any GPU (see below).
However, we strongly recommend using CUDA
to achieve faster run-times.
The only prerequisite is OpenCV 3.0
or newer.
We provide a CMake project for easy compilation on any platform.
The code was tested on Windows 10, Ubuntu 16.04, and Fedora 25 with Visual Studio 2013
or newer and g++ 6.3.1
or newer.
To compile the system, use the standard cmake approach:
-
Clone ROAM into
/roam
, e.g.:$ git clone git@github.com:omiksik/roam.git roam
-
Go to the downloaded folder
$ cd roam
-
Create a build directory
$ mkdir build
-
Run your CMake tool, either
$ cmake ..
or use
CMake GUI
. -
Compile it. On linux, use standard
$ make -j4
On windows, open the solution (
build/ROAM.sln
) in Visual Studio and build all in release mode. -
Done! Everything is built in the
build/bin
directory. See below how to run ROAM.
The default option is compilation without CUDA to ensure the code would compile without any problems on any machine.
However, we strongly recommend using GPU to achieve faster run-times (by a factor of 10 on some machines).
Prerequisite are the CUDA
drivers and SDK.
Then, set the WITH_CUDA
option to ON
in CMake
$ cmake -DWITH_CUDA=ON ..
and compile.
ROAM is built in build/bin
directory.
The main application is roam_cli
roam_cli --ini=${FIRST_FRAME_BINARY_MASK} --seq=${LIST_OF_FRAMES} --con=${PARAMETERS} --out=${OUTPUT_DIR} --win=${ZERO_PADDING_SIZE}
The required inputs are:
-
${FIRST_FRAME_BINARY_MASK} - path to the segmentation mask of the first frame. ROAM will track the object delineated by this mask starting from the first file of the video sequence.
-
${LIST_OF_FRAMES} - path to the text file containing list of all video images for a particular sequence. On linux, this file can be generated e.g. by
$ ls -1v .
-
${PARAMETERS} - path to the YAML file with the configuration of the tracker. Examples can be found in the YAML folder of the source code.
-
${OUTPUT_DIR} - path to the output directory
-
${ZERO_PADDING_SIZE} - optional
-
Download the DAVIS dataset from here.
-
Uncompress the file, e.g. with linux command
$ unzip DAVIS-data.zip -d ~/Documents/DAVIS
- Go to the directory with annotated masks and video sequences.
$ cd ~/Documents/DAVIS
- Create a text file with the sequence you want to try, for instance:
$ cd ~/Documents/DAVIS/JPEGImages/480p/blackswan/
$ ls -1v *.jpg > list.txt
- Create an output folder and run it
$ mkdir roam_output
$ roam_cli --ini="~/Documents/DAVIS/Annotations/480p/blackswan/00000.png" --seq="~/Documents/DAVIS/JPEGImages/480p/blackswan/list.txt" --con=yaml/default.yaml --out=~/Documents/DAVIS/JPEGImages/480p/blackswan/roam_output/
Roam will process the sequence and produce the object masks in the directory roam_output
.
If you use re-parametrization, you should cite Vladimir Kolmogorov's GraphCut implementation.
@article{Boykov01pami,
author = {Yuri Boykov and Vladimir Kolmogorov},
title = {An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision},
journal = {T-PAMI},
year = {2001},
}