Created by Subhransu Maji
Shape constrained MRFs are desined to exploit the tiered structure and the rectangular shapes of the builings in typical skyline images to enable faster and more accurate labelling. We are also releasing a skyline-12 dataset consisting of 120 high resolution images from 12 different cities. Each image contains labels of the individual buildings for benchmarking. The images are split into training, validation and test set for evaluation purposes.
The images were collected from Flickr that were shared under creative commons licence.
If you find the code and the dataset useful in your research, please consider citing:
@inproceedings{tonge14CVPR,
Author = {Rashmi Tonge, Subhransu Maji, and C.V. Jawahar},
Title = {Parsing World's Skylines with Shape Constrained MRFs},
Booktitle = {Computer Vision and Pattern Recognition},
Year = {2014}
}
The code is released under the simplified BSD License (refer to the LICENSE file for details).
Prerequisites:
- MATLAB 2011a (I have tested the code on 64 bit MAC OSX 10.9)
VLFEAT
http://www.vlfeat.org/- A machine with 2GB+ of memory
Here are the steps for installation:
- Clone the git repository into your local directory
- Download the skyline-12 dataset
- Set the paths. In the
skylineConfig.m
file you should change the path variables to reflect the location of the downloaded data and theVLFEAT
directory - Run
startup.m
. You should see a message "Startup done". - Run
compile.m
. This compiles all the MEX files needed for the code to run. You are all set.
Note: On the clang compiler on my laptop runing OSX 10.9 I had to pass CXXFLAGS="-std=c++11" as additional flags. On a linux machine I had to remove this in order to compile the code.
In the main directory there are two demo files:
-
demoAnno.m
: This will load the annotations for the city of Chicago and display them in an interactive manner. Pressh
key for help. The code aldo contains other examples such as how to display all annotations, or those in the train set. -
demoParse.m
: This will load an image and run various algorithms for parsing. The code also displays the intermediate steps of parsing, evaluates the resulting parse in terms of mean average overlapMAO
scores (described in the paper).
Here are the steps to evaluate various methods for parsing:
- Load annotations
anno=loadAnno()
- Load config variables
conf=skylineConfig()
- To evaluate on a
test
set, runevalImageSet(conf, anno, 'test')
. This will run various the methods ontest
and return theirMAO
scores. This should reproduce the results in the paper (as listed below). The run times are on a Intel CPU @ 3.20GHz desktop.
Method | MAO | Running time |
---|---|---|
Unary | 54.5% | n/a |
Standard MRF | 62.3% | 69.5s |
Tiered MRF | 59.4% | 7.5s |
Rectangle MRF | 62.0% | 5.5s |
Refined MRF | 63.4% | 9.2s |
Note: minor differences might arise due to randomization in k-means for unary potentials
The skylineConfig.m
has all the parameters for running the code. For example you can turn off the display by setting conf.display=false
. If you want to save the output of the rectangleMRF.m
as a .gif file, you can do so by setting conf.gif=true
.
For speed the image is scaled down if the maximum dimension of the image is greater than conf.param.image.maxDim=2000
. You could run the code faster by lowering this value. The parsing parameters are no longer optimal if this is changed, but in my experience the rectangleMRF
and standardMRF
work fine for a range of values of maxDim
.