This is a repository with the original implementation of the described solutions in our IROS 2018 paper.
This repository is released under the GNU General Public License (refer to the LICENSE file for details).
If you make use of this data and software, please cite the following reference in any publications:
@Inproceedings{herranz2018,
Author = {Herranz-Perdiguero, C. and Redondo-Cabrera, C. and L\'opez-Sastre, R.~J.},
Title = {In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization},
Booktitle = {IROS},
Year = {2018}
}
- In order to use this software, clone this repository. We will call that directory
PROJECT_ROOT
. There, we will have three folders:
PROJECT_ROOT/deeplab_NYU
contains the code needed to perform semantic segmentation.PROJECT_ROOT/NYU_depth_v2
contains the NYU Depth v2 dataset.PROJECT_ROOT/Classification
contains the code needed to perform scene classification.
- Download the NYU Depth v2 dataset. To do so, go to
PROJECT_ROOT/NYU_depth_v2
and run the following script:
cd $PROJECT_ROOT/NYU_depth_v2
./Download_NYU_depth.sh
The dataset is saved in a .mat file which is located in $PROJECT_ROOT/NYU_depth_v2/NYUdepth
- Now, to download a pretrained model to initialize the network and the best of our models, run:
cd $PROJECT_ROOT/NYU_depth_v2
./Download_Models.sh
You can find the file init.caffemodel
in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet
. We will use that model to initialize the weights of the segmentation network. In $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet/trained_model
you can find the final model used in our IROS 2018 paper.
- Finally we need to extract the images from the .mat file we have downloaded. Under
PROJECT_ROOT/NYU_depth_v2
just run the Matlab scriptObtainImages.m
. Also, to overcome memory limitations, we will split the images in two halves with overlap. We can do that by runningSplitWholeDataset.m
.
- We need first to compile the DeepLab based model. Run:
cd $PROJECT_ROOT/deeplab_NYU
make
make pycaffe
make matcaffe
- Once the compilation has been successfully completed, run:
cd $PROJECT_ROOT/deeplab_NYU
python run_NYUdepth.py
to train or test a model. Please, note that in the script there is a flag to choose between train or test. By default, we search in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet
for a .caffemodel to initialize the weights of the network in the training step, or to find the model to use during testing.
-
To visualize the results after the test has been completed, run the Matlab script
GetSegResults_Split.m
with do_save_results to '1' in$PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth
. We will need these results to perform the scene classification task. -
To evaluate the results using the different metrics reported in our paper run:
Get_mIOU.m
in$PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth
Once we have the results from our segmentation model, we can use them to categorize different scenes following the approach described in our paper. Simply:
-
In
PROJECT_ROOT/Classification/histograms
, runGenerate_histograms.m
to generate the histogram-like features from the segmented images. The code allows to choose the number of spatial pyramid levels you want to play with. -
Under
PROJECT_ROOT/Classification/SVM
, you can find the models used to perform scene classification with both a Linear SVM and an Additive Kernel SVM. You just need to runrun_SVM.m
orrun_addtive_SVM.m