What does this repository contain?
This repository contains 13 class labels for both train and test dataset in NYUv2. This is to avoid any hassle involved in parsing the data from the .mat files. If you are looking to train a network to do 13 class segmentation from RGB data, then this repository can provide you both the training/test dataset as well the corresponding ground truth labels. However, if your networks needs additionally depth data (either depth image or DHA features) then you will need to download the dataset from the NYUv2 website (~2.8GB) as well as the corresponding toolbox. To summarise, this repository contains the following
The train_labels_13 contains the ground truth annotation for 13 classes for NYUv2 training dataset while test_labels_13 contains the ground truth for test dataset in NYUv2.
Important to remember that the label files are ordered but the rgb files are not. Though you can order the files using
How do I obtain the DHA features?
Look for this in a corresponding SUN RGB-D meta data repository. You will need rotation matrices for each training and test image. They are available here at camera_rotations_NYU.txt. These matrices are used to align the floor normal vector to the canonical gravity vector. There are 1449 rotation matrices in total and the indices for these matrices corresponding to training and test data are in splits.mat. Remember that labels are ordered i.e. training labels files are named with indices 1 to 795 and similarly for test dataset.
How do I benchmark?
What are the classes and where is the mapping form the class number to the class name?
The mapping is also available at SceneNetv1.0 repository.