How to load indexed images (color maps) using dataset loader #622

crazysal · 2018-10-09T05:02:32Z

The dataset loader returns image loaded as rgb. Which is great for all other purposes, but loading an indexed image for example segmentation mask is a pain.

This is the line :
https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py#L161

The PIL module does load images by default if they are just color maps, can we not just return without conversion to rgb or pass a flag to convert to rgb ?

OR is there any other better way of creating a dataloader with the masks folder ?

fmassa · 2018-10-09T05:10:27Z

The ImageFolder Dataset class is mostly suitable for classification problems. Indeed, you'll see that the target in that case is a number, and not an image.

You can very easily write your own dataset class that satisfies the constraints of your dataset. It's very simple, something like

class Dataset(object):
    def __init__(self):
        # load paths for your dataset and put in self.data
        self.data = ...

    def __getitem__(self, idx):
        # get path to both images and mask
        data_path, mask_path = self.data[idx]
        # load images
        image = Image.open(data_path).convert("RGB")
        mask = Image.open(mask_path)
        # add transforms
        return image, mask

    def __len__(self):
        return len(self.data)

crazysal · 2018-10-09T05:29:17Z

I can write my own loader yes. But essentially, segmentation is classification of each pixel.
Then, why the redundancy. For all my projects till now, I used to write my own pipelines, this was the first using the ImageFolder dataset.

So essentially, at this moment there are no in-built methods to load segmentation maps, yes (Also transforms on color maps, such that pixel values are maintained and not further interpolated) ?

Can we have a feature request for that ? Willing to chip in, since I realized I spent a considerable amount of time writing the same segmentation ingestion pipeline for different projects.

Thanks for the quick reply 👍

fmassa · 2018-10-09T05:44:06Z

Hey,

There have been a number of requests for adding support for masks or segmentation data in torchvision.
The reason why we haven't added anything "off-the-box" yet is because every use-case is slightly different, and writing your own dataset for a particular case is usually faster than converting some new dataset to a pre-defined format (imagine converting from Pascal VOC format to COCO format).

The current recommended way of handling arbitrary transforms on both images and masks is by using the functional transforms. This lets you choose the right interpolation method for whichever data type you want (images usually use bilinear interpolation, masks require nearest interpolation, etc).

BTW, what's the segmentation dataset that you are currently working on?
Currently in torchvision we mostly have classification datasets, but I could add support for some standard segmentation datasets.

crazysal · 2018-10-09T06:08:17Z

I am working on a domain adaptation task, with source domains as GTA and synthia and target domain as the cityscapes dataset.

It's basically an improvement to this paper

I understand your point of diversity between the different formats but adding support for the popular ones, coco, voc, cityscapes, mapilliary, etc, even if not synthetic ones would be a big help.

Thanks for your time.

fmassa · 2018-10-09T06:18:34Z

I think COCO, VOC and potentially Cityscapes could be added. I haven't heard of mapilliary before.

At some point we should decide if we want all possible datasets in torchvision, or only the most used ones.
This is something I'm going to be deciding very soon.

In one hand, providing all possible datasets might make the user not realize that, if a dataset is not present in torchvision, he can also without much troubles implement his own logic for loading the data.
The hardest part (iterating efficiently over the data) is handled by DataLoader, so what remains to the DataSet is very little.

…ytorch#622) * Align optimizer parameters for embeddings and dense layers * Remove weight_decay for embedding optimizer as it has no effect * Use eps=1e-8 for Adagrad as it yields better AUROC results * Rewrite so that it works for SGD as well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to load indexed images (color maps) using dataset loader #622

How to load indexed images (color maps) using dataset loader #622

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018

How to load indexed images (color maps) using dataset loader #622

How to load indexed images (color maps) using dataset loader #622

Comments

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018

crazysal commented Oct 9, 2018

fmassa commented Oct 9, 2018