Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to load indexed images (color maps) using dataset loader #622

Open
crazysal opened this issue Oct 9, 2018 · 5 comments
Open

How to load indexed images (color maps) using dataset loader #622

crazysal opened this issue Oct 9, 2018 · 5 comments

Comments

@crazysal
Copy link

crazysal commented Oct 9, 2018

The dataset loader returns image loaded as rgb. Which is great for all other purposes, but loading an indexed image for example segmentation mask is a pain.

This is the line :
https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py#L161

The PIL module does load images by default if they are just color maps, can we not just return without conversion to rgb or pass a flag to convert to rgb ?

OR is there any other better way of creating a dataloader with the masks folder ?

@fmassa
Copy link
Member

fmassa commented Oct 9, 2018

The ImageFolder Dataset class is mostly suitable for classification problems. Indeed, you'll see that the target in that case is a number, and not an image.

You can very easily write your own dataset class that satisfies the constraints of your dataset. It's very simple, something like

class Dataset(object):
    def __init__(self):
        # load paths for your dataset and put in self.data
        self.data = ...

    def __getitem__(self, idx):
        # get path to both images and mask
        data_path, mask_path = self.data[idx]
        # load images
        image = Image.open(data_path).convert("RGB")
        mask = Image.open(mask_path)
        # add transforms
        return image, mask

    def __len__(self):
        return len(self.data)

@crazysal
Copy link
Author

crazysal commented Oct 9, 2018

I can write my own loader yes. But essentially, segmentation is classification of each pixel.
Then, why the redundancy. For all my projects till now, I used to write my own pipelines, this was the first using the ImageFolder dataset.

So essentially, at this moment there are no in-built methods to load segmentation maps, yes (Also transforms on color maps, such that pixel values are maintained and not further interpolated) ?

Can we have a feature request for that ? Willing to chip in, since I realized I spent a considerable amount of time writing the same segmentation ingestion pipeline for different projects.

Thanks for the quick reply 👍

@fmassa
Copy link
Member

fmassa commented Oct 9, 2018

Hey,

There have been a number of requests for adding support for masks or segmentation data in torchvision.
The reason why we haven't added anything "off-the-box" yet is because every use-case is slightly different, and writing your own dataset for a particular case is usually faster than converting some new dataset to a pre-defined format (imagine converting from Pascal VOC format to COCO format).

The current recommended way of handling arbitrary transforms on both images and masks is by using the functional transforms. This lets you choose the right interpolation method for whichever data type you want (images usually use bilinear interpolation, masks require nearest interpolation, etc).

BTW, what's the segmentation dataset that you are currently working on?
Currently in torchvision we mostly have classification datasets, but I could add support for some standard segmentation datasets.

@crazysal
Copy link
Author

crazysal commented Oct 9, 2018

I am working on a domain adaptation task, with source domains as GTA and synthia and target domain as the cityscapes dataset.

It's basically an improvement to this paper

I understand your point of diversity between the different formats but adding support for the popular ones, coco, voc, cityscapes, mapilliary, etc, even if not synthetic ones would be a big help.

Thanks for your time.

@fmassa
Copy link
Member

fmassa commented Oct 9, 2018

I think COCO, VOC and potentially Cityscapes could be added. I haven't heard of mapilliary before.

At some point we should decide if we want all possible datasets in torchvision, or only the most used ones.
This is something I'm going to be deciding very soon.

In one hand, providing all possible datasets might make the user not realize that, if a dataset is not present in torchvision, he can also without much troubles implement his own logic for loading the data.
The hardest part (iterating efficiently over the data) is handled by DataLoader, so what remains to the DataSet is very little.

rajveerb pushed a commit to rajveerb/vision that referenced this issue Nov 30, 2023
…ytorch#622)

* Align optimizer parameters for embeddings and dense layers

* Remove weight_decay for embedding optimizer as it has no effect

* Use eps=1e-8 for Adagrad as it yields better AUROC results

* Rewrite so that it works for SGD as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants