Add Cityscapes semantic segmentation dataset #392

mitmul · 2017-08-15T06:47:45Z

This PR adds a dataset class for Cityscapes dataset. It replaces this PR: #230

yuyu2172 · 2017-08-16T08:01:42Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+
+        self.label_fns, self.img_fns = [], []
+        resol = os.path.basename(label_dir)
+        for dname in glob.glob('{}/*'.format(label_dir)):


How about using os.path.join(label_dir, '*') instead?
This will work even if there is / at the end of img_dir.

yuyu2172 · 2017-08-16T08:02:39Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        for dname in glob.glob('{}/*'.format(label_dir)):
+            if split in dname:
+                for label_fn in glob.glob(
+                        '{}/*/*_labelIds.png'.format(dname)):


yuyu2172 · 2017-08-16T08:03:52Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+
+    """
+
+    def __init__(self, img_dir, label_dir, split='train', ignore_labels=True):


How about setting a default directory names?
if img_dir is None,
how about setting the img_dir as CHAINER_DATASET_ROOT/pfnet/chainercv/cityscapes/leftImg8bit?
Same for label_dir.

OK, but for label_dir, we can't assume which label users use.

By having a default directory value, users do not need to specify the directory path once they set up their files properly. I think that this feature is very useful.

How about changing the options to data_dir and label_mode.
(i.e. def __init__(self, data_dir=None, label_mode=None, split='train', ignore_labels=True):)
The data_dir would be the path to the root dir whose default value is CHAINER_DATASET_ROOT/pfnet/chainercv/cityscapes.
Below the root directory, we expect at least two folders (leftImg8bit and a label directory that is going to be used).
The label_mode should raise an error when unspecified. It should be either fine or coarse.

OK, I'll reflect your suggestion and add a test for it.

yuyu2172 · 2017-08-16T08:06:29Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+                for label_fn in glob.glob(
+                        '{}/*/*_labelIds.png'.format(dname)):
+                    self.label_fns.append(label_fn)
+        for label_fn in self.label_fns:


#161 (comment)

fn is not used in ChainerCV.
However, I think that filenames is too long. An alternative would be fnames.

@Hakuyume

Oops, sorry for the same mistake.

Is it OK to use fnames?

I prefer filenames to fnames. It is not so long.

filenames is OK, but label_filenames and img_filenames are bit too long.

How about label_paths and img_paths?

The difference between *_filenames and *_paths is unclear.

*_paths is shorter than *_filenames and _fnames. If the problem of *_filenames is its length, *_paths will be a good solution.

I agree.
I think changing all *_filenames to *_paths is good. Leaving both is bad.

I will update other datasets accordingly.

@mitmul
Can you use *_paths?

yuyu2172 · 2017-08-16T08:13:28Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            self.label_fns[i], dtype=np.int32, color=False)[0]
+        H, W = label_orig.shape
+        if self.ignore_labels:
+            label_out = np.ones((H, W), dtype=np.int32) * -1


We can optimize this part by

not initializing label_out.

for loop only labels in ignore lists

When make the loop only over the ignore id list, how can I replace the label ids which are not marked as an ignoreInEval with trainId?

Ahh. I see. My bad.
It looks OK.
Is it OK to remove line 79?
Also, np.where is not necessary.

yuyu2172 · 2017-08-16T08:17:19Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        img_dir = os.path.join(img_dir, split)
+        self.ignore_labels = ignore_labels
+
+        self.label_fns, self.img_fns = [], []


How about using two lines and use list.
self.label_fnames = list()
self.img_fnames = list()

How about adding a check of []/list() to our coding style checker?

yuyu2172 · 2017-08-16T08:18:12Z

tests/datasets_tests/cityscapes_tests/test_cityscapes.py

+
+        img_dir = os.path.join(self.temp_dir, 'leftImg8bit')
+        label_dir = os.path.join(self.temp_dir, 'gtFine')
+        if self.split == 'test':


This is unnecessary.

yuyu2172 · 2017-08-16T08:19:24Z

tests/datasets_tests/cityscapes_tests/test_cityscapes.py

+        for i in range(10):
+            img = np.random.randint(0, 255, size=(128, 160, 3))
+            img = Image.fromarray(img.astype(np.uint8))
+            img.save(os.path.join(


Please use write_image.
#382

yuyu2172 · 2017-08-16T08:20:04Z

tests/datasets_tests/cityscapes_tests/test_cityscapes.py

+
+
+@testing.parameterize(
+    {'split': 'train'},


Please test ignore_labels (True, False).

yuyu2172 · 2017-08-16T08:21:29Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+
+    .. note::
+
+        Please download the data by yourself because Cityscapes dataset doesn't


How about this.

Please manually downalod the data because it is not allowed to re-distribute Cityscapes dataset.

yuyu2172 · 2017-08-16T08:23:59Z

Thank you for a great PR.
One thing about the code block in ChainerCV.
We have been using

:obj: `XXX`

instead of

``XX``

This is because I initially thought that the red highlight emphasizes a word too much.
Could you follow this convention for consistency?

yuyu2172 · 2017-08-16T08:24:53Z

Also, could you add this to the docs?
docs/source/reference/dataset.rst

yuyu2172 · 2017-08-17T03:49:22Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        self.label_fns, self.img_fns = [], []
+        resol = os.path.basename(label_dir)
+        for dname in glob.glob('{}/*'.format(label_dir)):
+            if split in dname:


Does it work for Coarse dataset as well?
From the Github page, it seems that there is train_extra split.
https://github.com/mcordts/cityscapesScripts

I have not yet looked at the dataset by myself, so please tell me if I am wrong.

That's why I used if split in dname at L:59

Can you add that to the doc and test?

mitmul · 2017-08-18T03:51:30Z

Fixed. Replaced *_fnames with *_paths.

yuyu2172 · 2017-08-18T04:52:37Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        data_dir (string): Path to the dataset directory. The directory should
+            contain at least two directories, :obj:`leftImg8bit` and either
+            :obj:`gtFine` or :obj:`gtCoarse`. If :obj:`None` is given, it uses
+            :obj:`$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes` as default.


by default.

yuyu2172 · 2017-08-18T04:54:32Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+                             'argment.')
+        elif label_mode != 'fine' and label_mode != 'coarse':
+            raise ValueError('\'label_name\' argment should be eighter '
+                             '\'fine\' or \'coarse\'. But {} was '


I think it is OK to simplify this. (delete line 45 and line 46)

if label_mode not in ['fine', 'coarse']: raise ValueError('\'label_name\' argment should be eighter ' '\'fine\' or \'coarse\'.')

yuyu2172 · 2017-08-18T04:55:22Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        for dname in glob.glob(os.path.join(label_dir, '*')):
+            if split in dname:
+                for city_dname in glob.glob(os.path.join(dname, '*')):
+                    for label_fname in glob.glob(


fname -> path

yuyu2172 · 2017-08-18T04:55:27Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+                            os.path.join(city_dname, '*_labelIds.png')):
+                        self.label_paths.append(label_fname)
+                        city_dnames.append(os.path.basename(city_dname))
+        for city_dname, label_fname in zip(city_dnames, self.label_paths):


yuyu2172 · 2017-08-18T04:55:33Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+                        city_dnames.append(os.path.basename(city_dname))
+        for city_dname, label_fname in zip(city_dnames, self.label_paths):
+            label_fname = os.path.basename(label_fname)
+            img_fname = label_fname.replace(


mitmul · 2017-08-18T07:05:38Z

@yuyu2172 Thanks, fixed.

yuyu2172 · 2017-08-18T11:07:22Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+
+        img_dir = os.path.join(data_dir, os.path.join('leftImg8bit', split))
+        resol = 'gtFine' if label_mode == 'fine' else 'gtCoarse'
+        label_dir = os.path.join(data_dir, resol)


We can easily anticipate users to instantiate this object expecting that ChainerCV would download the dataset.
This happens with CityscapesSemanticSegmentationDataset() without properly setting files under the default locations.
We should give a clear guidance to the users on where they should prepare the datasets when the error occurs.
This can be done in the similar way to ResNet in Chainer. https://github.com/chainer/chainer/blob/master/chainer/links/model/vision/resnet.py#L695

Perhaps raise an error here when either of the two necessary directories do not exist?

The error message can be something like this.

'Cityscapes dataset does not exist at the expected location.' 'Please download it from https://www.cityscapes-dataset.com/.' 'Then place directory leftImg8bit at {} and {} at {}.'.format( img_dir, resol, label_dir)

yuyu2172 · 2017-08-18T11:09:22Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            contain at least two directories, :obj:`leftImg8bit` and either
+            :obj:`gtFine` or :obj:`gtCoarse`. If :obj:`None` is given, it uses
+            :obj:`$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes` by default.
+        label_mode (string): The resolution of the labels. It should be either


string -> {'fine', 'coarse'}

yuyu2172 · 2017-08-18T11:09:35Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            defined in the original
+            `cityscapesScripts<https://github.com/mcordts/cityscapesScripts>_`
+            will be replaced with :obj:`-1` in the :meth:`get_example` method.
+            The default value is :obj:`True`


Forgetting a period.

yuyu2172 · 2017-08-18T11:10:00Z

Great. Few more comments.

mitmul · 2017-08-18T11:22:11Z

Thanks for the comments. Updated it.

yuyu2172 · 2017-08-19T00:26:38Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            contain at least two directories, :obj:`leftImg8bit` and either
+            :obj:`gtFine` or :obj:`gtCoarse`. If :obj:`None` is given, it uses
+            :obj:`$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes` by default.
+        label_mode ({'fine', 'coarse'}): The resolution of the labels. It


label_mode -> label_resol

I think this is more specific and better. Also this is consistent with the variable name used inside the code.
Sorry for the extra work.

I think resol is difficult to understand. How about using use_fine_label, which takes a boolean? If you want to support the case of three resolutions, I think label_resolution is better.

Thanks for you feedback.
+1 for label_resolution.

How about using type?

type the type/modality of data, e.g. gtFine for fine ground truth, or leftImg8bit for left 8-bit images.

from https://github.com/mcordts/cityscapesScripts/blob/8815d4643222abc0f0a41614745ffa1637734335/README.md#dataset-structure

So, I feel label_type is appropriate.

label_resolution=coarse is more explicit than use_fine_label=False, and it makes the code more readable.
It is not obvious that the opposite of use_fine_label is use_coarse_label.

label_level is not intuitive to me.

label_type can be confused as dtype.
It seems that label_resolution is fine. The length does not bother me much.

OK, label_resolution works for me.

The problem with label_type is that it is not specific to labels in the original code.
It accepts leftImg8bit.
This is nonsense. Sorry...

Anyway, just keeping the consistency of naming from the official scripts is better I think.

This rule may break the consistency in ChainerCV.

Ah, right. Well, label_resolution is OK for you?

yuyu2172 · 2017-08-19T00:26:57Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+
+    """
+
+    def __init__(self, data_dir=None, label_mode=None, split='train',


label_mode -> label_resol

yuyu2172 · 2017-08-19T00:28:24Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            data_dir = download.get_dataset_directory(
+                'pfnet/chainercv/cityscapes')
+        if label_mode not in ['fine', 'coarse']:
+            raise ValueError('\'label_mode\' argment should be eighter '


label_mode -> label_resol

yuyu2172 · 2017-08-19T00:30:17Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+        img = read_image(self.img_paths[i])
+        label_orig = read_image(
+            self.label_paths[i], dtype=np.int32, color=False)[0]
+        H, W = label_orig.shape


I think we can delete this and use np.ones(label_orgi.shape, dtype=np.int32) * -1 instead for line 102.

yuyu2172 · 2017-08-19T00:32:06Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            label_out = np.ones((H, W), dtype=np.int32) * -1
+            for label in cityscapes_labels:
+                if not label.ignoreInEval:
+                    label_out[np.where(label_orig == label.id)] = label.trainId


We can remove np.where.

yuyu2172 · 2017-08-19T00:35:58Z

label_resol seems better. Sorry for the extra work. label_mode was my idea...

mitmul · 2017-08-19T00:46:56Z

@yuyu2172 Thanks for the good catches. I fixed them.

yuyu2172 · 2017-08-19T01:15:45Z

chainercv/datasets/cityscapes/cityscapes_semantic_segmentation_dataset.py

+            contain at least two directories, :obj:`leftImg8bit` and either
+            :obj:`gtFine` or :obj:`gtCoarse`. If :obj:`None` is given, it uses
+            :obj:`$CHAINER_DATSET_ROOT/pfnet/chainercv/cityscapes` by default.
+        label_resolutionution ({'fine', 'coarse'}): The resolution of the


yuyu2172 · 2017-08-19T23:58:12Z

LGTM!

mitmul added 2 commits August 15, 2017 09:11

Add cityscapes dataset

5488d03

Write a test

5173ad1

mitmul mentioned this pull request Aug 15, 2017

Add PSPNet model #388

Closed

3 tasks

mitmul added 4 commits August 15, 2017 17:04

Make label and colors of cityscapes tuples

6f12fa3

Fix flake8

7a7661b

Add link to the website

e838475

Fix import

45e1a3c

yuyu2172 self-assigned this Aug 16, 2017

yuyu2172 reviewed Aug 16, 2017

View reviewed changes

yuyu2172 reviewed Aug 17, 2017

View reviewed changes

mitmul added 6 commits August 17, 2017 13:26

Fix

5abfee7

flake8

62960b7

Fix

2deef29

Remove unused import

4fc0d78

flake8

b6e9f99

Use paths

07d6b8f

yuyu2172 reviewed Aug 18, 2017

View reviewed changes

yuyu2172 mentioned this pull request Aug 18, 2017

Change variable names: filenames to paths #398

Merged

Fix

13b66c7

yuyu2172 reviewed Aug 18, 2017

View reviewed changes

yuyu2172 added the feature label Aug 18, 2017

yuyu2172 added this to the v0.7 milestone Aug 18, 2017

yuyu2172 mentioned this pull request Aug 18, 2017

[WIP] Add Cityscapes segmentation dataset #230

Closed

4 tasks

Raise an error when dataset does not exist

77e0aa7

yuyu2172 reviewed Aug 19, 2017

View reviewed changes

Fix

e4fee67

mitmul added 2 commits August 19, 2017 10:06

Use label_resolution

d50abf8

flake8

5b28c0e

yuyu2172 reviewed Aug 19, 2017

View reviewed changes

Fix a typo

0740b5a

yuyu2172 approved these changes Aug 19, 2017

View reviewed changes

yuyu2172 merged commit 2696e82 into chainer:master Aug 19, 2017

yuyu2172 mentioned this pull request Aug 20, 2017

Add Cityscapes to datasets #138

Closed

mitmul deleted the add-cityscapes branch May 18, 2018 09:23


		"""

		def __init__(self, img_dir, label_dir, split='train', ignore_labels=True):


		.. note::

		Please download the data by yourself because Cityscapes dataset doesn't


		"""

		def __init__(self, data_dir=None, label_mode=None, split='train',

Add Cityscapes semantic segmentation dataset #392

Add Cityscapes semantic segmentation dataset #392

Conversation

mitmul commented Aug 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Hakuyume Aug 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuyu2172 commented Aug 16, 2017

yuyu2172 commented Aug 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mitmul commented Aug 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mitmul commented Aug 18, 2017

yuyu2172 Aug 18, 2017 • edited Loading

Choose a reason for hiding this comment

yuyu2172 Aug 18, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuyu2172 commented Aug 18, 2017

mitmul commented Aug 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mitmul Aug 19, 2017 • edited Loading

Choose a reason for hiding this comment

yuyu2172 Aug 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuyu2172 Aug 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuyu2172 commented Aug 19, 2017

mitmul commented Aug 19, 2017

Choose a reason for hiding this comment

yuyu2172 commented Aug 19, 2017

Hakuyume Aug 17, 2017 •

edited

Loading

yuyu2172 Aug 18, 2017 •

edited

Loading

yuyu2172 Aug 18, 2017 •

edited

Loading

mitmul Aug 19, 2017 •

edited

Loading

yuyu2172 Aug 19, 2017 •

edited

Loading

yuyu2172 Aug 19, 2017 •

edited

Loading