Added cityscapes dataset by f1recracker · Pull Request #225 · tensorflow/datasets

f1recracker · 2019-03-11T22:35:05Z

Added Cityscapes dataset with segmentation labels. Fixes #214.

f1recracker · 2019-03-11T22:46:52Z

Hello! This is my first contribution to tensorflow-datasets! I actually had a few quick questions / design items to discuss before merging this.

How are the unit tests run? (is it via oss_scripts/oss_tests?) I haven't run any of the cityscapes test but I checked the size of using the actual dataset in my local repository.
What is the recommended size of fake datasets used for testing? I have tried to ensure each split has a different size but I still end up with a 38MB fake dataset (cityscapes has big images - 1024 * 2048).
The current PR does not have instance level segmentation labels, only semantic segmentation labels. I plan on fixing this before the merge but this requires parsing a JSON. The official cityscapes repository has a parser already and is also packaged as an installer. I came up with three apporaches to solve this:
- Re-implement the parser in this repository
- Clone the repository with the dataset, and add it to the include path (seems kinda hacky)
- Require a user installation of cityScapesScripts if the cityscapes dataset is loaded

us

Welcome :)
Maybe if you can same fake_data folder, it'll help to decreasing mb.(assumed that the images were the same)

f1recracker · 2019-03-11T23:25:46Z

I've used pngquant to compress the fake images!

us · 2019-03-12T00:27:53Z

@f1recracker some dataset's test generate own images and label like testing/cifar.py. Maybe it can reduce size even more.

cyfra · 2019-03-12T17:23:43Z

@f1recracker - for running the tests, you can also invoke them from command line directly (assuming you're in the top directory):

python3 -m tensorflow_datasets.image.celeba_test

tensorflow_datasets/image/cityscapes.py

tensorflow_datasets/image/cityscapes_test.py

f1recracker · 2019-03-12T21:50:38Z

I've fixed most of the changes but I have observed that dl_manager.extract does not work as intended in unit tests (which actually returns DL_EXTRACT_RESULT or the example_path from what I can see here) but dl_manager.iter_archive works. Is mocking the extract method intentional?

If mocking extract is intentional, then I have a few work arounds to solve this issue:

Rewrite my code to use dl_manager.iter_archive but the problem is that I need to zip and synchronize generators returned from two archives (images and labels), which would require loading all file objects in memory at once and then generate match across both archives using ids.
Put the extracted data in fake_data directly and leave empty zip file, which seems much cleaner but would create a few additional files in the fake_data instead of a single archive.

cyfra · 2019-03-13T19:28:07Z

@f1recracker - yes, you're right, this is an inconsistency in our testing framework that we introduced when adding DL_EXTRACT_RESULT.

@pierrot0 - Pierre, can you comment on the extract issue and what is the recommended way @f1recracker should follow?

…tasets

f1recracker · 2019-04-08T19:41:37Z

Has there been any updates on this? Sorry I got a little caught up with some work during the past few weeks!

f1recracker · 2019-05-15T07:22:47Z

It seems that the latest version fixed the inconsistency and also another error. I've also refactored my code to now use BuilderConfigs instead of having separate classes for Cityscapes - fine and Cityscapes - coarse.

lgeiger · 2019-08-02T00:00:48Z

@f1recracker Thanks for the PR! Would it be also possible to add the depth data of the Cityscapes dataset as an additional feature?

f1recracker · 2019-08-02T04:00:23Z

I wasn't aware of the depth maps - let me have a look at this.

lgeiger · 2019-08-02T09:32:03Z

Thanks for taking a look!

lgeiger · 2019-08-06T21:53:00Z

@f1recracker The stereo pair right images would also be a great addition to this PR.

f1recracker · 2019-08-12T00:44:12Z

@lgeiger Added support for right images and disparity maps too. Cityscapes now has 4 configurations:

'semantic_segmentation' - left + segmentations
'semantic_segmentation_extra' - left + segmentations + coarse labels + train extra split
'stereo_disparity' - left + right + disparity maps
'stereo_disparity_extra' - left + right + disparity maps + train extra split

Let me know your thoughts on this configuration.

Could this be reviewed again? There's significant changes since the last review (BuilderConfigs so no abcs, and the recent change with disparity maps)? Thanks!

tensorflow_datasets/image/cityscapes.py

netw0rkf10w · 2019-10-09T13:12:04Z

@f1recracker Thanks for working on this.
@cyfra What would need to be changed for this to be merged please?

cyfra · 2019-10-24T15:12:21Z

Hey @netw0rkf10w
Unfortunately we have grown quite a backlog of PRs - sorry for that.
We were quite busy with making sure that current ones will work with Python3.

as I see that there is high interest with this dataset, I'll try to merge it soon.

netw0rkf10w · 2019-10-24T15:23:14Z

@cyfra Thank you for your reply, and for your hard work! This is indeed a very popular dataset in computer vision, so the merge would benefit a lot of users. Looking forward to having this merged.

tensorflow_datasets/image/cityscapes.py

Conchylicultor · 2019-10-24T18:58:16Z

tensorflow_datasets/image/cityscapes.py

+
+  BUILDER_CONFIGS = [
+      CityscapesConfig(
+          name='semantic_segmentation',


Instead of using configs, couldn't you merge all configs into a single one, and the user could afterward select only the feature he wants.
Or reduce the number of config to only two ?

Currently the code is quite difficult to read with a lot of if/else condition which make it difficult to understand.

It was a deliberate choice - Disparity maps need additional permissions from the authors and aren't readily available. 'Extra' splits - whether for segmentation or for the disparity requires additional files (50GB). I chose this route so that each configuration so that it would work for all cases.

I gave this some more thought today to see if I could refactor the code to use fewer if/else statements. The problem arises because Cityscapes has too many features that every user may not use.

There's two 'datasets' for different tasks - semantic segmentation and disparity inference. Disparity tasks can also require the right images - another additional download.

Every cityscapes 'dataset' (semantic seg / stereo depth) has an additional 'extra' split. This split is much larger and any labels provided are coarse grain instead.

I think we still need 4 (2 (task) x 2 (extra split)) configurations.

One feature that might reduce some if/else might be to use a constant feature dict and use a 'default' image tensor (since cityscape labels are images) if this is acceptable.

Are there any tutorials on how to use this class please?

You just have to do something along the lines of:

import tensorflow_datasets as tfds dataset = tfds.load(name="cityscapes/semantic_segmentation_extra", split="train") # Can also specify config from above.

I'd give a working Colab but unfortunately this is a pretty big dataset and requires manual download :)

@f1recracker thank you for the reply!
I downloaded the dataset manually and will attempt to access it with the syntax you suggested
Thanks again.

tensorflow_datasets/image/cityscapes.py

netw0rkf10w · 2019-11-21T14:20:43Z

Thanks a lot, @f1recracker, for continuing working on this! Unfortunately I am not yet familiar enough with the code to be able to help you.

PiperOrigin-RevId: 287961867

cyfra · 2020-01-03T10:42:40Z

Ok - dataset is submitted.

Thanks a lot @f1recracker for your contribution and patience.

Added cityscapes dataset

d27d9cb

googlebot added the cla: yes Author has signed CLA label Mar 11, 2019

us reviewed Mar 11, 2019

View reviewed changes

Reduced size of fake datasets

c80d87d

Fixed incorrect name in documentation

f8378d1

cyfra reviewed Mar 12, 2019

View reviewed changes

tensorflow_datasets/image/cityscapes.py Show resolved Hide resolved

cyfra suggested changes Mar 12, 2019

View reviewed changes

cyfra added the almost_ready_to_merge label Mar 12, 2019

cyfra self-assigned this Mar 12, 2019

f1recracker added 2 commits March 14, 2019 16:11

Added code review changes

54f9ee0

Merge branch 'master' of https://github.com/f1recracker/tensorflow-da…

fdbb7bc

…tasets

Conchylicultor added the dataset request Request for a new dataset to be added label Apr 25, 2019

f1recracker added 2 commits May 14, 2019 16:05

Merge branch 'master' of https://github.com/tensorflow/datasets

8eea1a3

Updated cityscapes class to use BuilderConfig

1f0acfe

Merge branch 'master' of https://github.com/tensorflow/datasets

ad1f64c

f1recracker added 3 commits August 9, 2019 20:59

Merge branch 'master' of https://github.com/tensorflow/datasets

95b6109

Updated cityscapes dataset loader, trimmed fake data

fb760a0

Full refactor; Added stereo images and disparity maps

0776bb9

Style changes to adhere to google styleguide

ff27d61

cyfra added the manual download required label Aug 13, 2019

us suggested changes Aug 14, 2019

View reviewed changes

tensorflow_datasets/image/cityscapes.py Outdated Show resolved Hide resolved

Removed explicit num_shards calculations

b06e86c

This was referenced Aug 27, 2019

Add 'Lost and Found' Road Hazard Dataset #949

Closed

Add 'Lost and Found' Road Hazard Dataset #950

Merged

mathmanu mentioned this pull request Oct 21, 2019

[data request] <Cityscapes> #214

Closed

f1recracker requested a review from cyfra October 22, 2019 04:49

cyfra added the kokoro:run Run Kokoro tests label Oct 24, 2019

kokoro-team removed the kokoro:run Run Kokoro tests label Oct 24, 2019

Conchylicultor reviewed Oct 24, 2019

View reviewed changes

f1recracker added 2 commits November 17, 2019 12:46

Merge branch 'master' of https://github.com/tensorflow/datasets

9fd6371

Removed python3 specific syntax

b25c53b

tfds-copybara pushed a commit that referenced this pull request Jan 3, 2020

Merge pull request #225 from f1recracker:master

097bd6a

PiperOrigin-RevId: 287961867

tfds-copybara merged commit b25c53b into tensorflow:master Jan 3, 2020

Conversation

f1recracker commented Mar 11, 2019

Uh oh!

f1recracker commented Mar 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

us left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

f1recracker commented Mar 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

us commented Mar 12, 2019

Uh oh!

cyfra commented Mar 12, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

f1recracker commented Mar 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfra commented Mar 13, 2019

Uh oh!

f1recracker commented Apr 8, 2019

Uh oh!

f1recracker commented May 15, 2019

Uh oh!

lgeiger commented Aug 2, 2019

Uh oh!

f1recracker commented Aug 2, 2019

Uh oh!

lgeiger commented Aug 2, 2019

Uh oh!

lgeiger commented Aug 6, 2019

Uh oh!

f1recracker commented Aug 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

netw0rkf10w commented Oct 9, 2019

Uh oh!

cyfra commented Oct 24, 2019

Uh oh!

netw0rkf10w commented Oct 24, 2019

Uh oh!

Uh oh!

Conchylicultor Oct 24, 2019

Choose a reason for hiding this comment

Uh oh!

f1recracker Nov 8, 2019

Choose a reason for hiding this comment

Uh oh!

f1recracker Nov 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Aya-S Jan 30, 2020

Choose a reason for hiding this comment

Uh oh!

f1recracker Jan 31, 2020

Choose a reason for hiding this comment

Uh oh!

Aya-S Feb 3, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

netw0rkf10w commented Nov 21, 2019

Uh oh!

cyfra commented Jan 3, 2020

Uh oh!

Reviewers

Assignees

f1recracker commented Mar 11, 2019 •

edited

Loading

us left a comment •

edited

Loading

f1recracker commented Mar 11, 2019 •

edited

Loading

f1recracker commented Mar 12, 2019 •

edited

Loading

f1recracker commented Aug 12, 2019 •

edited

Loading

f1recracker Nov 18, 2019 •

edited

Loading