Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lsun/car and many other classes are encoded as webp, which decode_image does not support #3345

Open
gtoderici opened this issue Jul 9, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@gtoderici
Copy link

Short description

Decoding the LSUN dataset isn't fully supported as some images are encoded as WebP.

Consider the code:

builder = tfds.builder('lsun/car')
builder.download_and_prepare()
dataset = builder.as_dataset("train")
for i, example in enumerate(dataset):
  if i > 10: break
  media.show_image(example['image'])

This causes:

InvalidArgumentError: Unknown image file format. One of JPEG, PNG, GIF, BMP required.

The error is very explicit about which formats are supported. Unfortunately multiple of the classes in LSUN are encoded as WebP, causing this error.

I would like to point out that tfio.image.decode_webp exists, so this is definitely fixable via a tf.cond() in the pipeline.

Environment information

  • Operating System: any

  • Python version: any

  • All versions affected.

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ?
    Yes

Reproduction instructions

builder = tfds.builder('lsun/car')
builder.download_and_prepare()
dataset = builder.as_dataset("train")
for i, example in enumerate(dataset):
  if i > 10: break
  media.show_image(example['image'])

Link to logs

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-14-7304f5f9c25c> in <module>()
      2 builder.download_and_prepare()
      3 dataset = builder.as_dataset("train")
----> 4 for i, example in enumerate(dataset):
      5   if i > 10: break
      6   media.show_image(example['image'])

4 frames
google3/third_party/tensorflow/python/data/ops/iterator_ops.py in __next__(self)
    759   def __next__(self):
    760     try:
--> 761       return self._next_internal()
    762     except errors.OutOfRangeError:
    763       raise StopIteration

google3/third_party/tensorflow/python/data/ops/iterator_ops.py in _next_internal(self)
    745           self._iterator_resource,
    746           output_types=self._flat_output_types,
--> 747           output_shapes=self._flat_output_shapes)
    748 
    749       try:

google3/third_party/tensorflow/python/ops/gen_dataset_ops.py in iterator_get_next(iterator, output_types, output_shapes, name)
   2726       return _result
   2727     except _core._NotOkStatusException as e:
-> 2728       _ops.raise_from_not_ok_status(e, name)
   2729     except _core._FallbackException:
   2730       pass

google3/third_party/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
   6904   message = e.message + (" name: " + name if name is not None else "")
   6905   # pylint: disable=protected-access
-> 6906   six.raise_from(core._status_to_exception(e.code, message), None)
   6907   # pylint: enable=protected-access
   6908 

google3/third_party/py/six/__init__.py in raise_from(value, from_value)

InvalidArgumentError: Unknown image file format. One of JPEG, PNG, GIF, BMP required.
	 [[{{node decode_image/DecodeImage}}]] [Op:IteratorGetNext]

Expected behavior
Be able to iterate through all classes in LSUN.

Additional context
tf.io.decode_image explicitly does not support WebP.

@gtoderici gtoderici added the bug Something isn't working label Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant