Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown image file format. One of JPEG, PNG, GIF, BMP required. #944

Closed
ChurikiTenna opened this issue Sep 11, 2023 · 0 comments
Closed

Comments

@ChurikiTenna
Copy link

ChurikiTenna commented Sep 11, 2023

My code runs normally with the example flowers file, but when I switched to my tgz file, this error occur.

image_count 92
Deleted 0 images
Found 92 files belonging to 12 classes.
Using 74 files for training.
Found 92 files belonging to 12 classes.
Using 18 files for validation.
class_names ['normal_10', 'normal_12', 'normal_14', 'normal_16', 'normal_18', 'normal_2', 'normal_20', 'normal_22', 'normal_24', 'normal_4', 'normal_6', 'normal_8']
data set created!
Traceback (most recent call last):
  File "/Users/tenna/Desktop/integro/tensorflow/main.py", line 89, in <module>
    image_batch, labels_batch = next(iter(normalized_ds))
  File "/Users/tenna/Library/Python/3.9/lib/python/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 814, in __next__
    return self._next_internal()
  File "/Users/tenna/Library/Python/3.9/lib/python/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 777, in _next_internal
    ret = gen_dataset_ops.iterator_get_next(
  File "/Users/tenna/Library/Python/3.9/lib/python/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 3028, in iterator_get_next
    _ops.raise_from_not_ok_status(e, name)
  File "/Users/tenna/Library/Python/3.9/lib/python/site-packages/tensorflow/python/framework/ops.py", line 6656, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__IteratorGetNext_output_types_2_device_/job:localhost/replica:0/task:0/device:CPU:0}} Unknown image file format. One of JPEG, PNG, GIF, BMP required.
         [[{{node decode_image/DecodeImage}}]] [Op:IteratorGetNext] name: 

Code:

Added some codes to remove invalid jpeg data, but didn't work.


import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

import pathlib
dataset_url = 'https://firebasestorage.googleapis.com/v0/b/integro-app.appspot.com/o/cups.tgz?alt=media&token=f7dda9bd-4ff0-44c6-9b6b-f530142b10d0'
data_dir = tf.keras.utils.get_file('cups', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

image_count = len(list(data_dir.glob('*/*.jpeg')))
print('image_count',image_count)

# added to resoluve "Unknown image file format." error, but was not effective

num_skipped = 0
for folder_name in ("normal_2", "normal_4", "normal_6", "normal_8", "normal_10", "normal_12", "normal_14", "normal_6", "normal_8", "normal_20", "normal_22", "normal_24"):
    folder_path = os.path.join(data_dir, folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)

print("Deleted %d images" % num_skipped)

# データセットを作成する

batch_size = 32
img_height = 180
img_width = 180

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

class_names = train_ds.class_names
print('class_names', class_names)

# パフォーマンスのためにデータセットを構成する

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
print('data set created!')

# データを標準化する

normalization_layer = layers.Rescaling(1./255)
normalized_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
image_batch, labels_batch = next(iter(normalized_ds))
first_image = image_batch[0]
# Notice the pixel values are now in `[0,1]`.
print(np.min(first_image), np.max(first_image))

Also tried changing all the image extensions to .jpg as below, which did not work.

from PIL import Image 
import glob, os
directory = os.path.abspath("./cups")
new_directory = os.path.abspath("./new_cups")


for folder_name in ("normal_2", "normal_4", "normal_6", "normal_8", "normal_10", "normal_12", "normal_14", "normal_6", "normal_8", "normal_20", "normal_22", "normal_24"):
    folder_path = os.path.join(directory, folder_name)
    new_folder_path = os.path.join(new_directory, folder_name)
    for fname in os.listdir(folder_path):
        fpath = os.path.join(folder_path, fname)
        im = Image.open(fpath)
        rgb_im = im.convert("RGB")
        new_fpath = os.path.join(new_folder_path, fname).replace('.jpeg', '.jpg')
        print("new_fpath",new_fpath)
        rgb_im.save(new_fpath)

Tried with local tgz file, uploading the file on firebase storage, none of that worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant