Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Images with different input shape #1

Closed
fm94 opened this issue Mar 14, 2022 · 1 comment
Closed

Images with different input shape #1

fm94 opened this issue Mar 14, 2022 · 1 comment

Comments

@fm94
Copy link

fm94 commented Mar 14, 2022

When changing img_size = 64 to count for images with different width and height i.e., also changing the process_img to:

def process_img(file_path, img_w, img_h):
  img = tf.io.read_file(file_path)
  img = tf.image.decode_jpeg(img, channels=3)
  img = tf.image.convert_image_dtype(img, tf.float32)
  img = tf.image.resize(img, size=(img_h, img_w), method='area')
  img = tf.image.convert_image_dtype(img, tf.uint8)
  return img

and the definition of the discriminator to:

def make_discriminator_model():
  disc = tf.keras.Sequential(
    [
      layers.Input(shape=(img_h, img_w, 3)),
...

and line 391 to: all_images.append(process_img(item, img_w, img_h))

The training crashes after the first epoch with: ValueError: Cannot reshape a tensor with 49152 elements to shape [32,128] (4096 elements) for '{{node Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32](discriminator/flatten/Reshape, Reshape/shape)' with input shapes: [32,1536], [2] and with input tensors computed as partial shapes: input[1] = [32,128].

Apparently, there is a problem here: real_first_output = tf.reshape(self.discriminator(images[i,...], training=True), (batch_size, disc_dim)). Somehow, the discriminator is not outputting correct dimension for reshaping. I have doubts that something is wrong with disc_dim but couldn't solve it.

I am using:

img_w = 256
img_h = 144
@muxamilian
Copy link
Owner

The neural network's size is fixed. If you feed larger input images, also the output of the neural network is going to be larger. Then the shapes don't match anymore.

There are two options:

  • Manually changing the neural network by, for example, adding an additional layer, since the input images are now larger.
  • Resizing the input to 64x64

In general, any input size that isn't a square is going to be difficult because then the convolutional layers aren't symmetric anymore and it can be quite tedious to design them so that the shapes match everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants