Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG Report]: Deconvolution issue where input shape has 'None' dimensions - Conv2DTranspose() #1219

Open
PhasonMatrix opened this issue Dec 2, 2023 · 1 comment
Assignees

Comments

@PhasonMatrix
Copy link

Description

I am attempting to build a U-Net with the Keras api. For the expansive part of the network I am using Conv2DTranspose() to deconvolve/upscale. When the model is built with a fixed input size like Shape(256, 256, 3) it builds and compiles without issue. When I try to make the U-Net work for arbitrary input shapes (for difference sized/shaped images) I use an input shape of (None, None, 3) which is (-1, -1, 3) in C#. When the model builds, the output shapes of the deconvolution layers appear to be calculated incorrectly.

In the debugger I see each deconvolution layer with output shapes of: (None, -2, -2, 256), (None, -4, -4, 64), (None, -8, -8, 32), and finally (None, None6, None6, 16).

As you can see in the last shape, there is an obvious bug where there is a string concatenation of "None" + "6" = "None6". Maybe it should be an integer addition instead?

From making the same model in Python, I think all of these negative numbers should actually be "None". I had a look in the source code and found that -2 is used to represent "Unkown" dimension, so I think that might be where the -2 in the first shape comes from, and then it gets doubled in subsequent deconvolution layers.

I finally get an exception at the output layer: Tensorflow.InvalidArgumentError: 'Negative dimension size caused by subtracting 1 from -16 for '{{node conv2d_18/Conv2D}} = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](conv2d_17/Relu, conv2d_18/ReadVariableOp)' with input shapes: [?,-16,-16,16], [1,1,16,1].'

Reproduction Steps

The network building function:

public static Model GetModel(int imageWidth, int imageHeight, int imageChannels, float learningRate)
{
    var inputShape = new Shape(imageWidth, imageHeight, imageChannels);
    var inputs = keras.layers.Input(inputShape, name: "image");

    // contraction path
    var c1Input = keras.layers.Conv2D(16, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(inputs);
    var c1Dropout = keras.layers.Dropout(0.1f).Apply(c1Input);
    var c1PostDropout = keras.layers.Conv2D(16, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c1Dropout);
    var c1Pooling = keras.layers.MaxPooling2D((2, 2)).Apply(c1PostDropout);


    var c2Input = keras.layers.Conv2D(32, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c1Pooling);
    var c2Dropout = keras.layers.Dropout(0.1f).Apply(c2Input);
    var c2PostDropout = keras.layers.Conv2D(32, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c2Dropout);
    var c2Pooling = keras.layers.MaxPooling2D((2, 2)).Apply(c2PostDropout);


    var c3Input = keras.layers.Conv2D(64, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c2Pooling);
    var c3Dropout = keras.layers.Dropout(0.2f).Apply(c3Input);
    var c3PostDropout = keras.layers.Conv2D(64, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c3Dropout);
    var c3Pooling = keras.layers.MaxPooling2D((2, 2)).Apply(c3PostDropout);


    var c4Input = keras.layers.Conv2D(128, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c3Pooling);
    var c4Dropout = keras.layers.Dropout(0.2f).Apply(c4Input);
    var c4PostDropout = keras.layers.Conv2D(128, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c4Dropout);
    var c4Pooling = keras.layers.MaxPooling2D((2, 2)).Apply(c4PostDropout);


    // bottleneck
    var c5Input = keras.layers.Conv2D(256, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c4Pooling);
    var c5Dropout = keras.layers.Dropout(0.3f).Apply(c5Input);
    var c5PostDropout = keras.layers.Conv2D(256, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c5Dropout);


    // expansive path
    var u6Transpose = keras.layers.Conv2DTranspose(128, kernel_size: (2, 2), strides: (2, 2), output_padding: "same", activation: "relu", kernel_initializer: "he_normal");
    var u6Tensor = u6Transpose.Apply(c5PostDropout);
    var u6Concat = keras.layers.Concatenate().Apply(new Tensors(u6Tensor, c4PostDropout));
    var c6Input = keras.layers.Conv2D(128, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(u6Concat);
    var c6Dropout = keras.layers.Dropout(0.2f).Apply(c6Input);
    var c6PostDropout = keras.layers.Conv2D(128, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c6Dropout);


    var u7Transpose = keras.layers.Conv2DTranspose(64, kernel_size: (2, 2), strides: (2, 2), output_padding: "same", kernel_initializer: "he_normal");
    var u7Tensor = u7Transpose.Apply(c6PostDropout);
    var u7Concat = keras.layers.Concatenate().Apply(new Tensors(u7Tensor, c3PostDropout));
    var c7Input = keras.layers.Conv2D(64, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(u7Concat);
    var c7Dropout = keras.layers.Dropout(0.2f).Apply(c7Input);
    var c7PostDropout = keras.layers.Conv2D(64, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c7Dropout);


    var u8Transpose = keras.layers.Conv2DTranspose(32, kernel_size: (2, 2), strides: (2, 2), output_padding: "same");
    var u8Tensor = u8Transpose.Apply(c7PostDropout);
    var u8Concat = keras.layers.Concatenate().Apply(new Tensors(u8Tensor, c2PostDropout));
    var c8Input = keras.layers.Conv2D(32, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(u8Concat);
    var c8Dropout = keras.layers.Dropout(0.1f).Apply(c8Input);
    var c8PostDropout = keras.layers.Conv2D(32, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c8Dropout);


    var u9Transpose = keras.layers.Conv2DTranspose(16, kernel_size: (2, 2), strides: (2, 2), output_padding: "same");
    var u9Tensor = u9Transpose.Apply(c8PostDropout);
    var u9Concat = keras.layers.Concatenate().Apply(new Tensors(u9Tensor, c1PostDropout));
    var c9Input = keras.layers.Conv2D(16, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(u9Concat);
    var c9Dropout = keras.layers.Dropout(0.1f).Apply(c9Input);
    var c9PostDropout = keras.layers.Conv2D(16, kernel_size: (3, 3), activation: "relu", kernel_initializer: "he_normal", padding: "same").Apply(c9Dropout);

    // output layer
    var outputs = keras.layers.Conv2D(1, (1, 1), activation: "sigmoid").Apply(c9PostDropout);

    // build the model
    Model model = (Model)keras.Model(inputs, outputs, name: "U-Net");
    model.summary();

    model.compile(
        optimizer: keras.optimizers.Adam(learningRate),
        loss: keras.losses.BinaryCrossentropy(from_logits: true),
        metrics: new[] { "accuracy" }
        );
    
    return model;
}

When I call this function with args GetModel(256, 256, 3, 0.001f) it builds, prints the summary to console and compiles without error.

When I call this function with GetModel(-1, -1, 3, 0.001f) the output shapes after calling .Apply() after .Conv2DTranspose():

image

image

image

image

And the exception

image

The tensor passed into this method has the "None6" dimensions:

image

Known Workarounds

No response

Configuration and Other Information

OS: Windows 11
.Net: 6.0

Using latest nuget packages
SciSharp.TensorFlow.Redist 2.16.0
TensorFlow.Keras 0.15.0
TensorFlow.Net 0.150.0

@Wanglongzhi2001
Copy link
Collaborator

Really thank you for your comprehensive issue report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants