Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DL4J Keras Import: tf.keras support #8348

Open
AlexDBlack opened this issue Nov 4, 2019 · 3 comments

Comments

@AlexDBlack
Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 4, 2019

TensorFlow 2 has 'build in' Keras model saving support:
https://www.tensorflow.org/tutorials/keras/save_and_load#hdf5_format

It looks like there are some format differences between this and the standard Keras model saving.

tf.keras is not yet tested with DL4J Keras import, but we know that there are some issues as of 1.0.0-beta5. For example:
#8338
#8344

Until this has been resolved, using the "standard" (non-TF) Keras - which is well tested - is one workaround.
Also using SameDiff's TensorFlow import support is another workaround (from TF frozen models).

@victoriest

This comment has been minimized.

Copy link

@victoriest victoriest commented Nov 4, 2019

Detail of #8338

Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidConfigException: Invalid configuration for layer (idx=-1, name=max_pooling2d_2, type=SubsamplingLayer) for height dimension:  Invalid input configuration for kernel height. Require 0 < kH <= inHeight + 2*padH; got (kH=16, inHeight=14, padH=0)
Input type = InputTypeConvolutional(h=14,w=14,c=192), kernel = [16, 16], strides = [16, 16], padding = [0, 0], layer size (output channels) = 192, convolution mode = Same
	at org.deeplearning4j.nn.conf.layers.InputTypeUtil.getOutputTypeCnnLayers(InputTypeUtil.java:327)
	at org.deeplearning4j.nn.conf.layers.SubsamplingLayer.getOutputType(SubsamplingLayer.java:153)
	at org.deeplearning4j.nn.modelimport.keras.layers.pooling.KerasPooling2D.getOutputType(KerasPooling2D.java:95)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.inferOutputTypes(KerasModel.java:304)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:179)
	at org.deeplearning4j.nn.modelimport.keras.KerasModel.<init>(KerasModel.java:96)
	at org.deeplearning4j.nn.modelimport.keras.utils.KerasModelBuilder.buildModel(KerasModelBuilder.java:307)
	at org.deeplearning4j.nn.modelimport.keras.KerasModelImport.importKerasModelAndWeights(KerasModelImport.java:172)
......

Code of model define:

    def __gen__rcnn_model():
        """
        RCNN, ref: https://github.com/JimLee4530/RCNN
        :return:
        """
        input_img = Input(shape=(INPUT_IMG_HEIGHT, INPUT_IMG_WIDTH, 3), name="input")
        conv1 = Conv2D(filters=192, kernel_size=[5, 5], strides=(1, 1), padding='same', activation='relu')(input_img)

        rconv1 = OcrModel.__rcl_block(192, conv1)
        dropout1 = Dropout(0.2)(rconv1)
        rconv2 = OcrModel.__rcl_block(192, dropout1)
        maxpooling_1 = MaxPooling2D((2, 2), strides=(2, 2), padding='same')(rconv2)
        dropout2 = Dropout(0.2)(maxpooling_1)
        rconv3 = OcrModel.__rcl_block(192, dropout2)
        dropout3 = Dropout(0.2)(rconv3)
        rconv4 = OcrModel.__rcl_block(192, dropout3)

        out = MaxPool2D((16, 16), strides=(16, 16), padding='same')(rconv4)
        flatten = Flatten()(out)
        prediction = Dense(11, activation='softmax')(flatten)

        model = Model(inputs=input_img, outputs=prediction)
        adam = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
        model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])

Summary by model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input (InputLayer)              (None, 28, 28, 3)    0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 28, 28, 192)  14592       input[0][0]                      
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 28, 28, 192)  331968      conv2d_1[0][0]                   
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 28, 28, 192)  768         conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_1[0][0]      
__________________________________________________________________________________________________
add_1 (Add)                     (None, 28, 28, 192)  0           conv2d_2[0][0]                   
                                                                 conv2d_3[0][0]                   
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 28, 28, 192)  768         add_1[0][0]                      
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_2[0][0]      
__________________________________________________________________________________________________
add_2 (Add)                     (None, 28, 28, 192)  0           conv2d_2[0][0]                   
                                                                 conv2d_4[0][0]                   
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 28, 28, 192)  768         add_2[0][0]                      
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_3[0][0]      
__________________________________________________________________________________________________
add_3 (Add)                     (None, 28, 28, 192)  0           conv2d_2[0][0]                   
                                                                 conv2d_5[0][0]                   
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 28, 28, 192)  768         add_3[0][0]                      
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 28, 28, 192)  0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 28, 28, 192)  331968      dropout_1[0][0]                  
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 28, 28, 192)  768         conv2d_6[0][0]                   
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_5[0][0]      
__________________________________________________________________________________________________
add_4 (Add)                     (None, 28, 28, 192)  0           conv2d_6[0][0]                   
                                                                 conv2d_7[0][0]                   
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 28, 28, 192)  768         add_4[0][0]                      
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_6[0][0]      
__________________________________________________________________________________________________
add_5 (Add)                     (None, 28, 28, 192)  0           conv2d_6[0][0]                   
                                                                 conv2d_8[0][0]                   
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 28, 28, 192)  768         add_5[0][0]                      
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 28, 28, 192)  331968      batch_normalization_7[0][0]      
__________________________________________________________________________________________________
add_6 (Add)                     (None, 28, 28, 192)  0           conv2d_6[0][0]                   
                                                                 conv2d_9[0][0]                   
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 28, 28, 192)  768         add_6[0][0]                      
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 14, 14, 192)  0           batch_normalization_8[0][0]      
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 14, 14, 192)  0           max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 14, 14, 192)  331968      dropout_2[0][0]                  
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 14, 14, 192)  768         conv2d_10[0][0]                  
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_9[0][0]      
__________________________________________________________________________________________________
add_7 (Add)                     (None, 14, 14, 192)  0           conv2d_10[0][0]                  
                                                                 conv2d_11[0][0]                  
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 14, 14, 192)  768         add_7[0][0]                      
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_10[0][0]     
__________________________________________________________________________________________________
add_8 (Add)                     (None, 14, 14, 192)  0           conv2d_10[0][0]                  
                                                                 conv2d_12[0][0]                  
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 14, 14, 192)  768         add_8[0][0]                      
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_11[0][0]     
__________________________________________________________________________________________________
add_9 (Add)                     (None, 14, 14, 192)  0           conv2d_10[0][0]                  
                                                                 conv2d_13[0][0]                  
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 14, 14, 192)  768         add_9[0][0]                      
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 14, 14, 192)  0           batch_normalization_12[0][0]     
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 14, 14, 192)  331968      dropout_3[0][0]                  
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 14, 14, 192)  768         conv2d_14[0][0]                  
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_13[0][0]     
__________________________________________________________________________________________________
add_10 (Add)                    (None, 14, 14, 192)  0           conv2d_14[0][0]                  
                                                                 conv2d_15[0][0]                  
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 14, 14, 192)  768         add_10[0][0]                     
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_14[0][0]     
__________________________________________________________________________________________________
add_11 (Add)                    (None, 14, 14, 192)  0           conv2d_14[0][0]                  
                                                                 conv2d_16[0][0]                  
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 14, 14, 192)  768         add_11[0][0]                     
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 14, 14, 192)  331968      batch_normalization_15[0][0]     
__________________________________________________________________________________________________
add_12 (Add)                    (None, 14, 14, 192)  0           conv2d_14[0][0]                  
                                                                 conv2d_17[0][0]                  
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 14, 14, 192)  768         add_12[0][0]                     
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D)  (None, 1, 1, 192)    0           batch_normalization_16[0][0]     
__________________________________________________________________________________________________
flatten_1 (Flatten)             (None, 192)          0           max_pooling2d_2[0][0]            
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 11)           2123        flatten_1[0][0]                  
==================================================================================================
Total params: 5,340,491
Trainable params: 5,334,347
Non-trainable params: 6,144
__________________________________________________________________________________________________
@AlexDBlack

This comment has been minimized.

Copy link
Contributor Author

@AlexDBlack AlexDBlack commented Nov 4, 2019

@victoriest ok, so this is pretty much what I expected, and mentioned in the other issue: keras must be adding some implicit padding here.
We (correctly, IMO) consider a kernel size larger than the input size to be invalid. You have a 16x16 kernel operating on 14x14 input, which doesn't really make sense.
That said, for import compatibility, we may need to allow Keras' way of implicitly adding padding.

out = MaxPool2D((16, 16), strides=(16, 16), padding='same')(rconv4)

batch_normalization_16 (BatchNo (None, 14, 14, 192)  768         add_12[0][0]                     
max_pooling2d_2 (MaxPooling2D)  (None, 1, 1, 192)    0           batch_normalization_16[0][0]

In the mean time, here's 2 ways that are more correct architectures:

Either of those should import fine.

I'll open a separate issue for this Keras edge case, it's unrelated to tf.keras support for DL4J Keras import

@victoriest

This comment has been minimized.

Copy link

@victoriest victoriest commented Nov 5, 2019

The solution is worked(14*14 kernel). im so appreciated and thanks for the reply.

@AlexDBlack AlexDBlack added this to the 1.0.0 Release milestone Nov 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.