[TF 2.0 keras] Unable save and load weights for double nested models #27769

zzh8829 · 2019-04-12T05:30:00Z

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): mac
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.0.0
Python version: 3.7

Describe the current behavior
load_weights throw exception on a doubly nested model

Describe the expected behavior
load_weights should work

This problem only happens on two+ layers of nested model with non-trainable weights.
The reason is save_weights and load_weights handles nested model differently
save_weights -> call layer.weights for each layer
load_weights -> recursively call model.weights if layer is a nested Model

Code to reproduce the issue

import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization

shape = (None, None, 3)

def BNModel():
    x = inputs = Input(shape)
    x = Conv2D(3, 1)(x)
    x = BatchNormalization()(x)
    return Model(inputs, x)

x = inner_inputs = Input(shape)
x = BNModel()(x)
x = BNModel()(x)
inner_model = Model(inner_inputs, x)

inputs = Input(shape)
model = Model(inputs, inner_model(inputs))

inner_model.save_weights('test.h5')
inner_model.load_weights('test.h5')  # works fine

model.save_weights('test.h5')
model.load_weights('test.h5')   # Exception: axes don't match array !!!

Other info / logs
This bug is also reported on upstream keras keras-team/keras#11847
Here is a detailed analysis on why this is happening keras-team/keras#11847 (comment)

Full Exception

  File "test.py", line 27, in <module>
    model.load_weights('test.h5')   # Exception: axes don't match array !!!
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1497, in load_weights
    hdf5_format.load_weights_from_hdf5_group(f, self.layers)
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 751, in load_weights_from_hdf5_group
    layer, weight_values, original_keras_version, original_backend)
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 377, in preprocess_weights_for_loading
    weights = convert_nested_model(weights)
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 365, in convert_nested_model
    original_backend=original_backend))
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 377, in preprocess_weights_for_loading
    weights = convert_nested_model(weights)
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 353, in convert_nested_model
    original_backend=original_backend))
  File "/usr/local/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 459, in preprocess_weights_for_loading
    weights[0] = np.transpose(weights[0], (3, 2, 0, 1))
  File "/usr/local/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 598, in transpose
    return _wrapfunc(a, 'transpose', axes)
  File "/usr/local/anaconda3/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
ValueError: axes don't match array

The text was updated successfully, but these errors were encountered:

zzh8829 · 2019-04-12T14:09:17Z

This only affected .h5 format, tensorflow checkpoints format works fine.
I guess alternatively we can tell users to not use h5 format instead of fixing it

abhigyank · 2019-05-03T06:42:37Z

@zzh8829 What is the alternative way to save a model/weights? I am having this proble min .hdf5 fromat too.

zzh8829 · 2019-05-04T21:05:25Z

@abhigyank the alternative is save to *.tf which will create tensorflow checkpoint files instead of hdf5.

bbrito · 2019-06-18T16:14:03Z

Any news on this issue?

I tried the *.tf and it works.

veqtor · 2019-06-20T09:50:07Z

It might seem like .tf saving works but in my experience the only difference is that it doesn't throw an error.
Steps to reproduce:
Make a model with nested models and set some layers to trainable=False
Train for some epochs
Save weights
Evaluate and save metrics
Clear everything
Make model
Load weights
Evaluate

k-w-w · 2019-06-20T22:06:07Z

I am currently submitting a fix for H5.

@veqtor What problem are you seeing with using the TF format?

Changed the test to the example from #27769. PiperOrigin-RevId: 254305891

dbalabka · 2019-07-01T00:23:59Z

@k-w-w I have tested your fix and it works for me 😃 Thank you a lot!

19giorgosts · 2019-08-05T11:24:18Z

@k-w-w How can I use your fix? I have the same problem.

k-w-w · 2019-08-05T19:49:39Z

@19giorgosts The fix should be in tensorflow-nightly, which you can install using pip install tf-nightly

Lannist · 2019-08-07T02:57:20Z

It might seem like .tf saving works but in my experience the only difference is that it doesn't throw an error.
Steps to reproduce:
Make a model with nested models and set some layers to trainable=False
Train for some epochs
Save weights
Evaluate and save metrics
Clear everything
Make model
Load weights
Evaluate

I am new coder to keras。 Can you show me a demo about your description?
Thx

jvishnuvardhan · 2019-08-09T17:45:50Z

@Lannist Here is the colab gist to save/load the weights in *.tf format. Here is the gist to save/load the weights in *.h5 format. The only difference between those two gist is in changing the extension. Thanks!

I am closing the issue as it was resolved in tf-nightly. Please feel free to open if the issue persists again. Thanks!

tensorflow-bot · 2019-08-09T17:45:52Z

Are you satisfied with the resolution of your issue?
Yes
No

ysyyork · 2020-05-06T09:08:50Z

is this change gonna be in tf 1 ?

aii-guo · 2020-05-13T01:39:19Z

is this change gonna be in tf 1 ?

have you found the solution?
I using the tensorflow 1.1.4 and meet the same error but can not find way to fix it

aii-guo · 2020-05-13T01:39:57Z

@19giorgosts The fix should be in tensorflow-nightly, which you can install using pip install tf-nightly

how about tensorflow 1.1.4 or 1.1.5
can not install tensorflow-nightly by pip

paulaceccon · 2021-10-05T21:35:48Z

I am currently submitting a fix for H5.

@veqtor What problem are you seeing with using the TF format?

That didn't work for me, using that fix in tf-nightly, for a siamese model such as:

import os
from typing import Optional

import numpy as np
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras import backend as K
from tensorflow.keras import layers
from tensorflow.keras.applications import EfficientNetB0, ResNet50
from tensorflow.keras.optimizers import Adam

def l1_distance(vects) -> float:
    """
    Finds the L1 distance between two vectors.

    Args:
        vects: List containing two tensors of same length.

    Returns:
        Element-wise L1 distance.
    """
    x, y = vects

    return K.abs(x - y)

def create_model(
    target_shape = (224, 224, 3),
    path = None,
) -> Model:
    """
    Creates the siamese model.

    Args:
        target_shape: image dimensions.
        path: path to best weights.

    Returns:
        Siamese model.
    """
    input_1 = layers.Input(shape=target_shape, name="inp_1")
    input_2 = layers.Input(shape=target_shape, name="inp_2")
    # input_1aug = img_augmentation(input_1)
    # input_2aug = img_augmentation(input_2)

    input = layers.Input(shape=target_shape, name="input")
    lambda_1 = layers.Lambda(
        lambda image: tf.keras.applications.resnet.preprocess_input(image),
        name="pre_process",
    )(input)
    base_cnn = ResNet50(
        weights="imagenet",
        input_tensor=lambda_1,
        input_shape=target_shape,
        include_top=False,
    )
    # CONV/FC -> BatchNorm -> ReLu(or other activation) -> Dropout -> CONV/FC ->
    pool = layers.MaxPooling2D(pool_size=(2, 2))(base_cnn.output)
    flatten = layers.Flatten(name="base_output_flatten")(pool)
    dense1 = layers.BatchNormalization(name="dense1_norm")(flatten)
    dense1 = layers.Dense(512, activation="relu", name="dense1")(dense1)
    dense1 = layers.Dropout(0.3, name="dense1_dropout")(dense1)
    dense2 = layers.BatchNormalization(name="dense2_norm")(dense1)
    dense2 = layers.Dense(256, activation="relu", name="dense2")(dense2)
    dense2 = layers.Dropout(0.2, name="dense2_dropout")(dense2)
    output = layers.Dense(256, name="dense_output")(dense2)

    embedding = Model(input, output, name="Embedding")

    trainable = False
    for layer in base_cnn.layers:
        if layer.name == "conv5_block1_out":
            trainable = True
        layer.trainable = trainable

    tower_1 = embedding(input_1)
    tower_2 = embedding(input_2)

    merge_layer = layers.Lambda(l1_distance, name="l1")([tower_1, tower_2])
    normal_layer = tf.keras.layers.BatchNormalization(name="l1_norm")(merge_layer)
    comparison_layer = layers.Dense(
        1,
        activation="sigmoid",
        name="final_layer",
    )(normal_layer)
    siamese = Model(inputs=[input_1, input_2], outputs=comparison_layer)

    if path is not None:
        siamese.load_weights(path)

    return siamese

early_stopping_callback = tf.keras.callbacks.EarlyStopping(
        monitor="loss", patience=5
    )
tensorboard_callback = tf.keras.callbacks.TensorBoard(
        log_dir="/logs", histogram_freq=1
    )
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath="/logs/weights{epoch:04d}.tf", save_weights_only=True, save_freq=1
    )

train_generator = get_train_generator(
        split_path, batch_size=batch_size, input_size=target_shape
    )
steps_per_epoch = len(train_generator)
clr = get_cyclical_lr(2 * steps_per_epoch)
optimizer = Adam(clr)

siamese = create_model(target_shape)

siamese.compile(
        loss=loss(margin=margin),
        optimizer=optimizer,
        metrics=[metrics.accuracy, metrics.precision, metrics.recall, metrics.f1],
    )
siamese.summary()

siamese.fit(
        train_generator,
        validation_data=get_valid_generator(
            split_path, batch_size=batch_size, input_size=target_shape
        ),
        epochs=epochs,
        callbacks=[
            early_stopping_callback,
            tensorboard_callback,
            model_checkpoint_callback,
        ],
        verbose=1,
    )

siamese = create_model(path="/content/weights00000012.h5")

ValueError                                Traceback (most recent call last)
<ipython-input-9-e1ec7f0dd441> in <module>()
----> 1 siamese = create_model(path="/content/weights00000012.h5")

3 frames
<__array_function__ internals> in transpose(*args, **kwargs)

/usr/local/lib/python3.7/dist-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
     56 
     57     try:
---> 58         return bound(*args, **kwds)
     59     except TypeError:
     60         # A TypeError occurs if the object does have such a method in its

ValueError: axes don't match array

jvishnuvardhan self-assigned this Apr 15, 2019

jvishnuvardhan added comp:keras Keras related issues type:bug Bug labels Apr 15, 2019

jvishnuvardhan assigned k-w-w and unassigned jvishnuvardhan Apr 15, 2019

jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 15, 2019

tensorflow-copybara pushed a commit that referenced this issue Jun 21, 2019

Fix bug with loading nested model with trainable/nontrainable weights.

f42549a

Changed the test to the example from #27769. PiperOrigin-RevId: 254305891

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 21, 2019

jvishnuvardhan closed this as completed Aug 9, 2019

zzh8829 mentioned this issue Jan 7, 2020

Error Loading Model zzh8829/yolov3-tf2#145

Open

RickOwri mentioned this issue Jun 10, 2022

".hdf5" format instead of ".h5" nicknochnack/GANBasics#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TF 2.0 keras] Unable save and load weights for double nested models #27769

[TF 2.0 keras] Unable save and load weights for double nested models #27769

zzh8829 commented Apr 12, 2019 •

edited

zzh8829 commented Apr 12, 2019

abhigyank commented May 3, 2019

zzh8829 commented May 4, 2019

bbrito commented Jun 18, 2019

veqtor commented Jun 20, 2019

k-w-w commented Jun 20, 2019

dbalabka commented Jul 1, 2019

19giorgosts commented Aug 5, 2019

k-w-w commented Aug 5, 2019

Lannist commented Aug 7, 2019

jvishnuvardhan commented Aug 9, 2019 •

edited

tensorflow-bot bot commented Aug 9, 2019

ysyyork commented May 6, 2020

aii-guo commented May 13, 2020

aii-guo commented May 13, 2020

paulaceccon commented Oct 5, 2021 •

edited

[TF 2.0 keras] Unable save and load weights for double nested models #27769

[TF 2.0 keras] Unable save and load weights for double nested models #27769

Comments

zzh8829 commented Apr 12, 2019 • edited

zzh8829 commented Apr 12, 2019

abhigyank commented May 3, 2019

zzh8829 commented May 4, 2019

bbrito commented Jun 18, 2019

veqtor commented Jun 20, 2019

k-w-w commented Jun 20, 2019

dbalabka commented Jul 1, 2019

19giorgosts commented Aug 5, 2019

k-w-w commented Aug 5, 2019

Lannist commented Aug 7, 2019

jvishnuvardhan commented Aug 9, 2019 • edited

tensorflow-bot bot commented Aug 9, 2019

ysyyork commented May 6, 2020

aii-guo commented May 13, 2020

aii-guo commented May 13, 2020

paulaceccon commented Oct 5, 2021 • edited

zzh8829 commented Apr 12, 2019 •

edited

jvishnuvardhan commented Aug 9, 2019 •

edited

paulaceccon commented Oct 5, 2021 •

edited