Problems with keras model saving when there's a loss added with add_loss #30378

nguerinjr · 2019-07-03T22:05:20Z

System information

System: windows 10, wsl with ubuntu 18 LTS
Tensorflow Version: 2.0.0b1 in CPU mode (default, installed from pip)
Python version: 3.6.8

It also happens in real linux environments (actually, it's easy to simulate each these errors)

Describe the current behavior

I'm having many problems when saving/loading keras models with custom loss (added with add_loss). I'll describe each one of the scenarios below (I think they're all related)

Code to reproduce the issue

inp = tf.keras.Input(batch_size=32, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=16, kernel_size=3)(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
model.add_loss(tf.keras.losses.mean_absolute_error(tensor, tensor + 1))
model.compile('adam')
tf.keras.experimental.export_saved_model(model, 'model.tf')

When not using a keras layer as loss, it produces a non-valid JSON:

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 8, in
tf.keras.experimental.export_saved_model(model, 'model.tf')
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 169, in export_saved_model
_export_model_json(model, saved_model_path)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 177, in _export_model_json
model_json = model.to_json()
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1449, in to_json
model_config, default=serialization.get_json_type, **kwargs)
File "/usr/lib/python3.7/json/init.py", line 238, in dumps
*kw).encode(obj)
File "/usr/lib/python3.7/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.7/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/util/serialization.py", line 69, in get_json_type
raise TypeError('Not JSON Serializable:', obj)
TypeError: ('Not JSON Serializable:', b'\n\x03add\x12\x03Add\x1a\x0fconv2d/Identity\x1a\x05add/y\x07\n\x01T\x12\x020\x01')

Now, a code that uses keras layers

Code to reproduce the issue

inp = tf.keras.Input(batch_size=32, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=16, kernel_size=3)(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
lbd = tf.keras.layers.Lambda(lambda i: tf.keras.losses.mean_absolute_error(i[0], i[1]))
model.add_loss(lbd([tensor, tensor + 1]))
model.compile('adam')
tf.keras.experimental.export_saved_model(model, 'model.tf')

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 16, in
tf.keras.experimental.export_saved_model(model, 'model.tf')
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 166, in export_saved_model
input_signature)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 236, in _save_v1_format
_export_mode(mode_keys.ModeKeys.TRAIN, has_saved_vars, **export_args)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 299, in _export_mode
compile_clone=compile_clone)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 538, in clone_and_build_model
clone = clone_model(model, input_tensors=input_tensors)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 326, in clone_model
model, input_tensors=input_tensors, layer_fn=clone_function)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 202, in _clone_functional_model
model._insert_layers(ancillary_layers, relevant_nodes=relevant_nodes)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1633, in _insert_layers
for node in layer.inbound_nodes
ValueError: min() arg is an empty sequence

When using a keras layer, it loses their inbound_nodes (I've debugged it). It puts them apart from other layers, but loses information of the objects (it's not a question of passing or not custom_objects as param).

Now, trying to use non-experimental saves/loads.

Code to reproduce the issue

inp = tf.keras.Input(batch_size=8, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=8, kernel_size=(3, 3))(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
lbd = tf.keras.layers.Lambda(lambda i: tf.keras.losses.mean_absolute_error(i[0], i[1]))
model.add_loss(lbd([tensor, tensor + 1]))
model.compile('adam')
model.save('model.keras.tf', save_format='tf')
tf.keras.models.load_model('model.keras.tf', custom_objects={'lambda': lbd})

A first annoying thing is that it's based on .ext. Even though it only saves in tf2 (put any extension), in load it verifies these extensions. It not a clear way of working in my opinion. Maybe some additional information on the files saved could make it not use the extensions.

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 26, in
tf.keras.models.load_model('model.keras.tf', custom_objects={'lambda': lbd})
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 141, in load_model
return saved_model.load_from_saved_model_v2(filepath, compile)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 1225, in load_from_saved_model_v2
model._training_config)) # pylint: disable=protected-access
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 458, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 337, in compile
self._compile_weights_loss_and_weighted_metrics()
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 458, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1494, in _compile_weights_loss_and_weighted_metrics
self.total_loss = self._prepare_total_loss(masks)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1595, in _prepare_total_loss
raise ValueError('The model cannot be compiled '
ValueError: The model cannot be compiled because it has no loss to optimize.

The second is related to the loading, is that it does not accept no loss to compile. Keras accepts it (you can see in the examples). It accepts because it accounts for the non-default losses added. This is another saving/loading bug.

Both experimental and non-experimental functions seem to have some problems in saving / loading. I think if you simulate some custom scenarios with keras, you'll find a bunch of other errors. So, it's worth to take a whole look at them (specially considering that Keras is the default prototyping tool in tf2).

In the current situation, i've not thought of a simple and easy way to save keras models with custom components (those which are not in the list of arguments of compile)

The text was updated successfully, but these errors were encountered:

ppham27 · 2019-07-04T00:01:18Z

Hi @nguerinjr,

Thanks for creating this issue. I've also struggled with this, but it's possible: 1ad6aae.

The trick is here:

tensorflow/tensorflow/python/keras/saving/hdf5_format_test.py

Line 821 in 1ad6aae

outputs = keras.layers.Lambda(lambda x: x[0])((y, custom_loss))

to connect the loss calculation with the rest of the network. It's not ideal, I know. There really should be an auxiliary outputs type argument in Model, or add_loss should retrace graph, but the workaround isn't terrible in this case.

Hope this helps.

rmothukuru · 2019-07-04T11:47:17Z

@nguerinjr ,
Can you please confirm if @ppham27's workaround is working for you.

nguerinjr · 2019-07-05T16:53:31Z

Hi @ppham27, @rmothukuru ,

Yes, that's working in the scenario of the codes I've put here.

I'll implement this workaround in my real code, where I have a bunch of custom_objects.

As you've mentioned, @ppham27, it's not ideal. Since there are those problems in the code, it's a good deal to make things work this way.

Thanks!

ppham27 · 2019-07-05T17:23:33Z

Great for what it's worth, I have a pending internal change that will trace the history when using add_loss and add_metric, so you don't need that line. Ideally, it will land at the end of next week, but I can't say for certain since I'm not actually on the TensorFlow team.

tensorflow-bot · 2019-07-18T21:39:50Z

Are you satisfied with the resolution of your issue?
Yes
No

nguerinjr changed the title ~~Problem with keras model saving when there's a loss added with add_loss~~ Problems with keras model saving when there's a loss added with add_loss Jul 3, 2019

rmothukuru self-assigned this Jul 4, 2019

rmothukuru added comp:keras Keras related issues 2.0.0-beta0 stat:awaiting response Status - Awaiting response from author labels Jul 4, 2019

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jul 6, 2019

rmothukuru added the type:bug Bug label Jul 8, 2019

rmothukuru assigned jvishnuvardhan and unassigned rmothukuru Jul 8, 2019

jvishnuvardhan assigned k-w-w and unassigned jvishnuvardhan Jul 9, 2019

jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 9, 2019

tensorflow-copybara closed this as completed in a377701 Jul 18, 2019

nguerinjr mentioned this issue Aug 3, 2019

Serialization problems in keras when using add_loss or '.tf' extension for saving #31298

Closed

lvenugopalan added the TF 2.0 Issues relating to TensorFlow 2.0 label Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with keras model saving when there's a loss added with add_loss #30378

Problems with keras model saving when there's a loss added with add_loss #30378

nguerinjr commented Jul 3, 2019 •

edited

ppham27 commented Jul 4, 2019

rmothukuru commented Jul 4, 2019

nguerinjr commented Jul 5, 2019

ppham27 commented Jul 5, 2019

tensorflow-bot bot commented Jul 18, 2019

Problems with keras model saving when there's a loss added with add_loss #30378

Problems with keras model saving when there's a loss added with add_loss #30378

Comments

nguerinjr commented Jul 3, 2019 • edited

ppham27 commented Jul 4, 2019

rmothukuru commented Jul 4, 2019

nguerinjr commented Jul 5, 2019

ppham27 commented Jul 5, 2019

tensorflow-bot bot commented Jul 18, 2019

nguerinjr commented Jul 3, 2019 •

edited