Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with keras model saving when there's a loss added with add_loss #30378

Closed
nguerinjr opened this issue Jul 3, 2019 · 5 comments
Closed
Assignees
Labels
comp:keras Keras related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug

Comments

@nguerinjr
Copy link

nguerinjr commented Jul 3, 2019

System information

System: windows 10, wsl with ubuntu 18 LTS
Tensorflow Version: 2.0.0b1 in CPU mode (default, installed from pip)
Python version: 3.6.8

It also happens in real linux environments (actually, it's easy to simulate each these errors)

Describe the current behavior

I'm having many problems when saving/loading keras models with custom loss (added with add_loss). I'll describe each one of the scenarios below (I think they're all related)

Code to reproduce the issue

inp = tf.keras.Input(batch_size=32, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=16, kernel_size=3)(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
model.add_loss(tf.keras.losses.mean_absolute_error(tensor, tensor + 1))
model.compile('adam')
tf.keras.experimental.export_saved_model(model, 'model.tf')

When not using a keras layer as loss, it produces a non-valid JSON:

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 8, in
tf.keras.experimental.export_saved_model(model, 'model.tf')
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 169, in export_saved_model
_export_model_json(model, saved_model_path)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 177, in _export_model_json
model_json = model.to_json()
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1449, in to_json
model_config, default=serialization.get_json_type, **kwargs)
File "/usr/lib/python3.7/json/init.py", line 238, in dumps
*kw).encode(obj)
File "/usr/lib/python3.7/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.7/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/util/serialization.py", line 69, in get_json_type
raise TypeError('Not JSON Serializable:', obj)
TypeError: ('Not JSON Serializable:', b'\n\x03add\x12\x03Add\x1a\x0fconv2d/Identity\x1a\x05add/y
\x07\n\x01T\x12\x020\x01')

Now, a code that uses keras layers

Code to reproduce the issue

inp = tf.keras.Input(batch_size=32, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=16, kernel_size=3)(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
lbd = tf.keras.layers.Lambda(lambda i: tf.keras.losses.mean_absolute_error(i[0], i[1]))
model.add_loss(lbd([tensor, tensor + 1]))
model.compile('adam')
tf.keras.experimental.export_saved_model(model, 'model.tf')

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 16, in
tf.keras.experimental.export_saved_model(model, 'model.tf')
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 166, in export_saved_model
input_signature)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 236, in _save_v1_format
_export_mode(mode_keys.ModeKeys.TRAIN, has_saved_vars, **export_args)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 299, in _export_mode
compile_clone=compile_clone)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 538, in clone_and_build_model
clone = clone_model(model, input_tensors=input_tensors)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 326, in clone_model
model, input_tensors=input_tensors, layer_fn=clone_function)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/models.py", line 202, in _clone_functional_model
model._insert_layers(ancillary_layers, relevant_nodes=relevant_nodes)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1633, in _insert_layers
for node in layer.inbound_nodes
ValueError: min() arg is an empty sequence

When using a keras layer, it loses their inbound_nodes (I've debugged it). It puts them apart from other layers, but loses information of the objects (it's not a question of passing or not custom_objects as param).

Now, trying to use non-experimental saves/loads.

Code to reproduce the issue

inp = tf.keras.Input(batch_size=8, shape=(32, 32, 3))
tensor = tf.keras.layers.Conv2D(filters=8, kernel_size=(3, 3))(inp)
model = tf.keras.Model(inputs=inp, outputs=tensor)
lbd = tf.keras.layers.Lambda(lambda i: tf.keras.losses.mean_absolute_error(i[0], i[1]))
model.add_loss(lbd([tensor, tensor + 1]))
model.compile('adam')
model.save('model.keras.tf', save_format='tf')
tf.keras.models.load_model('model.keras.tf', custom_objects={'lambda': lbd})

A first annoying thing is that it's based on .ext. Even though it only saves in tf2 (put any extension), in load it verifies these extensions. It not a clear way of working in my opinion. Maybe some additional information on the files saved could make it not use the extensions.

Traceback (most recent call last):
File "/home/nguerinjr/Documents/deep_coding_project/teste.py", line 26, in
tf.keras.models.load_model('model.keras.tf', custom_objects={'lambda': lbd})
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 141, in load_model
return saved_model.load_from_saved_model_v2(filepath, compile)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model.py", line 1225, in load_from_saved_model_v2
model._training_config)) # pylint: disable=protected-access
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 458, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 337, in compile
self._compile_weights_loss_and_weighted_metrics()
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 458, in _method_wrapper
result = method(self, *args, **kwargs)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1494, in _compile_weights_loss_and_weighted_metrics
self.total_loss = self._prepare_total_loss(masks)
File "/home/nguerinjr/Documents/deep_coding_project/venv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1595, in _prepare_total_loss
raise ValueError('The model cannot be compiled '
ValueError: The model cannot be compiled because it has no loss to optimize.

The second is related to the loading, is that it does not accept no loss to compile. Keras accepts it (you can see in the examples). It accepts because it accounts for the non-default losses added. This is another saving/loading bug.

Both experimental and non-experimental functions seem to have some problems in saving / loading. I think if you simulate some custom scenarios with keras, you'll find a bunch of other errors. So, it's worth to take a whole look at them (specially considering that Keras is the default prototyping tool in tf2).

In the current situation, i've not thought of a simple and easy way to save keras models with custom components (those which are not in the list of arguments of compile)

@nguerinjr nguerinjr changed the title Problem with keras model saving when there's a loss added with add_loss Problems with keras model saving when there's a loss added with add_loss Jul 3, 2019
@ppham27
Copy link
Contributor

ppham27 commented Jul 4, 2019

Hi @nguerinjr,

Thanks for creating this issue. I've also struggled with this, but it's possible: 1ad6aae.

The trick is here:

outputs = keras.layers.Lambda(lambda x: x[0])((y, custom_loss))

to connect the loss calculation with the rest of the network. It's not ideal, I know. There really should be an auxiliary outputs type argument in Model, or add_loss should retrace graph, but the workaround isn't terrible in this case.

Hope this helps.

@rmothukuru rmothukuru self-assigned this Jul 4, 2019
@rmothukuru rmothukuru added comp:keras Keras related issues 2.0.0-beta0 stat:awaiting response Status - Awaiting response from author labels Jul 4, 2019
@rmothukuru
Copy link
Contributor

@nguerinjr ,
Can you please confirm if @ppham27's workaround is working for you.

@nguerinjr
Copy link
Author

Hi @ppham27, @rmothukuru ,

Yes, that's working in the scenario of the codes I've put here.

I'll implement this workaround in my real code, where I have a bunch of custom_objects.

As you've mentioned, @ppham27, it's not ideal. Since there are those problems in the code, it's a good deal to make things work this way.

Thanks!

@ppham27
Copy link
Contributor

ppham27 commented Jul 5, 2019

Great for what it's worth, I have a pending internal change that will trace the history when using add_loss and add_metric, so you don't need that line. Ideally, it will land at the end of next week, but I can't say for certain since I'm not actually on the TensorFlow team.

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jul 6, 2019
@rmothukuru rmothukuru added the type:bug Bug label Jul 8, 2019
@jvishnuvardhan jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jul 9, 2019
@tensorflow-bot
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:keras Keras related issues stat:awaiting tensorflower Status - Awaiting response from tensorflower TF 2.0 Issues relating to TensorFlow 2.0 type:bug Bug
Projects
None yet
Development

No branches or pull requests

7 participants