Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
seq2seq.SequenceLoss is probably incompatible with tf.keras.models.Model.compile #375
Describe the bug
I think there are two incompatibilities when we use
import tensorflow as tf model = tf.keras.Sequential([ tf.keras.layers.Embedding(vocab_size, 50, mask_zero=True), tf.keras.layers.Dense(vocab_size, activation='softmax') ]) model.summary()
Now, Let's compile the model and use
import tensorflow_addons as tfa loss_obj = tfa.seq2seq.SequenceLoss() model.compile(optimizer='adam', loss=loss_obj)
As you can see, it gives us an ambiguous error (complete traceback none_value.log). However, I suppose that this issue is rooted in the custom
loss_obj = tfa.seq2seq.SequenceLoss() delattr(loss_obj, 'reduction') model.compile(optimizer='adam', loss=loss_obj)
It successfully passes that step, But it raises another exception (complete traceback shape_mismatch.log):
I followed the traceback, and it turns out that Keras does some loss preparation stuff in
p.s: I encountered this issue when I was working on #231.
@kazemnejad, thanks for the detective work.
For the first error, I think your approach is reasonable: we could delete the
For the second error, this looks like a strange Keras behavior. It creates a placeholder with the same static rank and dtype as the model output but when running the graph the dynamic rank of the placeholder is different... It seems we could just accept a 3D
Yes, of course. I give it a try by the end of this week.
Hi, Yes actually I’m currently working on it, I think I’ll be able to send the pull request today. Thank you for your patience.…
On Shahrivar 18, 1398 AP, 1:19 PM +0430, Guillaume Klein ***@***.***>, wrote: Hi @kazemnejad, did you have a chance to try that? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Sorry for the very late reply. I was on a very long holiday.
For the reduction property, it might have something to do with new distribution strategy if it is used. Please see the docstring for more details.
There is also another incompatibility. As you know, we need to pass the sample_weight to the
model.compile(optimizer='adam', loss=loss_obj, sample_weight_mode="temporal") model.fit(x, y, sample_weight=weights, ...)
where y's shape is
If we take a look at the training_utils.py#L883, we will find out the following condition is the cause the problem.
# line 883 -> 889 if len(y.shape) < 3: raise ValueError('Found a sample_weight array for ' 'an input with shape ' + str(y.shape) + '. ' 'Timestep-wise sample weighting (use of ' 'sample_weight_mode="temporal") is restricted to ' 'outputs that are at least 3D, i.e. that have ' 'a time dimension.')
And this condition is pretty much a dead-end for us.
So what is the cause all of these incompatibilities?
Keras mindset assumes:
But the SeqeunceLoss assumes the following:
As you can see, there is no way to use them together since they both will raise an exception on incompatible shapes.