-
Notifications
You must be signed in to change notification settings - Fork 617
Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
- TensorFlow version and how it was installed (source or binary): N/A
- TensorFlow-Addons version and how it was installed (source or binary): N/A
- Python version: N/A
- Is GPU used? (yes/no): N/A
Describe the bug
This bug is in https://colab.research.google.com/github/tensorflow/addons/blob/master/docs/tutorials/networks_seq2seq_nmt.ipynb
The loss function is not calculated properly. The mean should only be calculated over non-masked elements. This line should be replaced:
loss = tf.reduce_mean(loss)
with this:
loss = tf.math.reduce_sum(loss) / tf.math.reduce_sum(mask)
This now gives the same results as keras.metrics.SparseCategoricalCrossentropy(from_logits=True), as expected.
def loss_function(real, pred):
# real shape = (BATCH_SIZE, max_length_output)
# pred shape = (BATCH_SIZE, max_length_output, tar_vocab_size )
cross_entropy = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True, reduction='none')
loss = cross_entropy(y_true=real, y_pred=pred)
mask = tf.logical_not(tf.math.equal(real,0)) #output 0 for y=0 else output 1
mask = tf.cast(mask, dtype=loss.dtype)
loss = mask* loss
loss = tf.reduce_mean(loss)
return loss
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.