Skip to content
This repository was archived by the owner on Mar 11, 2026. It is now read-only.
This repository was archived by the owner on Mar 11, 2026. It is now read-only.

BasicDecoder fails with masked input tensor #534

@kazemnejad

Description

@kazemnejad

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
  • TensorFlow version and how it was installed (source or binary): binary(2.0.0-dev20190920)
  • TensorFlow-Addons version and how it was installed (source or binary): binary(0.6.0-dev)
  • Python version: 3.6
  • Is GPU used? (yes/no): yes

Describe the bug

When the BasicDecoder receives a tensor equipped with masking meta-data as the input, it will fail due to the unhandled mask argument in TrainingSampler.initialize(..). This is a common scenario in which the input tensor is computed using the embedding layer.

Code to reproduce the issue

units = 32
vocab_size = 1000
embedding = tf.keras.layers.Embedding(
    input_dim=vocab_size, output_dim=units, mask_zero=True)
attention_mechanism = attention_wrapper.LuongAttention(units)
cell = tf.keras.layers.LSTMCell(units)
cell = tfa.seq2seq.AttentionWrapper(cell, attention_mechanism)
output_layer = tf.keras.layers.Dense(vocab_size)
sampler = tfa.seq2seq.sampler.TrainingSampler()
decoder = tfa.seq2seq.BasicDecoder(
    cell, sampler, output_layer=output_layer)

# setup the attention mechanism's memory with a random tensor
fake_mem = tf.random.normal((2, 3, units))
fake_mem_mask = tf.ones((2,3), dtype=tf.bool)
attention_mechanism(fake_mem, mask=fake_mem_mask, setup_memory=True)

word_ids = tf.random.uniform((2, 5), 0, vocab_size, dtype=tf.int64) \
            * tf.constant([[1, 1, 1, 0, 0], [1, 1, 1, 1, 1]], dtype=tf.int64)
word_embeds = embedding(word_ids)
mask = embedding.compute_mask(word_ids)

initial_state = cell.get_initial_state(batch_size=2, dtype=tf.float32)
outputs = decoder(
    inputs=word_embeds,
    initial_state=initial_state,
    sequence_length=tf.math.reduce_sum(tf.cast(mask, tf.int32), axis=1),
)

It will raise the following error:

  File "tensorflow_addons/seq2seq/basic_decoder.py", line 72, in initialize
    return self.sampler.initialize(inputs, **kwargs) + (initial_state,)
TypeError: initialize() got an unexpected keyword argument 'mask'

Other info / logs
This issue happens because we capture the mask argument on the BaseDecoder.call(..., **kwargs) and propagate it all the way through TrainingSample.initialize(...) except that the initialize method doesn't accept the mask argument.
One can temporarily fix this error by adding delattr(word_embeds, '_keras_mask') just before calling the decoder instance.

I encountered this issue when i was working on #335

Ideas to fix this issue:

  • Fix the BasicDecoder to only pass the appropriate argument when it calls the sampler. or even fix the TrainingSampler to capture all other arguments using the kwargs
  • Add the appropriate support for the mask argument in the TrainingSampler in additon to sequence_length (similar to AttentionMechanism)

I think there is no harm in the latter option as it provides a more Keras-friendly way to work with the framework.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingseq2seq

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions