BasicDecoder fails with masked input tensor

**System information**
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
- TensorFlow version and how it was installed (source or binary): binary(2.0.0-dev20190920)
- TensorFlow-Addons version and how it was installed (source or binary): binary(0.6.0-dev)
- Python version: 3.6
- Is GPU used? (yes/no): yes

**Describe the bug**

When the `BasicDecoder` receives a tensor equipped with masking meta-data as the input, it will fail due to the unhandled mask argument in `TrainingSampler.initialize(..)`. This is a common scenario in which the input tensor is computed using the embedding layer.  

**Code to reproduce the issue**

```python
units = 32
vocab_size = 1000
embedding = tf.keras.layers.Embedding(
    input_dim=vocab_size, output_dim=units, mask_zero=True)
attention_mechanism = attention_wrapper.LuongAttention(units)
cell = tf.keras.layers.LSTMCell(units)
cell = tfa.seq2seq.AttentionWrapper(cell, attention_mechanism)
output_layer = tf.keras.layers.Dense(vocab_size)
sampler = tfa.seq2seq.sampler.TrainingSampler()
decoder = tfa.seq2seq.BasicDecoder(
    cell, sampler, output_layer=output_layer)

# setup the attention mechanism's memory with a random tensor
fake_mem = tf.random.normal((2, 3, units))
fake_mem_mask = tf.ones((2,3), dtype=tf.bool)
attention_mechanism(fake_mem, mask=fake_mem_mask, setup_memory=True)

word_ids = tf.random.uniform((2, 5), 0, vocab_size, dtype=tf.int64) \
            * tf.constant([[1, 1, 1, 0, 0], [1, 1, 1, 1, 1]], dtype=tf.int64)
word_embeds = embedding(word_ids)
mask = embedding.compute_mask(word_ids)

initial_state = cell.get_initial_state(batch_size=2, dtype=tf.float32)
outputs = decoder(
    inputs=word_embeds,
    initial_state=initial_state,
    sequence_length=tf.math.reduce_sum(tf.cast(mask, tf.int32), axis=1),
)
```
It will raise the following error:
```
  File "tensorflow_addons/seq2seq/basic_decoder.py", line 72, in initialize
    return self.sampler.initialize(inputs, **kwargs) + (initial_state,)
TypeError: initialize() got an unexpected keyword argument 'mask'
```

**Other info / logs**
This issue happens because we capture the mask argument on the `BaseDecoder.call(..., **kwargs)` and propagate it all the way through `TrainingSample.initialize(...)` except that the `initialize` method doesn't accept the mask argument. 
One can temporarily fix this error by adding `delattr(word_embeds, '_keras_mask')` just before calling the decoder instance.

*I encountered this issue when i was working on #335*


Ideas to fix this issue:
- Fix the `BasicDecoder` to only pass the appropriate argument when it calls the sampler. or even fix the `TrainingSampler` to capture all other arguments using the `kwargs`
- Add the appropriate support for the mask argument in the `TrainingSampler` in additon to `sequence_length` (similar to `AttentionMechanism`)

I think there is no harm in the latter option as it provides a more Keras-friendly way to work with the framework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BasicDecoder fails with masked input tensor #534

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BasicDecoder fails with masked input tensor #534

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions