tf.keras.layers.Softmax does not support masking? #27010

erikchwang · 2019-03-22T02:10:56Z

import tensorflow as tf
outputs = tf.keras.layers.Softmax().apply(
  tf.keras.layers.Masking().apply(
    tf.zeros([3,5,7])
  )
)

Since the default mask value of Masking is zero, Softmax should skip all values in the above case, and its behavior should be like sparse softmax. Therefore, I suppose the output should be all zeros, but that is not the case.

The text was updated successfully, but these errors were encountered:

Gurpreetsingh9465 · 2019-03-24T06:58:04Z

@chwang85 can you elaborate a bit because the output is correct i guess, which is Tensor of shape [3,5,7] of value = 0.14285715 which is correct as shown here wiki

erikchwang · 2019-03-24T20:25:03Z

I think my question is clear enough...
Maybe you need to know what is masking first...

Gurpreetsingh9465 · 2019-03-25T02:13:42Z

Sir, according to my understanding mask just skip those value which are equal to mask value.

erikchwang · 2019-03-25T02:51:53Z

The default mask value is zero, so the Softmax should skip all values in my given case

Gurpreetsingh9465 · 2019-03-25T15:57:09Z

@chwang85 masking replace the value with 0 actually for example.
t = tf.fill([2,2],5.0)
m = tf.keras.layers.Masking(5)
print(m.apply(t))
""" output
tf.Tensor(
[[0. 0.]
[0. 0.]], shape=(2, 2), dtype=float32)
"""

t = tf.fill([2,2],5.0)
m = tf.keras.layers.Masking() # default value 0.0
print(m.apply(t))

""" output
[[5. 5.]
[5. 5.]], shape=(2, 2), dtype=float32)
"""

erikchwang · 2019-03-25T18:29:42Z

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Masking

Masks a sequence by using a mask value to skip timesteps.

For each timestep in the input tensor (dimension # 1 in the tensor), if all values in the input tensor at that timestep are equal to mask_value, then the timestep will be masked (skipped) in all downstream layers (as long as they support masking).

erikchwang · 2019-03-26T00:25:23Z

No one can explain why?
Does Softmax support masking?
If so, why the masked values are not skipped in Softmax (the "downstream" layer of Masking)?

hoangcuong2011 · 2019-12-08T14:56:33Z

@erikchwang: My notes here might help you understand masking better keras-team/keras#3086 (comment)

erikchwang · 2019-12-08T17:09:07Z

So, can you explain the following question?

import tensorflow as tf
outputs = tf.keras.layers.Softmax().apply(
  tf.keras.layers.Masking().apply(
    tf.zeros([3,5,7])
  )
)

Since the default mask value of Masking is zero, Softmax should skip all values in the above case, and its behavior should be like sparse softmax. Therefore, I suppose the output should be all zeros, but that is not the case.

hoangcuong2011 · 2019-12-08T17:12:19Z

@erikchwang: If you look at my second, third and fourth bullets in my comment, you will understand this. Yes - the output is not supposed to be zero all the time.

"- Masking is not that complicated if we understand how the loss is computed with masking. For instance let us assume we have a sequence with length 256. From this sequence we have a masking with only 4 elements that are with masking of 1 (others are with masking 0). I thought the loss is computed as the average between these 4 elements. Guess what - it is not! The average loss will be divided by 256 instead. For this reason sometimes the loss will be extremely small (0.0something) if we have only few 1 elements and long sequence.
Does it matter? I guess not, as what we need is the gradient of loss, rather than the loss itself.

When we use softmax as the last layer, the denominator would be the sum of exponential of all elements, regarding whether their masking is 1 or 0.
I thought the output of masking inputs is zeros all the time in LSTM. But it is not the case. Let us assume we have a masking:

0 0 0 1 1 0 0 0

With this case, the three first elements with masking zero has output of 0. However, the three last zeros have output that is as the same as the output of the last element with masking 1."

erikchwang · 2019-12-08T17:22:03Z

I did not find this relevant to my question. Please just explain why the outputs of Softmax are not full zeros when all the inputs are masked?

import tensorflow as tf
outputs = tf.keras.layers.Softmax().apply(
  tf.keras.layers.Masking().apply(
    tf.zeros([3,5,7])
  )
)

hoangcuong2011 · 2019-12-08T17:32:28Z

@erikchwang: Even if you mask, the softmax layer still treats everything as usual. For instance if you put this: [3.,1.,2.,2.,0.,0.] into the softmax, regardless of whether you do masking or not, the output is always:
array([[ 0.50744212, 0.06867483, 0.18667753, 0.18667753, 0.02526405, 0.02526405]])

What masking does is that it notifies the loss computing that do not take into account the "neuron" that is masked, and that is it, no more no less. This is extremely useful, of course because we do padding all the time.
Also, it is very useful for LSTM as it skips inputs that have zeros (i.e. missing inputs - see my picture for the example why we need that). Note also that in case of LSTM it is a bit different in the sense that if you have a sequence of, say, 0 0 0 1 1 0 0 0, the output of the first three zeros is actually 0. But the output of the last three zeros is not 0.

In summary, dont' expect the output of masking is zeros, except LSTMs but in just a specific case like I shown.

erikchwang · 2019-12-09T02:13:08Z

You made too many assumptions. I do not use LSTM, neither do I calculate loss, I just want to verify if Softmax support masking. Now it seems that the answer is NO.

erikchwang · 2019-12-09T03:51:43Z

Sometimes we need more flexibility than just stacking keras layers...
The graph-style tf.layers is much more flexible than the dynamic tf.keras.layers, but it has been DEPRECATED...

bw4sz · 2021-01-15T23:57:36Z

I think this is a perfectly valid question, which needs to be addressed. Can we reopen? Added on SO https://stackoverflow.com/questions/65745053/tensorflow-softmax-does-not-ignore-masking-value

bw4sz · 2021-01-16T02:24:31Z

I like the answer here
https://stackoverflow.com/questions/65745053/tensorflow-softmax-does-not-ignore-masking-value#65745327

erikchwang mentioned this issue Mar 22, 2019

tf.keras.layers.Bidirectional is not equivalent to tf.nn.bidirectional_dynamic_rnn #26974

Closed

ymodak self-assigned this Mar 27, 2019

ymodak added comp:keras Keras related issues type:support Support issues labels Mar 27, 2019

ymodak assigned pavithrasv and unassigned ymodak Mar 27, 2019

ymodak added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Mar 27, 2019

erikchwang closed this as completed May 10, 2020

pavithrasv removed their assignment Jan 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tf.keras.layers.Softmax does not support masking? #27010

tf.keras.layers.Softmax does not support masking? #27010

erikchwang commented Mar 22, 2019 •

edited

Gurpreetsingh9465 commented Mar 24, 2019

erikchwang commented Mar 24, 2019

Gurpreetsingh9465 commented Mar 25, 2019

erikchwang commented Mar 25, 2019

Gurpreetsingh9465 commented Mar 25, 2019

erikchwang commented Mar 25, 2019 •

edited

erikchwang commented Mar 26, 2019 •

edited

hoangcuong2011 commented Dec 8, 2019

erikchwang commented Dec 8, 2019

hoangcuong2011 commented Dec 8, 2019

erikchwang commented Dec 8, 2019

hoangcuong2011 commented Dec 8, 2019 •

edited

erikchwang commented Dec 9, 2019

erikchwang commented Dec 9, 2019

bw4sz commented Jan 15, 2021

bw4sz commented Jan 16, 2021

tf.keras.layers.Softmax does not support masking? #27010

tf.keras.layers.Softmax does not support masking? #27010

Comments

erikchwang commented Mar 22, 2019 • edited

Gurpreetsingh9465 commented Mar 24, 2019

erikchwang commented Mar 24, 2019

Gurpreetsingh9465 commented Mar 25, 2019

erikchwang commented Mar 25, 2019

Gurpreetsingh9465 commented Mar 25, 2019

erikchwang commented Mar 25, 2019 • edited

erikchwang commented Mar 26, 2019 • edited

hoangcuong2011 commented Dec 8, 2019

erikchwang commented Dec 8, 2019

hoangcuong2011 commented Dec 8, 2019

erikchwang commented Dec 8, 2019

hoangcuong2011 commented Dec 8, 2019 • edited

erikchwang commented Dec 9, 2019

erikchwang commented Dec 9, 2019

bw4sz commented Jan 15, 2021

bw4sz commented Jan 16, 2021

erikchwang commented Mar 22, 2019 •

edited

erikchwang commented Mar 25, 2019 •

edited

erikchwang commented Mar 26, 2019 •

edited

hoangcuong2011 commented Dec 8, 2019 •

edited