Masking and Padding are useful techniques while developing Machine Learning models specially NLP models

**Masking**

It's way to indicate that certain steps are not present and should be considered in calculation.

**Padding**

Padding is useful when there are variable length sequence and we need to make them all standard length. Padding is just making the start or end part to make the sequence standard length

In [1]:
import tensorflow as tf

### Padding API

In [80]:
inputs = [
    [1,2,3],
    [4,5],
    [1,4,8,19]
]

In [81]:
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(
    inputs, padding="post"
)

In [82]:
padded_inputs

array([[ 1,  2,  3,  0],
       [ 4,  5,  0,  0],
       [ 1,  4,  8, 19]], dtype=int32)

### Making strategies

#### Using the `tf.keras.layers.Masking` layer

This layer takes an input and generates masked output by putting zeros to the masked value. A mask (tensor of boolean indicating which rows are active). 

Note that the masked generated by this layer is of one axis less than the actual value. This idea would be intutive if you think about the `Embedding` layer with inputs of word indexes of shape `(batch_size, steps)`. Even though we get an output of `(batch_size, steps, embedding_dim)`, the boolean mask is of `(batch_size, steps)` indicating which words/steps are active

In [84]:
mask = tf.keras.layers.Masking(mask_value=0.)

In [85]:
masked_values = mask(tf.cast(padded_inputs, dtype=tf.float32))

In [86]:
masked_values

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 1.,  2.,  3.,  0.],
       [ 4.,  5.,  0.,  0.],
       [ 1.,  4.,  8., 19.]], dtype=float32)>

In [87]:
masked_values._keras_mask

<tf.Tensor: shape=(3,), dtype=bool, numpy=array([ True,  True,  True])>

If you want a mask of same shape as input/output in the `Masking` layer, use tf.tile to increase one dimension

Let's say you want the mask for where 4.0 occurs

In [92]:
mask = tf.keras.layers.Masking(mask_value=4.)

In [95]:
mask(tf.tile(
    tf.expand_dims(tf.cast(padded_inputs, dtype=tf.float32), axis=-1), 
    multiples=[1,1,5]    # it can be any [1,1,n]
))._keras_mask

<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[ True,  True,  True,  True],
       [False,  True,  True,  True],
       [ True, False,  True,  True]])>

Please note the output of the layer would be of `(3,4,5)` and wouldn't be of any use to us though.

Two layers that can generate or attach masks are:
    1. Embedding Layer
    2. Masking Layer

### Flow of Masks in Keras Layers

Once masking is attached/generated they flow through the subsequent layers in Keras(if the subsequent layers support masking)

The masks are computed in a function `compute_mask()`

The masks are be used by giving input to the `__call__` function

The layers that support masking have a `compute_mask` method implemented and takes the `mask` argument in the `__call__` function. Check the LSTM layer implementation for more understanding

Custom Layers by default destroy the mask. If you want the masks to flow, you can implement a `compute_mask` or put self.supports_masking = True in the layer constructor(in case mask remains the same.)

### FOR A NEW CUSTOM LAYER

the mask can accessed by simply giving input to the `call(self, inputs, mask=None)`

In [None]:
The `mask` can be used inside the 