## Chapter 12 Practical Exercises: Custom Models and Training with TensorFlow


##### 12. Implement a custom layer that performs layer normalization (we will use this type of layer in Chapter 15):


a. The build() method should define two trainable weights α and β, both of
shape input_shape[-1:] and data type tf.float32. α should be initialized
with 1s, and β with 0s.


In [16]:
import tensorflow as tf
import numpy as np

In [3]:
class NormalizeLayer(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
    
    def build(self, batch_input_shape):
        self.alphas = self.add_weight('alpha',batch_input_shape[-1:],tf.float32,
                                    initializer=tf.keras.initializers.Constant(1.0)) 
        self.betas = self.add_weight('beta', batch_input_shape[-1:],tf.float32,
                                    initializer=tf.keras.initializers.Constant(0.0)) 
    def call(self, input):
        pass

b.
The `call()` method should compute the mean (μ) and standard deviation (σ) of each instance's features. For this, you can use `tf.nn.moments(inputs, axes=-1, keepdims=True)`, which returns the mean (μ) and the variance (σ^2) of all instances (compute the square root of the variance to get the standard deviation). Then, the function should compute and return:

$$
\alpha \otimes \left( \frac{X - \mu}{\sigma + \epsilon} \right) + \beta
$$

where _ represents elementwise multiplication (`_`), and ε is a smoothing term (a small constant to avoid division by zero, e.g., 0.001).


In [11]:
tf.nn.moments(x=tf.constant([[33,3,200],[34,5,322]]),axes=-1,keepdims=True)

(<tf.Tensor: shape=(2, 1), dtype=int32, numpy=
 array([[ 78],
        [120]])>,
 <tf.Tensor: shape=(2, 1), dtype=int32, numpy=
 array([[ 7511],
        [20475]])>)

In [46]:
class NormalizeLayer(tf.keras.layers.Layer):
    def __init__(self, eps=0.001, **kwargs):
        super().__init__(**kwargs)
        self.eps = eps

    def build(self, batch_input_shape):
        self.alphas = self.add_weight('alpha',batch_input_shape[-1:],tf.float32,
                                    initializer='ones'
                                    # tf.keras.initializers.Constant(1.0)
                                    ) 
        self.betas = self.add_weight('beta', batch_input_shape[-1:],tf.float32,
                                    initializer='zeros') 
        
    def call(self, X):
        mean, var = tf.nn.moments(x=X,axes=-1,keepdims=True)
        return self.alphas * ((X - mean) / (tf.sqrt(var + self.eps))) + self.betas
    
    def get_config(self):
        base_config = super().get_config()
        return {**base_config, "epsilon": self.eps}

c. Ensure that your custom layer produces the same (or very nearly the same) output as the tf.keras.layers.LayerNormalization layer.


In [47]:
keras_norm_layer = tf.keras.layers.LayerNormalization()


data = tf.constant(np.arange(10).reshape(5, 2) * 10, dtype=tf.float32)
custom_layer_norm = NormalizeLayer()
keras_layer_norm = tf.keras.layers.LayerNormalization()

tf.reduce_mean(tf.keras.losses.mean_absolute_error(
    keras_layer_norm(data), custom_layer_norm(data)))


<tf.Tensor: shape=(), dtype=float32, numpy=4.7683717e-08>

very small number. it's worked!

so let's do the check:

In [49]:
tf.keras.utils.set_random_seed(42)
random_alphas = np.random.rand(data.shape[-1])
random_betas = np.random.rand(data.shape[-1])

custom_layer_norm.set_weights([random_alphas, random_betas])
keras_layer_norm.set_weights([random_alphas, random_betas])

tf.reduce_mean(tf.keras.losses.mean_absolute_error(
    keras_layer_norm(data), custom_layer_norm(data)))

<tf.Tensor: shape=(), dtype=float32, numpy=1.1920929e-08>

----