In [1]:
import tensorflow as tf
print(tf.__version__)

2.0.0-beta1


You may occasionally want to build an architecture that contains an exotic layer for
which TensorFlow does not provide a default implementation. In this case, you will
need to create a custom layer. Or sometimes you may simply want to build a very
repetitive architecture, containing identical blocks of layers repeated many times, and
it would be convenient to treat each block of layers as a single layer. For example, if
the model is a sequence of layers A, B, C, A, B, C, A, B, C, then you might want to
define a custom layer D containing layers A, B, C, and your model would then simply
be D, D, D. Let’s see how to build custom layers.

### 1. Custom layers with single input and single output

some layers have no weights, such as ```tf.keras.layers.Flatten``` or ```tf.keras.layers.ReLU``` . 
If you want to create a custom layer without any weights, the simplest
option is to write a function and wrap it in a keras.layers.Lambda layer. For example, 
the following layer will apply the exponential function to its inputs:
```
exponential_layer = tf.keras.layers.Lambda(lambda x: tf.exp(x))
```

**This custom layer can then be used like any other layer, using the sequential API, the
functional API, or the subclassing API.**

To build a custom stateful layer (i.e., a layer with weights), you need to create a subclass of the keras.layers.Layer class

In [4]:
class MyDense(tf.keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        self.activation = tf.keras.activations.get(activation)
    def build(self, batch_input_shape):
        self.kernel = self.add_weight(name="kernel", 
                                      shape=[batch_input_shape[-1], self.units], 
                                      initializer="glorot_normal")
        self.bias = self.add_weight(name="bias", 
                                    shape=[self.units], 
                                    initializer="zeros")
        super().build(batch_input_shape) # must be at the end
    def call(self, X):
        return self.activation(X @ self.kernel + self.bias)
    def compute_output_shape(self, batch_input_shape):
        return tf.TensorShape(batch_input_shape.as_list()[:-1] + [self.units])
    def get_config(self):
        base_config = super().get_config()
        return {**base_config,
                "units": self.units,
                "activation": tf.keras.activations.serialize(self.activation)}

walk through this code:

• The constructor takes all the hyperparameters as arguments (in this example just
units and activation ), and importantly it also takes a ```**kwargs``` argument. It
calls the parent constructor, passing it the kwargs : this takes care of standard
arguments such as ```input_shape``` , ```trainable``` , ```name``` , and so on. Then it saves the
hyperparameters as attributes, converting the activation argument to the
appropriate activation function using the tf.keras.activations.get() function (it
accepts functions, standard strings like "relu" or "selu" , or simply None).


• **The build() method’s role is to create the layer’s variables, by calling the
add_weight() method for each weight. The build() method is called the first
time the layer is used**. At that point, tf.keras will know the shape of this layer’s
inputs, and it will pass it to the ```build()``` method, which is often necessary to create 
some of the weights. For example, we need to know the number of neurons in
the previous layer in order to create the connection weights matrix (i.e., the "kernel"): 
this corresponds to the size of the last dimension of the inputs. At the end
of the build() method (and only at the end), you must call the parent’s build()
method: this tells tf.keras that the layer is built (it just sets self.built = True).


• The ```call()``` method actually performs the desired operations. In this case, we
compute the matrix multiplication of the inputs X and the layer’s kernel, we add
the bias vector, we apply the activation function to the result, and this gives us the
output of the layer.


• The ```compute_output_shape()``` method simply returns the shape of this layer’s
outputs. In this case, it is the same shape as the inputs, except the last dimension
is replaced with the number of neurons in the layer. Note that in tf.keras, shapes
are instances of the tf.TensorShape class, which you can convert to Python lists
using as_list(). 

You can generally omit the compute_output_shape() method, as tf.keras automatically 
infers the output shape, except when the layer is dynamic. 
In other Keras implementations, this method is either required or by default it assumes 
the output shape is the same as the input shape.


• The ```get_config()``` method is just like earlier. Note that **we save the activation
function’s full configuration by calling tf.keras.activations.serialize().**

You can now use a MyDense layer just like any other layer!

### 2. Custom Layers with multiple inputs and/or outputs

To create a layer with multiple inputs (e.g., Concatenate ), the argument to the ```call()```
method should be a tuple containing all the inputs, and similarly the argument to the
```compute_output_shape()``` method should be a tuple containing each input’s batch
shape. To create a layer with multiple outputs, the ```call()``` method should return the
list of outputs, and the ```compute_output_shape()``` should return the list of batch output 
shapes (one per output). 

For example, the following toy layer takes two inputs
and returns three outputs:

In [6]:
class MyMultiLayer(tf.keras.layers.Layer):
    def call(self, X):
        X1, X2 = X
        return [X1 + X2, X1 * X2, X1 / X2]
    def compute_output_shape(self, batch_input_shape):
        b1, b2 = batch_input_shape
        return [b1, b1, b1] # should probably handle broadcasting rules

This layer may now be used like any other layer, but of course only using the functional 
and subclassing APIs, not the sequential API (which only accepts layers with
one input and one output)

### 3. Custom Layers with different behaviour during testing and training

If your layer needs to have a different behavior during training and during testing
(e.g., if it uses ```Dropout``` or ```BatchNormalization``` layers), then you must add a train
ing argument to the call() method and use this argument to decide what to do. For
example, let’s create a layer that adds Gaussian noise during training (for regularization), 
but does nothing during testing (tf.keras actually has a layer that does the same thing: ```tf.keras.layers.GaussianNoise``` )

In [9]:
class MyGaussianNoise(tf.keras.layers.Layer):
    def __init__(self, stddev, **kwargs):
        super().__init__(**kwargs)
        self.stddev = stddev
    def call(self, X, training=None):
        if training:
            noise = tf.random.normal(tf.shape(X), stddev=self.stddev)
            return X + noise
        else:
            return X
    def compute_output_shape(self, batch_input_shape):
        return batch_input_shape