# Custom layers

Using `tf.keras` as a high-level API for building neural networks. That said, most TensorFlow APIs are usable with eager execution.


In [1]:
import tensorflow as tf

In [2]:
print(tf.test.is_gpu_available())

Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
True


## Layers: common sets of useful operations


In [3]:
# In the tf.keras.layers package, layers are objects. To construct a layer,
# simply construct the object. Most layers take as a first argument the number
# of output dimensions / channels.
layer = tf.keras.layers.Dense(100)
# The number of input dimensions is often unnecessary, as it can be inferred
# the first time the layer is used, but it can be provided if you want to
# specify it manually, which is useful in some complex models.
layer = tf.keras.layers.Dense(10, input_shape=(None, 5))

The full list of pre-existing layers can be seen in [the documentation]. It includes Dense (a fully-connected layer),
Conv2D, LSTM, BatchNormalization, Dropout, and many others.

In [4]:
# To use a layer, simply call it.
layer(tf.zeros([10, 5]))

<tf.Tensor: shape=(10, 10), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)>

In [5]:
# Layers have many useful methods. For example, you can inspect all variables
# in a layer using `layer.variables` and trainable variables using
# `layer.trainable_variables`. In this case a fully-connected layer
# will have variables for weights and biases.
layer.variables

[<tf.Variable 'dense_1/kernel:0' shape=(5, 10) dtype=float32, numpy=
 array([[-0.21981409, -0.556763  , -0.00722533, -0.6025204 ,  0.11806881,
          0.49927992,  0.5916706 ,  0.10783505,  0.52594155,  0.01104635],
        [ 0.39414054,  0.34683132,  0.46735483, -0.05229956,  0.23294812,
          0.12009841, -0.48703358,  0.6022077 , -0.32566792,  0.07034373],
        [-0.40987128, -0.59633034, -0.2497774 ,  0.5014308 ,  0.31765997,
         -0.37472853, -0.03741425, -0.31100833,  0.47611362, -0.38121036],
        [ 0.39433593, -0.45738918, -0.17615733, -0.57191867,  0.24638736,
         -0.4380179 , -0.20712337,  0.16085309, -0.41435593, -0.46428642],
        [ 0.3286665 ,  0.09785253, -0.4027183 ,  0.17111355, -0.14382827,
          0.34538722, -0.5022633 ,  0.31599796,  0.06772214, -0.20462254]],
       dtype=float32)>,
 <tf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>]

In [6]:
# The variables are also accessible through nice accessors
layer.kernel, layer.bias

(<tf.Variable 'dense_1/kernel:0' shape=(5, 10) dtype=float32, numpy=
 array([[-0.21981409, -0.556763  , -0.00722533, -0.6025204 ,  0.11806881,
          0.49927992,  0.5916706 ,  0.10783505,  0.52594155,  0.01104635],
        [ 0.39414054,  0.34683132,  0.46735483, -0.05229956,  0.23294812,
          0.12009841, -0.48703358,  0.6022077 , -0.32566792,  0.07034373],
        [-0.40987128, -0.59633034, -0.2497774 ,  0.5014308 ,  0.31765997,
         -0.37472853, -0.03741425, -0.31100833,  0.47611362, -0.38121036],
        [ 0.39433593, -0.45738918, -0.17615733, -0.57191867,  0.24638736,
         -0.4380179 , -0.20712337,  0.16085309, -0.41435593, -0.46428642],
        [ 0.3286665 ,  0.09785253, -0.4027183 ,  0.17111355, -0.14382827,
          0.34538722, -0.5022633 ,  0.31599796,  0.06772214, -0.20462254]],
       dtype=float32)>,
 <tf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>)

## Implementing custom layers

The best way to implement own layer is extending the tf.keras.Layer class and implementing:

1. `__init__` , where users can do all input-independent initialization
2. `build`, where users know the shapes of the input tensors and can do the rest of the initialization
3. `call`, where users do the forward computation

Note that users don't have to wait until `build` is called to create variables, users can also create them in `__init__`. However, the advantage of creating them in `build` is that it enables late variable creation based on the shape of the inputs the layer will operate on. On the other hand, creating variables in `__init__` would mean that shapes required to create the variables will need to be explicitly specified.

In [7]:
class MyDenseLayer(tf.keras.layers.Layer):
  def __init__(self, num_outputs):
    super(MyDenseLayer, self).__init__()
    self.num_outputs = num_outputs

  def build(self, input_shape):
    self.kernel = self.add_weight("kernel",
                                  shape=[int(input_shape[-1]),
                                         self.num_outputs])

  def call(self, inputs):
    return tf.matmul(inputs, self.kernel)

layer = MyDenseLayer(10)

In [8]:
_ = layer(tf.zeros([10, 5])) # Calling the layer `.builds` it.

In [9]:
print([var.name for var in layer.trainable_variables])

['my_dense_layer/kernel:0']


Overall code is easier to read and maintain if it uses standard layers whenever possible, as other readers will be familiar with the behavior of standard layers. 

## Models: Composing layers

Many interesting layer-like things in machine learning models are implemented by composing existing layers. For example, each residual block in a resnet is a composition of convolutions, batch normalizations, and a shortcut. Layers can be nested inside other layers.

Typically users inherit from `keras.Model` when need the model methods like: `Model.fit`,`Model.evaluate`, and `Model.save` (see [Custom Keras layers and models](../../guide/keras/custom_layers_and_models.ipynb) for details).

One other feature provided by `keras.Model` (instead of `keras.layers.Layer`) is that in addition to tracking variables, a `keras.Model` also tracks its internal layers, making them easier to inspect.

Here is a ResNet block:

In [10]:
class ResnetIdentityBlock(tf.keras.Model):
  def __init__(self, kernel_size, filters):
    super(ResnetIdentityBlock, self).__init__(name='')
    filters1, filters2, filters3 = filters

    self.conv2a = tf.keras.layers.Conv2D(filters1, (1, 1))
    self.bn2a = tf.keras.layers.BatchNormalization()

    self.conv2b = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same')
    self.bn2b = tf.keras.layers.BatchNormalization()

    self.conv2c = tf.keras.layers.Conv2D(filters3, (1, 1))
    self.bn2c = tf.keras.layers.BatchNormalization()

  def call(self, input_tensor, training=False):
    x = self.conv2a(input_tensor)
    x = self.bn2a(x, training=training)
    x = tf.nn.relu(x)

    x = self.conv2b(x)
    x = self.bn2b(x, training=training)
    x = tf.nn.relu(x)

    x = self.conv2c(x)
    x = self.bn2c(x, training=training)

    x += input_tensor
    return tf.nn.relu(x)


block = ResnetIdentityBlock(1, [1, 2, 3])

In [11]:
_ = block(tf.zeros([1, 2, 3, 3])) 

In [12]:
block.layers

[<keras.layers.convolutional.Conv2D at 0x2c194150d30>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x2c194401a00>,
 <keras.layers.convolutional.Conv2D at 0x2c19419ad30>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x2c19419a190>,
 <keras.layers.convolutional.Conv2D at 0x2c1fc3af250>,
 <keras.layers.normalization.batch_normalization.BatchNormalization at 0x2c1fc3af700>]

In [13]:
len(block.variables)

18

In [14]:
block.summary()

Model: "resnet_identity_block"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              multiple                  4         
_________________________________________________________________
batch_normalization (BatchN  multiple                  4         
ormalization)                                                    
_________________________________________________________________
conv2d_1 (Conv2D)            multiple                  4         
_________________________________________________________________
batch_normalization_1 (Batc  multiple                  8         
hNormalization)                                                  
_________________________________________________________________
conv2d_2 (Conv2D)            multiple                  9         
_________________________________________________________________
batch_normalization_2 (Batc  multiple        

Much of the time, however, models which compose many layers simply call one layer after the other. This can be done in very little code using `tf.keras.Sequential`:

In [15]:
my_seq = tf.keras.Sequential([tf.keras.layers.Conv2D(1, (1, 1),
                                                    input_shape=(
                                                        None, None, 3)),
                             tf.keras.layers.BatchNormalization(),
                             tf.keras.layers.Conv2D(2, 1,
                                                    padding='same'),
                             tf.keras.layers.BatchNormalization(),
                             tf.keras.layers.Conv2D(3, (1, 1)),
                             tf.keras.layers.BatchNormalization()])
my_seq(tf.zeros([1, 2, 3, 3]))

<tf.Tensor: shape=(1, 2, 3, 3), dtype=float32, numpy=
array([[[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]]], dtype=float32)>

In [16]:
my_seq.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_3 (Conv2D)            (None, None, None, 1)     4         
_________________________________________________________________
batch_normalization_3 (Batc  (None, None, None, 1)     4         
hNormalization)                                                  
_________________________________________________________________
conv2d_4 (Conv2D)            (None, None, None, 2)     4         
_________________________________________________________________
batch_normalization_4 (Batc  (None, None, None, 2)     8         
hNormalization)                                                  
_________________________________________________________________
conv2d_5 (Conv2D)            (None, None, None, 3)     9         
_________________________________________________________________
batch_normalization_5 (Batc  (None, None, None, 3)     1