# TensorFlow-Keras Layers
*by Marvin Bertin*
<img src="../../images/keras-tensorflow-logo.jpg" width="400">


A Deep Learning model in TensorFlow is represented as layers composed into eachother to form a trainable complex model. Each layer represents a high-level operation in the computational graph. These can be visualized as lego blocks can that be combined together and repeated across the architecture to form the neural network.

Below is an example of Google's Inception model that can produce high performance image classification.

<img src="../../images/googlenet.png" width="1500">


Below are examples of common layers provided the TF-Keras `layers` module:

** Convolutional Layers**
```
tf_keras.layers.Conv1D
tf_keras.layers.Conv2D
tf_keras.layers.Conv3D
```

** Max-Pooling Layers**
```
tf_keras.layers.MaxPool1D
tf_keras.layers.MaxPool2D
tf_keras.layers.MaxPool3D
```

** Avergae Pooling Layers**
```
tf_keras.layers.AvgPool1D
tf_keras.layers.AvgPool2D
tf_keras.layers.AvgPool3D
```

** Fully-Connected layer**
```
tf_keras.layers.Dense
```

** Other Layers**
```
tf_keras.layers.Flatten
tf_keras.layers.Dropout
tf_keras.layers.BatchNormalization
```

** Activation Layers**
```
tf_keras.activations.relu
tf_keras.activations.sigmoid
tf_keras.activations.softmax
tf_keras.activations.tanh
tf_keras.activations.elu
tf_keras.activations.hard_sigmoid
tf_keras.activations.softplus
tf_keras.activations.softsign
tf_keras.activations.linear
```

In [1]:
import tensorflow as tf

tf_keras = tf.contrib.keras

## Convolutional Layer

2D convolution layer - This layer creates a convolution kernel that is convolved
with the layer input to produce a tensor of
outputs.

```
tf_keras.layers.Conv2D

Arguments:
    filters: Integer, the dimensionality of the output space
        (i.e. the number output of filters in the convolution).
    kernel_size: An integer or tuple/list of 2 integers, specifying the
        width and height of the 2D convolution window.
        Can be a single integer to specify the same value for
        all spatial dimensions.
    strides: An integer or tuple/list of 2 integers,
        specifying the strides of the convolution along the width and height.
        Can be a single integer to specify the same value for
        all spatial dimensions.
        Specifying any stride value != 1 is incompatible with specifying
        any `dilation_rate` value != 1.
    padding: one of `"valid"` or `"same"`.
    activation: Activation function to use.
        If you don't specify anything, no activation is applied
        (ie. "linear" activation: `a(x) = x`).
    use_bias: Boolean, whether the layer uses a bias vector.
```

In [None]:
# output filter size
filters = 10

# feature map size
kernel_size = (3,3)

# conv2D - spatial convolution over images
tf_keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid',
                       activation= tf.nn.relu, use_bias=True,
                       kernel_initializer='glorot_uniform', bias_initializer='zeros')
# conv1D - temporal convolution
tf_keras.layers.Conv1D(filters, kernel_size, strides=(1, 1), padding='valid',
                       activation= tf.nn.relu, use_bias=True,
                       kernel_initializer='glorot_uniform', bias_initializer='zeros')

# conv3D - spatial convolution over volumes
tf_keras.layers.Conv3D(filters, kernel_size, strides=(1, 1), padding='valid',
                       activation= tf.nn.relu, use_bias=True,
                       kernel_initializer='glorot_uniform', bias_initializer='zeros')

## Max-Pooling Layer

This layer create a max-pooling operation to reduces the number of parameters by downsamling the input and can also fight over-fitting.

```
tf_keras.layers.MaxPool2D

Arguments:
    pool_size: integer or tuple of 2 integers,
        factors by which to downscale (vertical, horizontal).
        (2, 2) will halve the input in both spatial dimension.
        If only one integer is specified, the same window length
        will be used for both dimensions.
    strides: Integer, tuple of 2 integers, or None.
        Strides values.
        If None, it will default to `pool_size`.
    padding: One of `"valid"` or `"same"` (case-insensitive).
    data_format: A string,
        one of `channels_last` (default) or `channels_first`.
        The ordering of the dimensions in the inputs.
        `channels_last` corresponds to inputs with shape
        `(batch, width, height, channels)` while `channels_first`
        corresponds to inputs with shape
        `(batch, channels, width, height)`.
```

In [None]:
# max-pooling 2D - spatial data
tf_keras.layers.MaxPool2D(pool_size=(2, 2), strides=(2,2), padding='valid', data_format="channels_last")

# max-pooling 1D - temporal data
tf_keras.layers.MaxPool1D(pool_size=(2, 2), strides=(2,2), padding='valid', data_format="channels_last")

# max-pooling 3D - spatial or spatio-temporal
tf_keras.layers.MaxPool3D(pool_size=(2, 2), strides=(2,2), padding='valid', data_format="channels_last")

## Average Pooling Layer

In [None]:
tf_keras.layers.AvgPool1D
tf_keras.layers.AvgPool2D
tf_keras.layers.AvgPool3D

## Dropout
Dropout consists in randomly setting
a fraction `p` of input units to 0 at each update during training time,
which helps prevent overfitting.

```
tf_keras.layers.Dropout

Arguments:
    rate: float between 0 and 1. Fraction of the input units to drop.
    noise_shape: 1D integer tensor representing the shape of the
        binary dropout mask that will be multiplied with the input.
        For instance, if your inputs have shape
        `(batch_size, timesteps, features)` and
        you want the dropout mask to be the same for all timesteps,
        you can use `noise_shape=(batch_size, 1, features)`.
```

In [None]:
# dropout
tf_keras.layers.Dropout(rate = 0.5)

## Batch normalization layer

Normalize the activations of the previous layer at each batch,
i.e. applies a transformation that maintains the mean activation
close to 0 and the activation standard deviation close to 1.

```
tf_keras.layers.BatchNormalization

Arguments:
    axis: Integer, the axis that should be normalized
        (typically the features axis).
        For instance, after a `Conv2D` layer with
        `data_format="channels_first"`,
        set `axis=1` in `BatchNormalization`.
    momentum: Momentum for the moving average.
    epsilon: Small float added to variance to avoid dividing by zero.
    center: If True, add offset of `beta` to normalized tensor.
        If False, `beta` is ignored.
    scale: If True, multiply by `gamma`.
        If False, `gamma` is not used.
        When the next layer is linear (also e.g. `nn.relu`),
        this can be disabled since the scaling
        will be done by the next layer.
    beta_initializer: Initializer for the beta weight.
    gamma_initializer: Initializer for the gamma weight.
    moving_mean_initializer: Initializer for the moving mean.
    moving_variance_initializer: Initializer for the moving variance.
```

In [None]:
tf_keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001, center=True,
                                   scale=True, beta_initializer='zeros', gamma_initializer='ones',
                                   moving_mean_initializer='zeros', moving_variance_initializer='ones')

## Fully Connected (Dense) Layer

Fully-connected layer computes:

`output = activation(dot(input, kernel) + bias)`
where:
- `activation` is the element-wise activation function
- `kernel` is a weights matrix created by the layer
- `bias` is a bias vector created by the layer

if the input to the layer has a rank greater than 2, then
it is flattened prior to the initial dot product with `kernel`.

In [None]:
# fully connected layer
tf_keras.layers.Dense(units, activation=None, use_bias=True,
                      kernel_initializer='glorot_uniform', bias_initializer='zeros')

# flatten to vector
tf_keras.layers.Flatten()

## Activation Layer
This is a layer of neurons that applies the non-saturating activation function. It increases the nonlinear properties of the decision function

In [None]:
tf_keras.activations.relu(inputs)
tf_keras.activations.sigmoid(inputs)
tf_keras.activations.softmax(inputs)
tf_keras.activations.tanh(inputs)
tf_keras.activations.elu(inputs)
tf_keras.activations.hard_sigmoid(inputs)
tf_keras.activations.softplus(inputs)
tf_keras.activations.softsign(inputs)
tf_keras.activations.linear(inputs)

## Next Lesson
### CNN layers in TF-Keras
-  You will learn aboutthe different layers in TF-Keras

<img src="../../images/divider.png" width="100">