<a href="https://colab.research.google.com/github/ching-wong/my_notebooks/blob/main/Custom_Models_Layers_and_Loss_Functions_with_TensorFlow/Week3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Custom Models, Layers, and Loss Functions with TensorFlow - Week 3

How to define custom layers by using lambda layer or as a class.

>[Custom Models, Layers, and Loss Functions with TensorFlow - Week 3](#scrollTo=VqW8A7V0q51z)

>>[What is a layer?](#scrollTo=X-TN-tNnjAGw)

>>[Lambda layer](#scrollTo=GpV1fDf9gXlP)

>>[Custom layer](#scrollTo=FtTiW_dgkwGA)

>>[Adding an activation function](#scrollTo=vr_-qh6-rYiY)



## What is a layer?

A class in a neural network that holds parameters which define its state and computation.

* State (weights): Variables that make each layer unique. They can be trainable (adjusted during training) or non-trainable (used for other purposes).

* Computation: Transforms inputs into outputs, known as the forward pass, passing results to the next layer.

For example, $Y = w * X + b$ is the computation, while $w$ and $b$ are the weights.

## Lambda layer

The easiest way to define a custom layer is to use the lambda layer. Here is a simple example, which turns the outputs to their absolute values.

In [1]:
from tensorflow.keras.layers import Lambda, Input
import tensorflow as tf

input = Input(shape=[100])
x = Lambda(lambda x: tf.abs(x))(input)

For example, one can have a custom relu. In the following code, x and x2 are the same, while z is obtained by a small tweak of the relu.

In [2]:
from tensorflow.keras.layers import Dense
from tensorflow.keras import backend as K

def my_relu_0(x):
  return K.maximum(0.0, x)

def my_relu_1(x):
  return K.maximum(0.1, x)

x = Dense(32, activation="relu")(input)

y = Dense(32)(input)
x2 = Lambda(my_relu_0)
z = Lambda(my_relu_1)

## Custom layer

To define a custom layer as a class, we first need to inherit it from the class Layer.

Inside the class, there should be at least 3 methods:

* init, accepts parameters and sets up internal variables

* build, runs when the instance is created, for specifying local input states and

* call, performs the computation, is called during training to get the output.

The following code is for the layer $Y = w * X + b$.


In [3]:
from tensorflow.keras.layers import Layer

class SimpleDense(Layer):

  def __init__(self, units = 32):
    super(SimpleDense, self).__init__()
    self.units = units

  def build(self, input_shape):
    w_init = tf.random_uniform_initializer() # uses normal distribution. There are a few other options from tf
    self.w = self.add_weight(name="kernel",
                                 shape=(input_shape[-1], self.units),
                                 initializer=w_init,
                                 trainable=True)

    b_init = tf.zeros_initializer()
    self.b = self.add_weight(name="bias",
                                 shape=(self.units,),
                                 initializer=b_init,
                                 trainable=True)

  def call(self, inputs):
    return tf.matmul(inputs, self.w) + self.b

We can use .variables to see the values of the weights (before or after training).

In [4]:
my_dense = SimpleDense(1)
x = tf.ones((1,1))
y = my_dense(x)
print(my_dense.variables)

[<Variable path=simple_dense/kernel, shape=(1, 1), dtype=float32, value=[[-0.03975159]]>, <Variable path=simple_dense/bias, shape=(1,), dtype=float32, value=[0.]>]


## Adding an activation function

In the init method, we accept one more parameter (the activation). In the call method, we wrap the result by the activation function.

In [5]:
from tensorflow.keras.activations import get

class SimpleDense(Layer):

  def __init__(self, units = 32, activation = None):
    super(SimpleDense, self).__init__()
    self.units = units
    self.activation = get(activation) # get activation

  def build(self, input_shape):
    w_init = tf.random_uniform_initializer()
    self.w = self.add_weight(name="kernel",
                                 shape=(input_shape[-1], self.units),
                                 initializer=w_init,
                                 trainable=True)

    b_init = tf.zeros_initializer()
    self.b = self.add_weight(name="bias",
                                 shape=(self.units,),
                                 initializer=b_init,
                                 trainable=True)

  def call(self, inputs):
    return self.activation(tf.matmul(inputs, self.w) + self.b) # wrap it with the activation function

In [6]:
my_dense = SimpleDense(1, activation = "relu")
x = tf.ones((1,1))
y = my_dense(x)
print(my_dense.variables)

[<Variable path=simple_dense_1/kernel, shape=(1, 1), dtype=float32, value=[[-0.00663128]]>, <Variable path=simple_dense_1/bias, shape=(1,), dtype=float32, value=[0.]>]
