# Keras API - Tensorflow v2

Keras is the high level API for TensorFlow
 
- If you're an engineer, Keras provides you with reusable blocks such as layers, metrics, training loops, to support common use cases. It provides a high-level user experience that's accessible and productive
- If you're a researcher, you may prefer not to use these built-in blocks such as layers and training loops, and instead create your own. Of course, Keras allows you to do this. In this case, Keras provides you with templates for the blocks you write, it provides you with structure, with an API standard for things like Layers and Metrics. This structure makes your code easy to share with others and easy to integrate in production workflows
- The same is true for library developers: TensorFlow is a large ecosystem. It has many different libraries. In order for different libraries to be able to talk to each other and share components, they need to follow an API standard. That's what Keras provides

Crucially, Keras brings high-level UX and low-level flexibility together fluently: you no longer have on one hand, a high-level API that's easy to use but inflexible, and on the other hand a low-level API that's flexible but only approachable by experts. Instead, you have a spectrum of workflows, from the very high-level to the very low-level. Workflows that are all compatible because they're built on top of the same concepts and objects.

### The base Layer class

All layers are pretty much derived from this class only. A Layer will encapsulate a state and computation. State being weights/bias and computation being forward pass (defined under call method)

In [2]:
import tensorflow as tf
from tensorflow.keras.layers import Layer

class Linear(Layer):
    
    "Implementation of y = w.x + b"

    def __init__(self, units = 32, input_dim = 32):
        super(Linear, self).__init__()
        
        # Initializing the weight initializing scheme - Random Normal Distribution
        w_init = tf.random_normal_initializer()
        
        # Declaring the Weight Matrix using the scheme, we give the shape (dimensions), no of units 
        # and specify whether we can update them or not using the trainable Flag
        self.w = tf.Variable(initial_value = w_init(shape = (input_dim, units), dtype = 'float32'), trainable = True)
        
        # Initializing the bias initializing scheme - Zeros
        b_init = tf.zeros_initializer()
        
        # Declaring the Bias Matrix using the scheme, we give the shape (dimensions) 
        # and specify whether we can update them or not using the trainable Flag 
        self.b = tf.Variable(initial_value = b_init(shape = (units,), dtype = 'float32'), trainable = True)
    
    # Forward pass function. This function is called whenever the forward pass is happening
    def call(self, inputs):
        
        # Defining the Matrix Multiplication operation - w.x + b
        return tf.matmul(inputs, self.w) + self.b


In [3]:
# Instantiating the layer with 4 units with an input layer dimension of 2
linear_layer = Linear(4, 2)

In [4]:
# Calling the function 
y = linear_layer(tf.ones((2, 2)))
assert y.shape == (2, 4)

### The Layer class takes care of tracking the weights assigned to it as attributes

In [5]:
# Weights are automatically tracked under the weights property
assert linear_layer.weights == [linear_layer.w, linear_layer.b]

### Another way of initializing the weights

Instead of doing this way,
 - w_init = tf.random_normal_initializer()
 - self.w = tf.Variable(initial_value = w_init(shape = shape, dtype = 'float32'))

We can do it shortly in this way,
 - self.w = self.add_weight(shape = shape, initializer = 'random_normal')

### We can create the Layer in a lazy way and eliminate the input dimension from the constructor. We can have a build function for specifically initializing the weights with a specific dimension.

Here the build function gets called automatically

In [6]:
class Linear(Layer):
    
    "Implementation of y = w.x + b"
    
    def __init__(self, units = 32):
        super(Linear, self).__init__()
        self.units = units

    def build(self, input_shape):
        
        # Initialization of weights with the given input shape
        self.w = self.add_weight(shape = (input_shape[-1], self.units), initializer = 'random_normal', trainable = True)
        
        # Initialization of bias with the given input shape
        self.b = self.add_weight(shape = (self.units,), initializer = 'random_normal', trainable = True)

    def call(self, inputs):
        
        # Forward pass function. This function is called whenever the forward pass is happening 
        return tf.matmul(inputs, self.w) + self.b
    

In [7]:
# Instantiate the Layer
linear_layer = Linear(4)

In [8]:
# This will also call build(input_shape) and create the weights
y = linear_layer(tf.ones((2, 2)))
assert len(linear_layer.weights) == 2

### Trainable & Non-Trainable Weights

We use the trainable flag to freeze and unfreeze the weights in a layer

In [9]:
class ComputeSum(Layer):
    
    """Returns the sum of the inputs."""

    def __init__(self, input_dim):
        super(ComputeSum, self).__init__()
        
        # Create a non-trainable weight
        self.total = tf.Variable(initial_value = tf.zeros((input_dim,)), trainable = False)

    def call(self, inputs):
        
        # Add the inputs and compute the sum
        self.total.assign_add(tf.reduce_sum(inputs, axis=0))
        return self.total  


In [10]:
# Initialize the Layer
my_sum = ComputeSum(2)
x = tf.ones((2, 2))

y = my_sum(x)
print(y.numpy())  

y = my_sum(x)
print(y.numpy())  

assert my_sum.weights == [my_sum.total]
assert my_sum.non_trainable_weights == [my_sum.total]
assert my_sum.trainable_weights == []

[2. 2.]
[4. 4.]


### Composing multiple Layers into bigger computation block of layers

Layers can be recursively nested to create bigger computation blocks. Each layer will track the weights of its sublayers (both trainable and non-trainable)

In [11]:
class MLP(Layer):
    
    """Simple stack of Linear layers."""

    def __init__(self):
        super(MLP, self).__init__()
        
        self.linear_1 = Linear(32)
        self.linear_2 = Linear(32)
        self.linear_3 = Linear(10)

    # Forward Pass function    
    def call(self, inputs):
        
        # Input Layer with ReLU activation
        x = self.linear_1(inputs)
        x = tf.nn.relu(x)
        
        # Hidden Layer with ReLU activation
        x = self.linear_2(x)
        x = tf.nn.relu(x)
        return self.linear_3(x)


In [12]:
# Initializing the block
block = MLP()

# The first call to the block object will create the weights
y = block(tf.ones(shape=(3, 64)))

# Weights are recursively tracked
assert len(block.weights) == 6

# Keras Built-In Layers

Keras provides us a wide range of [layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/) for easy and quick prototyping of Deep Neural Networks. Few examples are : 

- Attention 
- Activation
- Average, Max, Global Average Pooling
- BatchNormalization
- Bidirectional
- Conv1D, Conv2D
- Conv1DTranspose, Conv2DTranspose
- Dense
- Dropout
- DepthwiseConvolutions, Separateable convolutions
- LSTM, GRU (with built-in cuDNN acceleration) to name a few important ones

Keras follows the principles of exposing good default configurations, so that layers will work fine out of the box for most use cases if you leave keyword arguments to their default value. For instance, the LSTM layer uses an orthogonal recurrent matrix initializer by default, and initializes the forget gate bias to one by default.

### The training argument in call

Some layers, in particular the BatchNormalization layer and the Dropout layer, have different behaviors during training and inference. For such layers, it is standard practice to expose a training (boolean) argument in the call method.

By exposing this argument in call, you enable the built-in training and evaluation loops (e.g. fit) to correctly use the layer in training and inference.

In [13]:
# Defining the Dropout Layer
class Dropout(Layer):
    
    def __init__(self, rate):
        super(Dropout, self).__init__()
        self.rate = rate

    def call(self, inputs, training = None):
        
        if training:            
            # Dropout enabled computation
            return tf.nn.dropout(inputs, rate = self.rate)
        
        # Droupout disabled output
        return inputs

class MLPWithDropout(Layer):
    
    def __init__(self):
        super(MLPWithDropout, self).__init__()
        
        self.linear_1 = Linear(32)
        self.dropout = Dropout(0.5)
        self.linear_3 = Linear(10)

    def call(self, inputs, training = None):
          
        x = self.linear_1(inputs)
        x = tf.nn.relu(x)
        x = self.dropout(x, training = training)
        
        return self.linear_3(x)


In [14]:
# Initializing the network with Dropout 
mlp = MLPWithDropout()

y_train = mlp(tf.ones((2, 2)), training = True)
y_test = mlp(tf.ones((2, 2)), training = False)

### Neural Network Models using Functional API

To build deep learning models, we don't have to use the tedious OOP way everytime. Layers can also be composed functionally using the Functional API

In [16]:
# We use an `Input` object to describe the shape and dtype of the inputs
# The shape argument is per-sample; it does not include the batch size
# The functional API focused on defining per-sample transformations
# The model we create will automatically batch the per-sample transformations, so that it can be called on batches of data
inputs = tf.keras.Input(shape = (16,))

# We call layers on these "type" objects and they return updated types (new shapes/dtypes)
# We are reusing the Linear layer we defined earlier
x = Linear(32)(inputs)

# We are reusing the Dropout layer we defined earlier
x = Dropout(0.5)(x) 
outputs = Linear(10)(x)

# A functional Model can be defined by specifying inputs and outputs
model = tf.keras.Model(inputs, outputs)

# A functional model already has weights, before being called on any data
# That's because we defined its input shape in advance (in Input)
assert len(model.weights) == 4

# Let's call our model on some data
y = model(tf.ones((2, 16)))
assert y.shape == (2, 10)

The Functional API tends to be more concise than subclassing, and provides a few other advantages. Key differences between ways of defining the models via Model Sub-classing and Functional API way are explained in [this](https://medium.com/tensorflow/what-are-symbolic-and-imperative-apis-in-tensorflow-2-0-dfccecb01021) blog post.

Learn more about the Functional API [here](https://www.tensorflow.org/guide/keras/functional).In your research workflows, you may often find yourself mix-and-matching with Sub-Classing & Functional Ways.

### Neural Network Models using the Sequential Class

In [17]:
from tensorflow.keras import Sequential

model = Sequential([Linear(32), Dropout(0.5), Linear(10)])

y = model(tf.ones((2, 16)))
assert y.shape == (2, 10)

# Keras Loss Functions

### Example of defining a model using Keras Sub-Classing technique

In [15]:
# Keras Custom-Class
class MyModel(tf.keras.Model):
    def __init__(self, num_classes = 10):
        super(MyModel, self).__init__()
   
        # Define the layers here
        inputs = tf.keras.Input(shape = (28, 28))  
        self.l1 = tf.keras.layers.Flatten()
        self.l2 = tf.keras.layers.Dense(512, activation = 'relu', name = 'dense1')
        self.l3 = tf.keras.layers.Dropout(0.2)
        self.final = tf.keras.layers.Dense(10, activation = tf.nn.softmax, name = 'output1')
    
    def call(self, inputs):
        
        # Define the forward pass
        x = self.l1(inputs)
        x = self.l2(x)
        x = self.l3(x)
        
        return self.final(x)

In [13]:
# Instantiate the model
model = MyModel()

model.compile(optimizer= tf.keras.optimizers.Adam(), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])

# Summary of the Neural Network Model
#model.summary()

# Training the Model using the fit function passing the parameters X - {params} and Y - {labels}
#model.fit(x, y)

### Defining a model using Keras Functional API

In [14]:
#Using Keras Functional API
inputs = tf.keras.Input(shape = (28,28))  
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(512, activation='relu', name = 'dense2')(x)
x = tf.keras.layers.Dropout(0.2)(x)
final = tf.keras.layers.Dense(10, activation = tf.nn.softmax, name = 'output2')(x)

In [15]:
#Instantiate the model
model_1 = tf.keras.Model(inputs = inputs, outputs = final)
model_1.compile(optimizer= tf.keras.optimizers.Adam(), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
#model_1.summary()
#model_1.fit(x, y)

### Defining a model using Sequential : Method 1

In [16]:
#Using Sequential 
model_2 = tf.keras.models.Sequential([
 tf.keras.layers.Flatten(),
 tf.keras.layers.Dense(512, activation = tf.nn.relu, name = 'dense2'),
 tf.keras.layers.Dropout(0.2),
 tf.keras.layers.Dense(10, activation = tf.nn.softmax, name = 'output2')
])
model_2.compile(optimizer= tf.keras.optimizers.Adam(), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
#model_2.summary()
#model_2.fit(x, y)

### Defining a model using Sequential : Method 2

In [17]:
#Using Sequential
model_3 = tf.keras.models.Sequential()
model_3.add(tf.keras.layers.Flatten())
model_3.add(tf.keras.layers.Dense(512, activation='relu', name = 'dense3'))
model_3.add(tf.keras.layers.Dropout(0.2))
model_3.add(tf.keras.layers.Dense(10,activation=tf.nn.softmax, name = 'output3'))
model_3.compile(optimizer= tf.keras.optimizers.Adam(), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
#model_3.summary()
#model_3.fit(x, y)