# Class Models, Custom Layers, GPU training and Custom Resnets

## Instructions
1. Use Google Collab for this task
2. change runtime type to GPU in the Runtime tab

## Making models using class

1. In earlier tasks, we learnt how to make a model using the sequential api provided by tensorflow.
2. Now we learnt how to make a model using a class based approach.

1. <b>Import</b>

In [1]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras import Sequential, layers, initializers, models, callbacks, optimizers
from tensorflow.keras.models import Model
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import mnist

2. <b>To avoid GPU errors</b>

In [2]:
physical_devices = tf.config.list_physical_devices("GPU")
tf.config.experimental.set_memory_growth(physical_devices[0], True) 

3. <b>Loading MNIST Data</b>

In [3]:
(x_train, y_train) , (x_test, y_test) = mnist.load_data()
#flattening the images into 1 dimensional array
x_train = x_train.reshape(-1, 28*28).astype("float32") / 255.0
x_test = x_test.reshape(-1, 28*28).astype("float32") / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [4]:
len(x_train)

60000

In [5]:
len(x_test)

10000

4. <b>Building the model</b>

In [6]:
#we will extend the class from keras.Model
#this means that we can use the functionalities of keras.model
#extending a class is an object oriented concept, so google that if you are confused
class MyModel(keras.Model):

  def __init__(self):
    super(MyModel, self).__init__() #super is also an object oriented thing which allows us to call the init method of the parent class (keras.Model here)
    self.dense1 = layers.Dense(64) #self links the variable to the class. You can access dense1 anywhere in this class by using self.dense1
    self.dense2 = layers.Dense(10)

  def call(self, input_tensor):
    x = tf.nn.relu(self.dense1(input_tensor))
    x = self.dense2(x)
    return x

5. <b>Creating the model object</b>

In [7]:
model = MyModel()

6. <b>Compiling the model</b>

In [8]:
model.compile(
    loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer = keras.optimizers.Adam(),
    metrics = ['accuracy'],
)


7. <b>Fitting the model

In [9]:
model.fit(x_train, y_train, batch_size=32, epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f26f008c750>

6. <b>Evaluate the model</b>

In [10]:
model.evaluate(x_test, y_test, batch_size=32)



[0.12221387773752213, 0.9621000289916992]

<b>Why class based</b>
1. allows for better code visualization
2. better connectivity
3. helps in modularizing the code

<b>Now we'll learn to make the layers (like Dense) using classes</b><br><br>
<b>Why do we need to dive till making layers?</b>
1. sometimes when we make our own models, we want to do our operations.<br><br>
  a. dense follows the operation WX+b<br><br>
  b. while back propogation, tf uses it's own automatic derivative algorithm to calculate the derivative of this operation and uses it for gradient descent.<br><br>
  c. What if we want to implement a dense operation which does (WX x 5) + (7 x b)? How will tensorflow account for this derivative?<br><br>
  d. The answer is : by using the keras.Layer class to build the layer <br><br>

2. Hence, we can do any complex operation using the Layer class and tensorflow will account for its derivative during back propogation.

## Making layers using classes

1. <b>Building the Dense layer from scratch</b>

In [6]:
class MyDense(layers.Layer):
  #when object created, first init is called
  def __init__(self, units=32):
      super(MyDense, self).__init__()
      self.units = units

  #init automatically calls build and the weights are set. init passes the input shape to build automatically. (input shape for dense 1 will be (batch_size, 784) 28*28=784, for dense 2 -> (batch_size,64))
  def build(self, input_shape):
      print(input_shape)
      self.w = self.add_weight(shape=(input_shape[-1], self.units), #why do we choose this shape? ponder (make a neural net diagram and see what shape fits the matrix multiplication W*X)
                               initializer='random_normal', #what other initialization methods exist?
                               trainable=True)
      self.b = self.add_weight(shape=(self.units,),
                               initializer='zeros',
                               trainable=True)

  #build calls the call function which executes the operation
  def call(self, inputs):
      return tf.matmul(inputs, self.w) + self.b

2. <b>Using custom Dense to build MyModel</b>

In [None]:
#we will extend the class from keras.Model
#this means that we can use the functionalities of keras.model
#extending a class is an object oriented concept, so google that if you are confused
class MyModel(keras.Model):

  def __init__(self):
    super(MyModel, self).__init__() #super is also an object oriented thing which allows us to call the init method of the parent class (keras.Model here)
    self.dense1 = MyDense(64) #self links the variable to the class. You can access dense1 anywhere in this class by using self.dense1
    self.dense2 = MyDense(10)

  def call(self, input_tensor):
    x = tf.nn.relu(self.dense1(input_tensor))
    x = self.dense2(x)
    return x

3. <b>Training the model </b>


In [11]:
#train the model
custom_model = MyModel()
#write code here

4. <b>Testing the model </b>


In [26]:
#evaluate the model
#write code here
model.compile(
    loss = keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer = keras.optimizers.Adam(),
    metrics = ['accuracy'],
)

#### We got a very similar result with our own Dense Layer

## Now your turn
1. We will build a mini custom version of ResNet CNN architecture using the keras.Model

2. Resnet solves the problem of vanishing gradients (https://www.youtube.com/watch?v=JIWXbzRXk1I) and allows us to build very deep neural networks.


RESNET BLOG RESOURCE : https://towardsdatascience.com/understanding-and-visualizing-resnets-442284831be8

RESNET VIDEO RESOURCE : https://www.youtube.com/watch?v=sAzL4XMke80&t=314s

Read the above blog carefully and watch the video for better understanding as well.

#### <b>Our Custom Model architecture</b>
DATASET -> MNIST

1. Input -> 28,28,1 images
2. 8 filters of 5x5, padding none => (24,24,8)
3. Resblock 1
  - 16 filters of 7x7, padding none => (18,18,16) <b>[im_res_1]</b>
  - 16 filters of 8x8, padding same => (18,18,16)
  - 16 filters of 10x10, padding same => (18,18,16) <b>[output_res_1]</b>
  - (sum of output_res_1 and im_res_1 here) => (18,18,16)
4. maxpooling => (9,9,16)
5. Resblock 2
  - 32 filters of 6x6, padding none => (4,4,32) <b>[im_res_2]</b>
  - 32 filters of 7x7, padding same => (4,4,32)
  - 32 filters of 8x8, padding same => (4,4,32) <b>[output_res_2]</b>
  - (sum of output_res_2 and im_res_2 here) => (4,4,32)
6. maxpooling => (2, 2, 32)
7. Flatten => (128,1)
8. Dense, 64 units => (64,1)
9. Dense, 10 units => (10, 1)
10. use softmax to predict



### <b>Instructions</b>
1. Build the custom Resnet model based on the architecture described above
2. Only use class based approach
3. Create a custom model using class based approach. Name the Class CustomResNet
4. Use RELU as activation function, Batch normalization after every layer
5. reload the data as it was already flattened before
6. Use custom layers only if you find the use case for it here 

### <b>Notes</b>
1. all layers are supposed to be defined in the init method.If defined anyehere else, will result in errors.
2. Repititive layers will need seperate initializations
3. you can check if your model is correct by the following code : -
 - model = CustomResNet()
 - model.build((None,28,28,1))
 - model.summary()
4. Summary must have all layers defined and the trainable paramaetrs should not be zero

In [2]:
#imports that you can use
from tensorflow.keras.layers import Flatten, Dense, Conv2D, Dropout, BatchNormalization, Activation, Add, Input, MaxPool2D
tf.config.run_functions_eagerly(True)

In [3]:
# load the mnist dataset here
# make sure to resize the dataset to consider the 3rd image dimension as well
# dataset shape must be (num_images, height, width, channel)
# divide data into train and test

# write code here
(x_train, y_train) , (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 28,28,1).astype("float32") / 255.0
x_test = x_test.reshape(-1, 28,28,1).astype("float32") / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [7]:
#make a class based model for Custom ResNet (refer instructions and notes above)
# write code here
class CustomResNet(keras.Model) :
  def __init__(self):
    super(CustomResNet, self).__init__()
    self.conv1=layers.Conv2D(filters=8,kernel_size=(5,5),padding='valid',activation='relu')
    self.bn1=layers.BatchNormalization()
    self.conv2=layers.Conv2D(filters=16,padding='valid',kernel_size=(7,7),activation='relu')
    self.bn2=layers.BatchNormalization()
    self.conv3=layers.Conv2D(filters=16,padding='same',kernel_size=(8,8),activation='relu')
    self.bn3=layers.BatchNormalization()
    self.conv4=layers.Conv2D(filters=16,padding='same',kernel_size=(10,10),activation='relu')
    self.bn4=layers.BatchNormalization()
    self.pool1=layers.MaxPooling2D(pool_size=(2,2))
    self.conv5=layers.Conv2D(filters=32,padding='valid'kernel_size=(6,6),,activation='relu')
    self.bn5=layers.BatchNormalization()
    self.conv6=layers.Conv2D(filters=32,padding='same',kernel_size=(7,7),activation='relu')
    self.bn6=layers.BatchNormalization()
    self.conv7=layers.Conv2D(filters=32,padding='same',kernel_size=(8,8),activation='relu')
    self.bn7=layers.BatchNormalization()
    self.pool2=layers.MaxPooling2D(pool_size=(2,2))
    self.flat=layers.Flatten()
    self.dense1 = MyDense(64)
    self.dense2 = MyDense(10)

  def call(self, input_tensor):
    x=tf.nn.relu(self.conv1(input_tensor))
    x=self.bn1(x)
    x=self.conv2(x)
    x1=self.bn2(x)
    x=self.conv3(x)
    x=self.bn3(x)
    x=self.conv4(x)
    x2=self.bn4(x)
    x=x1+x2
    x=self.pool1(x)
    x=self.conv5(x)
    x1=self.bn5(x)
    x=self.conv6(x)
    x=self.bn6(x)
    x=self.conv7(x)
    x2=self.bn7(x)
    x=x1+x2
    x=self.pool2(x)
    x=self.flat(x)
    x=self.dense1(x)
    x=tf.nn.softmax(self.dense2(x))
    return x


In [8]:
#find summary of model
# write code here
model = CustomResNet()
model.build((60000,28,28,1))
model.summary()

(60000, 128)
(60000, 64)
Model: "custom_res_net_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_7 (Conv2D)           multiple                  208       
                                                                 
 batch_normalization_7 (Batc  multiple                 32        
 hNormalization)                                                 
                                                                 
 conv2d_8 (Conv2D)           multiple                  6288      
                                                                 
 batch_normalization_8 (Batc  multiple                 64        
 hNormalization)                                                 
                                                                 
 conv2d_9 (Conv2D)           multiple                  16400     
                                                                 
 batch_normalization_9 (B

In [18]:
#train model here
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=2)

  "Even though the `tf.config.experimental_run_functions_eagerly` "


Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f0e4cf940d0>

In [19]:
#evaluate model here
model.evaluate(x_test, y_test, batch_size=32)


  3/313 [..............................] - ETA: 8s - loss: 0.0019 - accuracy: 1.0000 

  "Even though the `tf.config.experimental_run_functions_eagerly` "




[0.035953398793935776, 0.9879000186920166]

## Congratulations!
#### Now you know how to build class based models and custom layers!