<a href="https://colab.research.google.com/github/zuber129/aimlops-colab-worksheets/blob/main/M1_AST_03_MNIST_Keras_A.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Programme in AI and MLOps
## A programme by IISc and TalentSprint
### Assignment: MNIST_Keras

## Learning Objectives:

In this tutorial, we will learn to implement the following:


1.   Building a neural network from scratch with tensorflow operations
2.   Keras Sequential API
3.   Keras Functional API
4.   Keras Model subclassing
5.   Callbacks

## Introduction to Keras

Keras is an open-source neural network library written in Python that allows you to build and train deep learning models. It provides a user-friendly and modular API for creating and configuring deep neural networks with high-level abstractions. Keras is built on top of other popular deep learning frameworks such as TensorFlow, Theano, and CNTK. It has a wide range of applications in areas such as computer vision, natural language processing, and time-series forecasting.

### Setup Steps:

In [None]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "" #@param {type:"string"}

In [None]:
#@title Please enter your password (your registered phone number) to continue: { run: "auto", display-mode: "form" }
password = "" #@param {type:"string"}

In [None]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython

ipython = get_ipython()
  
notebook= "M1_AST_03_MNIST_Keras_A" #name of the notebook

def setup():
#  ipython.magic("sx pip3 install torch")  

    from IPython.display import HTML, display
    display(HTML('<script src="https://dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")
    
    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:        
        print(r["err"])
        return None        
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None
    
    elif getAnswer() and getComplexity() and getAdditional() and getConcepts() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional, 
              "concepts" : Concepts, "record_id" : submission_id, 
              "answer" : Answer, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook,
              "feedback_experiments_input" : Comments,
              "feedback_mentor_support": Mentor_support}
      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:        
        print(r["err"])
        return None   
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://aimlops-iisc.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id
    

def getAdditional():
  try:
    if not Additional: 
      raise NameError
    else:
      return Additional  
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None
  
def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None
  
  
# def getWalkthrough():
#   try:
#     if not Walkthrough:
#       raise NameError
#     else:
#       return Walkthrough
#   except NameError:
#     print ("Please answer Walkthrough Question")
#     return None
  
def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None
  

def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer():
  try:
    if not Answer:
      raise NameError 
    else: 
      return Answer
  except NameError:
    print ("Please answer Question")
    return None
  

def getId():
  try: 
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup 
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup() 
else:
  print ("Please complete Id and Password cells before running setup")



## Import libraries

In [None]:
import tensorflow as tf
from tensorflow import keras
import math
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 

from keras.layers import Dense, Flatten
from keras import Input
from tensorflow.keras.utils import plot_model

from tensorflow.keras.datasets import mnist, fashion_mnist

# Basic Sequential Model

We want to build a sequential model. This means that the layers of our neural network are stacked sequentially. The approach is as follows:
1.  First implement a class to build a dense layer. We call it "NaiveDense"
2.  Implement a class ("NaiveSequential") to stack the layers sequentially and build a sequential model. 


In [None]:
# Implementing our dense layer class
class NaiveDense:
  def __init__(self, input_size, output_size, activation):  #input and output sizes for the layer
    self.activation = activation

    w_shape = (input_size, output_size)       # matrix of weights
    w_initial_value = tf.random.uniform(w_shape, minval=0, maxval=1e-1)
    self.w = tf.Variable(w_initial_value)     # we can only update values of tf.Variables

    b_shape = (output_size,)                  # vector of biases
    b_initial_value = tf.zeros(b_shape)
    self.b = tf.Variable(b_initial_value)

  def __call__(self, inputs):                 # executed when the class object is used as a function
    return self.activation(tf.matmul(inputs, self.w)+self.b)      # self.b is broadcasted

  @property                                   # enables us to use the method as an attribute
  def weights(self):
    return (self.w, self.b)

We implemented a dense layer. Now we will stack them together sequentially in our NaiveSequential class

In [None]:
class NaiveSequential:
  def __init__(self,layers):    # layers: list of layer objects
    self.layers = layers

  def __call__(self, inputs):   
    x = inputs
    for layer in self.layers:     #ouptut of the prev layer is the input to the next layer
      x = layer(x)
    return x

  @property
  def weights(self):
    weights = []
    for layer in self.layers:     # save weights of each layer to a list
      weights += layer.weights    # Q: What does layer.weights return? 
    return weights                # A: layer.weights calls the function layer.weights() since it decorated with @property. It returns (w,b)



Sequential stacking of dense layer is implemented.

Instantiate NaiveSequential class and make our first NN model.

In [None]:
# define the model
# YOUR CODE HERE

#  Q: What input argument does NaiveSequential take? A: list of layer objects
#  Q: What is the input and output dimension of the overall model? A: input dim = 784, output dim = 10
#  Q: Can the output_size of 1 layer be different from the input_size of the next layer? A: No, they have to be the same.

'model' is the object of 'NaiveSequential' class. This class has 'weights' as one of the methods of the class which is accessed using 'model.weights'. When 'NaiveDense' class is used as a function it will return the initial values of weights and biases.

In [None]:
# YOUR CODE HERE

The sequential model is untrained and currently not useful.

We must train the model to make it learn useful representaions but first we need data.

Solve a calssification problem by using above sequential model for [MNIST data set](https://keras.io/api/datasets/mnist/). 

We must 
1. Load the data
2. Reshape the data according to the input shape of the model
3. Normalize the data

In [None]:
# Load data
# YOUR CODE HERE
# Q: How many samples does the training set have? A: 60000

# reshape and normalize data
# YOUR CODE HERE

# YOUR CODE HERE

### Visualize Image

In [None]:
##Read image
# YOUR CODE HERE

We divide the data into batches. For this operation, we implement a class for Batch Generation.

In [None]:
# Create Batch Generator Class
# YOUR CODE HERE

'batch_generator.num_batches' is an attribute of the 'BatchGenerator' class which represents the total number of batches that can be generated from the given dataset of images and labels, based on the specified batch size. It is calculated as the total number of images divided by the batch size, rounded up to the nearest integer using the math.ceil() function.

In [None]:
# YOUR CODE HERE

'batch_generator.next()' is a method of the BatchGenerator class which generates the next batch of images and labels from the dataset. Each time next() is called, it returns a tuple of images and labels corresponding to the next batch of size batch_size, and updates the internal index pointer to point to the start of the next batch.

In [None]:
# YOUR CODE HERE

Once we have defined the model all we have to do to train it is:

1.   model.compile()
2.   model.fit()

We should know what goes on behind the scenes. The steps involved in training a model:

*Training steps:*

1. Compute the predictions using current weights (Forward Pass).
2. Compute the loss value for these predictions.
3. Compute the gradient with regard to model weights.
4. update the weights.

In [None]:
# one_training_step function gives the idea of how loss is computed and layer \
# parameters (weights and biases) are updated
def one_training_step(model, images_batch, labels_batch):
  with tf.GradientTape() as tape:                   # GradientTape() is the computational graph 
    predictions = model(images_batch)               # forward pass. 
    per_sample_losses = keras.losses.sparse_categorical_crossentropy(  # define loss
        labels_batch, predictions
    )
    average_loss = tf.reduce_mean(per_sample_losses)  
  gradients = tape.gradient(average_loss, model.weights)      # Compute gradients
  update_weights(gradients, model.weights)                    # Update the weights
  return average_loss

learning_rate = 1e-3
def update_weights(gradients, weights):
  for g,w in zip(gradients, weights):
    w.assign_sub(g*learning_rate)             # w -= g*lr
    
# Full training loop
# YOUR CODE HERE
# YOUR CODE HERE                       # repeat for epochs
# YOUR CODE HERE
# YOUR CODE HERE
# YOUR CODE HERE      # go through all mini-batches in the data
      images_batch, labels_batch = batch_generator.next()
      loss = one_training_step(model, images_batch, labels_batch)
      if batch_counter%100 == 0:
        print(f"loss at batch {batch_counter}:{loss:.2f}")

# Q: Identify the 4 training steps in the above code
# A: They are present in the one_training_step function. See comments.

Train the model on MNIST data set.

In [None]:
# YOUR CODE HERE

# Q: we didn't do a compile step.... or did we?
# A: In model.compile(), we pass information about the loss function, optimizer, and evaluation metric. \
#     In our naive implementation, instead of defining a separate compile() function, we have defined the loss \
#     inside one_training_step; implemented the optimizer 'mini-batch gradient descent' in update_weights(); \
#     and we are doing the evaluation (accuracy) separately in a later cell.

After 10 epochs the loss has come down.

The model has definitely learned something. Lets evaluate how accurately it can predict labels for images **it has not seen before**. These are the **images in the test set**.

Use the "accuracy" metric. Here we simply find the fraction of times the model succeeded in predicting the correct label.

When using Keras, we would mention this metric in model.compile(). (More on Keras later)


In [None]:
#  Evaluation step
# YOUR CODE HERE




---



# **Different APIs**
1. Sequential Model
2. Functional API
3. Model subclassing



---



---



# 1.Sequential API

Whatever has been implemented so far can be done alternatively using the Sequential class in keras. In the following approach, layers are passed as a list.

Defining the same old model by subclassing the Model class [[Reference](https://keras.io/api/models/model/)].

In [None]:
# from keras.layers import Dense, Flatten
# from keras import Input

# YOUR CODE HERE
# Q: Do you notice a difference in arguments of the Dense layers, compared to our implementation?
# A: We did not have write input_shape explicitly for each layer. It automatically inferred by Keras

Alternatively, instead of passing layers as list, we can build a sequential model by adding layers incrementally to the model.

In [None]:
# YOUR CODE HERE

Notice that we have not yet provided information of input dimensions.

These layers are referred to as symbolic layers. 

Unless until you build the model layer weights are not created.

In [None]:
# YOUR CODE HERE

To create a weights you need to call on some data or call its build method with input shape

In [None]:
# YOUR CODE HERE  #None means it can take any batch size; 3 is the number of features in your input
# YOUR CODE HERE

# seq_model_inc.weights

In [None]:
# Display model summary
# YOUR CODE HERE

Q: Verify the number of parameters by a quick calculation?

A: 650 = 10*64 + 10 

weight matrix has 64*10 weights and 10 biases for the 10 neurons

**Specifying input shape in advance**

In [None]:
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

# 2. Functional API


We will use the Keras functional API to create the same model. Keras functional API can create more flexible models than Sequential API. It can handle models with non-linear topology, shared layers, and even multiple inputs or outputs.

Key Idea- Expresses each layer as a function of the previous layer.

<center>(input: 3-dimensional vectors)</center> 
<center>  ↧ </center>
<center>[Dense (64 units, relu activation)]</center>
<center>   ↧ </center>
<center>(output: 10 units, softmax activation)</center>

Defining the same old model by subclassing the Model class [[Reference](https://keras.io/api/models/model/)].

In [None]:
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

We get effectively the same summary becuase we have implemented the same model using the functional API.

In [None]:
# YOUR CODE HERE

A deeper network.

In [None]:
# from keras.layers import Dense
# import keras
#node = Layer(nodes, extra_params)(prev_node)
# YOUR CODE HERE
# YOUR CODE HERE                               #defining dense2 node whose parent is dense1
# YOUR CODE HERE                               #defining output node where parent is dense2
# YOUR CODE HERE

In [None]:
model.summary()

In [None]:
plot_model(model)

An example where the Sequential API would not be sufficeint.

**Multi-Input and Multi-output:** Consider an example of building a system to rank customer tickets by priority and route them to the appropriate departments.

Outputs: model need to give two outputs
1. First task of the model is to classify the tickets into priority and non priority (Binary classification)

2. Second task is to route the ticket to appropriate department (Multi-class classification based on the number of departments)

These two task are to be done simultaneously

Inputs:
1. Title of the ticket (text input)
2. The text body of the ticket (text input)
3. Any tags added by the user

Q. Is it possible to build the model sequentially?

A: No, we cannot build a multi-input , multi-output model through the sequential API, because, by definition itself, the required model is not sequential.

In [None]:
vocabulary_size = 10000
num_tags = 100
num_departments = 4

# Inputs
# YOUR CODE HERE

# Outputs
# YOUR CODE HERE

In [None]:
# Model Summary
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

Reusing the model by training intermediate layer output

In [None]:
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

In [None]:
keras.utils.plot_model(new_model)

# 3. Subclassing the Model class

We saw how the functional API enabled us to make more complex models compared to the sequential API. We moved up the ladder of progressive disclosure of complexity.


Defining the same old model by subclassing the Model class [[Reference](https://keras.io/api/models/model/)].


In [None]:
class CustomerTicketModel(keras.Model):

# Define the layers in the __init__ method
  def __init__(self, num_departments):
    super().__init__()
    self.concat_layer = keras.layers.Concatenate()
    self.mixing_layer = keras.layers.Dense(64, activation="relu")
    self.priority_scorer = keras.layers.Dense(1, activation="sigmoid")
    self.department_classifier = keras.layers.Dense(num_departments,
                                                    activation="softmax")
    

# Define the relationship between layers in in the call method   
# See Section 7.2.3 in Francois chollet for more details 
  def call(self,inputs):
    # input should be dictionary type
    title = inputs["title"]
    text_body = inputs["text_body"]
    tags = inputs["tags"]

    features = self.concat_layer([title, text_body, tags])
    features = self.mixing_layer(features)

    priority = self.priority_scorer(features)
    department = self.department_classifier(features)

    return priority, department

In [None]:
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

In [None]:
# here model is built by calling the data since build() method is not 
# defined in model subclass 
# generate random data

# YOUR CODE HERE

In [None]:
sub_class_model.summary()

## Building the model using custom dense layer and functional API
### Building custom layer [[Reference](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Layer)].

In [None]:
from keras.initializers import RandomNormal

In [None]:
class Custom_Dense(keras.layers.Layer):
  def __init__(self, units, activation=None):
    super() .__init__()
    self.units = units
    self.activation = activation

  # Subclassing gives us the flexibility here to initialize weights on our own
  def build(self, input_shape):
    input_dim = input_shape[-1]
    std_dev = np.sqrt(2/(input_dim + self.units))
    self.W = self.add_weight(shape=(input_dim, self.units),
                            initializer=RandomNormal(stddev=std_dev))
    self.b = self.add_weight(shape=(self.units,),
                             initializer="zeros")
    
  def call(self, inputs):
    y = tf.matmul(inputs, self.W) + self.b
    if self.activation is not None:
      y = self.activation(y)
    return y


We can even define custom metrics and custom loss functions using the subclassing API. Refer to Section 7.3.1 of Chollet for details.

### Using custom dense layer with functional API


In [None]:
# YOUR CODE HERE

In [None]:
# Model Summary
# YOUR CODE HERE

In [None]:
# Plot the model
# YOUR CODE HERE

In [None]:
# Compile the model
# YOUR CODE HERE

In [None]:
train_x = train_images[10000:]
train_y = train_labels[10000:]
val_x = train_images[:10000]
val_y = train_labels[:10000]

In [None]:
# Train the model
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

Q. What is the difference between evaluate() and predict()

A: evaluate() returns the loss score and evaluation score. predict() runs a forward pass for the given input data.

"predict" is used to make predictions on new data using a trained model. Given an input tensor, the "predict" function outputs the corresponding predictions generated by the model.

"evaluate" is used to evaluate the performance of a trained model on a given dataset. Given an input dataset, the "evaluate" function computes the model's performance metrics, such as accuracy, loss, or any other metrics defined during model compilation.

In [None]:
model.evaluate(test_images,test_labels)

In [None]:
class_predicted = np.argmax(model.predict(test_images),axis=1)
accuracy = np.sum(class_predicted == test_labels)/len(test_labels)
print(accuracy)

In [None]:
from sklearn.metrics import confusion_matrix, classification_report

In [None]:
# Display Classification report
# YOUR CODE HERE

In [None]:
print(confusion_matrix(test_labels, class_predicted))

In [None]:
#save model

# Using Callbacks

A callback is an object that can perform actions at various stages of training (e.g. at the start or end of an epoch, before or after a single batch, etc).

You can use callbacks to:

* Write TensorBoard logs after every batch of training to monitor your metrics
* Periodically save your model to disk
* Do early stopping
* Get a view on internal states and statistics of a model during training

Access Keras callbacks [here](https://keras.io/api/callbacks/)

In [None]:
# build model using functional API
inputs = Input(shape=(28*28,))
features = Dense(512,activation="relu")(inputs)
features = keras.layers.Dropout(0.5)(features)
outputs = Dense(10,activation="softmax")(features)

mnist_model = keras.Model(inputs, outputs)

In [None]:
from keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard

In [None]:
# YOUR CODE HERE

In [None]:
# Compile the model
# YOUR CODE HERE

In [None]:
# YOUR CODE HERE

###TensorBoard 

It is a visualization tool provided with TensorFlow.

This callback logs events for TensorBoard, including:

* Metrics summary plots
* Training graph visualization
* Weight histograms
* Sampled profiling

In [None]:
# Tensorboard Visualization
# YOUR CODE HERE

**Reference**


*   Chollet, F. (2021). Deep learning with python. Manning Publications.
*   Geron,Aurelien(2022): Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, O'Reilly Media, Inc. Publications



### Please answer the questions below to complete the experiment:




In [None]:
#@title  Which of the following statement is best suitable for Keras functional API?  { run: "auto", form-width: "500px", display-mode: "form" }
Answer = "" #@param ["", "We can use it to create all aspects of the neural network from scratch and have complete control and flexibility.", "It is best suitable for creating simple architectures.", "It provides flexibility to design complex topologies of neural network which includes shared layers, branching, multiple input and output.", "All of the above"]

In [None]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [None]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "" #@param {type:"string"}


In [None]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "" #@param ["","Yes", "No"]


In [None]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

# Assignment [Optional]
Please attempt the exercises below, they will be discussed in the next tutorial session.

## Q1

Train a DNN using the sequntial API on the [MNIST fashion dataset](https://keras.io/api/datasets/fashion_mnist/) by following the instructions given below:


1.   Import all the necessary modules 
2.   Use the sequential API to make a model with the following dense layers: 

        i. layer-1: 128 neurons, relu activation

        ii. layer-2: X neurons, softmax activation
        (Deduce the value of X based on the number of classes in the MNIST fashion dataset)

        iii. Define the input_shape=(None, Y) for the model. Deduce Y from the dataset
        
        iV. Plot the model (plot_model())

3.  Load the mnist fashion data from keras.datasets and perform necessary preprocessing (like reshaping and normalizing) on the train and test sets.

4. Split the original training set into train and validation (10%) sets

5.  Compile the model using apropriate loss, any optimizer, and "accuracy" metric

6.  Define CallbackList with EarlyStopping (patience=2) and Tensorboard callbacks.

7.  Fit the model on training data for 10 epochs. 
      
8.  Predict the labels of the first 5 images in the test set.

9. Use the history returned by model.fit()  and plot: 
        (i), training and validation loss with the epochs. 
        (ii) Training and validation accuracy with epochs. Are these two plots identical to what we see in tensorboard ?

10. Report the metrics, confusion matrix and classification report.


There are no points for accuracy of models. We will only evaluate correct implementation. **Write the code for each subpart in a new cell.**


\\

## Q2
Use the functional API to make [this model](https://indianinstituteofscience-my.sharepoint.com/:i:/g/personal/rohitc1_iisc_ac_in/EbhB2Uh7DQZKuI4xVXgpidcBuvT_FL6XlnDgwnyGX7wxzg?e=B8pput). You can use any shape for the layers. Use the plot_model() utility to show the model graph.