# Neural Networks: Multi-layered and Multi-class classification

### Welcome to the 4th Lab of 42028: Deep Learning and CNN!

In this  Lab/Tutorial session you will be implementing Neural Network for Fashion MNIST dataset classification .

So lets get started!

<img src='http://drive.google.com/uc?export=view&id=1LzpK5HxtNbcr-_8t0xkbl92kEDfpIgo6' alt='NN-ML'>


**Image source: ** https://towardsdatascience.com/multi-layer-neural-networks-with-sigmoid-function-deep-learning-for-rookies-2-bf464f09eb7f 



## Tasks for this week:

1. Implementation of Neural Network for classification using Keras API. 
2. Train and test model


### Step 1: Import required packages

we will need tensorflow, numpy, os and keras


In [0]:
import tensorflow as tf
import os
import numpy as np

%load_ext tensorboard.notebook

In [0]:
from tensorflow import keras
print(tf.__version__)
print(keras.__version__)

### Step 2: Download the Fashion Mnist dataset using keras

In [0]:
fashionMnist=tf.keras.datasets.fashion_mnist

In [0]:
(train_images, train_labels), (test_images, test_labels) = fashionMnist.load_data()

In [0]:
print(train_images.shape)
print(train_images.dtype)

In [0]:
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

In [0]:
## Show an image from theh dataset
import matplotlib.pyplot as plt
plt.imshow(train_images[3])
print(train_labels[3])

** Note :** Scikit-learn import the Fashion MNIST dataset as a 1-D array while Keras API load the dataset in 28X28 format.

### Step 3: Normalize the dataset and split the small part of the training set into validation set


- Validation set: first 5000 samples 
- Training set: 5000 to remaining

In [0]:
## WRITE YOUR CODE HERE ## (~ 5 line of code)
## Hint: Using slicing to split the training to train and validation

valid_images = 
valid_labels = 


train_images = 
train_labels =

## Normalize the test images 
test_images = 

### END YOUR CODE HERE ###

In [0]:
print(np.shape(train_images))
print(np.shape(valid_images))
print(np.shape(test_images))

[**Expected** Output]

(55000, 28, 28)
(5000, 28, 28)
(10000, 28, 28)

### Step 4:  Design the model

In [0]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(input_shape=[28,28]), 
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

## **Notes:**
* **Sequential model.** This is the simplest kind of Keras model, for neural networks which defines a SEQUENCE of layers.

* **Flatten.** Flatten just takes that image and turns it into a 1-dimensional vector.

* Next we add a second Dense hidden layer with 128 neurons, also using the ReLU activation function.  **Dense.** Add a layer to the neural network which is followed by activation function of ReLU. The ReLU only passes the value greater than 0 and for all other values of X it passes 0.
e.g. If X>0 return X, else return 0"

* Finally, we add a Dense output layer with 10 neurons (one per class), using the softmax activation function.

* ** Softmax** The softmax takes a set of values and select the biggest one from the set of values.

## Step 5: Training the model

**"sparse_categorical_crossentropy": **   The dataset contains sparse labels and the classes are exclusive.

** One-hot vector encoding** This is sometime used for encoding the labels if there one target  probability per class for each instance. For example.
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.] represent one-hot encoding for class 4. In such case, **"categorical_crossentropy"** loss is used.

** "sigmoid_crossentropy"** This loss is used for binary class classification problems and also **"sigmoid"** activation function is used instead of Softmax.



In [0]:
model.compile(optimizer = tf.optimizers.Adam(),
              loss = 'sparse_categorical_crossentropy',
              metrics=['accuracy'])

H=model.fit(train_images, train_labels, epochs=10,validation_data=(valid_images, valid_labels))

Why do we use validation dataset?

In [0]:
type(H)
print(H.history.keys())

## Task:

Change the Optimizer to SGD or any other optimizer and retrain the model.

#### Optional: Add another layer and retain to check the accuracy

Reference: 

https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer


https://www.tensorflow.org/api_docs/python/tf/train/Optimizer 

In [0]:
## WRITE YOUR CODE HERE ## (~ 3 line of code)

model1 = ## create a model here

model1.compile() ## add the required arguments

H=model1.fit() ## Add the required arguments

### END YOUR CODE HERE ###

### Summary of the model

In [0]:
## Use model.summary to great a summary for the model(layers, type, shape, etc.)
model.summary()

In [0]:
## Plot the learning curves 
import pandas as pd
import matplotlib.pyplot as plt
pd.DataFrame(H.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1) # set the vertical range to [0-1]
plt.show()

## Plot only the loss train loss
plt.plot(H.history['loss'])
plt.ylabel('cost')
plt.xlabel('Epochs')
plt.title("Cost/Loss Curve")
plt.show()

## Step 6: Evaluation on test dataset

In [0]:
## Evaluate the model's performance on test dataset.
model.evaluate(test_images, test_labels)

## Task:

Evaluate the model trained with SGD


In [0]:
## WRITE YOUR CODE HERE ## (~ 1 line of code)


### Callbacks for early stopping training

In [0]:
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('loss')<0.1):
      print("\nReached 60% accuracy so cancelling training!")
      self.model.stop_training = True

In [0]:
callbacks = myCallback()
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=[28,28]),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(train_images, train_labels, epochs=5, callbacks=[callbacks])