<a href="https://colab.research.google.com/github/AdicherlaVenkataSai/NeuralNetworks/blob/master/4.%20DNNwithKeras(MNIST).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

MNIST dataset

`https://drive.google.com/file/d/1en861ED1Q-TBaQScN4R8KDqmkkC6FcF-/view?usp=sharing`

`https://drive.google.com/file/d/1ojllzzXU7KFLpieB9yrVIQLcK1eFcfGl/view?usp=sharing`

__________________________________________________________________________________

1. 784 I/P -> 200 -> 100 -> 60 -> 30 -> 10 O/P

2. On hidden layers sigmoid is used as an activation function, but for output layer we use softmax activaion function(can convert multiple no's into probabilities)




# Implementation
### 0. load tensorflow

In [0]:
import tensorflow as tf
#tf.set_random_seed(58)# produces same random no's if a seed is set

### 1. load data

In [2]:
(x_train, y_train), (x_test,y_test)= tf.keras.datasets.mnist.load_data()
x_train.shape,y_train.shape, x_test.shape,y_test.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))

In [3]:
y_train[20]

4

### 2. Convert output label to multiple valuels(categorical)

In [0]:
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=10)

In [5]:
y_train[20]

array([0., 0., 0., 0., 1., 0., 0., 0., 0., 0.], dtype=float32)

note: observe the difference between previous result of y_train[20] and now. The value 4 is one hot encodede into categorical variable and displayed as above(respective position is marked as 1 rest 0) 

### 3. Build the model

In [0]:
model = tf.keras.models.Sequential()

#reshaping 28*28 into 784
model.add(tf.keras.layers.Reshape((784,), input_shape = (28, 28,)))

#normalising the data
model.add(tf.keras.layers.BatchNormalization())

#1st hidden layer
model.add(tf.keras.layers.Dense(200, activation='sigmoid'))

#2st hidden layer
model.add(tf.keras.layers.Dense(100, activation='sigmoid'))

#3rd hidden layer
model.add(tf.keras.layers.Dense(60, activation='sigmoid'))

#4th hidden layer
model.add(tf.keras.layers.Dense(30, activation='sigmoid'))

#Output layer
model.add(tf.keras.layers.Dense(10, activation='softmax'))

### 4. Compile the model

In [0]:
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

### 5. Review the model

In [8]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape (Reshape)            (None, 784)               0         
_________________________________________________________________
batch_normalization (BatchNo (None, 784)               3136      
_________________________________________________________________
dense (Dense)                (None, 200)               157000    
_________________________________________________________________
dense_1 (Dense)              (None, 100)               20100     
_________________________________________________________________
dense_2 (Dense)              (None, 60)                6060      
_________________________________________________________________
dense_3 (Dense)              (None, 30)                1830      
_________________________________________________________________
dense_4 (Dense)              (None, 10)                3

### 6. Train the model

In [9]:
history = model.fit(x_train, y_train,
                    validation_data = (x_test, y_test),
                    epochs = 50, batch_size = 32)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


* tf.keras.backend.clear_session() removes the default graph

* We can use relu as our activation function(which is mostly used one)
* Even we can adjust the lr and decay in the optimizer can obsere the change in the results

### 7. Model with relu as an activation function

In [10]:
model1 = tf.keras.models.Sequential()

#reshaping 28*28 into 784
model1.add(tf.keras.layers.Reshape((784,), input_shape = (28, 28,)))

#normalising the data
model1.add(tf.keras.layers.BatchNormalization())

#1st hidden layer
model1.add(tf.keras.layers.Dense(200, activation='relu'))

#2st hidden layer
model1.add(tf.keras.layers.Dense(100, activation='relu'))

#3rd hidden layer
model1.add(tf.keras.layers.Dense(60, activation='relu'))

#4th hidden layer
model1.add(tf.keras.layers.Dense(30, activation='relu'))

#Output layer
model1.add(tf.keras.layers.Dense(10, activation='softmax'))


model1.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model1.summary()

history = model1.fit(x_train, y_train,
                    validation_data = (x_test, y_test),
                    epochs = 50, batch_size = 32)


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape_1 (Reshape)          (None, 784)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 784)               3136      
_________________________________________________________________
dense_5 (Dense)              (None, 200)               157000    
_________________________________________________________________
dense_6 (Dense)              (None, 100)               20100     
_________________________________________________________________
dense_7 (Dense)              (None, 60)                6060      
_________________________________________________________________
dense_8 (Dense)              (None, 30)                1830      
_________________________________________________________________
dense_9 (Dense)              (None, 10)               

### 8. Changing lr and decay in adam optimizer with relu as activation function

In [11]:
adam = tf.keras.optimizers.Adam(lr = 0.03, decay = 0.02)
model1.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model1.summary()

history = model1.fit(x_train, y_train,
                    validation_data = (x_test, y_test),
                    epochs = 50, batch_size = 32)


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape_1 (Reshape)          (None, 784)               0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 784)               3136      
_________________________________________________________________
dense_5 (Dense)              (None, 200)               157000    
_________________________________________________________________
dense_6 (Dense)              (None, 100)               20100     
_________________________________________________________________
dense_7 (Dense)              (None, 60)                6060      
_________________________________________________________________
dense_8 (Dense)              (None, 30)                1830      
_________________________________________________________________
dense_9 (Dense)              (None, 10)               

> Dropout, it is a regularization parameter which is used to avoid overfitting `https://drive.google.com/file/d/12QBlioV5wgSlBa8KXGBtz9mv3nVjXo6C/view?usp=sharing` 

note: we can apply dropout at anywhere and any no of times we want to

### 9. Dropout to avoid overfitting

In [13]:
model2 = tf.keras.models.Sequential()

#reshaping 28*28 into 784
model2.add(tf.keras.layers.Reshape((784,), input_shape = (28, 28,)))

#normalising the data
model2.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))

#1st hidden layer
model2.add(tf.keras.layers.Dense(200, activation='relu'))
model2.add(tf.keras.layers.Dropout(0.4))

#2st hidden layer
model2.add(tf.keras.layers.Dense(100, activation='relu'))
model2.add(tf.keras.layers.Dropout(0.3))

#3rd hidden layer
model2.add(tf.keras.layers.Dense(60, activation='relu'))
model2.add(tf.keras.layers.Dropout(0.2))

#4th hidden layer
model2.add(tf.keras.layers.Dense(30, activation='relu'))
model2.add(tf.keras.layers.Dropout(0.1))

#Output layer
model2.add(tf.keras.layers.Dense(10, activation='softmax'))


adam = tf.keras.optimizers.Adam(lr = 0.03, decay = 0.02)
model2.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model2.summary()

history = model2.fit(x_train, y_train,
                    validation_data = (x_test, y_test),
                    epochs = 50, batch_size = 32)

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape_3 (Reshape)          (None, 784)               0         
_________________________________________________________________
batch_normalization_3 (Batch (None, 784)               3136      
_________________________________________________________________
dense_15 (Dense)             (None, 200)               157000    
_________________________________________________________________
dropout_6 (Dropout)          (None, 200)               0         
_________________________________________________________________
dense_16 (Dense)             (None, 100)               20100     
_________________________________________________________________
dropout_7 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_17 (Dense)             (None, 60)               

> Batch Normalization, actually was introduce to normalise the data between hidden layers

### 10. BatchNormalization on hidden layers

In [14]:
model3 = tf.keras.models.Sequential()

#reshaping 28*28 into 784
model3.add(tf.keras.layers.Reshape((784,), input_shape = (28, 28,)))

#normalising the data
model3.add(tf.keras.layers.BatchNormalization())

#1st hidden layer
model3.add(tf.keras.layers.Dense(200, activation='relu'))
model3.add(tf.keras.layers.BatchNormalization())

#2st hidden layer
model3.add(tf.keras.layers.Dense(100, activation='relu'))
model3.add(tf.keras.layers.BatchNormalization())

#3rd hidden layer
model3.add(tf.keras.layers.Dense(60, activation='relu'))
model3.add(tf.keras.layers.BatchNormalization())

#4th hidden layer
model3.add(tf.keras.layers.Dense(30, activation='relu'))
model3.add(tf.keras.layers.BatchNormalization())

#Output layer
model3.add(tf.keras.layers.Dense(10, activation='softmax'))


adam = tf.keras.optimizers.Adam(lr = 0.03, decay = 0.02)
model3.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

model3.summary()

history = model3.fit(x_train, y_train,
                    validation_data = (x_test, y_test),
                    epochs = 50, batch_size = 32)

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape_4 (Reshape)          (None, 784)               0         
_________________________________________________________________
batch_normalization_4 (Batch (None, 784)               3136      
_________________________________________________________________
dense_20 (Dense)             (None, 200)               157000    
_________________________________________________________________
batch_normalization_5 (Batch (None, 200)               800       
_________________________________________________________________
dense_21 (Dense)             (None, 100)               20100     
_________________________________________________________________
batch_normalization_6 (Batch (None, 100)               400       
_________________________________________________________________
dense_22 (Dense)             (None, 60)               

> optimizers
1. SGD
2. Adam
3. Nestrov
4. AdaDelta
5. Momentum
6. AdaGrad


# Analysie all the results? find the differences?

for more details on hyperparameters check offical site of keras `https://keras.io/getting_started/`