In [4]:
# !pip install tensorflow 
# if not installed 

# Importing necessary libraries

In [79]:
import tensorflow as tf
from tensorflow.keras import layers, models # for layering and compiling
from tensorflow.keras.optimizers import SGD, Adam, RMSprop, Adamax # opitimizers model

### Load the Dataset (Build in keras MNIST dataset)

In [80]:
mnist = tf.keras.datasets.mnist # Load the dataset 

### Pre-Process the dataset and SPLIT them into train & test

In [81]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [82]:
train_images.shape

(60000, 28, 28)

In [83]:
test_images.shape

(10000, 28, 28)

# Normalizing

In [84]:

print('Max Pixel:',train_images.max()) # Checking highest pixel values for train_images
print('Min Pixel:',train_images.min()) # Checking min pixel values for train_images


Max Pixel: 255
Min Pixel: 0


In [85]:

print('Max Pixel:',test_images.max()) # Checking highest pixel values for test_images
print('Min Pixel:',test_images.min()) #Checking min pixel values for test_images


Max Pixel: 255
Min Pixel: 0


### Here we normalize the pixel values of the image by dividing train & test image sets to 255
So that the range of pixel value stays in between 0 to 1  
It will help model to learn faster and efficient training

In [86]:

train_images = train_images/255 # divided train_images with highest value (255) to normalize 
test_images = test_images/255 # divided test_images with highest value (255) to normalize 


In [87]:
# Again check after normalize train_images
print('Max Pixel:',train_images.max()) 
print('Min Pixel:',train_images.min())


Max Pixel: 1.0
Min Pixel: 0.0


In [88]:
# Again check after normalize test_images
print('Max Pixel:',test_images.max())
print('Min Pixel:',test_images.min())


Max Pixel: 1.0
Min Pixel: 0.0


# Build the Neural Network


model = models.Sequential([  
    layers.Flatten(input_shape = (28,28)),  **make the 28 x 28 array into 1D array**  
    layers.Dense(128, activation = 'relu'), **128 hidden layers commonly used we can adjust it if needed**  
    layers.Dense(10, activation='softmax') **here 10 is output layer. thus, MNIST has 10 classes**   
])


**Defining some OPTIMIZER**

In [133]:

optimizers = [SGD(), Adam(), RMSprop(), Adamax()] # Storing some optimizer in a list to execute at a time


### Compiling the model and Store the data in List 
accuracy_list  
loss_list  
**loss function = 'sparse_categorical_crossentropy'**

In [134]:
accuracy_list = [] # to store accuracy
loss_list = [] # to store loss function

In [135]:
for optimizer in optimizers:
    # build the neural network format for especially MNIST dataset
    model = models.Sequential([
                layers.Flatten(input_shape = (28,28)),
                layers.Dense(128, activation = 'relu'),
                layers.Dense(10, activation='softmax') 
            ])
    
    model.compile(optimizer = optimizer,
                  loss = 'sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    
    # fit the models
    model.fit(train_images, train_labels, epochs=10, verbose=0)
    
    loss_metric, accuracy = model.evaluate(test_images,test_labels)

    # storing accuracy and loss metric for evaluate later
    accuracy_list.append(accuracy)
    loss_list.append(loss_metric)




# Evaluate the Loss Function and Accuracy on behalf of different OPTIMIZER

In [136]:
opt =['SGD', 'Adam', 'RMSprop', 'Adamax']

for i in range(len(opt)):
    print('OPTIMIZER: ',opt[i], '\tLoss Metric: ',loss_list[i], '\tAccuracy: ', accuracy_list[i])


OPTIMIZER:  SGD 	Loss Metric:  0.16052822768688202 	Accuracy:  0.9535999894142151
OPTIMIZER:  Adam 	Loss Metric:  0.07821965217590332 	Accuracy:  0.977400004863739
OPTIMIZER:  RMSprop 	Loss Metric:  0.08780968934297562 	Accuracy:  0.9787999987602234
OPTIMIZER:  Adamax 	Loss Metric:  0.08700641244649887 	Accuracy:  0.9750999808311462


### Here after evaluating we can see using KERAS simple neuron & also among the various optimizer ADAM optimizer overall performs better in terms of ACCURACY & Loss Metric. Though RMSprop optimizers ACCURACY is a bit high but ADAM loss metric is performed better here  
# Finally we can say in this network ADAM optimizer is best for the dataset MNIST

# Description how I configure the network

As per the intruction I have to build a simple neuron network which can classify images from MNIST dataset for that -   
*Firstly*, I imported the necessary libraries like tensorflow and keras layers, models and optimizer.    

*Secondly*, I load the dataset and split it into train,test.    

Then I check the shape of the dataset and find it's maximum and minimum pixel and after that, I normalized the pixels range into 0 to 1 by divide them their highest value.  

After prepossessing I have build the network to train the dataset. First of all to handle dataset like MNIST generally we do not go to very deep so I do not use layers like Conv2D, Maxpool etc.    

I kept the model simple.  
I used **Flatten** to make the 28 x 28 array into 1D array.  

For hidden layer dense I used 128. Here we can use other number of layers too. Used **ReLu** activation function for input layer. on the reason to use **ReLu** is It doesn't allow for the activation of all of the neurons at the same time and it keeps the input value in range 0 to 1.  

For output layer I used 10 because MNIST dataset has 10 classes. Here Activation function used **softmax**. Softmax in the output layer is suitable for multi-class classification problems, providing a probability distribution over the different classes. I ignore **Sigmoid** because it generally used for binary classification problem.

Before compile the model I have created a list called OPTIMIZER to store the opimizers I have imported from keras to execute them at once using loop. I used four optimizer **SGD, Adam, RMSprop, Adamax** and then I used **sparse_categorical_crossentropy** as a loss function. Here I also tried to use **MeanSquaredError** but then I got to know It's typically used for regression model so I continue with **sparse_categorical_crossentropy**.

After training/ fit the model using for different optimizer I evaluate that overall **Adam** optimizer performs better   
It's **Loss Metric = 0.07821965217590332 & Accuracy = 0.977400004863739**