<a href="https://colab.research.google.com/github/Monalika-P/Complete-guide-to-tensorflow-2.0/blob/master/Building_an_Artificial_Neural_Network_in_TensorFlow_2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Author - Monalika P

# About the dataset

Fashion-MNIST can be used as drop-in replacement for the original MNIST dataset (10 categories of handwritten digits). It shares the same image size (28x28) and structure of training (60,000) and testing (10,000) splits.
There are ten categories to classify in the fashion_mnist dataset:

**Label Description**  

0 T-shirt/top

1 Trouser

2 Pullover

3 Dress

4 Coat

5 Sandal

6 Shirt

7 Sneaker

8 Bag

9 Ankle boot

# Import the dependencies

In [1]:
import numpy as np
import datetime
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist

In [2]:
tf.__version__

'2.3.0'

# Data Preprocessing

## Loading the dataset

In [3]:
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


## Normalizing the images

It is highly recommended to normalize the images. The reason is that the neural network will train faster. Since each pixel is between the range of 0 to 255, the normalization can be done by simply dividing the pixels with 255

In [4]:
X_train = X_train / 255.0

In [5]:
X_test = X_test / 255.0

## Reshaping the dataset

X_train contains almost 60,000 images. It is a 3D tensor where the first dimension gives the index of the images and the other two are the dimensions of the arrays containing the pixel of these images.
Here, each of 2D arrays are flattened into 1D vectors.







In [7]:
X_train = X_train.reshape(-1, 28*28)

-1 indicates that all the images of the training set have to reshaped.
28X28 are the number of columns you want to get in the X_train.Since, there are 28X28 pixels and we want to get all the pixels into columns and flatten it into 1D horizontal vectors,28x28 columns will be obtained. 

In [9]:
X_train.shape

(60000, 784)

The output obtained here will be a 2D array where the first dimension will contain the index of the image and the second dimension will be the single vector containing all the pixels of the image. So, basically each row corresponds to a limit and the columns contains the pixels of the image

In [11]:
X_test = X_test.reshape(-1, 28*28) # Reshaping the test set

In [12]:
X_test.shape

(10000, 784)

# Building an Artificial Neural network

## Defining the model

A fully connected neural network is being built here which is a series of dense layers. It is a sqeuence of dense layers and hence, **Sequential** class is used

In [13]:
model = tf.keras.models.Sequential()

## Adding the first layer (Dense layer)

Layer hyper-parameters:
- number of units/neurons: 128
- activation function: ReLU
- input_shape: (784, )

In [14]:
model.add(tf.keras.layers.Dense(units=128, activation='relu', input_shape=(784, )))

## Adding a Dropout layer 

Dropout is a **regularization technique** where we randomly set neurons in a layer to zero. In other words, a few neurons will be deactivated and will not be updated during the back propogation. This is done to **reduce overfitting**.

In [16]:
model.add(tf.keras.layers.Dropout(0.2))

## Adding the output layer

- units = 10 (number of classes/labels)
- activation = softmax

In [17]:
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))

## Compiling the model

- Optimizer: Adam
- Loss: Sparse softmax (categorical) crossentropy 

Optimizer will update the weights during the strocastic gradient descent when back propagation the loss to the neural network

In [18]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])

In [19]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________


##Training the model

In [21]:
model.fit(X_train, y_train, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7fed3ea34e10>

## Model evaluation and prediction

In [22]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)



In [26]:
print("Test accuracy:",test_accuracy)

Test accuracy: 0.880299985408783


In [27]:
print("Loss:",test_loss)

Loss: 0.34236401319503784


# Saving the model

## Saving the architecture (topology) of the network

In [28]:
model_json = model.to_json()
with open("fashion_model.json", "w") as json_file:
    json_file.write(model_json)

## Saving network weights

In [29]:
model.save_weights("fashion_model.h5")