<a href="https://colab.research.google.com/github/PiyushD17/PiyushD17.github.io/blob/master/Introduction%20to%20Tensorflow/HW_2_Building_an_ANN_in_Tensorflow_2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## We will work with Fashion MNIST
* Consists of 70,000 images of size 28 * 28 (784 pixels) with 10 different clothing categories.
* 60,000 images for training and 10,000 for testing
* We will build an ANN to classify these categories.
* As we are not using a CNN yet, we will reshape our 28 * 28 sized images to 784*1 shape
* For an ANN, the input layer is just a 1-D vector of certain numbers (like 784 * 1 elements).
* Each pixel takes values from 0 to 255.
* All images are black and white.

## Installing dependencies and setting up a GPU environment

In [0]:
!pip install tensorflow-gpu==2.0.0.alpha0

Collecting tensorflow-gpu==2.0.0.alpha0
[?25l  Downloading https://files.pythonhosted.org/packages/1a/66/32cffad095253219d53f6b6c2a436637bbe45ac4e7be0244557210dc3918/tensorflow_gpu-2.0.0a0-cp36-cp36m-manylinux1_x86_64.whl (332.1MB)
[K     |████████████████████████████████| 332.1MB 58kB/s 
Collecting tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115 (from tensorflow-gpu==2.0.0.alpha0)
[?25l  Downloading https://files.pythonhosted.org/packages/13/82/f16063b4eed210dc2ab057930ac1da4fbe1e91b7b051a6c8370b401e6ae7/tf_estimator_nightly-1.14.0.dev2019030115-py2.py3-none-any.whl (411kB)
[K     |████████████████████████████████| 419kB 32.3MB/s 
Collecting tb-nightly<1.14.0a20190302,>=1.14.0a20190301 (from tensorflow-gpu==2.0.0.alpha0)
[?25l  Downloading https://files.pythonhosted.org/packages/a9/51/aa1d756644bf4624c03844115e4ac4058eff77acd786b26315f051a4b195/tb_nightly-1.14.0a20190301-py3-none-any.whl (3.0MB)
[K     |████████████████████████████████| 3.0MB 37.9MB/s 
Installin

We will use the Fashion MNIST dataset that comes with the tensorlfow library.

In [0]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.datasets import fashion_mnist

## Data Preprocessing
* Loading the dataset
* Normalizing the images
* Reshaping the dataset (1-D format, 784 * 1)

### Loading the dataset

In [0]:
(X_train,y_train), (X_test,y_test) = fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


* X_train contains 60,000 images (all black and white).
* y_train contains 60,000 labels for the images in X_train.
* X_test contains 10,000 images (all black and white).
* y_test contains 10,000 labels for the images in X_test.

### Normalizing the images
* We divide each of the image in the traiing and test sets by the maximum value a pixel can take (255).
* In this way each pixel will be in the range [0,1]. By normalizing images we make sure that our model trains faster.

In [0]:
X_train = X_train/255.0

In [0]:
X_test = X_test/255.0

## Reshaping the dataset
* Since we are building a fully connected network, we reshape the training set and the test set into the vector format, i.e. 1 * 784.

In [0]:
# Since each image's dimension is 28*28, we reshape the full dataset to [-1 (all elements), height*width]
# -1 to consider all the images in the dataset. We maintain the first dimension (60000 for training or 10000 for test).
X_train = X_train.reshape(-1,28*28)

In [0]:
X_train.shape

(60000, 784)

In [0]:
# Similarly reshaping the X_test
X_test = X_test.reshape(-1,28*28)

In [0]:
X_test.shape

(10000, 784)

## Building an Artificial Neural Network

### Defining the model
* Simply define an object of the Sequential model.
* We are building a fully connected neural network which is a series of dense layers and as opposed to being a computational graph it is a sequence of layers and hence we use the Sequential class

In [0]:
model = tf.keras.models.Sequential()

### Adding a first fully-connected hidden layer
* Layer hyper-parameters
  * number of neurons/units = 128
  * activation function : ReLU to break the linearity and help the NN learn non-linear relationships
  * input shape = (784,) it is coming from the input layer and we have 784 neurons in the input layer, hence input shape = (784,)

In [0]:
model.add(tf.keras.layers.Dense(units = 128, activation = 'relu', input_shape=(784,)))

### Adding a second layer with Dropout.
Dropout is a Regularization technique where we randomly set neurons in a layer to zero (or deactivate them). That way while training those neurons won't be updated. As some percentage of neurons won't be updated the whole training process is long and we have less chance of __overfitting__. Usualyy we choose 20 to 50% (in this case 20%)


In [0]:
model.add(tf.keras.layers.Dropout(0.2))

### Adding a second fully-connected hidden layer

In [0]:
model.add(tf.keras.layers.Dense(units = 64, activation = 'relu'))

### Adding a Dropout layer

In [0]:
model.add(tf.keras.layers.Dropout(0.2))

### Adding a third fully-connected hidden layer

In [0]:
model.add(tf.keras.layers.Dense(units = 32, activation = 'relu'))

## Adding the output layer
  * units: number of classes in your dataset (10 in Fashion MNIST)
  * activation = softmax in order to get probabilities for each class and the input image will be assigned to that category for which we get the highest probability.
  * Once we define the input shape for the first hidden layer, we never need to define it in the proceeding layers because the model will understand by itself according to the connections.

In [0]:
model.add(tf.keras.layers.Dense(units = 10, activation = 'softmax'))

## Compiling the model
  * Optimizer = Adam (to update the weights using SGD in backpropagation). Other option is __RMSProp__.
  * Loss: Sparse softmax (categorical) crossentropy 
  * metrics: sparse categorical accuarcy (used when there are more than 2 classes)
  

In [0]:
model.compile(optimizer = 'adam', loss = 'sparse_categorical_crossentropy', metrics = ['sparse_categorical_accuracy'])

In [0]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_8 (Dense)              (None, 128)               100480    
_________________________________________________________________
dropout_4 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_5 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_10 (Dense)             (None, 32)                2080      
_________________________________________________________________
dense_11 (Dense)             (None, 10)                330       
Total params: 111,146
Trainable params: 111,146
Non-trainable params: 0
________________________________________________

## Training the model

In [0]:
# batch size is an optional argument
model.fit(X_train, y_train,epochs = 10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f1e8e95e5c0>

## Model evaluation and prediction

In [0]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)



In [0]:
test_accuracy

0.8806

In [0]:
test_loss

0.32958651356697083

In [0]:
print("Test accuracy: {}".format(test_accuracy))

Test accuracy: 0.8805999755859375


## Saving the Model

### Saving the architecture (topology) of the network

In [0]:
model_json = model.to_json()
with open('fashion_model.json','w') as json_file:
  json_file.write(model_json)

### Saving Network Weights

In [0]:
model.save_weights('fashion_model.h5')