
<p align="center">
  <img src="https://storage.googleapis.com/kaggle-datasets-images/2243/3791/9384af51de8baa77f6320901f53bd26b/dataset-cover.png" />
  Image source: https://www.kaggle.com/
</p>

## Stage 1: Installing dependencies and setting up GPU environment ( coded in google colab and that .ipynb uploaded onto github)

In [1]:
!pip install tensorflow-gpu==2.0.0.alpha0

Collecting tensorflow-gpu==2.0.0.alpha0
[?25l  Downloading https://files.pythonhosted.org/packages/1a/66/32cffad095253219d53f6b6c2a436637bbe45ac4e7be0244557210dc3918/tensorflow_gpu-2.0.0a0-cp36-cp36m-manylinux1_x86_64.whl (332.1MB)
[K     |████████████████████████████████| 332.1MB 54kB/s 
Collecting keras-applications>=1.0.6
[?25l  Downloading https://files.pythonhosted.org/packages/71/e3/19762fdfc62877ae9102edf6342d71b28fbfd9dea3d2f96a882ce099b03f/Keras_Applications-1.0.8-py3-none-any.whl (50kB)
[K     |████████████████████████████████| 51kB 7.3MB/s 
Collecting tb-nightly<1.14.0a20190302,>=1.14.0a20190301
[?25l  Downloading https://files.pythonhosted.org/packages/a9/51/aa1d756644bf4624c03844115e4ac4058eff77acd786b26315f051a4b195/tb_nightly-1.14.0a20190301-py3-none-any.whl (3.0MB)
[K     |████████████████████████████████| 3.0MB 17.3MB/s 
Collecting tf-estimator-nightly<1.14.0.dev2019030116,>=1.14.0.dev2019030115
[?25l  Downloading https://files.pythonhosted.org/packages/13/82/f1

## Stage 2: Import dependencies for the project

In [None]:
import numpy as np
import datetime
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist

In [None]:
tf.__version__

## Stage 3: Dataset preprocessing



### Loading the dataset

In [None]:
#Loading the Fashion Mnist dataset
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


### Image normalization

We devide each image in the training and testing dataset with the maxiumum number of pixels (255).

In this way each pixel will be in the rainge [0, 1]. By normalizing imaes we are making sure that our model (ANN) trains faster.

In [None]:
X_train = X_train / 255.0 #each pixel can hold any value between 0 to 255, so to avoid extreme irregularity in data we normalize all the pixel values in our image by dividing with 255

In [None]:
X_test = X_test / 255.0

### Reshaping of the dataset

Since we are using fully connected network, we reshape the training and testing subsets to be in the vector format. '**Flattening**' is converting the data into a 1-dimensional array for inputting it to the next layer. here each image is in 2d form i.e., 28 by 28 pixels which has to be converted into 1D form or vector form hence we take columns as 28by28 product as 784 where as the rows remains to be all (denoted as -1)  for upto 60000 training set images

In [None]:
#Since each image is 28x28, we simply use reshape the full dataset to [-1 (all elements), height * width]
X_train = X_train.reshape(-1, 28*28)

In [None]:
X_train.shape

(60000, 784)

In [None]:
#Reshape the testing subset in the same way
X_test = X_test.reshape(-1, 28*28)

## Stage 4: Building an Artificial Neural network

### Defining the model

Simply define an object of the Sequential model. ( why sequential is because a flattened layer supports only sequential inputs or a sequential model

In [None]:
model = tf.keras.models.Sequential()

### Adding the first layer (Dense layer)

Layer hyper-parameters:
- number of units/neurons: 128 ( this means how many hidden neurons do we want to have in our 1st fully connected hidden layer)
- activation function: ReLU ( is used to over come non linearity in the data as the images in this data set need not necessarily be linear or not in linear form)
- input_shape: (784, ) (contains input layer or input vector that is fed into our neural network) flattened 2d into 1d aray

...if we are planning to add 2nd hidden layer we dont need to give input_shape as it does not carry the intial input values and we can reduce or change the no of neurons or units to suppose 64 and try...

In [None]:
model.add(tf.keras.layers.Dense(units=128, activation='relu', input_shape=(784, ))) # we use dense class as we want to have full connections between our layers

In [None]:
#model.add(tf.keras.Dense(units= 64, activation = 'relu'))

### Adding a Dropout layer 

Dropout is a Regularization technique where we randomly set neurons in a layer to zero. In this way, while training those neurons won't be updated. Because some percentage of neurons won't be updated the whole training process is long and we have less chance for overfitting.

In [None]:
model.add(tf.keras.layers.Dropout(0.2)) #this means we are choosing 20% i.e., 0.2 of our neurons to deactivated during back propagation in the learning process of
# neurons in order to avoid overfitting

### Adding the second layer (output layer)

- units == number of classes (10 in the case of Fashion MNIST) ( this means how many neurons are we going to have in our output layer is 10 as we have 10 classes or varieties or types of costumes in our fashion mnist dataset)
- activation = 'softmax' which means before getting final output values we will need to get or want neural network to return probabilties of each of the class and return the final class that has the highest probbility

In [None]:
model.add(tf.keras.layers.Dense(units=10, activation='softmax'))

### Comiling the model
 Compile method is the another method of the sequential class
- Optimizer: Adam ( An optimizer updates the weights through stochastic gradient descent way. One of the best optimizers is the adam optimizer, it can be used as default option as it is highly recommendable for optimizing purpose.
- Loss: Sparse softmax (categorical) crossentropy 

In [None]:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'])
# when we have more than 2 categories or classes to predict unlike binary we use sparse_categorical_accuracy
#loss is obtained by the comparision of our model predicted output or answers to the existing actual answers

In [None]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________


### Training the model

In [None]:
model.fit(X_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fb870342ef0>

### Model evaluation and prediction

In [None]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)



In [None]:
print("Test accuracy: {}".format(test_accuracy))

Test accuracy: 0.8736000061035156


## Stage 5 : Saving the model

### Saving the architecture (topology) of the network

In [None]:
model_json = model.to_json()
with open("fashion_model.json", "w") as json_file:
    json_file.write(model_json)

### Saving network weights

In [None]:
model.save_weights("fashion_model.h5")