# Fashion Items Detector

### Importing Libraries

In [None]:
from PIL import Image
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
import matplotlib.pyplot as plt
import math

## Part 1 - Data Preprocessing

### Preprocessing the Training and Test sets

Here we are loading the 'fashion_mnist' dataset accessible through the 'tensorflow_datasets' library. 
Our call to tfds.load() returns two Tuples, the first is stored into 'dataset' and contains two dictionaries that contain the labled Training and Test data.

A detailed overview of he 'tfds.load' class can be found at https://www.tensorflow.org/datasets/api_docs/python/tfds/load

In [None]:
dataset,metadata =tfds.load('fashion_mnist',as_supervised=True,with_info=True)

If we print out the contents of dataset you can see two datasets labled 'test' and 'train'. Both have 'shape' key with a value (28, 28, 1) indicating that our images are all 28 x 28, and greyscale images as indicated by the 1, representing a single layer.

In [None]:
print(dataset)

Inside metadata we have detailed information about our datasets. Lower down you can see a key called 'splits', this is where we hold a count of how many testing and training images we have..

In [None]:
print(metadata)

From 'dataset' we extract the traning and testing data and store them into their on separate variables.

In [None]:
train_dataset,test_dataset = dataset['train'], dataset['test']

There are 10 different categories of items, but they are labled numerically from 0 - 9. If we don't keep track of which number represents each item, we won't be able to distinguish which image is what or understand the NN's predictions without directly observing the image. This is why the documentation includes the correct categorical labels represented by each number. In our code we declare an array 'class_names' initialized with the names of all labels, ordered corresponding to the element matching the numerical label. 

On lines 3 and 4 we extract how many images are in each dataset from the metadata, then print then print them out on lines 5 and 6. 

In [None]:
class_names = ['T-shirt/top','Trouser','Pullover','Dress','Coat','Sandal','Shirt','Sneaker','Bag','Ankle boot']

num_train_examples = metadata.splits['train'].num_examples
num_test_examples = metadata.splits['test'].num_examples

print("Training examples = {}".format(num_train_examples))
print("Test examples = {}".format(num_test_examples))

### Implementing a normalize function

Each pixel in each image is color represented by a number in the range 0 - 255. To avoid distortion in our results, we convert all of our attributes to a common range of 0 - 1. The purpose of our function here is to convert the value of each pixel to fit in this reduced range.

In [None]:
def normalize (images, label):
    images = tf.cast(images, tf.float32)
    images /= 255
    return images, label

### Normalizing our datasets and then loading the data into memory

On Lines 1 and 2 we iterate through the dataset and apply the normalize function on each item.

On Lines 4 and 5 we load our data into cache memory to significantly increase the speed with which we can access our data, since we will be accessing it incessantly throughout the training process.

In [None]:
train_datasets = train_dataset.map(normalize)
test_dataset = test_dataset.map(normalize)

train_dataset = train_dataset.cache()
test_dataset = test_dataset.cache()

## Part 2: Building the CNN

### Building the layers

Note that all keras.layers classes have many attributes that are all assigned default values. When we create our instances of each, we pass arguments for only the attributes that we need to change based on our desired outcome. For each class we use there will be a link to the documentation for a detailed overview of these classes.

#### layer0

We change the following attributes: Conv2D(filters, strides, padding, activation, input_shape) ---> 2D Convolution



https://keras.io/api/layers/convolution_layers/convolution2d/

#### layer1

MaxPooling2D(pool_size, strides)

https://keras.io/api/layers/pooling_layers/max_pooling2d/

#### layer2

Conv2D(filters, strides, padding, activation)

#### layer3

#### layer4

https://keras.io/api/layers/reshaping_layers/flatten/

#### layer5

Dense(units, activation)

https://keras.io/api/layers/core_layers/dense/

#### layer6

In [None]:
layer0 = tf.keras.layers.Conv2D(32,(3,3),padding='same',activation=tf.nn.relu,input_shape=(28,28,1))
layer1 = tf.keras.layers.MaxPooling2D((2,2),strides=2)
layer2 = tf.keras.layers.Conv2D(64,(3,3),padding='same',activation=tf.nn.relu)
layer3 = tf.keras.layers.MaxPooling2D((2,2),strides=2)
layer4 = tf.keras.layers.Flatten() 
layer5 = tf.keras.layers.Dense(128,activation = tf.nn.relu)
layer6 = tf.keras.layers.Dense(10,activation=tf.nn.softmax)

### Building the models

The models are passed to the Sequential API in order from input layer to output layer.

https://keras.io/api/models/sequential/

https://keras.io/api/models/model/


In [None]:
model = tf.keras.Sequential([layer0,layer1,layer2,layer3,layer4,layer5,layer6])

model.summary()

model.compile(optimizer='adam',loss =tf.keras.losses.SparseCategoricalCrossentropy(),metrics=['accuracy'])

## Part 3 - Training the CNN

In [None]:
BATCH_SIZE =32
train_dataset =train_dataset.cache().repeat().shuffle(num_train_examples).batch(BATCH_SIZE)
test_dataset= test_dataset.cache().batch(BATCH_SIZE)

model.fit(train_dataset,epochs=4,steps_per_epoch=math.ceil(num_train_examples/BATCH_SIZE))

print("\n-------- Training is complete ---------\n")

test_loss,test_accuracy = model.evaluate(test_dataset,steps=math.ceil(num_test_examples/BATCH_SIZE))

print("\nAccuracy ={}\n".format(test_accuracy))