## Fashion Training Set

The Fashion training set consists of **70,000 images** divided into:
- **60,000 training samples**
- **10,000 testing samples**

Each sample consists of a **28x28 grayscale image**, associated with a label from **10 classes**.

### The 10 Classes:
| Label | Class        |
|-------|--------------|
| 0     | T-shirt/top  |
| 1     | Trouser      |
| 2     | Pullover     |
| 3     | Dress        |
| 4     | Coat         |
| 5     | Sandal       |
| 6     | Shirt        |
| 7     | Sneaker      |
| 8     | Bag          |
| 9     | Ankle boot   |

### Image Details:
- Each image is **28 pixels in height** and **28 pixels in width**, totaling **784 pixels**.
- Each pixel has a single pixel-value associated with it, indicating the **lightness or darkness** of that pixel.
- Pixel values range from **0 to 255**, with **higher numbers indicating darker pixels**.


### Importing the Data

In [1]:
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os
import plotly.express as px
from sklearn.model_selection import train_test_split
from tensorflow.keras.regularizers import l2
from tensorflow.keras import layers
from tensorflow.keras.models import Model,Sequential
from tensorflow.keras.layers import BatchNormalization,Dense,Conv2D,Input,MaxPooling2D,Dropout,Flatten
from tensorflow.keras.optimizers import Adam

In [2]:
train = pd.read_csv("fashion-mnist_train.csv")
test = pd.read_csv("fashion-mnist_test.csv")

In [3]:
train.head() # getting the first 5 rows 

Unnamed: 0,label,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,pixel9,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,2,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,9,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,6,0,0,0,0,0,0,0,5,0,...,0,0,0,30,43,0,0,0,0,0
3,0,0,0,0,1,2,0,0,0,0,...,3,0,0,0,0,1,0,0,0,0
4,3,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [4]:
print(f"The total number of the rows in the train data is {train.shape[0]}")
print(f"The total number of the columns in the train data is {train.shape[1]}")
print(f"The total number of the missing values in the train data is {train.isna().sum().sum()}")
print(f"The total number of the duplicated rows in the train data is {train.duplicated().sum()}")

The total number of the rows in the train data is 60000
The total number of the columns in the train data is 785
The total number of the missing values in the train data is 0
The total number of the duplicated rows in the train data is 43


In [5]:
X_train = train.drop("label", axis = 1) # droppping every row in the label column
y_train = train["label"]

X_test = test.drop("label", axis = 1)
y_test = test["label"]
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

X_train = X_train.values 
X_test = X_test.values 
print(X_train.shape)
print(X_test.shape)

X_train = X_train.reshape(60000, 28, 28) 
X_test = X_test.reshape(10000, 28, 28) 
print(X_train.shape)
print(X_test.shape)
print(type(X_train))
print(type(X_test))

(60000, 784)
(60000,)
(10000, 784)
(10000,)
(60000, 784)
(10000, 784)
(60000, 28, 28)
(10000, 28, 28)
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>


In [6]:
px.imshow(X_train[20],color_continuous_scale='ice') 

In [7]:
px.imshow(X_train[44],color_continuous_scale='ice')

In [8]:
px.imshow(X_train[4], color_continuous_scale='ice')

In [9]:
print(X_train.max()) 
print(X_train.min()) 

255
0


### lets scale the data

In [10]:
X_train = X_train / 255
X_test = X_test / 255

In [11]:
X_train = np.array(X_train)
X_test = np.array(X_test)
print(type(X_train))
print(type(X_test))

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>


### simple DNN:

In [12]:
inputs = tf.keras.layers.Input(shape=(28,28)) 
x = tf.keras.layers.Flatten()(inputs) 
x = tf.keras.layers.Dense(128, activation='relu')(x) 
x = tf.keras.layers.Dense(10, activation='softmax')(x) 
model = Model(inputs=inputs, outputs=x)

In [13]:
model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'], optimizer= Adam(learning_rate = 0.01))

In [14]:
r = model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test)) # deeh ma3naha en ana 7a3ml training 3la el training data w ba3d kol mara 7a3ml feeha training 3la el training data 7aroo7 a3ml validation 3la el X_test w el y_test

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 6ms/step - accuracy: 0.7765 - loss: 0.6594 - val_accuracy: 0.8374 - val_loss: 0.4630
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - accuracy: 0.8358 - loss: 0.4587 - val_accuracy: 0.8341 - val_loss: 0.4610
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 5ms/step - accuracy: 0.8482 - loss: 0.4248 - val_accuracy: 0.8324 - val_loss: 0.4812
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 5ms/step - accuracy: 0.8517 - loss: 0.4187 - val_accuracy: 0.8398 - val_loss: 0.4646
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 5ms/step - accuracy: 0.8573 - loss: 0.3963 - val_accuracy: 0.8513 - val_loss: 0.4453
Epoch 6/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 5ms/step - accuracy: 0.8578 - loss: 0.4001 - val_accuracy: 0.8655 - val_loss: 0.3943
Epoch 7/10


In [15]:
results = pd.DataFrame(r.history)
results.tail()

Unnamed: 0,accuracy,loss,val_accuracy,val_loss
5,0.859717,0.397582,0.8655,0.394346
6,0.8642,0.385025,0.8551,0.427021
7,0.863533,0.383197,0.8634,0.42144
8,0.864783,0.377158,0.8613,0.424209
9,0.8658,0.378083,0.8632,0.41162


In [16]:
fig = px.line(results,y=[results['accuracy'],results['val_accuracy']],template="plotly_dark",color_discrete_sequence=['#7F00FF','#00bfff'])
fig.update_layout(   
    title_font_color="#41BEE9", 
    xaxis=dict(color="#41BEE9",title='Epochs'), 
    yaxis=dict(color="#41BEE9")
    )
fig.show()

In [17]:
fig = px.line(results,y=[results['loss'],results['val_loss']],template="plotly_dark",color_discrete_sequence=['#7F00FF','#00bfff'])
fig.update_layout(   
    title_font_color="#41BEE9", 
    xaxis=dict(color="#41BEE9",title='Epochs'), 
    yaxis=dict(color="#41BEE9")
 )
fig.show()

Not bad but, could be much better let's try a different architecture

### CNN

In [18]:
inputs = Input((28,28,1)) 
              
x = Conv2D(filters = 32, kernel_size = 5, strides = 1, activation = 'relu' , kernel_regularizer=tf.keras.regularizers.l2(0.0005))(inputs)
x = Conv2D(filters = 32, kernel_size = 5, strides = 1, use_bias=False,activation='relu')(x) 
x = BatchNormalization()(x) 
x = MaxPooling2D(strides = 2)(x) 
x = Dropout(0.3)(x) 

x = Conv2D(filters = 64, kernel_size = 3, strides = 1, activation = 'relu', kernel_regularizer=l2(0.0005))(x)
x = Conv2D(filters = 64, kernel_size = 3, strides = 1, use_bias=False,activation='relu')(x)
x = BatchNormalization()(x) 
x = MaxPooling2D(strides = 2)(x)
x = Dropout(0.3)(x)
    
    
x = Flatten()(x)
x = Dense(units = 256, use_bias=False,activation='relu')(x)
x = BatchNormalization()(x) 

x = Dense(units = 128, use_bias=False,kernel_regularizer=l2(0.0005),activation='relu')(x)
x = BatchNormalization()(x)

x1 = Dense(units = 84, use_bias=False,kernel_regularizer=l2(0.0005),activation='relu')(x)
x = BatchNormalization()(x1)
x2 = Dropout(0.3)(x)
x = tf.keras.layers.Add()([x1,x2])

outputs = Dense(units = 10, activation = 'softmax')(x)
cnn_model = Model(inputs=inputs, outputs=outputs)

In [19]:
cnn_model.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'], optimizer =Adam())

In [20]:
cnn = cnn_model.fit(X_train, y_train, epochs=50, validation_data=(X_test, y_test))

Epoch 1/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m94s[0m 45ms/step - accuracy: 0.6857 - loss: 1.0828 - val_accuracy: 0.8533 - val_loss: 0.5406
Epoch 2/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m84s[0m 45ms/step - accuracy: 0.8379 - loss: 0.5695 - val_accuracy: 0.8701 - val_loss: 0.4502
Epoch 3/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m83s[0m 44ms/step - accuracy: 0.8651 - loss: 0.4618 - val_accuracy: 0.8791 - val_loss: 0.4015
Epoch 4/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 44ms/step - accuracy: 0.8756 - loss: 0.4106 - val_accuracy: 0.8871 - val_loss: 0.3533
Epoch 5/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 44ms/step - accuracy: 0.8849 - loss: 0.3799 - val_accuracy: 0.8957 - val_loss: 0.3360
Epoch 6/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m85s[0m 45ms/step - accuracy: 0.8935 - loss: 0.3536 - val_accuracy: 0.9050 - val_loss: 0.3050
Epoc

In [21]:
results = pd.DataFrame(cnn.history)
results.tail()

Unnamed: 0,accuracy,loss,val_accuracy,val_loss
45,0.935733,0.2226,0.926,0.257501
46,0.935517,0.220909,0.9276,0.256978
47,0.935817,0.220666,0.9196,0.277936
48,0.937,0.221948,0.9272,0.253516
49,0.936583,0.221654,0.9299,0.248216


In [22]:
fig = px.line(results,y=[results['accuracy'],results['val_accuracy']],template="plotly_dark",color_discrete_sequence=['#7F00FF','#00bfff'])
fig.update_layout(   
    title_font_color="#41BEE9", 
    xaxis=dict(color="#41BEE9",title='Epochs'), 
    yaxis=dict(color="#41BEE9")
 )
fig.show()

In [23]:
fig = px.line(results,y=[results['loss'],results['val_loss']],template="plotly_dark",color_discrete_sequence=['#7F00FF','#00bfff'])
fig.update_layout(   
    title_font_color="#41BEE9", 
    xaxis=dict(color="#41BEE9",title='Epochs'), 
    yaxis=dict(color="#41BEE9")
 )
fig.show()

These are much better results, but it seems like the loss was fluctuating.  
Let's try a learning rate scheduler to reduce the learning rate at later epochs.

### Learning Rate Scheduler

In [30]:
def scheduler(epoch, lr):
    
    if epoch < 10:
        return float(lr)
    else:
        return float(lr * tf.math.exp(-0.1))
callback = tf.keras.callbacks.LearningRateScheduler(scheduler)

In [31]:
inputs = Input((28,28,1))
              
x = Conv2D(filters = 32, kernel_size = 5, strides = 1, activation = 'relu' , kernel_regularizer=tf.keras.regularizers.l2(0.0005))(inputs)
x = Conv2D(filters = 32, kernel_size = 5, strides = 1, use_bias=False,activation='relu')(x)
x = BatchNormalization()(x)
x = MaxPooling2D(strides = 2)(x)
x = Dropout(0.3)(x)

x = Conv2D(filters = 64, kernel_size = 3, strides = 1, activation = 'relu', kernel_regularizer=l2(0.0005))(x)
x = Conv2D(filters = 64, kernel_size = 3, strides = 1, use_bias=False,activation='relu')(x)
x = BatchNormalization()(x)
x = MaxPooling2D(strides = 2)(x)
x = Dropout(0.3)(x)
    
    
x = Flatten()(x)
x = Dense(units = 256, use_bias=False,activation='relu')(x)
x = BatchNormalization()(x)

x = Dense(units = 128, use_bias=False,kernel_regularizer=l2(0.0005),activation='relu')(x)
x = BatchNormalization()(x)

x1 = Dense(units = 84, use_bias=False,kernel_regularizer=l2(0.0005),activation='relu')(x)
x = BatchNormalization()(x1)
x2 = Dropout(0.3)(x)
x = tf.keras.layers.Add()([x1,x2])

outputs = Dense(units = 10, activation = 'softmax')(x)
cnn_model_v2 = Model(inputs=inputs, outputs=outputs)

In [32]:
cnn_model_v2.compile(loss='sparse_categorical_crossentropy', metrics=['accuracy'], optimizer =Adam())

In [33]:
cnn_v2 = cnn_model_v2.fit(X_train, y_train, epochs=50, validation_data=(X_test, y_test), callbacks=[callback])

Epoch 1/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 26ms/step - accuracy: 0.6922 - loss: 1.0638 - val_accuracy: 0.8558 - val_loss: 0.5354 - learning_rate: 0.0010
Epoch 2/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 25ms/step - accuracy: 0.8413 - loss: 0.5469 - val_accuracy: 0.8755 - val_loss: 0.4188 - learning_rate: 0.0010
Epoch 3/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 26ms/step - accuracy: 0.8623 - loss: 0.4548 - val_accuracy: 0.8866 - val_loss: 0.3809 - learning_rate: 0.0010
Epoch 4/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 25ms/step - accuracy: 0.8752 - loss: 0.4085 - val_accuracy: 0.8829 - val_loss: 0.3716 - learning_rate: 0.0010
Epoch 5/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 26ms/step - accuracy: 0.8824 - loss: 0.3878 - val_accuracy: 0.8997 - val_loss: 0.3294 - learning_rate: 0.0010
Epoch 6/50
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━