# fashion_mnist Classification

The aims to build a deep artificial neural network for classification problem using the fashion_mnist dataset. 
The problem consists of predicting a label from 10 classes.
For more deatil about the dataset, please, see https://github.com/zalandoresearch/fashion-mnist.

In [1]:
import numpy as np

In [2]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Input, Dropout
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.backend import clear_session
from sklearn.model_selection import train_test_split

from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam



In [3]:
from hyperopt import Trials, STATUS_OK, tpe
from hyperas import optim
from hyperas.distributions import choice, uniform

## I- Simple Model

### I-1 Import Data

In [4]:
from tensorflow.keras.datasets import fashion_mnist
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

### I-2 Split Data

In [5]:
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=12345)

### I-3 Preprocess Data

In [6]:
X_train = X_train.reshape(48000, 784)
X_val = X_val.reshape(12000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_val = X_val.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_val /= 255
X_test/= 255

In [9]:
nb_classes = 10 
Y_train = to_categorical(y_train, nb_classes)
Y_val = to_categorical(y_val, nb_classes)
Y_test= to_categorical(y_test, nb_classes)

  ### I-4 Define the model

In [7]:
model_0 = Sequential([
 Dense(10,input_shape=(784,),activation='softmax')
])
model_0.compile(optimizer=SGD(learning_rate=0.1),
 loss='categorical_crossentropy',
 metrics=['accuracy'])

### I-5 Train the model

In [10]:
history_0 = model_0.fit(X_train, Y_train,validation_data=(X_val,Y_val),verbose=2,epochs=100)

Epoch 1/100
1500/1500 - 4s - loss: 0.6177 - accuracy: 0.7888 - val_loss: 0.4983 - val_accuracy: 0.8323 - 4s/epoch - 2ms/step
Epoch 2/100
1500/1500 - 3s - loss: 0.4952 - accuracy: 0.8293 - val_loss: 0.4680 - val_accuracy: 0.8457 - 3s/epoch - 2ms/step
Epoch 3/100
1500/1500 - 3s - loss: 0.4715 - accuracy: 0.8371 - val_loss: 0.5036 - val_accuracy: 0.8306 - 3s/epoch - 2ms/step
Epoch 4/100
1500/1500 - 3s - loss: 0.4572 - accuracy: 0.8429 - val_loss: 0.4791 - val_accuracy: 0.8443 - 3s/epoch - 2ms/step
Epoch 5/100
1500/1500 - 3s - loss: 0.4547 - accuracy: 0.8446 - val_loss: 0.4679 - val_accuracy: 0.8363 - 3s/epoch - 2ms/step
Epoch 6/100
1500/1500 - 3s - loss: 0.4434 - accuracy: 0.8468 - val_loss: 0.4686 - val_accuracy: 0.8378 - 3s/epoch - 2ms/step
Epoch 7/100
1500/1500 - 5s - loss: 0.4377 - accuracy: 0.8492 - val_loss: 0.4622 - val_accuracy: 0.8422 - 5s/epoch - 3ms/step
Epoch 8/100
1500/1500 - 3s - loss: 0.4393 - accuracy: 0.8490 - val_loss: 0.4314 - val_accuracy: 0.8566 - 3s/epoch - 2ms/step


### I-6 Evaluate the model

In [None]:
# Evaluate the model on the test data using `evaluate`
print("Evaluate on test data")
results_0 = model_0.evaluate(X_test, Y_test)


In [12]:
print("train loss, train acc:", history_0.history['loss'][-1],history_0.history['accuracy'][-1])
print("val loss, val acc:", history_0.history['val_loss'][-1],history_0.history['val_accuracy'][-1])
print("test loss, test acc:", results_0)

train loss, train acc: 0.3795619606971741 0.8675833344459534
val loss, val acc: 0.5541159510612488 0.8213333487510681
test loss, test acc: [0.5904197096824646, 0.8105000257492065]


The model show an accuracy about 86% when training however the test accuraccy is about 81%. This difference reflects the problem of overfittig.

### I-7 Save the model

In [13]:
save_path = "/content/gdrive/My Drive/Deep_Learning_bb_22/Fashion_minst"
import pickle
import os

# save:
f = open(os.path.join(save_path,"history_fashion_mnist_v00.pckl"), 'wb')
pickle.dump(history_0.history, f)
f.close()

# save entire network 
model_0.save(os.path.join(save_path,"network_fashion_mnist_v00.h5"))

###  II- Dense Networks Hyperparameters tuning  using Hyperas: 

In order to fight the problem of overfitting, one approach is to tune the hyperparameters of the model

### II-1 Data Function

In [14]:
def data():
    (X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()
    X_train, X_val, y_train, y_val = train_test_split(X_train,    y_train, test_size=0.2, random_state=12345)
    X_train = X_train.reshape(48000, 784)
    X_val = X_val.reshape(12000, 784)
    X_train = X_train.astype('float32')
    X_val = X_val.astype('float32')
    X_train /= 255
    X_val /= 255
    nb_classes = 10
    Y_train = to_categorical(y_train, nb_classes)
    Y_val = to_categorical(y_val, nb_classes)
    return X_train, Y_train, X_val, Y_val

### II-2 Model Function

In [15]:
def model(X_train, Y_train, X_val, Y_val):
    
    model = Sequential()
    model.add(Dense({{choice([128, 256, 512, 1024])}}, input_shape=(784,), activation='relu'))
    #model.add(Activation({{choice(['relu', 'sigmoid'])}}))
    model.add(Dropout({{uniform(0, 1)}}))
    
    model.add(Dense({{choice([128, 256, 512, 1024])}},activation='relu' ))
    #model.add(Activation({{choice(['relu', 'sigmoid'])}}))
    model.add(Dropout({{uniform(0, 1)}}))
    
    if {{choice(['two', 'three'])}} == 'three':
        model.add(Dense({{choice([128, 256, 512, 1024])}},activation='relu'))
        #model.add(Activation({{choice(['relu', 'sigmoid'])}}))
        model.add(Dropout({{uniform(0, 1)}}))
        
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
    optim = Adam(learning_rate={{choice([10**-3, 10**-2, 10**-1])}})
        
    model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer=optim)
    model.fit(X_train, Y_train,
              batch_size={{choice([128,256,512])}},
              epochs=100,
              verbose=2,
              validation_data=(X_val, Y_val))
    score, acc = model.evaluate(X_val, Y_val, verbose=0)
    print('Test accuracy:', acc)
    return {'loss': -acc, 'status': STATUS_OK, 'model': model}

In [16]:
###### A class to calculate exceution time

from contextlib import ContextDecorator
from dataclasses import dataclass, field
import time
import datetime
from datetime import datetime


from typing import Any, Callable, ClassVar, Dict, Optional

class TimerError(Exception):
    """A custom exception used to report errors in use of Timer class"""

@dataclass
class Timer(ContextDecorator):
    """Time your code using a class, context manager, or decorator"""

    timers: ClassVar[Dict[str, float]] = dict()
    name: Optional[str] = None
    text: str = "Elapsed time: {:0.4f} seconds"
    logger: Optional[Callable[[str], None]] = print
    _start_time: Optional[float] = field(default=None, init=False, repr=False)

    def __post_init__(self) -> None:
        """Initialization: add timer to dict of timers"""
        if self.name:
            self.timers.setdefault(self.name, 0)

    def start(self) -> None:
        """Start a new timer"""
        if self._start_time is not None:
            raise TimerError(f"Timer is running. Use .stop() to stop it")

        self._start_time = time.perf_counter()

    def stop(self) -> float:
        """Stop the timer, and report the elapsed time"""
        if self._start_time is None:
            raise TimerError(f"Timer is not running. Use .start() to start it")

        # Calculate elapsed time
        elapsed_time = time.perf_counter() - self._start_time
        self._start_time = None

        # Report elapsed time
        if self.logger:
            self.logger(self.text.format(elapsed_time))
        if self.name:
            self.timers[self.name] += elapsed_time

        return elapsed_time

    def __enter__(self) -> "Timer":
        """Start a new timer as a context manager"""
        self.start()
        return self

    def __exit__(self, *exc_info: Any) -> None:
        """Stop the context manager timer"""
        self.stop()

### II-3 Hyperparameter search

To Tune the hyperparmeters we consider the package Hyperas which is a wrapper for the package Hyperopt.
Please, in order to get comptability of Hyperas with Hyperopt==0.2.7, consider to change the line 139 in optim.py in Hyperas from #rstate=np.random.RandomState(rseed),to rstate=np.random.default_rng(rseed)

In [18]:
X_train, Y_train, X_val, Y_val = data()
with Timer(name="context manager"):
    
    best_run, best_model = optim.minimize(model=model,
                                      data=data,
                                      algo=tpe.suggest,
                                      #rstate=np.random.RandomState(42),
                                      max_evals=40,
                                      trials=Trials(),
                                      notebook_name='hyperas_opt_bb_v00')


[1;30;43mLe flux de sortie a été tronqué et ne contient que les 5000 dernières lignes.[0m
Epoch 58/100

375/375 - 1s - loss: 0.2317 - accuracy: 0.9141 - val_loss: 0.3049 - val_accuracy: 0.9066 - 1s/epoch - 3ms/step

Epoch 59/100

375/375 - 1s - loss: 0.2362 - accuracy: 0.9119 - val_loss: 0.3102 - val_accuracy: 0.9060 - 1s/epoch - 3ms/step

Epoch 60/100

375/375 - 1s - loss: 0.2314 - accuracy: 0.9122 - val_loss: 0.3127 - val_accuracy: 0.9043 - 1s/epoch - 3ms/step

Epoch 61/100

375/375 - 1s - loss: 0.2267 - accuracy: 0.9141 - val_loss: 0.3229 - val_accuracy: 0.9039 - 1s/epoch - 3ms/step

Epoch 62/100

375/375 - 1s - loss: 0.2326 - accuracy: 0.9148 - val_loss: 0.3255 - val_accuracy: 0.9048 - 1s/epoch - 3ms/step

Epoch 63/100

375/375 - 1s - loss: 0.2314 - accuracy: 0.9150 - val_loss: 0.3230 - val_accuracy: 0.9050 - 1s/epoch - 3ms/step

Epoch 64/100

375/375 - 1s - loss: 0.2257 - accuracy: 0.9164 - val_loss: 0.3120 - val_accuracy: 0.9045 - 1s/epoch - 3ms/step

Epoch 65/100

375/375 - 1s

In [19]:
print(best_run)

{'Activation': 1, 'Activation_1': 1, 'Activation_2': 1, 'Dense': 2, 'Dense_1': 3, 'Dense_2': 0, 'Dropout': 0.2418971668220975, 'Dropout_1': 0.09607213889976557, 'Dropout_2': 0, 'Dropout_3': 0.14110612169191403, 'batch_size': 2, 'learning_rate': 0}


### II-4 Evaluate Model

In [20]:
print(best_model.evaluate(X_test, Y_test)) 

[0.4713289141654968, 0.9013000130653381]


### III Best Model

### III-1 Train the final Model

In this step, we rebuild our final model which the appropriate Hyperparameters extracted. Furthermore, in order to ensure that the final model not overfiting, we include the Earlystopping monitor.

In [21]:
from tensorflow.keras.callbacks import EarlyStopping

In [22]:
from tensorflow.keras.backend import clear_session
clear_session()
with Timer(name="context manager"): 

    model = Sequential()
    model.add(Dense(512, input_shape=(784,), activation='relu'))

    model.add(Dropout(0.24))
    
    model.add(Dense( 1024,activation='relu' ))

    model.add(Dropout(0.1))
    

    model.add(Dense(128,activation='relu'))

    model.add(Dropout(0.02))
        
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
    optim = Adam(learning_rate=10**-3)
        
    model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer=optim)

    monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, 
        patience=5, verbose=1, mode='auto',
        restore_best_weights=True)
  
    history = model.fit(X_train,Y_train,validation_data=(X_val, Y_val),
        callbacks=[monitor], verbose=2,epochs=120, batch_size=512)



Epoch 1/120
94/94 - 1s - loss: 0.6431 - accuracy: 0.7691 - val_loss: 0.4160 - val_accuracy: 0.8562 - 980ms/epoch - 10ms/step
Epoch 2/120
94/94 - 0s - loss: 0.4120 - accuracy: 0.8501 - val_loss: 0.3577 - val_accuracy: 0.8732 - 387ms/epoch - 4ms/step
Epoch 3/120
94/94 - 0s - loss: 0.3701 - accuracy: 0.8646 - val_loss: 0.3352 - val_accuracy: 0.8791 - 395ms/epoch - 4ms/step
Epoch 4/120
94/94 - 0s - loss: 0.3377 - accuracy: 0.8748 - val_loss: 0.3130 - val_accuracy: 0.8888 - 394ms/epoch - 4ms/step
Epoch 5/120
94/94 - 0s - loss: 0.3246 - accuracy: 0.8804 - val_loss: 0.2995 - val_accuracy: 0.8919 - 471ms/epoch - 5ms/step
Epoch 6/120
94/94 - 0s - loss: 0.3075 - accuracy: 0.8858 - val_loss: 0.3081 - val_accuracy: 0.8858 - 389ms/epoch - 4ms/step
Epoch 7/120
94/94 - 0s - loss: 0.2956 - accuracy: 0.8899 - val_loss: 0.3098 - val_accuracy: 0.8855 - 462ms/epoch - 5ms/step
Epoch 8/120
94/94 - 0s - loss: 0.2853 - accuracy: 0.8932 - val_loss: 0.2910 - val_accuracy: 0.8968 - 377ms/epoch - 4ms/step
Epoch 9

In [23]:
results = model.evaluate(X_val, Y_val, verbose=0)

print("train loss, train acc:", history.history['loss'][-1],history.history['accuracy'][-1])
print("val loss, val acc:", history.history['val_loss'][-1],history.history['val_accuracy'][-1])
print("test loss, test acc:", results)

train loss, train acc: 0.23013243079185486 0.9135833382606506
val loss, val acc: 0.2804587483406067 0.9006666541099548
test loss, test acc: [0.27978262305259705, 0.8995833396911621]


The test accuracy is about 90% and quite diffrent from the train accuraccy 91%. We See also that the test of accuracy of  our final optimised model is much better then test accuraccy of the First model (81%). 

### III-2 Save the model 

In [24]:
save_path = "/content/gdrive/My Drive/Deep_Learning_bb_22/Fashion_minst"
#import pickle
#import os

# save:
f = open(os.path.join(save_path,"history_fashion_mnist_vf.pckl"), 'wb')
pickle.dump(history.history, f)
f.close()

# save entire network to HDF5 (save everything, suggested)
model.save(os.path.join(save_path,"network_fashion_mnist_vf.h5"))