# Fashion MNIST Classification Project

For this classification task, we will be classifying fashion images that may belong to MULTIPLE fashion categories into their correct category. This task usually would be done with neural networks, but we will use the classification models that we know to try to complete it.

In [1]:
import tensorflow as tf

import pandas as pd
import numpy as np

from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import cross_val_score
from sklearn.model_selection import cross_val_predict

from sklearn.metrics import precision_score, recall_score, fbeta_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score
from sklearn.metrics import precision_recall_curve

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import AdaBoostClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn.feature_selection import SelectKBest
from sklearn.dummy import DummyClassifier
from sklearn.model_selection import RandomizedSearchCV
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.feature_selection import VarianceThreshold
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import f_classif


# Loading the Data

In this section, we will load in the data. We will transform our images into arrays: our features in this task will be the pixels, which are displayed as numbers in the array.

In [21]:
fashion_mnist=tf.keras.datasets.fashion_mnist

In [22]:
(X_train,y_train),(X_test,y_test)=fashion_mnist.load_data()

In [23]:
test=np.array(X_test)

In [24]:
test.shape=(10000,28*28)

In [25]:
test

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)

In [26]:
train=np.array(X_train)

Here we will set the shape of the train to be exactly what we want, and from here, we can make a dataframe out of it.

In [27]:
train.shape=(60000,784)

In [28]:
train

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)

In [29]:
df=pd.DataFrame(train)

In [30]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,774,775,776,777,778,779,780,781,782,783
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,1,0,0,0,0,...,119,114,130,76,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,22,...,0,0,1,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,33,96,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59995,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
59996,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
59997,0,0,0,0,0,0,0,0,0,5,...,0,0,0,0,0,0,0,0,0,0
59998,0,0,0,0,0,0,0,0,0,0,...,66,54,50,5,0,1,0,0,0,0


As we can see in the dataframe, there are values in the features that are meant to convey the value of the shade of that pixel.

Now that we see that we have what we need, we can split the data into train and test and begin model testing

In [31]:
X_train2, X_test2, y_train2, y_test2 = train_test_split(train,y_train, test_size=0.2, random_state=25)

# Model Testing

In [33]:
# dummy model, to see what a "bad score" would look like

dummy_model = DummyClassifier()
dummy_model.fit(X_train2,y_train2)

y_predict = dummy_model.predict(X_test2)

y_train_pred = cross_val_predict(dummy_model, X_train2, y_train2, cv=10)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = dummy_model.score(X_test2, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")


Precision: 0.09933333333333333
Recall: 0.09933333333333333
Accuracy:0.10175




In [14]:
dummy_model.score(test,y_test)

0.1

The scoring on our dummy model is very poor, a 0.1. That is because we have multiple categories and it's very easy to 'miss'. As we can see, we have a very low bar that we need to pass for our models.

# Decision Tree Classifier

In [34]:
decision_tree = DecisionTreeClassifier()
decision_tree =decision_tree.fit(X_train2,y_train2)

In [35]:
decision_tree.score(X_test2,y_test2)

y_train_pred = cross_val_predict(decision_tree, X_train2, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = decision_tree.score(X_test2, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.7936458333333334
Recall: 0.7936458333333334
Accuracy:0.79625


Here we are receiving some great scores, especially upon seeing how low we were scoring with the dummy model! We have a near 0.8 precision, recall, and accuracy! Let's see if we can further improve this model. We will begin with feature selection.

In [32]:
def select_features(X_train, y_train, X_test,i):
    
    # configure to select a subset of features
    fs = SelectKBest(score_func=f_classif, k=i)
    
    # learn relationship from training data
    fs.fit(X_train, y_train)
    
    # transform train input data
    X_train_fs = fs.transform(X_train)
    
    # transform test input data
    X_test_fs = fs.transform(X_test)
        
    return X_train_fs, X_test_fs, fs

In [14]:
X_train_fs, X_test_fs, fs = select_features(X_train2, y_train2, X_test2,600)

Upon manual testing, we have come to the conclusion that 600 features is the optimal amount for our model (giving us the best results while creating more efficient data for us to use).

In [38]:
decision_tree = DecisionTreeClassifier()
decision_tree =decision_tree.fit(X_train_fs,y_train2)

In [39]:
y_train_pred = cross_val_predict(decision_tree, X_train_fs, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = decision_tree.score(X_test_fs, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.7849375
Recall: 0.7849375
Accuracy:0.8003333333333333


There is slight improvement in the accuracy, but the precision and recall have slightly declined. However, less features is more efficient and helps avoid overfitting, so we will stick with 600 features.

From here, let's find the best hyperparameters.

In [50]:
param_dist = {"max_depth": [3, 10, 20, None],
              "max_features": ['auto', 'sqrt'],
              "min_samples_leaf": [1, 2, 4],
              "criterion": ["gini", "entropy"]}
dt_random = RandomizedSearchCV(estimator = decision_tree, param_distributions = param_dist, n_iter = 100, cv = 3, verbose=3, random_state=42, n_jobs = -1, scoring='accuracy')
# Fit the random search model
dt_random.fit(X_train_fs, y_train2)

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.


Fitting 3 folds for each of 48 candidates, totalling 144 fits


[Parallel(n_jobs=-1)]: Done  16 tasks      | elapsed:    1.1s
[Parallel(n_jobs=-1)]: Done 112 tasks      | elapsed:   30.0s
[Parallel(n_jobs=-1)]: Done 144 out of 144 | elapsed:   39.8s finished


RandomizedSearchCV(cv=3, estimator=DecisionTreeClassifier(), n_iter=100,
                   n_jobs=-1,
                   param_distributions={'criterion': ['gini', 'entropy'],
                                        'max_depth': [3, 10, 20, None],
                                        'max_features': ['auto', 'sqrt'],
                                        'min_samples_leaf': [1, 2, 4]},
                   random_state=42, scoring='accuracy', verbose=3)

In [51]:
dt_random.best_params_

{'min_samples_leaf': 4,
 'max_features': 'sqrt',
 'max_depth': 10,
 'criterion': 'entropy'}

These are the best hyperparameters for our model! We will now see how our tuned model will score

In [56]:
dt_tuned = DecisionTreeClassifier(min_samples_leaf=4,max_features='sqrt',
                                  max_depth=10,criterion='entropy')

dt_tuned.fit(X_train_fs,y_train2)

y_train_pred = cross_val_predict(dt_tuned, X_train_fs, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = dt_tuned.score(X_test_fs, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.7840416666666666
Recall: 0.7840416666666666
Accuracy:0.788


The hyperparameter tuning did not help our model. Our best decision tree model overall is the untouched one, but it is best to remove some unnecessary features (and the accuracy improved slightly), so our winner would be the decision tree model with feature selection

# K-Nearest Neighbour 

In [14]:
knn = KNeighborsClassifier()

knn.fit(X_train2,y_train2)

KNeighborsClassifier()

In [15]:
y_train_pred = cross_val_predict(knn, X_train2, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = knn.score(X_test2, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.8478125
Recall: 0.8478125
Accuracy:0.85125


Already, we are seeing an improvement from our decision tree model! We are getting a whopping 0.85 in terms of accuracy, and the precision and recall are also close! Let's see if we can further improve this model.

As we have manually checked before, 600 features is the optimal amount, so we will see how our model will do with these selected features.

In [33]:
X_train_fs, X_test_fs, fs = select_features(X_train2, y_train2, X_test2,600)

In [16]:
y_train_pred = cross_val_predict(knn2, X_train_fs, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = knn2.score(X_test_fs, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.8492916666666667
Recall: 0.8492916666666667
Accuracy:0.8514166666666667


We have slight improvement, therefore we will be using the feature selected model

In [101]:
knn2.score(X_test_fs,y_test2)

0.8514166666666667

In [98]:
knn.score(X_test2,y_test2)

0.85125

In [34]:
k2 = []
ind=[]
ind=range(5,15)

for  i in range(5,15):
    knn = KNeighborsClassifier(n_neighbors=i)
    score=cross_val_score(knn,X_train_fs,y_train2,cv=2,scoring='accuracy')
    print(i)
    k2.append(score.mean())
d2 = {'K': ind, 'Accuracy': k2}
dfd2 = pd.DataFrame(data=d2)
dfd2.sort_values('Accuracy',ascending=False,inplace=True,ignore_index=True)
dfd2.head(10)

5
6
7
8
9
10
11
12
13
14


Unnamed: 0,K,Accuracy
0,6,0.844854
1,5,0.844229
2,8,0.843313
3,10,0.842604
4,7,0.842542
5,9,0.841604
6,12,0.840417
7,11,0.839479
8,14,0.839479
9,13,0.839125


Let's see how our model will score with the best K score of 6!

In [35]:
knn3 = KNeighborsClassifier(n_neighbors=6)

knn3.fit(X_train_fs,y_train2)

KNeighborsClassifier(n_neighbors=6)

In [36]:
y_train_pred = cross_val_predict(knn3, X_train_fs, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = knn3.score(X_test_fs, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")

Precision: 0.8494375
Recall: 0.8494375
Accuracy:0.8539166666666667


As we can see, there is a slight improvement with this hyperparameter tuning! This would be our most optimal KNN model.

# Adaboost Classifier

In [13]:
model = AdaBoostClassifier()
# evaluate the model
#cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
model.fit(train,y_train)

AdaBoostClassifier()

In [14]:
model.score(test,y_test)

y_train_pred = cross_val_predict(model, X_train2, y_train2, cv=3)

precision = precision_score(y_train2, y_train_pred, average='micro')

recall = recall_score(y_train2, y_train_pred, average='micro')

accuracy = model.score(X_test2, y_test2)

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"Accuracy:{accuracy}")


Precision: 0.5017708333333334
Recall: 0.5017708333333334
Accuracy:0.5448333333333333


As we can see from these results, the Adaboost model is not close to being as powerful as the KNN model, and therefore, it will not be our best model.

# Our Best Model 

Our best model is the KNN model with the 600 selected features and a K value of 6! 

This model has gotten us the following results:

Precision: 0.8494375
Recall: 0.8494375
Accuracy:0.8539166666666667

These results are great, given that we are not using neutal networks and there are multiple categories. We are satisfied with these results.