<h1 style="font-size:300%">Clothing Classifier</h1>

This is an example notebook for the basics of computer vision using a multilayer perceptron for a simple classification task of images of clothing using the [clothing-dataset-small](https://github.com/alexeygrigorev/clothing-dataset-small) by Alexey Grigorev. Notice that, given that this notebook serves as a mere example, we only use the data in the `train` folder of this dataset and train/test split that again. Moreover, the scikit learn MLP splits its input again into train/eval. As an exercise to the reader it is left to use even more data by putting everything together in one folder and update the `data_folder` variable below.

Bas S.H.T. Michielsen MSc

In [None]:
import platform, sklearn
from utils import *
data_path = "data"
data_folder = "test"
print("python", platform.python_version(), "| scikit-learn", sklearn.__version__)

# Download the dataset

In [None]:
data_repo_user = "alexeygrigorev"
data_repo_name = "clothing-dataset-small"
if not os.path.exists(os.path.join(data_path, data_repo_name)):
    download_and_extract_repo(data_repo_user, data_repo_name, data_path)
    print("Downloaded dataset", data_repo_name)
else:
    print("The dataset", data_repo_name, "was found locally, no need to download it.")
data_path = os.path.join(data_path, data_repo_name, data_folder)
print("Using", data_path)

# Loading the images

In [None]:
size = 256
images, labels = load_labelled_images(data_path, size)
print("Loaded", len(images), "images in the following", len(numpy.unique(labels)), "classes:")
for label in numpy.unique(labels):
    print(label)

# Make train and test sets

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(images, labels, test_size=.3, random_state=0, stratify=labels)
print("train dataset size:", len(X_train), "| test dataset size:", len(X_test))

# Setting Baseline

In [None]:
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier()
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
print("Accuracy:", score)

# Modelling

In [None]:
from sklearn.neural_network import MLPClassifier
model = MLPClassifier()
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
print("Accuracy:", score)

In [None]:
from sklearn.metrics import classification_report
predictions = model.predict(X_test)
report = classification_report(y_test, predictions)
print(report)