# Multiclass Classification using FastAi Library

Applying what I learned from Jermey Howard's [lesson 3 of the fastai course](https://course.fast.ai/videos/?lesson=3) to create a multi-label classification model.


I will be taking the dataset from [pyimagesearch's blog post](https://www.pyimagesearch.com/2018/05/07/multi-label-classification-with-keras/) where we will identify the colour and type of the outfit in the image.

Also, there is a [tutorial](https://www.pyimagesearch.com/2018/04/09/how-to-quickly-build-a-deep-learning-image-dataset/) on pyimagesearch which helps you build an image dataset by scraping bing

For this kernel, I will be applying fastai library to classify the colour and article of clothing in an image. Thanks to [pyimagesearch's blog post](https://www.pyimagesearch.com/2018/05/07/multi-label-classification-with-keras/) for creating the dataset by scraping the images from bing. If you want to create your own image dataset, I suggest checking out [this tutorial](https://www.pyimagesearch.com/2018/04/09/how-to-quickly-build-a-deep-learning-image-dataset/) from pyimagesearch.

To get the image dataset, simply click on **File** and then **Add or upload data** from within a kernel you are editing, then paste `https://www.kaggle.com/kaiska/wardrobe` in the search box and click `Add`.

![add dataset](https://i.imgur.com/Tppldjm.png)

Every notebook starts with the following three lines; they ensure that any edits to libraries you make are reloaded here automatically, and also that any charts or images displayed are shown in this notebook.

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

We import all the necessary packages. We are going to work with the [fastai V1 library](http://www.fast.ai/2018/10/02/fastai-ai/) which sits on top of [Pytorch 1.0](https://hackernoon.com/pytorch-1-0-468332ba5163). The fastai library provides many useful functions that enable us to quickly and easily build neural networks and train our models.

We will also import [fastai.widgets](https://docs.fast.ai/widgets.image_cleaner.html#Image-Cleaner-Widget) which offer several widgets to support the workflow of a deep learning practitioner. The purpose of the widgets is to help you organize, clean, and prepare your data for your model. Widgets are separated by data type.

In [None]:
from fastai import *
from fastai.vision import *
from fastai.widgets import *

import os
import sys
import cv2
import shutil  
import numpy as np

Let's first copy our dataset to `/kaggle/working/` to be able to apply changes to the dataset without having to change the directory later. This is because the input directory on kaggle is read-only.

In [None]:
# copy dataset to working (to enable manipulating the directory)
path = '/kaggle/input/apparel-dataset/'   
dest = '/kaggle/working/dataset/'
shutil.copytree(path, dest, copy_function = shutil.copy)  

In this dataset, each picture can have multiple labels. If we take a look at the folder names, we see that each folder contains two labels seperated by an underscore.

In [None]:
os.listdir('/kaggle/working/dataset/')

To put this in a `DataBunch` while using the [data block API](https://docs.fast.ai/data_block.html), we then need to be using ImageList (and not ImageDataBunch). This will make sure the model created has the proper loss function to deal with the multiple classes. Also, the main difference for using `ImageList` over `ImageDataBunch` is that the later has pre-set constrains, while using `ImageList` gives you [more flexibility](https://forums.fast.ai/t/dataset-creation-imagedatabunch-vs-imagelists/45427/2).

In [None]:
tfms = get_transforms()

img_src = '/kaggle/working/dataset/'
src = (ImageList.from_folder(img_src) #set image folder
       .split_by_rand_pct(0.2) #set the split of training and validation to 80/20
       .label_from_folder(label_delim='_')) #get label names from folder and split by underscore

data = (src.transform(tfms, size=256) #set image size to 256
        .databunch(num_workers=0).normalize(imagenet_stats))

In [None]:
data.show_batch(rows=3, figsize=(12,9))

In [None]:
acc_02 = partial(accuracy_thresh, thresh=0.2)
learn = cnn_learner(data, models.resnet50, metrics=acc_02, model_dir='/kaggle/working/models')

In [None]:
learn.fit_one_cycle(5)

In [None]:
learn.save('stage-1-rn50')

In [None]:
learn.unfreeze()

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
learn.fit_one_cycle(5, slice(3e-5, 5e-4))

In [None]:
learn.save('stage-2-rn50')

In [None]:
learn.load('/kaggle/input/multilabel-models/models/stage-2-rn50')

In [None]:
learn.recorder.plot_losses()
learn.export()


In [None]:
import sys
import requests
from PIL import Image

def save_img_url(url):
    path = '/kaggle/working/test/'
    if os.path.exists(path +'test.jpg'):
        os.remove(path +'test.jpg')
    if not os.path.isdir(path):
        os.mkdir(path)
    os.chdir(path)    

    try:
        ImgRequest = requests.get(url)
            # Verifying whether the specified URL exist or not
        if ImgRequest.status_code == requests.codes.ok:
                    # Opening a file to write bytes from response content
                    # Storing this onject as an image file on the hard drive
            img = open("test.jpg","wb")
            img.write(ImgRequest.content)
            img.close()
                    # Opening Inage file using PIL Image
            img = Image.open("test.jpg")
#         img.show()
        else:
            print(ImgRequest.status_code)
    except Exception as e:
        print(str(e))
    
# def get_preds(obj):
#     labels = str(obj[0]).split(';')
#     tmp_list = []    
#     x = 0
#     for i in obj[2]:
#         if (i > 0.2):
#             acc= round(i.item(), 3)*100
#             tmp_list.append({"label": labels[x], "acc" : acc})
#             x+=1
#     return tmp_list

def get_preds(obj, learn):
    labels = []
    for item in learn.data.c2i:
        labels.append(item)

    predictions = []
    x=0
    for item in pred_obj[2]:
        acc= round(item.item(), 3)*100
        if acc > 1:
            predictions.append({labels[x]:acc})
        x+=1
    return predictions
    

In [None]:
save_img_url('https://cdn1.thr.com/sites/default/files/2017/08/conor_mcgregor_suit.jpg')
img = open_image('/kaggle/working/test/test.jpg')
img.show()
pred_obj = learn.predict(img)
print(get_preds(pred_obj, learn))

In [None]:

labels = []
for item in learn.data.c2i:
    labels.append(item)

predictions = []
x=0
for item in pred_obj[2]:
    acc= round(item.item(), 3)*100
    if acc > 10:
        print(acc)
        predictions.append({labels[x]:acc})
    x+=1
return predictions
    

In [None]:

predictions
# sorted(predictions, key = lambda i: i['age']) 
# from operator import itemgetter

# print(sorted(predictions, key=itemgetter('name'), reverse=True))