# Classificação das imagens EP3.2

Os notebooks também estão no repositório MAC5768 no [Github](https://github.com/iblucher/MAC5768), e os datasets estão na pasta `ep3` no [Google Drive](https://drive.google.com/drive/folders/1DtkTzyPvNXYur2LldKeOHN6mOhBy2t_-?usp=sharing).

In [122]:
import os
from collections import defaultdict
from pathlib import Path

from dataclasses import dataclass

import matplotlib.pyplot as plt

import numpy as np

import pandas as pd

from skimage import io 
from skimage.measure import perimeter

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, multilabel_confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

## Extração de features

As funções abaixo calculam características para cada imagems. As features selecionadas são:
- Área: a área da imagem cortada (bounding box)
- Perímetro: o perímetro do objeto na imagem
- Razão entre objeto e fundo: quanto da imagem o objeto ocupa em relação ao fundo 

In [21]:
def get_area(img):
    r, c = img.shape
    return r * c

In [22]:
def get_perimeter(img):
    return perimeter(img)

In [190]:
def get_object_background_ratio(img):
    object_pixels = np.count_nonzero(img)
    ratio = np.divide(object_pixels, (img.shape[0] * img.shape[1]) - object_pixels + 1)
        
    return ratio

In [191]:
def compute_features(img):
    binary_img = img < 255
    
    area = get_area(binary_img)
    prm = get_perimeter(binary_img)
    ratio = get_object_background_ratio(binary_img)
    
    return area, prm, ratio

In [192]:
GROUND_TRUTH_FERET_BOX_DATASET_PATH = Path('ground_truth_feret_box/')

dataset = []

for filename in GROUND_TRUTH_FERET_BOX_DATASET_PATH.rglob('*'):
    if filename.is_file():
        head, tail = os.path.split(filename)
        object_class = head.split('/')[-1]
        
        img = io.imread(filename)
         
        dataset.append([*compute_features(img), object_class])

## Classificação


Foram selecionados dois classificadores, uma SVM (classificador linear) e um RandomForestClassifier (classificador baseado em árvore). Vamos treinar ambos e comparar sua performance no dataset. Faremos a classificação tanto nos dados segmentados manualmente quanto nos segmentados pelo algoritmo. Para cada classe calculamos accuracy, precision e recall. 

### Ground-truth 


In [193]:
dataset = np.array(dataset)

X = dataset[:, :-1]
y = dataset[:, -1]

labels = np.unique(y)

In [194]:
X_scaled = StandardScaler().fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=.3, random_state=42)

In [210]:
names = ["Linear SVM", "Random Forest"]

classifiers = [SVC(kernel="linear"),
               RandomForestClassifier(max_depth=5, n_estimators=10)]

In [211]:
@dataclass
class ObjectClassClassifierPerformance:
    clf: str
    object_class: str
    accuracy: float
    precision: float
    recall: float
        
        
def get_performance_metrics(cm):
    tn = cm[0, 0]
    fp = cm[0, 1]
    fn = cm[1, 0]
    tp = cm[1, 1]

    acc = (tp + tn) / (tp + tn + fp + fn) if (tp + tn + fp + fn) != 0 else 0
  
    precision =  tp / (tp + fp) if (tp + fp) != 0 else 0
   
    recall = tp / (tp + fn) if (tp + fn) != 0 else 0 
    
    return acc, precision, recall
        

all_metrics = []

for name, clf in zip(names, classifiers):
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        score = clf.score(X_test, y_test)

        multi_label_cm = multilabel_confusion_matrix(y_test, y_pred)
         
        for object_class, cm in zip(labels, multi_label_cm):
            acc, pr, rec = get_performance_metrics(cm)
            all_metrics.append(ObjectClassClassifierPerformance(clf.__class__.__name__,
                                                                object_class,
                                                                acc,
                                                                pr,
                                                                rec))

In [212]:
results_df = pd.DataFrame([vars(s) for s in all_metrics])

In [214]:
results_df.set_index(['clf', 'object_class']).round(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,accuracy,precision,recall
clf,object_class,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
SVC,brush,0.92,0.0,0.0
SVC,earring,0.95,1.0,0.19
SVC,glasses,0.92,0.0,0.0
SVC,hand_sanitizer,0.88,0.31,0.32
SVC,knife,0.92,0.0,0.0
SVC,lipstick,0.91,0.0,0.0
SVC,mug,0.92,0.0,0.0
SVC,nail_polish,0.86,0.0,0.0
SVC,notebook,0.93,0.56,0.97
SVC,pen,0.44,0.27,0.91


### Segmentação automática

In [215]:
SEGMENTED_FERET_BOX_DATASET_PATH = Path('segmented_feret_box/')

seg_dataset = []

for filename in SEGMENTED_FERET_BOX_DATASET_PATH.rglob('*'):
    if filename.is_file():
        head, tail = os.path.split(filename)
        object_class = head.split('/')[-1]
        
        img = io.imread(filename)
         
        seg_dataset.append([*compute_features(img), object_class])

In [216]:
seg_dataset = np.array(seg_dataset)

X = seg_dataset[:, :-1]
y = seg_dataset[:, -1]

seg_labels = np.unique(y)

In [217]:
X_scaled = StandardScaler().fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=.3, random_state=42)

In [218]:
seg_all_metrics = []

for name, clf in zip(names, classifiers):
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        score = clf.score(X_test, y_test)

        multi_label_cm = multilabel_confusion_matrix(y_test, y_pred)
         
        for object_class, cm in zip(labels, multi_label_cm):
            acc, pr, rec = get_performance_metrics(cm)
            seg_all_metrics.append(ObjectClassClassifierPerformance(clf.__class__.__name__,
                                                                    object_class,
                                                                    acc,
                                                                    pr,
                                                                    rec))

In [219]:
results_df = pd.DataFrame([vars(s) for s in seg_all_metrics])

In [220]:
results_df.set_index(['clf', 'object_class']).round(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,accuracy,precision,recall
clf,object_class,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
SVC,brush,0.92,0.0,0.0
SVC,earring,0.95,1.0,0.19
SVC,glasses,0.92,0.0,0.0
SVC,hand_sanitizer,0.88,0.31,0.32
SVC,knife,0.92,0.0,0.0
SVC,lipstick,0.91,0.0,0.0
SVC,mug,0.92,0.0,0.0
SVC,nail_polish,0.86,0.0,0.0
SVC,notebook,0.93,0.56,0.97
SVC,pen,0.44,0.27,0.91


## Conclusão 

Observando as métricas das tabelas acima é possível ver que o RandomForestClassifier performa melhor em ambos os datasets. No SVM o recall na maioria das classes está zerado, o que significa que para essas classes o algoritmo não identificou nenhum objeto, como nas classes de óculos, faca, batom, caneca e esmalte.

O RandomForestClassifier performa muito bem em algumas classes como caderno, batom e caneta, que são objetos com formatos homogêneos (retângulo e cilindro). Classes como pincel que tem formatos muito variados tem uma classificação ruim. 

No geral, a acurácia para a maioria da classes é boa pois essa métrica incorpora o true negative, ou seja, como temos outras classes o true negative incorpora todos os objetos das outras classes que não foram classificados como a classe observada. Neste exercício usamos classificadores simples, mas é possível sofisticar tanto o processamento das imagens quanto a pipeline de machine learning para que a performance final seja melhor.