# Benchmarking Geral

* `Objetivo:` Avaliar a classificação de séries temporais usando 3 diferentes abordagens, incluindo nossa hipótese, de codificar séries temporais através do Gráfico de Recorrência;

* `Cenário Comparativo`: 
    - <u>Dados:</u> Considerando os dados estabelecidos no *Benchmarking 1*;
        * *Detalhes:* Base REDD, baixa Frequência, Resid. 3/9 aparelhos e 80% dados de treino/20% teste);
        * *Amostras:* Blocos de 5 minutos (300 segundos - 100 unidade, dado delay *3s*) de cada medição;
    
    - <u>Atributos *(Feature Space)*:</u> representação vetorial das amostras;
        1. **Abordagem Estatística (Benchmarking 1)**: Média, Desvio Padrão, Máximo, Energia Total, Hora do Dia e Temperatura Ambiente (zerado, neste caso, pois não foi disponibilizado pelos autores); 
        2. **Abordagem GAF (Benchmarking 2)**: Representação visual da amostra, através do algoritmo *Wang and Oates’[20]/Gramian Angular Field Matrices (GAF)*, e subsequente *embedding* com uma Rede Neural com arquitetura VGG16;
        3. **Abordagem RP (Hipótese)**: Nossa hipótese, converter a amostra em uma representação visual com a técnica de Gráfico de Recorrência (RP, do inglês *Recurrence Plot*); seguindo o *Benchmarking 2*, é realizado o *embedding* da imagem resultante com uma Rede Neural com arquitetura VGG16.
        
    - <u>Método de Classificação:</u> Rede Neural Multi-camadas (MLP, sem hiperparametrização - pacote [Scikit-lean](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier));
    
    - <u>Métrica(s):</u> uma vez que o problema irá ser tratado como classificação multi-rótulo, irão ser adotadas as seguintes métricas (via pacote [Scikit-learn.metrics](https://scikit-learn.org/stable/modules/model_evaluation.html))
        * Acurácia;
        * Precisão;
        * Recall;
        * F1-score;
        * Hamming Loss.
        

# Configurando ambiente e parâmetros

In [58]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
plt.rc('text', usetex=False)
from matplotlib.image import imsave
import pandas as pd
import pickle as cPickle
import os, sys, cv2
from math import *
from pprint import pprint
from tqdm import tqdm_notebook
from mpl_toolkits.axes_grid1 import make_axes_locatable
from PIL import Image
from glob import glob
from IPython.display import display

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image as keras_image
from tensorflow.keras.applications.vgg16 import preprocess_input

from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, precision_score, recall_score, accuracy_score, hamming_loss

from pyts.image import RecurrencePlot

REDD_RESOURCES_PATH = 'datasets/REDD'

BENCHMARKING1_RESOURCES_PATH = 'benchmarkings/cs446 project-electric-load-identification-using-machine-learning/'
BENCHMARKING2_RESOURCES_PATH = 'benchmarkings/Imaging-NILM-time-series/'
HYPOTHESIS_RESOURCES_PATH = 'datasets/hipotese1-recurrenceplot-vggembedding/'

sys.path.append(os.path.join(BENCHMARKING1_RESOURCES_PATH, ''))
sys.path.append(os.path.join(BENCHMARKING2_RESOURCES_PATH, ''))
sys.path.append(os.path.join(HYPOTHESIS_RESOURCES_PATH, ''))

from serie2QMlib import *

import warnings
warnings.filterwarnings(action='ignore')

# Carregando os dados

In [59]:
# devices to be used in training and testing
use_idx = np.array([3,4,6,7,10,11,13,17,19])

label_columns_idx = ["APLIANCE_{}".format(i) for i in use_idx]

## Informações Estatísticas (Bench. 1)

In [60]:
Xb1_train = np.load( os.path.join(BENCHMARKING1_RESOURCES_PATH, 'datasets/train_instances.npy') )
yb1_train = np.load( os.path.join(BENCHMARKING1_RESOURCES_PATH, 'datasets/train_labels_binary.npy') )

Xb1_test = np.load( os.path.join(BENCHMARKING1_RESOURCES_PATH, 'datasets/test_instances.npy') )
yb1_test = np.load( os.path.join(BENCHMARKING1_RESOURCES_PATH, 'datasets/test_labels_binary.npy') )

## Imagens GAF (Bench. 2)

In [61]:
Xb2_train = np.load( os.path.join(BENCHMARKING2_RESOURCES_PATH, 'datasets/X_train.npy') )
yb2_train = np.load( os.path.join(BENCHMARKING2_RESOURCES_PATH, 'datasets/y_train.npy') )

Xb2_test = np.load( os.path.join(BENCHMARKING2_RESOURCES_PATH, 'datasets/X_test.npy') )
yb2_test = np.load( os.path.join(BENCHMARKING2_RESOURCES_PATH, 'datasets/y_test.npy') )

## Gráficos de Recorrência (Hipótese)

In [62]:
Xh_train = np.load( os.path.join(HYPOTHESIS_RESOURCES_PATH, 'X_train.npy') )
yh_train = np.load( os.path.join(HYPOTHESIS_RESOURCES_PATH, 'y_train.npy') )

Xh_test = np.load( os.path.join(HYPOTHESIS_RESOURCES_PATH, 'X_test.npy') )
yh_test = np.load( os.path.join(HYPOTHESIS_RESOURCES_PATH, 'y_test.npy') )

# Treinando Classificadores

In [63]:
model_b1 = MLPClassifier()#RandomForestClassifier(n_estimators=10)#DecisionTreeClassifier(max_depth=15)
model_b1.fit(Xb1_train, yb1_train)

model_b2 = MLPClassifier()#RandomForestClassifier(n_estimators=10)#DecisionTreeClassifier(max_depth=15)
model_b2.fit(Xb2_train, yb2_train)

model_h  = MLPClassifier()#RandomForestClassifier(n_estimators=10)#DecisionTreeClassifier(max_depth=15)
model_h.fit(Xh_train, yh_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(100,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.1, verbose=False, warm_start=False)

# Avaliando Modelos

In [64]:
def metrics(test, predicted):

    acc = accuracy_score(test, predicted)
    prec = precision_score(test, predicted)
    rec = recall_score(test, predicted)    
    f1 = f1_score(test, predicted)
    f1m = f1_score(test, predicted, average='macro')
    hl = hamming_loss(test, predicted)   
    
    return acc, prec, rec, f1, f1m, hl

def plot_predicted_and_ground_truth(test, predicted):
    #import matplotlib.pyplot as plt
    plt.plot(predicted.flatten(), label = 'pred')
    plt.plot(test.flatten(), label= 'Y')
    plt.show();
    return

def result_report(y_test, y_pred):
    
    final_performance = []
    
    for i in range(y_test.shape[1]):

        acc, prec, rec, f1, f1m, hl = metrics(y_test[:, i], y_pred[:, i])
        final_performance.append([
            label_columns_idx[i], 
            round(acc*100, 2), 
            round(prec*100, 2), 
            round(rec*100, 2), 
            round(f1*100, 2), 
            round(f1m*100, 2),
            round(hl, 2)
        ])

    print("FINAL PERFORMANCE BY APPLIANCE (LABEL):")
    df_metrics = pd.DataFrame(
        data = final_performance,
        columns = ["Appliance", "Accuracy", "Precision", "Recall", "F1-score", "F1-macro", "Hamming Loss"]
    )
    display(df_metrics)

    print("")
    print("OVERALL AVERAGE PERFORMANCE:")
    final_performance = np.mean(np.array(final_performance)[:, 1:].astype(float), axis = 0)
    display(pd.DataFrame(
        data = {
            "Metric": ["Accuracy", "Precision", "Recall", "F1-score", "F1-macro", "Hamming Loss"],
            "Result (%)": [round(p, 2) for p in final_performance]
        }
    ))

## Benchmarking 1

In [65]:
y_test = np.array(yb1_test)

# Predict test data
y_pred = np.array(model_b1.predict(Xb1_test))

result_report(y_test, y_pred)

FINAL PERFORMANCE BY APPLIANCE (LABEL):


Unnamed: 0,Appliance,Accuracy,Precision,Recall,F1-score,F1-macro,Hamming Loss
0,APLIANCE_3,49.9,80.0,0.99,1.96,34.16,0.5
1,APLIANCE_4,69.12,77.09,37.07,50.06,63.86,0.31
2,APLIANCE_6,99.18,0.0,0.0,0.0,49.79,0.01
3,APLIANCE_7,99.35,0.0,0.0,0.0,49.84,0.01
4,APLIANCE_10,98.0,0.0,0.0,0.0,49.49,0.02
5,APLIANCE_11,96.78,0.0,0.0,0.0,49.18,0.03
6,APLIANCE_13,99.52,0.0,0.0,0.0,49.88,0.0
7,APLIANCE_17,99.02,0.0,0.0,0.0,49.76,0.01
8,APLIANCE_19,87.98,7.27,0.92,1.64,47.62,0.12



OVERALL AVERAGE PERFORMANCE:


Unnamed: 0,Metric,Result (%)
0,Accuracy,88.76
1,Precision,18.26
2,Recall,4.33
3,F1-score,5.96
4,F1-macro,49.29
5,Hamming Loss,0.11


## Benchmarking 2

In [66]:
y_test = np.array(yb2_test)
y_pred = np.array(model_b2.predict(Xb2_test).astype(int))
result_report(y_test, y_pred)

FINAL PERFORMANCE BY APPLIANCE (LABEL):


Unnamed: 0,Appliance,Accuracy,Precision,Recall,F1-score,F1-macro,Hamming Loss
0,APLIANCE_3,52.4,52.17,68.35,59.18,51.05,0.48
1,APLIANCE_4,59.5,51.48,52.1,51.79,58.44,0.4
2,APLIANCE_6,98.35,2.86,3.03,2.94,51.05,0.02
3,APLIANCE_7,95.95,2.07,13.04,3.57,50.75,0.04
4,APLIANCE_10,98.95,75.0,71.25,73.08,86.27,0.01
5,APLIANCE_11,96.95,53.33,43.41,47.86,73.15,0.03
6,APLIANCE_13,99.42,35.71,26.32,30.3,65.01,0.01
7,APLIANCE_17,96.92,5.32,12.82,7.52,52.98,0.03
8,APLIANCE_19,88.62,27.66,3.0,5.41,49.68,0.11



OVERALL AVERAGE PERFORMANCE:


Unnamed: 0,Metric,Result (%)
0,Accuracy,87.45
1,Precision,33.96
2,Recall,32.59
3,F1-score,31.29
4,F1-macro,59.82
5,Hamming Loss,0.13


## Hipótese

In [67]:
y_test = np.array(yh_test)
y_pred = np.array(model_h.predict(Xh_test))
result_report(y_test, y_pred)

FINAL PERFORMANCE BY APPLIANCE (LABEL):


Unnamed: 0,Appliance,Accuracy,Precision,Recall,F1-score,F1-macro,Hamming Loss
0,APLIANCE_3,58.4,57.78,65.28,61.3,58.16,0.42
1,APLIANCE_4,63.05,57.34,44.91,50.37,60.47,0.37
2,APLIANCE_6,98.32,0.0,0.0,0.0,49.58,0.02
3,APLIANCE_7,95.72,3.75,26.09,6.56,52.18,0.04
4,APLIANCE_10,98.98,83.05,61.25,70.5,84.99,0.01
5,APLIANCE_11,97.08,55.77,44.96,49.79,74.14,0.03
6,APLIANCE_13,99.18,18.18,21.05,19.51,59.55,0.01
7,APLIANCE_17,97.78,6.9,10.26,8.25,53.56,0.02
8,APLIANCE_19,88.85,36.36,3.69,6.69,50.38,0.11



OVERALL AVERAGE PERFORMANCE:


Unnamed: 0,Metric,Result (%)
0,Accuracy,88.6
1,Precision,35.46
2,Recall,30.83
3,F1-score,30.33
4,F1-macro,60.33
5,Hamming Loss,0.11


# Conclusões

A hipótese da utilização de RPs para a classificação multirótulo de cargas, no contexto descrito, demonstra os melhores resultados para as métricas **Acurácia, Precisão, F1-score ponderado (macro) e Hamming Loss**.