# Projeto Prático #4 
## Multilayer Perceptron + GridSearchCV + WheatSeedsDataset


Este Projeto Prático tem o objetivo de conduzir um processo de Aprendizado de Máquina com a tarefa de Classificação Multiclasse que utilize Redes Neurais Artificiais do tipo Multilayer Perceptron para solucionar o problema de classificação de três variedades de trigo (Kama, Rosa, Canadian) a partir dos seguintes dados:
    
    
    Área, Perímetro, Compactude, Comprimento, Largura, Coeficiente de Assimetria e Comprimento do Sulco da Semente
    
estes, encontrados no [WheatSeedsDataset](https://archive.ics.uci.edu/ml/datasets/seeds#).

Com intuito de otimização na busca por melhores parâmetro e hiperparâmetros da RNA, neste projeto, será utilizada uma Busca em Grade que irá variar a função de ativação e número de neurônios nas camadas ocultas

Para a avaliação das RNAs encontradas, a Busca em Grade considerará uma Validação Cruzada com k=3 folds e a acurácia como métrica de desempenho.

Alunos: 
    - Jean Phelipe de Oliveira Lima - 1615080096
    - Rodrigo Gomes de Souza - 1715310022

## Bibliotecas

In [1]:
import pandas as pd
import numpy as np
from math import ceil
from sklearn.model_selection import GridSearchCV, KFold
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import f1_score
import warnings
warnings.filterwarnings('ignore')

## Leitura do WhatSeedsDataset

In [2]:
dataset = pd.read_csv('WheatSeedDataset.csv', sep='\t')
dataset.head()

Unnamed: 0,Area,Perimeter,Compactness,Length of Kernel,Width of Kernel,Asymmetry Coefficient,Length of Kernel Groove,Type
0,15.26,14.84,0.871,5.763,3.312,2.221,5.22,1
1,14.88,14.57,0.8811,5.554,3.333,1.018,4.956,1
2,14.29,14.09,0.905,5.291,3.337,2.699,4.825,1
3,13.84,13.94,0.8955,5.324,3.379,2.259,4.805,1
4,16.14,14.99,0.9034,5.658,3.562,1.355,5.175,1


## Regra da Pirâmide Geométrica

Implementação da Regra da Pirâmide Geométrica para determinação da quantidade de Neurônios Ocultos

        Nh = α·√(Ni·No) ; Nh = Número de Neurônios Ocultos
                          Ni = Número de Neurônios de Entrada
                          No = Número de Neurônios de Saída
                          α  = Constante (Para o problema em questão, serão adotados α = [0.5, 2, 3])

In [3]:
def piramide_geometrica(ni, no, alfa):
    nh = alfa*((ni*no)**(1/2))
    return ceil(nh)

##  Distribuição dos Neurônios em duas Camadas Ocultas

Função para gerar todas as possíveis 2-tuplas que representam o número de neurônios distribuídos por duas camadas ocultas de uma RNA do tipo MLP, dado o número de neurônios ocultos obtidos previamente pela Regra da Pirâmide Geométrica.

In [4]:
def hidden_layers(layers, nh):
    for i in range(1, nh):
        neurons_layers = (i, nh-i)
        layers.append(neurons_layers)
    return layers

### Criação de Lista de Camadas Ocultas a Partir da Regra da Pirâmide Geométrica

In [5]:
num_in = 7
num_out = 3
alpha = [0.5, 2, 3]
layers = []

In [6]:
for i in range(len(alpha)):
    nh = piramide_geometrica(num_in, num_out, alpha[i])
    print('Para α = %.1f, Nh = %d'%(alpha[i],nh))
    hidden_layers(layers, nh)#insere cada possibilidade de camadas ocultas, dado o numero de neurônios, na lista 'layers'
    
print()
print('Distribuições de Camadas Ocultas:\n')
for i in layers:
    print(i)

Para α = 0.5, Nh = 3
Para α = 2.0, Nh = 10
Para α = 3.0, Nh = 14

Distribuições de Camadas Ocultas:

(1, 2)
(2, 1)
(1, 9)
(2, 8)
(3, 7)
(4, 6)
(5, 5)
(6, 4)
(7, 3)
(8, 2)
(9, 1)
(1, 13)
(2, 12)
(3, 11)
(4, 10)
(5, 9)
(6, 8)
(7, 7)
(8, 6)
(9, 5)
(10, 4)
(11, 3)
(12, 2)
(13, 1)


## Busca em Grade

São definidos:
    - Parâmetros que devem variar na busca em grade;
    - Número de Folds para validação cruzada;
    - Métrica de desempenho a ser considerada;

In [7]:
parameters = {'solver': ['lbfgs'], 
              'activation': ['identity', 'logistic', 'tanh', 'relu'],
              'hidden_layer_sizes': layers,
              'max_iter':[1000],
              'learning_rate': ['adaptive', 'constant']}

gs = GridSearchCV(MLPClassifier(), 
                  parameters, 
                  cv=3, 
                  scoring='accuracy')

In [8]:
x = dataset.drop(['Type'], axis = 1) #Atributos preditores
y = dataset.Type #Atributo Alvo

### Treinamento 

Treinamento de todas as combinações de RNAs definidas no GridSearchCV()

In [9]:
gs.fit(x, y)

GridSearchCV(cv=3, error_score='raise-deprecating',
       estimator=MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(100,), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='adam', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False),
       fit_params=None, iid='warn', n_jobs=None,
       param_grid={'solver': ['lbfgs'], 'activation': ['identity', 'logistic', 'tanh', 'relu'], 'hidden_layer_sizes': [(1, 2), (2, 1), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1), (1, 13), (2, 12), (3, 11), (4, 10), (5, 9), (6, 8), (7, 7), (8, 6), (9, 5), (10, 4), (11, 3), (12, 2), (13, 1)], 'max_iter': [1000], 'learning_rate': ['adaptive', 'constant']},
       pre_dispatch='2*n_jobs', refit=True

# Resultados

### Acurácia e Parâmetros do melhor modelo:

In [31]:
#Acurácia para o conjunto de testes
print('Acurácia média para os 3 splits de teste:',gs.best_score_)

print('\nParâmetros:')
for key in gs.best_params_.keys():
    print('\t',key, ': ', gs.best_params_[key])

Acurácia média para os 3 splits de teste: 0.9333333333333333

Parâmetros:
	 activation :  identity
	 hidden_layer_sizes :  (8, 2)
	 learning_rate :  adaptive
	 max_iter :  1000
	 solver :  lbfgs


### Dataframe - Desempenho de cada RNA

In [29]:
results = pd.DataFrame(gs.cv_results_)
results.head(10)

Unnamed: 0,mean_fit_time,std_fit_time,mean_score_time,std_score_time,param_activation,param_hidden_layer_sizes,param_learning_rate,param_max_iter,param_solver,params,...,split1_test_score,split2_test_score,mean_test_score,std_test_score,rank_test_score,split0_train_score,split1_train_score,split2_train_score,mean_train_score,std_train_score
0,0.723682,0.703837,0.002908,0.002906,identity,"(1, 2)",adaptive,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.913043,0.782609,0.852381,0.053243,87,0.891304,0.851064,0.914894,0.885754,0.026352
1,0.211971,0.059121,0.000798,2.7e-05,identity,"(1, 2)",constant,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.913043,0.753623,0.842857,0.065948,101,0.862319,0.851064,0.914894,0.876092,0.027819
2,0.094457,0.019507,0.000777,5.9e-05,identity,"(2, 1)",adaptive,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.913043,0.782609,0.852381,0.053243,87,0.891304,0.851064,0.914894,0.885754,0.026352
3,0.127001,0.024352,0.000803,2.1e-05,identity,"(2, 1)",constant,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.913043,0.782609,0.852381,0.053243,87,0.891304,0.851064,0.914894,0.885754,0.026352
4,0.176594,0.162363,0.000789,4.1e-05,identity,"(1, 9)",adaptive,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.942029,0.782609,0.861905,0.064619,79,0.869565,0.87234,0.914894,0.8856,0.020745
5,0.140351,0.01491,0.000813,5.5e-05,identity,"(1, 9)",constant,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.913043,0.782609,0.852381,0.053243,87,0.884058,0.851064,0.914894,0.883338,0.026063
6,0.235429,0.152976,0.00078,5.9e-05,identity,"(2, 8)",adaptive,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.942029,0.84058,0.914286,0.051991,21,1.0,0.957447,1.0,0.985816,0.02006
7,0.286647,0.056155,0.000834,2.3e-05,identity,"(2, 8)",constant,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,1.0,0.73913,0.9,0.113822,45,0.869565,0.964539,0.992908,0.942337,0.052745
8,0.231009,0.13168,0.000762,6.8e-05,identity,"(3, 7)",adaptive,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.956522,0.811594,0.904762,0.065362,37,0.992754,0.964539,1.0,0.985764,0.015297
9,0.235373,0.120501,0.000759,3.4e-05,identity,"(3, 7)",constant,1000,lbfgs,"{'activation': 'identity', 'hidden_layer_sizes...",...,0.985507,0.84058,0.92381,0.060604,12,1.0,0.957447,1.0,0.985816,0.02006
