#MNist com Multi-Layer Perceptron

Neste experimento foram exploradas configurações de hiper-parâmetros do algoritmo Multi-Layer Perceptron, variando a função de ativação (*activation*): entre Tangente Hiperbólica e Relu, o *solver* que é o otimizador do algoritmo entre o *sgd* (Gradiente Descendente Estocástico) e o *adam* (Gradiente Descendente Estocástico otimizado), por fim quantidade de neurônios nas camadas escondidas testando uma unica camada com 100 neurônios, depois são duas camadas com 10 neurônios cada, em seguida duas camadas com 15 e 15, e para finalizar duas camadas com 20 neurônios em cada.

Para garantir uma reproducibilidade entre este experimento e os demais, o conjunto de dados foi separado em 10 pastas com a técnica de Cross Validation utilizando o estado randômico 42.

A combinação dos hiper-parâmetros foi feita utilizando Grid Search, que cruza todas as opções dos dicionários e responde qual é a melhor configuração encontrada. Estas parametrizações são utilizadas para criar um modelo final, que é serializado em um objeto Pickle.

In [1]:
from sklearn.datasets import fetch_openml

import joblib
from sklearn.metrics import confusion_matrix,ConfusionMatrixDisplay
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import StratifiedKFold

from sklearn.neural_network import MLPClassifier

import time

In [2]:
mnist = fetch_openml('mnist_784', cache=True)

In [3]:
X = mnist["data"]
y = mnist["target"]

print(X.shape)
print(y.shape)

(70000, 784)
(70000,)


In [4]:
param_grid = [{'activation': ['tanh', 'relu'],
               'solver':['sgd', 'adam'],
               'hidden_layer_sizes': [(100),(10,10),(15,15),(20,20)] }]

kfolds = StratifiedKFold(n_splits=10, random_state=42, shuffle=True)

In [5]:
mlp = MLPClassifier()

inicio = time.time()
grid_search_mlp = GridSearchCV(mlp, param_grid, cv=kfolds, verbose=3)
resultado_modelo = grid_search_mlp.fit(X,y)
termino = time.time()
print("--- %s segundos para treinar o modelo ---" % (termino - inicio))

Fitting 10 folds for each of 16 candidates, totalling 160 fits
[CV 1/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.961 total time=10.6min




[CV 2/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.963 total time= 9.7min
[CV 3/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.964 total time= 6.9min
[CV 4/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.959 total time=10.9min
[CV 5/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.958 total time= 5.9min
[CV 6/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.955 total time= 6.8min




[CV 7/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.967 total time= 9.4min
[CV 8/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.962 total time=10.8min
[CV 9/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.957 total time= 6.3min
[CV 10/10] END activation=tanh, hidden_layer_sizes=100, solver=sgd;, score=0.962 total time=10.2min
[CV 1/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.932 total time= 2.2min
[CV 2/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.951 total time= 8.6min
[CV 3/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.949 total time= 7.3min
[CV 4/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.942 total time= 3.8min
[CV 5/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.943 total time= 6.0min
[CV 6/10] END activation=tanh, hidden_layer_sizes=100, solver=adam;, score=0.948 total time= 7.2min
[CV



[CV 1/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.928 total time=10.5min




[CV 2/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.933 total time= 9.5min




[CV 3/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.938 total time= 9.5min
[CV 4/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.936 total time= 8.8min
[CV 5/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.928 total time= 6.8min




[CV 6/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.936 total time=10.3min




[CV 7/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.938 total time= 9.7min




[CV 8/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.910 total time=10.4min
[CV 9/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.939 total time= 8.9min
[CV 10/10] END activation=relu, hidden_layer_sizes=100, solver=sgd;, score=0.937 total time= 8.8min
[CV 1/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.966 total time= 5.5min
[CV 2/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.969 total time= 7.2min
[CV 3/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.965 total time= 3.8min
[CV 4/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.966 total time= 3.3min
[CV 5/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.967 total time= 3.0min
[CV 6/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.960 total time= 4.9min
[CV 7/10] END activation=relu, hidden_layer_sizes=100, solver=adam;, score=0.968 total time= 5.0min
[C



[CV 2/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.916 total time= 8.8min




[CV 3/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.902 total time= 8.6min
[CV 4/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.923 total time= 7.1min




[CV 5/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.872 total time= 8.9min




[CV 6/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.890 total time= 8.3min
[CV 7/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.905 total time= 6.4min
[CV 8/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.911 total time= 7.0min




[CV 9/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.905 total time= 8.8min




[CV 10/10] END activation=relu, hidden_layer_sizes=(10, 10), solver=adam;, score=0.871 total time= 8.2min
[CV 1/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.113 total time=  49.3s
[CV 2/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.820 total time= 5.7min
[CV 3/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.288 total time= 1.9min
[CV 4/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.750 total time= 4.9min
[CV 5/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.113 total time=  49.9s
[CV 6/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.113 total time=  49.3s
[CV 7/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.304 total time= 3.8min
[CV 8/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=sgd;, score=0.112 total time=  50.5s
[CV 9/10] END activation=relu, hidden_layer_sizes=(15, 15), so



[CV 3/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.939 total time= 9.2min




[CV 4/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.923 total time= 9.2min




[CV 5/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.931 total time= 8.6min




[CV 6/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.912 total time= 9.2min
[CV 7/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.935 total time= 8.1min




[CV 8/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.940 total time= 8.7min
[CV 9/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.914 total time= 5.6min
[CV 10/10] END activation=relu, hidden_layer_sizes=(15, 15), solver=adam;, score=0.930 total time= 7.7min
[CV 1/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.827 total time= 6.2min
[CV 2/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.113 total time=  38.5s
[CV 3/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.879 total time= 7.8min
[CV 4/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.841 total time= 7.9min
[CV 5/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.382 total time= 5.8min
[CV 6/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=sgd;, score=0.653 total time= 4.7min
[CV 7/10] END activation=relu, hidden_layer_sizes=(20, 20), 



[CV 1/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.938 total time= 9.4min




[CV 2/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.908 total time=10.1min




[CV 3/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.929 total time= 9.6min




[CV 4/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.932 total time= 9.7min




[CV 5/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.930 total time= 9.2min




[CV 6/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.931 total time= 9.6min




[CV 7/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.939 total time= 9.7min




[CV 8/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.938 total time= 9.8min
[CV 9/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.946 total time= 6.3min




[CV 10/10] END activation=relu, hidden_layer_sizes=(20, 20), solver=adam;, score=0.944 total time= 9.3min
--- 52900.93397808075 segundos para treinar o modelo ---


In [6]:
print("Melhores Parâmetros")
print(grid_search_mlp.best_params_)
print("**************")
print("Melhores Estimadores")
print(grid_search_mlp.best_estimator_)
print("**************")
print("Melhores Pontuações")
print(grid_search_mlp.best_score_)
print("**************")
print(grid_search_mlp.best_index_)
print("**************")

Melhores Parâmetros
{'activation': 'relu', 'hidden_layer_sizes': 100, 'solver': 'adam'}
**************
Melhores Estimadores
MLPClassifier(hidden_layer_sizes=100)
**************
Melhores Pontuações
0.9664428571428573
**************
9
**************


In [7]:
joblib.dump(grid_search_mlp, "modelo_mlp_mnist.pkl")

['modelo_mlp_mnist.pkl']