## Datos _raw_ con _smote_ y _dropout_

Importamos los datos y separamos en variables predictoras: X y variable a predecir: y.

In [4]:
import pandas as pd
import numpy as np
data = pd.read_csv("krkopt.data", header=None)
data.columns = ["wkc", "wkr", "wrc", "wrr", "bkc", "bkr", "opt rank" ]
X = data.iloc[:, 0:6]
y = data['opt rank']
X

Unnamed: 0,wkc,wkr,wrc,wrr,bkc,bkr
0,a,1,b,3,c,2
1,a,1,c,1,c,2
2,a,1,c,1,d,1
3,a,1,c,1,d,2
4,a,1,c,2,c,1
...,...,...,...,...,...,...
28051,b,1,g,7,e,5
28052,b,1,g,7,e,6
28053,b,1,g,7,e,7
28054,b,1,g,7,f,5


Codificamos los valores "a", "b", "c" de las variables categóricas a enteros para que puedan ser utilizadas por las redes neuronales

In [5]:
X["wkc"]=X["wkc"].astype('category')
X["wrc"]=X["wrc"].astype('category')
X["bkc"]=X["bkc"].astype('category')
X["wkc"]=X["wkc"].cat.codes
X["wrc"]=X["wrc"].cat.codes
X["bkc"]=X["bkc"].cat.codes
y = y.astype('category')
y = y.cat.codes
X

Unnamed: 0,wkc,wkr,wrc,wrr,bkc,bkr
0,0,1,1,3,2,2
1,0,1,2,1,2,2
2,0,1,2,1,3,1
3,0,1,2,1,3,2
4,0,1,2,2,2,1
...,...,...,...,...,...,...
28051,1,1,6,7,4,5
28052,1,1,6,7,4,6
28053,1,1,6,7,4,7
28054,1,1,6,7,5,5


In [6]:
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.model_selection import train_test_split
from tensorflow import keras
import tensorflow as tf
from tensorflow.keras.utils import to_categorical

Creamos los conjuntos de entrenamiento y test con una proporción 80-20 usando herramientas de _sklearn_ y utilizamos oversampling con _smote_

In [7]:

X_train, X_test, y_train, y_test = train_test_split(X,
                                                   y, test_size=0.2,
                                                   random_state = 1)

from imblearn.over_sampling import SMOTE
oversample = SMOTE()


X_smote, y_smote = oversample.fit_resample(X, y)

X_train_smote, X_test_smote, y_train_smote, y_test_smote = train_test_split(X_smote,
                                                   y_smote, test_size=0.2,
                                                   random_state = 1)



y_train = to_categorical(y_train)
y_test  = to_categorical(y_test)
y_train_smote = to_categorical(y_train_smote)
y_test_smote = to_categorical(y_test_smote)

Como vemos, _Smote_ permite utilizar datos categóricos aunque no sea lo más recomendable

In [9]:
X_smote

Unnamed: 0,wkc,wkr,wrc,wrr,bkc,bkr
0,0,1,1,3,2,2
1,0,1,2,1,2,2
2,0,1,2,1,3,1
3,0,1,2,1,3,2
4,0,1,2,2,2,1
...,...,...,...,...,...,...
81949,2,3,0,1,2,1
81950,2,2,0,7,0,2
81951,2,1,0,8,0,1
81952,2,1,0,8,0,1


Estandarizamos los datos de entrada usando _zscore_.

In [10]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

scaler = StandardScaler().fit(X_train_smote)
X_train_smote = scaler.transform(X_train_smote)
X_test_smote = scaler.transform(X_test_smote)

### Definición de funciones

Definimos tres funciones que nos serán útiles para automatizar el proceso:
- _make_my_model_multi_ permite crear un perceptrón multicapa proporcionándole la estructura de la forma $[n_1, n_2, ..., n_i,..., n_N]$ donde $n_i$ es el número de neuronas de la capa oculta $i$. Además de otros parámetros como la forma de entrada, salida y la función de activación que queramos usar.
- _compile_fit_multiclass_ entrena un modelo de entrada con el conjunto de entrenamiento usando un conjunto de validación 80-20 y produce predicciones con un conjunto test. Utilizamos la herramienta _ModelCheckpoint_ para guardar el mejor modelo durante el entrenamiento y _EarlyStopping_ para parar el entrenamiento si el valor de _loss_ en el conjunto de validación no mejora en 10 épocas. Así evitamos el sobreaprendizaje.

- _compute_metrics_multiclass_ calcula las métricas precision, recall, F1 y Kappa a partir de las predicciones y los valores exactos de test.

- _make_my_model_multi_dropout_ es una modificación que permite crear un modelo introduciendo capas internas de dropout. La estructura de la red se introduce de la forma $[n_1, n_2, ..., n_i,..., n_N]$ donde $n_i$ es el número de neuronas de la capa $i$ si escribimos un número entero o una capa dropout con un valor  de desactivación $n_i$ si introducimos un elemento de tipo carácter. Por ejemplo, [50, "0.2", 50, "0.2"] creará unas capas ocultas de la forma: capa con 50 neuronas, dropout 0.2, capa con 50 neuronas y dropout de 0.2.

In [11]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


checkpoint_filepath = '/tmp/checkpoint'
from sklearn.metrics import confusion_matrix, precision_score, \
f1_score, cohen_kappa_score, recall_score

def make_my_model_multi( units_per_layer, input_s, output_s, activation_='relu'):
    model = Sequential()
    depth = len(units_per_layer)
    model.add(Dense(units_per_layer[0], activation=activation_, input_shape=(input_s,)))
    for i in range(1, depth):
        model.add(Dense(units_per_layer[i], activation=activation_))
    model.add(Dense(output_s, activation = 'softmax'))   
    
    return model




def compile_fit_multiclass(modelo, X_train, X_test, y_train, batch, epochs, verbose=0):
    modelo.compile(loss='categorical_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])
    
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=True)

    model_checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_filepath,
                                                        save_weights_only=True,
                                                        monitor='val_loss',
                                                        mode='min',
                                                        save_best_only=True,
                                                        verbose=False)

    modelo.fit(X_train, y_train, epochs=epochs, batch_size=batch, verbose=verbose, validation_split=0.2, callbacks = [early_stopping, model_checkpoint])
    model.load_weights(checkpoint_filepath)
    predictions = modelo.predict(X_test)
    return predictions

def compute_metrics_multiclass(y_test, y_pred):
    results=[]
    results.append(precision_score(y_test, np.round(y_pred), average="micro"))
    results.append(recall_score(y_test, np.round(y_pred), average="micro"))
    results.append(f1_score(y_test, np.round(y_pred), average="micro"))
    results.append(cohen_kappa_score(y_test, np.round(y_pred)))
    return results


from tensorflow.keras.layers import Dropout

def make_my_model_multi_dropout( units_per_layer, input_s, output_s, activation_='relu'):
    model = Sequential()
    depth = len(units_per_layer)
    model.add(Dense(units_per_layer[0], activation=activation_, input_shape=(input_s,)))
    for i in range(1, depth):
        if isinstance(units_per_layer[i], str):
            a = units_per_layer[i]
            dropout_r = float(a)
            model.add(Dropout(dropout_r))
        else:
            model.add(Dense(units_per_layer[i], activation=activation_))
    model.add(Dense(output_s, activation = 'softmax'))   
    
    return model

Comprobamos que los datos tienen las dimensiones correctas

In [22]:
X_train.shape, X_train_smote.shape, y_train.shape, y_train_smote.shape, X_test.shape, y_test.shape

((22444, 6), (65563, 6), (22444, 18), (65563, 18), (5612, 6), (5612, 18))

## Pruebas

### Datos _raw_
Creamos un total de treinta experimentos en las que crearemos redes neuronales de la misma cantidad de neuronas por capas con distinto número de neuronas y distinto número de capas. El número de neuronas será: [50, 100, 150, 200, 250] y el tamaño variará de entre una y seis capas ocultas. Guardamos las predicciones junto a las métricas y la matriz de confusión. Escribimos en disco el objeto mediante _joblib_ para su posterior análisis.

In [15]:
results = []
seed = 1
from sklearn.metrics import confusion_matrix

In [54]:
size_config = [50, 100, 150, 200, 250]
for size in size_config:
    layer_config = [[size], [size]*2, [size]*3, [size]*4, [size]*5, [size]*6]
    for layers in layer_config:
        np.random.seed(seed)
        tf.random.set_seed(seed)
        print(layers)
        model = make_my_model_multi(layers, 6, 18, activation_='relu' )
        preds = compile_fit_multiclass(model, X_train, X_test, y_train, 256, 300, verbose=0)
        metrics = compute_metrics_multiclass(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
        confusion = confusion_matrix(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
        aux = { "layer config" : layers,
               #"Model": model,
               "Predictions" : preds,
               "Metrics" : metrics,
               "Confusion" : confusion

        }
        print(metrics)
        results.append(aux)
    

[50]
[0.5563079116179616, 0.5563079116179616, 0.5563079116179616, 0.5014778486804247]
[50, 50]
Epoch 00278: early stopping
[0.6776550249465432, 0.6776550249465432, 0.6776550249465432, 0.6394872591777784]
[50, 50, 50]
Epoch 00189: early stopping
[0.7004632929436921, 0.7004632929436921, 0.7004632929436921, 0.66527050745813]
[50, 50, 50, 50]
Epoch 00135: early stopping
[0.7033143264433357, 0.7033143264433357, 0.7033143264433357, 0.6683596780441357]
[50, 50, 50, 50, 50]
Epoch 00083: early stopping
[0.7140057020669993, 0.7140057020669993, 0.7140057020669993, 0.6802859773047576]
[50, 50, 50, 50, 50, 50]
Epoch 00101: early stopping
[0.7325374198146828, 0.7325374198146828, 0.7325374198146829, 0.7014741703016625]
[100]
[0.5841054882394868, 0.5841054882394868, 0.5841054882394868, 0.532926380750063]
[100, 100]
Epoch 00226: early stopping
[0.7145402708481825, 0.7145402708481825, 0.7145402708481825, 0.6810448690167336]
[100, 100, 100]
Epoch 00145: early stopping
[0.7808267997148967, 0.7808267997148

In [65]:
import joblib
 
joblib.dump(results, 'results_1_joblib')

['results_1_joblib']

## Datos con _smote_

Repetimos el mismo esquema con los datos con _smote_.

In [30]:
results_smote = []
seed = 1

In [31]:
size_config = [50, 100, 150, 200, 250]
for size in size_config:
    layer_config = [[size], [size]*2, [size]*3, [size]*4, [size]*5, [size]*6]
    for layers in layer_config:
        np.random.seed(seed)
        tf.random.set_seed(seed)
        print(layers)
        model = make_my_model_multi(layers, 6, 18, activation_='relu' )
        preds = compile_fit_multiclass(model, X_train_smote, X_test, y_train_smote, 256, 300, verbose=0)
        metrics = compute_metrics_multiclass(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
        confusion = confusion_matrix(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
        aux = { "layer config" : layers,
               #"Model": model,
               "Predictions" : preds,
               "Metrics" : metrics,
               "Confusion" : confusion

        }
        print(metrics)
        results_smote.append(aux)
    

[50]
[0.2735210263720599, 0.2735210263720599, 0.2735210263720599, 0.21056283287164057]
[50, 50]
Epoch 00195: early stopping
[0.3016749821810406, 0.3016749821810406, 0.3016749821810406, 0.23858740268705947]
[50, 50, 50]
Epoch 00149: early stopping
[0.319672131147541, 0.319672131147541, 0.319672131147541, 0.2573189896196104]
[50, 50, 50, 50]
Epoch 00150: early stopping
[0.3257305773342837, 0.3257305773342837, 0.3257305773342837, 0.26329646382574623]
[50, 50, 50, 50, 50]
Epoch 00105: early stopping
[0.32216678545972915, 0.32216678545972915, 0.32216678545972915, 0.26004197522733685]
[50, 50, 50, 50, 50, 50]
Epoch 00170: early stopping
[0.3310762651461155, 0.3310762651461155, 0.3310762651461155, 0.2689915712533639]
[100]
[0.29971489665003564, 0.29971489665003564, 0.29971489665003564, 0.23722592685369615]
[100, 100]
Epoch 00174: early stopping
[0.32323592302209553, 0.32323592302209553, 0.32323592302209553, 0.26046775820865786]
[100, 100, 100]
Epoch 00133: early stopping
[0.3341054882394868, 

In [33]:
joblib.dump(results_smote, 'results_smote_joblib')

['results_smote_joblib']

### Datos raw con dropout
A la hora de realizar experimentos con dropout, creamos el mismo esquema que en casos anteriores intercalando capas dropout de 3 posibles valores: 0.1, 0.2 y 0.3.

In [33]:
results_dropout = []
seed = 1

Mostramos la configuración de los experimentos. En total, realizaremos noventa casos.

In [61]:
size_config = [50, 100, 150, 200, 250]
dropout_rate = ["0.1", "0.2", "0.3"]

for size in size_config:
    for size_d in (dropout_rate):
        layer_config_dense = [[size], [size]*2, [size]*3, [size]*4, [size]*5, [size]*6]
        layer_config_dropout = [[size_d], [size_d]*2, [size_d]*3, [size_d]*4, [size_d]*5, [size_d]*6]
        for layers_dense, layers_dropout in zip(layer_config_dense, layer_config_dropout):
            final_design = [None]*(len(layers_dense)+len(layers_dropout))
            final_design[::2] = layers_dense
            final_design[1::2] = layers_dropout
            print(final_design)

[50, '0.1']
[50, '0.1', 50, '0.1']
[50, '0.1', 50, '0.1', 50, '0.1']
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
[50, '0.2']
[50, '0.2', 50, '0.2']
[50, '0.2', 50, '0.2', 50, '0.2']
[50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2']
[50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2']
[50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2', 50, '0.2']
[50, '0.3']
[50, '0.3', 50, '0.3']
[50, '0.3', 50, '0.3', 50, '0.3']
[50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3']
[50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3']
[50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3', 50, '0.3']
[100, '0.1']
[100, '0.1', 100, '0.1']
[100, '0.1', 100, '0.1', 100, '0.1']
[100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1']
[100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1']
[100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1', 100, '0.1']
[100, '0.2']
[100, '0.2', 100, 

In [62]:
size_config = [50, 100, 150, 200, 250]
dropout_rate = ["0.1", "0.2", "0.3"]

for size in size_config:
    for size_d in (dropout_rate):
        layer_config_dense = [[size], [size]*2, [size]*3, [size]*4, [size]*5, [size]*6]
        layer_config_dropout = [[size_d], [size_d]*2, [size_d]*3, [size_d]*4, [size_d]*5, [size_d]*6]
        for layers_dense, layers_dropout in zip(layer_config_dense, layer_config_dropout):
            final_design = [None]*(len(layers_dense)+len(layers_dropout))
            final_design[::2] = layers_dense
            final_design[1::2] = layers_dropout
            np.random.seed(seed)
            tf.random.set_seed(seed)
            print(final_design)
            model = make_my_model_multi_dropout(final_design, 6, 18, activation_='relu' )
            preds = compile_fit_multiclass(model, X_train, X_test, y_train, 256, 300, verbose=0)
            metrics = compute_metrics_multiclass(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
            confusion = confusion_matrix(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
            aux = { "layer config" : final_design,
                   #"Model": model,
                   "Predictions" : preds,
                   "Metrics" : metrics,
                   "Confusion" : confusion

            }
            print(metrics)
            results_dropout.append(aux)
    

[50, '0.1']
[0.5306486101211689, 0.5306486101211689, 0.5306486101211689, 0.47299466324494865]
[50, '0.1', 50, '0.1']
[0.6582323592302209, 0.6582323592302209, 0.6582323592302209, 0.6168927965574147]
[50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00274: early stopping
[0.7109764789736279, 0.7109764789736279, 0.7109764789736278, 0.677058966107785]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00187: early stopping
[0.7066999287241625, 0.7066999287241625, 0.7066999287241625, 0.6719743342215174]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00152: early stopping
[0.6903064861012117, 0.6903064861012117, 0.6903064861012117, 0.6533659428663383]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00176: early stopping
[0.6915538132573058, 0.6915538132573058, 0.6915538132573058, 0.6548578250956504]
[50, '0.2']
Epoch 00253: early stopping
[0.5172843905915895, 0.5172843905915895, 0.5172843905915895, 0.45668070032612573]
[50, '0.2', 50, '0.2']
[0.607448325017819, 

In [64]:
joblib.dump(results_dropout, 'results_dropout')

['results_dropout']

### _Dropout_ usando _smote_
Repetimos el procedimiento con estos datos.

In [25]:
results_dropout_smote = []
seed = 1

In [26]:
size_config = [50, 100, 150, 200, 250]
dropout_rate = ["0.1", "0.2", "0.3"]

for size in size_config:
    for size_d in (dropout_rate):
        layer_config_dense = [[size], [size]*2, [size]*3, [size]*4, [size]*5, [size]*6]
        layer_config_dropout = [[size_d], [size_d]*2, [size_d]*3, [size_d]*4, [size_d]*5, [size_d]*6]
        for layers_dense, layers_dropout in zip(layer_config_dense, layer_config_dropout):
            final_design = [None]*(len(layers_dense)+len(layers_dropout))
            final_design[::2] = layers_dense
            final_design[1::2] = layers_dropout
            np.random.seed(seed)
            tf.random.set_seed(seed)
            print(final_design)
            model = make_my_model_multi_dropout(final_design, 6, 18, activation_='relu' )
            preds = compile_fit_multiclass(model, X_train_smote, X_test, y_train_smote, 256, 300, verbose=0)
            metrics = compute_metrics_multiclass(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
            confusion = confusion_matrix(np.argmax(preds, axis = 1), np.argmax(y_test, axis = 1))
            aux = { "layer config" : final_design,
                   #"Model": model,
                   "Predictions" : preds,
                   "Metrics" : metrics,
                   "Confusion" : confusion

            }
            print(metrics)
            results_dropout_smote.append(aux)
    

[50, '0.1']
[0.26300784034212404, 0.26300784034212404, 0.26300784034212404, 0.20085519448743994]
[50, '0.1', 50, '0.1']
Epoch 00265: early stopping
[0.2998930862437634, 0.2998930862437634, 0.2998930862437634, 0.23713346452369544]
[50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00199: early stopping
[0.3205630791161796, 0.3205630791161796, 0.3205630791161796, 0.258903190675154]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00162: early stopping
[0.30648610121168923, 0.30648610121168923, 0.30648610121168923, 0.2430873107926549]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00142: early stopping
[0.3075552387740556, 0.3075552387740556, 0.3075552387740556, 0.24394819172656956]
[50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1', 50, '0.1']
Epoch 00099: early stopping
[0.28367783321454026, 0.28367783321454026, 0.28367783321454026, 0.22122127360587496]
[50, '0.2']
Epoch 00271: early stopping
[0.2494654312188168, 0.2494654312188168, 0.2494654312188168, 0.1875769065873798]
[50,

In [27]:
import joblib
joblib.dump(results_dropout_smote, 'results_dropout_smote')

['results_dropout_smote']