# Diplomatura en ciencia de datos, aprendizaje automático y sus aplicaciones - Edición 2023 - FAMAF (UNC)

## Introducción al aprendizaje profundo

### Trabajo práctico entregable 1/2 (materia completa)

- **Estudiantes:**
    - [Chevallier-Boutell, Ignacio José.](https://www.linkedin.com/in/nachocheva/)
    - Gastelu, Gabriela.
    - Santos, Maricel.
    - Spano, Marcelo.

- **Docentes:**
    - Johanna Analiz Frau (Mercado Libre).
    - Nindiría Armenta Guerrero (fyo).

---

## Librerías y dataset

In [None]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

import torch
# Configuración del dispositivo
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Para que todo sea reproducible
torch.manual_seed(1994)
import torch.nn as nn
import torch.optim as optim

In [None]:
try:
  import google.colab
  IN_COLAB = True

except:
  IN_COLAB = False

if not IN_COLAB:
    # Script de funciones necesarias
    from utyls_DL import *
    %load_ext autoreload
    %autoreload 2

    # Dataset
    df = pd.read_csv('diabetes_binary_5050split_health_indicators_BRFSS2015.csv')
else:
    # Script de funciones necesarias
    !wget https://raw.githubusercontent.com/Cheva94/Diplo_Opt/main/3_DL/Lab1/utyls_DL.py
    from utyls_DL import *
    %load_ext autoreload
    %autoreload 2

    # Dataset
    url = 'https://raw.githubusercontent.com/Cheva94/Diplo_Opt/main/3_DL/Lab1/diabetes_binary_5050split_health_indicators_BRFSS2015.csv'
    df = pd.read_csv(url)

---
# Descripción y preprocesamiento del dataset

## Descripción

El BRFSS (Sistema de Vigilancia de Factores de Riesgo del Comportamiento) es una encuesta telefónica relacionada con la salud que los CDC (Centros para Control y prevención de Enfermedades) recopilan anualmente, desde 1984. Cada año, la encuesta recopila respuestas de más de 400.000 estadounidenses sobre conductas de riesgo relacionadas con la salud, enfermedades crónicas y el uso de servicios preventivos.

Los datos que usaremos corresponden al año 2015. El dataset original consta de 441.455 personas y tiene 330 características, las cuales son preguntas formuladas directamente a los participantes o variables calculadas en función de las respuestas de los participantes. Particularmente, el csv que utilizaremos tiene ya una limpieza, quedando 70692 respuestas, donde la mitad no tienen diabetes y la otra mitad tienen prediabetes o diabetes propiamente dicha. La variable objetivo (`Diabetes_binary`) es binaria: 0 corresponde a ausencia de diabetes y 1 corresponde a prediabetes o diabetes. Además, el dataset está balanceado y consta de 21 características. Todas las variables son numéricas (de punto flotante) y no se presentan valores nulos.

In [None]:
df.info()

print('\n\n ###-------------------------------------### \n Distribución de Diabetes_binary:')
display(df['Diabetes_binary'].value_counts())

print('\n\n ###-------------------------------------### \n Pequeña muestra del dataset:')
df.sample(5, random_state=1994)


Al considerar la cantidad de valores únicos presentes en cada variable, vemos que hay 14 de las 21 características son binarias.

In [None]:
nunique = df.nunique()
display(nunique)

cols_binary = [x for x in df.columns if (nunique[x] == 2) & (x != 'Diabetes_binary')]
cols_non_binary = [x for x in df.columns if (x not in cols_binary) & (x != 'Diabetes_binary')]

print(f'Hay {len(cols_binary)} features binarias y {len(cols_non_binary)} no binarias.')

Considerando las variables binarias, tenemos 5 que están relativamente balanceadas (`HighBP`, `HighChol`, `Smoker`, `Fruits` y `Sex`), mientras que el resto toma valores más extremos. El significado de las variables es el siguiente:
- `HighBP`: tiene presión sanguínea alta, donde 0 es no y 1 es sí.
- `HighChol`: tiene colesterol alto, donde 0 es no y 1 es sí.
- `CholCheck`: se controló el colesterol en los últimos 5 años, donde 0 es no y 1 es sí.
- `Smoker`: fumó al menos 100 cigarrillos a lo largo de su vida, donde 0 es no y 1 es sí.
- `Stroke`: tuvo un ACV, donde 0 es no y 1 es sí.
- `HeartDiseaseorAttack`: tiene enfermedades coronarias o ataques cardíacos, donde 0 es no y 1 es sí.
- `PhysActivity`: realizó actividad física en los últimos 30 días, donde 0 es no y 1 es sí.
- `Fruits`: consume al menos 1 fruta al día, donde 0 es no y 1 es sí.
- `Veggies`: consume al menos 1 vegetal al día, donde 0 es no y 1 es sí.
- `HvyAlcoholConsump`: consumo excesivo de alcohol (más de 14 bebidas en caso de varones o más de 7 bebidas en caso de mujeres), donde 0 es no y 1 es sí.
- `AnyHealthcare`: posee algún tipo de cobertura médica, donde 0 es no y 1 es sí.
- `NoDocbcCost`: tuvo la necesidad de ir al médico en el último año, pero no fue por el costo, donde 0 es no y 1 es sí.
- `DiffWalk`: posee dificultad para caminar o subir escaleras, donde 0 es no y 1 es sí.
- `Sex`: sexo, donde 0 es mujer y 1 es varon.

In [None]:
display(df[cols_binary].mean()*100)

_, axs = plt.subplots(2, 7, figsize=(35, 10))
for cat in range(7):
    sns.histplot(data=df, x=cols_binary[cat], ax=axs[0, cat])
    axs[0, cat].set_ylabel('')
    sns.histplot(data=df, x=cols_binary[cat+7], ax=axs[1, cat])
    axs[1, cat].set_ylabel('')

Pasando ahora a las 7 características no binarias, su significado es el siguiente:
- `BMI`: índice de masa corporal, donde debajo de 18.5 se considera por debajo del peso ideal, entre 18.5 y 24.9 es un peso saludable, de 25.0 a 29.9 es sobrepeso y por encima de 30.0 es obesidad.
- `GenHlth`: autopercepción del estado general de salud, donde 1 es excelente, 2 es muy bueno, 3 es bueno, 4 es malo y 5 es muy malo.
- `MentHlth`: cuántos de los últimos 30 días tuvo una mala salud mental.
- `PhysHlth`: cuántos de los últimos 30 días tuvo una herida o enfermedad física.
- `Age`: categorización de edades en 13 niveles, donde 1 va desde 18 hasta 24, luego toma de a 5 años hasta llegar a la categoría 13 para 80 años o más.
- `Education`: categorización de niveles de estudio en 6 niveles, donde 1 es nunca fue a la escuela o sólo al jardín y 6 es que asistió a la universidad 4 años o más.
- `Income`: categorización de niveles de ingreso en 8 niveles, donde 1 es menos de $ 10.000 dólares, luego toma de a $5.000 dólares hasta llegar a 8 para $75.000 o más.

Vemos que `MentHlth` y `PhysHlth` presentan una dispersión de datos mucho mayor a la media. Además, el rango de valores posibles es diferente para cada una de las variables. Al comparar las distribuciones para cada variable en función de la variable objetivo, se aprecian diferencias en general.

In [None]:
display(df[cols_non_binary].describe())


_, axs = plt.subplots(2, 4, figsize=(20, 10))
for cat in range(4):
    sns.kdeplot(data=df, x=cols_non_binary[cat], hue='Diabetes_binary', ax=axs[0, cat])
    axs[0, cat].set_ylabel('')
    axs[0, cat].grid()

for cat in range(3):
    sns.kdeplot(data=df, x=cols_non_binary[cat+4], hue='Diabetes_binary', ax=axs[1, cat])
    axs[1, cat].set_ylabel('')
    axs[1, cat].grid()

axs[1, 3].set_axis_off()

## Preprocesamiento

Creamos los tensores de features y target, escalando los datos entre 0 y 1 para las variables no binarias. Luego, creamos un TensorDataset a partir de los tensores de features y target. Finalmente, dividimos los datos para entrenar, validar y testear, y creamos los cargadores de datos para leer los datos por mini-batch.

In [None]:
BATCH_SIZE = 64
EPOCHS = 100

Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, BATCH_SIZE)

n_inputs = np.prod(np.array(Data_train[0][0].shape))
n_outputs = df['Diabetes_binary'].nunique()
loss_function = nn.CrossEntropyLoss()

---
# MLP #1

## Definición del modelo

In [None]:
class MLP1(nn.Module):
    '''
    Modelo fully-connected con X capas y N neuronas por capa.
    '''

    def __init__(self, n_inputs, n_outputs, n_hidden, activation_function, dropout=0.0):
        super().__init__()
        self.drop0 = nn.Dropout(dropout)

        self.hidden1 = nn.Linear(n_inputs, n_hidden)
        self.activ1 = activation_function
        self.drop1 = nn.Dropout(dropout)

        self.hidden2 = nn.Linear(n_hidden, n_hidden)
        self.activ2 = activation_function
        self.drop2 = nn.Dropout(dropout)

        self.hidden3 = nn.Linear(n_hidden, n_hidden)
        self.activ3 = activation_function
        self.drop3 = nn.Dropout(dropout)

        self.output = nn.Linear(n_hidden, n_outputs)

    def forward(self, x: torch.Tensor):
        x = self.drop0(x)

        x = self.activ1(self.hidden1(x))
        x = self.drop1(x)

        x = self.activ2(self.hidden2(x))
        x = self.drop2(x)

        x = self.activ3(self.hidden3(x))
        x = self.drop3(x)

        x = self.output(x)  # Output Layer
        return x

## Baseline

In [None]:
baseline_param = {
    'nH': 5,
    'AF': nn.Sigmoid(),
    'GD': optim.SGD,
    'LR': 0.1,
    'Mom': 0.9
}

baseline_exp = []

# Instanciamos el modelo
model = MLP1(n_inputs, n_outputs, baseline_param['nH'], baseline_param['AF'])

# Definimos el optimizador
optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])

# Corremos el baseline
experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
baseline_exp.append(experiment)

In [None]:
plot_results(baseline_exp)

## Estudio de la función de activación

In [None]:
activation_functions = {
    'Sigmoid': nn.Sigmoid(),
    'Tanh': nn.Tanh(),
    'ReLU': nn.ReLU(),
    'LeakyReLU': nn.LeakyReLU()
}
ActFunc_Exps1 = []

for key in activation_functions.keys():
    print(f'\n\n Corriendo con {key}')
    model = MLP1(n_inputs, n_outputs, baseline_param['nH'], activation_functions[key])
    optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    ActFunc_Exps1.append(experiment)

In [None]:
plot_results(ActFunc_Exps1)

## Estudio del optimizador

In [None]:
ActFunc_best1 = None
optims = [optim.SGD, optim.Adagrad, optim.RMSprop, optim.Adam]
Optim_Exps1 = []

for opt in optims:
    print(f'\n\n Corriendo con {opt}')
    model = MLP1(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best1)
    if opt == optim.SGD:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    else:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Optim_Exps1.append(experiment)

In [None]:
plot_results(Optim_Exps1)

## Estudio de la tasa de aprendizaje

In [None]:
Optim_best1 = None
alpha = [0.1, 0.01, 0.001, 0.0001]
LR_Exps1 = []

for LR in alpha:
    print(f'\n\n Corriendo con {LR}')
    model = MLP1(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best1)
    if Optim_best1 == optim.SGD:
        optimizer = Optim_best1(model.parameters(), lr=LR, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best1(model.parameters(), lr=LR)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    LR_Exps1.append(experiment)

In [None]:
plot_results(LR_Exps1)

## Estudio del tamaño de lote

In [None]:
LR_best1 = None
batches = [32, 64, 128]
Batch_Exps1 = []

for batch in batches:
    print(f'\n\n Corriendo con {batch}')
    Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, batch)

    model = MLP1(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best1)
    if Optim_best1 == optim.SGD:
        optimizer = Optim_best1(model.parameters(), lr=LR_best1, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best1(model.parameters(), lr=LR_best1)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Batch_Exps1.append(experiment)

In [None]:
plot_results(Batch_Exps1)

## Estudio del dropout

In [None]:
Batch_best1 = None
Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, Batch_best1)

dropOuts = [0.2, 0.4, 0.6]
Drop_Exps1 = []

for drop in dropOuts:
    print(f'\n\n Corriendo con {drop}')
    model = MLP1(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best1, dropout=drop)
    if Optim_best1 == optim.SGD:
        optimizer = Optim_best1(model.parameters(), lr=LR_best1, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best1(model.parameters(), lr=LR_best1)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Drop_Exps1.append(experiment)

In [None]:
plot_results(Batch_Exps1)

---
# MLP #2

## Definición del modelo

In [None]:
class MLP2(nn.Module):
    '''
    Modelo fully-connected con X capas y N neuronas por capa.
    '''

    def __init__(self, n_inputs, n_outputs, n_hidden, activation_function, dropout=0.0):
        super().__init__()
        self.drop0 = nn.Dropout(dropout)

        self.hidden1 = nn.Linear(n_inputs, n_hidden)
        self.activ1 = activation_function
        self.drop1 = nn.Dropout(dropout)

        self.hidden2 = nn.Linear(n_hidden, n_hidden)
        self.activ2 = activation_function
        self.drop2 = nn.Dropout(dropout)

        self.hidden3 = nn.Linear(n_hidden, n_hidden)
        self.activ3 = activation_function
        self.drop3 = nn.Dropout(dropout)

        self.output = nn.Linear(n_hidden, n_outputs)

    def forward(self, x: torch.Tensor):
        x = self.drop0(x)

        x = self.activ1(self.hidden1(x))
        x = self.drop1(x)

        x = self.activ2(self.hidden2(x))
        x = self.drop2(x)

        x = self.activ3(self.hidden3(x))
        x = self.drop3(x)

        x = self.output(x)  # Output Layer
        return x

## Baseline

In [None]:
baseline_param = {
    'nH': 5,
    'AF': nn.Sigmoid(),
    'GD': optim.SGD,
    'LR': 0.1,
    'Mom': 0.9
}

baseline_exp = []

# Instanciamos el modelo
model = MLP2(n_inputs, n_outputs, baseline_param['nH'], baseline_param['AF'])

# Definimos el optimizador
optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])

# Corremos el baseline
experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
baseline_exp.append(experiment)

In [None]:
plot_results(baseline_exp)

## Estudio de la función de activación

In [None]:
activation_functions = {
    'Sigmoid': nn.Sigmoid(),
    'Tanh': nn.Tanh(),
    'ReLU': nn.ReLU(),
    'LeakyReLU': nn.LeakyReLU()
}
ActFunc_Exps2 = []

for key in activation_functions.keys():
    print(f'\n\n Corriendo con {key}')
    model = MLP2(n_inputs, n_outputs, baseline_param['nH'], activation_functions[key])
    optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    ActFunc_Exps2.append(experiment)

In [None]:
plot_results(ActFunc_Exps2)

## Estudio del optimizador

In [None]:
ActFunc_best2 = None
optims = [optim.SGD, optim.Adagrad, optim.RMSprop, optim.Adam]
Optim_Exps2 = []

for opt in optims:
    print(f'\n\n Corriendo con {opt}')
    model = MLP2(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best2)
    if opt == optim.SGD:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    else:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Optim_Exps2.append(experiment)

In [None]:
plot_results(Optim_Exps2)

## Estudio de la tasa de aprendizaje

In [None]:
Optim_best2 = None
alpha = [0.1, 0.01, 0.001, 0.0001]
LR_Exps2 = []

for LR in alpha:
    print(f'\n\n Corriendo con {LR}')
    model = MLP2(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best2)
    if Optim_best2 == optim.SGD:
        optimizer = Optim_best2(model.parameters(), lr=LR, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best2(model.parameters(), lr=LR)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    LR_Exps2.append(experiment)

In [None]:
plot_results(LR_Exps2)

## Estudio del tamaño de lote

In [None]:
LR_best2 = None
batches = [32, 64, 128]
Batch_Exps2 = []

for batch in batches:
    print(f'\n\n Corriendo con {batch}')
    Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, batch)

    model = MLP2(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best2)
    if Optim_best2 == optim.SGD:
        optimizer = Optim_best2(model.parameters(), lr=LR_best2, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best2(model.parameters(), lr=LR_best2)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Batch_Exps2.append(experiment)

In [None]:
plot_results(Batch_Exps2)

## Estudio del dropout

In [None]:
Batch_best2 = None
Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, Batch_best2)

dropOuts = [0.2, 0.4, 0.6]
Drop_Exps2 = []

for drop in dropOuts:
    print(f'\n\n Corriendo con {drop}')
    model = MLP2(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best2, dropout=drop)
    if Optim_best2 == optim.SGD:
        optimizer = Optim_best2(model.parameters(), lr=LR_best2, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best2(model.parameters(), lr=LR_best2)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Drop_Exps2.append(experiment)

In [None]:
plot_results(Batch_Exps2)

---
# MLP #3

## Definición del modelo

In [None]:
class MLP3(nn.Module):
    '''
    Modelo fully-connected con X capas y N neuronas por capa.
    '''

    def __init__(self, n_inputs, n_outputs, n_hidden, activation_function, dropout=0.0):
        super().__init__()
        self.drop0 = nn.Dropout(dropout)

        self.hidden1 = nn.Linear(n_inputs, n_hidden)
        self.activ1 = activation_function
        self.drop1 = nn.Dropout(dropout)

        self.hidden2 = nn.Linear(n_hidden, n_hidden)
        self.activ2 = activation_function
        self.drop2 = nn.Dropout(dropout)

        self.hidden3 = nn.Linear(n_hidden, n_hidden)
        self.activ3 = activation_function
        self.drop3 = nn.Dropout(dropout)

        self.output = nn.Linear(n_hidden, n_outputs)

    def forward(self, x: torch.Tensor):
        x = self.drop0(x)

        x = self.activ1(self.hidden1(x))
        x = self.drop1(x)

        x = self.activ2(self.hidden2(x))
        x = self.drop2(x)

        x = self.activ3(self.hidden3(x))
        x = self.drop3(x)

        x = self.output(x)  # Output Layer
        return x

## Baseline

In [None]:
baseline_param = {
    'nH': 5,
    'AF': nn.Sigmoid(),
    'GD': optim.SGD,
    'LR': 0.1,
    'Mom': 0.9
}

baseline_exp = []

# Instanciamos el modelo
model = MLP3(n_inputs, n_outputs, baseline_param['nH'], baseline_param['AF'])

# Definimos el optimizador
optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])

# Corremos el baseline
experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
baseline_exp.append(experiment)

In [None]:
plot_results(baseline_exp)

## Estudio de la función de activación

In [None]:
activation_functions = {
    'Sigmoid': nn.Sigmoid(),
    'Tanh': nn.Tanh(),
    'ReLU': nn.ReLU(),
    'LeakyReLU': nn.LeakyReLU()
}
ActFunc_Exps3 = []

for key in activation_functions.keys():
    print(f'\n\n Corriendo con {key}')
    model = MLP3(n_inputs, n_outputs, baseline_param['nH'], activation_functions[key])
    optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    ActFunc_Exps3.append(experiment)

In [None]:
plot_results(ActFunc_Exps3)

## Estudio del optimizador

In [None]:
ActFunc_best3 = None
optims = [optim.SGD, optim.Adagrad, optim.RMSprop, optim.Adam]
Optim_Exps3 = []

for opt in optims:
    print(f'\n\n Corriendo con {opt}')
    model = MLP3(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best3)
    if opt == optim.SGD:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    else:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Optim_Exps3.append(experiment)

In [None]:
plot_results(Optim_Exps3)

## Estudio de la tasa de aprendizaje

In [None]:
Optim_best3 = None
alpha = [0.1, 0.01, 0.001, 0.0001]
LR_Exps3 = []

for LR in alpha:
    print(f'\n\n Corriendo con {LR}')
    model = MLP3(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best3)
    if Optim_best3 == optim.SGD:
        optimizer = Optim_best3(model.parameters(), lr=LR, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best3(model.parameters(), lr=LR)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    LR_Exps3.append(experiment)

In [None]:
plot_results(LR_Exps3)

## Estudio del tamaño de lote

In [None]:
LR_best3 = None
batches = [32, 64, 128]
Batch_Exps3 = []

for batch in batches:
    print(f'\n\n Corriendo con {batch}')
    Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, batch)

    model = MLP3(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best3)
    if Optim_best3 == optim.SGD:
        optimizer = Optim_best3(model.parameters(), lr=LR_best3, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best3(model.parameters(), lr=LR_best3)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Batch_Exps3.append(experiment)

In [None]:
plot_results(Batch_Exps3)

## Estudio del dropout

In [None]:
Batch_best3 = None
Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, Batch_best3)

dropOuts = [0.2, 0.4, 0.6]
Drop_Exps3 = []

for drop in dropOuts:
    print(f'\n\n Corriendo con {drop}')
    model = MLP3(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best3, dropout=drop)
    if Optim_best3 == optim.SGD:
        optimizer = Optim_best3(model.parameters(), lr=LR_best3, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best3(model.parameters(), lr=LR_best3)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Drop_Exps3.append(experiment)

In [None]:
plot_results(Batch_Exps3)

---
# MLP #4

## Definición del modelo

In [None]:
class MLP4(nn.Module):
    '''
    Modelo fully-connected con X capas y N neuronas por capa.
    '''

    def __init__(self, n_inputs, n_outputs, n_hidden, activation_function, dropout=0.0):
        super().__init__()
        self.drop0 = nn.Dropout(dropout)

        self.hidden1 = nn.Linear(n_inputs, n_hidden)
        self.activ1 = activation_function
        self.drop1 = nn.Dropout(dropout)

        self.hidden2 = nn.Linear(n_hidden, n_hidden)
        self.activ2 = activation_function
        self.drop2 = nn.Dropout(dropout)

        self.hidden3 = nn.Linear(n_hidden, n_hidden)
        self.activ3 = activation_function
        self.drop3 = nn.Dropout(dropout)

        self.output = nn.Linear(n_hidden, n_outputs)

    def forward(self, x: torch.Tensor):
        x = self.drop0(x)

        x = self.activ1(self.hidden1(x))
        x = self.drop1(x)

        x = self.activ2(self.hidden2(x))
        x = self.drop2(x)

        x = self.activ3(self.hidden3(x))
        x = self.drop3(x)

        x = self.output(x)  # Output Layer
        return x

## Baseline

In [None]:
baseline_param = {
    'nH': 5,
    'AF': nn.Sigmoid(),
    'GD': optim.SGD,
    'LR': 0.1,
    'Mom': 0.9
}

baseline_exp = []

# Instanciamos el modelo
model = MLP4(n_inputs, n_outputs, baseline_param['nH'], baseline_param['AF'])

# Definimos el optimizador
optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])

# Corremos el baseline
experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
baseline_exp.append(experiment)

In [None]:
plot_results(baseline_exp)

## Estudio de la función de activación

In [None]:
activation_functions = {
    'Sigmoid': nn.Sigmoid(),
    'Tanh': nn.Tanh(),
    'ReLU': nn.ReLU(),
    'LeakyReLU': nn.LeakyReLU()
}
ActFunc_Exps4 = []

for key in activation_functions.keys():
    print(f'\n\n Corriendo con {key}')
    model = MLP4(n_inputs, n_outputs, baseline_param['nH'], activation_functions[key])
    optimizer = baseline_param['GD'](model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    ActFunc_Exps4.append(experiment)

In [None]:
plot_results(ActFunc_Exps4)

## Estudio del optimizador

In [None]:
ActFunc_best4 = None
optims = [optim.SGD, optim.Adagrad, optim.RMSprop, optim.Adam]
Optim_Exps4 = []

for opt in optims:
    print(f'\n\n Corriendo con {opt}')
    model = MLP4(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best4)
    if opt == optim.SGD:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'], momentum=baseline_param['Mom'])
    else:
        optimizer = opt(model.parameters(), lr=baseline_param['LR'])
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Optim_Exps4.append(experiment)

In [None]:
plot_results(Optim_Exps4)

## Estudio de la tasa de aprendizaje

In [None]:
Optim_best4 = None
alpha = [0.1, 0.01, 0.001, 0.0001]
LR_Exps4 = []

for LR in alpha:
    print(f'\n\n Corriendo con {LR}')
    model = MLP4(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best4)
    if Optim_best4 == optim.SGD:
        optimizer = Optim_best4(model.parameters(), lr=LR, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best4(model.parameters(), lr=LR)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    LR_Exps4.append(experiment)

In [None]:
plot_results(LR_Exps4)

## Estudio del tamaño de lote

In [None]:
LR_best4 = None
batches = [32, 64, 128]
Batch_Exps4 = []

for batch in batches:
    print(f'\n\n Corriendo con {batch}')
    Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, batch)

    model = MLP4(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best4)
    if Optim_best4 == optim.SGD:
        optimizer = Optim_best4(model.parameters(), lr=LR_best4, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best4(model.parameters(), lr=LR_best4)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Batch_Exps4.append(experiment)

In [None]:
plot_results(Batch_Exps4)

## Estudio del dropout

In [None]:
Batch_best4 = None
Data_train, Data_val, Data_test, Load_train, Load_val, Load_test = preproc(df, cols_binary, cols_non_binary, Batch_best4)

dropOuts = [0.2, 0.4, 0.6]
Drop_Exps4 = []

for drop in dropOuts:
    print(f'\n\n Corriendo con {drop}')
    model = MLP4(n_inputs, n_outputs, baseline_param['nH'], ActFunc_best4, dropout=drop)
    if Optim_best4 == optim.SGD:
        optimizer = Optim_best4(model.parameters(), lr=LR_best4, momentum=baseline_param['Mom'])
    else:
        optimizer = Optim_best4(model.parameters(), lr=LR_best4)
    experiment = run_experiment(model, EPOCHS, Load_train, Load_val, loss_function, optimizer, device, use_tqdm=False)
    Drop_Exps4.append(experiment)

In [None]:
plot_results(Batch_Exps4)

---
# Comparación final y conclusiones

Acá comparar la mejor configuración de los 3 modelos elegidos y aplicarlos al conjunto de test? O ir testeando al final de cada modelo?