## Analítica Computacional para la Toma de Decisiones 2024

### Clases 11-12: Introducción a redes neuronales
- Ejemplo Regresión
- Entrenamiento y optimizadores
- Redes convolucionales

Iniciamos verificando la versión de python, scikit learn y tensoflow

In [None]:
import sys
assert sys.version_info >= (3, 7)

from packaging import version
import sklearn
assert version.parse(sklearn.__version__) >= version.parse("1.0.1")

import tensorflow as tf
assert version.parse(tf.__version__) >= version.parse("2.8.0")

## Ejemplo regresión

Consideremos ahora un problema de regresión para predecir el precio de una vivienda

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

housing = fetch_california_housing()
X_train_full, X_test, y_train_full, y_test = train_test_split(
    housing.data, housing.target, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(
    X_train_full, y_train_full, random_state=42)


In [None]:
housing.feature_names

In [None]:
X_train[1:5,]

In [None]:
housing.target_names

In [None]:
y_train[1:5,]

In [None]:
from sklearn.metrics import mean_squared_error
from sklearn.neural_network import MLPRegressor
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

mlp_reg = MLPRegressor(hidden_layer_sizes=[10, 10, 10], random_state=42)
pipeline = make_pipeline(StandardScaler(), mlp_reg)
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_valid)
rmse = mean_squared_error(y_valid, y_pred, squared=False)
print(rmse)

## Entrenamiento y optimizadores

Obtenemos los datos de MNIST fashion y los reescalamos los datos para que estén entre 0 y 1

In [None]:
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist
X_train, y_train = X_train_full[:-5000], y_train_full[:-5000]
X_valid, y_valid = X_train_full[-5000:], y_train_full[-5000:]
X_train, X_valid, X_test = X_train / 255, X_valid / 255, X_test / 255

Estandarizamos los datos para cada pixel con su media y varianza

In [None]:
pixel_means = X_train.mean(axis=0, keepdims=True)
pixel_stds = X_train.std(axis=0, keepdims=True)
X_train_scaled = (X_train - pixel_means) / pixel_stds
X_valid_scaled = (X_valid - pixel_means) / pixel_stds
X_test_scaled = (X_test - pixel_means) / pixel_stds

Definamos una función que cree un modelo con 3 capas densas y permita modificar el optimizador como argumento

In [None]:
def build_model(seed=42):
    tf.random.set_seed(seed)
    return tf.keras.Sequential([
        tf.keras.layers.Flatten(input_shape=[28, 28]),
        tf.keras.layers.Dense(100, activation="relu",
                              kernel_initializer="he_normal"),
        tf.keras.layers.Dense(100, activation="relu",
                              kernel_initializer="he_normal"),
        tf.keras.layers.Dense(100, activation="relu",
                              kernel_initializer="he_normal"),
        tf.keras.layers.Dense(10, activation="softmax")
    ])

def build_and_train_model(optimizer):
    model = build_model()
    model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer,
                  metrics=["accuracy"])
    return model.fit(X_train, y_train, epochs=10,
                     validation_data=(X_valid, y_valid))

Creemos un optimizador de gradiente estocástico con tasa de aprendizaje 0.001

In [None]:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)

Ejecutemos el modelo con el optimizador definido

In [None]:
history_sgd = build_and_train_model(optimizer)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Creemos un optimizador de gradiente estocástico con tasa de aprendizaje 0.001 y momento 0.9

In [None]:
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)

Ejecutemos el modelo con el optimizador definido

In [None]:
history_momentum = build_and_train_model(optimizer)

Optimizador adaptativo AdaGrad

In [None]:
optimizer = tf.keras.optimizers.Adagrad(learning_rate=0.001)

In [None]:
history_adagrad = build_and_train_model(optimizer)

Optimizador ADAM

In [None]:
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9,
                                     beta_2=0.999)

In [None]:
history_adam = build_and_train_model(optimizer)

Evaluación de desempeño

In [None]:
import matplotlib.pyplot as plt
for loss in ("loss", "val_loss"):
    plt.figure(figsize=(12, 8))
    opt_names = "SGD Momentum AdaGrad Adam"
    for history, opt_name in zip((history_sgd, history_momentum,
                                  history_adagrad, history_adam,
                                 ),
                                 opt_names.split()):
        plt.plot(history.history[loss], label=f"{opt_name}", linewidth=3)

    plt.grid()
    plt.xlabel("Epochs")
    plt.ylabel({"loss": "Training loss", "val_loss": "Validation loss"}[loss])
    plt.legend(loc="upper right")
    plt.axis([0, 9, 0.1, 0.7])
    plt.show()

## Redes convolucionales

Importemos numpy y reiniciemos la sesión de tensorflow

In [None]:
import numpy as np
tf.keras.backend.clear_session()

Verifiquemos si tenemos acceso a una GPU localmente o si estamos en Google Colab cambiar el hardware para incluir una.

In [None]:
IS_COLAB = "google.colab" in sys.modules

if not tf.config.list_physical_devices('GPU'):
    print("GPU no detectado.")
    if IS_COLAB:
        print("Go to Runtime > Change runtime and select a GPU hardware "
              "accelerator.")


Obtenemos los datos de MNIST fashion y los reescalamos los datos para que estén entre 0 y 1

In [None]:
mnist = tf.keras.datasets.fashion_mnist.load_data()
(X_train_full, y_train_full), (X_test, y_test) = mnist
X_train_full = np.expand_dims(X_train_full, axis=-1).astype(np.float32) / 255
X_test = np.expand_dims(X_test.astype(np.float32), axis=-1) / 255
X_train, X_valid = X_train_full[:-5000], X_train_full[-5000:]
y_train, y_valid = y_train_full[:-5000], y_train_full[-5000:]

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


A continuación creamos el modelo. Note que tiene algunos elementos que no hemos visto en clase. Investigue de qué se trata en particular la capa MaxPooling.

Realice una descripción de la red completa.

In [None]:
from functools import partial

tf.random.set_seed(42)
DefaultConv2D = partial(tf.keras.layers.Conv2D, kernel_size=3, padding="same",
                        activation="relu", kernel_initializer="he_normal")
model = tf.keras.Sequential([
    DefaultConv2D(filters=64, kernel_size=7, input_shape=[28, 28, 1]),
    tf.keras.layers.MaxPool2D(),
    DefaultConv2D(filters=128),
    DefaultConv2D(filters=128),
    tf.keras.layers.MaxPool2D(),
    DefaultConv2D(filters=256),
    DefaultConv2D(filters=256),
    tf.keras.layers.MaxPool2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(units=128, activation="relu",
                          kernel_initializer="he_normal"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(units=64, activation="relu",
                          kernel_initializer="he_normal"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(units=10, activation="softmax")
])

Compilemos el modelo con un optimizador, realicemos el entrenammiento durante 10 épocas. Evaluemos el modelo y realicemos algunas predicciones.

In [None]:
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam",
              metrics=["accuracy"])
history = model.fit(X_train, y_train, epochs=10,
                    validation_data=(X_valid, y_valid))
score = model.evaluate(X_test, y_test)
X_new = X_test[:10]  # pretend we have new images
y_pred = model.predict(X_new)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


# Ejercicio
Empleando los mismos datos de _MNIST fashion_ modifique la red (sus hiperparámetros), entrene y pruebe usando los mismos conjuntos de datos, y compare los resultados de diferentes experimentos.  