# Práctico 4

## Transform Pattern

**"Transform pattern"** en el contexto de Machine Learning se refiere a la técnica de manipular y cambiar los datos de entrada antes de que sean utilizados por un modelo. 

Estos cambios pueden ayudar a mejorar el rendimiento del modelo y a hacer que el modelo sea más robusto ante variaciones en los datos de entrada.

Las transformaciones se aplican típicamente durante la etapa de pre_procesamiento de los datos y pueden implicar muchas técnicas diferentes, dependiendo del tipo de datos y del problema que se está tratando de resolver.

In [1]:
!pip install tensorflow==2.16.1



In [2]:
import tensorflow as tf
print(tf.__version__)


2024-05-17 01:09:41.848968: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-17 01:09:41.850296: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-17 01:09:41.854798: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-05-17 01:09:42.009214: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


2.16.1


In [3]:
%%time 

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Resizing, Rescaling
from tensorflow.keras.models import Sequential
import numpy as np
import matplotlib.pyplot as plt

# Cargar y normalizar el conjunto de datos 
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Convertir las etiquetas en one-hot
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

data_preprocessing = Sequential([
    Resizing(32, 32),  # Redimensionar a 32x32
    Rescaling(1./255)  # Normalización adicional después del redimensionamiento
])

# Define tu modelo
model = tf.keras.models.Sequential([
    data_preprocessing,

    tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(32, 32, 3)),  # Ajustar a 3 canales de entrada
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.3),

    tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.5),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compila y entrena el modelo
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(x_train, y_train, batch_size=256, epochs=5, validation_data=(x_test, y_test))

# Guarda el modelo
model.save('transform_pattern_conv.h5')


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m209s[0m 1s/step - accuracy: 0.2994 - loss: 2.2652 - val_accuracy: 0.1181 - val_loss: 2.9272
Epoch 2/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m171s[0m 586ms/step - accuracy: 0.5209 - loss: 1.3501 - val_accuracy: 0.2612 - val_loss: 2.2513
Epoch 3/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 572ms/step - accuracy: 0.6011 - loss: 1.1278 - val_accuracy: 0.6423 - val_loss: 1.0149
Epoch 4/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m142s[0m 574ms/step - accuracy: 0.6473 - loss: 1.0054 - val_accuracy: 0.6645 - val_loss: 0.9398
Epoch 5/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m144s[0m 585ms/step - accuracy: 0.6767 - loss: 0.9323 - val_accuracy: 0.7065 - val_loss: 0.8341




CPU times: user 16min 47s, sys: 5.29 s, total: 16min 52s
Wall time: 13min 54s


En este caso, las transformaciones de datos **(Resizing y Rescaling)** se definen como capas Keras y se agregan al inicio de tu modelo. 

Esto significa que estas transformaciones se aplicarán automáticamente a las imágenes a medida que pasen por el modelo, ya sea durante el entrenamiento o durante la inferencia. 

Además, como las capas de preprocesamiento son parte del modelo, se guardarán junto con el modelo cuando lo guardes con model.save(). 

## Experiment tracking
### wandb: https://wandb.ai/site


Primero, vamos a agregar experiment tracking utilizando wandb (Weights & Biases). Esto nos va a permitir monitorear los experimentos en tiempo real, guardar nuestros modelos , resultados, y podremos compartir experimentos con otros.

[Wandb collab full explained notebook ](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/intro/Intro_to_Weights_%26_Biases.ipynb#scrollTo=jufPgkgqz2eF)

### 👟 Run an experiment 

1.  **Start a new run** and pass in hyperparameters to track

2.  **Log metrics** from training or evaluation

3.  **Visualize results** in the dashboard

4. **Generate alerts** in the dashboard 

In [4]:
!pip install wandb==0.17.0

Collecting wandb==0.17.0
  Downloading wandb-0.17.0-py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (10 kB)
Collecting click!=8.0.0,>=7.1 (from wandb==0.17.0)
  Downloading click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting docker-pycreds>=0.4.0 (from wandb==0.17.0)
  Downloading docker_pycreds-0.4.0-py2.py3-none-any.whl.metadata (1.8 kB)
Collecting sentry-sdk>=1.0.0 (from wandb==0.17.0)
  Downloading sentry_sdk-2.2.0-py2.py3-none-any.whl.metadata (10 kB)
Collecting setproctitle (from wandb==0.17.0)
  Downloading setproctitle-1.3.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.9 kB)
Downloading wandb-0.17.0-py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.7/6.7 MB[0m [31m57.7 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hDownloading 

In [5]:
import wandb

print("wandb version:", wandb.__version__)


wandb version: 0.17.0


In [7]:
import warnings
warnings.filterwarnings('ignore')

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import wandb
from wandb.integration.keras import WandbCallback

# Configura tu clave API directamente
wandb.login(key='5f6ace794df4cab57a0ea72cf21c8c46ccd2beb7')

[34m[1mwandb[0m: Currently logged in as: [33mchv-facu[0m ([33mchristianvera495[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /home/codespace/.netrc


True

In [8]:
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Resizing, Rescaling
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import Callback
import numpy as np
import matplotlib.pyplot as plt
import random


# Definir un callback personalizado para wandb
class CustomWandbCallback(Callback):
    def on_epoch_end(self, epoch, logs=None):
        wandb.log(logs, step=epoch)

# Launch 2 experiments, trying different dropout rates
for run in range(2):
    
    # Start a run, tracking hyperparameters
    wandb.init(
        project="ml-produccion-wandb",
        config={
            "activation_1": "relu",
            "dropout": random.uniform(0.01, 0.80),
            "optimizer": "adam",
            "loss": "categorical_crossentropy",
            "metric": "accuracy",
            "epoch": 3,
            "batch_size": 512,
        },
    )
    config = wandb.config
    
    # Cargar y normalizar el conjunto de datos 
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()

    # Convertir las etiquetas en one-hot
    y_train = tf.keras.utils.to_categorical(y_train, 10)
    y_test = tf.keras.utils.to_categorical(y_test, 10)

    # Define el preprocesamiento de la imagen
    data_preprocessing = Sequential([
        Resizing(32, 32),  # Redimensionar a 32x32
        Rescaling(1./255)  # Normalización adicional después del redimensionamiento
    ])

    # Define tu modelo
    model = tf.keras.models.Sequential([
        data_preprocessing,  

        tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(32, 32, 3)),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Dropout(0.3),

        tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Dropout(config.dropout),

        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dropout(config.dropout),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    # Compila y entrena el modelo
    model.compile(optimizer=config.optimizer, loss=config.loss, metrics=[config.metric])

    # Add CustomWandbCallback to log metrics
    custom_wandb_callback = CustomWandbCallback()

    history = model.fit(x_train, y_train, batch_size=config.batch_size, epochs=config.epoch, 
                        validation_data=(x_test, y_test), callbacks=[custom_wandb_callback])
    
    wandb.finish()
    
    # Guardar el modelo
    model.save(f'model_run_{run}.h5')


Epoch 1/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m122s[0m 1s/step - accuracy: 0.2517 - loss: 2.5847 - val_accuracy: 0.1643 - val_loss: 2.7888
Epoch 2/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 1s/step - accuracy: 0.4198 - loss: 1.6487 - val_accuracy: 0.1323 - val_loss: 3.1158
Epoch 3/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m143s[0m 1s/step - accuracy: 0.4928 - loss: 1.4053 - val_accuracy: 0.2061 - val_loss: 3.3668


0,1
accuracy,▁▆█
loss,█▃▁
val_accuracy,▄▁█
val_loss,▁▅█

0,1
accuracy,0.50706
loss,1.36905
val_accuracy,0.2061
val_loss,3.36678




Epoch 1/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m125s[0m 1s/step - accuracy: 0.4009 - loss: 1.7464 - val_accuracy: 0.1594 - val_loss: 2.7252
Epoch 2/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m141s[0m 1s/step - accuracy: 0.6224 - loss: 1.0704 - val_accuracy: 0.1186 - val_loss: 3.3521
Epoch 3/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m142s[0m 1s/step - accuracy: 0.6878 - loss: 0.8828 - val_accuracy: 0.1581 - val_loss: 3.1706


0,1
accuracy,▁▆█
loss,█▃▁
val_accuracy,█▁█
val_loss,▁█▆

0,1
accuracy,0.6947
loss,0.86819
val_accuracy,0.1581
val_loss,3.17064





##  W&B Alerts

**[W&B Alerts](https://docs.wandb.ai/guides/track/alert)** allows you to send alerts, triggered from your Python code, to your Slack or email. There are 2 steps to follow the first time you'd like to send a Slack or email alert, triggered from your code:

1) Turn on Alerts in your W&B [User Settings](https://wandb.ai/settings)

2) Add `wandb.alert()` to your code:

```python
wandb.alert(
    title="Low accuracy", 
    text=f"Accuracy is below the acceptable threshold"
)
```

In [14]:
import random 

# Start a wandb run
wandb.init(project="alerts-intro")

# Simulating a model training loop
acc_threshold = 0.3
for training_step in range(1000):

    # Generate a random number for accuracy
    accuracy = round(random.random() + random.random(), 3)
    print(f"Accuracy is: {accuracy}, {acc_threshold}")

    # 🐝 Log accuracy to wandb
    wandb.log({"Accuracy": accuracy})

    # 🔔 If the accuracy is below the threshold, fire a W&B Alert and stop the run
    if accuracy <= acc_threshold:
        # 🐝 Send the wandb Alert
        wandb.alert(
            title="Low Accuracy",
            text=f"Accuracy {accuracy} at step {training_step} is below the acceptable theshold, {acc_threshold}",
        )
        print("Alert triggered")
        break

# Mark the run as finished (useful in Jupyter notebooks)
wandb.finish()

Accuracy is: 1.122, 0.3
Accuracy is: 0.607, 0.3
Accuracy is: 0.588, 0.3
Accuracy is: 1.249, 0.3
Accuracy is: 1.182, 0.3
Accuracy is: 0.842, 0.3
Accuracy is: 0.991, 0.3
Accuracy is: 1.105, 0.3
Accuracy is: 1.336, 0.3
Accuracy is: 1.602, 0.3
Accuracy is: 1.5, 0.3
Accuracy is: 1.071, 0.3
Accuracy is: 0.928, 0.3
Accuracy is: 1.786, 0.3
Accuracy is: 0.751, 0.3
Accuracy is: 0.912, 0.3
Accuracy is: 0.959, 0.3
Accuracy is: 0.251, 0.3
Alert triggered


0,1
Accuracy,▅▃▃▆▅▄▄▅▆▇▇▅▄█▃▄▄▁

0,1
Accuracy,0.251


## H Tuning - wandb

In [29]:
import os
import wandb
from tensorflow.keras.models import load_model
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import tensorflow as tf

# Cargar y normalizar el conjunto de datos
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Usar solo una muestra del conjunto de datos de prueba
x_test_small = x_test[:1]
y_test_small = to_categorical(y_test[:1], 10)

# Definir el callback personalizado para wandb
class CustomWandbCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        wandb.log(logs, step=epoch)

# Definir la configuración del sweep
sweep_config = {
    'method': 'grid',  # el método de búsqueda de hiperparámetros
    'metric': {
        'name': 'accuracy',
        'goal': 'maximize'  
    },
    'parameters': {
        'learning_rate': {
            'values': [0.01]  # Un solo valor para minimizar el tiempo de ejecución
        },
        'batch_size': {
            'values': [64]  # Un solo valor para minimizar el tiempo de ejecución
        },
        'run_index': {
            'values': [0, 1]  # Índices de los modelos a cargar
        }
    }
}

sweep_id = wandb.sweep(sweep_config, project="Htuning")

# Definir la función de entrenamiento
def train():
    run = wandb.init()
    config = run.config
    
    print("Configuración de la ejecución:", config)
    
    # Mostrar archivos en el directorio actual
    print("Archivos en el directorio actual:", os.listdir('.'))
    
    # Cargar el modelo basado en el índice del run
    run_index = config.run_index
    model_path = f"model_run_{run_index}.h5"
    
    if os.path.exists(model_path):
        print(f"Cargando el modelo desde {model_path}")
        model = load_model(model_path)
    else:
        raise FileNotFoundError(f"Archivo de modelo {model_path} no encontrado.")
    
    print(f"Evaluando el modelo {model_path}")
    
    # Evaluar el modelo en los datos de prueba reducidos
    loss, accuracy = model.evaluate(x_test_small, y_test_small, verbose=0)
    
    # Loguear los resultados en wandb
    wandb.log({"val_loss": loss, "val_accuracy": accuracy})
    print(f"Modelo {model_path} - Pérdida de validación: {loss}, Precisión de validación: {accuracy}")
    
    wandb.finish()
    print("Ejecución de Wandb finalizada")

# Ejecutar el agente
print("Iniciando el sweep...")
wandb.agent(sweep_id, function=train)
print("Sweep finalizado")




Create sweep with ID: cxwhl0r9
Sweep URL: https://wandb.ai/ortmlprod/Htuning/sweeps/cxwhl0r9
Iniciando el sweep...


[34m[1mwandb[0m: Ctrl + C detected. Stopping sweep.


Sweep finalizado
