# Práctico 4

## Transform Pattern

**"Transform pattern"** en el contexto de Machine Learning se refiere a la técnica de manipular y cambiar los datos de entrada antes de que sean utilizados por un modelo. 

Estos cambios pueden ayudar a mejorar el rendimiento del modelo y a hacer que el modelo sea más robusto ante variaciones en los datos de entrada.

Las transformaciones se aplican típicamente durante la etapa de pre_procesamiento de los datos y pueden implicar muchas técnicas diferentes, dependiendo del tipo de datos y del problema que se está tratando de resolver.

In [1]:
!pip install tensorflow==2.16.1



In [2]:
import tensorflow as tf
print(tf.__version__)


2024-06-09 22:10:10.348059: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-09 22:10:10.352507: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2024-06-09 22:10:10.394982: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


2.16.1


In [3]:
%%time 

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Resizing, Rescaling
from tensorflow.keras.models import Sequential
import numpy as np
import matplotlib.pyplot as plt

# Cargar y normalizar el conjunto de datos 
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Convertir las etiquetas en one-hot
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

data_preprocessing = Sequential([
    Resizing(32, 32),  # Redimensionar a 32x32
    Rescaling(1./255)  # Normalización adicional después del redimensionamiento
])

# Define tu modelo
model = tf.keras.models.Sequential([
    data_preprocessing,

    tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(32, 32, 3)),  # Ajustar a 3 canales de entrada
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.3),

    tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.5),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compila y entrena el modelo
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(x_train, y_train, batch_size=256, epochs=5, validation_data=(x_test, y_test))

# Guarda el modelo
model.save('transform_pattern_conv.h5')


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5


2024-06-09 21:11:48.242026: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 153600000 exceeds 10% of free system memory.
2024-06-09 21:11:51.735817: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 33554432 exceeds 10% of free system memory.
2024-06-09 21:11:51.935379: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 33554432 exceeds 10% of free system memory.
2024-06-09 21:11:51.946231: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 33554432 exceeds 10% of free system memory.
2024-06-09 21:11:51.955184: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 33554432 exceeds 10% of free system memory.


[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m184s[0m 921ms/step - accuracy: 0.3058 - loss: 2.1977 - val_accuracy: 0.1097 - val_loss: 3.3565
Epoch 2/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m173s[0m 884ms/step - accuracy: 0.5110 - loss: 1.3744 - val_accuracy: 0.2324 - val_loss: 2.8064
Epoch 3/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m172s[0m 877ms/step - accuracy: 0.5986 - loss: 1.1342 - val_accuracy: 0.5476 - val_loss: 1.2956
Epoch 4/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m170s[0m 866ms/step - accuracy: 0.6374 - loss: 1.0301 - val_accuracy: 0.6038 - val_loss: 1.1527
Epoch 5/5
[1m196/196[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m170s[0m 870ms/step - accuracy: 0.6636 - loss: 0.9663 - val_accuracy: 0.6562 - val_loss: 0.9773




CPU times: user 24min 49s, sys: 1min 8s, total: 25min 58s
Wall time: 15min 7s


En este caso, las transformaciones de datos **(Resizing y Rescaling)** se definen como capas Keras y se agregan al inicio de tu modelo. 

Esto significa que estas transformaciones se aplicarán automáticamente a las imágenes a medida que pasen por el modelo, ya sea durante el entrenamiento o durante la inferencia. 

Además, como las capas de preprocesamiento son parte del modelo, se guardarán junto con el modelo cuando lo guardes con model.save(). 

## Experiment tracking
### wandb: https://wandb.ai/site


Primero, vamos a agregar experiment tracking utilizando wandb (Weights & Biases). Esto nos va a permitir monitorear los experimentos en tiempo real, guardar nuestros modelos , resultados, y podremos compartir experimentos con otros.

[Wandb collab full explained notebook ](https://colab.research.google.com/github/wandb/examples/blob/master/colabs/intro/Intro_to_Weights_%26_Biases.ipynb#scrollTo=jufPgkgqz2eF)

### 👟 Run an experiment 

1.  **Start a new run** and pass in hyperparameters to track

2.  **Log metrics** from training or evaluation

3.  **Visualize results** in the dashboard

4. **Generate alerts** in the dashboard 

In [3]:
!pip install wandb==0.17.0



In [4]:
import wandb

print("wandb version:", wandb.__version__)


wandb version: 0.17.0


In [5]:
import warnings
warnings.filterwarnings('ignore')

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import wandb
from wandb.integration.keras import WandbCallback

# Configura tu clave API directamente
wandb.login(key='5f6ace794df4cab57a0ea72cf21c8c46ccd2beb7')

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mchv-facu[0m ([33mchristianvera495[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /home/codespace/.netrc


True

In [5]:
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.layers import Resizing, Rescaling
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import Callback
import numpy as np
import matplotlib.pyplot as plt
import random


# Definir un callback personalizado para wandb
class CustomWandbCallback(Callback):
    def on_epoch_end(self, epoch, logs=None):
        wandb.log(logs, step=epoch)

# Launch 2 experiments, trying different dropout rates
for run in range(2):
    
    # Start a run, tracking hyperparameters
    wandb.init(
        project="ml-produccion-wandb",
        config={
            "activation_1": "relu",
            "dropout": random.uniform(0.01, 0.80),
            "optimizer": "adam",
            "loss": "categorical_crossentropy",
            "metric": "accuracy",
            "epoch": 3,
            "batch_size": 512,
        },
    )
    config = wandb.config
    
    # Cargar y normalizar el conjunto de datos 
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()

    # Convertir las etiquetas en one-hot
    y_train = tf.keras.utils.to_categorical(y_train, 10)
    y_test = tf.keras.utils.to_categorical(y_test, 10)

    # Define el preprocesamiento de la imagen
    data_preprocessing = Sequential([
        Resizing(32, 32),  # Redimensionar a 32x32
        Rescaling(1./255)  # Normalización adicional después del redimensionamiento
    ])

    # Define tu modelo
    model = tf.keras.models.Sequential([
        data_preprocessing,  

        tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu', input_shape=(32, 32, 3)),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Conv2D(32, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Dropout(0.3),

        tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Conv2D(64, (3, 3), padding='same', activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Dropout(config.dropout),

        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dropout(config.dropout),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    # Compila y entrena el modelo
    model.compile(optimizer=config.optimizer, loss=config.loss, metrics=[config.metric])

    # Add CustomWandbCallback to log metrics
    custom_wandb_callback = CustomWandbCallback()

    history = model.fit(x_train, y_train, batch_size=config.batch_size, epochs=config.epoch, 
                        validation_data=(x_test, y_test), callbacks=[custom_wandb_callback])
    
    wandb.finish()
    
    # Guardar el modelo
    model.save(f'model_run_{run}.h5')


Epoch 1/3


2024-06-09 21:46:04.190513: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 153600000 exceeds 10% of free system memory.
2024-06-09 21:46:12.136556: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 67108864 exceeds 10% of free system memory.
2024-06-09 21:46:12.314095: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 67108864 exceeds 10% of free system memory.
2024-06-09 21:46:12.384513: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 67108864 exceeds 10% of free system memory.
2024-06-09 21:46:12.427618: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 67108864 exceeds 10% of free system memory.


[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m293s[0m 3s/step - accuracy: 0.3868 - loss: 1.8148 - val_accuracy: 0.1094 - val_loss: 2.8902
Epoch 2/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m179s[0m 2s/step - accuracy: 0.6028 - loss: 1.1230 - val_accuracy: 0.1550 - val_loss: 3.1015
Epoch 3/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m182s[0m 2s/step - accuracy: 0.6706 - loss: 0.9384 - val_accuracy: 0.1377 - val_loss: 3.8604


0,1
accuracy,▁▆█
loss,█▃▁
val_accuracy,▁█▅
val_loss,▁▃█

0,1
accuracy,0.6754
loss,0.92563
val_accuracy,0.1377
val_loss,3.86042




Epoch 1/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m187s[0m 2s/step - accuracy: 0.4117 - loss: 1.7104 - val_accuracy: 0.1005 - val_loss: 3.1995
Epoch 2/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m202s[0m 2s/step - accuracy: 0.6346 - loss: 1.0305 - val_accuracy: 0.1000 - val_loss: 4.4631
Epoch 3/3
[1m98/98[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m202s[0m 2s/step - accuracy: 0.7001 - loss: 0.8577 - val_accuracy: 0.1074 - val_loss: 4.6833


0,1
accuracy,▁▆█
loss,█▃▁
val_accuracy,▁▁█
val_loss,▁▇█

0,1
accuracy,0.70308
loss,0.84729
val_accuracy,0.1074
val_loss,4.68332





##  W&B Alerts

**[W&B Alerts](https://docs.wandb.ai/guides/track/alert)** allows you to send alerts, triggered from your Python code, to your Slack or email. There are 2 steps to follow the first time you'd like to send a Slack or email alert, triggered from your code:

1) Turn on Alerts in your W&B [User Settings](https://wandb.ai/settings)

2) Add `wandb.alert()` to your code:

```python
wandb.alert(
    title="Low accuracy", 
    text=f"Accuracy is below the acceptable threshold"
)
```

In [6]:
import random 

# Start a wandb run
wandb.init(project="alerts-intro")

# Simulating a model training loop
acc_threshold = 0.3
for training_step in range(1000):

    # Generate a random number for accuracy
    accuracy = round(random.random() + random.random(), 3)
    print(f"Accuracy is: {accuracy}, {acc_threshold}")

    # 🐝 Log accuracy to wandb
    wandb.log({"Accuracy": accuracy})

    # 🔔 If the accuracy is below the threshold, fire a W&B Alert and stop the run
    if accuracy <= acc_threshold:
        # 🐝 Send the wandb Alert
        wandb.alert(
            title="Low Accuracy",
            text=f"Accuracy {accuracy} at step {training_step} is below the acceptable theshold, {acc_threshold}",
        )
        print("Alert triggered")
        break

# Mark the run as finished (useful in Jupyter notebooks)
wandb.finish()

Accuracy is: 0.991, 0.3
Accuracy is: 1.582, 0.3
Accuracy is: 1.388, 0.3
Accuracy is: 1.346, 0.3
Accuracy is: 1.335, 0.3
Accuracy is: 0.964, 0.3
Accuracy is: 0.993, 0.3
Accuracy is: 1.249, 0.3
Accuracy is: 0.859, 0.3
Accuracy is: 0.589, 0.3
Accuracy is: 0.596, 0.3
Accuracy is: 0.861, 0.3
Accuracy is: 0.737, 0.3
Accuracy is: 1.045, 0.3
Accuracy is: 1.367, 0.3
Accuracy is: 1.62, 0.3
Accuracy is: 1.514, 0.3
Accuracy is: 1.047, 0.3
Accuracy is: 1.191, 0.3
Accuracy is: 1.478, 0.3
Accuracy is: 0.349, 0.3
Accuracy is: 0.932, 0.3
Accuracy is: 1.699, 0.3
Accuracy is: 0.604, 0.3
Accuracy is: 1.403, 0.3
Accuracy is: 0.333, 0.3
Accuracy is: 1.143, 0.3
Accuracy is: 1.077, 0.3
Accuracy is: 0.736, 0.3
Accuracy is: 1.135, 0.3
Accuracy is: 0.339, 0.3
Accuracy is: 1.359, 0.3
Accuracy is: 1.038, 0.3
Accuracy is: 0.87, 0.3
Accuracy is: 0.668, 0.3
Accuracy is: 1.226, 0.3
Accuracy is: 0.219, 0.3
Alert triggered


0,1
Accuracy,▅▇▇▆▆▅▅▆▄▃▃▄▃▅▆█▇▅▆▇▂▄█▃▇▂▅▅▃▅▂▆▅▄▃▆▁

0,1
Accuracy,0.219


## H Tuning - wandb

In [7]:
import os
import wandb
from tensorflow.keras.models import load_model
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical
import tensorflow as tf

# Cargar y normalizar el conjunto de datos
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Usar solo una muestra del conjunto de datos de prueba
x_test_small = x_test[:1]
y_test_small = to_categorical(y_test[:1], 10)

# Definir el callback personalizado para wandb
class CustomWandbCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        wandb.log(logs, step=epoch)

# Definir la configuración del sweep
sweep_config = {
    'method': 'grid',  # el método de búsqueda de hiperparámetros
    'metric': {
        'name': 'accuracy',
        'goal': 'maximize'  
    },
    'parameters': {
        'learning_rate': {
            'values': [0.01]  # Un solo valor para minimizar el tiempo de ejecución
        },
        'batch_size': {
            'values': [64]  # Un solo valor para minimizar el tiempo de ejecución
        },
        'run_index': {
            'values': [0, 1]  # Índices de los modelos a cargar
        }
    }
}

sweep_id = wandb.sweep(sweep_config, project="Htuning")

# Definir la función de entrenamiento
def train():
    run = wandb.init()
    config = run.config
    
    print("Configuración de la ejecución:", config)
    
    # Mostrar archivos en el directorio actual
    print("Archivos en el directorio actual:", os.listdir('.'))
    
    # Cargar el modelo basado en el índice del run
    run_index = config.run_index
    model_path = f"model_run_{run_index}.h5"
    
    if os.path.exists(model_path):
        print(f"Cargando el modelo desde {model_path}")
        model = load_model(model_path)
    else:
        raise FileNotFoundError(f"Archivo de modelo {model_path} no encontrado.")
    
    print(f"Evaluando el modelo {model_path}")
    
    # Evaluar el modelo en los datos de prueba reducidos
    loss, accuracy = model.evaluate(x_test_small, y_test_small, verbose=0)
    
    # Loguear los resultados en wandb
    wandb.log({"val_loss": loss, "val_accuracy": accuracy})
    print(f"Modelo {model_path} - Pérdida de validación: {loss}, Precisión de validación: {accuracy}")
    
    wandb.finish()
    print("Ejecución de Wandb finalizada")

# Ejecutar el agente
print("Iniciando el sweep...")
wandb.agent(sweep_id, function=train)
print("Sweep finalizado")


Create sweep with ID: ckern2jh
Sweep URL: https://wandb.ai/christianvera495/Htuning/sweeps/ckern2jh
Iniciando el sweep...


[34m[1mwandb[0m: Agent Starting Run: xmhmmnkn with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	run_index: 0
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


Configuración de la ejecución: {'batch_size': 64, 'learning_rate': 0.01, 'run_index': 0}
Archivos en el directorio actual: ['wandb', 'practico_4.ipynb', 'model_run_0.h5', 'model_run_1.h5', 'README.md', '.git', 'transform_pattern_conv.h5']
Cargando el modelo desde model_run_0.h5




Evaluando el modelo model_run_0.h5
Modelo model_run_0.h5 - Pérdida de validación: 2.448362350463867, Precisión de validación: 0.0


0,1
val_accuracy,▁
val_loss,▁

0,1
val_accuracy,0.0
val_loss,2.44836


Ejecución de Wandb finalizada


[34m[1mwandb[0m: Agent Starting Run: 2ob34fx1 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	learning_rate: 0.01
[34m[1mwandb[0m: 	run_index: 1
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.




Configuración de la ejecución: {'batch_size': 64, 'learning_rate': 0.01, 'run_index': 1}
Archivos en el directorio actual: ['wandb', 'practico_4.ipynb', 'model_run_0.h5', 'model_run_1.h5', 'README.md', '.git', 'transform_pattern_conv.h5']
Cargando el modelo desde model_run_1.h5
Evaluando el modelo model_run_1.h5
Modelo model_run_1.h5 - Pérdida de validación: 5.316495895385742, Precisión de validación: 0.0


0,1
val_accuracy,▁
val_loss,▁

0,1
val_accuracy,0.0
val_loss,5.3165


Ejecución de Wandb finalizada


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Sweep Agent: Exiting.


Sweep finalizado
