# Universidad de O'Higgins

## Escuela de Ingeniería
## COM4402: Introducción a Inteligencia Artificial

### **Tarea 2: Clasificación de Dígitos Manuscritos con Redes Neuronales**

### Estudiante: César Hormazábal

El objetivo de esta tarea es utilizar redes neuronales en un problema de clasificación de dígitos. Se utilizará el conjunto de datos Optical Recognition of Handwritten Digits Data Set. Este conjunto tiene 64 características, con 10 clases y 5620 muestras en total. La base de datos estará disponible en U-Campus.

Las redes a ser entrenadas tienen la siguiente estructura: capa de entrada de dimensionalidad 64 (correspondiente a los datos de entrada), capas ocultas (una o dos) y capa de salida con 10 neuronas y función de activación softmax. La función de loss (pérdida) es entropía cruzada. El optimizador que se
debe usar es Adam. La función softmax está implícita al usar la función de pérdida CrossEntropyLoss de PyTorch (**no se debe agregar softmax a la salida de la red**).

Se usará PyTorch para entrenar y validar la red neuronal que implementa el clasificador de dígitos. Se analizará los efectos de cambiar el tamaño de la red (número de capas ocultas y de neuronas en estas
capas) y la función de activación.

El siguiente código base debe ser usado para realizar las actividades pedidas.

## Observación: Antes de ejecutar su código, active el uso de GPU en Google Colab para acelerar el proceso de entrenamiento.

### Para esto: vaya a "Entorno de Ejecución" en el menú superior, haga click en "Cambiar tipo de entorno de ejecución", y seleccionar/verificar "GPU" en "Acelerador de Hardware"

In [None]:
import pandas as pd
import torch
import torch.nn as nn
import numpy as np
import time
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

## Subir datasets de dígitos (train)

In [None]:
!ls

sample_data


In [None]:

!wget https://raw.githubusercontent.com/Felipe1401/Mineria/main/dataset_digits/1_digits_train.txt  # 1_digits_train.txt
!ls


--2023-10-31 17:24:40--  https://raw.githubusercontent.com/Felipe1401/Mineria/main/dataset_digits/1_digits_train.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 640604 (626K) [text/plain]
Saving to: ‘1_digits_train.txt’


2023-10-31 17:24:40 (19.0 MB/s) - ‘1_digits_train.txt’ saved [640604/640604]

1_digits_train.txt  sample_data


In [None]:
!wget https://raw.githubusercontent.com/Felipe1401/Mineria/main/dataset_digits/1_digits_test.txt  # 1_digits_test.txt
!ls

--2023-10-31 17:24:43--  https://raw.githubusercontent.com/Felipe1401/Mineria/main/dataset_digits/1_digits_test.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 187595 (183K) [text/plain]
Saving to: ‘1_digits_test.txt’


2023-10-31 17:24:43 (8.21 MB/s) - ‘1_digits_test.txt’ saved [187595/187595]

1_digits_test.txt  1_digits_train.txt  sample_data


## Leer dataset de dígitos

In [None]:
column_names = ["feat" + str(i) for i in range(64)]
column_names.append("class")

In [None]:
df_train_val = pd.read_csv('1_digits_train.txt', names = column_names)
df_train_val


Unnamed: 0,feat0,feat1,feat2,feat3,feat4,feat5,feat6,feat7,feat8,feat9,...,feat55,feat56,feat57,feat58,feat59,feat60,feat61,feat62,feat63,class
0,0,0,5,13,9,1,0,0,0,0,...,0,0,0,6,13,10,0,0,0,0
1,0,0,0,12,13,5,0,0,0,0,...,0,0,0,0,11,16,10,0,0,1
2,0,0,0,4,15,12,0,0,0,0,...,0,0,0,0,3,11,16,9,0,2
3,0,0,7,15,13,1,0,0,0,8,...,0,0,0,7,13,13,9,0,0,3
4,0,0,0,1,11,0,0,0,0,0,...,0,0,0,0,2,16,4,0,0,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4342,0,0,9,11,0,0,0,0,0,7,...,0,0,0,8,12,12,15,10,0,2
4343,0,0,6,15,2,0,0,0,0,0,...,0,0,0,7,16,16,10,1,0,6
4344,0,0,15,16,16,14,0,0,0,0,...,0,0,0,14,11,0,0,0,0,7
4345,0,0,0,1,15,11,0,0,0,0,...,0,0,0,0,1,16,10,0,0,4


Vemos que tenemos 64 caracteristicas mas la columna de clases. Estas caracteristicas corresponden a los pixeles que conforman las imagenes 8x8, por ello el data set tiene 64 dimensiones

In [None]:
df_test = pd.read_csv('1_digits_test.txt', names = column_names)
df_test

Unnamed: 0,feat0,feat1,feat2,feat3,feat4,feat5,feat6,feat7,feat8,feat9,...,feat55,feat56,feat57,feat58,feat59,feat60,feat61,feat62,feat63,class
0,0,0,13,12,10,12,8,0,0,2,...,0,0,0,10,16,16,8,0,0,5
1,0,0,8,16,14,4,0,0,0,5,...,0,0,0,10,16,14,12,2,0,9
2,0,0,0,7,16,0,0,0,0,0,...,0,0,0,0,9,15,1,0,0,4
3,0,0,2,14,9,2,0,0,0,0,...,0,0,0,2,14,14,2,0,0,0
4,0,1,16,16,15,3,0,0,0,0,...,0,0,0,16,6,0,0,0,0,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1267,0,0,5,13,11,2,0,0,0,2,...,0,0,0,8,13,15,10,1,0,9
1268,0,0,0,1,12,1,0,0,0,0,...,0,0,0,0,4,9,0,0,0,4
1269,0,0,3,15,0,0,0,0,0,0,...,0,0,0,4,14,16,9,0,0,6
1270,0,0,6,16,2,0,0,0,0,0,...,0,0,0,5,16,16,16,5,0,6


Entonces dividimos los datos de entrenamiento en validacion y entrenamiento

In [None]:
df_train, df_val = train_test_split(df_train_val, test_size = 0.3, random_state = 10)
print("Muestras de entrenamiento", len(df_train))
print("Muestras de validacion", len(df_val))
print("Muestras de prueba", len(df_test))
print("Muestras Totales", len(df_train_val)+len(df_test))

Muestras de entrenamiento 3042
Muestras de validacion 1305
Muestras de prueba 1272
Muestras Totales 5619


Normalizamos los datos

In [None]:
scaler = StandardScaler().fit(df_train.iloc[:,0:64])
df_train.iloc[:,0:64] = scaler.transform(df_train.iloc[:,0:64])
df_val.iloc[:,0:64] = scaler.transform(df_val.iloc[:,0:64])
df_test.iloc[:,0:64] = scaler.transform(df_test.iloc[:,0:64])

  df_train.iloc[:,0:64] = scaler.transform(df_train.iloc[:,0:64])
  df_val.iloc[:,0:64] = scaler.transform(df_val.iloc[:,0:64])
  df_test.iloc[:,0:64] = scaler.transform(df_test.iloc[:,0:64])


In [None]:
df_train

Unnamed: 0,feat0,feat1,feat2,feat3,feat4,feat5,feat6,feat7,feat8,feat9,...,feat55,feat56,feat57,feat58,feat59,feat60,feat61,feat62,feat63,class
4026,0.0,-0.338570,0.797238,0.535297,-0.599203,-1.007598,-0.412556,-0.13043,-0.045374,1.329217,...,-0.206188,0.0,-0.302452,0.078492,-0.427291,0.468026,1.548367,0.696964,-0.186744,9
1548,0.0,-0.338570,-0.269346,0.999221,0.985819,0.594099,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.121990,0.248408,0.876852,1.378118,-0.527715,-0.186744,3
1709,0.0,-0.338570,-0.482663,0.535297,0.759387,0.594099,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.723436,-0.427291,0.876852,1.548367,-0.037843,-0.186744,1
2195,0.0,2.026364,1.863823,-0.392551,-2.410657,-1.007598,-0.412556,-0.13043,-0.045374,1.329217,...,-0.206188,0.0,1.890776,1.481867,-0.652523,-0.758451,0.356618,1.186836,-0.186744,2
1216,0.0,2.026364,0.797238,0.767259,0.985819,1.661897,0.191388,-0.13043,-0.045374,2.953365,...,-0.206188,0.0,-0.302452,1.281385,0.924106,0.672439,0.356618,-0.282779,-0.186744,3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2009,0.0,-0.338570,-1.122613,-2.248248,-0.599203,1.839863,0.493359,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-1.124400,-2.679617,0.468026,-0.664881,-0.527715,-0.186744,9
1180,0.0,0.843897,0.583922,-0.392551,0.532956,0.772065,-0.412556,-0.13043,-0.045374,3.278195,...,-0.206188,0.0,-0.302452,1.281385,0.924106,0.468026,0.867368,0.452028,-0.186744,3
3441,0.0,-0.338570,-0.269346,-0.160589,0.985819,0.950031,-0.110584,-0.13043,-0.045374,0.354728,...,-0.206188,0.0,-0.302452,0.078492,0.473640,-2.393755,-1.175631,-0.527715,-0.186744,9
1344,0.0,-0.338570,1.223872,0.999221,0.985819,0.238166,-0.412556,-0.13043,-0.045374,0.029899,...,0.911859,0.0,-0.302452,0.880421,0.924106,0.876852,1.548367,3.391260,6.008192,2


In [None]:
df_val

Unnamed: 0,feat0,feat1,feat2,feat3,feat4,feat5,feat6,feat7,feat8,feat9,...,feat55,feat56,feat57,feat58,feat59,feat60,feat61,feat62,feat63,class
3459,0.0,-0.338570,0.370605,0.071373,0.532956,0.950031,0.493359,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,0.880421,0.924106,-0.145212,-0.835131,-0.527715,-0.186744,5
4204,0.0,0.843897,1.650506,0.999221,-0.372771,-1.007598,-0.412556,-0.13043,-0.045374,0.029899,...,0.911859,0.0,0.794162,1.882831,0.924106,0.468026,0.526868,1.431772,0.432750,2
455,0.0,-0.338570,1.010555,0.999221,0.306524,-0.117767,-0.412556,-0.13043,-0.045374,1.654047,...,-0.206188,0.0,-0.302452,0.679939,0.248408,0.876852,0.186368,-0.527715,-0.186744,9
3970,0.0,-0.338570,0.157288,-0.624513,0.080092,1.839863,2.607162,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,0.078492,0.924106,0.876852,0.016118,-0.527715,-0.186744,5
2377,0.0,-0.338570,0.583922,0.071373,0.759387,1.127998,-0.412556,-0.13043,-0.045374,1.978876,...,-0.206188,0.0,-0.302452,0.880421,0.698873,0.876852,0.356618,-0.527715,-0.186744,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2020,0.0,-0.338570,-0.909296,0.303335,0.759387,0.238166,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.522954,0.248408,0.876852,0.526868,-0.527715,-0.186744,0
1985,0.0,0.843897,1.863823,0.303335,0.985819,1.127998,-0.110584,-0.13043,-0.045374,2.303706,...,-0.206188,0.0,-0.302452,1.481867,0.698873,-0.758451,-1.005381,-0.527715,-0.186744,5
2107,0.0,7.938698,2.290457,0.999221,0.985819,-0.117767,-0.412556,-0.13043,-0.045374,-0.294931,...,-0.206188,0.0,7.373845,1.281385,-2.679617,-2.393755,-1.175631,-0.527715,-0.186744,7
2213,0.0,-0.338570,-0.482663,0.535297,0.985819,1.839863,0.795331,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.121990,0.924106,-1.371690,-1.175631,-0.527715,-0.186744,7


In [None]:
df_test

Unnamed: 0,feat0,feat1,feat2,feat3,feat4,feat5,feat6,feat7,feat8,feat9,...,feat55,feat56,feat57,feat58,feat59,feat60,feat61,feat62,feat63,class
0,0.0,-0.338570,1.650506,0.071373,-0.372771,1.127998,2.003218,-0.13043,-0.045374,0.029899,...,-0.206188,0.0,-0.302452,0.880421,0.924106,0.876852,0.186368,-0.527715,-0.186744,5
1,0.0,-0.338570,0.583922,0.999221,0.532956,-0.295733,-0.412556,-0.13043,-0.045374,1.004388,...,-0.206188,0.0,-0.302452,0.880421,0.924106,0.468026,0.867368,-0.037843,-0.186744,9
2,0.0,-0.338570,-1.122613,-1.088438,0.985819,-1.007598,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-1.124400,-0.652523,0.672439,-1.005381,-0.527715,-0.186744,4
3,0.0,-0.338570,-0.695980,0.535297,-0.599203,-0.651666,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.723436,0.473640,0.468026,-0.835131,-0.527715,-0.186744,0
4,0.0,0.843897,2.290457,0.999221,0.759387,-0.473699,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,2.083313,-1.328221,-2.393755,-1.175631,-0.527715,-0.186744,7
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1267,0.0,-0.338570,-0.056029,0.303335,-0.146340,-0.651666,-0.412556,-0.13043,-0.045374,0.029899,...,-0.206188,0.0,-0.302452,0.479457,0.248408,0.672439,0.526868,-0.282779,-0.186744,9
1268,0.0,-0.338570,-1.122613,-2.480210,0.080092,-0.829632,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-1.124400,-1.778687,-0.554038,-1.175631,-0.527715,-0.186744,4
1269,0.0,-0.338570,-0.482663,0.767259,-2.637089,-1.007598,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.322472,0.473640,0.876852,0.356618,-0.527715,-0.186744,6
1270,0.0,-0.338570,0.157288,0.999221,-2.184225,-1.007598,-0.412556,-0.13043,-0.045374,-0.619760,...,-0.206188,0.0,-0.302452,-0.121990,0.924106,0.876852,1.548367,0.696964,-0.186744,6


## Crear modelo

definimos el modelo de red neuronal

En este caso con 10 neuronas y activacion ReLU

.#Capa de Entrada de 64 neuronas(por que hay 64 caracteristicas)

.#Capa oculta de 10 neuronas(porque hay 10 clases)

.#Funcion de activacion ReLU

.#Capa de salida de 10 neuronas(porque hay 10 clases)

In [None]:
modela = nn.Sequential(
          nn.Linear(64, 10),
          nn.ReLU(),
          nn.Linear(10,10)
        )


definimos ademas el modelo b
que en este caso son

.#Capa de Entrada de 64 neuronas(por que hay 64 caracteristicas)porfis

.#Capa oculta de 40

.#Funcion de activacion ReLU

.#Capa de salida de 10 neuronas(porque hay 10 clases)


In [None]:
modelb = nn.Sequential(
          nn.Linear(64, 40),
          nn.ReLU(),
          nn.Linear(40,10)
        )

definimos ademas el modelo c
que en este caso son

.#Capa de Entrada de 64 neuronas(por que hay 64 caracteristicas)

.#Capa oculta de 10

.#Funcion de activacion Tanh

.#Capa de salida de 10 neuronas(porque hay 10 clases)

In [None]:
modelc = nn.Sequential(
          nn.Linear(64, 10),
          nn.Tanh(),
          nn.Linear(10,10)
        )

definimos ademas el modelo d
que en este caso son

.#Capa de Entrada de 64 neuronas(por que hay 64 caracteristicas)

.#Capa oculta de 40

.#Funcion de activacion Tanh

.#Capa de salida de 10 neuronas(porque hay 10 clases)

In [None]:
modeld = nn.Sequential(
          nn.Linear(64, 40),
          nn.Tanh(),
          nn.Linear(40,10)
        )

definimos ademas el modelo e
que en este caso son

.#Capa de Entrada de 64 neuronas

.#Una primer capa oculta de 10

.#Funcion de activacion ReLU

.#Una segunda capa oculta de 10

.#Capa de salida de 10 neuronas(porque hay 10 clases)


In [None]:
modele = nn.Sequential(
          nn.Linear(64, 10),
          nn.ReLU(),
          nn.Linear(10,10),
          nn.ReLU(),
          nn.Linear(10,10)
        )

definimos ademas el modelo e
que en este caso son

.#Capa de Entrada de 64 neuronas

.#Una primer capa oculta de 10

.#Funcion de activacion ReLU

.#Una segunda capa oculta de 1'

.#Capa de salida de 10 neuronas

In [None]:
modelf = nn.Sequential(
          nn.Linear(64, 40),
          nn.ReLU(),
          nn.Linear(40,40),
          nn.ReLU(),
          nn.Linear(40,10)
        )

## Crear datasets y dataloaders para pytorch (train)

In [None]:
# Crear datasets
feats_train = df_train.to_numpy()[:,0:64].astype(np.float32)
labels_train = df_train.to_numpy()[:,64].astype(int)
dataset_train = [ {"features":feats_train[i,:], "labels":labels_train[i]} for i in range(feats_train.shape[0]) ]

feats_val = df_val.to_numpy()[:,0:64].astype(np.float32)
labels_val = df_val.to_numpy()[:,64].astype(int)
dataset_val = [ {"features":feats_val[i,:], "labels":labels_val[i]} for i in range(feats_val.shape[0]) ]

feats_test = df_test.to_numpy()[:,0:64].astype(np.float32)
labels_test = df_test.to_numpy()[:,64].astype(int)
dataset_test = [ {"features":feats_test[i,:], "labels":labels_test[i]} for i in range(feats_test.shape[0]) ]

In [None]:
# Crear dataloaders
dataloader_train = torch.utils.data.DataLoader(dataset_train, batch_size=128, shuffle=True, num_workers=0)
dataloader_val = torch.utils.data.DataLoader(dataset_val, batch_size=128, shuffle=True, num_workers=0)
dataloader_test = torch.utils.data.DataLoader(dataset_test, batch_size=128, shuffle=True, num_workers=0)

## Entrenamiento

Luego indicamos a pytorch que correremos el modelo con GPU(borrar)

Como las mediciones de loss pueden tener fluctuaciones (ruido), en lapsos cortos de tiempo, la
implementación de esta funcionalidad debe realizarse en forma robusta. El c´odigo guarda checkpoints
para poder recuperar las redes correspondientes a distintas épocas de entrenamiento
Para cada red a ser entrenada:
(a) Calcular el loss de entrenamiento y el de validación, así como el tiempo de entrenamiento total
requerido. El entrenamiento se debe finalizar cuando el loss de validación comience a subir,
para evitar sobreajuste de la red.

(b) Graficar el loss de entrenamiento y el de validación en función del tiempo. Ambas curvas deben
mostrarse en un mismo gráfico, colocando las leyendas adecuadas.

(c) Generar la matriz de confusión normalizada y el accuracy normalizado, usando el conjunto de
entrenamiento. Se debe mostrar la matriz de confusión con distintos colores o tonos de gris
seg´un el valor del accuracy. La matriz se debe calcular usando scikit-learn, usando la opción
normalize = "true".

(d) Generar la matriz de confusión normalizada y el accuracy normalizado, usando el conjunto
de validación. Se debe mostrar la matriz de confusión con distintos colores seg´un el valor del
accuracy.


3) Usando la mejor red encontrada en validación (aquella con mayor accuracy en validación), calcular
la matriz de confusión normalizada y el accuracy normalizado, usando el conjunto de prueba.

In [None]:
#Como se nos pide entrenar una red neuronal para cada uno de los casos, creamos la funcion de entrenamiento y asi la utilizamos

def modeloDeEntrenamiento(modelX):
  #indicaremos que correremos los modelos con gpu
  device = torch.device('cuda')
  modelX = modelX.to(device)
  #Se define la funcion de perdida y optimizador que usaremos
  criterion = nn.CrossEntropyLoss() #Funcion de perdida
  optimizer = torch.optim.Adam(modelX.parameters(), lr=1e-3) #Optimizador

  #definimos variables para implementar early stoppign
  mejor_perdida= float('inf')
  paciencia= 10
  sinMejoras= 0

  #Se inicia el tiempo de ejecucion del programa
  start = time.time()

  # Guardar resultados del loss y epocas que duró el entrenamiento
  loss_train=[]
  accuracy_train=[]
  loss_val=[]
  accuracy_val=[]
  epochs = []



  # loop over the dataset multiple times
  for epoch in range(1000):

    #Agregamos igual dos arreglos para guardar loss del entrenamiento y validacion de cada batch
    loss_train_batches=[]
    loss_val_batches=[]
    #Se agregan dos arreglos para el accuraccy en ambos casos
    accuracy_train_batches=[]
    accuracy_val_batches=[]

    #----Comenzamos el entenamiento----
    modelX.train()

    #Inicializamos los valores perdida_entrenamiento para el earlystopping y  prediccionesCorrectas junto al total para el accuracy
    perdida_entrenamiento=0.0
    prediccionesCorrectas = 0
    totalPredicciones = 0

    # Train on the current epoch
    for data in dataloader_train:
      # Process the current batch
      inputs = data["features"].to(device)  #Caracteristicas
      labels = data["labels"].to(device)  #Clases
      # zero the parameter gradients
      optimizer.zero_grad()
      # forward + backward + optimize
      outputs = modelX(inputs)
      loss = criterion(outputs, labels)
      loss.backward() # backpropagation
      optimizer.step()

      #Se calcula la pérdida de validación y accuracy en el batch actual
      #guardando la perdida actual con loss.item en el arreglo del batch y ademas guardadndola en perdida de entrenamiento como un total
      loss_train_batches.append(loss.item())
      perdida_entrenamiento += loss.item()

      _, Predicciones =torch.max(outputs.data, 1)# realiza una operación de reducción en el tensor outputs.data a lo largo del eje 1 (columnas) y devuelve un tensor que contiene los valores máximos a lo largo del eje 1 y los índices de estos valores máximos.
      totalPredicciones += labels.size(0)

      #Guardamos el accuracy en el batch actual, calculandolo como las predicciones correctas divididas en el total de predicciones
      prediccionesCorrectas += (Predicciones == labels).sum().item()
      accuracy=(prediccionesCorrectas / totalPredicciones)
      accuracy_train_batches.append(accuracy)

    #Guardamos ademas el loss y accuracy de entrenamiento de la epoca actual
    loss_train.append(np.mean(loss_train_batches)) #Loss promedio
    accuracy_train.append(np.mean(accuracy_train_batches)) #accuracy promedio


    #------ Ahora vemos la prediccion en el conjunto de validacion--
    modelX.eval()
    with torch.no_grad():
      #Inicializamos los valores perdida_validacion para el earlystopping y  prediccionesCorrectas junto al total para el accuracy
      perdida_validacion = 0.0
      prediccionesCorrectas = 0
      totalPredicciones= 0

      # Iteramos dataloader_val para evaluar el modelo en los datos de validación
      for i, data in enumerate(dataloader_val, 0):
        # Procesar batch actual
        inputs = data["features"].to(device) # Características
        labels = data["labels"].to(device)   # Clases

        outputs = modelX(inputs)              # Obtenemos predicciones

        # Guardamos la pérdida de validación en el batch actual
        #guardando la perdida actual con loss.item en el arreglo del batch y ademas guardadndola en perdida de entrenamiento como un total para el earlystopping
        loss = criterion(outputs, labels)
        loss_val_batches.append(loss.item())
        perdida_validacion += loss.item()

        _.Predicciones =torch.max(outputs.data, 1)# realiza una operación de reducción en el tensor outputs.data a lo largo del eje 1 (columnas) y devuelve un tensor que contiene los valores máximos a lo largo del eje 1 y los índices de estos valores máximos.

        #Guardamos el accuracy en el batch actual, calculandolo como las predicciones correctas divididas en el total de predicciones
        totalPredicciones += labels.size(0)
        prediccionesCorrectas += (Predicciones == labels).sum().item()
      accuracy=(prediccionesCorrectas / totalPredicciones)
      accuracy_val_batches.append(accuracy)
      print("Accuracy: {:.2f}%".format(accuracy * 100))

    #Guardamos ademas el loss y accuracy de validacion de la epoca actual
    loss_val.append(np.mean(loss_val_batches)) #Loss promedio
    accuracy_val.append(np.mean(accuracy_val_batches)) #accuracy promedio


    #Implementamos early stopping
    if perdida_validacion < mejor_perdida:
      mejor_perdida = perdida_validacion
      torch.save(modelX.state_dict(), 'Mejor_Modelo.pt') #se guarda el mejor modelo para la matriz confusion
    elif sinMejoras == paciencia:
      print('Termino anticipado Época %d' % epoch)
      break
    else:
      sinMejoras += 1

    # Guardamos la época
    epochs.append(epoch)

    # Imprimir la pérdida de entrenamiento/validación en la época actual
    print(("Epoch: %d, train loss: %.4f, train acuraccy: %.4f, val loss: %.4f, val accuracy: %.4f"  %(epoch, loss_train[epoch], accuracy_train[epoch], loss_val[epoch], accuracy_val[epoch])))

  end = time.time()
  print('Finished Training of '+ modelX +', total time %f seconds' % (end - start))
  # Graficar loss de entrenamiento Y validación
  plt.figure(figsize = (8, 5))
  plt.title('Model loss on train & validation')
  plt.xlabel('No. epochs')
  plt.ylabel('Loss')
  plt.plot(epochs, loss_train, 'b', label = 'Train')
  plt.plot(epochs, loss_val, 'r', label = 'Validation')
  plt.grid()
  plt.legend()
  #Ademas calcular matriz confusion y accuracy en el conjunto de entrenamiento
  modelX.load_state_dict(torch.load('Mejor_Modelo.pt'))
  modelX.train()
  train_p=[]
  train_l=[]
  with torch.no_grad():
    for data in dataloader_train:
      inputs = data["features"].to(device)
      labels = data["labels"].to(device)
      outputs = modelX(inputs)
      _,predicciones = torch.max(outputs,1)
      train_p.append(predicciones)
      train_l.append(labels)
  matriz_train = confusion_matrix(train_l, train_p, normalize= 'true')
  plt.figure(figsize = (8, 5))
  sns.heatmap(matriz_train, annot=True, cmap='Reds')
  plt.title('Matruz de confusion normalizada(Train)')
  plt.xlabel('Predicciones de etiqueta')
  plt.ylabel('Valor verdadero de etiqueta')
  plt.legend()
  plt.show()

  #Ademas calcular matriz confusion y accuracy en el conjunto de validacion
  modelX.load_state_dict(torch.load('Mejor_Modelo.pt'))
  modelX.eval()
  val_p=[]
  val_l=[]
  with torch.no_grad():
    for data in dataloader_val:
      inputs = data["features"].to(device)
      labels = data["labels"].to(device)
      outputs = modelX(inputs)
      _,predicciones = torch.max(outputs,1)
      val_p.append(predicciones)
      val_l.append(labels)
  matriz_val = confusion_matrix(val_l, val_p, normalize= 'true')
  plt.figure(figsize = (8, 5))
  sns.heatmap(matriz_val, annot=True, cmap='Reds')
  plt.title('Matruz de confusion normalizada(validation)')
  plt.xlabel('Predicciones de etiqueta')
  plt.ylabel('Valor verdadero de etiqueta')
  plt.legend()
  plt.show()


In [None]:
modeloDeEntrenamiento(modela)
modeloDeEntrenamiento(modelb)
modeloDeEntrenamiento(modelb)
modeloDeEntrenamiento(modelc)
modeloDeEntrenamiento(modeld)
modeloDeEntrenamiento(modele)
modeloDeEntrenamiento(modelf)

RuntimeError: ignored