# **CC6204 - Deep Learning**

## **Predicción de riesgo de diabetes en etapas tempranas**

En esta tarea vas a desarrollar un modelo de aprendizaje que sea capaz de  determinar si una persona tiene riesgo de paceder diabetes en un futuro. Los datos han sido coleccionados a través de encuestas a pacientes en el Sylhet Diabetes Hospital en Bangladesh. Los datos han sido curados y verificados por profesionales de la salud, por lo que son confiables para crear un modelo de aprendizaje.

Primero vamos a importar los paquetes necesarios para trabajar en estos datos.

In [323]:
import pathlib

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

## **Descarga de archivo de datos**
Los datos están disponibles en un archivo CSV que contiene 520 muestras de 17 atributos. Los atributos son:

*   Age: numérico
*   Género: \[Male, Female\]
*   Polyuria
*   Polydipsia
*   sudden weight loss
*   weakness
*   Polyphagia
*   Genital thrush
*   visual blurring
*   Itching
*   Irritability 
*   delayed healing 
*   partial paresis 
*   muscle stiffness
*   Alopecia
*   Obesity
*   Class: \[Positive, Negative\]

Todos los atributos descritos sin valores tienen el conjunto \[Yes, No\]. 

En la siguiente celda de código, descargamos el archivo y lo leemos con Pandas. Finalmente, visualizamos algunos datos del conjunto.



In [324]:
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/00529/diabetes_data_upload.csv

--2022-09-18 03:06:31--  https://archive.ics.uci.edu/ml/machine-learning-databases/00529/diabetes_data_upload.csv
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34682 (34K) [application/x-httpd-php]
Saving to: ‘diabetes_data_upload.csv.1’


2022-09-18 03:06:33 (219 KB/s) - ‘diabetes_data_upload.csv.1’ saved [34682/34682]



In [325]:
dataset_path = 'diabetes_data_upload.csv'

In [326]:
column_names = ['Age','Gender','Polyuria','Polydipsia','sudden weight loss',
                'weakness', 'Polyphagia', 'Genital thrush', 'visual blurring','Itching', 'Irritability', 'delayed healing',
                'partial paresis', 'muscle stiffness', 'Alopecia', 'Obesity', 'class']
raw_dataset = pd.read_csv(dataset_path, names=column_names,
                      na_values = "?", comment='\t',
                      sep=",", skipinitialspace=True, header=1)

dataset = raw_dataset.copy()
dataset.head()

Unnamed: 0,Age,Gender,Polyuria,Polydipsia,sudden weight loss,weakness,Polyphagia,Genital thrush,visual blurring,Itching,Irritability,delayed healing,partial paresis,muscle stiffness,Alopecia,Obesity,class
0,58,Male,No,No,No,Yes,No,No,Yes,No,No,No,Yes,No,Yes,No,Positive
1,41,Male,Yes,No,No,Yes,Yes,No,No,Yes,No,Yes,No,Yes,Yes,No,Positive
2,45,Male,No,No,Yes,Yes,Yes,Yes,No,Yes,No,Yes,No,No,No,No,Positive
3,60,Male,Yes,Yes,Yes,Yes,Yes,No,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Yes,Positive
4,55,Male,Yes,Yes,No,Yes,Yes,No,Yes,Yes,No,Yes,No,Yes,Yes,Yes,Positive


# **Pre-procesamiento del conjunto de datos**
El conjunto de datos es variado. El atributo "Age" es el único atributo numérico. Todos los demás atributos son nominales. Para procesar los datos nominales en una red neuronal, es mejor convertirlos a una representación numérica. En el siguiente ejemplo transformamos el atributo "Gender" con valores nominales "Female" y "Male" a valores 1.0 y 0.0, respectivamente.

Del mismo modo, cambiamos los valores nominales de todos los atributos a valores 0.0 y 1.0.

In [327]:
gender = dataset.pop('Gender')
dataset['gender'] = (gender == 'Female')*1.0

column_class = dataset.pop('class')
dataset['class'] = (column_class=='Positive')*1.0

for column in column_names:
  if column not in ['Gender', 'class', 'Age']:
    column_class = dataset.pop(column)
    dataset[column] = (column_class=='Yes')*1.0

# Prueba mostrando parte de la data para ver si tu conversión se hizo correctamente
dataset.tail()


Unnamed: 0,Age,gender,class,Polyuria,Polydipsia,sudden weight loss,weakness,Polyphagia,Genital thrush,visual blurring,Itching,Irritability,delayed healing,partial paresis,muscle stiffness,Alopecia,Obesity
514,39,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0
515,48,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0
516,58,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,1.0
517,32,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0
518,42,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Ahora normalizamos el atributo "Age" y dividimos la data en conjunto de entrenamiento y conjunto de test. Esta división es siempre necesaria para poder probar si tu modelo de aprendizaje ha aprendido a generalizar con datos que no pertenecen al conjunto de entrenamiento.

In [328]:
max_age = dataset["Age"].max()
dataset["Age"] = dataset["Age"] / max_age
dataset.tail()


#80% de datos para train y 20% de datos para test
train_dataset = dataset.sample(frac=0.8,random_state=0)
test_dataset = dataset.drop(train_dataset.index)

train_labels = train_dataset.pop('class')
test_labels = test_dataset.pop('class')


In [352]:
#Convertimos todo a arrays Numpy
X_train = train_dataset.to_numpy()
X_test = test_dataset.to_numpy()

Y_train = train_labels.to_numpy()
Y_test = test_labels.to_numpy()

Y_train = Y_train[:,None]
Y_test = Y_test[:,None]

print(X_train.shape)
print(Y_train.shape)
print(X_test.shape)
print(Y_test.shape)

(415, 16)
(415, 1)
(104, 16)
(104, 1)


# **Parte 1**
Diseña y entrena un perceptrón multicapa con la data de arriba. Intenta que tu modelo alcance el mayor accuracy de test posible (ojalá por encima del 93%). Para este primer experimento puedes usar la implementación de MLP vista en clase (basada en Numpy) ó puedes usar algún framework como Tensorflow o Pytorch si es que deseas. Hay que tener en cuenta algunas consideraciones para este primer experimento:



*   Usar gradiente descendiente estocástico con un tamaño de mini-batch de 20. 
*   Usar learning rate de 0.01.
*  Los mini-batches NO se generan de manera aleatoria.

Graficar la función de loss con respecto a las épocas.

## Respuesta


In [353]:
#Packages to use pytorch
import torch
import torch.nn as nn 
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data as data
from functools import reduce

import time

In [354]:
SEED = 1234

torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
torch.backends.cudnn.deterministic = True

In [355]:
#Pytorch has a nice feature to prepare data. The DataLoader creates an iterator of batches which are very convenient for training
#Set the batch size the biggest value as possible depending on your GPU. Operations in a batch are parallelized.
BATCH_SIZE = 20


X_train = torch.from_numpy(X_train).float()
Y_train = torch.from_numpy(Y_train.flatten()).long()
X_test = torch.from_numpy(X_test).float()
Y_test = torch.from_numpy(Y_test.flatten()).long()

train_data = list(zip(X_train, Y_train))
test_data = list(zip(X_test, Y_test))


train_iterator = data.DataLoader(train_data, shuffle=True, batch_size=BATCH_SIZE)
test_iterator = data.DataLoader(test_data, batch_size=BATCH_SIZE)


In [356]:
#Class for our neural network. When a class inherits from torch.nn.Module, 
#it automatically becomes a neural network

class MLP(nn.Module):
  # We need to define two methods at leats: constructor and forward

  #Constructor is for member definitions
  def __init__(self, input_dim, output_dim):
    super().__init__()

    self.fc1 = nn.Linear(input_dim, 8)
    self.fc2 = nn.Linear(8, 4)
    self.fc3 = nn.Linear(4, output_dim)

  #Forward: what happens when we feed the network with data
  def forward(self, input):
    batch_size = input.shape[0]
    input = input.view(batch_size, -1)
    h_1 = F.relu(self.fc1(input))
    h_2 = F.relu(self.fc2(h_1))
    y_pred = self.fc3(h_2)

    #Our network returns the output of the final layer
    return y_pred

In [374]:
# Create the model
INPUT_DIM = X_train.shape[1]
OUTPUT_DIM = 2


model = MLP(INPUT_DIM, OUTPUT_DIM)

In [375]:
#How many parameters are there in our model?

def count_parameters(model):
  return sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'The model has {count_parameters(model):,} trainable parameters')

The model has 182 trainable parameters


In [376]:
#Create the object for the optimization. 
optimizer = optim.SGD(model.parameters(), lr=0.01)

In [377]:
#Define the loss criterion
#In Pytorch, the CrossEntropyLoss includes the softmax activation and the negative log-likelihood cost function
#These two functions are joined because efficiency issues

criterion = nn.CrossEntropyLoss()

In [378]:
#In Pytorch we can decide where to run our program, so we can 
#initialize the device depending whether you have a GPU or not

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [379]:
#Send the model and the loss object to the GPU

model = model.to(device)
criterion = criterion.to(device)

In [380]:
#Function to compute the accuracy. We assume the predictions and the labels are tensors in the GPU

def calculate_accuracy(y_pred, y):
  top_pred = y_pred.argmax(1, keepdim=True)
  correct = top_pred.eq(y.view_as(top_pred)).sum()
  acc = correct.float()/y.shape[0]
  return acc


In [381]:
#Define a function to perform training

def train(model, iterator, optimizer, criterion, device):
  epoch_loss = 0
  epoch_acc = 0

  #We have to set the neural network in training mode. This is because during
  #training, we need gradients and complementary data to ease the computation  
  model.train()
  
  #Training loop
  for (x, y) in iterator:
    x = x.to(device) #Data
    y = y.to(device) #Labels
        
    optimizer.zero_grad() #Clean gradients
             
    y_pred = model(x) #Feed the network with data
    
    loss = criterion(y_pred, y) #Compute the loss
       
    acc = calculate_accuracy(y_pred, y) #Compute the accuracy
        
    loss.backward() #Compute gradients
        
    optimizer.step() #Apply update rules
        
    epoch_loss += loss.item()
    epoch_acc += acc.item()
        
  return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [385]:
#Function to test neural network

def evaluate(model, iterator, criterion, device):
  epoch_loss = 0
  epoch_acc = 0

  #We put the network in testing mode
  #In this mode, Pytorch doesn't use features only reserved for 
  #training (dropout for instance)    
  model.eval()
    
  with torch.no_grad(): #disable the autograd engine (save computation and memory)
        
    for (x, y) in iterator:
      x = x.to(device)
      y = y.to(device)

      y_pred = model(x)
    
      loss = criterion(y_pred, y)

      acc = calculate_accuracy(y_pred, y)

      epoch_loss += loss.item()
      epoch_acc += acc.item()
  return epoch_loss / len(iterator), epoch_acc / len(iterator)

In [386]:
def epoch_time(start_time, end_time):
    elapsed_time = end_time - start_time
    elapsed_mins = int(elapsed_time / 60)
    elapsed_secs = int(elapsed_time - (elapsed_mins * 60))
    return elapsed_mins, elapsed_secs

In [389]:
#Let's perform the training

EPOCHS = 1000

best_test_loss = float('inf')

for epoch in range(EPOCHS):
    
  start_time = time.time()

  #Train + validation cycles  
  train_loss, train_acc = train(model, train_iterator, optimizer, criterion, device)
  test_loss, test_acc = evaluate(model, test_iterator, criterion, device)
    
  #If we find a smaller loss, we save the model
  if test_loss < best_test_loss:
    best_test_loss = test_loss
    torch.save(model.state_dict(), 'saved-model.pt')
    
  end_time = time.time()

  epoch_mins, epoch_secs = epoch_time(start_time, end_time)
    
  print(f'Epoch: {epoch+1:02} | Epoch Time: {epoch_mins}m {epoch_secs}s')
  print(f'\tTrain Loss: {train_loss:.3f} | Train Acc: {train_acc*100:.2f}%')
  print(f'\t Val. Loss: {test_loss:.3f} |  Val. Acc: {test_acc*100:.2f}%')

Epoch: 01 | Epoch Time: 0m 0s
	Train Loss: 0.186 | Train Acc: 94.76%
	 Val. Loss: 0.180 |  Val. Acc: 93.33%
Epoch: 02 | Epoch Time: 0m 0s
	Train Loss: 0.186 | Train Acc: 94.52%
	 Val. Loss: 0.179 |  Val. Acc: 94.17%
Epoch: 03 | Epoch Time: 0m 0s
	Train Loss: 0.185 | Train Acc: 94.68%
	 Val. Loss: 0.177 |  Val. Acc: 94.17%
Epoch: 04 | Epoch Time: 0m 0s
	Train Loss: 0.184 | Train Acc: 94.68%
	 Val. Loss: 0.177 |  Val. Acc: 94.17%
Epoch: 05 | Epoch Time: 0m 0s
	Train Loss: 0.185 | Train Acc: 94.52%
	 Val. Loss: 0.177 |  Val. Acc: 94.17%
Epoch: 06 | Epoch Time: 0m 0s
	Train Loss: 0.183 | Train Acc: 94.44%
	 Val. Loss: 0.175 |  Val. Acc: 94.17%
Epoch: 07 | Epoch Time: 0m 0s
	Train Loss: 0.180 | Train Acc: 94.76%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 08 | Epoch Time: 0m 0s
	Train Loss: 0.181 | Train Acc: 94.68%
	 Val. Loss: 0.174 |  Val. Acc: 95.00%
Epoch: 09 | Epoch Time: 0m 0s
	Train Loss: 0.182 | Train Acc: 94.60%
	 Val. Loss: 0.173 |  Val. Acc: 95.00%
Epoch: 10 | Epoch Time: 0m 0

Epoch: 87 | Epoch Time: 0m 0s
	Train Loss: 0.151 | Train Acc: 94.76%
	 Val. Loss: 0.153 |  Val. Acc: 95.00%
Epoch: 88 | Epoch Time: 0m 0s
	Train Loss: 0.153 | Train Acc: 95.08%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 89 | Epoch Time: 0m 0s
	Train Loss: 0.153 | Train Acc: 94.84%
	 Val. Loss: 0.151 |  Val. Acc: 95.00%
Epoch: 90 | Epoch Time: 0m 0s
	Train Loss: 0.152 | Train Acc: 94.92%
	 Val. Loss: 0.151 |  Val. Acc: 95.00%
Epoch: 91 | Epoch Time: 0m 0s
	Train Loss: 0.150 | Train Acc: 95.00%
	 Val. Loss: 0.152 |  Val. Acc: 95.00%
Epoch: 92 | Epoch Time: 0m 0s
	Train Loss: 0.149 | Train Acc: 95.24%
	 Val. Loss: 0.152 |  Val. Acc: 95.00%
Epoch: 93 | Epoch Time: 0m 0s
	Train Loss: 0.153 | Train Acc: 94.92%
	 Val. Loss: 0.152 |  Val. Acc: 95.00%
Epoch: 94 | Epoch Time: 0m 0s
	Train Loss: 0.152 | Train Acc: 94.92%
	 Val. Loss: 0.153 |  Val. Acc: 95.00%
Epoch: 95 | Epoch Time: 0m 0s
	Train Loss: 0.149 | Train Acc: 95.00%
	 Val. Loss: 0.152 |  Val. Acc: 95.00%
Epoch: 96 | Epoch Time: 0m 0

Epoch: 163 | Epoch Time: 0m 0s
	Train Loss: 0.134 | Train Acc: 95.00%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 164 | Epoch Time: 0m 0s
	Train Loss: 0.138 | Train Acc: 94.76%
	 Val. Loss: 0.155 |  Val. Acc: 95.00%
Epoch: 165 | Epoch Time: 0m 0s
	Train Loss: 0.133 | Train Acc: 95.00%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 166 | Epoch Time: 0m 0s
	Train Loss: 0.133 | Train Acc: 95.24%
	 Val. Loss: 0.155 |  Val. Acc: 95.00%
Epoch: 167 | Epoch Time: 0m 0s
	Train Loss: 0.133 | Train Acc: 95.48%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 168 | Epoch Time: 0m 0s
	Train Loss: 0.136 | Train Acc: 95.08%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 169 | Epoch Time: 0m 0s
	Train Loss: 0.132 | Train Acc: 95.71%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 170 | Epoch Time: 0m 0s
	Train Loss: 0.132 | Train Acc: 95.71%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 171 | Epoch Time: 0m 0s
	Train Loss: 0.131 | Train Acc: 95.48%
	 Val. Loss: 0.154 |  Val. Acc: 95.00%
Epoch: 172 | Epoch 

Epoch: 248 | Epoch Time: 0m 0s
	Train Loss: 0.111 | Train Acc: 97.14%
	 Val. Loss: 0.153 |  Val. Acc: 94.17%
Epoch: 249 | Epoch Time: 0m 0s
	Train Loss: 0.113 | Train Acc: 97.06%
	 Val. Loss: 0.151 |  Val. Acc: 95.00%
Epoch: 250 | Epoch Time: 0m 0s
	Train Loss: 0.110 | Train Acc: 97.06%
	 Val. Loss: 0.150 |  Val. Acc: 95.00%
Epoch: 251 | Epoch Time: 0m 0s
	Train Loss: 0.109 | Train Acc: 97.14%
	 Val. Loss: 0.152 |  Val. Acc: 95.00%
Epoch: 252 | Epoch Time: 0m 0s
	Train Loss: 0.109 | Train Acc: 97.14%
	 Val. Loss: 0.150 |  Val. Acc: 95.00%
Epoch: 253 | Epoch Time: 0m 0s
	Train Loss: 0.112 | Train Acc: 97.06%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 254 | Epoch Time: 0m 0s
	Train Loss: 0.109 | Train Acc: 97.06%
	 Val. Loss: 0.151 |  Val. Acc: 95.00%
Epoch: 255 | Epoch Time: 0m 0s
	Train Loss: 0.108 | Train Acc: 97.14%
	 Val. Loss: 0.154 |  Val. Acc: 94.17%
Epoch: 256 | Epoch Time: 0m 0s
	Train Loss: 0.108 | Train Acc: 97.14%
	 Val. Loss: 0.151 |  Val. Acc: 95.00%
Epoch: 257 | Epoch 

Epoch: 334 | Epoch Time: 0m 0s
	Train Loss: 0.087 | Train Acc: 98.10%
	 Val. Loss: 0.146 |  Val. Acc: 95.00%
Epoch: 335 | Epoch Time: 0m 0s
	Train Loss: 0.088 | Train Acc: 98.02%
	 Val. Loss: 0.145 |  Val. Acc: 95.00%
Epoch: 336 | Epoch Time: 0m 0s
	Train Loss: 0.086 | Train Acc: 98.10%
	 Val. Loss: 0.145 |  Val. Acc: 95.00%
Epoch: 337 | Epoch Time: 0m 0s
	Train Loss: 0.087 | Train Acc: 98.02%
	 Val. Loss: 0.141 |  Val. Acc: 95.00%
Epoch: 338 | Epoch Time: 0m 0s
	Train Loss: 0.086 | Train Acc: 98.10%
	 Val. Loss: 0.144 |  Val. Acc: 95.00%
Epoch: 339 | Epoch Time: 0m 0s
	Train Loss: 0.087 | Train Acc: 98.02%
	 Val. Loss: 0.145 |  Val. Acc: 95.00%
Epoch: 340 | Epoch Time: 0m 0s
	Train Loss: 0.085 | Train Acc: 98.10%
	 Val. Loss: 0.142 |  Val. Acc: 95.00%
Epoch: 341 | Epoch Time: 0m 0s
	Train Loss: 0.084 | Train Acc: 98.10%
	 Val. Loss: 0.142 |  Val. Acc: 95.00%
Epoch: 342 | Epoch Time: 0m 0s
	Train Loss: 0.084 | Train Acc: 98.10%
	 Val. Loss: 0.143 |  Val. Acc: 95.00%
Epoch: 343 | Epoch 

Epoch: 411 | Epoch Time: 0m 0s
	Train Loss: 0.069 | Train Acc: 98.33%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 412 | Epoch Time: 0m 0s
	Train Loss: 0.069 | Train Acc: 98.33%
	 Val. Loss: 0.148 |  Val. Acc: 95.00%
Epoch: 413 | Epoch Time: 0m 0s
	Train Loss: 0.071 | Train Acc: 98.25%
	 Val. Loss: 0.147 |  Val. Acc: 95.00%
Epoch: 414 | Epoch Time: 0m 0s
	Train Loss: 0.069 | Train Acc: 98.33%
	 Val. Loss: 0.147 |  Val. Acc: 95.00%
Epoch: 415 | Epoch Time: 0m 0s
	Train Loss: 0.068 | Train Acc: 98.33%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 416 | Epoch Time: 0m 0s
	Train Loss: 0.068 | Train Acc: 98.33%
	 Val. Loss: 0.147 |  Val. Acc: 95.00%
Epoch: 417 | Epoch Time: 0m 0s
	Train Loss: 0.068 | Train Acc: 98.25%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 418 | Epoch Time: 0m 0s
	Train Loss: 0.067 | Train Acc: 98.33%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 419 | Epoch Time: 0m 0s
	Train Loss: 0.067 | Train Acc: 98.33%
	 Val. Loss: 0.149 |  Val. Acc: 95.00%
Epoch: 420 | Epoch 

Epoch: 493 | Epoch Time: 0m 0s
	Train Loss: 0.059 | Train Acc: 98.49%
	 Val. Loss: 0.156 |  Val. Acc: 95.00%
Epoch: 494 | Epoch Time: 0m 0s
	Train Loss: 0.057 | Train Acc: 98.57%
	 Val. Loss: 0.156 |  Val. Acc: 95.00%
Epoch: 495 | Epoch Time: 0m 0s
	Train Loss: 0.056 | Train Acc: 98.57%
	 Val. Loss: 0.157 |  Val. Acc: 95.00%
Epoch: 496 | Epoch Time: 0m 0s
	Train Loss: 0.055 | Train Acc: 98.33%
	 Val. Loss: 0.157 |  Val. Acc: 95.00%
Epoch: 497 | Epoch Time: 0m 0s
	Train Loss: 0.055 | Train Acc: 98.57%
	 Val. Loss: 0.156 |  Val. Acc: 95.00%
Epoch: 498 | Epoch Time: 0m 0s
	Train Loss: 0.055 | Train Acc: 98.33%
	 Val. Loss: 0.157 |  Val. Acc: 95.00%
Epoch: 499 | Epoch Time: 0m 0s
	Train Loss: 0.058 | Train Acc: 98.41%
	 Val. Loss: 0.158 |  Val. Acc: 95.00%
Epoch: 500 | Epoch Time: 0m 0s
	Train Loss: 0.055 | Train Acc: 98.57%
	 Val. Loss: 0.157 |  Val. Acc: 95.00%
Epoch: 501 | Epoch Time: 0m 0s
	Train Loss: 0.057 | Train Acc: 98.49%
	 Val. Loss: 0.158 |  Val. Acc: 95.00%
Epoch: 502 | Epoch 

Epoch: 571 | Epoch Time: 0m 0s
	Train Loss: 0.045 | Train Acc: 98.81%
	 Val. Loss: 0.164 |  Val. Acc: 95.00%
Epoch: 572 | Epoch Time: 0m 0s
	Train Loss: 0.045 | Train Acc: 98.57%
	 Val. Loss: 0.164 |  Val. Acc: 95.00%
Epoch: 573 | Epoch Time: 0m 0s
	Train Loss: 0.045 | Train Acc: 98.81%
	 Val. Loss: 0.164 |  Val. Acc: 95.00%
Epoch: 574 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.57%
	 Val. Loss: 0.165 |  Val. Acc: 95.00%
Epoch: 575 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.57%
	 Val. Loss: 0.165 |  Val. Acc: 95.00%
Epoch: 576 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.81%
	 Val. Loss: 0.165 |  Val. Acc: 95.00%
Epoch: 577 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.81%
	 Val. Loss: 0.165 |  Val. Acc: 95.00%
Epoch: 578 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.57%
	 Val. Loss: 0.165 |  Val. Acc: 95.00%
Epoch: 579 | Epoch Time: 0m 0s
	Train Loss: 0.044 | Train Acc: 98.81%
	 Val. Loss: 0.164 |  Val. Acc: 95.00%
Epoch: 580 | Epoch 

Epoch: 649 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.170 |  Val. Acc: 95.00%
Epoch: 650 | Epoch Time: 0m 0s
	Train Loss: 0.038 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 651 | Epoch Time: 0m 0s
	Train Loss: 0.038 | Train Acc: 99.21%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 652 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 653 | Epoch Time: 0m 0s
	Train Loss: 0.038 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 654 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 655 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 656 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 657 | Epoch Time: 0m 0s
	Train Loss: 0.037 | Train Acc: 99.29%
	 Val. Loss: 0.171 |  Val. Acc: 95.00%
Epoch: 658 | Epoch 

Epoch: 727 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 728 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 729 | Epoch Time: 0m 0s
	Train Loss: 0.034 | Train Acc: 99.21%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 730 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 731 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 732 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 733 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 734 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 735 | Epoch Time: 0m 0s
	Train Loss: 0.032 | Train Acc: 99.29%
	 Val. Loss: 0.175 |  Val. Acc: 95.00%
Epoch: 736 | Epoch 

Epoch: 813 | Epoch Time: 0m 0s
	Train Loss: 0.028 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 814 | Epoch Time: 0m 0s
	Train Loss: 0.029 | Train Acc: 99.21%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 815 | Epoch Time: 0m 0s
	Train Loss: 0.028 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 816 | Epoch Time: 0m 0s
	Train Loss: 0.028 | Train Acc: 99.21%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 817 | Epoch Time: 0m 0s
	Train Loss: 0.027 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 818 | Epoch Time: 0m 0s
	Train Loss: 0.027 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 819 | Epoch Time: 0m 0s
	Train Loss: 0.027 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 820 | Epoch Time: 0m 0s
	Train Loss: 0.027 | Train Acc: 99.29%
	 Val. Loss: 0.177 |  Val. Acc: 95.00%
Epoch: 821 | Epoch Time: 0m 0s
	Train Loss: 0.027 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 822 | Epoch 

Epoch: 893 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.180 |  Val. Acc: 95.00%
Epoch: 894 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 895 | Epoch Time: 0m 0s
	Train Loss: 0.025 | Train Acc: 99.21%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 896 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 897 | Epoch Time: 0m 0s
	Train Loss: 0.025 | Train Acc: 99.21%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 898 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 899 | Epoch Time: 0m 0s
	Train Loss: 0.025 | Train Acc: 99.21%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 900 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 901 | Epoch Time: 0m 0s
	Train Loss: 0.024 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 902 | Epoch 

Epoch: 969 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 970 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 971 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 972 | Epoch Time: 0m 0s
	Train Loss: 0.022 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 973 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.178 |  Val. Acc: 95.00%
Epoch: 974 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 975 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 976 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 977 | Epoch Time: 0m 0s
	Train Loss: 0.021 | Train Acc: 99.29%
	 Val. Loss: 0.179 |  Val. Acc: 95.00%
Epoch: 978 | Epoch 

# **Parte 2**
Intenta cambiar el tamaño de los mini-batches. Como casos extremos usa m=1 (mini-batches de tamaño 1) y m=n(1 solo mini-batch con todos los datos). Qué resultados obtienes? Discute los resultados. (Hay que explicar el fenómeno observado, dando su opinión de porqué sucede).

Graficar la función de loss con respecto a las épocas y comparar con la función de loss de la Parte 1.

# **Parte 3**
Intenta cambiar el learning rate. Como casos extremos usa lr = 0.5 y lr = 0.000001. Qué resultados obtienes? Discute los resultados. (Hay que explicar el fenómeno observado, dando su opinión de porqué sucede).

Graficar la función de loss con respecto a las épocas y comparar con la función de loss de la Parte 1.

# **Parte 4**
Implementar el algoritmo de gradiente descendente estocástico de manera que los mini-batches se generen de manera aleatoria antes de cada época. Qué resultados obtienes? Discute los resultados. (Hay que explicar el fenómeno observado, dando su opinión de porqué sucede).

Graficar la función de loss con respecto a las épocas y comparar con la función de loss de la Parte 1.