<h1 align="center"> <font color="red"> ===== Deep Learning - Mini Projeto 01 =====</font></h1>
<br>
<h5 align="right">Brasília, dezembro de 2022</h5>
<br>
<b align="center"> Professor: Mateus Mendelson</b>
<br><br>
<b align="center"> Alunos:</b><br>
<b align="center">Halisson Souza Gomides </b><br>
<b align="center"> Lorena Vaz</b><br>
<b align="center"> Roberto Rodrigues Adrego</b>
<br><br>

## Configuração de Ambiente e tratamento de dados

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px

# random_seed = 123 # >> ACCURACY:  0.8013333333333333
random_seed = 10 # >> ACCURACY:  0.8093333333333333
# random_seed = 123456
pd.set_option('display.max_columns', None)

In [None]:
# Efforts for reproducibility
import torch
import random
import numpy as np

torch.manual_seed(random_seed)
random.seed(random_seed)
np.random.seed(random_seed)

In [None]:
path = '/content/df_points.txt'

df = pd.read_csv(path, sep='\t', index_col=[0])

Quantos registros temos no dataset?

In [None]:
print(f'O dataset possui {df.shape[0]:,.1f} registros')

O dataset possui 10,000.0 registros


In [None]:
df.head()

Unnamed: 0,x,y,z,label
0,326.488285,188.988808,-312.205307,0.0
1,-314.287214,307.276723,-179.037412,1.0
2,-328.20891,181.627758,446.311062,1.0
3,-148.65889,147.027947,-27.477959,1.0
4,-467.065931,250.467651,-306.47533,1.0


Vamos dividir os dados em conjuntos de treinamento, validação e teste nas seguintes proporções:

- treinamento: `70%`
- validação:   `15%`
- teste:       `15%`

In [None]:
train_p = 0.7
val_p = 0.15
test_p = 0.15

train_size = int(train_p*df.shape[0])
val_size = int(val_p*df.shape[0])
test_size = int(test_p*df.shape[0])

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df[['x', 'y', 'z']],
                                                    df['label'],
                                                    train_size=train_size,
                                                    stratify=df['label'],
                                                    random_state=random_seed
                                                    )

In [None]:
X_valid, X_test, y_valid, y_test = train_test_split(X_test,
                                                    y_test,
                                                    test_size=test_size,
                                                    stratify=y_test,
                                                    random_state=random_seed
                                                    )

In [None]:
X_train.reset_index(drop=True, inplace=True)
X_valid.reset_index(drop=True, inplace=True)
X_test.reset_index(drop=True, inplace=True)

Amostras em cada conjunto:

In [None]:
len(y_train)

7000

In [None]:
len(y_valid)

1500

In [None]:
len(y_test)

1500

Vamos normalizar os dados.

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import MinMaxScaler

ss = StandardScaler()

Note que o `fit` deve ser realizado apenas com o conjunto de treinamento!

In [None]:
ss.fit(X_train)

StandardScaler()

In [None]:
X_train = ss.transform(X_train)
X_valid = ss.transform(X_valid)
X_test = ss.transform(X_test)

Como a normalização ficou:

In [None]:
X_train

array([[-0.39601882,  1.45043415,  0.43291041],
       [ 1.59359658, -0.88212023,  1.04995899],
       [-1.6717337 , -0.43105859,  0.33223992],
       ...,
       [ 1.24034175, -0.48110015,  1.37646803],
       [ 1.58736464,  0.07917989,  0.30736798],
       [-0.37117044, -0.8237158 , -0.62861755]])

One-hot encoding:

In [None]:
y_train = pd.get_dummies(y_train, prefix='target').reset_index(drop=True)
y_valid = pd.get_dummies(y_valid, prefix='target').reset_index(drop=True)
y_test = pd.get_dummies(y_test, prefix='target').reset_index(drop=True)

In [None]:
y_train

Unnamed: 0,target_0.0,target_1.0
0,1,0
1,1,0
2,0,1
3,1,0
4,1,0
...,...,...
6995,0,1
6996,0,1
6997,0,1
6998,1,0


## Mini-Projeto

Aqui, deixamos o primeiro mini-projeto da disciplina: uma competição!

Utilize tudo que estiver ao seu alcance para atingir a maior acurácia sobre o conjunto de teste.

>Entenda **tudo** como **tudo que é conceitualmente correto**, ou seja, não é permitido o uso do conjunto de teste em nenhum momento além do cálculo da acurácia final.

In [None]:
import torch
import torch.optim as optim
import numpy as np
from tqdm import tqdm
from datetime import datetime

#### Funções

In [None]:
def get_accuracy(model, X_test, y_test):
    model.eval()

    hits = 0
    for index, (original_data, original_target) in enumerate(zip(X_test, y_test)):
        # Format data to tensor
        target = original_target
        data = torch.tensor(()).new_ones((1, 3))
        data[0] = original_data

        # GPU
        # target = target.cuda()
        # data = data.cuda()

        # Softmax: https://pytorch.org/docs/stable/generated/torch.nn.functional.softmax.html
        # Probability for each output
        predicted = F.softmax(model(data), dim=1)

        # The output with the highest probability is the predicted class
        # Let's calculate the accuracy
        if torch.argmax(predicted[0]) == torch.argmax(target):
            hits += 1
            
    return hits/(index+1)

In [None]:
def get_batches(data, batch_size=1):
    batches = []
    
    data_size = len(data)
    for start_idx in range(0, data_size, batch_size):
        end_idx = min(data_size, start_idx + batch_size)
        batches.append(data[start_idx:end_idx])
    
    return batches

In [None]:
def train(model, n_epochs, batch_size, early_stopping_epochs, optimizer, criterion, X_train, y_train, X_valid, y_valid):
    init = datetime.now()
    
    best_epoch = None
    best_valid_loss = np.Inf
    best_train_loss = None
    epochs_without_improv = 0
    
    train_loss = []
    valid_loss = []

    for epoch in tqdm(range(n_epochs)):
        ###################
        # early stopping? #
        ###################
        if epochs_without_improv >= early_stopping_epochs:
            break
        
        ###################
        # train the model #
        ###################
        model.train()
        acc_train_loss = 0.0
        for index, (original_data, original_target) in enumerate(zip(get_batches(X_train, batch_size),
                                                                     get_batches(y_train, batch_size))):
            
            # Format data to tensor
            target = (original_target == 1).nonzero(as_tuple=True)[1]
            data = original_data.float() # Esse '.float()' é necessário para arrumar o tipo do dado

            # target = target.cuda()
            # data = data.cuda()

            optimizer.zero_grad()

            # model.forward(data)
            predicted = model(data)

            loss = criterion(predicted, target)

            # Backprop
            loss.backward()
            optimizer.step()

            acc_train_loss += loss.item()

        train_loss.append(acc_train_loss)

        ###################
        # valid the model #
        ###################
        model.eval()
        acc_valid_loss = 0.0
        for index, (original_data, original_target) in enumerate(zip(get_batches(X_valid, batch_size), 
                                                                     get_batches(y_valid, batch_size))):
            # Format data to tensor
            target = (original_target == 1).nonzero(as_tuple=True)[1]
            data = original_data.float() # Esse '.float()' é necessário para arrumar o tipo do dado

            # target = target.cuda()
            # data = data.cuda()

            # model.forward(data)
            predicted = model(data)

            loss = criterion(predicted, target)
            acc_valid_loss += loss.item()

        valid_loss.append(acc_valid_loss)
        
        #####################
        # Update best model #
        #####################
        if acc_valid_loss < best_valid_loss:
            torch.save(model.state_dict(), 'best_model') # save best model
            best_epoch = epoch
            best_valid_loss = acc_valid_loss
            best_train_loss = acc_train_loss
            epochs_without_improv = 0
        else:
            epochs_without_improv += 1
    
    
    # Load best model
    model.load_state_dict(torch.load('best_model'))
    model.eval()
    
    # Print logs
    if epochs_without_improv >= early_stopping_epochs:
        print('Training interrupted by early stopping!')
    else:
        print('Training finished by epochs!')
    print(f'Total epochs run: {epoch + 1}')
    print(f'Best model found at epoch {best_epoch + 1} with valid loss {best_valid_loss} and training loss {best_train_loss}')
    
    end = datetime.now()
    print(f'Total training time: {end - init}')
    
    return model, train_loss, valid_loss

### TENTATIVA 01

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class MinhaRede(nn.Module):
    def __init__(self, input_features, p=0.5):
        super(MinhaRede, self).__init__()

        self.camada_entrada = nn.Linear(input_features, 128)
        self.camada_oculta_1 = nn.Linear(128, 64)
        self.camada_oculta_2 = nn.Linear(64, 32)
        self.camada_saida = nn.Linear(32, 2)
        
        self.dropout_1 = nn.Dropout(p) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado
        self.dropout_2 = nn.Dropout(p) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado

    def forward(self, p):
        s = F.relu(self.camada_entrada(p))
        s = self.dropout_1(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento
        s = F.relu(self.camada_oculta_1(s))
        s = self.dropout_2(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento
        s = F.relu(self.camada_oculta_2(s))
        s = self.camada_saida(s)

        return s

In [None]:
input_features = 3
epochs = 2000
batch_size = 20
early_stopping_epochs = 70 # quantas épocas sem melhoria serão toleradas antes de parar o treinamento
prob_dropout = 0.3
learning_rate = 1e-3

In [None]:
model = MinhaRede(input_features, p=prob_dropout)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

model, train_loss, valid_loss = train(model, epochs, batch_size, early_stopping_epochs, optimizer, criterion,
                                      torch.from_numpy(X_train),
                                      torch.from_numpy(y_train.to_numpy()),
                                      torch.from_numpy(X_valid),
                                      torch.from_numpy(y_valid.to_numpy()))

 15%|█▌        | 306/2000 [01:56<10:43,  2.63it/s]

Training interrupted by early stopping!
Total epochs run: 307
Best model found at epoch 236 with valid loss 36.698350727558136 and training loss 175.958392187953
Total training time: 0:01:56.171156





In [None]:
get_accuracy(model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))

0.7826666666666666

### TENTATIVA 02

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class MinhaRede(nn.Module):
    def __init__(self, input_features, p=0.5):
        super(MinhaRede, self).__init__()

        self.camada_entrada = nn.Linear(input_features, 128)
        self.camada_oculta_1 = nn.Linear(128, 64)
        self.camada_oculta_2 = nn.Linear(64, 32)
        self.camada_oculta_3 = nn.Linear(32, 16)
        self.camada_saida = nn.Linear(16, 2)
        
        self.dropout_1 = nn.Dropout(p) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado
        self.dropout_2 = nn.Dropout(p) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado

    def forward(self, p):
        s = F.relu(self.camada_entrada(p))
        s = self.dropout_1(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento
        s = F.relu(self.camada_oculta_1(s))        
        s = F.relu(self.camada_oculta_2(s))
        s = self.dropout_2(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento
        s = F.relu(self.camada_oculta_3(s))
        s = self.camada_saida(s)

        return s

In [None]:
input_features = 3
epochs = 2000
batch_size = 20
early_stopping_epochs = 70 # quantas épocas sem melhoria serão toleradas antes de parar o treinamento
prob_dropout = 0.4
learning_rate = 1e-3

In [None]:
model = MinhaRede(input_features, p=prob_dropout)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

model, train_loss, valid_loss = train(model, epochs, batch_size, early_stopping_epochs, optimizer, criterion,
                                      torch.from_numpy(X_train),
                                      torch.from_numpy(y_train.to_numpy()),
                                      torch.from_numpy(X_valid),
                                      torch.from_numpy(y_valid.to_numpy()))

 11%|█         | 222/2000 [02:31<20:10,  1.47it/s]

Training interrupted by early stopping!
Total epochs run: 223
Best model found at epoch 152 with valid loss 37.392108619213104 and training loss 176.56904274225235
Total training time: 0:02:31.130150





In [None]:
get_accuracy(model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))

0.792

In [None]:
best_model = MinhaRede(input_features, p=prob_dropout)
best_model.load_state_dict(torch.load('/content/best_model'))

acc = get_accuracy(best_model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
print('>> ACCURACY: ', acc, end='\n\n')

>> ACCURACY:  0.794



### TENTATIVA 03

#### Arquitetura da rede

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class MinhaRede(nn.Module):
    def __init__(self, input_features, p1=0.5, p2=0.5, qtdn=160):
        super(MinhaRede, self).__init__()

        self.camada_entrada = nn.Linear(input_features, qtdn)
        self.camada_oculta_1 = nn.Linear(qtdn, int(qtdn/2))
        self.camada_oculta_2 = nn.Linear(int(qtdn/2), int(qtdn/4))
        self.camada_oculta_3 = nn.Linear(int(qtdn/4), int(qtdn/8))
        self.camada_saida = nn.Linear(int(qtdn/8), 2)
        
        self.dropout_1 = nn.Dropout(p1) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado
        self.dropout_2 = nn.Dropout(p2) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado
        # self.dropout_3 = nn.Dropout(p3) # <= criação da camada de dropout, com cada neurônio tendo probabilidade p de ser desativado

    def forward(self, p):
        s = F.relu(self.camada_entrada(p))
        s = self.dropout_1(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento   
        s = F.relu(self.camada_oculta_1(s))
        s = F.relu(self.camada_oculta_2(s))        
        s = self.dropout_2(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento                                    
        s = F.relu(self.camada_oculta_3(s))
        # s = self.dropout_3(s) # <= aplicamos a camada de dropout, que só faz efeito quando o modelo está em modo de treinamento                        
        s = self.camada_saida(s)

        return s

#### Espaço de busca de melhores parâmetros

In [None]:
input_features = 3
epochs = 2000

batch_sizes = np.array([15, 20, 25, 30])
early_stopping_epochs = 100 # quantas épocas sem melhoria serão toleradas antes de parar o treinamento
probs_1 = np.array([.15, .2, .3, .35, .4, .5, .55, .6, .65])
probs_2 = np.array([.15, .2, .3, .35, .4, .5, .55, .6, .65])
qtdns = np.array([96, 112, 128, 144, 160])
learning_rates = np.array([1e-3, 1e-4, 1e-5])

tentativas = 5

#### Treinamento da rede

In [None]:
import pickle

best_acc = 0

for _ in range(tentativas):

  print('='*100)
  learning_rate = np.random.choice(learning_rates)
  neuronios = np.random.choice(qtdns)
  prob_dropout1 = np.random.choice(probs_1) 
  prob_dropout2 = np.random.choice(probs_2)
  batch_size = np.random.choice(batch_sizes)

  # batch_size = 30
  # learning_rate = 0.001
  # neuronios = 144
  # prob_dropout1 = 0.55
  # prob_dropout2 = 0.15

  print('batch_size: ', batch_size)
  print('Learning_rate: ', learning_rate)
  print('Qtd. Neuronios out 1: ', neuronios)
  print('prob_dropout1: ', prob_dropout1)
  print('prob_dropout2: ', prob_dropout2)

  model = MinhaRede(input_features, p1=prob_dropout1, p2=prob_dropout2, qtdn=neuronios)
  criterion = nn.CrossEntropyLoss()
  optimizer = optim.Adam(model.parameters(), lr=learning_rate)

  model, train_loss, valid_loss = train(model, epochs, batch_size, early_stopping_epochs, optimizer, criterion,
                                        torch.from_numpy(X_train),
                                        torch.from_numpy(y_train.to_numpy()),
                                        torch.from_numpy(X_valid),
                                        torch.from_numpy(y_valid.to_numpy()))

  acc = get_accuracy(model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
  print('>> ACCURACY: ', acc, end='\n\n')
  
  if acc > best_acc:
    torch.save(model, 'best_model_tentativa03.pyt')
    d_best_model_params = {
      'batch_size': batch_size,
      'learning_rate': learning_rate,
      'neuronios': neuronios,
      'prob_dropout1': prob_dropout1,
      'prob_dropout2': prob_dropout2
    }

    with open('/content/best_model_tentativa03_params.pkl', 'wb') as handle:
      pickle.dump(d_best_model_params, handle, protocol=pickle.HIGHEST_PROTOCOL)

    best_acc = acc
  

batch_size:  15
Learning_rate:  0.001
Qtd. Neuronios out 1:  112
prob_dropout1:  0.4
prob_dropout2:  0.15


 10%|▉         | 195/2000 [02:27<22:44,  1.32it/s]


Training interrupted by early stopping!
Total epochs run: 196
Best model found at epoch 95 with valid loss 50.099229991436005 and training loss 237.20309409499168
Total training time: 0:02:27.376148
>> ACCURACY:  0.804

batch_size:  15
Learning_rate:  0.001
Qtd. Neuronios out 1:  160
prob_dropout1:  0.6
prob_dropout2:  0.65


 19%|█▉        | 383/2000 [05:55<25:01,  1.08it/s]


Training interrupted by early stopping!
Total epochs run: 384
Best model found at epoch 283 with valid loss 49.90789234638214 and training loss 238.57456946372986
Total training time: 0:05:55.542448
>> ACCURACY:  0.804

batch_size:  15
Learning_rate:  1e-05
Qtd. Neuronios out 1:  160
prob_dropout1:  0.55
prob_dropout2:  0.3


100%|██████████| 2000/2000 [24:51<00:00,  1.34it/s]


Training finished by epochs!
Total epochs run: 2000
Best model found at epoch 1998 with valid loss 50.655465722084045 and training loss 242.5771330446005
Total training time: 0:24:51.290920
>> ACCURACY:  0.8046666666666666

batch_size:  20
Learning_rate:  0.0001
Qtd. Neuronios out 1:  96
prob_dropout1:  0.55
prob_dropout2:  0.55


 38%|███▊      | 759/2000 [06:48<11:07,  1.86it/s]


Training interrupted by early stopping!
Total epochs run: 760
Best model found at epoch 659 with valid loss 37.40077401697636 and training loss 181.04937963187695
Total training time: 0:06:48.026791
>> ACCURACY:  0.808

batch_size:  25
Learning_rate:  1e-05
Qtd. Neuronios out 1:  144
prob_dropout1:  0.15
prob_dropout2:  0.15


 65%|██████▍   | 1296/2000 [09:52<05:22,  2.19it/s]


Training interrupted by early stopping!
Total epochs run: 1297
Best model found at epoch 1196 with valid loss 30.70126649737358 and training loss 144.46154701709747
Total training time: 0:09:52.845968
>> ACCURACY:  0.7973333333333333



APÓS RESET DO KERNEL, RODANDO COM OS PARÂMETROS DO MELHOR MODELO:

In [None]:
import pickle

best_acc = 0

for _ in range(tentativas):

  print('='*100)
  # learning_rate = np.random.choice(learning_rates)
  # neuronios = np.random.choice(qtdns)
  # prob_dropout1 = np.random.choice(probs_1) 
  # prob_dropout2 = np.random.choice(probs_2)
  # batch_size = np.random.choice(batch_sizes)

  batch_size = 30
  learning_rate = 0.001
  neuronios = 144
  prob_dropout1 = 0.55
  prob_dropout2 = 0.15

  print('batch_size: ', batch_size)
  print('Learning_rate: ', learning_rate)
  print('Qtd. Neuronios out 1: ', neuronios)
  print('prob_dropout1: ', prob_dropout1)
  print('prob_dropout2: ', prob_dropout2)

  model = MinhaRede(input_features, p1=prob_dropout1, p2=prob_dropout2, qtdn=neuronios)
  criterion = nn.CrossEntropyLoss()
  optimizer = optim.Adam(model.parameters(), lr=learning_rate)

  model, train_loss, valid_loss = train(model, epochs, batch_size, early_stopping_epochs, optimizer, criterion,
                                        torch.from_numpy(X_train),
                                        torch.from_numpy(y_train.to_numpy()),
                                        torch.from_numpy(X_valid),
                                        torch.from_numpy(y_valid.to_numpy()))

  acc = get_accuracy(model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
  print('>> ACCURACY: ', acc, end='\n\n')
  
  if acc > best_acc:
    torch.save(model, 'best_model_tentativa03.pyt')
    d_best_model_params = {
      'batch_size': batch_size,
      'learning_rate': learning_rate,
      'neuronios': neuronios,
      'prob_dropout1': prob_dropout1,
      'prob_dropout2': prob_dropout2
    }

    with open('/content/best_model_tentativa03_params.pkl', 'wb') as handle:
      pickle.dump(d_best_model_params, handle, protocol=pickle.HIGHEST_PROTOCOL)

    best_acc = acc

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 11%|█         | 215/2000 [01:26<11:59,  2.48it/s]


Training interrupted by early stopping!
Total epochs run: 216
Best model found at epoch 115 with valid loss 25.109973430633545 and training loss 121.03501638770103
Total training time: 0:01:26.636983
>> ACCURACY:  0.802

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 12%|█▏        | 244/2000 [01:40<12:05,  2.42it/s]


Training interrupted by early stopping!
Total epochs run: 245
Best model found at epoch 144 with valid loss 24.958782017230988 and training loss 120.01877626776695
Total training time: 0:01:40.767192
>> ACCURACY:  0.802

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 13%|█▎        | 269/2000 [01:47<11:32,  2.50it/s]


Training interrupted by early stopping!
Total epochs run: 270
Best model found at epoch 169 with valid loss 24.914406418800354 and training loss 118.42115581035614
Total training time: 0:01:47.647302
>> ACCURACY:  0.7993333333333333

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 11%|█▏        | 225/2000 [01:30<11:50,  2.50it/s]


Training interrupted by early stopping!
Total epochs run: 226
Best model found at epoch 125 with valid loss 24.96848550438881 and training loss 120.47225534915924
Total training time: 0:01:30.086625
>> ACCURACY:  0.8066666666666666

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 14%|█▍        | 276/2000 [01:51<11:33,  2.49it/s]


Training interrupted by early stopping!
Total epochs run: 277
Best model found at epoch 176 with valid loss 25.02282229065895 and training loss 119.68830814957619
Total training time: 0:01:51.077467
>> ACCURACY:  0.804



TESTE DO MELHOR MODELO SALVO, APLICADO AOS DADOS COM MESMA RANDOM SEED = 10

In [None]:
import pickle

best_model = torch.load('/content/best_model_tentativa03_0_809.pyt')

acc = get_accuracy(best_model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
print('>> ACCURACY: ', acc, end='\n\n')

with open('/content/best_model_tentativa03_0_809_params.pkl', 'rb') as handle:
  best_model_params = pickle.load(handle)
print('Parâmetros:')
print(best_model_params)

>> ACCURACY:  0.8093333333333333

Parâmetros:
{'batch_size': 30, 'learning_rate': 0.001, 'neuronios': 144, 'prob_dropout1': 0.55, 'prob_dropout2': 0.15}


In [None]:
print('='*100)

batch_size, learning_rate, neuronios, prob_dropout1, prob_dropout2 = best_model_params.values()
print('batch_size: ', batch_size)
print('Learning_rate: ', learning_rate)
print('Qtd. Neuronios out 1: ', neuronios)
print('prob_dropout1: ', prob_dropout1)
print('prob_dropout2: ', prob_dropout2)

model = MinhaRede(input_features, p1=prob_dropout1, p2=prob_dropout2, qtdn=neuronios)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

model, train_loss, valid_loss = train(model, epochs, batch_size, early_stopping_epochs, optimizer, criterion,
                                      torch.from_numpy(X_train),
                                      torch.from_numpy(y_train.to_numpy()),
                                      torch.from_numpy(X_valid),
                                      torch.from_numpy(y_valid.to_numpy()))

acc = get_accuracy(model,
          torch.from_numpy(X_test),
          torch.from_numpy(y_test.to_numpy()))
print('>> ACCURACY: ', acc, end='\n\n')

batch_size:  30
Learning_rate:  0.001
Qtd. Neuronios out 1:  144
prob_dropout1:  0.55
prob_dropout2:  0.15


 15%|█▌        | 302/2000 [02:03<11:36,  2.44it/s]


Training interrupted by early stopping!
Total epochs run: 303
Best model found at epoch 202 with valid loss 24.931107580661774 and training loss 119.02289918065071
Total training time: 0:02:03.975549
>> ACCURACY:  0.8066666666666666



> Como pôde ser visto, não foi possível alcançar a mesma acurácia de outrora.

### TENTATIVA 04

https://towardsdatascience.com/hyperparameter-tuning-of-neural-networks-with-optuna-and-pytorch-22e179efc837

In [None]:
%%capture
!pip install optuna

In [None]:
import optuna
import torch.nn as nn
import torch.nn.functional as F


# Build a model by implementing define-by-run design from Optuna
def build_model_custom(trial):
    
    n_layers = trial.suggest_int("n_layers", 2, 5)
    layers = []

    in_features = 3
    
    for i in range(n_layers):
                
        out_features = trial.suggest_int("n_units_l{}".format(i), int(90/(i+1)), int(300/(i+1)))
        p_dropout = trial.suggest_float("p1", 0.1, 0.65, step=0.05)

        layers.append(nn.Linear(in_features, out_features))
        layers.append(nn.LeakyReLU())
        layers.append(nn.Dropout(p_dropout))

        in_features = out_features        
        
    layers.append(nn.Linear(in_features, 2))
    layers.append(nn.LeakyReLU())
    
    return nn.Sequential(*layers)

In [None]:
def optuna_train(model, trial, n_epochs, param, X_train, y_train, X_valid, y_valid):

    optimizer = getattr(optim, param['optimizer'])(model.parameters(), lr= param['learning_rate'])
    batch_size = param['batch_size']
    criterion = nn.CrossEntropyLoss()

    init = datetime.now()
    
    best_epoch = None
    best_valid_loss = np.Inf
    best_train_loss = None
    epochs_without_improv = 0
    
    train_loss = []
    valid_loss = []

    for epoch in tqdm(range(n_epochs)):
        ###################
        # early stopping? #
        ###################
        if epochs_without_improv >= param['early_stopping_epochs']:
            break
        
        ###################
        # train the model #
        ###################
        model.train()
        total_acc_train = 0.0
        total_loss_train = 0.0
        for index, (original_data, original_target) in enumerate(zip(get_batches(X_train, param['batch_size']),
                                                                     get_batches(y_train, param['batch_size']))):
            
            # Format data to tensor
            target = (original_target == 1).nonzero(as_tuple=True)[1]
            data = original_data.float() # Esse '.float()' é necessário para arrumar o tipo do dado

            # target = target.cuda()
            # data = data.cuda()

            optimizer.zero_grad()

            # model.forward(data)
            predicted = model(data)

            loss = criterion(predicted, target)
            total_loss_train  += loss.item()

            acc = (predicted.argmax(dim=1) == target).sum().item()
            total_acc_train += acc

            # Backprop
            loss.backward()
            optimizer.step()            

        train_loss.append(total_loss_train)

        ###################
        # valid the model #
        ###################
        model.eval()
        total_acc_valid = 0.0
        total_loss_valid = 0.0
        for index, (original_data, original_target) in enumerate(zip(get_batches(X_valid, param['batch_size']), 
                                                                     get_batches(y_valid, param['batch_size']))):
            # Format data to tensor
            target = (original_target == 1).nonzero(as_tuple=True)[1]
            data = original_data.float() # Esse '.float()' é necessário para arrumar o tipo do dado

            # target = target.cuda()
            # data = data.cuda()

            # model.forward(data)
            predicted = model(data)

            loss = criterion(predicted, target)
            total_loss_valid += loss.item()

            acc = (predicted.argmax(dim=1) == target).sum().item()
            total_acc_valid += acc

        valid_loss.append(total_loss_valid)
        accuracy = total_acc_valid/len(X_valid)
        
        # Add prune mechanism
        trial.report(accuracy, epoch)

        if trial.should_prune():
          raise optuna.exceptions.TrialPruned()
        
        #####################
        # Update best model #
        #####################
        if total_loss_valid < best_valid_loss:
            # torch.save(model.state_dict(), 'best_model') # save best model
            best_epoch = epoch
            best_valid_loss = total_loss_valid
            best_train_loss = total_loss_train
            epochs_without_improv = 0
        else:
            epochs_without_improv += 1
    
    
    # Load best model
    # model.load_state_dict(torch.load('best_model'))
    # model.eval()
    
    # Print logs
    if epochs_without_improv >= param['early_stopping_epochs']:
        print('Training interrupted by early stopping!')
    else:
        print('Training finished by epochs!')
    print(f'Total epochs run: {epoch + 1}')
    print(f'Best model found at epoch {best_epoch + 1} with valid loss {best_valid_loss} and training loss {best_train_loss}')
    
    end = datetime.now()
    print(f'Total training time: {end - init}')
    
    return accuracy

In [None]:
epochs = 2000

In [None]:
import pickle

def objective(trial): 

  params = {
            'learning_rate': trial.suggest_float('learning_rate', 1e-5, 1e-2, log=True),
            'optimizer': trial.suggest_categorical("optimizer", ["Adam", "RMSprop", "SGD"]),              
            'batch_size': trial.suggest_int("batch_size", 15, 40, step=5),
            'early_stopping_epochs': trial.suggest_int("early_stopping_epochs", 70, 120, step=10),
            }
    
  model = build_model_custom(trial)
  
  accuracy = optuna_train(model, trial, epochs, params, 
                    torch.from_numpy(X_train),
                    torch.from_numpy(y_train.to_numpy()),
                    torch.from_numpy(X_valid),
                    torch.from_numpy(y_valid.to_numpy()))
  
  # Save a trained model to a file.
  with open("/content/optuna_model_{}.pickle".format(trial.number), "wb") as fout:
      pickle.dump(model, fout)

  return accuracy

In [None]:
study = optuna.create_study(direction="maximize", sampler=optuna.samplers.TPESampler(), pruner=optuna.pruners.MedianPruner())
study.optimize(objective, n_trials=40)

[32m[I 2022-11-26 14:22:01,451][0m A new study created in memory with name: no-name-834432c7-6e91-4968-9a15-bbd3f4a80793[0m
100%|██████████| 2000/2000 [05:32<00:00,  6.02it/s]
[32m[I 2022-11-26 14:27:33,902][0m Trial 0 finished with value: 0.578 and parameters: {'learning_rate': 0.00014610870376987413, 'optimizer': 'SGD', 'batch_size': 40, 'early_stopping_epochs': 90, 'n_layers': 2, 'n_units_l0': 146, 'p1': 0.15000000000000002, 'n_units_l1': 73}. Best is trial 0 with value: 0.578.[0m


Training finished by epochs!
Total epochs run: 2000
Best model found at epoch 2000 with valid loss 23.418964505195618 and training loss 107.8265009522438
Total training time: 0:05:32.417048


 25%|██▍       | 491/2000 [03:51<11:50,  2.13it/s]
[32m[I 2022-11-26 14:31:24,951][0m Trial 1 finished with value: 0.7846666666666666 and parameters: {'learning_rate': 0.0003307296196279348, 'optimizer': 'RMSprop', 'batch_size': 25, 'early_stopping_epochs': 70, 'n_layers': 5, 'n_units_l0': 268, 'p1': 0.55, 'n_units_l1': 63, 'n_units_l2': 78, 'n_units_l3': 59, 'n_units_l4': 21}. Best is trial 1 with value: 0.7846666666666666.[0m


Training interrupted by early stopping!
Total epochs run: 492
Best model found at epoch 421 with valid loss 30.86790081858635 and training loss 147.23315313458443
Total training time: 0:03:51.032679


 27%|██▋       | 537/2000 [08:04<21:58,  1.11it/s]
[32m[I 2022-11-26 14:39:29,099][0m Trial 2 finished with value: 0.7846666666666666 and parameters: {'learning_rate': 0.0001315466391572841, 'optimizer': 'Adam', 'batch_size': 15, 'early_stopping_epochs': 100, 'n_layers': 5, 'n_units_l0': 103, 'p1': 0.45000000000000007, 'n_units_l1': 92, 'n_units_l2': 92, 'n_units_l3': 39, 'n_units_l4': 20}. Best is trial 1 with value: 0.7846666666666666.[0m


Training interrupted by early stopping!
Total epochs run: 538
Best model found at epoch 437 with valid loss 51.5263444930315 and training loss 242.33654835820198
Total training time: 0:08:04.125611


 36%|███▌      | 721/2000 [02:02<03:38,  5.87it/s]
[32m[I 2022-11-26 14:41:32,015][0m Trial 3 finished with value: 0.7666666666666667 and parameters: {'learning_rate': 0.006856916379444076, 'optimizer': 'SGD', 'batch_size': 35, 'early_stopping_epochs': 70, 'n_layers': 2, 'n_units_l0': 92, 'p1': 0.55, 'n_units_l1': 71}. Best is trial 1 with value: 0.7846666666666666.[0m


Training interrupted by early stopping!
Total epochs run: 722
Best model found at epoch 651 with valid loss 23.636359602212906 and training loss 112.50871747732162
Total training time: 0:02:02.902761


100%|██████████| 2000/2000 [09:01<00:00,  3.70it/s]
[32m[I 2022-11-26 14:50:33,201][0m Trial 4 finished with value: 0.5833333333333334 and parameters: {'learning_rate': 0.00011773032111817194, 'optimizer': 'SGD', 'batch_size': 40, 'early_stopping_epochs': 110, 'n_layers': 5, 'n_units_l0': 178, 'p1': 0.35, 'n_units_l1': 116, 'n_units_l2': 100, 'n_units_l3': 40, 'n_units_l4': 33}. Best is trial 1 with value: 0.7846666666666666.[0m


Training finished by epochs!
Total epochs run: 2000
Best model found at epoch 2000 with valid loss 26.295251607894897 and training loss 121.10397201776505
Total training time: 0:09:01.159694


  9%|▉         | 183/2000 [02:39<26:25,  1.15it/s]
[32m[I 2022-11-26 14:53:12,925][0m Trial 5 pruned. [0m
  2%|▎         | 50/2000 [00:22<14:54,  2.18it/s]
[32m[I 2022-11-26 14:53:35,888][0m Trial 6 pruned. [0m
  9%|▉         | 188/2000 [01:49<17:36,  1.71it/s]
[32m[I 2022-11-26 14:55:25,584][0m Trial 7 finished with value: 0.784 and parameters: {'learning_rate': 0.0023554557899657367, 'optimizer': 'RMSprop', 'batch_size': 20, 'early_stopping_epochs': 100, 'n_layers': 5, 'n_units_l0': 184, 'p1': 0.25, 'n_units_l1': 109, 'n_units_l2': 43, 'n_units_l3': 61, 'n_units_l4': 47}. Best is trial 1 with value: 0.7846666666666666.[0m


Training interrupted by early stopping!
Total epochs run: 189
Best model found at epoch 88 with valid loss 38.36430397629738 and training loss 180.93460568785667
Total training time: 0:01:49.674469


  0%|          | 0/2000 [00:00<?, ?it/s]
[32m[I 2022-11-26 14:55:25,793][0m Trial 8 pruned. [0m
  0%|          | 0/2000 [00:00<?, ?it/s]
[32m[I 2022-11-26 14:55:26,583][0m Trial 9 pruned. [0m
  7%|▋         | 144/2000 [00:48<10:31,  2.94it/s]
[33m[W 2022-11-26 14:56:15,608][0m Trial 10 failed because of the following error: KeyboardInterrupt()[0m
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/optuna/study/_optimize.py", line 196, in _run_trial
    value_or_values = func(trial)
  File "<ipython-input-27-22d5db7007a2>", line 18, in objective
    torch.from_numpy(y_valid.to_numpy()))
  File "<ipython-input-25-931907107683>", line 43, in optuna_train
    predicted = model(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  Fi

KeyboardInterrupt: ignored

In [None]:
best_trial = study.best_trial

for key, value in best_trial.params.items():
    print("{}: {}".format(key, value))

learning_rate: 0.0003307296196279348
optimizer: RMSprop
batch_size: 25
early_stopping_epochs: 70
n_layers: 5
n_units_l0: 268
p1: 0.55
n_units_l1: 63
n_units_l2: 78
n_units_l3: 59
n_units_l4: 21


In [None]:
best_trial.value

0.7846666666666666

In [None]:
optuna.visualization.plot_optimization_history(study)

In [None]:
# Load the best model.
with open("/content/optuna_model_{}.pickle".format(best_trial.number), "rb") as fin:
    best_model = pickle.load(fin)

acc = get_accuracy(best_model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
print('>> ACCURACY: ', acc, end='\n\n')

>> ACCURACY:  0.8046666666666666



In [None]:
best_trial.number

10

In [None]:
with open(f'/content/optuna_model_{best_trial.number}_params.pickle', 'wb') as handle:
    pickle.dump(best_trial.params, handle, protocol=pickle.HIGHEST_PROTOCOL)

TESTANDO APÓS RESET DO KERNEL

In [None]:
import pickle


with open("/content/optuna_model_{}.pickle".format(2), "rb") as fin:
    best_model = pickle.load(fin)

acc = get_accuracy(best_model,
            torch.from_numpy(X_test),
            torch.from_numpy(y_test.to_numpy()))
print('>> ACCURACY: ', acc, end='\n\n')

>> ACCURACY:  0.8013333333333333

