# Tensorboard and Optuna

* PyTorch [website](https://pytorch.org/)
* PyTorch [Tutorials](https://pytorch.org/tutorials/)
* TensorBoard [website](https://www.tensorflow.org/tensorboard)
* PyTorch [Tensorboard](https://pytorch.org/docs/stable/tensorboard.html) docs
* PyTorch TensorBoard [Tutorial](https://pytorch.org/tutorials/intermediate/tensorboard_tutorial.html)

# TensorBoard
Load the TensorBoard notebook extension

*Enable 3rd party cookies in your browser settings. Use Google Chrome.*

In [4]:
%load_ext tensorboard
logs_dir = './logs/'
%tensorboard --logdir {logs_dir} --host localhost

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 20536), started 0:01:09 ago. (Use '!kill 20536' to kill it.)

In [49]:
#!rm -r {logs_dir}

"rm" ­Ґ пў«пҐвбп ў­гваҐ­­Ґ© Ё«Ё ў­Ґи­Ґ©
Є®¬ ­¤®©, ЁбЇ®«­пҐ¬®© Їа®Ја ¬¬®© Ё«Ё Ї ЄҐв­л¬ д ©«®¬.


In [3]:
import time
import numpy as np
from torch.utils.tensorboard import SummaryWriter

# writer = SummaryWriter(log_dir=logs_dir+'experiment_02', flush_secs=2)

# for epoch in range(1, 100):
#     writer.add_scalar('Loss/train', np.random.random()/epoch, epoch)
#     writer.add_scalar('Loss/test', np.random.random()/epoch, epoch)
#     writer.add_scalar('Accuracy/train', np.random.random()*epoch, epoch)
#     writer.add_scalar('Accuracy/test', np.random.random()*epoch, epoch)
#     time.sleep(0.25)
# writer.close()

## Homework 04


All tasks could be done together.
1. (3) Use Optuna to tune hyperparameters. Decrease error rate of your best original model (Homework 03) by 20% at least. (E.g., your best model accuracy was 85%, the error rate was 100%-85%=15%; hence, you need to find hyperparameters to achieve 100%-15%*(100%-80%)=88% accuracy)
2. (1) Use TensorBoard to log each model training process (Train and Val loss and accuracy) (*.add_scalar*) and save hyperparameters of each model examined by Optuna (*.add_hparams*).
3. (1) Save visualization of each model architecture examined by Optuna to TensorBoard. (*.add_graph*)
4. \* (2) Save up to 3 misclassified images from the test dataset of each class of each model architecture examined by Optuna to TensorBoard.

In [5]:
import numpy as np

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch.optim.lr_scheduler import ReduceLROnPlateau

import matplotlib.pyplot as plt

import math
import optuna

In [6]:
def get_loaders(batch_size=128, num_workers=2, transform=transforms.ToTensor()):
    train = datasets.CIFAR10('../data', train=True, download=True, transform=transform)
    test = datasets.CIFAR10('../data', train=False, download=True, transform=transform)
    torch.manual_seed(123)  # To ensure the same sampling during each experiment
    
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, num_workers=num_workers, shuffle=False)
    return train_loader, test_loader

In [7]:
class ResNet(nn.Module):
    def __init__(self):
        super(ResNet, self).__init__()
        
        #ResNet
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.bn1 = nn.BatchNorm2d(16)
        
        #Residual block 1
        self.conv2 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(16)
        self.conv3 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.bn3 = nn.BatchNorm2d(16)
        
        self.conv4 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.bn4 = nn.BatchNorm2d(16)
        self.conv5 = nn.Conv2d(in_channels=16, out_channels=16, kernel_size=3, stride=1, padding=1)
        self.bn5 = nn.BatchNorm2d(16)
        
        
        #Residual block 2
        self.conv6 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=2, padding=1)
        self.bn6 = nn.BatchNorm2d(32)
        self.conv7 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.bn7 = nn.BatchNorm2d(32)
        
        self.conv8 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, stride=2, padding=1)
        self.bn8 = nn.BatchNorm2d(32)
        
        self.conv9 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.bn9 = nn.BatchNorm2d(32)
        self.conv10 = nn.Conv2d(in_channels=32, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.bn10 = nn.BatchNorm2d(32)
        
        
        #Residual block 3
        self.conv11 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=2, padding=1)
        self.bn11 = nn.BatchNorm2d(64)
        self.conv12 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.bn12 = nn.BatchNorm2d(64)
        
        self.conv13 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=2, padding=1)
        self.bn13 = nn.BatchNorm2d(64)
        
        self.conv14 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.bn14 = nn.BatchNorm2d(64)
        self.conv15 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.bn15 = nn.BatchNorm2d(64)
        
        
        #Final
        self.pool = nn.AvgPool2d(8)
        self.fc = nn.Linear(64, 10)
    
    def forward(self, x):
        
        x = self.conv1(x)
        x = self.bn1(x)
        x = F.relu(x, inplace=True)
        
        
        #Residual block 1
        
        residual = x
        x = self.conv2(x)
        x = self.bn2(x)
        x = F.relu(x, inplace=True)
        x = self.conv3(x)
        x = self.bn3(x)
        x += residual
        x = F.relu(x, inplace=True)
        
        residual = x
        x = self.conv4(x)
        x = self.bn4(x)
        x = F.relu(x, inplace=True)
        x = self.conv5(x)
        x = self.bn5(x)
        x += residual
        x = F.relu(x, inplace=True)
        
        
        #Residual block 2
        
        residual = x
        x = self.conv6(x)
        x = self.bn6(x)
        x = F.relu(x, inplace=True)
        x = self.conv7(x)
        x = self.bn7(x)
        residual = self.conv8(residual)
        residual = self.bn8(residual)
        x += residual
        x = F.relu(x, inplace=True)
        
        residual = x
        x = self.conv9(x)
        x = self.bn9(x)
        x = F.relu(x, inplace=True)
        x = self.conv10(x)
        x = self.bn10(x)
        x += residual
        x = F.relu(x, inplace=True)
        
        
        #Residual block 3
        
        residual = x
        x = self.conv11(x)
        x = self.bn11(x)
        x = F.relu(x, inplace=True)
        x = self.conv12(x)
        x = self.bn12(x)
        residual = self.conv13(residual)
        residual = self.bn13(residual)
        x += residual
        x = F.relu(x, inplace=True)
        
        residual = x
        x = self.conv14(x)
        x = self.bn14(x)
        x = F.relu(x, inplace=True)
        x = self.conv15(x)
        x = self.bn15(x)
        x += residual
        x = F.relu(x, inplace=True)
        
        
        #Final
        x = self.pool(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        x = F.log_softmax(x, 1)
        
        return x

In [8]:
def train(model, device, train_loader, optimizer):
    model.train()
    train_loss = 0
    correct = 0
    for data, target in train_loader:
        data = data.to(device)
        target = target.to(device)
        optimizer.zero_grad()
        out = model(data)
        loss = F.nll_loss(out, target)
        loss.backward()
        optimizer.step()
        train_loss += F.nll_loss(out, target, reduction='sum').item()
        
        _, predicted = torch.max(out, 1)
        correct += (predicted == target).sum().item()
    
    return train_loss / len(train_loader.dataset), 100*correct/len(train_loader.dataset)


def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data = data.to(device)
            target = target.to(device)
            out = model(data)
            loss = F.nll_loss(out, target, reduction='sum')
            test_loss += loss.item()
            
            _, predicted = torch.max(out, 1)
            correct += (predicted == target).sum().item()
            
    return test_loss / len(test_loader.dataset), 100*correct/len(test_loader.dataset)

## Результаты обучения с исходными гиперпараметрами

In [57]:
epochs = 80
device = torch.device('cuda:0')

batch_size = 128
lr = 1e-3

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])

train_loader, test_loader = get_loaders(batch_size=batch_size, transform=transform, num_workers=0)
model = ResNet().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
scheduler = ReduceLROnPlateau(optimizer, 'max')

min_loss = 100
max_early_stop = 15
early_stop = 0


for epoch in range(epochs):
    train_loss, train_accuracy = train(model, device, train_loader, optimizer)
    test_loss, test_accuracy = test(model, device, test_loader)
    scheduler.step(test_accuracy)

    #early stop
    if test_loss > min_loss:
        early_stop += 1
    else:
        min_loss = test_loss
        early_stop = 0
    if early_stop == max_early_stop:
        print("Model doesn't improve, early stop")
        break


    print("Epoch: {}, train loss: {}, train accuracy: {}, test loss: {}, test accuracy: {}".format(epoch, train_loss, train_accuracy, test_loss, test_accuracy))

Files already downloaded and verified
Files already downloaded and verified
Epoch: 0, train loss: 1.3189632302856444, train accuracy: 51.886, test loss: 1.2484499015808106, test accuracy: 57.32
Epoch: 1, train loss: 0.9172500576019287, train accuracy: 67.168, test loss: 0.9936826674461364, test accuracy: 64.68
Epoch: 2, train loss: 0.7532615420532227, train accuracy: 73.272, test loss: 1.046308250617981, test accuracy: 62.53
Epoch: 3, train loss: 0.644967350616455, train accuracy: 77.394, test loss: 0.7758380510330201, test accuracy: 73.82
Epoch: 4, train loss: 0.5712703308105469, train accuracy: 79.84, test loss: 0.6856704473495483, test accuracy: 76.45
Epoch: 5, train loss: 0.5108024661254883, train accuracy: 82.124, test loss: 0.6793796546936035, test accuracy: 76.42
Epoch: 6, train loss: 0.45692398040771487, train accuracy: 83.92, test loss: 0.7882611918449401, test accuracy: 74.53
Epoch: 7, train loss: 0.4108411571884155, train accuracy: 85.634, test loss: 0.6668471819877625, test

(100%-78%)*0.2 = 4.4%
Требуется увеличить значение точности на 4.4%

## Поиск потимальных гиперпараметров

In [9]:
train_split_len = 20000
test_split_len = 5000

def get_loaders_batch(batch_size=128, num_workers=2, transform=transforms.ToTensor()):
    train = datasets.CIFAR10('../data', train=True, download=True, transform=transform)
    test = datasets.CIFAR10('../data', train=False, download=True, transform=transform)
    torch.manual_seed(123)  # To ensure the same sampling during each experiment
    train = torch.utils.data.random_split(train, [train_split_len, len(train)-train_split_len])[0]
    test = torch.utils.data.random_split(test, [test_split_len, len(test)-test_split_len])[0]
    
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, num_workers=num_workers, shuffle=False)
    return train_loader, test_loader

Результаты на части датасета

In [54]:
epochs = 80
device = torch.device('cuda:0')

batch_size = 128
lr = 1e-3

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])

train_loader, test_loader = get_loaders_batch(batch_size=batch_size, transform=transform, num_workers=0)
model = ResNet().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
scheduler = ReduceLROnPlateau(optimizer, 'max')

min_loss = 128
max_early_stop = 15
early_stop = 0


for epoch in range(epochs):
    train_loss, train_accuracy = train(model, device, train_loader, optimizer)
    test_loss, test_accuracy = test(model, device, test_loader)
    scheduler.step(test_accuracy)

    #early stop
    if test_loss > min_loss:
        early_stop += 1
    else:
        min_loss = test_loss
        early_stop = 0
    if early_stop == max_early_stop:
        print("Model doesn't improve, early stop")
        break


    print("Epoch: {}, train loss: {}, train accuracy: {}, test loss: {}, test accuracy: {}".format(epoch, train_loss, train_accuracy, test_loss, test_accuracy))

Files already downloaded and verified
Files already downloaded and verified
Epoch: 0, train loss: 1.5943648639678956, train accuracy: 41.365, test loss: 1.5803111841201782, test accuracy: 43.06
Epoch: 1, train loss: 1.2028879693984986, train accuracy: 56.365, test loss: 1.3263910757064818, test accuracy: 54.04
Epoch: 2, train loss: 1.016085764312744, train accuracy: 63.55, test loss: 1.7451686050415038, test accuracy: 44.4
Epoch: 3, train loss: 0.9019544897079468, train accuracy: 67.445, test loss: 1.0092890419006348, test accuracy: 63.56
Epoch: 4, train loss: 0.7993814951896667, train accuracy: 71.49, test loss: 1.4173109228134155, test accuracy: 55.8
Epoch: 5, train loss: 0.7183253150939941, train accuracy: 74.62, test loss: 1.111238115310669, test accuracy: 61.44
Epoch: 6, train loss: 0.6358985503196717, train accuracy: 77.475, test loss: 1.0777069185256958, test accuracy: 63.18
Epoch: 7, train loss: 0.5653978249549866, train accuracy: 80.4, test loss: 1.3068972537994386, test accur

In [10]:
#отдельная функция трейна с эпохами

def epoch_train(model, params, trial_n):
    
    experiment = 'trial_'+str(trial_n)
    writer = SummaryWriter(log_dir=logs_dir+experiment, flush_secs=2)
    
    epochs = 80
    max_early_stop = 15
    min_loss = 100
    early_stop = 0
    batch_size = 128
    
    lr = params["lr"]
    weight_decay = params["weight_decay"]
    amsgrad = params["amsgrad"]
    patience = params["patience"]
    
    train_loader, test_loader = get_loaders_batch(batch_size=batch_size, transform=transform, num_workers=0)
    optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay = weight_decay, amsgrad = amsgrad)
    scheduler = ReduceLROnPlateau(optimizer, 'max', patience = patience)
    
    accuracy = 0
    for epoch in range(epochs):
        train_loss, train_accuracy = train(model, device, train_loader, optimizer)
        test_loss, test_accuracy = test(model, device, test_loader)
        scheduler.step(test_accuracy)
        
        writer.add_scalar('Loss/train', train_loss, epoch)
        writer.add_scalar('Loss/test', test_loss, epoch)
        writer.add_scalar('Accuracy/train', train_accuracy, epoch)
        writer.add_scalar('Accuracy/test', test_accuracy, epoch)

        #early stop
        if test_loss > min_loss:
            early_stop += 1
        else:
            min_loss = test_loss
            early_stop = 0
        if early_stop == max_early_stop:
            print("Model doesn't improve, early stop")
            accuracy = test_accuracy
            break
        if epoch==epochs-1:
            accuracy = test_accuracy
    
    writer.add_hparams(hparam_dict = params, metric_dict = {'accuracy':accuracy})
    writer.close()
    
    return accuracy

In [11]:
device = torch.device('cuda:0')

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

model = ResNet().to(device)

def objective(trial):
    
    amsgrad = trial.suggest_categorical("amsgrad",[True, False])
    lr = trial.suggest_categorical("lr", [1e-5, 1e-4, 1e-3, 1e-2, 1e-1])
    weight_decay = trial.suggest_categorical("weight_decay", [0, 1e-7, 1e-6, 1e-5, 1e-4, 1e-3, 1e-2, 1e-1])
    patience = trial.suggest_int("patience", 4, 10)
    
    params = {"lr":1e-3, "weight_decay":weight_decay, "amsgrad": amsgrad, "patience": patience}
    
    accuracy = epoch_train(model, params, trial.number)
    
    print(accuracy)
    return accuracy

In [12]:
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials = 60)

[32m[I 2021-02-12 23:28:22,188][0m A new study created in memory with name: no-name-ec8d6f35-0672-45b1-9d83-cf4bae8830c3[0m


Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-12 23:36:32,733][0m Trial 0 finished with value: 74.28 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0, 'patience': 10}. Best is trial 0 with value: 74.28.[0m


Model doesn't improve, early stop
74.28
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-12 23:46:52,518][0m Trial 1 finished with value: 74.84 and parameters: {'amsgrad': True, 'lr': 0.001, 'weight_decay': 1e-06, 'patience': 9}. Best is trial 1 with value: 74.84.[0m


Model doesn't improve, early stop
74.84
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:03:38,998][0m Trial 2 finished with value: 74.96 and parameters: {'amsgrad': True, 'lr': 0.1, 'weight_decay': 0.001, 'patience': 7}. Best is trial 2 with value: 74.96.[0m


Model doesn't improve, early stop
74.96
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:20:14,998][0m Trial 3 finished with value: 73.3 and parameters: {'amsgrad': False, 'lr': 0.001, 'weight_decay': 0.1, 'patience': 9}. Best is trial 2 with value: 74.96.[0m


Model doesn't improve, early stop
73.3
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:28:32,937][0m Trial 4 finished with value: 71.72 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 1e-06, 'patience': 9}. Best is trial 2 with value: 74.96.[0m


Model doesn't improve, early stop
71.72
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:38:08,051][0m Trial 5 finished with value: 76.48 and parameters: {'amsgrad': False, 'lr': 0.001, 'weight_decay': 0.0001, 'patience': 5}. Best is trial 5 with value: 76.48.[0m


Model doesn't improve, early stop
76.48
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:47:05,691][0m Trial 6 finished with value: 76.68 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.0001, 'patience': 7}. Best is trial 6 with value: 76.68.[0m


Model doesn't improve, early stop
76.68
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 00:59:52,322][0m Trial 7 finished with value: 77.84 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 6}. Best is trial 7 with value: 77.84.[0m


Model doesn't improve, early stop
77.84
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 01:05:57,755][0m Trial 8 finished with value: 76.0 and parameters: {'amsgrad': True, 'lr': 0.001, 'weight_decay': 1e-05, 'patience': 4}. Best is trial 7 with value: 77.84.[0m


Model doesn't improve, early stop
76.0
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 01:13:37,963][0m Trial 9 finished with value: 77.8 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.0001, 'patience': 5}. Best is trial 7 with value: 77.84.[0m


Model doesn't improve, early stop
77.8
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 01:27:21,676][0m Trial 10 finished with value: 78.56 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
78.56
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 01:43:38,599][0m Trial 11 finished with value: 74.64 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.64
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 01:51:57,022][0m Trial 12 finished with value: 72.78 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 1e-07, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
72.78
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 02:00:55,518][0m Trial 13 finished with value: 75.52 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.01, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.52
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 02:18:28,919][0m Trial 14 finished with value: 74.84 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.84
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 02:37:16,146][0m Trial 15 finished with value: 74.28 and parameters: {'amsgrad': True, 'lr': 0.1, 'weight_decay': 0.1, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.28
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 02:52:33,200][0m Trial 16 finished with value: 74.74 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.74
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:03:41,935][0m Trial 17 finished with value: 75.04 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0.01, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.04
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:09:26,007][0m Trial 18 finished with value: 68.22 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
68.22
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:14:39,037][0m Trial 19 finished with value: 71.78 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 1e-05, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
71.78
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:24:49,970][0m Trial 20 finished with value: 74.48 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 1e-07, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.48
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:33:05,866][0m Trial 21 finished with value: 74.24 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.0001, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.24
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:45:48,659][0m Trial 22 finished with value: 74.82 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.82
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:53:45,725][0m Trial 23 finished with value: 76.26 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.0001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
76.26
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 03:59:11,666][0m Trial 24 finished with value: 75.88 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.88
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 04:05:34,561][0m Trial 25 finished with value: 66.22 and parameters: {'amsgrad': False, 'lr': 0.1, 'weight_decay': 0.001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
66.22
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 04:13:34,494][0m Trial 26 finished with value: 75.82 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.0001, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.82
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 04:31:02,022][0m Trial 27 finished with value: 75.08 and parameters: {'amsgrad': False, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.08
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 04:38:58,200][0m Trial 28 finished with value: 73.24 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
73.24
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 04:48:33,634][0m Trial 29 finished with value: 78.16 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
78.16
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 05:07:29,057][0m Trial 30 finished with value: 74.28 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.28
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 05:24:58,918][0m Trial 31 finished with value: 74.78 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.78
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 05:37:22,873][0m Trial 32 finished with value: 76.34 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 1e-06, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
76.34
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 05:45:20,790][0m Trial 33 finished with value: 75.12 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.12
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 05:57:08,509][0m Trial 34 finished with value: 75.32 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.32
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:14:39,519][0m Trial 35 finished with value: 74.62 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 0.1, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.62
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:24:12,387][0m Trial 36 finished with value: 76.58 and parameters: {'amsgrad': False, 'lr': 0.1, 'weight_decay': 1e-06, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
76.58
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:32:30,717][0m Trial 37 finished with value: 75.3 and parameters: {'amsgrad': True, 'lr': 0.001, 'weight_decay': 0.0001, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.3
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:39:13,169][0m Trial 38 finished with value: 72.82 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 1e-05, 'patience': 10}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
72.82
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:47:31,005][0m Trial 39 finished with value: 74.96 and parameters: {'amsgrad': True, 'lr': 0.0001, 'weight_decay': 1e-07, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.96
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 06:55:10,367][0m Trial 40 finished with value: 74.74 and parameters: {'amsgrad': False, 'lr': 0.001, 'weight_decay': 0.01, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.74
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 07:04:06,222][0m Trial 41 finished with value: 74.68 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.0001, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.68
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 07:10:52,739][0m Trial 42 finished with value: 71.58 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.0001, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
71.58
Files already downloaded and verified
Files already downloaded and verified
Model doesn't improve, early stop


[32m[I 2021-02-13 07:19:11,598][0m Trial 43 finished with value: 74.2 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.0001, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


74.2
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 07:28:46,091][0m Trial 44 finished with value: 74.56 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.0001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.56
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 07:34:53,122][0m Trial 45 finished with value: 58.56 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
58.56
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 07:46:39,600][0m Trial 46 finished with value: 76.56 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
76.56
Files already downloaded and verified
Files already downloaded and verified
Model doesn't improve, early stop


[32m[I 2021-02-13 07:52:05,717][0m Trial 47 finished with value: 76.06 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.1, 'patience': 4}. Best is trial 10 with value: 78.56.[0m


76.06
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 08:06:31,547][0m Trial 48 finished with value: 76.46 and parameters: {'amsgrad': True, 'lr': 0.1, 'weight_decay': 0.0001, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
76.46
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 08:24:04,818][0m Trial 49 finished with value: 77.3 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
77.3
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 08:41:41,253][0m Trial 50 finished with value: 74.84 and parameters: {'amsgrad': True, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.84
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 09:01:07,805][0m Trial 51 finished with value: 74.6 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
74.6
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 09:18:41,858][0m Trial 52 finished with value: 73.78 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
73.78
Files already downloaded and verified
Files already downloaded and verified
Model doesn't improve, early stop


[32m[I 2021-02-13 09:36:32,298][0m Trial 53 finished with value: 74.58 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


74.58
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 09:44:00,668][0m Trial 54 finished with value: 70.46 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 1e-05, 'patience': 8}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
70.46
Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 09:52:32,614][0m Trial 55 finished with value: 75.02 and parameters: {'amsgrad': False, 'lr': 0.01, 'weight_decay': 0.01, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.02
Files already downloaded and verified
Files already downloaded and verified
Model doesn't improve, early stop
69.8


[32m[I 2021-02-13 10:00:26,766][0m Trial 56 finished with value: 69.8 and parameters: {'amsgrad': False, 'lr': 0.0001, 'weight_decay': 0.001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 10:07:05,730][0m Trial 57 finished with value: 75.16 and parameters: {'amsgrad': True, 'lr': 0.001, 'weight_decay': 1e-07, 'patience': 5}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.16
Files already downloaded and verified
Files already downloaded and verified
Model doesn't improve, early stop
74.84


[32m[I 2021-02-13 10:16:34,187][0m Trial 58 finished with value: 74.84 and parameters: {'amsgrad': False, 'lr': 1e-05, 'weight_decay': 0.1, 'patience': 7}. Best is trial 10 with value: 78.56.[0m


Files already downloaded and verified
Files already downloaded and verified


[32m[I 2021-02-13 10:28:23,400][0m Trial 59 finished with value: 75.08 and parameters: {'amsgrad': True, 'lr': 0.01, 'weight_decay': 0.0001, 'patience': 6}. Best is trial 10 with value: 78.56.[0m


Model doesn't improve, early stop
75.08


Гиперпараметры, выдающие лучший результат согласно тестам optuna 

In [13]:
def get_loaders(batch_size=128, num_workers=2, transform=transforms.ToTensor()):
    train = datasets.CIFAR10('../data', train=True, download=True, transform=transform)
    test = datasets.CIFAR10('../data', train=False, download=True, transform=transform)
    torch.manual_seed(123)  # To ensure the same sampling during each experiment
    
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, num_workers=num_workers, shuffle=False)
    return train_loader, test_loader

epochs = 80
device = torch.device('cuda:0')

batch_size = 128

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_loader, test_loader = get_loaders(batch_size=batch_size, transform=transform, num_workers=0)
model = ResNet().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay = 0.1, amsgrad = True)
scheduler = ReduceLROnPlateau(optimizer, 'max', patience = 6)

min_loss = 100
max_early_stop = 15
early_stop = 0


for epoch in range(epochs):
    train_loss, train_accuracy = train(model, device, train_loader, optimizer)
    test_loss, test_accuracy = test(model, device, test_loader)
    scheduler.step(test_accuracy)

    #early stop
    if test_loss > min_loss:
        early_stop += 1
    else:
        min_loss = test_loss
        early_stop = 0
    if early_stop == max_early_stop:
        print("Model doesn't improve, early stop")
        break


    print("Epoch: {}, train loss: {}, train accuracy: {}, test loss: {}, test accuracy: {}".format(epoch, train_loss, train_accuracy, test_loss, test_accuracy))            

Files already downloaded and verified
Files already downloaded and verified
Epoch: 0, train loss: 2.1087148974609375, train accuracy: 19.332, test loss: 2.2405940105438233, test accuracy: 11.39
Epoch: 1, train loss: 2.1456982568359373, train accuracy: 17.95, test loss: 2.2152142162322996, test accuracy: 14.78
Epoch: 2, train loss: 2.1216576470947266, train accuracy: 17.988, test loss: 2.1491609413146975, test accuracy: 16.61
Epoch: 3, train loss: 2.0957974383544924, train accuracy: 17.94, test loss: 2.1846366706848146, test accuracy: 14.61
Epoch: 4, train loss: 2.091106525268555, train accuracy: 18.064, test loss: 2.197235676574707, test accuracy: 14.08
Epoch: 5, train loss: 2.0865254098510744, train accuracy: 18.292, test loss: 2.0958155609130857, test accuracy: 16.47
Epoch: 6, train loss: 2.081383836669922, train accuracy: 18.134, test loss: 2.483316763687134, test accuracy: 10.04
Epoch: 7, train loss: 2.082813187866211, train accuracy: 17.96, test loss: 2.063861925125122, test accur

Второй по точноти результат

In [17]:
def get_loaders(batch_size=128, num_workers=2, transform=transforms.ToTensor()):
    train = datasets.CIFAR10('../data', train=True, download=True, transform=transform)
    test = datasets.CIFAR10('../data', train=False, download=True, transform=transform)
    torch.manual_seed(123)  # To ensure the same sampling during each experiment
    
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, num_workers=num_workers, shuffle=False)
    return train_loader, test_loader

epochs = 80
device = torch.device('cuda:0')

batch_size = 128

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_loader, test_loader = get_loaders(batch_size=batch_size, transform=transform, num_workers=0)
model = ResNet().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001, weight_decay = 0.1, amsgrad = True)
scheduler = ReduceLROnPlateau(optimizer, 'max', patience = 4)

min_loss = 100
max_early_stop = 15
early_stop = 0


for epoch in range(epochs):
    train_loss, train_accuracy = train(model, device, train_loader, optimizer)
    test_loss, test_accuracy = test(model, device, test_loader)
    scheduler.step(test_accuracy)

    #early stop
    if test_loss > min_loss:
        early_stop += 1
    else:
        min_loss = test_loss
        early_stop = 0
    if early_stop == max_early_stop:
        print("Model doesn't improve, early stop")
        break


    print("Epoch: {}, train loss: {}, train accuracy: {}, test loss: {}, test accuracy: {}".format(epoch, train_loss, train_accuracy, test_loss, test_accuracy))            

Files already downloaded and verified
Files already downloaded and verified
Epoch: 0, train loss: 1.7348911715698243, train accuracy: 37.376, test loss: 1.4607028423309327, test accuracy: 46.93
Epoch: 1, train loss: 1.3391730519104004, train accuracy: 52.452, test loss: 1.2772051935195923, test accuracy: 55.88
Epoch: 2, train loss: 1.1522341136169434, train accuracy: 60.132, test loss: 1.1838129133224486, test accuracy: 59.5
Epoch: 3, train loss: 1.0477635551452638, train accuracy: 64.196, test loss: 1.1413270091056824, test accuracy: 59.96
Epoch: 4, train loss: 0.9740360813903809, train accuracy: 67.398, test loss: 1.098311243057251, test accuracy: 62.24
Epoch: 5, train loss: 0.9171803987121582, train accuracy: 69.956, test loss: 1.0977614736557006, test accuracy: 61.53
Epoch: 6, train loss: 0.8683072491455078, train accuracy: 72.078, test loss: 0.9804579916000367, test accuracy: 66.62
Epoch: 7, train loss: 0.8270677038574219, train accuracy: 73.856, test loss: 0.9155941962242127, tes

Результаты для гиперпараметров, подобранных вручную

In [39]:
def get_loaders(batch_size=128, num_workers=2, transform=transforms.ToTensor()):
    train = datasets.CIFAR10('../data', train=True, download=True, transform=transform)
    test = datasets.CIFAR10('../data', train=False, download=True, transform=transform)
    torch.manual_seed(123)  # To ensure the same sampling during each experiment
    
    train_loader = torch.utils.data.DataLoader(train, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader = torch.utils.data.DataLoader(test, batch_size=batch_size, num_workers=num_workers, shuffle=False)
    return train_loader, test_loader

epochs = 80
device = torch.device('cuda:0')

batch_size = 128

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_loader, test_loader = get_loaders(batch_size=batch_size, transform=transform, num_workers=0)
model = ResNet().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay = 0.01, amsgrad = True)
scheduler = ReduceLROnPlateau(optimizer, 'max', patience = 5)

min_loss = 100
max_early_stop = 15
early_stop = 0


for epoch in range(epochs):
    train_loss, train_accuracy = train(model, device, train_loader, optimizer)
    test_loss, test_accuracy = test(model, device, test_loader)
    scheduler.step(test_accuracy)

    #early stop
    if test_loss > min_loss:
        early_stop += 1
    else:
        min_loss = test_loss
        early_stop = 0
    if early_stop == max_early_stop:
        print("Model doesn't improve, early stop")
        break


    print("Epoch: {}, train loss: {}, train accuracy: {}, test loss: {}, test accuracy: {}".format(epoch, train_loss, train_accuracy, test_loss, test_accuracy))            

Files already downloaded and verified
Files already downloaded and verified
Epoch: 0, train loss: 1.3474447024536134, train accuracy: 51.226, test loss: 1.2984502500534059, test accuracy: 52.88
Epoch: 1, train loss: 1.0135837547302247, train accuracy: 64.374, test loss: 1.2190793561935425, test accuracy: 54.42
Epoch: 2, train loss: 0.9013457600402832, train accuracy: 68.854, test loss: 1.2080781826019287, test accuracy: 56.4
Epoch: 3, train loss: 0.8272447872924805, train accuracy: 71.954, test loss: 1.0307086879730225, test accuracy: 62.95
Epoch: 4, train loss: 0.7656343673706054, train accuracy: 74.294, test loss: 0.9496164835929871, test accuracy: 67.33
Epoch: 5, train loss: 0.7219563063812255, train accuracy: 75.976, test loss: 0.9451825685501098, test accuracy: 67.42
Epoch: 6, train loss: 0.6874327080535889, train accuracy: 77.25, test loss: 0.926943623828888, test accuracy: 67.9
Epoch: 7, train loss: 0.6707906939697266, train accuracy: 77.758, test loss: 0.8636785765647889, test 

Поиск гиперпараметров на части датасета не дает тех же результатов на всем датасете. Необходимо искать гиперпараметры на всем датасете, но это занимает слигком много времени (больше двух дней)