# Alexnet Cifar100 and Dropout

*Python implementation of Dropout with Alexnet and Cifar 100*

**For the following experiments several software packages need to be installed including NVIDIA Cuda which has to be installed manually before using this notebook.**

In [1]:
!pip install Pillow==6.1.0
!pip install torch torchvision
!pip install tensorboard
!pip install scikit-learn



## Overfitting

To understand the effect of Dropout for Deep Neural Networks (DNN) we first have to take a look at the general Problem of Overfitting. 

In a DNN each Neuron uses 2 equations to calculate the output of the DNN. The first equation is used to calculate the state of the neuron using bias and the weighted input form each neuron of the prior layer. The second equation is used to calculate the activation of that neuron using its state. Each neuron has its on input weights and each weight is a network parameter, so adding one neuron to a layer can mean adding several parameters to the network.

Depending on the amount of Data that we have to train our network, the network might have too many parameters. To be more precisely due to the amount of parameters the network might reach a point during training, where it starts to memorize the training data or more specific the noise of the training data which can increase the accuracy on the training data. Most likely the noise of the training data is different than the noise of the test data, which causes the accuracy on the testing data to decrease after reaching that point during training as seen in the following diagram.

<img width="864" height="576" class="pf sh ds t u hu ak id" role="presentation" src="https://miro.medium.com/max/1350/1*iANsamYbzkuUwIBWDP21GQ.gif">

## Co-adaption

another problem might be caused by the general methode of the backpropagation algorithm. The backpropagation algorith is used to update each input weight to minimize the loss of our network. Thereby it updates the neuron according to the behavior of the other neurons. This can lead to co-adaption between neurons, where neurons might change in a way where they correct the mistakes of other neurons. 

## Dropout

To stop overfitting and co-adaption we randomly drop neuron of the net to train with thinner networks. To make that possible we add a layer that drops a neuron with the probabilty p. Using that layer we generated a thinner network for each batch that we train on, as seen in the following diagram. 

![Network with parameters with and without dropout layer](./visualizations/Dropout_aufbau.JPG)


<img width="612" height="348" class="pf sh ds t u hu ak id" role="presentation" src="https://miro.medium.com/max/956/1*064lT1SXq_6F7uoc00V1fw.gif">

Using scaling of the input weights of each thinner network we get the weights of our original network, which we use without Dropout to evaluate our accuracy on our test data. Research shows that the specific probability p that we use for the dropout layers determine the effect it regarding overfitting.

For the following experiment we use AlexNet, best known for winning the 2012 ILSVRC with more than 10 percentage points better accuracy that the runner up.

Alexnet has more than 650000 neurons and over 62 million parameters.

In [2]:
import argparse
import os
import random
import shutil
import time
import warnings
import sys

import torch
import torch.nn as nn
import torch.nn.parallel
import torch.backends.cudnn as cudnn
import torch.distributed as dist
import torch.optim
import torch.multiprocessing as mp
import torch.utils.data as data
import torch.utils.data.distributed
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
from torch.utils.tensorboard import SummaryWriter

class AlexNet(nn.Module):

    def __init__(self, droprate=0.5, num_classes=10):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(64, 192, kernel_size=3, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2)
        )
        self.fc_layers = nn.Sequential(
            nn.Dropout(droprate),
            nn.Linear(4096, 2048),
            nn.ReLU(inplace=True),
            nn.Dropout(droprate),
            nn.Linear(2048, 2048),
            nn.ReLU(inplace=True),
            nn.Linear(2048, num_classes),
        )

    def forward(self, x):
        conv_features = self.features(x)
        flatten = conv_features.view(conv_features.size(0), -1)
        fc = self.fc_layers(flatten)
        return fc

Since Cifar 100 has 100 differnet classes we have to set *num_classe* to 100. 

For our experiment we can use different dropout rates to see the effect on our training and test accuracy. As a standard the dropout rate will be 0.5. Since we want to show the effect of dropout we will overwrite that standard parameter with different rates by using different values for the variable *dropout_rate*. At first we train our modell without dropout by setting *dropout_rate* to 0, train the modell for 100 epochs, save the final losses as well as both testing and training accuracy and restart training with an untrained network and a different dropout rate. To show the effect of dropout we will increase the dropout rate for each training session by 0.1 up to a rate of 0.9. A dropout rate of 0.9 will show us how the network will perform under training when an expected 90% of the neurons in our dropout layer will be zeroed out.

To adjust the dropout rate manually please chance the value for *dropout_rate* in the following python code! This will only be necessary for training with different dropout rates than described previously.

In [3]:
dropout_rate = 0.0
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


For training and test data we use the Cifar 100 dataset which consists of 100 different classes with 600 images each. 
For further information about that dataset we refer to https://www.cs.toronto.edu/~kriz/cifar.html

In [4]:
batch_groese = 128
num_workers = 8
print("for dataloading this programm will use a batch size of {} and {} threads".format(batch_groese,num_workers))

transform = transforms.Compose(
    [transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

train_dataset = datasets.CIFAR100(root='~/data', train=True,download=True, transform=transform)
test_dataset =  datasets.CIFAR100(root='~/data', train=False,download=True, transform=transform)

train_sampler = None
    
train_loader = torch.utils.data.DataLoader(
     train_dataset, batch_size=batch_groese, shuffle=(train_sampler is None),
     num_workers=num_workers, pin_memory=True, sampler=train_sampler)

val_loader = torch.utils.data.DataLoader( test_dataset,
     batch_size=batch_groese, shuffle=False,
     num_workers=num_workers, pin_memory=True)

for dataloading this programm will use a batch size of 128 and 8 threads
Files already downloaded and verified
Files already downloaded and verified


Bevor we can start training the model we first have to initialize several funktions that we will use to compute the average both top 1 and top 5 accuracy

In [5]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [6]:
class AverageMeter(object):
    """Computes and stores the average and current value"""
    def __init__(self):
        self.reset()

    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.count

In [7]:
def accuracy(output, target, topk=(1,)):
    """Computes the accuracy over the k top predictions for the specified values of k"""
    with torch.no_grad():
        maxk = max(topk)
        batch_size = target.size(0)

        _, pred = output.topk(maxk, 1, True, True)
        pred = pred.t()
        correct = pred.eq(target.view(1, -1).expand_as(pred))

        res = []
        for k in topk:
            correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
            res.append(correct_k.mul_(100.0 / batch_size))
        return res

Choosing the learning rate for our model dependes on our batch size and can make a big difference regarding the best accuracy we can accieve during training. To get close to the optimum we also adjust the learning rate every 30 epochs.

In [8]:
def adjust_learning_rate(optimizer, epoch,lr):
    """Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""
    lr = lr * (0.1 ** (epoch // 30))
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

Before we start with training our modell we first define our functions for training and validation. As seen in the following code the training functions uses the given batchsize to calculate the loss function and optimize the model to minimize the loss
after each batch it will calculate the top1 and top5 accuracy

In [9]:
def train(train_loader, model, criterion, optimizer, epoch):
    batch_time = AverageMeter()
    data_time = AverageMeter()
    losses = AverageMeter()
    top1 = AverageMeter()
    top5 = AverageMeter()

    #Tensorboard Summarywriter
    directory = 'runs/cifar100'
    runfolder = 'Dropout ' + str(dropout_rate)
    tb = SummaryWriter(log_dir=os.path.join(directory,runfolder))

    # define at which batchnumbers to print accuracy
    print_training = 150
    # when batchnumber/print_training = 0 accuracy will be printed
    
    # switch to train mode
    model.train()

    end = time.time()
    for i, (input, target) in enumerate(train_loader):
        # measure data loading time
        data_time.update(time.time() - end)

        
        target = target.cuda(device=None, non_blocking=True)

        # compute output
        output = model(input)
        loss = criterion(output, target)

        # measure accuracy and record loss
        acc1, acc5 = accuracy(output, target, topk=(1, 5))
        losses.update(loss.item(), input.size(0))
        top1.update(acc1[0], input.size(0))
        top5.update(acc5[0], input.size(0))

        # compute gradient and do SGD step
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # measure elapsed time
        batch_time.update(time.time() - end)
        end = time.time()
        
        
        if i % print_training == 0:
            print('Epoch: [{0}][{1}/{2}]\t'
                  'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                  'Data {data_time.val:.3f} ({data_time.avg:.3f})\t'
                  'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                  'Acc@1 {top1.val:.3f} ({top1.avg:.3f})\t'
                  'Acc@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
                   epoch, i, len(train_loader), batch_time=batch_time,
                   data_time=data_time, loss=losses, top1=top1, top5=top5))
    tb.add_scalar('Loss (Training)', losses.avg, epoch)
    tb.add_scalar('Top 1 Training-Accuracy', top1.avg,epoch)
    tb.add_scalar('Top 5 Training-Accuracy', top5.avg,epoch)
    if epoch == 99:
        directory = 'runs/cifar100'
        runfolder = 'all_dropout_experiments'
        tb_2 = SummaryWriter(log_dir=os.path.join(directory,runfolder))
        print('dropout rate for this experiment = {}'.format(dropout_rate))
        tb_2.add_scalar('Loss (Training)', losses.avg, (dropout_rate*100))
        tb_2.add_scalar('Top 1 Training-Accuracy', top1.avg,(dropout_rate*100))
        tb_2.add_scalar('Top 5 Training-Accuracy', top5.avg,(dropout_rate*100))

In [10]:
def validate(val_loader, model, criterion, epoch):
    batch_time = AverageMeter()
    losses = AverageMeter()
    top1 = AverageMeter()
    top5 = AverageMeter()

    #Tensorboard Summarywriter
    
    directory = 'runs/cifar100'
    runfolder = 'Dropout p'
    tb = SummaryWriter(log_dir=os.path.join(directory,runfolder))

    # define at which batchnumbers to print accuracy
    print_test = 10
    # when batchnumber/print_test = 0 accuracy will be printed
    
    # switch to evaluate mode
    model.eval()

    with torch.no_grad():
        end = time.time()
        for i, (input, target) in enumerate(val_loader):
            target = target.cuda(device = None, non_blocking=True)

            # compute output
            output = model(input)
            loss = criterion(output, target)

            # measure accuracy and record loss
            acc1, acc5 = accuracy(output, target, topk=(1, 5))
            losses.update(loss.item(), input.size(0))
            top1.update(acc1[0], input.size(0))
            top5.update(acc5[0], input.size(0))

            # measure elapsed time
            batch_time.update(time.time() - end)
            end = time.time()

            if i % print_test == 0:
                print('Test: [{0}/{1}]\t'
                      'Time {batch_time.val:.3f} ({batch_time.avg:.3f})\t'
                      'Loss {loss.val:.4f} ({loss.avg:.4f})\t'
                      'Acc@1 {top1.val:.3f} ({top1.avg:.3f})\t'
                      'Acc@5 {top5.val:.3f} ({top5.avg:.3f})'.format(
                       i, len(val_loader), batch_time=batch_time, loss=losses,
                       top1=top1, top5=top5))
        
        print(' * Acc@1 {top1.avg:.3f} Acc@5 {top5.avg:.3f}'
              .format(top1=top1, top5=top5))
    tb.add_scalar('Loss (Test)', losses.avg, epoch)
    tb.add_scalar('Top 1 Test-Accuracy', top1.avg,epoch)
    tb.add_scalar('Top 5 Test-Accuracy', top5.avg,epoch)
    if epoch == 99:
        directory = 'runs/cifar100'
        runfolder = 'all_dropout_experiments'
        tb_2 = SummaryWriter(log_dir=os.path.join(directory,runfolder))
        print('dropout rate for this experiment = {}'.format(dropout_rate))
        tb_2.add_scalar('Loss (Test)', losses.avg, (dropout_rate*100))
        tb_2.add_scalar('Top 1 Test-Accuracy', top1.avg,(dropout_rate*100))
        tb_2.add_scalar('Top 5 Test-Accuracy', top5.avg,(dropout_rate*100))

    return top1.avg

Since training might be interrupted we will define a function that can save the current state of our modell during training

In [11]:
def save_checkpoint(state, is_best, filename='checkpoint.pth.tar'):
    torch.save(state, filename)
    if is_best:
        shutil.copyfile(filename, 'model_best.pth.tar')

after defining our training and validation function we can finally start training our model with the following code

In [12]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.273 (0.273)	Data 0.230 (0.230)	Loss 4.6072 (4.6072)	Acc@1 0.000 (0.000)	Acc@5 3.125 (3.125)
Epoch: [0][150/391]	Time 0.026 (0.025)	Data 0.000 (0.002)	Loss 4.1308 (4.4276)	Acc@1 4.688 (2.582)	Acc@5 18.750 (11.082)
Epoch: [0][300/391]	Time 0.022 (0.024)	Data 0.000 (0.001)	Loss 3.8969 (4.2417)	Acc@1 6.250 (4.119)	Acc@5 26.562 (16.798)
Test: [0/79]	Time 0.185 (0.185)	Loss 3.6821 (3.6821)	Acc@1 8.594 (8.594)	Acc@5 33.594 (33.594)
Test: [10/79]	Time 0.006 (0.023)	Loss 4.0238 (3.8299)	Acc@1 7.812 (10.014)	Acc@5 30.469 (30.966)
Test: [20/79]	Time 0.006 (0.015)	Loss 3.7696 (3.8306)	Acc@1 13.281 (10.193)	Acc@5 39.062 (31.213)
Test: [30/79]	Time 0.006 (0.012)	Loss 3.8817 (3.8379)	Acc@1 3.906 (9.577)	Acc@5 24.219 (30.973)
Test: [40/79]	Time 0.007 (0.011)	Loss 3.8315 (3.8278)	Acc@1 8.594 (9.566)	Acc@5 28.906 (31.021)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.9261 (3.8298)	Acc@1 9.375 (9.681)	Acc@5 25.000 (30.913)
Test: [60/79]	Time 0.006 (0.009)	Loss 3.9610 (3.8415)	Acc@1 7.8

Test: [70/79]	Time 0.007 (0.009)	Loss 2.3345 (2.3673)	Acc@1 35.156 (38.710)	Acc@5 71.875 (69.960)
 * Acc@1 38.530 Acc@5 69.840
Time/epoch: 10.478960514068604 sec
Epoch: [7][0/391]	Time 0.190 (0.190)	Data 0.177 (0.177)	Loss 2.0580 (2.0580)	Acc@1 45.312 (45.312)	Acc@5 75.000 (75.000)
Epoch: [7][150/391]	Time 0.020 (0.023)	Data 0.000 (0.001)	Loss 1.8991 (1.9935)	Acc@1 46.875 (45.137)	Acc@5 77.344 (76.976)
Epoch: [7][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.1195 (2.0128)	Acc@1 39.844 (44.635)	Acc@5 75.000 (76.716)
Test: [0/79]	Time 0.190 (0.190)	Loss 2.1013 (2.1013)	Acc@1 46.094 (46.094)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.023)	Loss 2.3133 (2.3275)	Acc@1 36.719 (39.134)	Acc@5 74.219 (72.088)
Test: [20/79]	Time 0.007 (0.015)	Loss 2.2044 (2.2758)	Acc@1 45.312 (40.290)	Acc@5 74.219 (72.507)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.3485 (2.3044)	Acc@1 36.719 (39.819)	Acc@5 72.656 (71.925)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.0876 (2.3136)	Acc@1 43.750 (39.958)	Acc@

Test: [60/79]	Time 0.007 (0.010)	Loss 2.6068 (2.3952)	Acc@1 42.969 (43.199)	Acc@5 71.094 (73.450)
Test: [70/79]	Time 0.007 (0.009)	Loss 2.5052 (2.3906)	Acc@1 43.750 (43.321)	Acc@5 66.406 (73.449)
 * Acc@1 43.370 Acc@5 73.400
Time/epoch: 10.076985359191895 sec
Epoch: [14][0/391]	Time 0.196 (0.196)	Data 0.181 (0.181)	Loss 0.8909 (0.8909)	Acc@1 74.219 (74.219)	Acc@5 96.094 (96.094)
Epoch: [14][150/391]	Time 0.021 (0.023)	Data 0.000 (0.001)	Loss 1.1146 (1.1297)	Acc@1 62.500 (66.370)	Acc@5 91.406 (91.401)
Epoch: [14][300/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 1.2041 (1.1662)	Acc@1 64.062 (65.365)	Acc@5 89.844 (91.022)
Test: [0/79]	Time 0.167 (0.167)	Loss 2.3369 (2.3369)	Acc@1 49.219 (49.219)	Acc@5 73.438 (73.438)
Test: [10/79]	Time 0.006 (0.021)	Loss 2.4039 (2.4507)	Acc@1 42.969 (42.969)	Acc@5 73.438 (73.722)
Test: [20/79]	Time 0.006 (0.014)	Loss 2.5855 (2.4239)	Acc@1 42.188 (43.118)	Acc@5 69.531 (73.103)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.5104 (2.4437)	Acc@1 41.406 (42.868)	A

Test: [30/79]	Time 0.007 (0.012)	Loss 3.2339 (3.0496)	Acc@1 41.406 (43.271)	Acc@5 69.531 (73.034)
Test: [40/79]	Time 0.007 (0.011)	Loss 2.6417 (3.0753)	Acc@1 39.062 (43.216)	Acc@5 79.688 (72.866)
Test: [50/79]	Time 0.007 (0.010)	Loss 3.1945 (3.0772)	Acc@1 45.312 (43.382)	Acc@5 71.875 (72.733)
Test: [60/79]	Time 0.007 (0.010)	Loss 3.2917 (3.0829)	Acc@1 44.531 (43.353)	Acc@5 71.875 (72.759)
Test: [70/79]	Time 0.007 (0.009)	Loss 3.2669 (3.0797)	Acc@1 43.750 (43.486)	Acc@5 69.531 (72.843)
 * Acc@1 43.590 Acc@5 72.900
Time/epoch: 10.479411363601685 sec
Epoch: [21][0/391]	Time 0.217 (0.217)	Data 0.205 (0.205)	Loss 0.5989 (0.5989)	Acc@1 82.031 (82.031)	Acc@5 96.875 (96.875)
Epoch: [21][150/391]	Time 0.021 (0.023)	Data 0.000 (0.001)	Loss 0.7609 (0.5741)	Acc@1 75.781 (82.042)	Acc@5 96.094 (97.527)
Epoch: [21][300/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 0.6627 (0.6216)	Acc@1 81.250 (80.464)	Acc@5 95.312 (97.308)
Test: [0/79]	Time 0.182 (0.182)	Loss 2.9236 (2.9236)	Acc@1 45.312 (45.312)	A

Test: [0/79]	Time 0.187 (0.187)	Loss 3.5455 (3.5455)	Acc@1 47.656 (47.656)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.023)	Loss 4.2758 (4.0330)	Acc@1 42.188 (42.330)	Acc@5 69.531 (72.088)
Test: [20/79]	Time 0.007 (0.015)	Loss 3.6187 (3.9261)	Acc@1 46.094 (42.708)	Acc@5 72.656 (72.098)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.3295 (3.9788)	Acc@1 46.094 (42.591)	Acc@5 76.562 (71.749)
Test: [40/79]	Time 0.007 (0.011)	Loss 3.2784 (4.0155)	Acc@1 50.781 (43.026)	Acc@5 75.000 (71.742)
Test: [50/79]	Time 0.007 (0.010)	Loss 4.1380 (4.0035)	Acc@1 43.750 (43.030)	Acc@5 69.531 (71.507)
Test: [60/79]	Time 0.007 (0.010)	Loss 4.3498 (4.0349)	Acc@1 46.094 (43.071)	Acc@5 69.531 (71.491)
Test: [70/79]	Time 0.007 (0.009)	Loss 4.2100 (4.0091)	Acc@1 43.750 (43.244)	Acc@5 67.969 (71.611)
 * Acc@1 43.260 Acc@5 71.700
Time/epoch: 10.100801944732666 sec
Epoch: [28][0/391]	Time 0.191 (0.191)	Data 0.179 (0.179)	Loss 0.3977 (0.3977)	Acc@1 87.500 (87.500)	Acc@5 99.219 (99.219)
Epoch: [28][150/391]	Time 0.020

Epoch: [34][150/391]	Time 0.019 (0.023)	Data 0.000 (0.001)	Loss 0.0670 (0.0353)	Acc@1 99.219 (99.276)	Acc@5 100.000 (99.979)
Epoch: [34][300/391]	Time 0.022 (0.022)	Data 0.000 (0.001)	Loss 0.0401 (0.0357)	Acc@1 99.219 (99.242)	Acc@5 100.000 (99.977)
Test: [0/79]	Time 0.180 (0.180)	Loss 4.5067 (4.5067)	Acc@1 47.656 (47.656)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.023)	Loss 6.3434 (5.1890)	Acc@1 44.531 (45.668)	Acc@5 68.750 (74.219)
Test: [20/79]	Time 0.006 (0.015)	Loss 4.9464 (5.1975)	Acc@1 39.062 (44.717)	Acc@5 73.438 (73.326)
Test: [30/79]	Time 0.006 (0.012)	Loss 5.0852 (5.2357)	Acc@1 39.844 (44.708)	Acc@5 75.000 (73.387)
Test: [40/79]	Time 0.007 (0.011)	Loss 4.6970 (5.3133)	Acc@1 49.219 (45.046)	Acc@5 73.438 (73.133)
Test: [50/79]	Time 0.007 (0.010)	Loss 5.5001 (5.3021)	Acc@1 42.188 (45.175)	Acc@5 70.312 (72.993)
Test: [60/79]	Time 0.007 (0.010)	Loss 5.4693 (5.3310)	Acc@1 50.000 (45.146)	Acc@5 72.656 (73.117)
Test: [70/79]	Time 0.007 (0.009)	Loss 5.3347 (5.2784)	Acc@1 42.18

Test: [70/79]	Time 0.007 (0.009)	Loss 6.6825 (6.5317)	Acc@1 42.188 (46.039)	Acc@5 70.312 (73.845)
 * Acc@1 46.090 Acc@5 73.930
Time/epoch: 10.180627346038818 sec
Epoch: [41][0/391]	Time 0.210 (0.210)	Data 0.198 (0.198)	Loss 0.0015 (0.0015)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [41][150/391]	Time 0.022 (0.023)	Data 0.000 (0.001)	Loss 0.0047 (0.0063)	Acc@1 100.000 (99.953)	Acc@5 100.000 (99.995)
Epoch: [41][300/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.0044 (0.0082)	Acc@1 100.000 (99.909)	Acc@5 100.000 (99.997)
Test: [0/79]	Time 0.203 (0.203)	Loss 5.9296 (5.9296)	Acc@1 47.656 (47.656)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.025)	Loss 7.7970 (6.7866)	Acc@1 44.531 (45.455)	Acc@5 68.750 (74.077)
Test: [20/79]	Time 0.006 (0.016)	Loss 6.1817 (6.6955)	Acc@1 45.312 (45.312)	Acc@5 75.781 (73.958)
Test: [30/79]	Time 0.006 (0.013)	Loss 5.9951 (6.7903)	Acc@1 45.312 (45.161)	Acc@5 76.562 (73.690)
Test: [40/79]	Time 0.007 (0.011)	Loss 6.0406 (6.8166)	Acc@1 43.750 (4

Test: [40/79]	Time 0.007 (0.011)	Loss 7.4014 (7.5307)	Acc@1 43.750 (45.827)	Acc@5 73.438 (73.190)
Test: [50/79]	Time 0.007 (0.010)	Loss 7.5816 (7.4841)	Acc@1 46.094 (45.879)	Acc@5 70.312 (73.300)
Test: [60/79]	Time 0.007 (0.010)	Loss 7.7695 (7.5208)	Acc@1 48.438 (45.812)	Acc@5 69.531 (73.309)
Test: [70/79]	Time 0.007 (0.009)	Loss 7.8158 (7.4530)	Acc@1 40.625 (46.094)	Acc@5 69.531 (73.449)
 * Acc@1 46.210 Acc@5 73.460
Time/epoch: 10.039480209350586 sec
Epoch: [48][0/391]	Time 0.224 (0.224)	Data 0.213 (0.213)	Loss 0.0044 (0.0044)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [48][150/391]	Time 0.021 (0.023)	Data 0.000 (0.002)	Loss 0.0009 (0.0092)	Acc@1 100.000 (99.824)	Acc@5 100.000 (100.000)
Epoch: [48][300/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.0037 (0.0093)	Acc@1 100.000 (99.795)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.185 (0.185)	Loss 6.7567 (6.7567)	Acc@1 50.781 (50.781)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.023)	Loss 8.9722 (7.6204)	Acc@1 46.094 

Test: [0/79]	Time 0.197 (0.197)	Loss 7.2113 (7.2113)	Acc@1 46.875 (46.875)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.024)	Loss 9.1888 (8.1199)	Acc@1 46.875 (46.307)	Acc@5 67.969 (73.793)
Test: [20/79]	Time 0.006 (0.015)	Loss 8.0369 (7.9675)	Acc@1 42.969 (46.429)	Acc@5 71.094 (73.400)
Test: [30/79]	Time 0.006 (0.012)	Loss 7.7722 (7.9501)	Acc@1 42.188 (45.917)	Acc@5 76.562 (73.564)
Test: [40/79]	Time 0.006 (0.011)	Loss 7.1579 (8.0457)	Acc@1 50.000 (45.903)	Acc@5 73.438 (73.418)
Test: [50/79]	Time 0.007 (0.010)	Loss 7.9651 (8.0015)	Acc@1 46.094 (45.941)	Acc@5 65.625 (73.269)
Test: [60/79]	Time 0.007 (0.010)	Loss 8.0493 (8.0162)	Acc@1 50.000 (45.889)	Acc@5 72.656 (73.271)
Test: [70/79]	Time 0.007 (0.009)	Loss 8.7835 (7.9597)	Acc@1 46.094 (46.303)	Acc@5 67.969 (73.283)
 * Acc@1 46.300 Acc@5 73.450
Time/epoch: 9.970653772354126 sec
Epoch: [55][0/391]	Time 0.212 (0.212)	Data 0.198 (0.198)	Loss 0.0004 (0.0004)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [55][150/391]	Time 0.

Epoch: [61][0/391]	Time 0.191 (0.191)	Data 0.180 (0.180)	Loss 0.0010 (0.0010)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [61][150/391]	Time 0.021 (0.023)	Data 0.000 (0.001)	Loss 0.0016 (0.0013)	Acc@1 100.000 (99.979)	Acc@5 100.000 (100.000)
Epoch: [61][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0006 (0.0015)	Acc@1 100.000 (99.977)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.186 (0.186)	Loss 7.0457 (7.0457)	Acc@1 49.219 (49.219)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.023)	Loss 9.5371 (8.2854)	Acc@1 49.219 (47.017)	Acc@5 78.125 (75.710)
Test: [20/79]	Time 0.006 (0.015)	Loss 7.5729 (8.0509)	Acc@1 46.094 (46.429)	Acc@5 74.219 (75.260)
Test: [30/79]	Time 0.006 (0.012)	Loss 8.1788 (8.1230)	Acc@1 40.625 (46.220)	Acc@5 71.875 (74.446)
Test: [40/79]	Time 0.007 (0.011)	Loss 7.5772 (8.2352)	Acc@1 45.312 (46.189)	Acc@5 74.219 (74.143)
Test: [50/79]	Time 0.007 (0.010)	Loss 8.5498 (8.2062)	Acc@1 44.531 (46.048)	Acc@5 69.531 (73.912)
Test: [60/79]	Time 0.007 (0.010)	

Test: [70/79]	Time 0.007 (0.009)	Loss 8.7633 (8.2868)	Acc@1 44.531 (46.457)	Acc@5 67.188 (73.790)
 * Acc@1 46.510 Acc@5 73.900
Time/epoch: 10.038851022720337 sec
Epoch: [68][0/391]	Time 0.199 (0.199)	Data 0.188 (0.188)	Loss 0.0004 (0.0004)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [68][150/391]	Time 0.021 (0.023)	Data 0.000 (0.001)	Loss 0.0003 (0.0006)	Acc@1 100.000 (99.990)	Acc@5 100.000 (100.000)
Epoch: [68][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0003 (0.0006)	Acc@1 100.000 (99.984)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.189 (0.189)	Loss 7.2110 (7.2110)	Acc@1 53.906 (53.906)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.023)	Loss 9.4180 (8.1290)	Acc@1 48.438 (47.088)	Acc@5 76.562 (75.994)
Test: [20/79]	Time 0.006 (0.015)	Loss 7.2733 (8.0142)	Acc@1 46.094 (46.354)	Acc@5 73.438 (74.926)
Test: [30/79]	Time 0.006 (0.012)	Loss 7.8567 (8.1316)	Acc@1 45.312 (46.119)	Acc@5 75.000 (74.647)
Test: [40/79]	Time 0.007 (0.011)	Loss 7.8235 (8.2695)	Acc@1 44.531 

Test: [40/79]	Time 0.006 (0.011)	Loss 8.4198 (8.6388)	Acc@1 43.750 (46.399)	Acc@5 72.656 (73.952)
Test: [50/79]	Time 0.007 (0.010)	Loss 9.1610 (8.6381)	Acc@1 41.406 (46.201)	Acc@5 69.531 (73.683)
Test: [60/79]	Time 0.007 (0.010)	Loss 8.6297 (8.6645)	Acc@1 51.562 (46.247)	Acc@5 75.000 (73.668)
Test: [70/79]	Time 0.007 (0.009)	Loss 8.7464 (8.5877)	Acc@1 47.656 (46.710)	Acc@5 71.875 (73.779)
 * Acc@1 46.850 Acc@5 73.940
Time/epoch: 10.63701057434082 sec
Epoch: [75][0/391]	Time 0.221 (0.221)	Data 0.208 (0.208)	Loss 0.0003 (0.0003)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [75][150/391]	Time 0.020 (0.023)	Data 0.000 (0.002)	Loss 0.0004 (0.0006)	Acc@1 100.000 (99.990)	Acc@5 100.000 (100.000)
Epoch: [75][300/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 0.0001 (0.0007)	Acc@1 100.000 (99.982)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.188 (0.188)	Loss 6.8152 (6.8152)	Acc@1 50.000 (50.000)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.023)	Loss 10.6319 (8.6515)	Acc@1 45.312 

Test: [0/79]	Time 0.201 (0.201)	Loss 7.7243 (7.7243)	Acc@1 51.562 (51.562)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.024)	Loss 9.7869 (8.6828)	Acc@1 50.000 (46.307)	Acc@5 75.781 (75.781)
Test: [20/79]	Time 0.006 (0.016)	Loss 7.8648 (8.6479)	Acc@1 45.312 (46.094)	Acc@5 74.219 (74.665)
Test: [30/79]	Time 0.006 (0.013)	Loss 8.4196 (8.7303)	Acc@1 47.656 (46.321)	Acc@5 77.344 (73.942)
Test: [40/79]	Time 0.006 (0.011)	Loss 7.6537 (8.8190)	Acc@1 44.531 (46.094)	Acc@5 76.562 (73.647)
Test: [50/79]	Time 0.007 (0.010)	Loss 9.1015 (8.7867)	Acc@1 47.656 (46.354)	Acc@5 69.531 (73.560)
Test: [60/79]	Time 0.007 (0.010)	Loss 9.3232 (8.8427)	Acc@1 51.562 (46.388)	Acc@5 74.219 (73.566)
Test: [70/79]	Time 0.007 (0.009)	Loss 8.9101 (8.7599)	Acc@1 46.094 (46.765)	Acc@5 70.312 (73.779)
 * Acc@1 46.860 Acc@5 73.830
Time/epoch: 10.020301103591919 sec
Epoch: [82][0/391]	Time 0.197 (0.197)	Data 0.186 (0.186)	Loss 0.0001 (0.0001)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [82][150/391]	Time 0

Epoch: [88][0/391]	Time 0.197 (0.197)	Data 0.186 (0.186)	Loss 0.0002 (0.0002)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [88][150/391]	Time 0.021 (0.023)	Data 0.000 (0.001)	Loss 0.0002 (0.0005)	Acc@1 100.000 (99.974)	Acc@5 100.000 (100.000)
Epoch: [88][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0001 (0.0005)	Acc@1 100.000 (99.977)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.203 (0.203)	Loss 8.3605 (8.3605)	Acc@1 48.438 (48.438)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.024)	Loss 10.6759 (8.8776)	Acc@1 48.438 (46.449)	Acc@5 74.219 (75.781)
Test: [20/79]	Time 0.006 (0.016)	Loss 8.1151 (8.6985)	Acc@1 45.312 (46.652)	Acc@5 75.000 (74.888)
Test: [30/79]	Time 0.006 (0.013)	Loss 8.8288 (8.7957)	Acc@1 44.531 (46.447)	Acc@5 75.781 (74.824)
Test: [40/79]	Time 0.006 (0.011)	Loss 7.1699 (8.9272)	Acc@1 46.094 (46.380)	Acc@5 75.000 (74.295)
Test: [50/79]	Time 0.007 (0.010)	Loss 9.4004 (8.8614)	Acc@1 44.531 (46.645)	Acc@5 70.312 (74.188)
Test: [60/79]	Time 0.007 (0.010)

Test: [70/79]	Time 0.007 (0.009)	Loss 9.9074 (9.0357)	Acc@1 43.750 (46.171)	Acc@5 69.531 (73.669)
 * Acc@1 46.230 Acc@5 73.840
Time/epoch: 10.013275623321533 sec
Epoch: [95][0/391]	Time 0.207 (0.207)	Data 0.195 (0.195)	Loss 0.0001 (0.0001)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [95][150/391]	Time 0.023 (0.023)	Data 0.000 (0.001)	Loss 0.0001 (0.0003)	Acc@1 100.000 (99.990)	Acc@5 100.000 (100.000)
Epoch: [95][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0001 (0.0005)	Acc@1 100.000 (99.979)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.193 (0.193)	Loss 8.0981 (8.0981)	Acc@1 50.000 (50.000)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.023)	Loss 10.1820 (9.0732)	Acc@1 45.312 (46.023)	Acc@5 71.875 (74.290)
Test: [20/79]	Time 0.006 (0.015)	Loss 8.2452 (8.8691)	Acc@1 42.969 (46.131)	Acc@5 76.562 (73.958)
Test: [30/79]	Time 0.006 (0.012)	Loss 8.5026 (8.9182)	Acc@1 44.531 (45.943)	Acc@5 78.906 (74.068)
Test: [40/79]	Time 0.006 (0.011)	Loss 8.5462 (9.0273)	Acc@1 42.969

After completing this first training session we have a final result for the Top 1 and Top 5 accuracy on our training and testing dataset when no dropout is applied to our network during training. To shwo the effect of dropout we will now increase the dropout rate by 0.1 as described earlier.

In [13]:
dropout_rate = 0.1
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [14]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [15]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.181 (0.181)	Data 0.168 (0.168)	Loss 4.6047 (4.6047)	Acc@1 1.562 (1.562)	Acc@5 8.594 (8.594)
Epoch: [0][150/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 4.0906 (4.3621)	Acc@1 7.031 (3.280)	Acc@5 25.000 (13.814)
Epoch: [0][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 3.8442 (4.1542)	Acc@1 11.719 (5.391)	Acc@5 32.031 (20.037)
Test: [0/79]	Time 0.201 (0.201)	Loss 3.6670 (3.6670)	Acc@1 8.594 (8.594)	Acc@5 30.469 (30.469)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.8108 (3.7495)	Acc@1 10.938 (10.724)	Acc@5 34.375 (34.872)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.7449 (3.7325)	Acc@1 10.938 (10.677)	Acc@5 28.906 (34.338)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.7333 (3.7298)	Acc@1 6.250 (10.433)	Acc@5 27.344 (34.325)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.6474 (3.7212)	Acc@1 15.625 (10.575)	Acc@5 36.719 (34.527)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.8701 (3.7200)	Acc@1 7.031 (10.524)	Acc@5 28.125 (34.329)
Test: [60/79]	Time 0.006 (0.009)	Loss 3.8790 (3.7228)	Acc

 * Acc@1 41.140 Acc@5 72.340
Time/epoch: 10.003570079803467 sec
Epoch: [7][0/391]	Time 0.198 (0.198)	Data 0.186 (0.186)	Loss 1.7553 (1.7553)	Acc@1 50.781 (50.781)	Acc@5 78.906 (78.906)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.0789 (1.9088)	Acc@1 43.750 (47.372)	Acc@5 81.250 (79.139)
Epoch: [7][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.0877 (1.9301)	Acc@1 42.969 (46.953)	Acc@5 74.219 (78.647)
Test: [0/79]	Time 0.193 (0.193)	Loss 1.9248 (1.9248)	Acc@1 50.000 (50.000)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.007 (0.023)	Loss 2.2734 (2.1990)	Acc@1 42.188 (43.750)	Acc@5 72.656 (73.295)
Test: [20/79]	Time 0.006 (0.015)	Loss 2.1215 (2.1605)	Acc@1 44.531 (43.490)	Acc@5 70.312 (73.624)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.1373 (2.1542)	Acc@1 37.500 (43.296)	Acc@5 68.750 (73.639)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8302 (2.1650)	Acc@1 53.125 (43.426)	Acc@5 79.688 (73.647)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1092 (2.1530)	Acc@1 46.875 (43.566)	Acc@

 * Acc@1 46.350 Acc@5 75.920
Time/epoch: 9.994637727737427 sec
Epoch: [14][0/391]	Time 0.200 (0.200)	Data 0.189 (0.189)	Loss 1.0267 (1.0267)	Acc@1 65.625 (65.625)	Acc@5 93.750 (93.750)
Epoch: [14][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 1.1697 (1.1167)	Acc@1 62.500 (66.758)	Acc@5 92.969 (91.567)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.9957 (1.1539)	Acc@1 70.312 (65.620)	Acc@5 96.094 (91.204)
Test: [0/79]	Time 0.178 (0.178)	Loss 2.1614 (2.1614)	Acc@1 46.875 (46.875)	Acc@5 71.875 (71.875)
Test: [10/79]	Time 0.006 (0.022)	Loss 2.4249 (2.3295)	Acc@1 46.875 (47.372)	Acc@5 71.875 (73.438)
Test: [20/79]	Time 0.006 (0.014)	Loss 2.1501 (2.2872)	Acc@1 47.656 (46.689)	Acc@5 77.344 (74.888)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.3012 (2.2944)	Acc@1 40.625 (46.119)	Acc@5 75.781 (75.202)
Test: [40/79]	Time 0.006 (0.010)	Loss 1.8673 (2.2870)	Acc@1 50.000 (46.170)	Acc@5 82.812 (75.438)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.2743 (2.2827)	Acc@1 45.312 (46.048)	Ac

 * Acc@1 46.620 Acc@5 75.940
Time/epoch: 9.525918006896973 sec
Epoch: [21][0/391]	Time 0.195 (0.195)	Data 0.182 (0.182)	Loss 0.5691 (0.5691)	Acc@1 82.031 (82.031)	Acc@5 97.656 (97.656)
Epoch: [21][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.7596 (0.7033)	Acc@1 74.219 (78.301)	Acc@5 95.312 (96.601)
Epoch: [21][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.7382 (0.7561)	Acc@1 78.125 (76.604)	Acc@5 96.875 (96.270)
Test: [0/79]	Time 0.192 (0.192)	Loss 2.5143 (2.5143)	Acc@1 52.344 (52.344)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.023)	Loss 2.7278 (2.7161)	Acc@1 49.219 (46.378)	Acc@5 71.875 (74.503)
Test: [20/79]	Time 0.006 (0.015)	Loss 2.3683 (2.5946)	Acc@1 46.094 (47.210)	Acc@5 74.219 (75.335)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.4802 (2.6251)	Acc@1 43.750 (46.799)	Acc@5 77.344 (75.025)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2244 (2.6214)	Acc@1 53.125 (46.665)	Acc@5 82.031 (75.076)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.7540 (2.6095)	Acc@1 44.531 (46.507)	Ac

 * Acc@1 46.150 Acc@5 74.670
Time/epoch: 9.685084819793701 sec
Epoch: [28][0/391]	Time 0.218 (0.218)	Data 0.207 (0.207)	Loss 0.4659 (0.4659)	Acc@1 81.250 (81.250)	Acc@5 96.094 (96.094)
Epoch: [28][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.4664 (0.4948)	Acc@1 85.938 (84.468)	Acc@5 98.438 (98.360)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8398 (0.5329)	Acc@1 74.219 (83.259)	Acc@5 98.438 (98.165)
Test: [0/79]	Time 0.208 (0.208)	Loss 2.5877 (2.5877)	Acc@1 50.000 (50.000)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.5882 (3.2060)	Acc@1 46.094 (46.378)	Acc@5 75.000 (73.153)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.7635 (3.0971)	Acc@1 51.562 (47.061)	Acc@5 75.000 (73.772)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.0643 (3.0811)	Acc@1 40.625 (46.699)	Acc@5 75.000 (74.244)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.5470 (3.0583)	Acc@1 47.656 (46.875)	Acc@5 82.031 (74.657)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.1150 (3.0510)	Acc@1 46.094 (46.921)	Ac

 * Acc@1 49.100 Acc@5 76.510
Time/epoch: 9.584686040878296 sec
Epoch: [35][0/391]	Time 0.224 (0.224)	Data 0.213 (0.213)	Loss 0.2330 (0.2330)	Acc@1 91.406 (91.406)	Acc@5 98.438 (98.438)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.1027 (0.1417)	Acc@1 96.875 (95.478)	Acc@5 100.000 (99.881)
Epoch: [35][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1346 (0.1408)	Acc@1 93.750 (95.479)	Acc@5 100.000 (99.860)
Test: [0/79]	Time 0.210 (0.210)	Loss 3.4437 (3.4437)	Acc@1 51.562 (51.562)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.025)	Loss 4.5148 (3.8157)	Acc@1 49.219 (48.082)	Acc@5 71.875 (76.705)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.5207 (3.7120)	Acc@1 49.219 (49.368)	Acc@5 77.344 (76.711)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.4855 (3.7583)	Acc@1 47.656 (49.017)	Acc@5 78.906 (76.588)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.8860 (3.7368)	Acc@1 50.781 (48.647)	Acc@5 87.500 (77.001)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.0160 (3.7143)	Acc@1 44.531 (48.759)	

 * Acc@1 48.840 Acc@5 76.770
Time/epoch: 9.570420026779175 sec
Epoch: [42][0/391]	Time 0.220 (0.220)	Data 0.208 (0.208)	Loss 0.0943 (0.0943)	Acc@1 97.656 (97.656)	Acc@5 99.219 (99.219)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.0661 (0.0851)	Acc@1 97.656 (97.346)	Acc@5 100.000 (99.948)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.1550 (0.0903)	Acc@1 94.531 (97.070)	Acc@5 100.000 (99.914)
Test: [0/79]	Time 0.194 (0.194)	Loss 3.8493 (3.8493)	Acc@1 53.906 (53.906)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.023)	Loss 4.6975 (4.3441)	Acc@1 50.781 (48.864)	Acc@5 75.000 (76.065)
Test: [20/79]	Time 0.006 (0.015)	Loss 3.9884 (4.2242)	Acc@1 50.000 (49.740)	Acc@5 78.125 (76.674)
Test: [30/79]	Time 0.006 (0.012)	Loss 3.4289 (4.2336)	Acc@1 50.781 (49.118)	Acc@5 82.031 (76.865)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.5840 (4.2525)	Acc@1 53.906 (48.876)	Acc@5 79.688 (76.772)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.8924 (4.2132)	Acc@1 44.531 (48.897)	

 * Acc@1 48.840 Acc@5 77.240
Time/epoch: 9.549096822738647 sec
Epoch: [49][0/391]	Time 0.211 (0.211)	Data 0.199 (0.199)	Loss 0.0766 (0.0766)	Acc@1 98.438 (98.438)	Acc@5 100.000 (100.000)
Epoch: [49][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0542 (0.0675)	Acc@1 96.875 (97.843)	Acc@5 100.000 (99.979)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.0652 (0.0688)	Acc@1 98.438 (97.869)	Acc@5 100.000 (99.974)
Test: [0/79]	Time 0.190 (0.190)	Loss 4.1276 (4.1276)	Acc@1 53.125 (53.125)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.023)	Loss 5.1661 (4.5409)	Acc@1 49.219 (49.077)	Acc@5 78.125 (76.278)
Test: [20/79]	Time 0.006 (0.015)	Loss 4.2466 (4.4334)	Acc@1 53.125 (49.926)	Acc@5 76.562 (76.414)
Test: [30/79]	Time 0.006 (0.012)	Loss 4.0690 (4.4481)	Acc@1 47.656 (49.496)	Acc@5 78.906 (76.789)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.6977 (4.4566)	Acc@1 50.000 (49.428)	Acc@5 78.906 (76.867)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.8057 (4.4394)	Acc@1 40.625 (49.280

 * Acc@1 49.300 Acc@5 77.240
Time/epoch: 9.550699949264526 sec
Epoch: [56][0/391]	Time 0.221 (0.221)	Data 0.210 (0.210)	Loss 0.0880 (0.0880)	Acc@1 96.875 (96.875)	Acc@5 100.000 (100.000)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.0656 (0.0549)	Acc@1 96.875 (98.189)	Acc@5 100.000 (99.979)
Epoch: [56][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.0202 (0.0549)	Acc@1 99.219 (98.206)	Acc@5 100.000 (99.977)
Test: [0/79]	Time 0.200 (0.200)	Loss 4.1142 (4.1142)	Acc@1 57.812 (57.812)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.024)	Loss 5.4830 (4.8316)	Acc@1 46.875 (48.366)	Acc@5 77.344 (77.415)
Test: [20/79]	Time 0.006 (0.016)	Loss 4.5846 (4.7261)	Acc@1 47.656 (49.665)	Acc@5 76.562 (77.083)
Test: [30/79]	Time 0.006 (0.013)	Loss 4.1388 (4.7348)	Acc@1 49.219 (49.370)	Acc@5 82.031 (77.596)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.1209 (4.7438)	Acc@1 53.906 (49.238)	Acc@5 81.250 (77.591)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.7695 (4.7495)	Acc@1 47.656 (49.341

 * Acc@1 49.440 Acc@5 76.940
Time/epoch: 9.552031517028809 sec
Epoch: [63][0/391]	Time 0.201 (0.201)	Data 0.188 (0.188)	Loss 0.0477 (0.0477)	Acc@1 98.438 (98.438)	Acc@5 100.000 (100.000)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0353 (0.0424)	Acc@1 99.219 (98.717)	Acc@5 100.000 (99.990)
Epoch: [63][300/391]	Time 0.022 (0.021)	Data 0.000 (0.001)	Loss 0.0246 (0.0427)	Acc@1 99.219 (98.710)	Acc@5 100.000 (99.982)
Test: [0/79]	Time 0.184 (0.184)	Loss 3.9446 (3.9446)	Acc@1 53.906 (53.906)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.007 (0.023)	Loss 5.7479 (4.7270)	Acc@1 47.656 (49.148)	Acc@5 75.781 (77.415)
Test: [20/79]	Time 0.006 (0.015)	Loss 4.7262 (4.7055)	Acc@1 53.125 (50.112)	Acc@5 75.000 (77.046)
Test: [30/79]	Time 0.006 (0.012)	Loss 4.4016 (4.7159)	Acc@1 52.344 (49.950)	Acc@5 79.688 (77.193)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.9472 (4.7136)	Acc@1 52.344 (49.638)	Acc@5 82.031 (77.172)
Test: [50/79]	Time 0.006 (0.010)	Loss 5.7285 (4.7274)	Acc@1 45.312 (49.449

 * Acc@1 49.610 Acc@5 77.470
Time/epoch: 9.663557052612305 sec
Epoch: [70][0/391]	Time 0.209 (0.209)	Data 0.196 (0.196)	Loss 0.0268 (0.0268)	Acc@1 98.438 (98.438)	Acc@5 100.000 (100.000)
Epoch: [70][150/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 0.0311 (0.0426)	Acc@1 99.219 (98.577)	Acc@5 100.000 (99.984)
Epoch: [70][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.0380 (0.0407)	Acc@1 97.656 (98.663)	Acc@5 100.000 (99.992)
Test: [0/79]	Time 0.189 (0.189)	Loss 4.6112 (4.6112)	Acc@1 53.125 (53.125)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.024)	Loss 5.5983 (4.9583)	Acc@1 47.656 (49.432)	Acc@5 78.125 (76.918)
Test: [20/79]	Time 0.006 (0.016)	Loss 4.2248 (4.7809)	Acc@1 52.344 (49.888)	Acc@5 77.344 (76.972)
Test: [30/79]	Time 0.007 (0.013)	Loss 4.0515 (4.7555)	Acc@1 51.562 (49.672)	Acc@5 79.688 (77.344)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.2261 (4.7715)	Acc@1 52.344 (49.619)	Acc@5 80.469 (77.153)
Test: [50/79]	Time 0.006 (0.010)	Loss 5.3800 (4.7556)	Acc@1 42.188 (49.632

 * Acc@1 49.830 Acc@5 77.430
Time/epoch: 9.657078266143799 sec
Epoch: [77][0/391]	Time 0.186 (0.186)	Data 0.175 (0.175)	Loss 0.0396 (0.0396)	Acc@1 99.219 (99.219)	Acc@5 100.000 (100.000)
Epoch: [77][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0215 (0.0367)	Acc@1 99.219 (98.857)	Acc@5 100.000 (100.000)
Epoch: [77][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.0585 (0.0356)	Acc@1 98.438 (98.868)	Acc@5 100.000 (100.000)
Test: [0/79]	Time 0.204 (0.204)	Loss 4.2471 (4.2471)	Acc@1 55.469 (55.469)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.024)	Loss 5.4134 (4.8919)	Acc@1 48.438 (48.935)	Acc@5 75.000 (76.491)
Test: [20/79]	Time 0.006 (0.016)	Loss 4.2927 (4.7606)	Acc@1 53.125 (49.628)	Acc@5 78.906 (76.823)
Test: [30/79]	Time 0.007 (0.013)	Loss 4.1982 (4.8426)	Acc@1 50.781 (49.546)	Acc@5 80.469 (77.092)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.1270 (4.8681)	Acc@1 54.688 (49.390)	Acc@5 82.812 (77.172)
Test: [50/79]	Time 0.006 (0.010)	Loss 5.1947 (4.8256)	Acc@1 46.875 (49.4

 * Acc@1 50.000 Acc@5 77.110
Time/epoch: 9.56813359260559 sec
Epoch: [84][0/391]	Time 0.206 (0.206)	Data 0.195 (0.195)	Loss 0.0242 (0.0242)	Acc@1 99.219 (99.219)	Acc@5 100.000 (100.000)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.0954 (0.0334)	Acc@1 98.438 (99.022)	Acc@5 100.000 (100.000)
Epoch: [84][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.0166 (0.0348)	Acc@1 100.000 (98.941)	Acc@5 100.000 (99.995)
Test: [0/79]	Time 0.190 (0.190)	Loss 4.4474 (4.4474)	Acc@1 53.125 (53.125)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.023)	Loss 5.1461 (5.0035)	Acc@1 50.781 (49.645)	Acc@5 79.688 (76.918)
Test: [20/79]	Time 0.006 (0.015)	Loss 4.6462 (4.8576)	Acc@1 53.906 (50.484)	Acc@5 79.688 (77.083)
Test: [30/79]	Time 0.006 (0.012)	Loss 4.3708 (4.8590)	Acc@1 52.344 (50.126)	Acc@5 78.125 (77.495)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.2036 (4.8604)	Acc@1 52.344 (49.752)	Acc@5 82.812 (77.496)
Test: [50/79]	Time 0.006 (0.010)	Loss 5.5041 (4.8460)	Acc@1 48.438 (49.64

 * Acc@1 49.790 Acc@5 77.490
Time/epoch: 9.531861543655396 sec
Epoch: [91][0/391]	Time 0.208 (0.208)	Data 0.194 (0.194)	Loss 0.0147 (0.0147)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [91][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.0229 (0.0313)	Acc@1 99.219 (99.007)	Acc@5 100.000 (99.990)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.0097 (0.0307)	Acc@1 100.000 (99.014)	Acc@5 100.000 (99.987)
Test: [0/79]	Time 0.215 (0.215)	Loss 4.6110 (4.6110)	Acc@1 51.562 (51.562)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.025)	Loss 5.3679 (4.9537)	Acc@1 52.344 (49.503)	Acc@5 79.688 (76.989)
Test: [20/79]	Time 0.006 (0.016)	Loss 4.6007 (4.8214)	Acc@1 56.250 (50.521)	Acc@5 77.344 (76.749)
Test: [30/79]	Time 0.006 (0.013)	Loss 4.6242 (4.8443)	Acc@1 50.000 (50.302)	Acc@5 75.000 (76.991)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.0983 (4.8346)	Acc@1 53.125 (50.019)	Acc@5 82.031 (77.287)
Test: [50/79]	Time 0.006 (0.010)	Loss 5.3147 (4.8366)	Acc@1 45.312 (49.

 * Acc@1 50.110 Acc@5 77.280
Time/epoch: 9.623223066329956 sec
Epoch: [98][0/391]	Time 0.218 (0.218)	Data 0.206 (0.206)	Loss 0.0034 (0.0034)	Acc@1 100.000 (100.000)	Acc@5 100.000 (100.000)
Epoch: [98][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 0.0422 (0.0323)	Acc@1 99.219 (98.981)	Acc@5 100.000 (100.000)
Epoch: [98][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.0433 (0.0318)	Acc@1 98.438 (99.019)	Acc@5 100.000 (99.995)
Test: [0/79]	Time 0.210 (0.210)	Loss 4.4550 (4.4550)	Acc@1 54.688 (54.688)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 5.6232 (5.0006)	Acc@1 48.438 (49.290)	Acc@5 77.344 (76.989)
Test: [20/79]	Time 0.006 (0.016)	Loss 4.4047 (4.8729)	Acc@1 51.562 (50.335)	Acc@5 80.469 (76.637)
Test: [30/79]	Time 0.006 (0.013)	Loss 4.2523 (4.8576)	Acc@1 50.781 (50.025)	Acc@5 78.125 (77.218)
Test: [40/79]	Time 0.006 (0.011)	Loss 4.2601 (4.8616)	Acc@1 55.469 (49.619)	Acc@5 82.031 (77.325)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.9309 (4.8576)	Acc@1 46.875 (49.

In [16]:
dropout_rate = 0.2
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [17]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [18]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.186 (0.186)	Data 0.173 (0.173)	Loss 4.6044 (4.6044)	Acc@1 1.562 (1.562)	Acc@5 7.812 (7.812)
Epoch: [0][150/391]	Time 0.022 (0.022)	Data 0.000 (0.001)	Loss 4.0729 (4.4421)	Acc@1 8.594 (2.271)	Acc@5 22.656 (9.773)
Epoch: [0][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 3.9314 (4.2559)	Acc@1 7.031 (3.834)	Acc@5 24.219 (15.994)
Test: [0/79]	Time 0.199 (0.199)	Loss 3.6737 (3.6737)	Acc@1 14.844 (14.844)	Acc@5 31.250 (31.250)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.8954 (3.8551)	Acc@1 11.719 (9.162)	Acc@5 28.906 (29.261)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.8859 (3.8366)	Acc@1 10.156 (9.896)	Acc@5 31.250 (30.543)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.8556 (3.8378)	Acc@1 7.031 (9.829)	Acc@5 29.688 (30.116)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.8359 (3.8356)	Acc@1 9.375 (9.851)	Acc@5 24.219 (30.145)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.9478 (3.8352)	Acc@1 7.812 (9.544)	Acc@5 26.562 (30.270)
Test: [60/79]	Time 0.006 (0.009)	Loss 3.9895 (3.8400)	Acc@1 8.5

 * Acc@1 38.940 Acc@5 69.860
Time/epoch: 10.008291244506836 sec
Epoch: [7][0/391]	Time 0.200 (0.200)	Data 0.187 (0.187)	Loss 2.2615 (2.2615)	Acc@1 37.500 (37.500)	Acc@5 71.094 (71.094)
Epoch: [7][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 1.9104 (2.1263)	Acc@1 42.969 (42.679)	Acc@5 76.562 (74.803)
Epoch: [7][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.2596 (2.1444)	Acc@1 44.531 (42.333)	Acc@5 76.562 (74.421)
Test: [0/79]	Time 0.210 (0.210)	Loss 2.0869 (2.0869)	Acc@1 46.094 (46.094)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.3369 (2.2854)	Acc@1 40.625 (39.347)	Acc@5 70.312 (71.946)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.3751 (2.2820)	Acc@1 35.156 (40.067)	Acc@5 68.750 (71.540)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3128 (2.2755)	Acc@1 38.281 (40.197)	Acc@5 69.531 (71.497)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.0482 (2.2766)	Acc@1 44.531 (40.454)	Acc@5 77.344 (71.589)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.2082 (2.2734)	Acc@1 39.062 (40.411)	Acc@

 * Acc@1 45.520 Acc@5 75.840
Time/epoch: 10.084136724472046 sec
Epoch: [14][0/391]	Time 0.206 (0.206)	Data 0.194 (0.194)	Loss 1.5018 (1.5018)	Acc@1 54.688 (54.688)	Acc@5 87.500 (87.500)
Epoch: [14][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 1.3699 (1.4420)	Acc@1 60.156 (58.428)	Acc@5 89.062 (86.910)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.4324 (1.4743)	Acc@1 57.031 (57.522)	Acc@5 87.500 (86.394)
Test: [0/79]	Time 0.200 (0.200)	Loss 2.0496 (2.0496)	Acc@1 50.781 (50.781)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.1336 (2.1773)	Acc@1 48.438 (46.307)	Acc@5 76.562 (74.645)
Test: [20/79]	Time 0.006 (0.015)	Loss 2.3397 (2.1103)	Acc@1 42.188 (47.135)	Acc@5 69.531 (75.000)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.0180 (2.0943)	Acc@1 50.781 (47.152)	Acc@5 76.562 (75.706)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8887 (2.0949)	Acc@1 51.562 (47.504)	Acc@5 79.688 (75.896)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9975 (2.1004)	Acc@1 51.562 (47.319)	A

 * Acc@1 47.440 Acc@5 75.970
Time/epoch: 9.513055801391602 sec
Epoch: [21][0/391]	Time 0.204 (0.204)	Data 0.191 (0.191)	Loss 1.0122 (1.0122)	Acc@1 68.750 (68.750)	Acc@5 92.969 (92.969)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 1.0584 (1.0597)	Acc@1 71.875 (68.031)	Acc@5 93.750 (92.850)
Epoch: [21][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.0747 (1.0917)	Acc@1 70.312 (67.169)	Acc@5 90.625 (92.393)
Test: [0/79]	Time 0.210 (0.210)	Loss 2.1790 (2.1790)	Acc@1 49.219 (49.219)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.3955 (2.3008)	Acc@1 43.750 (44.957)	Acc@5 76.562 (75.000)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.2753 (2.2017)	Acc@1 42.969 (46.577)	Acc@5 75.781 (76.042)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1872 (2.2104)	Acc@1 44.531 (47.152)	Acc@5 75.781 (75.932)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.0060 (2.2194)	Acc@1 48.438 (47.161)	Acc@5 79.688 (75.896)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.4290 (2.2097)	Acc@1 49.219 (47.350)	Ac

 * Acc@1 47.050 Acc@5 75.400
Time/epoch: 9.482879400253296 sec
Epoch: [28][0/391]	Time 0.203 (0.203)	Data 0.190 (0.190)	Loss 0.8455 (0.8455)	Acc@1 75.000 (75.000)	Acc@5 96.094 (96.094)
Epoch: [28][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.9405 (0.8603)	Acc@1 75.000 (73.588)	Acc@5 95.312 (95.328)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8190 (0.8879)	Acc@1 78.906 (73.001)	Acc@5 95.312 (95.040)
Test: [0/79]	Time 0.200 (0.200)	Loss 2.4028 (2.4028)	Acc@1 47.656 (47.656)	Acc@5 73.438 (73.438)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.4846 (2.4817)	Acc@1 46.875 (46.165)	Acc@5 78.125 (74.290)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1713 (2.3479)	Acc@1 46.875 (48.214)	Acc@5 78.906 (75.595)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3132 (2.3383)	Acc@1 46.094 (47.933)	Acc@5 75.781 (75.983)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.0824 (2.3667)	Acc@1 51.562 (47.637)	Acc@5 78.906 (75.838)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.2170 (2.3485)	Acc@1 52.344 (47.733)	Ac

 * Acc@1 49.560 Acc@5 77.530
Time/epoch: 9.603440999984741 sec
Epoch: [35][0/391]	Time 0.221 (0.221)	Data 0.208 (0.208)	Loss 0.3828 (0.3828)	Acc@1 85.156 (85.156)	Acc@5 100.000 (100.000)
Epoch: [35][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 0.5331 (0.4091)	Acc@1 85.156 (87.034)	Acc@5 100.000 (98.681)
Epoch: [35][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.4058 (0.4074)	Acc@1 85.938 (86.851)	Acc@5 99.219 (98.707)
Test: [0/79]	Time 0.199 (0.199)	Loss 2.7366 (2.7366)	Acc@1 50.781 (50.781)	Acc@5 72.656 (72.656)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.0241 (2.8482)	Acc@1 50.781 (47.869)	Acc@5 78.125 (75.213)
Test: [20/79]	Time 0.007 (0.015)	Loss 2.7070 (2.6734)	Acc@1 49.219 (49.591)	Acc@5 75.781 (76.414)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.4279 (2.6739)	Acc@1 47.656 (49.395)	Acc@5 83.594 (77.067)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.4918 (2.6852)	Acc@1 57.812 (49.581)	Acc@5 82.031 (76.963)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.8057 (2.6630)	Acc@1 53.125 (49.724)

 * Acc@1 49.940 Acc@5 77.210
Time/epoch: 9.553666830062866 sec
Epoch: [42][0/391]	Time 0.214 (0.214)	Data 0.200 (0.200)	Loss 0.3703 (0.3703)	Acc@1 88.281 (88.281)	Acc@5 98.438 (98.438)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.3690 (0.3331)	Acc@1 90.625 (89.187)	Acc@5 98.438 (99.151)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2898 (0.3404)	Acc@1 88.281 (88.917)	Acc@5 100.000 (99.169)
Test: [0/79]	Time 0.199 (0.199)	Loss 2.7089 (2.7089)	Acc@1 60.156 (60.156)	Acc@5 74.219 (74.219)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.2394 (2.9537)	Acc@1 48.438 (49.148)	Acc@5 75.781 (76.065)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.9369 (2.8440)	Acc@1 50.000 (50.558)	Acc@5 76.562 (76.600)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.6009 (2.8488)	Acc@1 45.312 (49.597)	Acc@5 80.469 (77.142)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.3966 (2.8430)	Acc@1 57.812 (49.638)	Acc@5 80.469 (77.248)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.8064 (2.8228)	Acc@1 52.344 (49.694)	A

 * Acc@1 49.920 Acc@5 77.140
Time/epoch: 9.53812289237976 sec
Epoch: [49][0/391]	Time 0.204 (0.204)	Data 0.191 (0.191)	Loss 0.2701 (0.2701)	Acc@1 92.188 (92.188)	Acc@5 99.219 (99.219)
Epoch: [49][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.2633 (0.2736)	Acc@1 88.281 (91.142)	Acc@5 100.000 (99.421)
Epoch: [49][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.2236 (0.2780)	Acc@1 92.188 (90.996)	Acc@5 99.219 (99.382)
Test: [0/79]	Time 0.194 (0.194)	Loss 2.9668 (2.9668)	Acc@1 57.031 (57.031)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.023)	Loss 3.2647 (3.1207)	Acc@1 50.000 (50.142)	Acc@5 80.469 (76.136)
Test: [20/79]	Time 0.006 (0.015)	Loss 3.0424 (3.0215)	Acc@1 46.094 (51.190)	Acc@5 78.906 (76.749)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.7559 (2.9770)	Acc@1 50.000 (50.630)	Acc@5 82.031 (77.419)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.6930 (2.9980)	Acc@1 55.469 (50.400)	Acc@5 81.250 (77.477)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.9586 (2.9812)	Acc@1 51.562 (50.352)	Ac

 * Acc@1 50.070 Acc@5 77.700
Time/epoch: 9.533075332641602 sec
Epoch: [56][0/391]	Time 0.202 (0.202)	Data 0.189 (0.189)	Loss 0.2186 (0.2186)	Acc@1 92.969 (92.969)	Acc@5 100.000 (100.000)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1793 (0.2341)	Acc@1 95.312 (92.425)	Acc@5 99.219 (99.591)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2602 (0.2443)	Acc@1 91.406 (91.905)	Acc@5 100.000 (99.569)
Test: [0/79]	Time 0.203 (0.203)	Loss 2.8547 (2.8547)	Acc@1 60.938 (60.938)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.4132 (3.2252)	Acc@1 44.531 (50.000)	Acc@5 78.125 (75.994)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.4198 (3.1284)	Acc@1 44.531 (50.446)	Acc@5 76.562 (76.376)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.9280 (3.0889)	Acc@1 47.656 (49.975)	Acc@5 82.812 (77.394)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.7292 (3.1064)	Acc@1 53.906 (50.095)	Acc@5 79.688 (77.210)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.8120 (3.0839)	Acc@1 53.906 (50.092)

 * Acc@1 50.520 Acc@5 77.740
Time/epoch: 10.008663177490234 sec
Epoch: [63][0/391]	Time 0.206 (0.206)	Data 0.193 (0.193)	Loss 0.2466 (0.2466)	Acc@1 92.969 (92.969)	Acc@5 99.219 (99.219)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1417 (0.2051)	Acc@1 96.875 (93.295)	Acc@5 100.000 (99.674)
Epoch: [63][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2571 (0.2039)	Acc@1 89.844 (93.415)	Acc@5 100.000 (99.704)
Test: [0/79]	Time 0.207 (0.207)	Loss 3.2458 (3.2458)	Acc@1 55.469 (55.469)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.6872 (3.3179)	Acc@1 45.312 (49.645)	Acc@5 75.000 (77.131)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.1385 (3.1413)	Acc@1 46.875 (51.190)	Acc@5 75.000 (77.493)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.0903 (3.1209)	Acc@1 46.875 (50.706)	Acc@5 83.594 (78.301)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.0479 (3.1418)	Acc@1 51.562 (50.915)	Acc@5 78.906 (78.087)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.1755 (3.1227)	Acc@1 52.344 (50.812)

 * Acc@1 50.400 Acc@5 77.750
Time/epoch: 9.553833723068237 sec
Epoch: [70][0/391]	Time 0.213 (0.213)	Data 0.202 (0.202)	Loss 0.2078 (0.2078)	Acc@1 94.531 (94.531)	Acc@5 100.000 (100.000)
Epoch: [70][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1856 (0.2000)	Acc@1 95.312 (93.321)	Acc@5 100.000 (99.746)
Epoch: [70][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.1746 (0.1971)	Acc@1 96.094 (93.444)	Acc@5 100.000 (99.720)
Test: [0/79]	Time 0.205 (0.205)	Loss 3.2008 (3.2008)	Acc@1 57.031 (57.031)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.6555 (3.3299)	Acc@1 47.656 (49.858)	Acc@5 76.562 (76.207)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.3248 (3.1824)	Acc@1 48.438 (50.893)	Acc@5 77.344 (77.158)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.9620 (3.1296)	Acc@1 46.875 (50.378)	Acc@5 80.469 (77.772)
Test: [40/79]	Time 0.007 (0.011)	Loss 3.1131 (3.1512)	Acc@1 51.562 (50.286)	Acc@5 82.812 (77.801)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.1915 (3.1398)	Acc@1 53.125 (50.352

 * Acc@1 50.780 Acc@5 77.430
Time/epoch: 9.549046754837036 sec
Epoch: [77][0/391]	Time 0.226 (0.226)	Data 0.213 (0.213)	Loss 0.1487 (0.1487)	Acc@1 92.969 (92.969)	Acc@5 100.000 (100.000)
Epoch: [77][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.1141 (0.1987)	Acc@1 97.656 (93.564)	Acc@5 100.000 (99.690)
Epoch: [77][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.1553 (0.1964)	Acc@1 92.188 (93.581)	Acc@5 100.000 (99.717)
Test: [0/79]	Time 0.193 (0.193)	Loss 3.0247 (3.0247)	Acc@1 58.594 (58.594)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.7097 (3.4081)	Acc@1 46.094 (49.503)	Acc@5 78.125 (76.776)
Test: [20/79]	Time 0.006 (0.015)	Loss 3.0643 (3.2153)	Acc@1 48.438 (51.376)	Acc@5 78.906 (77.455)
Test: [30/79]	Time 0.006 (0.012)	Loss 2.9664 (3.2064)	Acc@1 50.781 (51.033)	Acc@5 83.594 (78.150)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.7983 (3.2010)	Acc@1 51.562 (50.819)	Acc@5 81.250 (77.896)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.0409 (3.1738)	Acc@1 53.125 (50.858

 * Acc@1 50.800 Acc@5 77.760
Time/epoch: 9.558701753616333 sec
Epoch: [84][0/391]	Time 0.223 (0.223)	Data 0.211 (0.211)	Loss 0.1910 (0.1910)	Acc@1 92.969 (92.969)	Acc@5 100.000 (100.000)
Epoch: [84][150/391]	Time 0.024 (0.022)	Data 0.000 (0.002)	Loss 0.1556 (0.1835)	Acc@1 94.531 (93.962)	Acc@5 100.000 (99.809)
Epoch: [84][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.2079 (0.1829)	Acc@1 91.406 (93.963)	Acc@5 100.000 (99.798)
Test: [0/79]	Time 0.217 (0.217)	Loss 3.1621 (3.1621)	Acc@1 54.688 (54.688)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.6668 (3.4475)	Acc@1 46.875 (47.727)	Acc@5 76.562 (76.420)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.4526 (3.3168)	Acc@1 47.656 (49.740)	Acc@5 76.562 (77.083)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.6780 (3.2631)	Acc@1 50.781 (49.723)	Acc@5 85.156 (77.722)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.1016 (3.2731)	Acc@1 53.125 (49.943)	Acc@5 81.250 (77.572)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.0166 (3.2411)	Acc@1 54.688 (50.199

 * Acc@1 50.500 Acc@5 77.550
Time/epoch: 9.56344485282898 sec
Epoch: [91][0/391]	Time 0.208 (0.208)	Data 0.196 (0.196)	Loss 0.2437 (0.2437)	Acc@1 92.188 (92.188)	Acc@5 100.000 (100.000)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.3345 (0.1900)	Acc@1 90.625 (93.988)	Acc@5 99.219 (99.643)
Epoch: [91][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.1143 (0.1888)	Acc@1 96.094 (93.971)	Acc@5 100.000 (99.704)
Test: [0/79]	Time 0.213 (0.213)	Loss 3.2293 (3.2293)	Acc@1 55.469 (55.469)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.5998 (3.4001)	Acc@1 50.000 (49.574)	Acc@5 82.031 (76.634)
Test: [20/79]	Time 0.007 (0.016)	Loss 3.3685 (3.2679)	Acc@1 50.781 (51.004)	Acc@5 79.688 (77.158)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.8775 (3.2254)	Acc@1 50.000 (50.680)	Acc@5 84.375 (77.772)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.7312 (3.2334)	Acc@1 53.125 (50.476)	Acc@5 82.812 (77.763)
Test: [50/79]	Time 0.007 (0.010)	Loss 3.3841 (3.2061)	Acc@1 50.000 (50.536)	

 * Acc@1 50.170 Acc@5 77.710
Time/epoch: 9.635247468948364 sec
Epoch: [98][0/391]	Time 0.209 (0.209)	Data 0.198 (0.198)	Loss 0.0965 (0.0965)	Acc@1 97.656 (97.656)	Acc@5 100.000 (100.000)
Epoch: [98][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.2376 (0.1862)	Acc@1 92.188 (93.874)	Acc@5 100.000 (99.731)
Epoch: [98][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2751 (0.1868)	Acc@1 90.625 (93.820)	Acc@5 98.438 (99.730)
Test: [0/79]	Time 0.209 (0.209)	Loss 3.1401 (3.1401)	Acc@1 56.250 (56.250)	Acc@5 74.219 (74.219)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.6308 (3.3699)	Acc@1 48.438 (49.645)	Acc@5 78.906 (77.202)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.3980 (3.2310)	Acc@1 48.438 (51.228)	Acc@5 78.125 (77.418)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.8903 (3.2078)	Acc@1 50.781 (51.109)	Acc@5 82.812 (78.251)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.8977 (3.2194)	Acc@1 53.125 (51.105)	Acc@5 80.469 (78.201)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.0569 (3.1998)	Acc@1 53.906 (51.256)

In [19]:
dropout_rate = 0.3
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [20]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [21]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.191 (0.191)	Data 0.179 (0.179)	Loss 4.6058 (4.6058)	Acc@1 0.781 (0.781)	Acc@5 5.469 (5.469)
Epoch: [0][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 4.2437 (4.4617)	Acc@1 3.906 (2.194)	Acc@5 14.062 (9.815)
Epoch: [0][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 3.9924 (4.2692)	Acc@1 6.250 (3.654)	Acc@5 25.000 (15.778)
Test: [0/79]	Time 0.205 (0.205)	Loss 3.8872 (3.8872)	Acc@1 7.031 (7.031)	Acc@5 29.688 (29.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.9703 (3.9103)	Acc@1 5.469 (7.599)	Acc@5 28.906 (27.060)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.8137 (3.8870)	Acc@1 9.375 (8.259)	Acc@5 31.250 (28.720)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.8871 (3.8774)	Acc@1 6.250 (8.367)	Acc@5 30.469 (29.083)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.8587 (3.8746)	Acc@1 10.938 (8.232)	Acc@5 28.125 (29.097)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.9436 (3.8698)	Acc@1 5.469 (8.441)	Acc@5 20.312 (28.860)
Test: [60/79]	Time 0.006 (0.010)	Loss 4.0905 (3.8844)	Acc@1 10.156

 * Acc@1 39.020 Acc@5 69.620
Time/epoch: 9.998703479766846 sec
Epoch: [7][0/391]	Time 0.204 (0.204)	Data 0.193 (0.193)	Loss 2.2023 (2.2023)	Acc@1 40.625 (40.625)	Acc@5 75.781 (75.781)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.3398 (2.1859)	Acc@1 35.156 (41.437)	Acc@5 72.656 (73.531)
Epoch: [7][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 2.2734 (2.1992)	Acc@1 38.281 (41.064)	Acc@5 75.781 (73.225)
Test: [0/79]	Time 0.212 (0.212)	Loss 2.0596 (2.0596)	Acc@1 44.531 (44.531)	Acc@5 72.656 (72.656)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.2925 (2.2902)	Acc@1 42.188 (38.849)	Acc@5 71.094 (71.733)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.2782 (2.2617)	Acc@1 37.500 (39.658)	Acc@5 68.750 (71.726)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.2763 (2.2533)	Acc@1 36.719 (40.096)	Acc@5 72.656 (71.951)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.9543 (2.2589)	Acc@1 46.875 (40.282)	Acc@5 78.125 (71.761)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1593 (2.2517)	Acc@1 44.531 (40.518)	Acc@5

 * Acc@1 47.070 Acc@5 76.620
Time/epoch: 10.032461404800415 sec
Epoch: [14][0/391]	Time 0.207 (0.207)	Data 0.195 (0.195)	Loss 1.5983 (1.5983)	Acc@1 55.469 (55.469)	Acc@5 78.906 (78.906)
Epoch: [14][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 1.4868 (1.5423)	Acc@1 57.031 (56.302)	Acc@5 86.719 (85.358)
Epoch: [14][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.6339 (1.5717)	Acc@1 53.125 (55.251)	Acc@5 85.938 (85.008)
Test: [0/79]	Time 0.209 (0.209)	Loss 1.8574 (1.8574)	Acc@1 53.906 (53.906)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 1.9683 (2.0454)	Acc@1 50.781 (47.656)	Acc@5 78.125 (76.847)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.2166 (2.0177)	Acc@1 48.438 (47.656)	Acc@5 73.438 (76.935)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9729 (2.0069)	Acc@1 43.750 (47.278)	Acc@5 76.562 (76.739)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8569 (2.0068)	Acc@1 51.562 (48.018)	Acc@5 78.906 (76.715)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0433 (2.0055)	Acc@1 44.531 (47.718)	A

 * Acc@1 48.250 Acc@5 77.130
Time/epoch: 9.56247353553772 sec
Epoch: [21][0/391]	Time 0.212 (0.212)	Data 0.201 (0.201)	Loss 1.0193 (1.0193)	Acc@1 71.875 (71.875)	Acc@5 90.625 (90.625)
Epoch: [21][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.9824 (1.2006)	Acc@1 67.969 (64.456)	Acc@5 95.312 (91.049)
Epoch: [21][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.3227 (1.2402)	Acc@1 60.938 (63.346)	Acc@5 89.844 (90.417)
Test: [0/79]	Time 0.209 (0.209)	Loss 1.8845 (1.8845)	Acc@1 54.688 (54.688)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.1164 (2.0709)	Acc@1 50.000 (49.432)	Acc@5 77.344 (77.699)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1338 (2.0579)	Acc@1 51.562 (49.702)	Acc@5 74.219 (77.679)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0144 (2.0504)	Acc@1 45.312 (49.572)	Acc@5 76.562 (77.646)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8527 (2.0475)	Acc@1 53.906 (49.676)	Acc@5 82.812 (77.877)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0114 (2.0397)	Acc@1 48.438 (49.372)	Acc

 * Acc@1 48.680 Acc@5 77.670
Time/epoch: 9.588121891021729 sec
Epoch: [28][0/391]	Time 0.214 (0.214)	Data 0.200 (0.200)	Loss 0.9219 (0.9219)	Acc@1 71.875 (71.875)	Acc@5 94.531 (94.531)
Epoch: [28][150/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 0.9378 (1.0040)	Acc@1 73.438 (69.386)	Acc@5 94.531 (93.776)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.0880 (1.0291)	Acc@1 67.188 (68.768)	Acc@5 92.188 (93.304)
Test: [0/79]	Time 0.226 (0.226)	Loss 2.0853 (2.0853)	Acc@1 51.562 (51.562)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.5228 (2.2293)	Acc@1 49.219 (48.295)	Acc@5 75.781 (77.415)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1593 (2.2026)	Acc@1 46.094 (48.624)	Acc@5 78.125 (77.455)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.2051 (2.2329)	Acc@1 48.438 (48.311)	Acc@5 75.781 (76.890)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.2568 (2.2262)	Acc@1 47.656 (48.723)	Acc@5 79.688 (77.153)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.3797 (2.2111)	Acc@1 51.562 (49.020)	Ac

 * Acc@1 51.910 Acc@5 79.050
Time/epoch: 9.585531234741211 sec
Epoch: [35][0/391]	Time 0.231 (0.231)	Data 0.218 (0.218)	Loss 0.4980 (0.4980)	Acc@1 84.375 (84.375)	Acc@5 98.438 (98.438)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.6795 (0.5792)	Acc@1 79.688 (81.819)	Acc@5 96.875 (97.470)
Epoch: [35][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.5136 (0.5725)	Acc@1 82.812 (81.922)	Acc@5 96.875 (97.607)
Test: [0/79]	Time 0.206 (0.206)	Loss 2.1205 (2.1205)	Acc@1 59.375 (59.375)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.2657 (2.3883)	Acc@1 50.781 (50.994)	Acc@5 80.469 (77.841)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1446 (2.3143)	Acc@1 51.562 (51.786)	Acc@5 78.125 (78.795)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3750 (2.3144)	Acc@1 52.344 (51.562)	Acc@5 76.562 (79.057)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2070 (2.3333)	Acc@1 52.344 (51.639)	Acc@5 79.688 (78.887)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1773 (2.2959)	Acc@1 51.562 (51.976)	Ac

 * Acc@1 52.240 Acc@5 79.060
Time/epoch: 9.600236177444458 sec
Epoch: [42][0/391]	Time 0.205 (0.205)	Data 0.193 (0.193)	Loss 0.6265 (0.6265)	Acc@1 79.688 (79.688)	Acc@5 97.656 (97.656)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.4652 (0.4575)	Acc@1 85.938 (85.218)	Acc@5 98.438 (98.406)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.3758 (0.4592)	Acc@1 86.719 (85.257)	Acc@5 99.219 (98.388)
Test: [0/79]	Time 0.199 (0.199)	Loss 2.3016 (2.3016)	Acc@1 58.594 (58.594)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.8045 (2.5767)	Acc@1 54.688 (52.131)	Acc@5 78.125 (77.699)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1852 (2.4551)	Acc@1 55.469 (52.641)	Acc@5 79.688 (78.906)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3378 (2.4479)	Acc@1 50.000 (51.865)	Acc@5 80.469 (79.183)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2458 (2.4471)	Acc@1 55.469 (52.077)	Acc@5 82.031 (79.211)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.3799 (2.4265)	Acc@1 50.781 (52.037)	Ac

 * Acc@1 51.670 Acc@5 79.370
Time/epoch: 9.582780599594116 sec
Epoch: [49][0/391]	Time 0.207 (0.207)	Data 0.194 (0.194)	Loss 0.4173 (0.4173)	Acc@1 82.031 (82.031)	Acc@5 98.438 (98.438)
Epoch: [49][150/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 0.3365 (0.3982)	Acc@1 90.625 (87.102)	Acc@5 98.438 (98.655)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.3377 (0.3896)	Acc@1 89.844 (87.334)	Acc@5 98.438 (98.824)
Test: [0/79]	Time 0.207 (0.207)	Loss 2.4620 (2.4620)	Acc@1 57.031 (57.031)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.9101 (2.6454)	Acc@1 50.781 (52.486)	Acc@5 78.906 (78.835)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.2652 (2.5293)	Acc@1 54.688 (53.051)	Acc@5 80.469 (79.204)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.5405 (2.5361)	Acc@1 50.000 (51.815)	Acc@5 78.906 (79.083)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.4297 (2.5513)	Acc@1 55.469 (51.810)	Acc@5 80.469 (79.059)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.3148 (2.5140)	Acc@1 50.781 (51.869)	Ac

 * Acc@1 52.000 Acc@5 79.620
Time/epoch: 9.585057735443115 sec
Epoch: [56][0/391]	Time 0.206 (0.206)	Data 0.194 (0.194)	Loss 0.2356 (0.2356)	Acc@1 93.750 (93.750)	Acc@5 100.000 (100.000)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.3610 (0.3453)	Acc@1 86.719 (88.695)	Acc@5 99.219 (99.058)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.4427 (0.3487)	Acc@1 81.250 (88.541)	Acc@5 98.438 (99.060)
Test: [0/79]	Time 0.198 (0.198)	Loss 2.4227 (2.4227)	Acc@1 58.594 (58.594)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.9415 (2.7394)	Acc@1 50.000 (52.415)	Acc@5 76.562 (78.267)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.4658 (2.6447)	Acc@1 52.344 (53.311)	Acc@5 80.469 (79.129)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.6287 (2.6592)	Acc@1 53.906 (52.445)	Acc@5 78.125 (79.183)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2420 (2.6759)	Acc@1 54.688 (52.534)	Acc@5 81.250 (79.002)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.6356 (2.6436)	Acc@1 53.125 (52.574)	

Test: [70/79]	Time 0.006 (0.009)	Loss 3.1439 (2.6965)	Acc@1 53.125 (52.641)	Acc@5 75.000 (79.743)
 * Acc@1 52.540 Acc@5 79.870
Time/epoch: 9.586866855621338 sec
Epoch: [63][0/391]	Time 0.218 (0.218)	Data 0.205 (0.205)	Loss 0.2533 (0.2533)	Acc@1 92.969 (92.969)	Acc@5 100.000 (100.000)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.3362 (0.2926)	Acc@1 87.500 (90.511)	Acc@5 100.000 (99.348)
Epoch: [63][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.3874 (0.2950)	Acc@1 86.719 (90.316)	Acc@5 100.000 (99.362)
Test: [0/79]	Time 0.203 (0.203)	Loss 2.3671 (2.3671)	Acc@1 57.031 (57.031)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.8885 (2.7658)	Acc@1 51.562 (51.562)	Acc@5 79.688 (80.256)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.4647 (2.6341)	Acc@1 53.125 (53.274)	Acc@5 81.250 (80.766)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.4067 (2.6578)	Acc@1 53.906 (52.697)	Acc@5 78.906 (80.418)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2271 (2.6773)	Acc@1 59.375 (52.973

Test: [40/79]	Time 0.006 (0.011)	Loss 2.4444 (2.7378)	Acc@1 56.250 (52.553)	Acc@5 82.812 (79.364)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.6099 (2.6946)	Acc@1 53.906 (52.619)	Acc@5 78.125 (79.657)
Test: [60/79]	Time 0.006 (0.010)	Loss 3.0878 (2.6910)	Acc@1 51.562 (52.446)	Acc@5 75.781 (79.611)
Test: [70/79]	Time 0.006 (0.009)	Loss 3.1007 (2.6919)	Acc@1 51.562 (52.619)	Acc@5 74.219 (79.621)
 * Acc@1 52.570 Acc@5 79.820
Time/epoch: 9.59591031074524 sec
Epoch: [70][0/391]	Time 0.205 (0.205)	Data 0.194 (0.194)	Loss 0.3294 (0.3294)	Acc@1 89.062 (89.062)	Acc@5 100.000 (100.000)
Epoch: [70][150/391]	Time 0.019 (0.022)	Data 0.000 (0.001)	Loss 0.3285 (0.2965)	Acc@1 91.406 (90.304)	Acc@5 99.219 (99.343)
Epoch: [70][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.4053 (0.2896)	Acc@1 85.938 (90.410)	Acc@5 98.438 (99.395)
Test: [0/79]	Time 0.211 (0.211)	Loss 2.4584 (2.4584)	Acc@1 60.156 (60.156)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.8682 (2.7946)	Acc@1 52.344 (52.557)	A

Test: [40/79]	Time 0.006 (0.011)	Loss 2.4273 (2.7565)	Acc@1 58.594 (52.344)	Acc@5 83.594 (79.783)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.6224 (2.7182)	Acc@1 55.469 (52.528)	Acc@5 76.562 (79.871)
Test: [60/79]	Time 0.006 (0.010)	Loss 3.1764 (2.7108)	Acc@1 46.875 (52.600)	Acc@5 74.219 (79.944)
Test: [70/79]	Time 0.006 (0.009)	Loss 3.1294 (2.7177)	Acc@1 51.562 (52.553)	Acc@5 78.125 (79.952)
 * Acc@1 52.470 Acc@5 80.040
Time/epoch: 9.684936761856079 sec
Epoch: [77][0/391]	Time 0.228 (0.228)	Data 0.215 (0.215)	Loss 0.2736 (0.2736)	Acc@1 89.844 (89.844)	Acc@5 100.000 (100.000)
Epoch: [77][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.1114 (0.2738)	Acc@1 98.438 (90.910)	Acc@5 100.000 (99.457)
Epoch: [77][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 0.3603 (0.2780)	Acc@1 89.062 (90.817)	Acc@5 97.656 (99.419)
Test: [0/79]	Time 0.204 (0.204)	Loss 2.3572 (2.3572)	Acc@1 60.156 (60.156)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.2308 (2.8519)	Acc@1 50.000 (53.054)

Test: [40/79]	Time 0.006 (0.011)	Loss 2.6938 (2.7929)	Acc@1 59.375 (52.611)	Acc@5 81.250 (79.268)
Test: [50/79]	Time 0.007 (0.010)	Loss 2.5226 (2.7461)	Acc@1 53.125 (52.665)	Acc@5 78.125 (79.688)
Test: [60/79]	Time 0.006 (0.010)	Loss 2.9949 (2.7436)	Acc@1 50.000 (52.613)	Acc@5 74.219 (79.726)
Test: [70/79]	Time 0.006 (0.009)	Loss 3.0568 (2.7477)	Acc@1 52.344 (52.465)	Acc@5 75.781 (79.710)
 * Acc@1 52.420 Acc@5 79.750
Time/epoch: 9.61355996131897 sec
Epoch: [84][0/391]	Time 0.219 (0.219)	Data 0.206 (0.206)	Loss 0.3060 (0.3060)	Acc@1 88.281 (88.281)	Acc@5 99.219 (99.219)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1538 (0.2708)	Acc@1 93.750 (91.142)	Acc@5 100.000 (99.514)
Epoch: [84][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.3651 (0.2678)	Acc@1 85.156 (91.232)	Acc@5 99.219 (99.478)
Test: [0/79]	Time 0.213 (0.213)	Loss 2.5923 (2.5923)	Acc@1 53.906 (53.906)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.0651 (2.8968)	Acc@1 50.000 (52.415)	Ac

Test: [40/79]	Time 0.006 (0.012)	Loss 2.4643 (2.8035)	Acc@1 59.375 (52.058)	Acc@5 80.469 (79.325)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.6896 (2.7484)	Acc@1 49.219 (52.482)	Acc@5 78.125 (79.519)
Test: [60/79]	Time 0.006 (0.010)	Loss 3.1749 (2.7480)	Acc@1 48.438 (52.382)	Acc@5 75.781 (79.572)
Test: [70/79]	Time 0.006 (0.009)	Loss 2.9475 (2.7542)	Acc@1 54.688 (52.421)	Acc@5 78.125 (79.577)
 * Acc@1 52.340 Acc@5 79.640
Time/epoch: 9.639029741287231 sec
Epoch: [91][0/391]	Time 0.218 (0.218)	Data 0.203 (0.203)	Loss 0.2529 (0.2529)	Acc@1 92.188 (92.188)	Acc@5 100.000 (100.000)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.3240 (0.2622)	Acc@1 88.281 (91.225)	Acc@5 98.438 (99.503)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2588 (0.2610)	Acc@1 90.625 (91.225)	Acc@5 100.000 (99.502)
Test: [0/79]	Time 0.205 (0.205)	Loss 2.5207 (2.5207)	Acc@1 59.375 (59.375)	Acc@5 82.812 (82.812)
Test: [10/79]	Time 0.006 (0.024)	Loss 3.2054 (2.8744)	Acc@1 50.000 (52.273)

Test: [40/79]	Time 0.006 (0.011)	Loss 2.4769 (2.7899)	Acc@1 56.250 (52.058)	Acc@5 81.250 (79.726)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.7888 (2.7473)	Acc@1 49.219 (52.022)	Acc@5 75.781 (79.887)
Test: [60/79]	Time 0.006 (0.009)	Loss 3.3264 (2.7426)	Acc@1 50.000 (52.152)	Acc@5 74.219 (79.905)
Test: [70/79]	Time 0.006 (0.009)	Loss 2.9540 (2.7503)	Acc@1 54.688 (52.256)	Acc@5 75.781 (79.864)
 * Acc@1 52.370 Acc@5 80.030
Time/epoch: 9.560600519180298 sec
Epoch: [98][0/391]	Time 0.206 (0.206)	Data 0.193 (0.193)	Loss 0.3260 (0.3260)	Acc@1 89.844 (89.844)	Acc@5 99.219 (99.219)
Epoch: [98][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.1941 (0.2674)	Acc@1 92.188 (91.267)	Acc@5 100.000 (99.488)
Epoch: [98][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.2179 (0.2611)	Acc@1 92.188 (91.360)	Acc@5 99.219 (99.522)
Test: [0/79]	Time 0.204 (0.204)	Loss 2.4773 (2.4773)	Acc@1 59.375 (59.375)	Acc@5 82.812 (82.812)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.9469 (2.8977)	Acc@1 53.125 (52.202)	A

In [22]:
dropout_rate = 0.4
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [23]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [24]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.205 (0.205)	Data 0.191 (0.191)	Loss 4.6051 (4.6051)	Acc@1 0.000 (0.000)	Acc@5 1.562 (1.562)
Epoch: [0][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 4.3582 (4.4283)	Acc@1 2.344 (2.556)	Acc@5 15.625 (10.875)
Epoch: [0][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 3.8940 (4.2568)	Acc@1 8.594 (3.862)	Acc@5 27.344 (16.186)
Test: [0/79]	Time 0.205 (0.205)	Loss 3.7776 (3.7776)	Acc@1 6.250 (6.250)	Acc@5 30.469 (30.469)
Test: [10/79]	Time 0.006 (0.025)	Loss 4.0949 (3.8877)	Acc@1 7.812 (8.807)	Acc@5 25.781 (29.119)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.8950 (3.8598)	Acc@1 7.031 (8.445)	Acc@5 27.344 (29.911)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.8406 (3.8520)	Acc@1 3.906 (8.518)	Acc@5 28.125 (29.662)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.8087 (3.8469)	Acc@1 11.719 (8.594)	Acc@5 27.344 (29.478)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.9924 (3.8447)	Acc@1 7.031 (8.778)	Acc@5 22.656 (29.182)
Test: [60/79]	Time 0.006 (0.010)	Loss 3.9378 (3.8496)	Acc@1 7.031

 * Acc@1 36.970 Acc@5 67.840
Time/epoch: 10.165104150772095 sec
Epoch: [7][0/391]	Time 0.218 (0.218)	Data 0.204 (0.204)	Loss 2.5750 (2.5750)	Acc@1 33.594 (33.594)	Acc@5 70.312 (70.312)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.6124 (2.4147)	Acc@1 33.594 (36.388)	Acc@5 66.406 (69.164)
Epoch: [7][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.6027 (2.4231)	Acc@1 28.906 (36.106)	Acc@5 66.406 (68.721)
Test: [0/79]	Time 0.204 (0.204)	Loss 2.1042 (2.1042)	Acc@1 42.188 (42.188)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.4536 (2.4010)	Acc@1 35.938 (37.571)	Acc@5 73.438 (69.673)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.3024 (2.3773)	Acc@1 39.844 (38.058)	Acc@5 71.875 (69.754)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.4899 (2.3792)	Acc@1 35.938 (38.155)	Acc@5 67.188 (69.657)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.2997 (2.3875)	Acc@1 36.719 (37.938)	Acc@5 70.312 (69.322)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.4046 (2.3890)	Acc@1 37.500 (38.097)	Acc@

 * Acc@1 44.630 Acc@5 74.990
Time/epoch: 10.061579942703247 sec
Epoch: [14][0/391]	Time 0.228 (0.228)	Data 0.217 (0.217)	Loss 1.6958 (1.6958)	Acc@1 50.000 (50.000)	Acc@5 81.250 (81.250)
Epoch: [14][150/391]	Time 0.022 (0.022)	Data 0.000 (0.002)	Loss 1.9872 (1.8812)	Acc@1 49.219 (47.936)	Acc@5 77.344 (79.832)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.9286 (1.9102)	Acc@1 45.312 (47.308)	Acc@5 77.344 (79.020)
Test: [0/79]	Time 0.208 (0.208)	Loss 1.8349 (1.8349)	Acc@1 52.344 (52.344)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.0866 (2.0914)	Acc@1 45.312 (47.088)	Acc@5 74.219 (74.787)
Test: [20/79]	Time 0.007 (0.016)	Loss 2.2220 (2.0815)	Acc@1 40.625 (46.168)	Acc@5 73.438 (74.591)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1834 (2.0960)	Acc@1 39.062 (45.590)	Acc@5 70.312 (74.647)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8767 (2.0952)	Acc@1 50.000 (45.751)	Acc@5 78.906 (74.886)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9500 (2.0887)	Acc@1 50.781 (45.558)	A

 * Acc@1 46.880 Acc@5 76.440
Time/epoch: 10.029819965362549 sec
Epoch: [21][0/391]	Time 0.225 (0.225)	Data 0.213 (0.213)	Loss 1.4914 (1.4914)	Acc@1 60.156 (60.156)	Acc@5 85.156 (85.156)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.7847 (1.6162)	Acc@1 46.094 (53.927)	Acc@5 85.156 (84.308)
Epoch: [21][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.6384 (1.6281)	Acc@1 55.469 (53.751)	Acc@5 84.375 (84.043)
Test: [0/79]	Time 0.210 (0.210)	Loss 1.7460 (1.7460)	Acc@1 55.469 (55.469)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.1876 (2.0503)	Acc@1 46.094 (46.733)	Acc@5 74.219 (76.349)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0090 (1.9980)	Acc@1 43.750 (47.098)	Acc@5 75.781 (76.897)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1290 (2.0156)	Acc@1 48.438 (47.051)	Acc@5 72.656 (76.865)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7595 (2.0081)	Acc@1 49.219 (47.504)	Acc@5 79.688 (77.020)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9393 (2.0001)	Acc@1 46.875 (47.457)	A

 * Acc@1 48.270 Acc@5 77.480
Time/epoch: 10.059017896652222 sec
Epoch: [28][0/391]	Time 0.219 (0.219)	Data 0.206 (0.206)	Loss 1.6068 (1.6068)	Acc@1 51.562 (51.562)	Acc@5 84.375 (84.375)
Epoch: [28][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.1398 (1.4116)	Acc@1 67.969 (58.987)	Acc@5 91.406 (87.717)
Epoch: [28][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 1.5533 (1.4377)	Acc@1 56.250 (58.316)	Acc@5 87.500 (87.477)
Test: [0/79]	Time 0.210 (0.210)	Loss 1.6971 (1.6971)	Acc@1 56.250 (56.250)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.1516 (2.0063)	Acc@1 46.875 (48.935)	Acc@5 73.438 (77.202)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9403 (1.9943)	Acc@1 40.625 (48.177)	Acc@5 74.219 (76.711)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9443 (2.0002)	Acc@1 51.562 (48.664)	Acc@5 75.781 (77.067)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8340 (1.9989)	Acc@1 52.344 (48.780)	Acc@5 82.031 (77.344)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1182 (1.9942)	Acc@1 50.781 (48.514)	A

 * Acc@1 51.610 Acc@5 79.580
Time/epoch: 10.01617169380188 sec
Epoch: [35][0/391]	Time 0.228 (0.228)	Data 0.216 (0.216)	Loss 0.7822 (0.7822)	Acc@1 75.781 (75.781)	Acc@5 95.312 (95.312)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.9982 (0.9962)	Acc@1 71.875 (69.661)	Acc@5 93.750 (93.383)
Epoch: [35][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.9308 (1.0008)	Acc@1 71.875 (69.619)	Acc@5 92.188 (93.205)
Test: [0/79]	Time 0.218 (0.218)	Loss 1.7885 (1.7885)	Acc@1 55.469 (55.469)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1541 (2.0189)	Acc@1 51.562 (51.420)	Acc@5 76.562 (78.764)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.8070 (1.9738)	Acc@1 51.562 (51.711)	Acc@5 78.125 (78.869)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.8929 (1.9695)	Acc@1 46.875 (51.386)	Acc@5 81.250 (79.410)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8376 (1.9652)	Acc@1 54.688 (51.982)	Acc@5 78.906 (79.173)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8940 (1.9562)	Acc@1 52.344 (51.624)	Ac

 * Acc@1 51.560 Acc@5 79.900
Time/epoch: 9.63914179801941 sec
Epoch: [42][0/391]	Time 0.214 (0.214)	Data 0.203 (0.203)	Loss 0.7437 (0.7437)	Acc@1 79.688 (79.688)	Acc@5 97.656 (97.656)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.7627 (0.9013)	Acc@1 71.094 (72.346)	Acc@5 97.656 (94.304)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8467 (0.9064)	Acc@1 71.094 (72.231)	Acc@5 96.875 (94.246)
Test: [0/79]	Time 0.207 (0.207)	Loss 1.7822 (1.7822)	Acc@1 59.375 (59.375)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.2195 (2.0275)	Acc@1 50.781 (52.415)	Acc@5 76.562 (79.688)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.8860 (1.9906)	Acc@1 50.781 (52.344)	Acc@5 80.469 (79.241)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9666 (1.9906)	Acc@1 49.219 (51.840)	Acc@5 81.250 (79.637)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7573 (1.9923)	Acc@1 61.719 (52.096)	Acc@5 85.938 (79.745)
Test: [50/79]	Time 0.007 (0.010)	Loss 2.0048 (1.9867)	Acc@1 51.562 (51.869)	Acc

 * Acc@1 51.580 Acc@5 79.800
Time/epoch: 9.5890953540802 sec
Epoch: [49][0/391]	Time 0.221 (0.221)	Data 0.208 (0.208)	Loss 0.7671 (0.7671)	Acc@1 71.875 (71.875)	Acc@5 95.312 (95.312)
Epoch: [49][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.8490 (0.8304)	Acc@1 73.438 (74.074)	Acc@5 96.875 (95.256)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8734 (0.8349)	Acc@1 72.656 (74.053)	Acc@5 94.531 (95.133)
Test: [0/79]	Time 0.229 (0.229)	Loss 1.9396 (1.9396)	Acc@1 55.469 (55.469)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1649 (2.0678)	Acc@1 52.344 (52.557)	Acc@5 74.219 (79.830)
Test: [20/79]	Time 0.006 (0.017)	Loss 1.8361 (2.0359)	Acc@1 50.000 (52.530)	Acc@5 81.250 (79.762)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.1154 (2.0509)	Acc@1 47.656 (51.537)	Acc@5 78.125 (79.561)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.7613 (2.0405)	Acc@1 53.125 (51.734)	Acc@5 82.812 (79.573)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.9775 (2.0294)	Acc@1 50.000 (51.593)	Acc@

 * Acc@1 51.640 Acc@5 79.570
Time/epoch: 9.5790536403656 sec
Epoch: [56][0/391]	Time 0.220 (0.220)	Data 0.208 (0.208)	Loss 0.7952 (0.7952)	Acc@1 77.344 (77.344)	Acc@5 96.875 (96.875)
Epoch: [56][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 0.7956 (0.7737)	Acc@1 72.656 (75.636)	Acc@5 97.656 (95.788)
Epoch: [56][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.6602 (0.7743)	Acc@1 80.469 (75.651)	Acc@5 97.656 (95.873)
Test: [0/79]	Time 0.215 (0.215)	Loss 1.8643 (1.8643)	Acc@1 55.469 (55.469)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.2401 (2.0970)	Acc@1 53.125 (52.557)	Acc@5 78.125 (79.261)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9076 (2.0658)	Acc@1 53.906 (52.679)	Acc@5 81.250 (79.650)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1512 (2.0873)	Acc@1 50.000 (51.865)	Acc@5 78.906 (79.814)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8087 (2.0888)	Acc@1 56.250 (51.963)	Acc@5 81.250 (79.573)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0937 (2.0683)	Acc@1 49.219 (51.900)	Acc@

 * Acc@1 52.070 Acc@5 79.530
Time/epoch: 10.185791254043579 sec
Epoch: [63][0/391]	Time 0.229 (0.229)	Data 0.217 (0.217)	Loss 0.6536 (0.6536)	Acc@1 78.125 (78.125)	Acc@5 96.875 (96.875)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.6838 (0.7027)	Acc@1 80.469 (77.608)	Acc@5 95.312 (96.539)
Epoch: [63][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.5351 (0.7009)	Acc@1 82.812 (77.717)	Acc@5 99.219 (96.543)
Test: [0/79]	Time 0.211 (0.211)	Loss 2.0773 (2.0773)	Acc@1 57.812 (57.812)	Acc@5 82.812 (82.812)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.5253 (2.1904)	Acc@1 51.562 (51.918)	Acc@5 75.000 (79.190)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0134 (2.1229)	Acc@1 52.344 (53.051)	Acc@5 81.250 (79.129)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0247 (2.1204)	Acc@1 49.219 (51.941)	Acc@5 76.562 (79.461)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8408 (2.1188)	Acc@1 57.812 (52.096)	Acc@5 82.031 (79.497)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1328 (2.1002)	Acc@1 54.688 (52.160)	A

 * Acc@1 51.990 Acc@5 79.960
Time/epoch: 9.600878238677979 sec
Epoch: [70][0/391]	Time 0.214 (0.214)	Data 0.202 (0.202)	Loss 0.6669 (0.6669)	Acc@1 81.250 (81.250)	Acc@5 96.875 (96.875)
Epoch: [70][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.6032 (0.6994)	Acc@1 80.469 (77.918)	Acc@5 97.656 (96.508)
Epoch: [70][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.5903 (0.6919)	Acc@1 82.031 (78.141)	Acc@5 98.438 (96.512)
Test: [0/79]	Time 0.220 (0.220)	Loss 2.0662 (2.0662)	Acc@1 53.125 (53.125)	Acc@5 85.156 (85.156)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.3705 (2.1774)	Acc@1 50.000 (52.628)	Acc@5 77.344 (79.545)
Test: [20/79]	Time 0.006 (0.017)	Loss 1.9600 (2.1323)	Acc@1 54.688 (52.865)	Acc@5 79.688 (79.427)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0022 (2.1420)	Acc@1 50.781 (52.293)	Acc@5 79.688 (79.536)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.7355 (2.1258)	Acc@1 57.031 (52.553)	Acc@5 81.250 (79.535)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0612 (2.1086)	Acc@1 53.906 (52.420)	Ac

 * Acc@1 52.200 Acc@5 79.830
Time/epoch: 9.605156183242798 sec
Epoch: [77][0/391]	Time 0.235 (0.235)	Data 0.222 (0.222)	Loss 0.7829 (0.7829)	Acc@1 80.469 (80.469)	Acc@5 95.312 (95.312)
Epoch: [77][150/391]	Time 0.022 (0.022)	Data 0.000 (0.002)	Loss 0.5779 (0.6765)	Acc@1 83.594 (78.642)	Acc@5 95.312 (96.678)
Epoch: [77][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.9113 (0.6831)	Acc@1 71.875 (78.452)	Acc@5 93.750 (96.615)
Test: [0/79]	Time 0.210 (0.210)	Loss 2.0779 (2.0779)	Acc@1 54.688 (54.688)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.2968 (2.2056)	Acc@1 52.344 (51.918)	Acc@5 79.688 (79.759)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0043 (2.1404)	Acc@1 53.906 (52.381)	Acc@5 78.125 (79.836)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0734 (2.1361)	Acc@1 50.781 (52.117)	Acc@5 77.344 (80.015)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.9008 (2.1346)	Acc@1 56.250 (52.077)	Acc@5 82.812 (79.878)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1495 (2.1160)	Acc@1 52.344 (51.976)	Ac

 * Acc@1 52.110 Acc@5 79.610
Time/epoch: 9.73727560043335 sec
Epoch: [84][0/391]	Time 0.215 (0.215)	Data 0.201 (0.201)	Loss 0.7484 (0.7484)	Acc@1 76.562 (76.562)	Acc@5 96.875 (96.875)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 0.7097 (0.6972)	Acc@1 71.875 (77.825)	Acc@5 97.656 (96.590)
Epoch: [84][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.6045 (0.6861)	Acc@1 78.906 (78.231)	Acc@5 99.219 (96.667)
Test: [0/79]	Time 0.214 (0.214)	Loss 2.0646 (2.0646)	Acc@1 55.469 (55.469)	Acc@5 82.812 (82.812)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.2397 (2.2036)	Acc@1 49.219 (52.060)	Acc@5 82.031 (80.043)
Test: [20/79]	Time 0.006 (0.017)	Loss 1.9138 (2.1374)	Acc@1 54.688 (52.455)	Acc@5 81.250 (80.134)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1203 (2.1406)	Acc@1 50.000 (51.991)	Acc@5 78.125 (79.914)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.7445 (2.1396)	Acc@1 57.812 (52.363)	Acc@5 84.375 (79.649)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0530 (2.1235)	Acc@1 52.344 (52.267)	Acc

 * Acc@1 52.090 Acc@5 79.810
Time/epoch: 9.562450170516968 sec
Epoch: [91][0/391]	Time 0.221 (0.221)	Data 0.209 (0.209)	Loss 0.5910 (0.5910)	Acc@1 78.906 (78.906)	Acc@5 94.531 (94.531)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.8803 (0.6785)	Acc@1 70.312 (78.270)	Acc@5 94.531 (96.580)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.5499 (0.6697)	Acc@1 82.031 (78.714)	Acc@5 97.656 (96.673)
Test: [0/79]	Time 0.236 (0.236)	Loss 2.0343 (2.0343)	Acc@1 57.031 (57.031)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.3618 (2.2077)	Acc@1 47.656 (53.054)	Acc@5 77.344 (78.906)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0612 (2.1548)	Acc@1 53.125 (52.641)	Acc@5 77.344 (79.353)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.0445 (2.1468)	Acc@1 50.000 (52.193)	Acc@5 78.125 (79.688)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8536 (2.1275)	Acc@1 63.281 (52.973)	Acc@5 82.031 (79.707)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.2298 (2.1126)	Acc@1 50.000 (52.788)	Ac

 * Acc@1 52.040 Acc@5 79.820
Time/epoch: 9.57581377029419 sec
Epoch: [98][0/391]	Time 0.222 (0.222)	Data 0.209 (0.209)	Loss 0.7488 (0.7488)	Acc@1 77.344 (77.344)	Acc@5 94.531 (94.531)
Epoch: [98][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.6705 (0.6667)	Acc@1 75.781 (78.792)	Acc@5 95.312 (96.922)
Epoch: [98][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.5776 (0.6648)	Acc@1 78.906 (78.898)	Acc@5 97.656 (96.880)
Test: [0/79]	Time 0.201 (0.201)	Loss 2.0789 (2.0789)	Acc@1 54.688 (54.688)	Acc@5 85.938 (85.938)
Test: [10/79]	Time 0.006 (0.024)	Loss 2.3926 (2.2426)	Acc@1 47.656 (52.131)	Acc@5 80.469 (80.185)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9656 (2.1457)	Acc@1 52.344 (52.344)	Acc@5 80.469 (80.246)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0940 (2.1642)	Acc@1 50.781 (51.588)	Acc@5 76.562 (80.015)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.9245 (2.1452)	Acc@1 54.688 (52.115)	Acc@5 82.031 (80.164)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1933 (2.1288)	Acc@1 51.562 (52.037)	Acc

In [25]:
dropout_rate = 0.5
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [26]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [27]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.197 (0.197)	Data 0.186 (0.186)	Loss 4.6057 (4.6057)	Acc@1 0.781 (0.781)	Acc@5 4.688 (4.688)
Epoch: [0][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 4.2256 (4.3846)	Acc@1 2.344 (2.261)	Acc@5 13.281 (10.498)
Epoch: [0][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 3.9471 (4.2162)	Acc@1 7.812 (3.836)	Acc@5 27.344 (16.437)
Test: [0/79]	Time 0.212 (0.212)	Loss 3.7728 (3.7728)	Acc@1 12.500 (12.500)	Acc@5 28.906 (28.906)
Test: [10/79]	Time 0.006 (0.026)	Loss 3.9670 (3.8462)	Acc@1 9.375 (9.375)	Acc@5 33.594 (30.682)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.7606 (3.8371)	Acc@1 14.844 (10.119)	Acc@5 33.594 (31.101)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.8352 (3.8312)	Acc@1 7.031 (9.476)	Acc@5 29.688 (30.368)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.8376 (3.8251)	Acc@1 13.281 (9.451)	Acc@5 28.125 (30.697)
Test: [50/79]	Time 0.006 (0.010)	Loss 3.9052 (3.8212)	Acc@1 7.812 (9.681)	Acc@5 26.562 (30.775)
Test: [60/79]	Time 0.006 (0.010)	Loss 4.0001 (3.8324)	Acc@1 7

 * Acc@1 37.350 Acc@5 69.180
Time/epoch: 10.063776969909668 sec
Epoch: [7][0/391]	Time 0.215 (0.215)	Data 0.204 (0.204)	Loss 2.2356 (2.2356)	Acc@1 39.062 (39.062)	Acc@5 74.219 (74.219)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.2006 (2.3915)	Acc@1 40.625 (36.600)	Acc@5 75.000 (69.190)
Epoch: [7][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.4702 (2.4116)	Acc@1 35.938 (36.498)	Acc@5 64.844 (68.636)
Test: [0/79]	Time 0.224 (0.224)	Loss 2.1547 (2.1547)	Acc@1 46.094 (46.094)	Acc@5 74.219 (74.219)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.3520 (2.3507)	Acc@1 45.312 (38.920)	Acc@5 70.312 (70.028)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.4352 (2.3380)	Acc@1 37.500 (39.249)	Acc@5 67.969 (70.238)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3874 (2.3227)	Acc@1 32.812 (39.113)	Acc@5 66.406 (70.691)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.1799 (2.3379)	Acc@1 43.750 (38.967)	Acc@5 74.219 (70.198)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.5237 (2.3473)	Acc@1 32.812 (38.695)	Acc@

 * Acc@1 45.830 Acc@5 75.870
Time/epoch: 10.203275442123413 sec
Epoch: [14][0/391]	Time 0.225 (0.225)	Data 0.213 (0.213)	Loss 1.9022 (1.9022)	Acc@1 46.094 (46.094)	Acc@5 80.469 (80.469)
Epoch: [14][150/391]	Time 0.022 (0.022)	Data 0.000 (0.002)	Loss 1.8533 (1.9174)	Acc@1 44.531 (46.642)	Acc@5 79.688 (79.093)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8559 (1.9432)	Acc@1 44.531 (46.185)	Acc@5 83.594 (78.558)
Test: [0/79]	Time 0.209 (0.209)	Loss 1.9682 (1.9682)	Acc@1 56.250 (56.250)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.007 (0.025)	Loss 2.0500 (2.0502)	Acc@1 47.656 (46.023)	Acc@5 77.344 (75.852)
Test: [20/79]	Time 0.007 (0.016)	Loss 2.1123 (2.0396)	Acc@1 46.875 (46.503)	Acc@5 77.344 (75.595)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0694 (2.0287)	Acc@1 47.656 (46.069)	Acc@5 78.125 (76.310)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6552 (2.0373)	Acc@1 57.031 (46.418)	Acc@5 82.812 (76.067)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0872 (2.0371)	Acc@1 46.094 (46.002)	A

 * Acc@1 47.230 Acc@5 76.820
Time/epoch: 9.600794553756714 sec
Epoch: [21][0/391]	Time 0.213 (0.213)	Data 0.202 (0.202)	Loss 1.6583 (1.6583)	Acc@1 52.344 (52.344)	Acc@5 82.812 (82.812)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 1.7305 (1.6727)	Acc@1 52.344 (52.483)	Acc@5 83.594 (83.387)
Epoch: [21][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.5556 (1.6979)	Acc@1 54.688 (51.923)	Acc@5 82.031 (82.851)
Test: [0/79]	Time 0.209 (0.209)	Loss 1.9000 (1.9000)	Acc@1 51.562 (51.562)	Acc@5 74.219 (74.219)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.1208 (2.0475)	Acc@1 47.656 (48.366)	Acc@5 80.469 (77.273)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9347 (2.0154)	Acc@1 47.656 (47.582)	Acc@5 78.906 (77.046)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0911 (1.9965)	Acc@1 41.406 (47.127)	Acc@5 78.125 (77.495)
Test: [40/79]	Time 0.007 (0.011)	Loss 1.7915 (1.9965)	Acc@1 57.812 (47.523)	Acc@5 80.469 (77.287)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0398 (1.9908)	Acc@1 47.656 (47.089)	Ac

 * Acc@1 48.980 Acc@5 77.950
Time/epoch: 10.061187028884888 sec
Epoch: [28][0/391]	Time 0.232 (0.232)	Data 0.219 (0.219)	Loss 1.4315 (1.4315)	Acc@1 61.719 (61.719)	Acc@5 84.375 (84.375)
Epoch: [28][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.6999 (1.5275)	Acc@1 53.125 (56.167)	Acc@5 76.562 (85.917)
Epoch: [28][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.8237 (1.5534)	Acc@1 48.438 (55.466)	Acc@5 80.469 (85.538)
Test: [0/79]	Time 0.214 (0.214)	Loss 1.8405 (1.8405)	Acc@1 54.688 (54.688)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 1.9805 (1.9677)	Acc@1 46.875 (48.793)	Acc@5 80.469 (77.628)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0035 (1.9466)	Acc@1 49.219 (49.516)	Acc@5 74.219 (77.827)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9137 (1.9324)	Acc@1 44.531 (48.916)	Acc@5 79.688 (78.553)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7528 (1.9363)	Acc@1 53.125 (49.028)	Acc@5 84.375 (78.354)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8576 (1.9259)	Acc@1 50.781 (49.020)	A

 * Acc@1 52.320 Acc@5 80.350
Time/epoch: 10.030874967575073 sec
Epoch: [35][0/391]	Time 0.224 (0.224)	Data 0.212 (0.212)	Loss 1.1347 (1.1347)	Acc@1 68.750 (68.750)	Acc@5 94.531 (94.531)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.2543 (1.1488)	Acc@1 57.812 (65.361)	Acc@5 92.188 (91.489)
Epoch: [35][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.1791 (1.1453)	Acc@1 63.281 (65.358)	Acc@5 89.062 (91.347)
Test: [0/79]	Time 0.201 (0.201)	Loss 1.6806 (1.6806)	Acc@1 61.719 (61.719)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.024)	Loss 1.9134 (1.8894)	Acc@1 55.469 (53.764)	Acc@5 78.125 (79.972)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9591 (1.8703)	Acc@1 53.125 (53.013)	Acc@5 78.125 (80.246)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9895 (1.8699)	Acc@1 49.219 (52.394)	Acc@5 79.688 (80.343)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.5605 (1.8684)	Acc@1 55.469 (52.572)	Acc@5 84.375 (80.297)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8379 (1.8541)	Acc@1 52.344 (52.528)	A

 * Acc@1 52.890 Acc@5 80.430
Time/epoch: 9.70221471786499 sec
Epoch: [42][0/391]	Time 0.239 (0.239)	Data 0.227 (0.227)	Loss 0.9473 (0.9473)	Acc@1 72.656 (72.656)	Acc@5 93.750 (93.750)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.8375 (1.0406)	Acc@1 78.125 (68.077)	Acc@5 92.969 (92.757)
Epoch: [42][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.9334 (1.0438)	Acc@1 71.875 (68.112)	Acc@5 94.531 (92.823)
Test: [0/79]	Time 0.219 (0.219)	Loss 1.7133 (1.7133)	Acc@1 64.062 (64.062)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0347 (1.9437)	Acc@1 51.562 (53.196)	Acc@5 81.250 (80.398)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9012 (1.8951)	Acc@1 56.250 (52.939)	Acc@5 78.125 (79.985)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9626 (1.8984)	Acc@1 46.875 (52.369)	Acc@5 82.812 (80.469)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6619 (1.8924)	Acc@1 56.250 (52.801)	Acc@5 84.375 (80.335)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8503 (1.8689)	Acc@1 55.469 (52.619)	Acc

 * Acc@1 52.530 Acc@5 80.680
Time/epoch: 9.589685201644897 sec
Epoch: [49][0/391]	Time 0.222 (0.222)	Data 0.211 (0.211)	Loss 0.9149 (0.9149)	Acc@1 70.312 (70.312)	Acc@5 94.531 (94.531)
Epoch: [49][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.9789 (0.9831)	Acc@1 72.656 (69.552)	Acc@5 92.188 (93.424)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8406 (0.9777)	Acc@1 73.438 (70.017)	Acc@5 97.656 (93.516)
Test: [0/79]	Time 0.211 (0.211)	Loss 1.6534 (1.6534)	Acc@1 59.375 (59.375)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.025)	Loss 1.8868 (1.9141)	Acc@1 57.031 (54.474)	Acc@5 79.688 (80.185)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9278 (1.8927)	Acc@1 58.594 (54.092)	Acc@5 78.125 (80.134)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0226 (1.8932)	Acc@1 49.219 (53.553)	Acc@5 79.688 (80.343)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7229 (1.9081)	Acc@1 57.031 (53.659)	Acc@5 81.250 (80.278)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8510 (1.8801)	Acc@1 54.688 (53.569)	Ac

 * Acc@1 53.050 Acc@5 80.820
Time/epoch: 9.62324333190918 sec
Epoch: [56][0/391]	Time 0.244 (0.244)	Data 0.229 (0.229)	Loss 0.7471 (0.7471)	Acc@1 74.219 (74.219)	Acc@5 97.656 (97.656)
Epoch: [56][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.9887 (0.9243)	Acc@1 69.531 (71.233)	Acc@5 94.531 (94.257)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8507 (0.9193)	Acc@1 75.781 (71.343)	Acc@5 95.312 (94.339)
Test: [0/79]	Time 0.220 (0.220)	Loss 1.9265 (1.9265)	Acc@1 58.594 (58.594)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.8956 (1.9376)	Acc@1 53.125 (53.906)	Acc@5 79.688 (80.398)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9464 (1.9180)	Acc@1 50.781 (53.348)	Acc@5 80.469 (80.432)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9480 (1.9299)	Acc@1 52.344 (52.873)	Acc@5 81.250 (80.696)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6825 (1.9296)	Acc@1 60.156 (53.430)	Acc@5 81.250 (80.240)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8694 (1.9146)	Acc@1 55.469 (53.033)	Acc

 * Acc@1 53.220 Acc@5 80.980
Time/epoch: 9.628714561462402 sec
Epoch: [63][0/391]	Time 0.229 (0.229)	Data 0.216 (0.216)	Loss 0.9445 (0.9445)	Acc@1 74.219 (74.219)	Acc@5 92.969 (92.969)
Epoch: [63][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.8617 (0.8609)	Acc@1 76.562 (73.422)	Acc@5 95.312 (94.671)
Epoch: [63][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 0.9274 (0.8647)	Acc@1 74.219 (72.996)	Acc@5 92.969 (94.747)
Test: [0/79]	Time 0.217 (0.217)	Loss 1.9132 (1.9132)	Acc@1 59.375 (59.375)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0023 (1.9722)	Acc@1 52.344 (53.764)	Acc@5 80.469 (79.688)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9553 (1.9559)	Acc@1 53.906 (53.311)	Acc@5 82.812 (79.501)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0706 (1.9615)	Acc@1 46.094 (52.545)	Acc@5 83.594 (79.990)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.5549 (1.9555)	Acc@1 60.156 (52.934)	Acc@5 85.156 (79.973)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9832 (1.9313)	Acc@1 55.469 (52.987)	Ac

 * Acc@1 52.900 Acc@5 80.890
Time/epoch: 9.738898754119873 sec
Epoch: [70][0/391]	Time 0.244 (0.244)	Data 0.232 (0.232)	Loss 0.7811 (0.7811)	Acc@1 73.438 (73.438)	Acc@5 96.094 (96.094)
Epoch: [70][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.8046 (0.8408)	Acc@1 75.000 (73.841)	Acc@5 94.531 (95.095)
Epoch: [70][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.7791 (0.8396)	Acc@1 71.875 (73.907)	Acc@5 96.875 (94.967)
Test: [0/79]	Time 0.223 (0.223)	Loss 1.8393 (1.8393)	Acc@1 57.812 (57.812)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1573 (1.9707)	Acc@1 52.344 (53.906)	Acc@5 81.250 (80.398)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0539 (1.9462)	Acc@1 50.781 (53.534)	Acc@5 80.469 (79.874)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0388 (1.9600)	Acc@1 46.875 (52.873)	Acc@5 82.031 (80.544)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.4900 (1.9633)	Acc@1 60.938 (53.220)	Acc@5 87.500 (80.354)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.8429 (1.9334)	Acc@1 55.469 (53.125)	Ac

 * Acc@1 53.200 Acc@5 80.840
Time/epoch: 9.559995889663696 sec
Epoch: [77][0/391]	Time 0.222 (0.222)	Data 0.211 (0.211)	Loss 0.7580 (0.7580)	Acc@1 75.781 (75.781)	Acc@5 96.875 (96.875)
Epoch: [77][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 0.7653 (0.8291)	Acc@1 74.219 (73.903)	Acc@5 94.531 (95.178)
Epoch: [77][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8216 (0.8350)	Acc@1 71.875 (73.824)	Acc@5 95.312 (95.146)
Test: [0/79]	Time 0.213 (0.213)	Loss 1.8855 (1.8855)	Acc@1 58.594 (58.594)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.0589 (1.9588)	Acc@1 52.344 (52.628)	Acc@5 79.688 (80.327)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9339 (1.9537)	Acc@1 53.906 (52.939)	Acc@5 82.812 (79.874)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0074 (1.9644)	Acc@1 47.656 (52.697)	Acc@5 82.031 (80.141)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8261 (1.9660)	Acc@1 53.906 (52.954)	Acc@5 83.594 (80.126)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.7975 (1.9445)	Acc@1 57.812 (52.895)	Ac

 * Acc@1 53.390 Acc@5 80.700
Time/epoch: 9.572221755981445 sec
Epoch: [84][0/391]	Time 0.218 (0.218)	Data 0.206 (0.206)	Loss 1.0642 (1.0642)	Acc@1 67.969 (67.969)	Acc@5 95.312 (95.312)
Epoch: [84][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.7732 (0.8277)	Acc@1 74.219 (74.276)	Acc@5 97.656 (95.297)
Epoch: [84][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.8058 (0.8320)	Acc@1 70.312 (73.972)	Acc@5 96.094 (95.240)
Test: [0/79]	Time 0.243 (0.243)	Loss 1.8934 (1.8934)	Acc@1 60.156 (60.156)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.1601 (1.9891)	Acc@1 52.344 (54.048)	Acc@5 80.469 (80.895)
Test: [20/79]	Time 0.006 (0.018)	Loss 2.0749 (1.9662)	Acc@1 54.688 (53.869)	Acc@5 82.031 (80.432)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.0252 (1.9585)	Acc@1 46.875 (53.226)	Acc@5 81.250 (80.897)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.6887 (1.9600)	Acc@1 57.812 (53.620)	Acc@5 80.469 (80.697)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.9837 (1.9355)	Acc@1 54.688 (53.539)	Ac

 * Acc@1 53.390 Acc@5 80.860
Time/epoch: 9.558185338973999 sec
Epoch: [91][0/391]	Time 0.217 (0.217)	Data 0.206 (0.206)	Loss 0.8707 (0.8707)	Acc@1 69.531 (69.531)	Acc@5 94.531 (94.531)
Epoch: [91][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 0.7904 (0.8228)	Acc@1 74.219 (74.089)	Acc@5 94.531 (95.276)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.6258 (0.8211)	Acc@1 78.906 (74.086)	Acc@5 98.438 (95.211)
Test: [0/79]	Time 0.220 (0.220)	Loss 1.9150 (1.9150)	Acc@1 58.594 (58.594)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.2108 (2.0201)	Acc@1 51.562 (54.830)	Acc@5 79.688 (80.114)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0817 (1.9702)	Acc@1 52.344 (54.278)	Acc@5 80.469 (80.171)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0604 (1.9742)	Acc@1 48.438 (53.226)	Acc@5 80.469 (80.771)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6877 (1.9687)	Acc@1 57.812 (53.639)	Acc@5 82.812 (80.469)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9185 (1.9452)	Acc@1 55.469 (53.477)	Ac

 * Acc@1 53.520 Acc@5 80.910
Time/epoch: 9.58526873588562 sec
Epoch: [98][0/391]	Time 0.224 (0.224)	Data 0.212 (0.212)	Loss 0.8293 (0.8293)	Acc@1 75.781 (75.781)	Acc@5 93.750 (93.750)
Epoch: [98][150/391]	Time 0.025 (0.022)	Data 0.000 (0.002)	Loss 0.8887 (0.8154)	Acc@1 74.219 (74.741)	Acc@5 92.188 (95.375)
Epoch: [98][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 0.7942 (0.8174)	Acc@1 71.875 (74.520)	Acc@5 93.750 (95.354)
Test: [0/79]	Time 0.223 (0.223)	Loss 1.9100 (1.9100)	Acc@1 61.719 (61.719)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.9875 (1.9942)	Acc@1 50.781 (54.190)	Acc@5 78.906 (79.972)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0545 (1.9535)	Acc@1 53.125 (54.353)	Acc@5 80.469 (79.874)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0543 (1.9585)	Acc@1 44.531 (53.604)	Acc@5 80.469 (80.091)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.5962 (1.9591)	Acc@1 56.250 (53.792)	Acc@5 82.812 (79.897)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9072 (1.9408)	Acc@1 53.906 (53.554)	Acc

In [28]:
dropout_rate = 0.6
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [29]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [30]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.221 (0.221)	Data 0.209 (0.209)	Loss 4.6059 (4.6059)	Acc@1 0.781 (0.781)	Acc@5 3.906 (3.906)
Epoch: [0][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 4.4273 (4.4071)	Acc@1 2.344 (2.473)	Acc@5 18.750 (11.305)
Epoch: [0][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 4.0055 (4.2488)	Acc@1 8.594 (3.714)	Acc@5 23.438 (16.087)
Test: [0/79]	Time 0.217 (0.217)	Loss 3.7954 (3.7954)	Acc@1 8.594 (8.594)	Acc@5 25.781 (25.781)
Test: [10/79]	Time 0.006 (0.025)	Loss 3.9707 (3.8969)	Acc@1 11.719 (8.310)	Acc@5 32.031 (27.628)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.8789 (3.8880)	Acc@1 7.812 (8.036)	Acc@5 28.906 (28.199)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.9135 (3.8887)	Acc@1 7.031 (7.964)	Acc@5 21.875 (28.024)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.7910 (3.8793)	Acc@1 10.938 (8.232)	Acc@5 29.688 (28.487)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.0193 (3.8806)	Acc@1 7.031 (8.119)	Acc@5 22.656 (28.431)
Test: [60/79]	Time 0.006 (0.010)	Loss 3.9858 (3.8869)	Acc@1 7.03

 * Acc@1 33.750 Acc@5 64.900
Time/epoch: 10.077085971832275 sec
Epoch: [7][0/391]	Time 0.227 (0.227)	Data 0.214 (0.214)	Loss 2.7660 (2.7660)	Acc@1 25.781 (25.781)	Acc@5 67.188 (67.188)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.6132 (2.6500)	Acc@1 38.281 (31.612)	Acc@5 67.188 (63.188)
Epoch: [7][300/391]	Time 0.024 (0.021)	Data 0.000 (0.001)	Loss 2.4727 (2.6520)	Acc@1 34.375 (31.292)	Acc@5 65.625 (63.151)
Test: [0/79]	Time 0.229 (0.229)	Loss 2.2241 (2.2241)	Acc@1 41.406 (41.406)	Acc@5 71.094 (71.094)
Test: [10/79]	Time 0.007 (0.027)	Loss 2.3864 (2.4778)	Acc@1 39.844 (35.014)	Acc@5 71.094 (69.034)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.5580 (2.4694)	Acc@1 31.250 (35.082)	Acc@5 67.188 (67.746)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.4531 (2.4764)	Acc@1 32.812 (34.476)	Acc@5 66.406 (67.263)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.2925 (2.4777)	Acc@1 39.062 (35.099)	Acc@5 71.875 (67.168)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.6074 (2.4787)	Acc@1 34.375 (35.156)	Acc@

 * Acc@1 41.140 Acc@5 72.020
Time/epoch: 10.068870067596436 sec
Epoch: [14][0/391]	Time 0.238 (0.238)	Data 0.224 (0.224)	Loss 2.0683 (2.0683)	Acc@1 45.312 (45.312)	Acc@5 76.562 (76.562)
Epoch: [14][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.1720 (2.2174)	Acc@1 44.531 (40.066)	Acc@5 78.125 (73.065)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.3026 (2.2235)	Acc@1 38.281 (40.238)	Acc@5 69.531 (72.693)
Test: [0/79]	Time 0.215 (0.215)	Loss 1.9242 (1.9242)	Acc@1 50.781 (50.781)	Acc@5 76.562 (76.562)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.0894 (2.1512)	Acc@1 47.656 (44.389)	Acc@5 75.781 (74.787)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1590 (2.1442)	Acc@1 38.281 (43.080)	Acc@5 72.656 (74.405)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.0419 (2.1369)	Acc@1 42.969 (42.994)	Acc@5 75.000 (74.496)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.0205 (2.1444)	Acc@1 45.312 (43.083)	Acc@5 73.438 (74.162)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.1832 (2.1435)	Acc@1 47.656 (43.428)	A

 * Acc@1 44.770 Acc@5 75.090
Time/epoch: 10.04701018333435 sec
Epoch: [21][0/391]	Time 0.246 (0.246)	Data 0.227 (0.227)	Loss 2.1227 (2.1227)	Acc@1 39.844 (39.844)	Acc@5 78.906 (78.906)
Epoch: [21][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.9277 (1.9950)	Acc@1 47.656 (45.209)	Acc@5 78.906 (77.385)
Epoch: [21][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.0945 (2.0211)	Acc@1 44.531 (44.451)	Acc@5 73.438 (76.825)
Test: [0/79]	Time 0.223 (0.223)	Loss 2.0199 (2.0199)	Acc@1 46.094 (46.094)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0606 (2.1139)	Acc@1 47.656 (45.170)	Acc@5 75.781 (75.071)
Test: [20/79]	Time 0.007 (0.017)	Loss 2.1571 (2.1130)	Acc@1 42.188 (45.238)	Acc@5 73.438 (74.554)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9404 (2.1000)	Acc@1 42.969 (45.212)	Acc@5 76.562 (74.798)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9126 (2.1056)	Acc@1 49.219 (45.236)	Acc@5 77.344 (74.676)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1597 (2.1057)	Acc@1 44.531 (44.991)	Ac

 * Acc@1 46.640 Acc@5 76.540
Time/epoch: 9.830339670181274 sec
Epoch: [28][0/391]	Time 0.233 (0.233)	Data 0.221 (0.221)	Loss 1.6903 (1.6903)	Acc@1 50.000 (50.000)	Acc@5 82.812 (82.812)
Epoch: [28][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.6806 (1.8581)	Acc@1 51.562 (47.827)	Acc@5 85.938 (80.148)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8893 (1.8850)	Acc@1 46.875 (47.363)	Acc@5 76.562 (79.573)
Test: [0/79]	Time 0.219 (0.219)	Loss 1.7914 (1.7914)	Acc@1 54.688 (54.688)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.9331 (1.9659)	Acc@1 47.656 (47.017)	Acc@5 80.469 (76.989)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1133 (1.9547)	Acc@1 40.625 (46.949)	Acc@5 75.781 (77.307)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.8149 (1.9622)	Acc@1 52.344 (46.699)	Acc@5 78.906 (77.293)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.7450 (1.9701)	Acc@1 52.344 (46.932)	Acc@5 82.812 (77.210)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9414 (1.9681)	Acc@1 49.219 (46.967)	Ac

 * Acc@1 50.920 Acc@5 79.320
Time/epoch: 10.03608512878418 sec
Epoch: [35][0/391]	Time 0.222 (0.222)	Data 0.211 (0.211)	Loss 1.3535 (1.3535)	Acc@1 57.812 (57.812)	Acc@5 87.500 (87.500)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.6881 (1.5268)	Acc@1 53.125 (55.526)	Acc@5 82.031 (85.648)
Epoch: [35][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.5004 (1.5210)	Acc@1 55.469 (55.980)	Acc@5 89.844 (85.610)
Test: [0/79]	Time 0.217 (0.217)	Loss 1.7038 (1.7038)	Acc@1 56.250 (56.250)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.007 (0.026)	Loss 1.8218 (1.8676)	Acc@1 50.781 (51.207)	Acc@5 79.688 (78.977)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0505 (1.8629)	Acc@1 50.781 (51.488)	Acc@5 77.344 (78.906)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7558 (1.8630)	Acc@1 50.000 (51.184)	Acc@5 78.906 (78.780)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7404 (1.8594)	Acc@1 56.250 (51.753)	Acc@5 82.812 (78.754)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9856 (1.8567)	Acc@1 53.125 (51.762)	Ac

 * Acc@1 51.390 Acc@5 79.400
Time/epoch: 9.616835832595825 sec
Epoch: [42][0/391]	Time 0.229 (0.229)	Data 0.217 (0.217)	Loss 1.6266 (1.6266)	Acc@1 51.562 (51.562)	Acc@5 80.469 (80.469)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.4881 (1.4271)	Acc@1 58.594 (58.625)	Acc@5 85.156 (87.210)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.3224 (1.4432)	Acc@1 64.062 (58.018)	Acc@5 85.938 (86.815)
Test: [0/79]	Time 0.217 (0.217)	Loss 1.6713 (1.6713)	Acc@1 54.688 (54.688)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.025)	Loss 1.8194 (1.8488)	Acc@1 50.781 (50.284)	Acc@5 78.125 (79.474)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1266 (1.8627)	Acc@1 50.000 (50.521)	Acc@5 75.781 (78.943)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7139 (1.8612)	Acc@1 53.906 (50.252)	Acc@5 81.250 (79.133)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7149 (1.8573)	Acc@1 53.906 (50.743)	Acc@5 84.375 (78.963)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9693 (1.8522)	Acc@1 51.562 (50.950)	Ac

 * Acc@1 51.500 Acc@5 80.050
Time/epoch: 9.586020469665527 sec
Epoch: [49][0/391]	Time 0.225 (0.225)	Data 0.211 (0.211)	Loss 1.3168 (1.3168)	Acc@1 61.719 (61.719)	Acc@5 89.062 (89.062)
Epoch: [49][150/391]	Time 0.022 (0.022)	Data 0.000 (0.002)	Loss 1.2572 (1.3544)	Acc@1 57.031 (60.079)	Acc@5 92.969 (88.255)
Epoch: [49][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.4186 (1.3652)	Acc@1 56.250 (59.614)	Acc@5 89.062 (88.172)
Test: [0/79]	Time 0.220 (0.220)	Loss 1.5652 (1.5652)	Acc@1 58.594 (58.594)	Acc@5 84.375 (84.375)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.8912 (1.8531)	Acc@1 51.562 (52.131)	Acc@5 78.906 (79.261)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.0931 (1.8605)	Acc@1 51.562 (51.339)	Acc@5 75.781 (79.092)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7464 (1.8428)	Acc@1 50.000 (51.008)	Acc@5 82.812 (79.561)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.7004 (1.8512)	Acc@1 53.125 (51.181)	Acc@5 83.594 (79.383)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9393 (1.8475)	Acc@1 50.781 (51.118)	Ac

 * Acc@1 51.420 Acc@5 79.560
Time/epoch: 9.616858720779419 sec
Epoch: [56][0/391]	Time 0.225 (0.225)	Data 0.211 (0.211)	Loss 1.4583 (1.4583)	Acc@1 56.250 (56.250)	Acc@5 86.719 (86.719)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.5760 (1.3213)	Acc@1 56.250 (60.622)	Acc@5 82.031 (88.685)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.3619 (1.3188)	Acc@1 61.719 (60.761)	Acc@5 85.156 (88.787)
Test: [0/79]	Time 0.226 (0.226)	Loss 1.6167 (1.6167)	Acc@1 55.469 (55.469)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.7553 (1.8497)	Acc@1 50.781 (51.847)	Acc@5 83.594 (80.327)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0224 (1.8539)	Acc@1 49.219 (51.674)	Acc@5 76.562 (80.208)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7169 (1.8459)	Acc@1 50.000 (51.562)	Acc@5 81.250 (80.040)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.6999 (1.8509)	Acc@1 55.469 (51.810)	Acc@5 85.156 (79.707)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.8632 (1.8437)	Acc@1 50.781 (52.037)	Ac

 * Acc@1 52.140 Acc@5 79.800
Time/epoch: 9.59984564781189 sec
Epoch: [63][0/391]	Time 0.247 (0.247)	Data 0.235 (0.235)	Loss 1.4315 (1.4315)	Acc@1 61.719 (61.719)	Acc@5 89.844 (89.844)
Epoch: [63][150/391]	Time 0.023 (0.022)	Data 0.000 (0.002)	Loss 1.4497 (1.2698)	Acc@1 55.469 (61.864)	Acc@5 84.375 (89.570)
Epoch: [63][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.2787 (1.2657)	Acc@1 58.594 (62.163)	Acc@5 88.281 (89.623)
Test: [0/79]	Time 0.222 (0.222)	Loss 1.7196 (1.7196)	Acc@1 52.344 (52.344)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.8099 (1.8729)	Acc@1 53.906 (51.918)	Acc@5 82.031 (80.256)
Test: [20/79]	Time 0.006 (0.017)	Loss 1.9987 (1.8699)	Acc@1 56.250 (52.158)	Acc@5 77.344 (80.246)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7259 (1.8545)	Acc@1 49.219 (51.890)	Acc@5 82.031 (80.544)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.6492 (1.8538)	Acc@1 55.469 (52.058)	Acc@5 85.938 (80.183)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.8678 (1.8571)	Acc@1 54.688 (52.083)	Acc

 * Acc@1 52.090 Acc@5 79.930
Time/epoch: 9.595969200134277 sec
Epoch: [70][0/391]	Time 0.259 (0.259)	Data 0.246 (0.246)	Loss 1.0307 (1.0307)	Acc@1 70.312 (70.312)	Acc@5 93.750 (93.750)
Epoch: [70][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.0275 (1.2546)	Acc@1 69.531 (62.490)	Acc@5 95.312 (89.419)
Epoch: [70][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.3916 (1.2547)	Acc@1 54.688 (62.604)	Acc@5 88.281 (89.628)
Test: [0/79]	Time 0.230 (0.230)	Loss 1.6455 (1.6455)	Acc@1 56.250 (56.250)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.027)	Loss 1.8661 (1.9091)	Acc@1 51.562 (50.923)	Acc@5 82.812 (79.972)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0635 (1.8937)	Acc@1 52.344 (51.451)	Acc@5 75.781 (79.576)
Test: [30/79]	Time 0.006 (0.014)	Loss 1.6941 (1.8793)	Acc@1 52.344 (51.386)	Acc@5 81.250 (79.713)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.6532 (1.8722)	Acc@1 52.344 (51.620)	Acc@5 85.156 (79.535)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.9763 (1.8632)	Acc@1 51.562 (51.915)	Ac

 * Acc@1 52.390 Acc@5 79.760
Time/epoch: 9.720659017562866 sec
Epoch: [77][0/391]	Time 0.217 (0.217)	Data 0.206 (0.206)	Loss 1.3025 (1.3025)	Acc@1 64.844 (64.844)	Acc@5 89.844 (89.844)
Epoch: [77][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 1.3297 (1.2630)	Acc@1 60.938 (62.526)	Acc@5 89.062 (89.756)
Epoch: [77][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.2797 (1.2624)	Acc@1 64.062 (62.549)	Acc@5 87.500 (89.641)
Test: [0/79]	Time 0.239 (0.239)	Loss 1.7191 (1.7191)	Acc@1 53.906 (53.906)	Acc@5 82.031 (82.031)
Test: [10/79]	Time 0.006 (0.027)	Loss 1.7505 (1.8550)	Acc@1 52.344 (52.983)	Acc@5 82.031 (80.185)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0005 (1.8558)	Acc@1 51.562 (52.307)	Acc@5 79.688 (80.060)
Test: [30/79]	Time 0.006 (0.014)	Loss 1.6721 (1.8481)	Acc@1 53.906 (52.092)	Acc@5 82.812 (80.343)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.6266 (1.8491)	Acc@1 51.562 (52.287)	Acc@5 83.594 (79.878)
Test: [50/79]	Time 0.006 (0.011)	Loss 1.9294 (1.8438)	Acc@1 50.781 (52.635)	Ac

 * Acc@1 52.400 Acc@5 79.750
Time/epoch: 9.653146266937256 sec
Epoch: [84][0/391]	Time 0.228 (0.228)	Data 0.215 (0.215)	Loss 1.4983 (1.4983)	Acc@1 57.812 (57.812)	Acc@5 84.375 (84.375)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.2409 (1.2319)	Acc@1 59.375 (62.841)	Acc@5 89.844 (89.947)
Epoch: [84][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 1.3136 (1.2393)	Acc@1 64.844 (62.801)	Acc@5 86.719 (89.984)
Test: [0/79]	Time 0.219 (0.219)	Loss 1.6852 (1.6852)	Acc@1 56.250 (56.250)	Acc@5 83.594 (83.594)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.7873 (1.9032)	Acc@1 53.125 (51.705)	Acc@5 83.594 (79.830)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.1588 (1.8907)	Acc@1 50.000 (51.897)	Acc@5 77.344 (79.390)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.6692 (1.8802)	Acc@1 51.562 (51.840)	Acc@5 84.375 (80.015)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6019 (1.8745)	Acc@1 61.719 (52.248)	Acc@5 81.250 (79.707)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8852 (1.8643)	Acc@1 53.906 (52.405)	Ac

 * Acc@1 52.120 Acc@5 79.910
Time/epoch: 9.633773803710938 sec
Epoch: [91][0/391]	Time 0.228 (0.228)	Data 0.217 (0.217)	Loss 1.3235 (1.3235)	Acc@1 56.250 (56.250)	Acc@5 89.062 (89.062)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.2936 (1.2420)	Acc@1 56.250 (62.629)	Acc@5 92.188 (90.087)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.0210 (1.2410)	Acc@1 64.062 (62.848)	Acc@5 93.750 (89.942)
Test: [0/79]	Time 0.222 (0.222)	Loss 1.6934 (1.6934)	Acc@1 53.125 (53.125)	Acc@5 81.250 (81.250)
Test: [10/79]	Time 0.006 (0.026)	Loss 1.8871 (1.9122)	Acc@1 48.438 (51.065)	Acc@5 83.594 (78.764)
Test: [20/79]	Time 0.006 (0.017)	Loss 1.9220 (1.8943)	Acc@1 54.688 (51.562)	Acc@5 78.125 (78.981)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7149 (1.8837)	Acc@1 49.219 (51.436)	Acc@5 81.250 (79.561)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6343 (1.8896)	Acc@1 57.031 (51.601)	Acc@5 85.156 (79.249)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.8704 (1.8892)	Acc@1 53.906 (51.808)	Ac

 * Acc@1 52.620 Acc@5 80.070
Time/epoch: 9.634990453720093 sec
Epoch: [98][0/391]	Time 0.236 (0.236)	Data 0.224 (0.224)	Loss 1.3849 (1.3849)	Acc@1 61.719 (61.719)	Acc@5 89.062 (89.062)
Epoch: [98][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.3348 (1.2246)	Acc@1 57.031 (63.271)	Acc@5 89.062 (90.268)
Epoch: [98][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.3005 (1.2272)	Acc@1 58.594 (63.019)	Acc@5 89.844 (90.306)
Test: [0/79]	Time 0.209 (0.209)	Loss 1.7265 (1.7265)	Acc@1 54.688 (54.688)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.025)	Loss 1.7781 (1.8856)	Acc@1 52.344 (52.273)	Acc@5 82.031 (79.119)
Test: [20/79]	Time 0.006 (0.016)	Loss 1.9384 (1.8667)	Acc@1 53.906 (52.790)	Acc@5 78.125 (79.464)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.7746 (1.8677)	Acc@1 53.125 (51.991)	Acc@5 82.812 (80.091)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.6719 (1.8705)	Acc@1 54.688 (52.039)	Acc@5 84.375 (79.840)
Test: [50/79]	Time 0.006 (0.010)	Loss 1.9695 (1.8667)	Acc@1 51.562 (52.145)	Ac

In [31]:
dropout_rate = 0.7
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [32]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [33]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.217 (0.217)	Data 0.201 (0.201)	Loss 4.6057 (4.6057)	Acc@1 0.781 (0.781)	Acc@5 7.812 (7.812)
Epoch: [0][150/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 4.5478 (4.6033)	Acc@1 0.000 (1.050)	Acc@5 7.812 (5.386)
Epoch: [0][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 4.1388 (4.4575)	Acc@1 3.125 (2.152)	Acc@5 19.531 (9.746)
Test: [0/79]	Time 0.221 (0.221)	Loss 3.9268 (3.9268)	Acc@1 3.906 (3.906)	Acc@5 25.781 (25.781)
Test: [10/79]	Time 0.006 (0.026)	Loss 4.1109 (3.9969)	Acc@1 3.906 (6.960)	Acc@5 14.844 (23.011)
Test: [20/79]	Time 0.006 (0.016)	Loss 3.9329 (3.9833)	Acc@1 7.031 (6.696)	Acc@5 24.219 (23.921)
Test: [30/79]	Time 0.006 (0.013)	Loss 3.9671 (3.9754)	Acc@1 3.125 (6.603)	Acc@5 19.531 (23.841)
Test: [40/79]	Time 0.006 (0.011)	Loss 3.9904 (3.9689)	Acc@1 12.500 (6.822)	Acc@5 25.781 (24.257)
Test: [50/79]	Time 0.006 (0.010)	Loss 4.0888 (3.9675)	Acc@1 6.250 (6.924)	Acc@5 23.438 (24.112)
Test: [60/79]	Time 0.006 (0.010)	Loss 4.1437 (3.9778)	Acc@1 6.250 (6

 * Acc@1 27.960 Acc@5 58.440
Time/epoch: 10.123371601104736 sec
Epoch: [7][0/391]	Time 0.226 (0.226)	Data 0.213 (0.213)	Loss 2.9390 (2.9390)	Acc@1 26.562 (26.562)	Acc@5 59.375 (59.375)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.9629 (2.9490)	Acc@1 21.875 (25.698)	Acc@5 57.031 (56.416)
Epoch: [7][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.9412 (2.9422)	Acc@1 21.094 (25.753)	Acc@5 60.156 (56.499)
Test: [0/79]	Time 0.238 (0.238)	Loss 2.6120 (2.6120)	Acc@1 36.719 (36.719)	Acc@5 67.188 (67.188)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.7876 (2.8181)	Acc@1 30.469 (29.403)	Acc@5 59.375 (60.440)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.7593 (2.7858)	Acc@1 32.031 (29.762)	Acc@5 65.625 (60.975)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7395 (2.7824)	Acc@1 26.562 (29.864)	Acc@5 62.500 (61.341)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.6681 (2.7773)	Acc@1 27.344 (30.412)	Acc@5 65.625 (61.452)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.7210 (2.7782)	Acc@1 32.031 (30.499)	Acc@

 * Acc@1 35.260 Acc@5 66.690
Time/epoch: 10.176544904708862 sec
Epoch: [14][0/391]	Time 0.253 (0.253)	Data 0.240 (0.240)	Loss 2.6712 (2.6712)	Acc@1 25.000 (25.000)	Acc@5 60.938 (60.938)
Epoch: [14][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 2.5761 (2.5446)	Acc@1 33.594 (33.030)	Acc@5 63.281 (65.858)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.4746 (2.5450)	Acc@1 35.156 (33.197)	Acc@5 68.750 (65.734)
Test: [0/79]	Time 0.222 (0.222)	Loss 2.3153 (2.3153)	Acc@1 42.969 (42.969)	Acc@5 71.094 (71.094)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.4382 (2.4562)	Acc@1 40.625 (36.719)	Acc@5 65.625 (67.756)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.5649 (2.4524)	Acc@1 39.062 (36.868)	Acc@5 65.625 (67.522)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3562 (2.4565)	Acc@1 39.062 (36.618)	Acc@5 64.062 (67.288)
Test: [40/79]	Time 0.006 (0.011)	Loss 2.3061 (2.4532)	Acc@1 34.375 (36.776)	Acc@5 71.875 (67.454)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.4330 (2.4516)	Acc@1 35.938 (36.765)	A

 * Acc@1 39.230 Acc@5 70.530
Time/epoch: 10.051123142242432 sec
Epoch: [21][0/391]	Time 0.240 (0.240)	Data 0.226 (0.226)	Loss 2.3391 (2.3391)	Acc@1 39.844 (39.844)	Acc@5 70.312 (70.312)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.2096 (2.3486)	Acc@1 39.844 (37.526)	Acc@5 72.656 (69.837)
Epoch: [21][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.3404 (2.3524)	Acc@1 37.500 (37.321)	Acc@5 69.531 (69.856)
Test: [0/79]	Time 0.224 (0.224)	Loss 2.2024 (2.2024)	Acc@1 42.188 (42.188)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.3903 (2.4213)	Acc@1 39.062 (39.560)	Acc@5 72.656 (69.318)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.4600 (2.3837)	Acc@1 42.969 (39.323)	Acc@5 71.094 (69.271)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3902 (2.3686)	Acc@1 39.062 (38.634)	Acc@5 71.094 (69.430)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.2863 (2.3645)	Acc@1 37.500 (38.758)	Acc@5 72.656 (69.569)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.3440 (2.3594)	Acc@1 39.844 (38.542)	A

 * Acc@1 41.480 Acc@5 71.710
Time/epoch: 10.070176839828491 sec
Epoch: [28][0/391]	Time 0.251 (0.251)	Data 0.239 (0.239)	Loss 2.3634 (2.3634)	Acc@1 41.406 (41.406)	Acc@5 69.531 (69.531)
Epoch: [28][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.2547 (2.2274)	Acc@1 36.719 (39.419)	Acc@5 75.781 (72.672)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.9934 (2.2380)	Acc@1 45.312 (39.182)	Acc@5 77.344 (72.386)
Test: [0/79]	Time 0.230 (0.230)	Loss 1.9986 (1.9986)	Acc@1 50.000 (50.000)	Acc@5 74.219 (74.219)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.3247 (2.2297)	Acc@1 39.844 (40.838)	Acc@5 72.656 (73.011)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1756 (2.2068)	Acc@1 43.750 (40.588)	Acc@5 71.094 (72.991)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1772 (2.2017)	Acc@1 44.531 (41.079)	Acc@5 72.656 (73.236)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.0947 (2.1972)	Acc@1 41.406 (41.006)	Acc@5 74.219 (73.171)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1973 (2.1910)	Acc@1 44.531 (41.238)	A

 * Acc@1 45.190 Acc@5 75.860
Time/epoch: 9.602334260940552 sec
Epoch: [35][0/391]	Time 0.226 (0.226)	Data 0.214 (0.214)	Loss 1.7638 (1.7638)	Acc@1 52.344 (52.344)	Acc@5 80.469 (80.469)
Epoch: [35][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 2.0795 (1.9040)	Acc@1 44.531 (46.839)	Acc@5 72.656 (78.730)
Epoch: [35][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.8494 (1.9145)	Acc@1 46.875 (46.548)	Acc@5 82.031 (78.548)
Test: [0/79]	Time 0.225 (0.225)	Loss 1.9278 (1.9278)	Acc@1 56.250 (56.250)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0777 (2.0917)	Acc@1 47.656 (46.165)	Acc@5 78.906 (76.562)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1088 (2.0770)	Acc@1 42.969 (45.387)	Acc@5 75.781 (76.079)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9616 (2.0638)	Acc@1 45.312 (45.565)	Acc@5 79.688 (76.084)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8545 (2.0627)	Acc@1 50.000 (45.808)	Acc@5 76.562 (75.781)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1001 (2.0577)	Acc@1 46.094 (45.619)	Ac

 * Acc@1 45.890 Acc@5 75.740
Time/epoch: 9.649132013320923 sec
Epoch: [42][0/391]	Time 0.238 (0.238)	Data 0.227 (0.227)	Loss 1.7669 (1.7669)	Acc@1 49.219 (49.219)	Acc@5 80.469 (80.469)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.8055 (1.8544)	Acc@1 46.094 (48.075)	Acc@5 81.250 (79.481)
Epoch: [42][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.7218 (1.8708)	Acc@1 50.781 (47.721)	Acc@5 86.719 (79.210)
Test: [0/79]	Time 0.233 (0.233)	Loss 1.8270 (1.8270)	Acc@1 56.250 (56.250)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1209 (2.1007)	Acc@1 47.656 (45.455)	Acc@5 78.125 (75.213)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1219 (2.0732)	Acc@1 49.219 (46.057)	Acc@5 75.781 (75.149)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.0037 (2.0616)	Acc@1 42.969 (45.338)	Acc@5 77.344 (75.680)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8906 (2.0506)	Acc@1 53.906 (45.903)	Acc@5 78.906 (75.667)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0282 (2.0464)	Acc@1 48.438 (45.849)	Ac

 * Acc@1 45.790 Acc@5 76.210
Time/epoch: 9.616220951080322 sec
Epoch: [49][0/391]	Time 0.233 (0.233)	Data 0.219 (0.219)	Loss 2.0318 (2.0318)	Acc@1 46.875 (46.875)	Acc@5 74.219 (74.219)
Epoch: [49][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.7604 (1.7877)	Acc@1 50.000 (49.560)	Acc@5 79.688 (80.821)
Epoch: [49][300/391]	Time 0.021 (0.022)	Data 0.000 (0.001)	Loss 2.0182 (1.7925)	Acc@1 41.406 (49.382)	Acc@5 78.906 (80.695)
Test: [0/79]	Time 0.238 (0.238)	Loss 1.8735 (1.8735)	Acc@1 57.031 (57.031)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1571 (2.0748)	Acc@1 44.531 (46.378)	Acc@5 74.219 (75.639)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1812 (2.0635)	Acc@1 46.094 (45.982)	Acc@5 74.219 (75.967)
Test: [30/79]	Time 0.006 (0.014)	Loss 1.9777 (2.0528)	Acc@1 46.094 (45.640)	Acc@5 77.344 (76.436)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8199 (2.0524)	Acc@1 53.125 (45.713)	Acc@5 78.906 (76.220)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0211 (2.0422)	Acc@1 46.875 (45.956)	Ac

 * Acc@1 46.300 Acc@5 76.200
Time/epoch: 9.652329206466675 sec
Epoch: [56][0/391]	Time 0.234 (0.234)	Data 0.221 (0.221)	Loss 1.9933 (1.9933)	Acc@1 47.656 (47.656)	Acc@5 77.344 (77.344)
Epoch: [56][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.9095 (1.7633)	Acc@1 51.562 (50.414)	Acc@5 78.125 (81.421)
Epoch: [56][300/391]	Time 0.022 (0.021)	Data 0.000 (0.001)	Loss 1.9680 (1.7642)	Acc@1 42.969 (50.080)	Acc@5 72.656 (81.349)
Test: [0/79]	Time 0.224 (0.224)	Loss 1.8398 (1.8398)	Acc@1 59.375 (59.375)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0923 (2.0652)	Acc@1 50.781 (48.366)	Acc@5 79.688 (77.273)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2242 (2.0594)	Acc@1 42.188 (46.838)	Acc@5 75.000 (76.749)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9505 (2.0488)	Acc@1 43.750 (46.295)	Acc@5 80.469 (76.739)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9115 (2.0441)	Acc@1 49.219 (46.037)	Acc@5 77.344 (76.562)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0565 (2.0325)	Acc@1 46.875 (46.247)	Ac

 * Acc@1 46.580 Acc@5 76.700
Time/epoch: 9.609270334243774 sec
Epoch: [63][0/391]	Time 0.222 (0.222)	Data 0.211 (0.211)	Loss 1.9987 (1.9987)	Acc@1 46.875 (46.875)	Acc@5 76.562 (76.562)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.8228 (1.7090)	Acc@1 49.219 (51.216)	Acc@5 82.031 (82.347)
Epoch: [63][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.6119 (1.7027)	Acc@1 59.375 (51.365)	Acc@5 83.594 (82.402)
Test: [0/79]	Time 0.222 (0.222)	Loss 1.9022 (1.9022)	Acc@1 53.906 (53.906)	Acc@5 75.781 (75.781)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.0384 (2.0453)	Acc@1 49.219 (46.804)	Acc@5 82.031 (76.989)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2020 (2.0342)	Acc@1 42.188 (46.429)	Acc@5 76.562 (76.376)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9846 (2.0202)	Acc@1 43.750 (46.094)	Acc@5 77.344 (76.588)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8311 (2.0187)	Acc@1 50.781 (46.322)	Acc@5 78.906 (76.372)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0540 (2.0087)	Acc@1 49.219 (46.706)	Ac

 * Acc@1 47.030 Acc@5 76.700
Time/epoch: 10.095858573913574 sec
Epoch: [70][0/391]	Time 0.237 (0.237)	Data 0.225 (0.225)	Loss 1.5545 (1.5545)	Acc@1 53.125 (53.125)	Acc@5 82.812 (82.812)
Epoch: [70][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.8155 (1.6964)	Acc@1 53.906 (51.438)	Acc@5 80.469 (82.564)
Epoch: [70][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.7714 (1.6978)	Acc@1 51.562 (51.422)	Acc@5 83.594 (82.325)
Test: [0/79]	Time 0.234 (0.234)	Loss 1.8456 (1.8456)	Acc@1 55.469 (55.469)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1969 (2.0858)	Acc@1 49.219 (46.662)	Acc@5 78.125 (76.207)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1157 (2.0518)	Acc@1 44.531 (46.763)	Acc@5 75.781 (76.637)
Test: [30/79]	Time 0.006 (0.014)	Loss 1.9256 (2.0293)	Acc@1 46.094 (46.371)	Acc@5 78.906 (77.117)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8199 (2.0276)	Acc@1 52.344 (46.227)	Acc@5 78.906 (76.696)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0510 (2.0172)	Acc@1 46.875 (46.461)	A

 * Acc@1 46.780 Acc@5 76.370
Time/epoch: 9.601327657699585 sec
Epoch: [77][0/391]	Time 0.228 (0.228)	Data 0.216 (0.216)	Loss 1.5975 (1.5975)	Acc@1 53.906 (53.906)	Acc@5 83.594 (83.594)
Epoch: [77][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.8055 (1.7041)	Acc@1 52.344 (51.366)	Acc@5 81.250 (82.626)
Epoch: [77][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.7760 (1.6935)	Acc@1 50.781 (51.521)	Acc@5 79.688 (82.732)
Test: [0/79]	Time 0.224 (0.224)	Loss 1.9236 (1.9236)	Acc@1 53.125 (53.125)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1729 (2.0605)	Acc@1 50.000 (47.940)	Acc@5 78.906 (77.415)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.0934 (2.0460)	Acc@1 43.750 (46.838)	Acc@5 75.781 (76.897)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9361 (2.0356)	Acc@1 43.750 (46.220)	Acc@5 78.906 (76.915)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8265 (2.0297)	Acc@1 55.469 (46.475)	Acc@5 79.688 (76.620)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0294 (2.0211)	Acc@1 51.562 (46.630)	Ac

 * Acc@1 46.920 Acc@5 76.820
Time/epoch: 9.640557050704956 sec
Epoch: [84][0/391]	Time 0.228 (0.228)	Data 0.213 (0.213)	Loss 1.7239 (1.7239)	Acc@1 56.250 (56.250)	Acc@5 85.938 (85.938)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.5804 (1.6828)	Acc@1 48.438 (51.940)	Acc@5 85.156 (82.595)
Epoch: [84][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.7345 (1.6851)	Acc@1 50.000 (51.721)	Acc@5 81.250 (82.618)
Test: [0/79]	Time 0.228 (0.228)	Loss 1.8459 (1.8459)	Acc@1 55.469 (55.469)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1929 (2.0633)	Acc@1 51.562 (47.443)	Acc@5 77.344 (76.278)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1913 (2.0470)	Acc@1 42.188 (47.098)	Acc@5 75.781 (76.116)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9794 (2.0321)	Acc@1 43.750 (46.774)	Acc@5 77.344 (76.436)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.7476 (2.0242)	Acc@1 50.000 (46.913)	Acc@5 79.688 (76.410)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0085 (2.0148)	Acc@1 46.875 (47.044)	Ac

 * Acc@1 46.710 Acc@5 76.520
Time/epoch: 9.749742031097412 sec
Epoch: [91][0/391]	Time 0.255 (0.255)	Data 0.243 (0.243)	Loss 1.8315 (1.8315)	Acc@1 51.562 (51.562)	Acc@5 78.906 (78.906)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.5879 (1.6853)	Acc@1 50.781 (51.956)	Acc@5 84.375 (82.450)
Epoch: [91][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 1.5955 (1.6824)	Acc@1 54.688 (51.926)	Acc@5 83.594 (82.535)
Test: [0/79]	Time 0.225 (0.225)	Loss 1.8785 (1.8785)	Acc@1 58.594 (58.594)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1104 (2.0324)	Acc@1 46.094 (48.366)	Acc@5 78.125 (77.060)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.1193 (2.0165)	Acc@1 47.656 (47.359)	Acc@5 78.125 (76.860)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.9741 (2.0068)	Acc@1 42.969 (46.825)	Acc@5 79.688 (77.016)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8449 (2.0058)	Acc@1 54.688 (46.799)	Acc@5 78.125 (76.677)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.0749 (2.0013)	Acc@1 48.438 (46.936)	Ac

 * Acc@1 47.130 Acc@5 76.610
Time/epoch: 9.609364748001099 sec
Epoch: [98][0/391]	Time 0.230 (0.230)	Data 0.219 (0.219)	Loss 1.6148 (1.6148)	Acc@1 53.125 (53.125)	Acc@5 84.375 (84.375)
Epoch: [98][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.6952 (1.6726)	Acc@1 55.469 (51.588)	Acc@5 82.031 (82.673)
Epoch: [98][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8529 (1.6837)	Acc@1 50.781 (51.557)	Acc@5 77.344 (82.602)
Test: [0/79]	Time 0.208 (0.208)	Loss 1.8318 (1.8318)	Acc@1 52.344 (52.344)	Acc@5 80.469 (80.469)
Test: [10/79]	Time 0.006 (0.025)	Loss 2.1752 (2.0689)	Acc@1 49.219 (47.088)	Acc@5 79.688 (76.705)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.2109 (2.0424)	Acc@1 46.094 (46.652)	Acc@5 71.875 (76.674)
Test: [30/79]	Time 0.006 (0.013)	Loss 1.8879 (2.0259)	Acc@1 45.312 (46.245)	Acc@5 79.688 (77.016)
Test: [40/79]	Time 0.006 (0.011)	Loss 1.8024 (2.0213)	Acc@1 55.469 (46.456)	Acc@5 78.125 (76.753)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.0061 (2.0148)	Acc@1 49.219 (46.645)	Ac

In [34]:
dropout_rate = 0.8
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [35]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [36]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.214 (0.214)	Data 0.201 (0.201)	Loss 4.6026 (4.6026)	Acc@1 1.562 (1.562)	Acc@5 7.031 (7.031)
Epoch: [0][150/391]	Time 0.024 (0.022)	Data 0.000 (0.001)	Loss 4.3226 (4.5050)	Acc@1 2.344 (1.635)	Acc@5 14.062 (8.428)
Epoch: [0][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 4.1432 (4.3859)	Acc@1 7.031 (2.432)	Acc@5 24.219 (11.464)
Test: [0/79]	Time 0.222 (0.222)	Loss 3.9940 (3.9940)	Acc@1 7.812 (7.812)	Acc@5 26.562 (26.562)
Test: [10/79]	Time 0.006 (0.026)	Loss 4.0839 (4.0348)	Acc@1 4.688 (5.114)	Acc@5 26.562 (22.017)
Test: [20/79]	Time 0.006 (0.017)	Loss 4.0037 (4.0167)	Acc@1 4.688 (5.469)	Acc@5 21.094 (22.433)
Test: [30/79]	Time 0.006 (0.013)	Loss 4.0245 (4.0065)	Acc@1 4.688 (5.444)	Acc@5 25.000 (23.085)
Test: [40/79]	Time 0.006 (0.012)	Loss 4.1325 (4.0106)	Acc@1 4.688 (5.697)	Acc@5 20.312 (23.590)
Test: [50/79]	Time 0.006 (0.011)	Loss 4.1814 (4.0153)	Acc@1 3.125 (5.668)	Acc@5 16.406 (23.346)
Test: [60/79]	Time 0.006 (0.010)	Loss 4.1435 (4.0200)	Acc@1 7.031 (

 * Acc@1 23.480 Acc@5 54.060
Time/epoch: 10.049119710922241 sec
Epoch: [7][0/391]	Time 0.229 (0.229)	Data 0.217 (0.217)	Loss 3.2945 (3.2945)	Acc@1 18.750 (18.750)	Acc@5 50.000 (50.000)
Epoch: [7][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 3.2784 (3.1770)	Acc@1 23.438 (21.156)	Acc@5 50.781 (50.714)
Epoch: [7][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 3.2381 (3.1613)	Acc@1 23.438 (21.185)	Acc@5 49.219 (50.924)
Test: [0/79]	Time 0.230 (0.230)	Loss 2.8472 (2.8472)	Acc@1 27.344 (27.344)	Acc@5 62.500 (62.500)
Test: [10/79]	Time 0.006 (0.027)	Loss 3.0075 (2.9775)	Acc@1 27.344 (25.355)	Acc@5 53.125 (54.901)
Test: [20/79]	Time 0.006 (0.017)	Loss 3.1248 (2.9830)	Acc@1 24.219 (25.223)	Acc@5 54.688 (55.618)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.9664 (2.9772)	Acc@1 21.094 (25.176)	Acc@5 50.000 (55.444)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.7748 (2.9720)	Acc@1 29.688 (25.324)	Acc@5 56.250 (55.240)
Test: [50/79]	Time 0.006 (0.011)	Loss 3.0835 (2.9805)	Acc@1 25.000 (25.245)	Acc@

 * Acc@1 31.310 Acc@5 62.900
Time/epoch: 9.656184196472168 sec
Epoch: [14][0/391]	Time 0.238 (0.238)	Data 0.224 (0.224)	Loss 2.7984 (2.7984)	Acc@1 29.688 (29.688)	Acc@5 56.250 (56.250)
Epoch: [14][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 2.6250 (2.8188)	Acc@1 28.125 (27.561)	Acc@5 60.938 (59.328)
Epoch: [14][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 2.7278 (2.8156)	Acc@1 28.125 (27.606)	Acc@5 63.281 (59.406)
Test: [0/79]	Time 0.235 (0.235)	Loss 2.4972 (2.4972)	Acc@1 36.719 (36.719)	Acc@5 71.875 (71.875)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.6708 (2.6633)	Acc@1 32.812 (31.037)	Acc@5 62.500 (64.062)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.7652 (2.6520)	Acc@1 35.938 (32.254)	Acc@5 61.719 (63.802)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.5664 (2.6431)	Acc@1 32.031 (32.283)	Acc@5 61.719 (63.735)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.4718 (2.6573)	Acc@1 33.594 (31.993)	Acc@5 68.750 (63.434)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.6662 (2.6624)	Acc@1 31.250 (31.756)	Ac

 * Acc@1 35.450 Acc@5 67.050
Time/epoch: 10.060152769088745 sec
Epoch: [21][0/391]	Time 0.240 (0.240)	Data 0.226 (0.226)	Loss 2.4311 (2.4311)	Acc@1 32.031 (32.031)	Acc@5 68.750 (68.750)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.7421 (2.6135)	Acc@1 24.219 (31.529)	Acc@5 62.500 (64.228)
Epoch: [21][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.3616 (2.6142)	Acc@1 38.281 (31.603)	Acc@5 68.750 (64.195)
Test: [0/79]	Time 0.232 (0.232)	Loss 2.1968 (2.1968)	Acc@1 41.406 (41.406)	Acc@5 75.000 (75.000)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.5092 (2.4906)	Acc@1 38.281 (34.162)	Acc@5 69.531 (68.182)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.5775 (2.4702)	Acc@1 33.594 (35.193)	Acc@5 64.062 (68.452)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.4524 (2.4746)	Acc@1 37.500 (35.383)	Acc@5 66.406 (68.170)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.2255 (2.4711)	Acc@1 41.406 (35.880)	Acc@5 72.656 (67.950)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.5975 (2.4718)	Acc@1 40.625 (36.229)	A

 * Acc@1 38.710 Acc@5 69.650
Time/epoch: 10.055457830429077 sec
Epoch: [28][0/391]	Time 0.243 (0.243)	Data 0.231 (0.231)	Loss 2.4941 (2.4941)	Acc@1 32.812 (32.812)	Acc@5 69.531 (69.531)
Epoch: [28][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 2.9228 (2.4826)	Acc@1 28.906 (34.691)	Acc@5 57.812 (67.229)
Epoch: [28][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 2.5191 (2.4929)	Acc@1 42.188 (34.445)	Acc@5 64.062 (66.910)
Test: [0/79]	Time 0.226 (0.226)	Loss 2.1782 (2.1782)	Acc@1 45.312 (45.312)	Acc@5 73.438 (73.438)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.3105 (2.4093)	Acc@1 37.500 (37.358)	Acc@5 70.312 (68.963)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.5247 (2.4003)	Acc@1 31.250 (37.091)	Acc@5 67.188 (69.085)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.3615 (2.3910)	Acc@1 35.938 (37.122)	Acc@5 67.188 (69.355)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.1201 (2.3878)	Acc@1 42.188 (37.062)	Acc@5 75.781 (69.379)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.4717 (2.3882)	Acc@1 38.281 (36.903)	A

 * Acc@1 41.390 Acc@5 73.340
Time/epoch: 10.105263471603394 sec
Epoch: [35][0/391]	Time 0.228 (0.228)	Data 0.216 (0.216)	Loss 2.1536 (2.1536)	Acc@1 38.281 (38.281)	Acc@5 74.219 (74.219)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.3012 (2.1903)	Acc@1 35.156 (39.776)	Acc@5 71.875 (73.070)
Epoch: [35][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 2.2619 (2.1845)	Acc@1 42.969 (40.150)	Acc@5 76.562 (73.274)
Test: [0/79]	Time 0.228 (0.228)	Loss 1.9858 (1.9858)	Acc@1 45.312 (45.312)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.2124 (2.2179)	Acc@1 44.531 (41.264)	Acc@5 69.531 (73.651)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.4295 (2.2090)	Acc@1 38.281 (41.667)	Acc@5 70.312 (73.177)
Test: [30/79]	Time 0.007 (0.013)	Loss 2.1653 (2.1998)	Acc@1 41.406 (42.112)	Acc@5 71.875 (73.311)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.0853 (2.2084)	Acc@1 41.406 (41.978)	Acc@5 75.781 (73.114)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1525 (2.1971)	Acc@1 43.750 (42.034)	A

 * Acc@1 41.970 Acc@5 73.540
Time/epoch: 9.755671262741089 sec
Epoch: [42][0/391]	Time 0.231 (0.231)	Data 0.218 (0.218)	Loss 2.1242 (2.1242)	Acc@1 36.719 (36.719)	Acc@5 74.219 (74.219)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.9865 (2.0919)	Acc@1 42.188 (42.767)	Acc@5 77.344 (74.746)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.1257 (2.1030)	Acc@1 44.531 (42.162)	Acc@5 72.656 (74.691)
Test: [0/79]	Time 0.238 (0.238)	Loss 2.0053 (2.0053)	Acc@1 48.438 (48.438)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1360 (2.2143)	Acc@1 48.438 (42.685)	Acc@5 74.219 (73.793)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.4126 (2.2042)	Acc@1 37.500 (42.374)	Acc@5 68.750 (73.363)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.2844 (2.1916)	Acc@1 41.406 (42.288)	Acc@5 74.219 (73.639)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9477 (2.1868)	Acc@1 46.094 (42.416)	Acc@5 79.688 (73.571)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.2841 (2.1838)	Acc@1 38.281 (42.341)	Ac

 * Acc@1 42.520 Acc@5 73.670
Time/epoch: 10.08776307106018 sec
Epoch: [49][0/391]	Time 0.222 (0.222)	Data 0.210 (0.210)	Loss 2.1403 (2.1403)	Acc@1 41.406 (41.406)	Acc@5 75.000 (75.000)
Epoch: [49][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 2.0046 (2.0457)	Acc@1 42.969 (43.248)	Acc@5 75.781 (76.206)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8172 (2.0460)	Acc@1 48.438 (43.119)	Acc@5 78.906 (76.080)
Test: [0/79]	Time 0.225 (0.225)	Loss 1.9534 (1.9534)	Acc@1 48.438 (48.438)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1426 (2.1604)	Acc@1 49.219 (43.963)	Acc@5 73.438 (74.290)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2822 (2.1654)	Acc@1 42.188 (43.229)	Acc@5 71.875 (73.735)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.2003 (2.1650)	Acc@1 41.406 (43.019)	Acc@5 72.656 (73.664)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.0371 (2.1674)	Acc@1 46.875 (43.121)	Acc@5 78.125 (73.628)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1731 (2.1616)	Acc@1 43.750 (43.107)	Ac

 * Acc@1 42.830 Acc@5 74.090
Time/epoch: 9.609729766845703 sec
Epoch: [56][0/391]	Time 0.235 (0.235)	Data 0.222 (0.222)	Loss 1.9042 (1.9042)	Acc@1 46.875 (46.875)	Acc@5 79.688 (79.688)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.9375 (2.0049)	Acc@1 46.094 (44.045)	Acc@5 76.562 (76.418)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.0292 (1.9976)	Acc@1 42.188 (44.170)	Acc@5 73.438 (76.679)
Test: [0/79]	Time 0.235 (0.235)	Loss 1.8924 (1.8924)	Acc@1 52.344 (52.344)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1325 (2.1365)	Acc@1 47.656 (44.176)	Acc@5 72.656 (75.497)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2453 (2.1413)	Acc@1 44.531 (44.085)	Acc@5 72.656 (74.963)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.1774 (2.1435)	Acc@1 46.094 (43.800)	Acc@5 76.562 (74.849)
Test: [40/79]	Time 0.007 (0.012)	Loss 1.9202 (2.1465)	Acc@1 50.000 (43.731)	Acc@5 78.125 (74.428)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.2124 (2.1357)	Acc@1 41.406 (43.980)	Ac

 * Acc@1 43.000 Acc@5 74.110
Time/epoch: 9.599185466766357 sec
Epoch: [63][0/391]	Time 0.245 (0.245)	Data 0.234 (0.234)	Loss 2.0805 (2.0805)	Acc@1 41.406 (41.406)	Acc@5 77.344 (77.344)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.6311 (1.9658)	Acc@1 54.688 (44.961)	Acc@5 85.938 (77.437)
Epoch: [63][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8138 (1.9591)	Acc@1 47.656 (45.017)	Acc@5 83.594 (77.663)
Test: [0/79]	Time 0.234 (0.234)	Loss 1.9208 (1.9208)	Acc@1 48.438 (48.438)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.0874 (2.1466)	Acc@1 44.531 (42.685)	Acc@5 74.219 (74.929)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.3022 (2.1487)	Acc@1 45.312 (43.824)	Acc@5 71.094 (74.330)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.0721 (2.1395)	Acc@1 44.531 (43.548)	Acc@5 77.344 (74.370)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8994 (2.1361)	Acc@1 47.656 (43.331)	Acc@5 80.469 (74.181)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1844 (2.1321)	Acc@1 46.875 (43.398)	Ac

 * Acc@1 42.780 Acc@5 74.480
Time/epoch: 9.656151533126831 sec
Epoch: [70][0/391]	Time 0.225 (0.225)	Data 0.213 (0.213)	Loss 1.9709 (1.9709)	Acc@1 47.656 (47.656)	Acc@5 80.469 (80.469)
Epoch: [70][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 1.9070 (1.9501)	Acc@1 42.969 (45.380)	Acc@5 78.906 (77.701)
Epoch: [70][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8887 (1.9471)	Acc@1 46.875 (45.585)	Acc@5 78.906 (77.827)
Test: [0/79]	Time 0.227 (0.227)	Loss 1.9555 (1.9555)	Acc@1 51.562 (51.562)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.1172 (2.1593)	Acc@1 47.656 (44.034)	Acc@5 75.000 (75.213)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2382 (2.1547)	Acc@1 45.312 (44.717)	Acc@5 68.750 (74.888)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1040 (2.1424)	Acc@1 44.531 (44.380)	Acc@5 75.000 (74.723)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9679 (2.1423)	Acc@1 46.875 (43.941)	Acc@5 79.688 (74.505)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1640 (2.1335)	Acc@1 41.406 (43.750)	Ac

 * Acc@1 42.910 Acc@5 74.060
Time/epoch: 9.622703313827515 sec
Epoch: [77][0/391]	Time 0.242 (0.242)	Data 0.228 (0.228)	Loss 2.0181 (2.0181)	Acc@1 42.969 (42.969)	Acc@5 79.688 (79.688)
Epoch: [77][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.0222 (1.9293)	Acc@1 45.312 (45.644)	Acc@5 73.438 (78.213)
Epoch: [77][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.6403 (1.9342)	Acc@1 53.906 (45.512)	Acc@5 84.375 (77.907)
Test: [0/79]	Time 0.229 (0.229)	Loss 2.0417 (2.0417)	Acc@1 48.438 (48.438)	Acc@5 78.125 (78.125)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1736 (2.1533)	Acc@1 42.969 (42.969)	Acc@5 72.656 (75.000)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.3147 (2.1471)	Acc@1 42.969 (43.527)	Acc@5 70.312 (74.516)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.1712 (2.1372)	Acc@1 44.531 (43.826)	Acc@5 75.000 (74.446)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.0420 (2.1395)	Acc@1 46.875 (43.598)	Acc@5 78.125 (74.314)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1570 (2.1349)	Acc@1 43.750 (43.597)	Ac

 * Acc@1 43.350 Acc@5 73.590
Time/epoch: 9.600590944290161 sec
Epoch: [84][0/391]	Time 0.247 (0.247)	Data 0.233 (0.233)	Loss 1.8891 (1.8891)	Acc@1 46.094 (46.094)	Acc@5 80.469 (80.469)
Epoch: [84][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 2.0315 (1.9148)	Acc@1 53.125 (46.021)	Acc@5 78.125 (78.529)
Epoch: [84][300/391]	Time 0.022 (0.022)	Data 0.000 (0.001)	Loss 1.9482 (1.9291)	Acc@1 39.844 (45.689)	Acc@5 78.125 (78.021)
Test: [0/79]	Time 0.243 (0.243)	Loss 1.9524 (1.9524)	Acc@1 46.875 (46.875)	Acc@5 77.344 (77.344)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.1505 (2.1693)	Acc@1 44.531 (42.330)	Acc@5 74.219 (74.503)
Test: [20/79]	Time 0.006 (0.018)	Loss 2.2211 (2.1465)	Acc@1 46.094 (43.564)	Acc@5 72.656 (74.293)
Test: [30/79]	Time 0.007 (0.014)	Loss 2.1643 (2.1396)	Acc@1 43.750 (43.574)	Acc@5 75.000 (74.345)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9012 (2.1390)	Acc@1 46.875 (43.540)	Acc@5 78.125 (74.219)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1388 (2.1308)	Acc@1 45.312 (43.781)	Ac

 * Acc@1 43.140 Acc@5 74.550
Time/epoch: 9.768069505691528 sec
Epoch: [91][0/391]	Time 0.261 (0.261)	Data 0.242 (0.242)	Loss 1.6117 (1.6117)	Acc@1 56.250 (56.250)	Acc@5 84.375 (84.375)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.8964 (1.9279)	Acc@1 49.219 (46.135)	Acc@5 80.469 (78.011)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 1.8887 (1.9382)	Acc@1 44.531 (45.645)	Acc@5 82.812 (77.850)
Test: [0/79]	Time 0.259 (0.259)	Loss 1.9235 (1.9235)	Acc@1 51.562 (51.562)	Acc@5 79.688 (79.688)
Test: [10/79]	Time 0.006 (0.029)	Loss 2.1671 (2.1255)	Acc@1 44.531 (43.750)	Acc@5 71.875 (75.639)
Test: [20/79]	Time 0.006 (0.018)	Loss 2.3067 (2.1294)	Acc@1 41.406 (44.085)	Acc@5 67.969 (75.074)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.2429 (2.1234)	Acc@1 40.625 (43.851)	Acc@5 72.656 (75.277)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.8729 (2.1291)	Acc@1 50.000 (43.579)	Acc@5 79.688 (74.943)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.2215 (2.1285)	Acc@1 42.188 (43.781)	Ac

 * Acc@1 43.300 Acc@5 74.290
Time/epoch: 9.583700895309448 sec
Epoch: [98][0/391]	Time 0.246 (0.246)	Data 0.234 (0.234)	Loss 1.8645 (1.8645)	Acc@1 48.438 (48.438)	Acc@5 78.906 (78.906)
Epoch: [98][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 1.6879 (1.9272)	Acc@1 55.469 (45.576)	Acc@5 78.906 (78.337)
Epoch: [98][300/391]	Time 0.022 (0.021)	Data 0.000 (0.001)	Loss 1.8985 (1.9223)	Acc@1 49.219 (45.746)	Acc@5 79.688 (78.104)
Test: [0/79]	Time 0.232 (0.232)	Loss 1.9010 (1.9010)	Acc@1 48.438 (48.438)	Acc@5 78.906 (78.906)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.1362 (2.1523)	Acc@1 46.094 (42.756)	Acc@5 72.656 (74.645)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.2663 (2.1419)	Acc@1 46.094 (44.234)	Acc@5 68.750 (74.293)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.1497 (2.1259)	Acc@1 44.531 (44.355)	Acc@5 75.781 (74.748)
Test: [40/79]	Time 0.006 (0.012)	Loss 1.9416 (2.1310)	Acc@1 46.094 (44.150)	Acc@5 79.688 (74.428)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.1866 (2.1276)	Acc@1 47.656 (44.102)	Ac

In [37]:
dropout_rate = 0.9
print("=> creating model '{}'".format('AlexNet'))
model = AlexNet(droprate = dropout_rate, num_classes = 100)
lr = 0.001

=> creating model 'AlexNet'


In [38]:
model = torch.nn.DataParallel(model).cuda()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.002)

In [39]:
best_acc1 = 0
for epoch in range(0, 100):
    start_time = time.time()
    adjust_learning_rate(optimizer, epoch, lr)

    # train for one epoch
    train(train_loader, model, criterion, optimizer, epoch)

    # evaluate on validation set
    acc1 = validate(val_loader, model, criterion, epoch)

    # remember best acc@1 and save checkpoint
    is_best = acc1 > best_acc1
    best_acc1 = max(acc1, best_acc1)
    
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': model.state_dict(),
        'best_acc1': best_acc1,
        'optimizer' : optimizer.state_dict(),
    }, is_best)
    print("Time/epoch: {} sec".format(time.time() - start_time))

Epoch: [0][0/391]	Time 0.221 (0.221)	Data 0.209 (0.209)	Loss 4.6026 (4.6026)	Acc@1 0.781 (0.781)	Acc@5 6.250 (6.250)
Epoch: [0][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 4.3608 (4.5668)	Acc@1 1.562 (1.257)	Acc@5 10.938 (6.002)
Epoch: [0][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 4.3338 (4.4892)	Acc@1 0.000 (1.835)	Acc@5 8.594 (8.599)
Test: [0/79]	Time 0.262 (0.262)	Loss 4.1273 (4.1273)	Acc@1 4.688 (4.688)	Acc@5 19.531 (19.531)
Test: [10/79]	Time 0.006 (0.030)	Loss 4.2436 (4.1701)	Acc@1 4.688 (4.403)	Acc@5 21.094 (20.739)
Test: [20/79]	Time 0.006 (0.019)	Loss 4.2305 (4.1529)	Acc@1 5.469 (5.246)	Acc@5 21.094 (21.466)
Test: [30/79]	Time 0.006 (0.015)	Loss 4.1513 (4.1465)	Acc@1 5.469 (5.141)	Acc@5 21.094 (21.144)
Test: [40/79]	Time 0.006 (0.013)	Loss 4.1828 (4.1533)	Acc@1 3.906 (5.145)	Acc@5 17.188 (20.960)
Test: [50/79]	Time 0.006 (0.011)	Loss 4.2227 (4.1559)	Acc@1 2.344 (4.948)	Acc@5 14.062 (20.726)
Test: [60/79]	Time 0.006 (0.011)	Loss 4.2872 (4.1608)	Acc@1 2.344 (4.

 * Acc@1 12.620 Acc@5 38.730
Time/epoch: 10.122535705566406 sec
Epoch: [7][0/391]	Time 0.228 (0.228)	Data 0.215 (0.215)	Loss 3.5861 (3.5861)	Acc@1 9.375 (9.375)	Acc@5 35.156 (35.156)
Epoch: [7][150/391]	Time 0.019 (0.022)	Data 0.000 (0.002)	Loss 3.7603 (3.6827)	Acc@1 10.156 (10.477)	Acc@5 26.562 (34.411)
Epoch: [7][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 3.5791 (3.6824)	Acc@1 13.281 (10.629)	Acc@5 35.156 (34.492)
Test: [0/79]	Time 0.231 (0.231)	Loss 3.5396 (3.5396)	Acc@1 10.938 (10.938)	Acc@5 35.156 (35.156)
Test: [10/79]	Time 0.006 (0.027)	Loss 3.5823 (3.5624)	Acc@1 18.750 (13.636)	Acc@5 40.625 (38.778)
Test: [20/79]	Time 0.006 (0.017)	Loss 3.5843 (3.5599)	Acc@1 11.719 (13.356)	Acc@5 39.062 (39.100)
Test: [30/79]	Time 0.006 (0.014)	Loss 3.5525 (3.5647)	Acc@1 14.062 (13.407)	Acc@5 42.188 (39.113)
Test: [40/79]	Time 0.006 (0.012)	Loss 3.4679 (3.5641)	Acc@1 16.406 (13.624)	Acc@5 35.938 (38.986)
Test: [50/79]	Time 0.006 (0.011)	Loss 3.5493 (3.5576)	Acc@1 14.844 (13.710)	Acc@5 

 * Acc@1 16.780 Acc@5 44.430
Time/epoch: 9.601755857467651 sec
Epoch: [14][0/391]	Time 0.238 (0.238)	Data 0.225 (0.225)	Loss 3.3137 (3.3137)	Acc@1 12.500 (12.500)	Acc@5 49.219 (49.219)
Epoch: [14][150/391]	Time 0.022 (0.022)	Data 0.000 (0.002)	Loss 3.4538 (3.4151)	Acc@1 18.750 (15.025)	Acc@5 35.156 (42.436)
Epoch: [14][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 3.2131 (3.4144)	Acc@1 21.094 (15.153)	Acc@5 50.781 (42.481)
Test: [0/79]	Time 0.250 (0.250)	Loss 3.1888 (3.1888)	Acc@1 22.656 (22.656)	Acc@5 50.000 (50.000)
Test: [10/79]	Time 0.006 (0.028)	Loss 3.2298 (3.3152)	Acc@1 17.969 (17.614)	Acc@5 48.438 (45.241)
Test: [20/79]	Time 0.006 (0.018)	Loss 3.3751 (3.3004)	Acc@1 14.844 (18.973)	Acc@5 44.531 (46.317)
Test: [30/79]	Time 0.006 (0.014)	Loss 3.2695 (3.2933)	Acc@1 17.188 (18.952)	Acc@5 47.656 (46.043)
Test: [40/79]	Time 0.006 (0.012)	Loss 3.2020 (3.2923)	Acc@1 24.219 (19.093)	Acc@5 52.344 (46.589)
Test: [50/79]	Time 0.006 (0.011)	Loss 3.3325 (3.2920)	Acc@1 17.188 (19.240)	Ac

 * Acc@1 20.840 Acc@5 50.950
Time/epoch: 10.12453031539917 sec
Epoch: [21][0/391]	Time 0.251 (0.251)	Data 0.239 (0.239)	Loss 3.2527 (3.2527)	Acc@1 16.406 (16.406)	Acc@5 42.969 (42.969)
Epoch: [21][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 3.2368 (3.2332)	Acc@1 15.625 (18.383)	Acc@5 41.406 (47.646)
Epoch: [21][300/391]	Time 0.020 (0.021)	Data 0.000 (0.001)	Loss 3.1289 (3.2561)	Acc@1 20.312 (18.070)	Acc@5 53.906 (47.373)
Test: [0/79]	Time 0.230 (0.230)	Loss 2.9438 (2.9438)	Acc@1 21.094 (21.094)	Acc@5 53.906 (53.906)
Test: [10/79]	Time 0.006 (0.027)	Loss 3.0266 (3.1212)	Acc@1 24.219 (19.886)	Acc@5 54.688 (50.142)
Test: [20/79]	Time 0.007 (0.017)	Loss 3.1352 (3.1239)	Acc@1 24.219 (21.466)	Acc@5 48.438 (50.893)
Test: [30/79]	Time 0.006 (0.014)	Loss 3.0258 (3.1253)	Acc@1 21.094 (21.169)	Acc@5 51.562 (50.958)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.8982 (3.1142)	Acc@1 23.438 (21.418)	Acc@5 57.031 (51.696)
Test: [50/79]	Time 0.006 (0.011)	Loss 3.1435 (3.1102)	Acc@1 19.531 (21.492)	Ac

 * Acc@1 23.130 Acc@5 53.170
Time/epoch: 9.62761640548706 sec
Epoch: [28][0/391]	Time 0.247 (0.247)	Data 0.233 (0.233)	Loss 3.1627 (3.1627)	Acc@1 23.438 (23.438)	Acc@5 49.219 (49.219)
Epoch: [28][150/391]	Time 0.020 (0.022)	Data 0.000 (0.002)	Loss 3.2758 (3.1520)	Acc@1 17.188 (19.992)	Acc@5 46.875 (49.809)
Epoch: [28][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 3.0008 (3.1413)	Acc@1 28.125 (20.157)	Acc@5 53.125 (49.982)
Test: [0/79]	Time 0.240 (0.240)	Loss 2.7828 (2.7828)	Acc@1 28.906 (28.906)	Acc@5 63.281 (63.281)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.9793 (3.0193)	Acc@1 29.688 (24.574)	Acc@5 55.469 (54.332)
Test: [20/79]	Time 0.006 (0.017)	Loss 3.0419 (3.0140)	Acc@1 21.875 (24.182)	Acc@5 54.688 (53.943)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.9832 (3.0080)	Acc@1 23.438 (24.118)	Acc@5 52.344 (53.730)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.8271 (3.0023)	Acc@1 29.688 (24.352)	Acc@5 57.031 (54.154)
Test: [50/79]	Time 0.006 (0.011)	Loss 3.0330 (2.9994)	Acc@1 24.219 (24.387)	Acc

 * Acc@1 27.360 Acc@5 57.680
Time/epoch: 10.077147483825684 sec
Epoch: [35][0/391]	Time 0.268 (0.268)	Data 0.255 (0.255)	Loss 2.9205 (2.9205)	Acc@1 25.781 (25.781)	Acc@5 54.688 (54.688)
Epoch: [35][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.9078 (2.9121)	Acc@1 26.562 (24.032)	Acc@5 58.594 (55.707)
Epoch: [35][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.9782 (2.9150)	Acc@1 25.781 (23.900)	Acc@5 59.375 (55.785)
Test: [0/79]	Time 0.263 (0.263)	Loss 2.6485 (2.6485)	Acc@1 27.344 (27.344)	Acc@5 62.500 (62.500)
Test: [10/79]	Time 0.006 (0.030)	Loss 2.6937 (2.8505)	Acc@1 31.250 (25.994)	Acc@5 60.156 (56.818)
Test: [20/79]	Time 0.006 (0.019)	Loss 2.9355 (2.8433)	Acc@1 25.781 (26.376)	Acc@5 56.250 (56.845)
Test: [30/79]	Time 0.006 (0.015)	Loss 2.8353 (2.8371)	Acc@1 24.219 (26.865)	Acc@5 56.250 (57.283)
Test: [40/79]	Time 0.006 (0.013)	Loss 2.5900 (2.8229)	Acc@1 34.375 (27.801)	Acc@5 62.500 (57.927)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.7862 (2.8224)	Acc@1 27.344 (28.064)	A

 * Acc@1 27.890 Acc@5 58.700
Time/epoch: 9.627776145935059 sec
Epoch: [42][0/391]	Time 0.251 (0.251)	Data 0.240 (0.240)	Loss 2.7451 (2.7451)	Acc@1 25.781 (25.781)	Acc@5 59.375 (59.375)
Epoch: [42][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.9262 (2.8478)	Acc@1 25.781 (25.559)	Acc@5 52.344 (57.683)
Epoch: [42][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.6376 (2.8428)	Acc@1 27.344 (25.555)	Acc@5 64.062 (57.825)
Test: [0/79]	Time 0.219 (0.219)	Loss 2.6072 (2.6072)	Acc@1 26.562 (26.562)	Acc@5 63.281 (63.281)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.6793 (2.8391)	Acc@1 33.594 (27.699)	Acc@5 64.062 (58.452)
Test: [20/79]	Time 0.006 (0.016)	Loss 2.8285 (2.8178)	Acc@1 24.219 (27.493)	Acc@5 57.031 (58.036)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.7999 (2.8062)	Acc@1 25.781 (27.797)	Acc@5 53.906 (58.115)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.6179 (2.7959)	Acc@1 35.938 (28.335)	Acc@5 61.719 (58.575)
Test: [50/79]	Time 0.006 (0.010)	Loss 2.8379 (2.7959)	Acc@1 24.219 (28.416)	Ac

 * Acc@1 27.780 Acc@5 58.470
Time/epoch: 9.644168138504028 sec
Epoch: [49][0/391]	Time 0.239 (0.239)	Data 0.226 (0.226)	Loss 3.1009 (3.1009)	Acc@1 17.969 (17.969)	Acc@5 41.406 (41.406)
Epoch: [49][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.7934 (2.8007)	Acc@1 24.219 (26.211)	Acc@5 54.688 (58.537)
Epoch: [49][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.7106 (2.7944)	Acc@1 17.969 (26.306)	Acc@5 58.594 (58.801)
Test: [0/79]	Time 0.235 (0.235)	Loss 2.6159 (2.6159)	Acc@1 30.469 (30.469)	Acc@5 67.969 (67.969)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.7057 (2.8186)	Acc@1 32.812 (28.267)	Acc@5 60.156 (58.736)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.8971 (2.8096)	Acc@1 28.906 (28.646)	Acc@5 60.938 (58.408)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7700 (2.8043)	Acc@1 28.125 (28.251)	Acc@5 57.812 (58.619)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.6004 (2.7937)	Acc@1 34.375 (28.830)	Acc@5 61.719 (58.880)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.8206 (2.7877)	Acc@1 25.000 (28.891)	Ac

 * Acc@1 28.710 Acc@5 59.700
Time/epoch: 10.232307195663452 sec
Epoch: [56][0/391]	Time 0.255 (0.255)	Data 0.242 (0.242)	Loss 2.5894 (2.5894)	Acc@1 25.781 (25.781)	Acc@5 60.156 (60.156)
Epoch: [56][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.6943 (2.7838)	Acc@1 29.688 (26.568)	Acc@5 60.938 (59.111)
Epoch: [56][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.4886 (2.7826)	Acc@1 32.031 (26.511)	Acc@5 71.094 (58.960)
Test: [0/79]	Time 0.246 (0.246)	Loss 2.6132 (2.6132)	Acc@1 28.906 (28.906)	Acc@5 64.062 (64.062)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.6622 (2.7926)	Acc@1 30.469 (28.338)	Acc@5 61.719 (58.523)
Test: [20/79]	Time 0.006 (0.018)	Loss 2.8181 (2.7791)	Acc@1 28.125 (28.683)	Acc@5 58.594 (58.333)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.8396 (2.7808)	Acc@1 24.219 (28.604)	Acc@5 57.812 (59.173)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.5523 (2.7689)	Acc@1 32.812 (28.868)	Acc@5 68.750 (59.527)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.7581 (2.7631)	Acc@1 28.906 (29.059)	A

 * Acc@1 29.000 Acc@5 60.380
Time/epoch: 9.65976357460022 sec
Epoch: [63][0/391]	Time 0.270 (0.270)	Data 0.257 (0.257)	Loss 2.4484 (2.4484)	Acc@1 35.156 (35.156)	Acc@5 67.188 (67.188)
Epoch: [63][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.6846 (2.7238)	Acc@1 24.219 (27.478)	Acc@5 60.938 (60.322)
Epoch: [63][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.6630 (2.7267)	Acc@1 23.438 (27.554)	Acc@5 64.844 (60.216)
Test: [0/79]	Time 0.237 (0.237)	Loss 2.6238 (2.6238)	Acc@1 30.469 (30.469)	Acc@5 69.531 (69.531)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.6174 (2.7892)	Acc@1 32.812 (28.835)	Acc@5 65.625 (59.801)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.8187 (2.7848)	Acc@1 26.562 (28.274)	Acc@5 57.812 (58.705)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7739 (2.7756)	Acc@1 26.562 (28.377)	Acc@5 61.719 (59.451)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.5039 (2.7643)	Acc@1 36.719 (28.925)	Acc@5 70.312 (59.661)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.7933 (2.7617)	Acc@1 28.125 (29.136)	Acc

 * Acc@1 29.300 Acc@5 59.780
Time/epoch: 9.736753702163696 sec
Epoch: [70][0/391]	Time 0.260 (0.260)	Data 0.248 (0.248)	Loss 2.9652 (2.9652)	Acc@1 21.875 (21.875)	Acc@5 56.250 (56.250)
Epoch: [70][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.7908 (2.7199)	Acc@1 28.125 (27.815)	Acc@5 60.156 (60.891)
Epoch: [70][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 2.4080 (2.7128)	Acc@1 35.938 (27.834)	Acc@5 70.312 (60.873)
Test: [0/79]	Time 0.240 (0.240)	Loss 2.5782 (2.5782)	Acc@1 29.688 (29.688)	Acc@5 67.188 (67.188)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.6040 (2.7798)	Acc@1 32.812 (28.196)	Acc@5 67.969 (59.801)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.7457 (2.7747)	Acc@1 28.125 (27.865)	Acc@5 63.281 (59.152)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7157 (2.7633)	Acc@1 27.344 (28.150)	Acc@5 59.375 (59.854)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.5431 (2.7525)	Acc@1 34.375 (28.735)	Acc@5 62.500 (60.004)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.7310 (2.7484)	Acc@1 32.812 (29.167)	Ac

 * Acc@1 29.330 Acc@5 60.210
Time/epoch: 9.610320568084717 sec
Epoch: [77][0/391]	Time 0.245 (0.245)	Data 0.234 (0.234)	Loss 2.6680 (2.6680)	Acc@1 24.219 (24.219)	Acc@5 60.938 (60.938)
Epoch: [77][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.7993 (2.7082)	Acc@1 26.562 (28.068)	Acc@5 55.469 (60.736)
Epoch: [77][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 2.7406 (2.7141)	Acc@1 21.875 (27.930)	Acc@5 64.062 (60.740)
Test: [0/79]	Time 0.231 (0.231)	Loss 2.5961 (2.5961)	Acc@1 27.344 (27.344)	Acc@5 64.062 (64.062)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.6469 (2.7734)	Acc@1 31.250 (28.054)	Acc@5 62.500 (59.659)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.8451 (2.7577)	Acc@1 28.125 (28.274)	Acc@5 55.469 (59.077)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7952 (2.7593)	Acc@1 25.781 (28.276)	Acc@5 58.594 (59.375)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.5914 (2.7524)	Acc@1 34.375 (28.735)	Acc@5 67.969 (59.756)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.6825 (2.7433)	Acc@1 28.125 (29.305)	Ac

 * Acc@1 28.620 Acc@5 60.010
Time/epoch: 9.633252382278442 sec
Epoch: [84][0/391]	Time 0.248 (0.248)	Data 0.236 (0.236)	Loss 2.6144 (2.6144)	Acc@1 29.688 (29.688)	Acc@5 62.500 (62.500)
Epoch: [84][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.7999 (2.7181)	Acc@1 30.469 (27.768)	Acc@5 58.594 (60.399)
Epoch: [84][300/391]	Time 0.019 (0.021)	Data 0.000 (0.001)	Loss 2.5522 (2.7174)	Acc@1 35.156 (27.720)	Acc@5 65.625 (60.527)
Test: [0/79]	Time 0.234 (0.234)	Loss 2.6143 (2.6143)	Acc@1 28.125 (28.125)	Acc@5 67.188 (67.188)
Test: [10/79]	Time 0.006 (0.027)	Loss 2.6334 (2.7772)	Acc@1 30.469 (29.190)	Acc@5 61.719 (59.162)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.8308 (2.7676)	Acc@1 25.781 (28.795)	Acc@5 57.812 (58.594)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7590 (2.7604)	Acc@1 26.562 (28.856)	Acc@5 60.156 (59.148)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.5447 (2.7488)	Acc@1 32.031 (29.287)	Acc@5 65.625 (59.832)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.8381 (2.7446)	Acc@1 28.906 (29.596)	Ac

 * Acc@1 29.140 Acc@5 60.450
Time/epoch: 9.636030912399292 sec
Epoch: [91][0/391]	Time 0.238 (0.238)	Data 0.225 (0.225)	Loss 2.6545 (2.6545)	Acc@1 25.781 (25.781)	Acc@5 63.281 (63.281)
Epoch: [91][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.4398 (2.7117)	Acc@1 33.594 (27.602)	Acc@5 65.625 (61.036)
Epoch: [91][300/391]	Time 0.021 (0.021)	Data 0.000 (0.001)	Loss 2.6195 (2.7102)	Acc@1 35.938 (27.668)	Acc@5 63.281 (60.950)
Test: [0/79]	Time 0.241 (0.241)	Loss 2.5942 (2.5942)	Acc@1 29.688 (29.688)	Acc@5 68.750 (68.750)
Test: [10/79]	Time 0.006 (0.028)	Loss 2.6644 (2.7791)	Acc@1 32.031 (28.764)	Acc@5 60.938 (59.659)
Test: [20/79]	Time 0.006 (0.018)	Loss 2.8156 (2.7796)	Acc@1 27.344 (28.646)	Acc@5 61.719 (59.003)
Test: [30/79]	Time 0.006 (0.014)	Loss 2.7790 (2.7586)	Acc@1 28.125 (28.906)	Acc@5 60.156 (60.030)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.4766 (2.7506)	Acc@1 36.719 (29.497)	Acc@5 67.969 (60.252)
Test: [50/79]	Time 0.007 (0.011)	Loss 2.7021 (2.7466)	Acc@1 29.688 (29.688)	Ac

 * Acc@1 29.220 Acc@5 59.980
Time/epoch: 9.624799013137817 sec
Epoch: [98][0/391]	Time 0.239 (0.239)	Data 0.225 (0.225)	Loss 2.4950 (2.4950)	Acc@1 32.031 (32.031)	Acc@5 60.938 (60.938)
Epoch: [98][150/391]	Time 0.021 (0.022)	Data 0.000 (0.002)	Loss 2.6667 (2.7238)	Acc@1 29.688 (27.768)	Acc@5 59.375 (59.753)
Epoch: [98][300/391]	Time 0.020 (0.022)	Data 0.000 (0.001)	Loss 2.5418 (2.7160)	Acc@1 26.562 (27.884)	Acc@5 63.281 (60.291)
Test: [0/79]	Time 0.229 (0.229)	Loss 2.6264 (2.6264)	Acc@1 30.469 (30.469)	Acc@5 67.969 (67.969)
Test: [10/79]	Time 0.006 (0.026)	Loss 2.6680 (2.7763)	Acc@1 32.812 (29.830)	Acc@5 62.500 (59.446)
Test: [20/79]	Time 0.006 (0.017)	Loss 2.8619 (2.7670)	Acc@1 28.125 (29.390)	Acc@5 59.375 (59.189)
Test: [30/79]	Time 0.006 (0.013)	Loss 2.7400 (2.7522)	Acc@1 30.469 (29.158)	Acc@5 60.938 (59.778)
Test: [40/79]	Time 0.006 (0.012)	Loss 2.4898 (2.7412)	Acc@1 33.594 (29.478)	Acc@5 68.750 (60.137)
Test: [50/79]	Time 0.006 (0.011)	Loss 2.8118 (2.7372)	Acc@1 28.125 (29.703)	Ac

To generate diagrams for our Accuracy we will use tensorboard.
after running the following code a Address will be given that you can access via your Browser to check the results. Please be aware that tensorboard needs to be installed for using that functionality

To see the training and and testing accuracy for each training session please change *logdir* to *runs/cifar100*. The directory *run/cifar100/all_dropout_experiments* will only show the final results of each training session in comparison. 

Those diagrams should look similar to these:


![Top1_Training](./visualizations/Top1_Training.png)


![Top5_Training](./visualizations/Top5_Training.png)


![Top1_Training](./visualizations/Top1_Test.png)


![Top5_Training](./visualizations/Top5_Test.png)

The generated diagrams will show you that increasing the dropout rate will lead to a worse training accuracy but up to a dropout rate of 0.5 the accuracy on our test data set will be better than the accuracy from our first session where we did not use dropout. 

Increasing the dropout rate over 0.5 will decrease both the accuracy on both the training and test data set.

As a result this means that there is a sweetspot for our dropout rate around 0.5 where we can see that dropout decreases our training accuracy but increases our test accuracy compared to not using dropout.

In [None]:
!tensorboard --logdir=runs/cifar100/all_dropout_experiments

TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.1.0 at http://localhost:6006/ (Press CTRL+C to quit)


# Sources

[1] „Dropout: A Simple Way to Prevent Neural Networks from Overﬁtting”    ​
     Srivastava, Hinton, Krizhevsky et. al. (2014): ​
     https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf 

[2]  GitHub Project:​
     https://github.com/iceboy910447/ALEXNET-CIFAR10

[3] „A Walk- through of Alexnet“,  Hao Gao (2017):​
     https://medium.com/@smallfishbigsea/a-walk-through-of-alexnet-6cbd137a5637

[4] „Preventing Deep Neural Network from Overfitting“, Piotr Skalski (2018) : ​
     https://towardsdatascience.com/preventing-deep-neural-network-from-overfitting-953458db800a

[5] „Deep Learning – An Introduction Beyond Buzzwords“ Janis Keuper (2019):​ 
https://elearning.hs-offenburg.de/moodle/pluginfile.php/429832/mod_resource/content/1/1_Deep_Learning_Intro.pdf 