# Fisher FIFO

In the `fisher-fifo-v3` notebook, we set a code to use the Fisher information effectively in neural netwrks using the *first-in-first-out* strategy to store gradients.

Here our objective is to assess the effect of the FIFO buffer in the quality of the model. Also, we implement the partitioning strategy to make our algorithm generalizable to larger networks, as well as datasets with larger instances (like images).

To enhance the partition effectiveness, we proceed to use the "maximum-block-update", to make the algorithm faster.

---

in fisher `fisher-fifo-v4` notebook we are trying to implement the algorithm in a more efficient way by using the [torch.bmm](https://pytorch.org/docs/stable/generated/torch.bmm.html). The main idea is to execute the matrix-multiplications of more than one block in an optimized way.

---

in `fisher-fifo-v4.2` notebook, we are trying to make the algorithm even faster using a single centralized object responsible for storing all the matrices (and their inverses) as well as all gradients and buffers. The main idea here is to make the most of vectorization using Pytorch utilities for GPU.

---

in `fisher-fifo-v4.3` notebook, we try to take the algorithm one step further in efficiency. We are implementing the strategy to retrieve the partitions in sets, instead of individually.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import time
import math
import os
import json

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# from skopt import gp_minimize

from scipy import stats

In [2]:
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

/kaggle/input/fisher-fifo-v4-3/bayes-opt-results.npz
/kaggle/input/fisher-fifo-v4-3/results_step_100.json
/kaggle/input/fisher-fifo-v4-3/__results__.html
/kaggle/input/fisher-fifo-v4-3/__notebook__.ipynb
/kaggle/input/fisher-fifo-v4-3/__output__.json
/kaggle/input/fisher-fifo-v4-3/custom.css
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/t10k-labels-idx1-ubyte
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/t10k-images-idx3-ubyte.gz
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/t10k-labels-idx1-ubyte.gz
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/train-labels-idx1-ubyte.gz
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/t10k-images-idx3-ubyte
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/train-labels-idx1-ubyte
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/train-images-idx3-ubyte
/kaggle/input/fisher-fifo-v4-3/data/MNIST/raw/train-images-idx3-ubyte.gz


In [3]:
import torch
from torch import nn
from torch.nn import functional as F
from torch.utils.data import TensorDataset, DataLoader

import torchvision
import torchvision.datasets as datasets

In [4]:
def get_device():    
    if torch.cuda.is_available():

        device = torch.device('cuda')
        print( torch.cuda.get_device_name(device) )
        print( torch.cuda.get_device_properties(device) )

    else:
        device = torch.device('cpu')
        print(device)
        
    return device

In [5]:
!pip install torchsummary
import torchsummary

Collecting torchsummary
  Downloading torchsummary-1.5.1-py3-none-any.whl (2.8 kB)
Installing collected packages: torchsummary
Successfully installed torchsummary-1.5.1
[0m

In [6]:
class cfg:
    # n_features = 28 * 28
    img_size = (32, 32)
    img_channels = 3
    
    n_samples = 10000
    n_classes = 10  ## we have 10 classes in MNIST
    
    # device = torch.device('cpu')
    device = get_device()
    
    max_loss = 20.0

Tesla P100-PCIE-16GB
_CudaDeviceProperties(name='Tesla P100-PCIE-16GB', major=6, minor=0, total_memory=16280MB, multi_processor_count=56)


# create the dataset

In [7]:
def generate_dataset_mnist(batch_size):
    print(f'generating MNIST data with {cfg.n_classes} classes')
    
    transf_ = torchvision.transforms.Compose([
        # torchvision.transforms.Resize(size=[14, 14]),
        torchvision.transforms.ToTensor()
    ])
    
    mnist_train = datasets.MNIST(root='./data', train=True, download=True, transform=transf_)
    mnist_test  = datasets.MNIST(root='./data', train=False, download=True, transform=transf_)
    
    mnist_train_dataloader = DataLoader(dataset=mnist_train, batch_size=batch_size, shuffle=True)
    mnist_test_dataloader  = DataLoader(dataset=mnist_test, batch_size=batch_size, shuffle=False)

    return mnist_train_dataloader, mnist_test_dataloader

In [8]:
def generate_dataset_cifar10(batch_size):
    print(f'generating CIFAR10 data with {cfg.n_classes} classes')
    
    transf_ = torchvision.transforms.Compose([
        # torchvision.transforms.Resize(size=[14, 14]),
        torchvision.transforms.ToTensor()
    ])
    
    cifar10_train = datasets.CIFAR10(root='./data', train=True, download=True, transform=transf_)
    cifar10_test  = datasets.CIFAR10(root='./data', train=False, download=True, transform=transf_)
    
    cifar10_train_dataloader = DataLoader(dataset=cifar10_train, batch_size=batch_size, shuffle=True)
    cifar10_test_dataloader  = DataLoader(dataset=cifar10_test, batch_size=batch_size, shuffle=False)

    return cifar10_train_dataloader, cifar10_test_dataloader

## declaring network architecture

In [9]:
def get_default_network(c=16, device=cfg.device):
    net = nn.Sequential(
        nn.Flatten(),
        nn.Linear(in_features=cfg.n_features, out_features=c),
        nn.ReLU(),
        nn.Linear(in_features=c, out_features=c),
        nn.ReLU(),
        nn.Linear(in_features=c, out_features=c),
        nn.ReLU(),
        nn.Linear(in_features=c, out_features=cfg.n_classes)
    )
    
    torchsummary.summary(net, input_size=[[cfg.n_features]], device='cpu')
    
    return net

In [10]:
def get_cnn_network(in_channels=cfg.img_channels, c=16, p_drop=0.1, device=cfg.device):
    
    img_flat_size = (4 * c * (cfg.img_size[0] // 8) * (cfg.img_size[1] // 8) )
    print(img_flat_size)
    net = nn.Sequential(
        nn.Conv2d(in_channels=3, out_channels=c, kernel_size=5, stride=2, padding=2),
        nn.ReLU(),

        nn.Conv2d(in_channels=c, out_channels=(2 * c), kernel_size=3, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2),
        nn.Dropout2d(p=p_drop),
        
        nn.Conv2d(in_channels=(2 * c), out_channels=(4 * c), kernel_size=3, padding=1),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2),
        nn.Dropout2d(p=p_drop),
        
        nn.Flatten(),
        
        nn.Linear(in_features=img_flat_size, out_features=(8 * c) ),
        nn.ReLU(),
        nn.Dropout(p=p_drop),
        
        nn.Linear(in_features=(8 * c), out_features=(4 * c) ),
        nn.ReLU(),
        nn.Dropout(p=p_drop),
        
        nn.Linear(in_features=(4 * c), out_features=cfg.n_classes)
    )
    
    torchsummary.summary(net, input_size=[[cfg.img_channels, *cfg.img_size]], device='cpu')
    
    return net

In [11]:
def get_cnn_network_v2(in_channels=cfg.img_channels, p_drop=0.1, device=cfg.device):
    
    net = nn.Sequential(
        nn.Conv2d(in_channels=3, out_channels=96, kernel_size=5, padding=2),
        nn.MaxPool2d(kernel_size=2),
        nn.ReLU(),

        nn.Conv2d(in_channels=96, out_channels=80, kernel_size=5, padding=2),
        nn.ReLU(),
        nn.MaxPool2d(kernel_size=2),
        nn.Dropout2d(p=p_drop),
        
        nn.Conv2d(in_channels=80, out_channels=96, kernel_size=5, padding=2),
        nn.ReLU(),
        nn.Dropout2d(p=p_drop),
        
        nn.Conv2d(in_channels=96, out_channels=64, kernel_size=5, padding=2),
        nn.ReLU(),
        nn.Dropout2d(p=p_drop),
        
        nn.Flatten(),
        
        nn.Linear(in_features=4096, out_features=256 ),
        nn.ReLU(),
        nn.Dropout(p=p_drop),
        
        nn.Linear(in_features=256, out_features=cfg.n_classes)
    )
    
    torchsummary.summary(net, input_size=[[cfg.img_channels, *cfg.img_size]], device='cpu')
    
    return net

# object for calculation of the metrics

In [12]:
class Metrics():
    def __init__(self, value_round=None, time_round=None):
        self.metrics_dict = {}
        self.set_initial_time()
        self.val_round = value_round
        self.time_round = time_round
        
    def set_initial_time(self):
        self.init_time = time.time()
        
    def get_time(self):
        return time.time() - self.init_time
    
    def add(self, key, value, step=None):
        
        if step is None:
            step = np.nan
        
        if key not in self.metrics_dict:
            self.metrics_dict[key] = []
        
        t = self.get_time()
        if self.time_round is not None:
            t = round(t, ndigits=self.time_round)
        
        if self.val_round is not None:
            value = round(value, ndigits=self.val_round)
        
        self.metrics_dict[key].append( (value, step, t) )
    
    def add_(self, dict_, step=None):
        for key, value in dict_.items():
            self.add(key, value, step)
    
    def get(self, key, get_step=False, get_time=False):
        y, x, t = zip(*self.metrics_dict[key])
        y, x, t = list(y), list(x), list(t)
        
        return x, y, t

# Fisher Information calculation objects

In [13]:
class FisherFIFO():
    def __init__(self,
                 named_params,
                 buffer_size,
                 partition_size,
                 block_updates):
        
        self.buffer_size = buffer_size
        self.partition_size = partition_size
        self.block_updates = block_updates
        
        named_params = list(named_params)
        
        self.partition_fisher_list = []
        total_partitions, total_block_upd = 0, 0
        for pi, (n, p) in enumerate( named_params ):
            part_fisher = PartitionerFisherFIFO(param = p,
                                                name = n,
                                                buffer_size = buffer_size,
                                                partition_size = partition_size,
                                                block_updates = block_updates,
                                                parent_fifo = None)
            
            self.partition_fisher_list.append( (p, part_fisher, total_partitions, total_block_upd) )
            
            total_partitions += part_fisher.num_part
            total_block_upd += part_fisher.block_updates
            
        self.num_part = total_partitions
        self.total_block_updates = total_block_upd
        
        print(f'total partitions: {self.num_part} - effective block updates: {self.total_block_updates}')
        
        ## pre-alocate the memory for the tensor that stores the selected gradients (changes every iteration)
        self.g = torch.zeros(size=[self.total_block_updates, partition_size, 1], dtype=torch.float, device=cfg.device)
        
        ## pre-alocate the memory for the tensor that stores the buffer and the tensor for the inverse
        self.buffer = torch.zeros(size=[self.num_part, partition_size, buffer_size], dtype=torch.float, device=cfg.device)
        self.fisher_inv = torch.zeros(size=[self.num_part, partition_size, partition_size], dtype=torch.float, device=cfg.device)        
    
        print('initializing buffers and inverses...')
        ## now we initialize the buffer and the inverse for all partitions
        i = 0
        for _, part_fisher, _, _ in self.partition_fisher_list:
            for _, start, end in part_fisher.ind_fisher_list:

                if i == 0 or ( (i + 1) % 10000 ) == 0 or i == (self.num_part - 1):
                    print(f'partition {i+1}/{self.num_part}')

                n = end - start
                buffer, _, fisher_inv = self.initialize_fisher_partition(param_size=n, buffer_size=self.buffer_size)

                self.buffer[i, :n, :] = buffer
                self.fisher_inv[i, :n, :n] = fisher_inv

                i += 1

    
    def initialize_fisher_partition(self, param_size, buffer_size):
        
        buffer = self.get_initial_buffer_v2(param_size, buffer_size)
            
        ## shuffle buffer across columns
        buffer = buffer[:, torch.randperm(buffer.shape[1]) ]
        
        ## the fisher matrix will be initialized as G @ G.T, in which G is our buffer. We built
        ## our buffer in a smart way so the resulting Fisher info matrix is initialized close to identity
        fisher = buffer @ buffer.T
        
        ## since our Fisher information is diagonal for now, its inverse is given just by the innverted
        ## elements of the diagonal. 
        fisher_inv = torch.diag( 1 / torch.diag(fisher) )
        
        return buffer, fisher, fisher_inv
    
    
    def get_initial_buffer_v2(self, param_size, buffer_size):
        ## here we adopt a faster approach to initialize the buffer. We use n 
        ## identity matrices concatenated column-wise. n is determined by `param_size` and `buffer_size`
        
        n_eye = math.ceil(buffer_size / param_size)
        I = torch.eye(n=param_size, dtype=torch.float, device=cfg.device)
        
        buffer = torch.cat(n_eye * [I], dim=1)[:, :buffer_size]
        
        assert buffer.shape == (param_size, buffer_size)
        
        return buffer


    def get_idx_lists(self):
        run_enc_list = []
        default_idx_list = []
        for p, part_fisher, num_part, block_upd in self.partition_fisher_list:
            init_block, end_block, g_init_idx, g_end_idx = part_fisher.get_random_blocks()
            
            # print(f'param shape: {p.shape} - blocks: {init_block} to {end_block} - grad: {g_init_idx} to {g_end_idx}')
            
            run_enc_list.append( (num_part + init_block, num_part + end_block, g_init_idx, g_end_idx) )
            default_idx_list.append( np.arange(start=num_part + init_block, stop=num_part + end_block + 1) )
            
            
        return run_enc_list, np.concatenate(default_idx_list)
    
    
    def read_gradients(self, idx):
        self_g_start = 0
        for i, (_, _, g_start, g_end) in enumerate(idx):
            n_grad = g_end - g_start
            # self_g_end = min( self_g_start + n_grad, torch.numel(self.g) )
            self_g_end = self_g_start + n_grad
            
            p, _, _, _ = self.partition_fisher_list[i]
            
            self.g.view(-1)[self_g_start:self_g_end] = p.grad.view(-1)[g_start:g_end]
            
            if (n_grad % self.partition_size) > 0:
                extra_zeros = self.partition_size - (n_grad % self.partition_size)
                self.g.view(-1)[self_g_end:(self_g_end + extra_zeros)] = 0.0
            else:
                extra_zeros = 0
            
#             print(f'self_g_start: {self_g_start} - self_g_end: {self_g_end} - self.g.shape: {self.g.view(-1).shape}')
#             print(f'g_start: {g_start} - g_end: {g_end} - p.grad.shape: {p.grad.view(-1).shape}')
#             print(f'n_grad: {n_grad} - part-size: {self.partition_size} - extra-zeros: {extra_zeros}')
#             print()
            
            self_g_start = self_g_end + extra_zeros
            

    def write_gradients(self, idx):
        self_g_start = 0
        for i, (_, _, g_start, g_end) in enumerate(idx):
            n_grad = g_end - g_start
            self_g_end = self_g_start + n_grad
            
            p, _, _, _ = self.partition_fisher_list[i]
            p.grad.view(-1)[g_start:g_end] = self.g.view(-1)[self_g_start:self_g_end]

            if (n_grad % self.partition_size) > 0:
                extra_zeros = self.partition_size - (n_grad % self.partition_size)
            else:
                extra_zeros = 0
            
            self_g_start = self_g_end + extra_zeros

    
    def step(self):
        ## selects the blocks to be updated
        run_enc_idx, default_idx = self.get_idx_lists()
        
        ## read the selected blocks gradients and stores them in self.g
        self.read_gradients(run_enc_idx)
        
        ## get apart the inverses and buffers for the selected blocks
        inv = self.fisher_inv[default_idx, ...]
        buffer = self.buffer[default_idx, ...]
        
        # print(inv.shape, buffer.shape)
        
        ## update the buffer
        
        ## dimensions are: partitions, gradient-size, bufffer-size
        g_old = buffer[:, :, 0:1] 
        
        ## we  update in the third dimension: "buffer-size"
        buffer = torch.cat([buffer[:, :, 1:], self.g], dim=2)
        
        ## update the inverses and modify current gradients
        ## ...
        sqrt_NB = math.sqrt(self.buffer_size)
        
        ## update inverse - phase 1: add new gradient ##
        g_phase1 = self.upd_inverse( (1 / sqrt_NB) * self.g, inv, type_='sum')

        ## update inverse - phase 2: remove old gradient ##
        self.upd_inverse( (1 / sqrt_NB) * g_old, inv, type_='sub')

        ## modify the current gradients
        if False:
            ## use the "phase-1-trick" to get the estimated new gradient
            self.g = g_phase1 * sqrt_NB
        else:
            ## get the modified gradient using "de facto" the new inverses and the gradients
            self.g = self.modify_grad(self.g, inv)
        
        ## return the inverses and buffers to the main tensor
        self.fisher_inv[default_idx, ...] = inv
        self.buffer[default_idx, ...] = buffer
        
        ## return the modified gradients to the parameters
        self.write_gradients(run_enc_idx)


    def upd_inverse(self, g, inverse, type_='sum'):
        ## update the inverse based on the woodbury inversion
        f_inv_g = torch.bmm(inverse, g)

        if type_ == 'sum':
            d = 1 + torch.sum(g * f_inv_g, dim=[1, 2], keepdim=True)
            inverse[:] = inverse - (f_inv_g * torch.transpose(f_inv_g, 1, 2) / d)

        elif type_ == 'sub':
            d = 1 - torch.sum(g * f_inv_g, dim=[1, 2], keepdim=True)
            inverse[:] = inverse + (f_inv_g * torch.transpose(f_inv_g, 1, 2) / d)

        else:
            ## incorrect type
            print('incorrect rank-1 update type: ' + type_)
        
        return f_inv_g


    def modify_grad(self, g, inverse):
        return torch.bmm(inverse, g)

In [14]:
class PartitionerFisherFIFO():
    def __init__(self,
                 param,
                 name,
                 buffer_size,
                 partition_size,
                 block_updates,
                 parent_fifo):
        
        self.param = param
        self.name = name 
        
        if partition_size is None:
            self.partition_size = param.numel()
        else:
            self.partition_size = partition_size
        
        ## calculates the number of partitions required. It is calculated using the param size and
        ## our partition maximum size. The gradient (the same size as param) is going to be partitioned in
        ## equal pieces (except possibly the last one) to be processed individually by our "IndividualFisherFIFO"
        self.param_size = param.numel()
        self.num_part = math.ceil(self.param_size / self.partition_size)
        
        ## the number of blocks (partitions) to update at each iteration. This can be < num_part to make
        ## the algorithm more efficient. (we dont update every partition at every iteration)
        if block_updates is None:
            self.block_updates = self.num_part
        else:
            self.block_updates = min(block_updates, self.num_part)
        
        print(f'FisherPartitioner: param: {self.param_size} - partition: {self.partition_size} - nº part: {self.num_part} - block updates: {self.block_updates}')
                
        ## the list stores the indexes used to partition the gradient
        self.ind_fisher_list = []
        for i in range(self.num_part):
            start = i * self.partition_size
            end = min(start + self.partition_size, self.param_size)
            
            self.ind_fisher_list.append( (i, start, end) )
        
    
    def get_random_blocks(self, num_part=None, block_upd=None):
        
        if num_part is None:
            num_part = self.num_part
        
        if block_upd is None:
            block_upd = self.block_updates
        
        ## choose the initial block randomly
        init_block = np.random.choice(num_part - block_upd + 1)
        
        ## the final block will be necessarily `block_upd` blocks further. This means we select
        ## a contiguous sequence of blocks. This is going to be used for performance reasons
        end_block = init_block + block_upd - 1
        
        ## therefore, the starting and ending index to be used to fetch the gradient positions for the
        ## blocks will be the starting index for the first block and the ending positions for the last block
        _, g_init_idx, _ = self.ind_fisher_list[init_block]
        _, _, g_end_idx = self.ind_fisher_list[end_block]
        
        return init_block, end_block, g_init_idx, g_end_idx

---

# utils function for training

In [18]:
def accuracy_score_tns(y_true, y_pred):
    return torch.mean( (y_true == y_pred).to(dtype=torch.float) ).cpu().item()

In [19]:
def train_iteration(x, y, net, optim, loss, fisher=None):
    net.train()
    net.zero_grad()
    
    y_pred = net(x)
    l = loss(y_pred, y)
    
    l.backward()
    
    if fisher is not None:
        fisher.step()
    
    optim.step()
    
    return l.item(), accuracy_score_tns( y.view(-1), y_pred.argmax(dim=1).view(-1) )

In [20]:
def evaluate(net, dataloader, loss):
    net.eval()
    
    with torch.no_grad():

        loss_list = []
        y_pred_list = []
        y_label_list = []
        for x, y in dataloader:
            
            x = x.to(cfg.device)
            y = y.to(cfg.device)

            y_pred = net(x)
            l = loss(y_pred, y)

            loss_list.append( l.cpu().item() )
            y_pred_list.append( y_pred.argmax(dim=1).view(-1) )
            y_label_list.append( y.view(-1) )

        y_pred_list = torch.cat(y_pred_list).view(-1)
        y_label_list = torch.cat(y_label_list).view(-1)

    return np.mean(loss_list), accuracy_score_tns(y_label_list, y_pred_list)

# training

In [21]:
def train_network_fisher_optimization(batch_size = 32,
                                      lr = 1e-3,
                                      momentum = 0.9,
                                      epochs = 30,
                                      buffer_size = 1000,
                                      partition_size = 256,
                                      block_updates = 4,
                                      net_params = {'c':16, 'p':0.1},
                                      apply_fisher = True,
                                      # gpu_memory_check = 20,
                                      time_limit_secs = 600,
                                      interval_print = 100):

    ## declare (instantiate) the dataset
    train_dataloader, test_dataloader = generate_dataset_cifar10(batch_size = batch_size)

    ## instantiate the network
    net = get_cnn_network_v2(p_drop = net_params['p']).to(device=cfg.device)
    
    if apply_fisher:
        ## instantiate FisherFIFO object to create and update the Fisher info matrix
        fisher_fifo = FisherFIFO(named_params = net.named_parameters(),
                                 buffer_size = buffer_size,
                                 partition_size = partition_size,
                                 block_updates = block_updates)
    else:
        fisher_fifo = None

    ## create loss object: we multiply by our constant to stabilize norms
    # cross_entropy = nn.CrossEntropyLoss(reduction='mean') # standard version
    cross_entropy_standard = nn.CrossEntropyLoss(reduction='mean')
    cross_entropy = lambda y_pred, y: math.sqrt(batch_size) * cross_entropy_standard(y_pred, y)
    
    ## create optimize objects
    optim = torch.optim.SGD(params=net.parameters(), lr=lr, momentum=momentum)

    default_metrics = Metrics(value_round=3, time_round=2)

    ini_time = time.time()

    step = 0
    training_finished = False
    for epc in range(1, epochs + 1):
        
        if training_finished:
            break
        
        print(f'starting epoch: {epc}/{epochs}')

        for nbt, (x, y) in enumerate(train_dataloader):

            if training_finished:
                break

            x = x.to(cfg.device)
            y = y.to(cfg.device)

            train_loss, train_acc = train_iteration(x, y, net, optim, cross_entropy, fisher_fifo)
            default_metrics.add_({'train-loss': train_loss, 'train-acc': train_acc}, step=step)
            
            ## check time limit
            t = int(time.time() - ini_time)
            if t > time_limit_secs:
                print('time is up! finishing training')
                training_finished = True

            if ( (nbt + 1) % interval_print ) == 0 or (nbt + 1) == len(train_dataloader) or training_finished:
                avg_train_loss = np.mean( default_metrics.get('train-loss')[1][-interval_print:] )
                avg_train_acc = np.mean( default_metrics.get('train-acc')[1][-interval_print:] )
                
                test_loss, test_acc = evaluate(net, test_dataloader, cross_entropy)
                default_metrics.add_({'test-loss': test_loss, 'test-acc': test_acc}, step=step)

                m, s = t // 60, t % 60

                print(f'batch: {nbt + 1}/{len(train_dataloader)}', end='')
                print(f' - train loss: {avg_train_loss:.4f} - test loss: {test_loss:.4f}', end='')
                print(f' - train acc: {avg_train_acc:.4f} - test acc: {test_acc:.4f}', end='')
                print(f' - {m}m {s}s')
                
            step += 1

        ## check for GPU memory consumption
        if torch.cuda.is_available():
            mem_alloc_gb = torch.cuda.memory_allocated(cfg.device) / 1024**3
            mem_res_gb = torch.cuda.memory_reserved(cfg.device) / 1024**3
            max_mem_alloc_gb = torch.cuda.max_memory_allocated(cfg.device) / 1024**3
            max_mem_res_gb = torch.cuda.max_memory_reserved(cfg.device) / 1024**3

            print(f'GPU memory used: {mem_alloc_gb:.2f} GB - max: {max_mem_alloc_gb:.2f} GB - memory reserved: {mem_res_gb:.2f} GB - max: {max_mem_res_gb:.2f} GB')

            # torch.cuda.empty_cache()

    return default_metrics, fisher_fifo

In [22]:
last_step_saved = None

def results_list_to_json(results_list, out_dir='/kaggle/working', step=0):
    global last_step_saved

    json_results = []

    for metrics, bs, ps, bu in results_list:
        json_results.append({
            'buffer-size': bs,
            'partition-size': ps,
            'blocks-updates': bu,
            'metrics': metrics.metrics_dict
        })

    with open( os.path.join(out_dir, f'results_step_{step}.json'), 'w' ) as fp:
        json.dump(json_results, fp)
    
    if last_step_saved is not None:
        old_file = os.path.join(out_dir, f'results_step_{last_step_saved}.json')
        if os.path.exists(old_file):
            os.remove(old_file)
    
    last_step_saved = step

In [23]:
def get_min_test_loss(metrics):
    _, test_loss, _ = metrics.get('test-loss')
    return min(test_loss)

## executing runs - FisherFIFO

In [24]:
buffer_size = 295
partition_size = 185
block_updates = 85

n_runs = 20

In [25]:
results_list = []
step_i = 0

for _ in range(n_runs):
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        
    print(f'iteration: {step_i} - testing - buf size: {buffer_size} - part size: {partition_size} - block upd: {block_updates}')

    default_metrics, _ = train_network_fisher_optimization(apply_fisher = True,
                                                           buffer_size = buffer_size,
                                                           partition_size = partition_size,
                                                           block_updates = block_updates,
                                                           net_params = {'p': 0.1},
                                                           epochs = 100,
                                                           time_limit_secs = 1200)

    results_list.append( (default_metrics, buffer_size, partition_size, block_updates) )
    results_list_to_json(results_list, step=step_i)
    step_i += 1
    
    print()

iteration: 0 - testing - buf size: 295 - part size: 185 - block upd: 85
generating CIFAR10 data with 10 classes
Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8, 8]               0
        Dropout2d-10             [-1, 96, 8, 8]               0
           Conv2d-11             [-1, 64, 8, 8]         153,664
             ReLU-12             [-1, 64, 8, 8]               0
        Dropou

batch: 100/1563 - train loss: 5.2975 - test loss: 5.1798 - train acc: 0.6690 - test acc: 0.6773 - 2m 44s
batch: 200/1563 - train loss: 5.3575 - test loss: 5.2123 - train acc: 0.6638 - test acc: 0.6768 - 2m 48s
batch: 300/1563 - train loss: 5.2669 - test loss: 5.4024 - train acc: 0.6638 - test acc: 0.6624 - 2m 51s
batch: 400/1563 - train loss: 5.4236 - test loss: 5.4470 - train acc: 0.6634 - test acc: 0.6520 - 2m 54s
batch: 500/1563 - train loss: 5.1611 - test loss: 5.1458 - train acc: 0.6798 - test acc: 0.6767 - 2m 58s
batch: 600/1563 - train loss: 5.0754 - test loss: 5.4493 - train acc: 0.6731 - test acc: 0.6669 - 3m 1s
batch: 700/1563 - train loss: 5.1895 - test loss: 5.0614 - train acc: 0.6791 - test acc: 0.6851 - 3m 5s
batch: 800/1563 - train loss: 5.1016 - test loss: 5.1790 - train acc: 0.6732 - test acc: 0.6711 - 3m 8s
batch: 900/1563 - train loss: 5.1253 - test loss: 4.9657 - train acc: 0.6766 - test acc: 0.6863 - 3m 11s
batch: 1000/1563 - train loss: 4.9500 - test loss: 4.9738 

batch: 1100/1563 - train loss: 3.3095 - test loss: 3.7346 - train acc: 0.7943 - test acc: 0.7749 - 6m 49s
batch: 1200/1563 - train loss: 3.2930 - test loss: 3.9233 - train acc: 0.7924 - test acc: 0.7662 - 6m 53s
batch: 1300/1563 - train loss: 3.2005 - test loss: 3.7297 - train acc: 0.8018 - test acc: 0.7718 - 6m 56s
batch: 1400/1563 - train loss: 3.1447 - test loss: 3.7143 - train acc: 0.8003 - test acc: 0.7741 - 6m 59s
batch: 1500/1563 - train loss: 3.4344 - test loss: 3.7541 - train acc: 0.7887 - test acc: 0.7689 - 7m 2s
batch: 1563/1563 - train loss: 3.5421 - test loss: 3.7090 - train acc: 0.7803 - test acc: 0.7732 - 7m 5s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 9/100
batch: 100/1563 - train loss: 2.8636 - test loss: 3.6810 - train acc: 0.8209 - test acc: 0.7740 - 7m 8s
batch: 200/1563 - train loss: 2.9818 - test loss: 3.6133 - train acc: 0.8134 - test acc: 0.7813 - 7m 12s
batch: 300/1563 - train loss: 2.8144 - test loss: 3.6

batch: 400/1563 - train loss: 1.9185 - test loss: 3.6110 - train acc: 0.8735 - test acc: 0.7995 - 10m 49s
batch: 500/1563 - train loss: 1.8956 - test loss: 3.7141 - train acc: 0.8794 - test acc: 0.7915 - 10m 52s
batch: 600/1563 - train loss: 2.0261 - test loss: 3.8581 - train acc: 0.8693 - test acc: 0.7821 - 10m 55s
batch: 700/1563 - train loss: 1.9561 - test loss: 3.6597 - train acc: 0.8781 - test acc: 0.7912 - 10m 59s
batch: 800/1563 - train loss: 2.0697 - test loss: 3.6571 - train acc: 0.8729 - test acc: 0.7926 - 11m 2s
batch: 900/1563 - train loss: 2.1837 - test loss: 3.5535 - train acc: 0.8613 - test acc: 0.7974 - 11m 5s
batch: 1000/1563 - train loss: 1.8982 - test loss: 3.7525 - train acc: 0.8838 - test acc: 0.7944 - 11m 9s
batch: 1100/1563 - train loss: 1.9802 - test loss: 3.6428 - train acc: 0.8785 - test acc: 0.7942 - 11m 12s
batch: 1200/1563 - train loss: 2.0826 - test loss: 3.6128 - train acc: 0.8710 - test acc: 0.7907 - 11m 15s
batch: 1300/1563 - train loss: 2.0819 - test l

batch: 1400/1563 - train loss: 1.3755 - test loss: 4.0692 - train acc: 0.9154 - test acc: 0.7977 - 14m 54s
batch: 1500/1563 - train loss: 1.4159 - test loss: 3.9767 - train acc: 0.9101 - test acc: 0.7974 - 14m 57s
batch: 1563/1563 - train loss: 1.4885 - test loss: 4.0249 - train acc: 0.9066 - test acc: 0.7938 - 15m 0s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 18/100
batch: 100/1563 - train loss: 1.0554 - test loss: 3.9741 - train acc: 0.9370 - test acc: 0.8056 - 15m 3s
batch: 200/1563 - train loss: 0.9920 - test loss: 4.1085 - train acc: 0.9361 - test acc: 0.8022 - 15m 7s
batch: 300/1563 - train loss: 1.1057 - test loss: 4.2574 - train acc: 0.9333 - test acc: 0.7971 - 15m 10s
batch: 400/1563 - train loss: 1.0082 - test loss: 4.3377 - train acc: 0.9383 - test acc: 0.7948 - 15m 13s
batch: 500/1563 - train loss: 1.1771 - test loss: 4.0120 - train acc: 0.9270 - test acc: 0.7985 - 15m 17s
batch: 600/1563 - train loss: 1.2223 - test los

batch: 700/1563 - train loss: 0.9320 - test loss: 4.4668 - train acc: 0.9442 - test acc: 0.7977 - 18m 56s
batch: 800/1563 - train loss: 0.8511 - test loss: 4.6326 - train acc: 0.9489 - test acc: 0.7990 - 19m 0s
batch: 900/1563 - train loss: 0.8791 - test loss: 4.5391 - train acc: 0.9483 - test acc: 0.8028 - 19m 3s
batch: 1000/1563 - train loss: 0.9030 - test loss: 4.7097 - train acc: 0.9443 - test acc: 0.7955 - 19m 6s
batch: 1100/1563 - train loss: 0.9157 - test loss: 4.8063 - train acc: 0.9380 - test acc: 0.7974 - 19m 10s
batch: 1200/1563 - train loss: 0.9338 - test loss: 4.7695 - train acc: 0.9417 - test acc: 0.7966 - 19m 13s
batch: 1300/1563 - train loss: 0.7837 - test loss: 4.6285 - train acc: 0.9486 - test acc: 0.8010 - 19m 16s
batch: 1400/1563 - train loss: 0.8529 - test loss: 4.4815 - train acc: 0.9495 - test acc: 0.8046 - 19m 20s
batch: 1500/1563 - train loss: 0.9644 - test loss: 4.4815 - train acc: 0.9399 - test acc: 0.8039 - 19m 23s
batch: 1563/1563 - train loss: 0.9332 - tes

batch: 1000/1563 - train loss: 7.3816 - test loss: 7.2661 - train acc: 0.5159 - test acc: 0.5258 - 1m 23s
batch: 1100/1563 - train loss: 7.3632 - test loss: 7.4260 - train acc: 0.5387 - test acc: 0.5266 - 1m 26s
batch: 1200/1563 - train loss: 7.2745 - test loss: 6.7831 - train acc: 0.5290 - test acc: 0.5687 - 1m 29s
batch: 1300/1563 - train loss: 6.9743 - test loss: 7.0592 - train acc: 0.5622 - test acc: 0.5493 - 1m 33s
batch: 1400/1563 - train loss: 6.8485 - test loss: 6.6006 - train acc: 0.5606 - test acc: 0.5827 - 1m 36s
batch: 1500/1563 - train loss: 6.7763 - test loss: 6.4432 - train acc: 0.5646 - test acc: 0.5906 - 1m 39s
batch: 1563/1563 - train loss: 6.6512 - test loss: 6.4166 - train acc: 0.5768 - test acc: 0.5897 - 1m 42s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 3/100
batch: 100/1563 - train loss: 6.6172 - test loss: 6.8509 - train acc: 0.5866 - test acc: 0.5696 - 1m 45s
batch: 200/1563 - train loss: 6.6036 - test loss:

batch: 300/1563 - train loss: 3.5620 - test loss: 3.9449 - train acc: 0.7750 - test acc: 0.7602 - 5m 22s
batch: 400/1563 - train loss: 3.5640 - test loss: 3.9443 - train acc: 0.7762 - test acc: 0.7603 - 5m 25s
batch: 500/1563 - train loss: 3.7085 - test loss: 4.0083 - train acc: 0.7694 - test acc: 0.7526 - 5m 28s
batch: 600/1563 - train loss: 3.6877 - test loss: 3.9820 - train acc: 0.7800 - test acc: 0.7560 - 5m 31s
batch: 700/1563 - train loss: 3.7112 - test loss: 3.9468 - train acc: 0.7687 - test acc: 0.7567 - 5m 34s
batch: 800/1563 - train loss: 3.6246 - test loss: 3.8362 - train acc: 0.7777 - test acc: 0.7630 - 5m 38s
batch: 900/1563 - train loss: 3.5768 - test loss: 3.8861 - train acc: 0.7780 - test acc: 0.7639 - 5m 41s
batch: 1000/1563 - train loss: 3.6157 - test loss: 3.8324 - train acc: 0.7771 - test acc: 0.7686 - 5m 44s
batch: 1100/1563 - train loss: 3.5484 - test loss: 3.8239 - train acc: 0.7768 - test acc: 0.7698 - 5m 48s
batch: 1200/1563 - train loss: 3.4461 - test loss: 3.

batch: 1300/1563 - train loss: 2.4977 - test loss: 3.7555 - train acc: 0.8494 - test acc: 0.7860 - 9m 24s
batch: 1400/1563 - train loss: 2.4267 - test loss: 3.4000 - train acc: 0.8468 - test acc: 0.8007 - 9m 28s
batch: 1500/1563 - train loss: 2.5217 - test loss: 3.4157 - train acc: 0.8465 - test acc: 0.8033 - 9m 31s
batch: 1563/1563 - train loss: 2.4299 - test loss: 3.6736 - train acc: 0.8565 - test acc: 0.7846 - 9m 34s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 12/100
batch: 100/1563 - train loss: 2.0834 - test loss: 3.5753 - train acc: 0.8741 - test acc: 0.7921 - 9m 37s
batch: 200/1563 - train loss: 2.0735 - test loss: 3.5336 - train acc: 0.8732 - test acc: 0.7987 - 9m 40s
batch: 300/1563 - train loss: 1.9534 - test loss: 3.6804 - train acc: 0.8775 - test acc: 0.7890 - 9m 44s
batch: 400/1563 - train loss: 2.1322 - test loss: 3.5446 - train acc: 0.8659 - test acc: 0.7919 - 9m 47s
batch: 500/1563 - train loss: 2.2262 - test loss: 3

batch: 600/1563 - train loss: 1.2547 - test loss: 3.8014 - train acc: 0.9200 - test acc: 0.8066 - 13m 24s
batch: 700/1563 - train loss: 1.3438 - test loss: 3.9585 - train acc: 0.9132 - test acc: 0.7999 - 13m 28s
batch: 800/1563 - train loss: 1.4244 - test loss: 4.0562 - train acc: 0.9060 - test acc: 0.7958 - 13m 31s
batch: 900/1563 - train loss: 1.4682 - test loss: 3.8624 - train acc: 0.9051 - test acc: 0.7993 - 13m 34s
batch: 1000/1563 - train loss: 1.5004 - test loss: 3.7681 - train acc: 0.9063 - test acc: 0.8040 - 13m 38s
batch: 1100/1563 - train loss: 1.5323 - test loss: 3.9033 - train acc: 0.9038 - test acc: 0.7961 - 13m 41s
batch: 1200/1563 - train loss: 1.6283 - test loss: 3.7645 - train acc: 0.8909 - test acc: 0.7968 - 13m 45s
batch: 1300/1563 - train loss: 1.4123 - test loss: 3.8612 - train acc: 0.9079 - test acc: 0.7965 - 13m 48s
batch: 1400/1563 - train loss: 1.4609 - test loss: 3.9407 - train acc: 0.9083 - test acc: 0.7996 - 13m 51s
batch: 1500/1563 - train loss: 1.5952 - t

batch: 1563/1563 - train loss: 1.1575 - test loss: 4.1766 - train acc: 0.9333 - test acc: 0.8009 - 17m 30s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 21/100
batch: 100/1563 - train loss: 0.6937 - test loss: 4.5035 - train acc: 0.9555 - test acc: 0.8097 - 17m 34s
batch: 200/1563 - train loss: 0.7711 - test loss: 4.4368 - train acc: 0.9520 - test acc: 0.8039 - 17m 37s
batch: 300/1563 - train loss: 0.8895 - test loss: 4.5332 - train acc: 0.9493 - test acc: 0.8001 - 17m 40s
batch: 400/1563 - train loss: 0.8437 - test loss: 4.3741 - train acc: 0.9498 - test acc: 0.8004 - 17m 44s
batch: 500/1563 - train loss: 0.9669 - test loss: 4.4556 - train acc: 0.9390 - test acc: 0.7975 - 17m 47s
batch: 600/1563 - train loss: 0.8028 - test loss: 4.5419 - train acc: 0.9458 - test acc: 0.7948 - 17m 51s
batch: 700/1563 - train loss: 0.8444 - test loss: 4.5544 - train acc: 0.9479 - test acc: 0.8002 - 17m 54s
batch: 800/1563 - train loss: 0.9115 - test lo

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0235 - test loss: 13.0213 - train acc: 0.1099 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 12.9934 - test loss: 12.8951 - train acc: 0.1090 - test acc: 0.1013 - 0m 4s
batch: 300/1563 - train loss: 12.3640 - test loss: 11.7590 - train acc: 0.1703 - test acc: 0.2229 - 0m 7s
batch: 400/1563 - train loss: 11.6005 - test loss: 11.6005 - train acc: 0.2285 - test acc: 0.2267 - 0m 11s
batch: 500/1563 - train loss: 11.2534 - test loss: 10.9572 - train acc: 0.2660 - test acc: 0.2784 - 0m 14s
batch: 600/1563 - train loss: 10.8349 - test loss: 10.5344 - train acc: 0.2913 - test acc: 0.3158 - 0m 17s
batch: 700/1563 - train loss: 10.3637 - test loss: 10.1711 - train acc: 0.3056 - test acc: 0.3324 - 0m 20s
batch: 800/1563 - train loss: 10.1462 - test loss: 9.7412 - train acc: 0.3253 - test acc: 0.3567 - 0m 24s
batch: 900/1563 - train loss: 9.9592 - test loss: 9.5706 - train acc: 0.3518 - test acc: 0.3742 - 0m 27s
b

batch: 1100/1563 - train loss: 4.4058 - test loss: 4.3632 - train acc: 0.7256 - test acc: 0.7264 - 4m 2s
batch: 1200/1563 - train loss: 4.4556 - test loss: 4.3890 - train acc: 0.7178 - test acc: 0.7297 - 4m 6s
batch: 1300/1563 - train loss: 4.1922 - test loss: 4.2104 - train acc: 0.7413 - test acc: 0.7406 - 4m 9s
batch: 1400/1563 - train loss: 4.1530 - test loss: 4.4010 - train acc: 0.7459 - test acc: 0.7324 - 4m 12s
batch: 1500/1563 - train loss: 4.1588 - test loss: 4.8767 - train acc: 0.7353 - test acc: 0.7022 - 4m 16s
batch: 1563/1563 - train loss: 4.2164 - test loss: 4.2135 - train acc: 0.7410 - test acc: 0.7416 - 4m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.8508 - test loss: 4.3317 - train acc: 0.7622 - test acc: 0.7372 - 4m 22s
batch: 200/1563 - train loss: 3.9453 - test loss: 4.2361 - train acc: 0.7534 - test acc: 0.7416 - 4m 25s
batch: 300/1563 - train loss: 4.0488 - test loss: 4.4

batch: 400/1563 - train loss: 2.5769 - test loss: 3.5808 - train acc: 0.8402 - test acc: 0.7863 - 8m 1s
batch: 500/1563 - train loss: 2.5507 - test loss: 3.5971 - train acc: 0.8431 - test acc: 0.7861 - 8m 5s
batch: 600/1563 - train loss: 2.4850 - test loss: 3.5387 - train acc: 0.8428 - test acc: 0.7915 - 8m 8s
batch: 700/1563 - train loss: 2.7143 - test loss: 3.6180 - train acc: 0.8368 - test acc: 0.7887 - 8m 11s
batch: 800/1563 - train loss: 2.7596 - test loss: 3.4989 - train acc: 0.8275 - test acc: 0.7883 - 8m 14s
batch: 900/1563 - train loss: 2.6270 - test loss: 3.5993 - train acc: 0.8371 - test acc: 0.7878 - 8m 18s
batch: 1000/1563 - train loss: 2.7050 - test loss: 3.8132 - train acc: 0.8343 - test acc: 0.7773 - 8m 21s
batch: 1100/1563 - train loss: 2.8118 - test loss: 3.5826 - train acc: 0.8240 - test acc: 0.7820 - 8m 24s
batch: 1200/1563 - train loss: 2.8253 - test loss: 3.7722 - train acc: 0.8222 - test acc: 0.7724 - 8m 28s
batch: 1300/1563 - train loss: 2.6858 - test loss: 3.64

batch: 1400/1563 - train loss: 1.9456 - test loss: 3.7558 - train acc: 0.8712 - test acc: 0.7952 - 12m 6s
batch: 1500/1563 - train loss: 1.8720 - test loss: 3.5418 - train acc: 0.8847 - test acc: 0.7995 - 12m 9s
batch: 1563/1563 - train loss: 1.9549 - test loss: 3.5886 - train acc: 0.8835 - test acc: 0.7966 - 12m 12s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.5335 - test loss: 3.9282 - train acc: 0.9061 - test acc: 0.7941 - 12m 16s
batch: 200/1563 - train loss: 1.4953 - test loss: 3.6606 - train acc: 0.9057 - test acc: 0.8043 - 12m 19s
batch: 300/1563 - train loss: 1.6390 - test loss: 3.5535 - train acc: 0.8976 - test acc: 0.8034 - 12m 22s
batch: 400/1563 - train loss: 1.6102 - test loss: 3.7529 - train acc: 0.9007 - test acc: 0.7958 - 12m 25s
batch: 500/1563 - train loss: 1.4920 - test loss: 3.6856 - train acc: 0.9089 - test acc: 0.8009 - 12m 29s
batch: 600/1563 - train loss: 1.4852 - test lo

batch: 700/1563 - train loss: 1.1788 - test loss: 4.1038 - train acc: 0.9283 - test acc: 0.7996 - 16m 9s
batch: 800/1563 - train loss: 1.0778 - test loss: 4.2488 - train acc: 0.9361 - test acc: 0.7939 - 16m 12s
batch: 900/1563 - train loss: 1.1615 - test loss: 4.0904 - train acc: 0.9255 - test acc: 0.8014 - 16m 15s
batch: 1000/1563 - train loss: 1.0924 - test loss: 4.1398 - train acc: 0.9296 - test acc: 0.7999 - 16m 18s
batch: 1100/1563 - train loss: 1.1396 - test loss: 4.3414 - train acc: 0.9295 - test acc: 0.7988 - 16m 22s
batch: 1200/1563 - train loss: 1.1344 - test loss: 4.2255 - train acc: 0.9330 - test acc: 0.8039 - 16m 25s
batch: 1300/1563 - train loss: 1.1262 - test loss: 4.0312 - train acc: 0.9301 - test acc: 0.8055 - 16m 29s
batch: 1400/1563 - train loss: 1.3055 - test loss: 4.0020 - train acc: 0.9192 - test acc: 0.7999 - 16m 32s
batch: 1500/1563 - train loss: 1.1626 - test loss: 4.0925 - train acc: 0.9286 - test acc: 0.8035 - 16m 35s
batch: 1563/1563 - train loss: 1.2771 - t

Files already downloaded and verified
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8, 8]               0
        Dropout2d-10             [-1, 96, 8, 8]               0
           Conv2d-11             [-1, 64, 8, 8]         153,664
             ReLU-12             [-1, 64, 8, 8]               0
        Dropout2d-13       

batch: 100/1563 - train loss: 5.4141 - test loss: 5.2678 - train acc: 0.6613 - test acc: 0.6729 - 2m 37s
batch: 200/1563 - train loss: 5.4288 - test loss: 5.3719 - train acc: 0.6691 - test acc: 0.6655 - 2m 40s
batch: 300/1563 - train loss: 5.3311 - test loss: 5.3957 - train acc: 0.6615 - test acc: 0.6592 - 2m 43s
batch: 400/1563 - train loss: 5.3273 - test loss: 5.2664 - train acc: 0.6681 - test acc: 0.6711 - 2m 47s
batch: 500/1563 - train loss: 5.6160 - test loss: 5.5307 - train acc: 0.6391 - test acc: 0.6530 - 2m 50s
batch: 600/1563 - train loss: 5.2860 - test loss: 5.2670 - train acc: 0.6701 - test acc: 0.6723 - 2m 53s
batch: 700/1563 - train loss: 5.2454 - test loss: 5.1020 - train acc: 0.6766 - test acc: 0.6831 - 2m 57s
batch: 800/1563 - train loss: 5.1398 - test loss: 5.1992 - train acc: 0.6679 - test acc: 0.6707 - 3m 0s
batch: 900/1563 - train loss: 5.2867 - test loss: 5.0758 - train acc: 0.6763 - test acc: 0.6830 - 3m 3s
batch: 1000/1563 - train loss: 4.9780 - test loss: 5.3109

batch: 1100/1563 - train loss: 3.4798 - test loss: 3.6630 - train acc: 0.7878 - test acc: 0.7810 - 6m 39s
batch: 1200/1563 - train loss: 3.2302 - test loss: 3.7447 - train acc: 0.8033 - test acc: 0.7734 - 6m 42s
batch: 1300/1563 - train loss: 3.3106 - test loss: 3.8674 - train acc: 0.7940 - test acc: 0.7725 - 6m 45s
batch: 1400/1563 - train loss: 3.3231 - test loss: 3.6701 - train acc: 0.7937 - test acc: 0.7769 - 6m 49s
batch: 1500/1563 - train loss: 3.3271 - test loss: 3.7536 - train acc: 0.7952 - test acc: 0.7707 - 6m 52s
batch: 1563/1563 - train loss: 3.3306 - test loss: 3.7908 - train acc: 0.7918 - test acc: 0.7659 - 6m 55s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 9/100
batch: 100/1563 - train loss: 3.0411 - test loss: 3.8384 - train acc: 0.8143 - test acc: 0.7703 - 6m 58s
batch: 200/1563 - train loss: 2.7753 - test loss: 3.6755 - train acc: 0.8334 - test acc: 0.7755 - 7m 1s
batch: 300/1563 - train loss: 2.8088 - test loss: 3

batch: 400/1563 - train loss: 2.0307 - test loss: 3.6324 - train acc: 0.8794 - test acc: 0.7909 - 10m 39s
batch: 500/1563 - train loss: 2.0664 - test loss: 3.5682 - train acc: 0.8709 - test acc: 0.7948 - 10m 42s
batch: 600/1563 - train loss: 2.0735 - test loss: 3.6289 - train acc: 0.8688 - test acc: 0.7868 - 10m 45s
batch: 700/1563 - train loss: 2.0060 - test loss: 3.7062 - train acc: 0.8688 - test acc: 0.7905 - 10m 49s
batch: 800/1563 - train loss: 1.9937 - test loss: 3.5620 - train acc: 0.8781 - test acc: 0.7963 - 10m 52s
batch: 900/1563 - train loss: 2.1792 - test loss: 3.5454 - train acc: 0.8631 - test acc: 0.7991 - 10m 55s
batch: 1000/1563 - train loss: 2.1286 - test loss: 3.5441 - train acc: 0.8691 - test acc: 0.8005 - 10m 58s
batch: 1100/1563 - train loss: 1.9835 - test loss: 3.5825 - train acc: 0.8759 - test acc: 0.7901 - 11m 2s
batch: 1200/1563 - train loss: 2.1171 - test loss: 3.8474 - train acc: 0.8666 - test acc: 0.7889 - 11m 5s
batch: 1300/1563 - train loss: 2.3017 - test 

batch: 1400/1563 - train loss: 1.4083 - test loss: 3.9809 - train acc: 0.9107 - test acc: 0.7985 - 14m 44s
batch: 1500/1563 - train loss: 1.3878 - test loss: 3.8738 - train acc: 0.9098 - test acc: 0.7940 - 14m 48s
batch: 1563/1563 - train loss: 1.3645 - test loss: 3.9187 - train acc: 0.9145 - test acc: 0.7939 - 14m 50s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 18/100
batch: 100/1563 - train loss: 1.1834 - test loss: 4.0967 - train acc: 0.9277 - test acc: 0.7888 - 14m 54s
batch: 200/1563 - train loss: 1.0328 - test loss: 4.1069 - train acc: 0.9345 - test acc: 0.7980 - 14m 57s
batch: 300/1563 - train loss: 1.1436 - test loss: 4.0759 - train acc: 0.9270 - test acc: 0.7997 - 15m 0s
batch: 400/1563 - train loss: 1.2274 - test loss: 3.9910 - train acc: 0.9238 - test acc: 0.7979 - 15m 4s
batch: 500/1563 - train loss: 1.2444 - test loss: 3.9855 - train acc: 0.9207 - test acc: 0.7989 - 15m 7s
batch: 600/1563 - train loss: 1.1810 - test los

batch: 700/1563 - train loss: 0.8336 - test loss: 4.6872 - train acc: 0.9464 - test acc: 0.7932 - 18m 48s
batch: 800/1563 - train loss: 0.9386 - test loss: 4.5733 - train acc: 0.9408 - test acc: 0.7961 - 18m 51s
batch: 900/1563 - train loss: 0.8162 - test loss: 4.6884 - train acc: 0.9499 - test acc: 0.7951 - 18m 54s
batch: 1000/1563 - train loss: 0.8398 - test loss: 4.6087 - train acc: 0.9465 - test acc: 0.8002 - 18m 58s
batch: 1100/1563 - train loss: 0.8943 - test loss: 4.4651 - train acc: 0.9464 - test acc: 0.8019 - 19m 1s
batch: 1200/1563 - train loss: 0.9140 - test loss: 4.3097 - train acc: 0.9461 - test acc: 0.8000 - 19m 4s
batch: 1300/1563 - train loss: 0.8863 - test loss: 4.5669 - train acc: 0.9427 - test acc: 0.7981 - 19m 8s
batch: 1400/1563 - train loss: 0.9749 - test loss: 4.4228 - train acc: 0.9389 - test acc: 0.8007 - 19m 11s
batch: 1500/1563 - train loss: 0.9158 - test loss: 4.5967 - train acc: 0.9439 - test acc: 0.7976 - 19m 14s
batch: 1563/1563 - train loss: 0.9951 - tes

batch: 800/1563 - train loss: 7.6683 - test loss: 7.1293 - train acc: 0.5063 - test acc: 0.5441 - 1m 17s
batch: 900/1563 - train loss: 7.4044 - test loss: 7.1060 - train acc: 0.5266 - test acc: 0.5474 - 1m 20s
batch: 1000/1563 - train loss: 7.3313 - test loss: 7.1233 - train acc: 0.5213 - test acc: 0.5400 - 1m 23s
batch: 1100/1563 - train loss: 7.3698 - test loss: 7.3436 - train acc: 0.5346 - test acc: 0.5312 - 1m 27s
batch: 1200/1563 - train loss: 7.1832 - test loss: 6.6445 - train acc: 0.5313 - test acc: 0.5744 - 1m 30s
batch: 1300/1563 - train loss: 7.1262 - test loss: 6.9045 - train acc: 0.5515 - test acc: 0.5608 - 1m 33s
batch: 1400/1563 - train loss: 6.9203 - test loss: 6.5923 - train acc: 0.5638 - test acc: 0.5871 - 1m 36s
batch: 1500/1563 - train loss: 6.8958 - test loss: 6.4977 - train acc: 0.5597 - test acc: 0.5929 - 1m 40s
batch: 1563/1563 - train loss: 6.7760 - test loss: 6.4851 - train acc: 0.5678 - test acc: 0.5871 - 1m 42s
GPU memory used: 2.88 GB - max: 3.19 GB - memory

batch: 100/1563 - train loss: 3.5854 - test loss: 3.9762 - train acc: 0.7812 - test acc: 0.7570 - 5m 16s
batch: 200/1563 - train loss: 3.4285 - test loss: 3.9702 - train acc: 0.7993 - test acc: 0.7578 - 5m 19s
batch: 300/1563 - train loss: 3.5583 - test loss: 3.9075 - train acc: 0.7712 - test acc: 0.7559 - 5m 23s
batch: 400/1563 - train loss: 3.7653 - test loss: 3.9946 - train acc: 0.7647 - test acc: 0.7535 - 5m 26s
batch: 500/1563 - train loss: 3.5274 - test loss: 4.1159 - train acc: 0.7762 - test acc: 0.7504 - 5m 30s
batch: 600/1563 - train loss: 3.7399 - test loss: 3.9030 - train acc: 0.7709 - test acc: 0.7632 - 5m 33s
batch: 700/1563 - train loss: 3.6358 - test loss: 3.9435 - train acc: 0.7769 - test acc: 0.7595 - 5m 36s
batch: 800/1563 - train loss: 3.6354 - test loss: 3.9392 - train acc: 0.7672 - test acc: 0.7626 - 5m 40s
batch: 900/1563 - train loss: 3.7736 - test loss: 3.9975 - train acc: 0.7643 - test acc: 0.7559 - 5m 43s
batch: 1000/1563 - train loss: 3.6351 - test loss: 3.83

batch: 1100/1563 - train loss: 2.5422 - test loss: 3.4298 - train acc: 0.8466 - test acc: 0.7970 - 9m 20s
batch: 1200/1563 - train loss: 2.4238 - test loss: 3.5517 - train acc: 0.8484 - test acc: 0.7954 - 9m 23s
batch: 1300/1563 - train loss: 2.4451 - test loss: 3.4947 - train acc: 0.8499 - test acc: 0.7925 - 9m 27s
batch: 1400/1563 - train loss: 2.4045 - test loss: 3.6321 - train acc: 0.8481 - test acc: 0.7876 - 9m 30s
batch: 1500/1563 - train loss: 2.6349 - test loss: 3.3525 - train acc: 0.8434 - test acc: 0.7993 - 9m 33s
batch: 1563/1563 - train loss: 2.6158 - test loss: 3.5737 - train acc: 0.8381 - test acc: 0.7911 - 9m 36s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 12/100
batch: 100/1563 - train loss: 1.8970 - test loss: 3.5558 - train acc: 0.8854 - test acc: 0.7960 - 9m 40s
batch: 200/1563 - train loss: 2.0115 - test loss: 3.5297 - train acc: 0.8813 - test acc: 0.8031 - 9m 43s
batch: 300/1563 - train loss: 2.1117 - test loss:

batch: 400/1563 - train loss: 1.3345 - test loss: 3.9636 - train acc: 0.9132 - test acc: 0.7962 - 13m 21s
batch: 500/1563 - train loss: 1.3597 - test loss: 3.7891 - train acc: 0.9167 - test acc: 0.7973 - 13m 24s
batch: 600/1563 - train loss: 1.4714 - test loss: 3.8692 - train acc: 0.9067 - test acc: 0.8004 - 13m 28s
batch: 700/1563 - train loss: 1.3690 - test loss: 4.1152 - train acc: 0.9136 - test acc: 0.7962 - 13m 31s
batch: 800/1563 - train loss: 1.5048 - test loss: 3.7307 - train acc: 0.9045 - test acc: 0.8020 - 13m 34s
batch: 900/1563 - train loss: 1.3995 - test loss: 3.9649 - train acc: 0.9151 - test acc: 0.8022 - 13m 37s
batch: 1000/1563 - train loss: 1.4237 - test loss: 3.7586 - train acc: 0.9145 - test acc: 0.7982 - 13m 41s
batch: 1100/1563 - train loss: 1.6033 - test loss: 3.5989 - train acc: 0.8985 - test acc: 0.7988 - 13m 44s
batch: 1200/1563 - train loss: 1.4713 - test loss: 4.1240 - train acc: 0.9060 - test acc: 0.7881 - 13m 47s
batch: 1300/1563 - train loss: 1.4566 - tes

batch: 1400/1563 - train loss: 1.0236 - test loss: 4.3777 - train acc: 0.9405 - test acc: 0.7998 - 17m 28s
batch: 1500/1563 - train loss: 1.0738 - test loss: 4.1516 - train acc: 0.9330 - test acc: 0.8050 - 17m 32s
batch: 1563/1563 - train loss: 1.0566 - test loss: 4.5313 - train acc: 0.9339 - test acc: 0.7962 - 17m 34s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 21/100
batch: 100/1563 - train loss: 0.7711 - test loss: 4.2820 - train acc: 0.9515 - test acc: 0.8081 - 17m 38s
batch: 200/1563 - train loss: 0.7742 - test loss: 4.4422 - train acc: 0.9489 - test acc: 0.8020 - 17m 42s
batch: 300/1563 - train loss: 0.8768 - test loss: 4.6504 - train acc: 0.9470 - test acc: 0.7999 - 17m 45s
batch: 400/1563 - train loss: 0.8179 - test loss: 4.5522 - train acc: 0.9495 - test acc: 0.8041 - 17m 48s
batch: 500/1563 - train loss: 0.8369 - test loss: 4.5635 - train acc: 0.9462 - test acc: 0.7962 - 17m 52s
batch: 600/1563 - train loss: 0.8259 - test 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0297 - test loss: 13.0200 - train acc: 0.1062 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 13.0155 - test loss: 13.0042 - train acc: 0.0999 - test acc: 0.1361 - 0m 4s
batch: 300/1563 - train loss: 12.8545 - test loss: 12.2273 - train acc: 0.1319 - test acc: 0.1614 - 0m 8s
batch: 400/1563 - train loss: 12.1702 - test loss: 11.9193 - train acc: 0.2000 - test acc: 0.2175 - 0m 11s
batch: 500/1563 - train loss: 11.4403 - test loss: 11.0768 - train acc: 0.2519 - test acc: 0.2807 - 0m 14s
batch: 600/1563 - train loss: 10.9064 - test loss: 10.5910 - train acc: 0.2828 - test acc: 0.2930 - 0m 18s
batch: 700/1563 - train loss: 10.3880 - test loss: 10.4228 - train acc: 0.3140 - test acc: 0.3007 - 0m 21s
batch: 800/1563 - train loss: 10.1728 - test loss: 9.8134 - train acc: 0.3232 - test acc: 0.3434 - 0m 25s
batch: 900/1563 - train loss: 9.7679 - test loss: 9.6866 - train acc: 0.3538 - test acc: 0.3619 - 0m 28s
b

batch: 1100/1563 - train loss: 4.3900 - test loss: 4.3079 - train acc: 0.7234 - test acc: 0.7362 - 4m 4s
batch: 1200/1563 - train loss: 4.3489 - test loss: 4.5100 - train acc: 0.7357 - test acc: 0.7176 - 4m 7s
batch: 1300/1563 - train loss: 4.3045 - test loss: 4.2325 - train acc: 0.7297 - test acc: 0.7397 - 4m 11s
batch: 1400/1563 - train loss: 4.1843 - test loss: 4.3091 - train acc: 0.7394 - test acc: 0.7376 - 4m 14s
batch: 1500/1563 - train loss: 4.2561 - test loss: 4.1833 - train acc: 0.7338 - test acc: 0.7466 - 4m 17s
batch: 1563/1563 - train loss: 4.1652 - test loss: 4.5979 - train acc: 0.7415 - test acc: 0.7160 - 4m 20s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.0667 - test loss: 4.2640 - train acc: 0.7493 - test acc: 0.7367 - 4m 23s
batch: 200/1563 - train loss: 3.9914 - test loss: 4.3621 - train acc: 0.7494 - test acc: 0.7308 - 4m 27s
batch: 300/1563 - train loss: 4.0264 - test loss: 4.

batch: 400/1563 - train loss: 2.3970 - test loss: 3.5875 - train acc: 0.8556 - test acc: 0.7901 - 8m 4s
batch: 500/1563 - train loss: 2.6122 - test loss: 3.5265 - train acc: 0.8390 - test acc: 0.7919 - 8m 7s
batch: 600/1563 - train loss: 2.6913 - test loss: 3.6688 - train acc: 0.8312 - test acc: 0.7851 - 8m 11s
batch: 700/1563 - train loss: 2.5683 - test loss: 3.5110 - train acc: 0.8309 - test acc: 0.7921 - 8m 14s
batch: 800/1563 - train loss: 2.6389 - test loss: 3.5060 - train acc: 0.8359 - test acc: 0.7930 - 8m 17s
batch: 900/1563 - train loss: 2.8509 - test loss: 3.3689 - train acc: 0.8205 - test acc: 0.7975 - 8m 21s
batch: 1000/1563 - train loss: 2.7352 - test loss: 3.4351 - train acc: 0.8315 - test acc: 0.7929 - 8m 24s
batch: 1100/1563 - train loss: 2.7074 - test loss: 3.5594 - train acc: 0.8274 - test acc: 0.7937 - 8m 27s
batch: 1200/1563 - train loss: 2.5906 - test loss: 3.6025 - train acc: 0.8353 - test acc: 0.7893 - 8m 30s
batch: 1300/1563 - train loss: 2.7804 - test loss: 3.3

batch: 1400/1563 - train loss: 1.8748 - test loss: 3.4532 - train acc: 0.8844 - test acc: 0.8059 - 12m 9s
batch: 1500/1563 - train loss: 1.9158 - test loss: 3.4998 - train acc: 0.8791 - test acc: 0.8065 - 12m 13s
batch: 1563/1563 - train loss: 1.8195 - test loss: 3.5573 - train acc: 0.8916 - test acc: 0.8047 - 12m 16s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3942 - test loss: 3.5940 - train acc: 0.9129 - test acc: 0.8100 - 12m 19s
batch: 200/1563 - train loss: 1.3266 - test loss: 3.6945 - train acc: 0.9142 - test acc: 0.8091 - 12m 23s
batch: 300/1563 - train loss: 1.4707 - test loss: 3.7310 - train acc: 0.9051 - test acc: 0.8056 - 12m 26s
batch: 400/1563 - train loss: 1.3958 - test loss: 3.7217 - train acc: 0.9161 - test acc: 0.8095 - 12m 29s
batch: 500/1563 - train loss: 1.4723 - test loss: 3.6575 - train acc: 0.9088 - test acc: 0.8010 - 12m 32s
batch: 600/1563 - train loss: 1.5043 - test l

batch: 700/1563 - train loss: 1.0322 - test loss: 4.1454 - train acc: 0.9323 - test acc: 0.8084 - 16m 13s
batch: 800/1563 - train loss: 1.1138 - test loss: 4.1037 - train acc: 0.9295 - test acc: 0.8061 - 16m 16s
batch: 900/1563 - train loss: 1.0434 - test loss: 4.2524 - train acc: 0.9342 - test acc: 0.8110 - 16m 20s
batch: 1000/1563 - train loss: 0.9805 - test loss: 4.3265 - train acc: 0.9389 - test acc: 0.8020 - 16m 23s
batch: 1100/1563 - train loss: 1.2194 - test loss: 4.3048 - train acc: 0.9242 - test acc: 0.7997 - 16m 27s
batch: 1200/1563 - train loss: 1.2794 - test loss: 3.9783 - train acc: 0.9213 - test acc: 0.8094 - 16m 30s
batch: 1300/1563 - train loss: 1.1336 - test loss: 4.0856 - train acc: 0.9286 - test acc: 0.8111 - 16m 34s
batch: 1400/1563 - train loss: 1.0593 - test loss: 4.0720 - train acc: 0.9276 - test acc: 0.8075 - 16m 37s
batch: 1500/1563 - train loss: 1.1693 - test loss: 3.9464 - train acc: 0.9264 - test acc: 0.8085 - 16m 40s
batch: 1563/1563 - train loss: 1.1397 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0238 - test loss: 13.0186 - train acc: 0.1012 - test acc: 0.1029 - 0m 1s
batch: 200/1563 - train loss: 12.9966 - test loss: 12.9739 - train acc: 0.1165 - test acc: 0.1013 - 0m 4s
batch: 300/1563 - train loss: 12.8186 - test loss: 12.4728 - train acc: 0.1429 - test acc: 0.1761 - 0m 8s
batch: 400/1563 - train loss: 11.9293 - test loss: 11.6356 - train acc: 0.2094 - test acc: 0.2547 - 0m 11s
batch: 500/1563 - train loss: 11.4689 - test loss: 10.8505 - train acc: 0.2537 - test acc: 0.2966 - 0m 14s
batch: 600/1563 - train loss: 10.8361 - test loss: 10.3134 - train acc: 0.2880 - test acc: 0.3315 - 0m 18s
batch: 700/1563 - train loss: 10.3850 - test loss: 9.8309 - train acc: 0.3265 - test acc: 0.3573 - 0m 21s
batch: 800/1563 - train loss: 10.0457 - test loss: 9.7904 - train acc: 0.3328 - test acc: 0.3555 - 0m 24s
batch: 900/1563 - train loss: 9.8722 - test loss: 9.4792 - train acc: 0.3503 - test acc: 0.3661 - 0m 27s
ba

batch: 1100/1563 - train loss: 4.6614 - test loss: 4.3336 - train acc: 0.7116 - test acc: 0.7338 - 4m 3s
batch: 1200/1563 - train loss: 4.4542 - test loss: 4.4223 - train acc: 0.7269 - test acc: 0.7270 - 4m 7s
batch: 1300/1563 - train loss: 4.2131 - test loss: 4.7025 - train acc: 0.7365 - test acc: 0.7145 - 4m 10s
batch: 1400/1563 - train loss: 4.3398 - test loss: 4.3785 - train acc: 0.7347 - test acc: 0.7327 - 4m 13s
batch: 1500/1563 - train loss: 4.2706 - test loss: 4.4937 - train acc: 0.7391 - test acc: 0.7240 - 4m 17s
batch: 1563/1563 - train loss: 4.1045 - test loss: 4.2198 - train acc: 0.7465 - test acc: 0.7391 - 4m 20s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.1218 - test loss: 4.1369 - train acc: 0.7428 - test acc: 0.7437 - 4m 23s
batch: 200/1563 - train loss: 3.9386 - test loss: 4.1570 - train acc: 0.7606 - test acc: 0.7443 - 4m 26s
batch: 300/1563 - train loss: 3.9695 - test loss: 4.

batch: 400/1563 - train loss: 2.6995 - test loss: 3.5740 - train acc: 0.8280 - test acc: 0.7941 - 8m 4s
batch: 500/1563 - train loss: 2.5949 - test loss: 3.4950 - train acc: 0.8424 - test acc: 0.7898 - 8m 7s
batch: 600/1563 - train loss: 2.5531 - test loss: 3.4012 - train acc: 0.8441 - test acc: 0.8002 - 8m 11s
batch: 700/1563 - train loss: 2.6427 - test loss: 3.5570 - train acc: 0.8350 - test acc: 0.7893 - 8m 14s
batch: 800/1563 - train loss: 2.6448 - test loss: 3.5982 - train acc: 0.8372 - test acc: 0.7887 - 8m 17s
batch: 900/1563 - train loss: 2.6989 - test loss: 3.5398 - train acc: 0.8372 - test acc: 0.7898 - 8m 21s
batch: 1000/1563 - train loss: 2.6534 - test loss: 3.4717 - train acc: 0.8324 - test acc: 0.7883 - 8m 24s
batch: 1100/1563 - train loss: 2.5432 - test loss: 3.6142 - train acc: 0.8399 - test acc: 0.7853 - 8m 27s
batch: 1200/1563 - train loss: 2.7102 - test loss: 3.7255 - train acc: 0.8275 - test acc: 0.7797 - 8m 30s
batch: 1300/1563 - train loss: 2.8517 - test loss: 3.4

batch: 1400/1563 - train loss: 1.7471 - test loss: 3.5522 - train acc: 0.8826 - test acc: 0.8056 - 12m 9s
batch: 1500/1563 - train loss: 1.9898 - test loss: 3.5867 - train acc: 0.8772 - test acc: 0.8060 - 12m 13s
batch: 1563/1563 - train loss: 1.7227 - test loss: 3.7360 - train acc: 0.8882 - test acc: 0.8010 - 12m 16s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4043 - test loss: 3.7490 - train acc: 0.9192 - test acc: 0.8055 - 12m 19s
batch: 200/1563 - train loss: 1.3614 - test loss: 3.8309 - train acc: 0.9113 - test acc: 0.8007 - 12m 23s
batch: 300/1563 - train loss: 1.5303 - test loss: 3.7670 - train acc: 0.9082 - test acc: 0.8059 - 12m 26s
batch: 400/1563 - train loss: 1.5263 - test loss: 3.6257 - train acc: 0.9051 - test acc: 0.8036 - 12m 30s
batch: 500/1563 - train loss: 1.4270 - test loss: 3.6493 - train acc: 0.9142 - test acc: 0.8024 - 12m 33s
batch: 600/1563 - train loss: 1.6445 - test l

batch: 700/1563 - train loss: 1.0672 - test loss: 4.1878 - train acc: 0.9286 - test acc: 0.8021 - 16m 13s
batch: 800/1563 - train loss: 1.0886 - test loss: 4.1747 - train acc: 0.9311 - test acc: 0.8022 - 16m 17s
batch: 900/1563 - train loss: 1.0336 - test loss: 4.4167 - train acc: 0.9351 - test acc: 0.7963 - 16m 20s
batch: 1000/1563 - train loss: 1.2384 - test loss: 4.0694 - train acc: 0.9176 - test acc: 0.8046 - 16m 23s
batch: 1100/1563 - train loss: 1.1683 - test loss: 4.0665 - train acc: 0.9270 - test acc: 0.8023 - 16m 27s
batch: 1200/1563 - train loss: 1.1069 - test loss: 4.1931 - train acc: 0.9327 - test acc: 0.8037 - 16m 30s
batch: 1300/1563 - train loss: 1.1768 - test loss: 4.2307 - train acc: 0.9248 - test acc: 0.7981 - 16m 33s
batch: 1400/1563 - train loss: 1.1622 - test loss: 4.0606 - train acc: 0.9267 - test acc: 0.8063 - 16m 38s
batch: 1500/1563 - train loss: 1.1692 - test loss: 4.2456 - train acc: 0.9283 - test acc: 0.8000 - 16m 41s
batch: 1563/1563 - train loss: 1.2093 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0250 - test loss: 13.0059 - train acc: 0.0943 - test acc: 0.1069 - 0m 1s
batch: 200/1563 - train loss: 12.9120 - test loss: 12.3969 - train acc: 0.1187 - test acc: 0.1745 - 0m 4s
batch: 300/1563 - train loss: 12.3891 - test loss: 11.7020 - train acc: 0.1628 - test acc: 0.2033 - 0m 8s
batch: 400/1563 - train loss: 11.7100 - test loss: 11.2783 - train acc: 0.2194 - test acc: 0.2686 - 0m 11s
batch: 500/1563 - train loss: 11.0866 - test loss: 10.8056 - train acc: 0.2753 - test acc: 0.2935 - 0m 14s
batch: 600/1563 - train loss: 10.7586 - test loss: 10.4026 - train acc: 0.2977 - test acc: 0.3164 - 0m 18s
batch: 700/1563 - train loss: 10.4609 - test loss: 9.9861 - train acc: 0.3009 - test acc: 0.3334 - 0m 21s
batch: 800/1563 - train loss: 10.1937 - test loss: 9.8860 - train acc: 0.3331 - test acc: 0.3512 - 0m 24s
batch: 900/1563 - train loss: 10.0203 - test loss: 10.1723 - train acc: 0.3303 - test acc: 0.3409 - 0m 28s


batch: 1100/1563 - train loss: 4.5809 - test loss: 4.4736 - train acc: 0.7200 - test acc: 0.7245 - 4m 3s
batch: 1200/1563 - train loss: 4.5671 - test loss: 4.3076 - train acc: 0.7253 - test acc: 0.7343 - 4m 7s
batch: 1300/1563 - train loss: 4.3355 - test loss: 4.2328 - train acc: 0.7366 - test acc: 0.7386 - 4m 10s
batch: 1400/1563 - train loss: 4.3201 - test loss: 4.3516 - train acc: 0.7363 - test acc: 0.7282 - 4m 13s
batch: 1500/1563 - train loss: 4.3304 - test loss: 4.2617 - train acc: 0.7278 - test acc: 0.7391 - 4m 17s
batch: 1563/1563 - train loss: 4.2341 - test loss: 4.1539 - train acc: 0.7347 - test acc: 0.7483 - 4m 20s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.0379 - test loss: 4.2074 - train acc: 0.7428 - test acc: 0.7421 - 4m 23s
batch: 200/1563 - train loss: 3.7594 - test loss: 4.0744 - train acc: 0.7708 - test acc: 0.7504 - 4m 26s
batch: 300/1563 - train loss: 3.8719 - test loss: 4.

batch: 400/1563 - train loss: 2.8077 - test loss: 3.5544 - train acc: 0.8268 - test acc: 0.7901 - 8m 3s
batch: 500/1563 - train loss: 2.7345 - test loss: 3.5315 - train acc: 0.8315 - test acc: 0.7930 - 8m 6s
batch: 600/1563 - train loss: 2.7375 - test loss: 3.6716 - train acc: 0.8293 - test acc: 0.7851 - 8m 10s
batch: 700/1563 - train loss: 2.5637 - test loss: 3.6875 - train acc: 0.8365 - test acc: 0.7811 - 8m 13s
batch: 800/1563 - train loss: 2.6790 - test loss: 3.4314 - train acc: 0.8293 - test acc: 0.7938 - 8m 17s
batch: 900/1563 - train loss: 2.6000 - test loss: 3.4395 - train acc: 0.8421 - test acc: 0.7940 - 8m 20s
batch: 1000/1563 - train loss: 2.6562 - test loss: 3.5276 - train acc: 0.8387 - test acc: 0.7894 - 8m 23s
batch: 1100/1563 - train loss: 2.7254 - test loss: 3.4726 - train acc: 0.8256 - test acc: 0.7932 - 8m 26s
batch: 1200/1563 - train loss: 2.6329 - test loss: 3.5646 - train acc: 0.8440 - test acc: 0.7823 - 8m 30s
batch: 1300/1563 - train loss: 2.5287 - test loss: 3.5

batch: 1400/1563 - train loss: 1.8457 - test loss: 3.5236 - train acc: 0.8866 - test acc: 0.7989 - 12m 8s
batch: 1500/1563 - train loss: 1.7568 - test loss: 3.3998 - train acc: 0.8891 - test acc: 0.8069 - 12m 11s
batch: 1563/1563 - train loss: 1.7796 - test loss: 3.4235 - train acc: 0.8893 - test acc: 0.8071 - 12m 14s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4310 - test loss: 3.6520 - train acc: 0.9139 - test acc: 0.8104 - 12m 17s
batch: 200/1563 - train loss: 1.4065 - test loss: 3.5467 - train acc: 0.9070 - test acc: 0.8093 - 12m 21s
batch: 300/1563 - train loss: 1.5843 - test loss: 3.6540 - train acc: 0.8979 - test acc: 0.8019 - 12m 25s
batch: 400/1563 - train loss: 1.4313 - test loss: 3.6485 - train acc: 0.9057 - test acc: 0.8071 - 12m 28s
batch: 500/1563 - train loss: 1.6148 - test loss: 3.5901 - train acc: 0.8976 - test acc: 0.8050 - 12m 31s
batch: 600/1563 - train loss: 1.5396 - test l

batch: 700/1563 - train loss: 1.0493 - test loss: 4.0422 - train acc: 0.9295 - test acc: 0.8122 - 16m 10s
batch: 800/1563 - train loss: 1.0361 - test loss: 4.1530 - train acc: 0.9358 - test acc: 0.8027 - 16m 14s
batch: 900/1563 - train loss: 1.0684 - test loss: 4.2783 - train acc: 0.9308 - test acc: 0.8036 - 16m 17s
batch: 1000/1563 - train loss: 1.1976 - test loss: 4.1011 - train acc: 0.9239 - test acc: 0.8044 - 16m 20s
batch: 1100/1563 - train loss: 1.1712 - test loss: 3.9481 - train acc: 0.9280 - test acc: 0.8079 - 16m 24s
batch: 1200/1563 - train loss: 1.1134 - test loss: 4.0704 - train acc: 0.9304 - test acc: 0.8004 - 16m 27s
batch: 1300/1563 - train loss: 1.2072 - test loss: 3.8943 - train acc: 0.9220 - test acc: 0.8093 - 16m 30s
batch: 1400/1563 - train loss: 1.2106 - test loss: 3.9035 - train acc: 0.9223 - test acc: 0.8063 - 16m 34s
batch: 1500/1563 - train loss: 1.1054 - test loss: 4.0969 - train acc: 0.9345 - test acc: 0.8008 - 16m 37s
batch: 1563/1563 - train loss: 1.1784 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0312 - test loss: 13.0155 - train acc: 0.1062 - test acc: 0.1023 - 0m 1s
batch: 200/1563 - train loss: 13.0015 - test loss: 12.9650 - train acc: 0.1078 - test acc: 0.1001 - 0m 4s
batch: 300/1563 - train loss: 12.7049 - test loss: 12.1087 - train acc: 0.1469 - test acc: 0.1969 - 0m 7s
batch: 400/1563 - train loss: 11.8601 - test loss: 11.3851 - train acc: 0.2163 - test acc: 0.2543 - 0m 11s
batch: 500/1563 - train loss: 11.4556 - test loss: 10.8774 - train acc: 0.2519 - test acc: 0.2664 - 0m 14s
batch: 600/1563 - train loss: 10.9506 - test loss: 10.2804 - train acc: 0.2899 - test acc: 0.3081 - 0m 17s
batch: 700/1563 - train loss: 10.4058 - test loss: 10.0635 - train acc: 0.3184 - test acc: 0.3317 - 0m 21s
batch: 800/1563 - train loss: 10.2271 - test loss: 9.8644 - train acc: 0.3331 - test acc: 0.3487 - 0m 24s
batch: 900/1563 - train loss: 9.8734 - test loss: 9.3962 - train acc: 0.3625 - test acc: 0.3798 - 0m 27s
b

batch: 1100/1563 - train loss: 4.3092 - test loss: 4.2491 - train acc: 0.7403 - test acc: 0.7378 - 4m 2s
batch: 1200/1563 - train loss: 4.4922 - test loss: 4.4469 - train acc: 0.7187 - test acc: 0.7271 - 4m 5s
batch: 1300/1563 - train loss: 4.3753 - test loss: 4.2728 - train acc: 0.7328 - test acc: 0.7358 - 4m 8s
batch: 1400/1563 - train loss: 4.3581 - test loss: 4.4113 - train acc: 0.7381 - test acc: 0.7294 - 4m 12s
batch: 1500/1563 - train loss: 4.2084 - test loss: 4.2942 - train acc: 0.7300 - test acc: 0.7373 - 4m 15s
batch: 1563/1563 - train loss: 4.2065 - test loss: 4.1560 - train acc: 0.7315 - test acc: 0.7434 - 4m 18s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.9199 - test loss: 4.1400 - train acc: 0.7600 - test acc: 0.7486 - 4m 21s
batch: 200/1563 - train loss: 4.0631 - test loss: 4.3791 - train acc: 0.7419 - test acc: 0.7268 - 4m 24s
batch: 300/1563 - train loss: 3.9044 - test loss: 4.1

batch: 400/1563 - train loss: 2.6971 - test loss: 3.6507 - train acc: 0.8343 - test acc: 0.7877 - 8m 3s
batch: 500/1563 - train loss: 2.6563 - test loss: 3.5729 - train acc: 0.8284 - test acc: 0.7907 - 8m 6s
batch: 600/1563 - train loss: 2.6031 - test loss: 3.5123 - train acc: 0.8428 - test acc: 0.7905 - 8m 9s
batch: 700/1563 - train loss: 2.6337 - test loss: 3.5711 - train acc: 0.8347 - test acc: 0.7840 - 8m 13s
batch: 800/1563 - train loss: 2.5829 - test loss: 3.5195 - train acc: 0.8352 - test acc: 0.7887 - 8m 16s
batch: 900/1563 - train loss: 2.7279 - test loss: 3.5494 - train acc: 0.8315 - test acc: 0.7922 - 8m 19s
batch: 1000/1563 - train loss: 2.5987 - test loss: 3.4967 - train acc: 0.8350 - test acc: 0.7928 - 8m 23s
batch: 1100/1563 - train loss: 2.5830 - test loss: 3.5964 - train acc: 0.8383 - test acc: 0.7859 - 8m 26s
batch: 1200/1563 - train loss: 2.7970 - test loss: 3.4997 - train acc: 0.8259 - test acc: 0.7904 - 8m 30s
batch: 1300/1563 - train loss: 2.7578 - test loss: 3.35

batch: 1400/1563 - train loss: 1.8602 - test loss: 3.6855 - train acc: 0.8791 - test acc: 0.7976 - 12m 10s
batch: 1500/1563 - train loss: 1.7808 - test loss: 3.6926 - train acc: 0.8878 - test acc: 0.7990 - 12m 13s
batch: 1563/1563 - train loss: 1.7617 - test loss: 3.6048 - train acc: 0.8863 - test acc: 0.8024 - 12m 16s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4679 - test loss: 3.5807 - train acc: 0.9141 - test acc: 0.8055 - 12m 19s
batch: 200/1563 - train loss: 1.4600 - test loss: 3.6852 - train acc: 0.9069 - test acc: 0.8087 - 12m 23s
batch: 300/1563 - train loss: 1.5570 - test loss: 3.6284 - train acc: 0.9051 - test acc: 0.8037 - 12m 27s
batch: 400/1563 - train loss: 1.4627 - test loss: 3.9090 - train acc: 0.9038 - test acc: 0.7949 - 12m 30s
batch: 500/1563 - train loss: 1.5296 - test loss: 3.7995 - train acc: 0.9039 - test acc: 0.8016 - 12m 33s
batch: 600/1563 - train loss: 1.5468 - test 

batch: 700/1563 - train loss: 1.1001 - test loss: 4.2235 - train acc: 0.9320 - test acc: 0.8083 - 16m 16s
batch: 800/1563 - train loss: 1.1791 - test loss: 4.2455 - train acc: 0.9301 - test acc: 0.8040 - 16m 19s
batch: 900/1563 - train loss: 1.1026 - test loss: 4.0517 - train acc: 0.9321 - test acc: 0.8037 - 16m 23s
batch: 1000/1563 - train loss: 1.0859 - test loss: 4.1097 - train acc: 0.9263 - test acc: 0.8096 - 16m 26s
batch: 1100/1563 - train loss: 1.0310 - test loss: 4.0906 - train acc: 0.9329 - test acc: 0.8094 - 16m 30s
batch: 1200/1563 - train loss: 1.1314 - test loss: 4.0912 - train acc: 0.9283 - test acc: 0.8023 - 16m 33s
batch: 1300/1563 - train loss: 1.1015 - test loss: 4.2698 - train acc: 0.9336 - test acc: 0.8045 - 16m 36s
batch: 1400/1563 - train loss: 1.1593 - test loss: 4.3610 - train acc: 0.9261 - test acc: 0.8044 - 16m 40s
batch: 1500/1563 - train loss: 1.1897 - test loss: 4.0714 - train acc: 0.9224 - test acc: 0.8033 - 16m 43s
batch: 1563/1563 - train loss: 1.2454 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0292 - test loss: 13.0175 - train acc: 0.0952 - test acc: 0.1076 - 0m 1s
batch: 200/1563 - train loss: 13.0068 - test loss: 12.9752 - train acc: 0.1197 - test acc: 0.1141 - 0m 4s
batch: 300/1563 - train loss: 12.5911 - test loss: 11.8608 - train acc: 0.1368 - test acc: 0.2210 - 0m 8s
batch: 400/1563 - train loss: 11.9689 - test loss: 11.3885 - train acc: 0.2134 - test acc: 0.2499 - 0m 11s
batch: 500/1563 - train loss: 11.3863 - test loss: 11.0395 - train acc: 0.2547 - test acc: 0.2713 - 0m 14s
batch: 600/1563 - train loss: 10.6790 - test loss: 10.2318 - train acc: 0.2924 - test acc: 0.3268 - 0m 18s
batch: 700/1563 - train loss: 10.3425 - test loss: 10.3063 - train acc: 0.3000 - test acc: 0.3273 - 0m 21s
batch: 800/1563 - train loss: 10.1233 - test loss: 9.4753 - train acc: 0.3312 - test acc: 0.3645 - 0m 24s
batch: 900/1563 - train loss: 9.8060 - test loss: 9.4654 - train acc: 0.3493 - test acc: 0.3643 - 0m 27s
b

batch: 1100/1563 - train loss: 4.3838 - test loss: 4.4045 - train acc: 0.7291 - test acc: 0.7263 - 4m 2s
batch: 1200/1563 - train loss: 4.4012 - test loss: 4.3961 - train acc: 0.7178 - test acc: 0.7233 - 4m 5s
batch: 1300/1563 - train loss: 4.3784 - test loss: 4.4105 - train acc: 0.7244 - test acc: 0.7226 - 4m 8s
batch: 1400/1563 - train loss: 4.5785 - test loss: 4.2505 - train acc: 0.7222 - test acc: 0.7411 - 4m 11s
batch: 1500/1563 - train loss: 4.3824 - test loss: 4.3427 - train acc: 0.7334 - test acc: 0.7319 - 4m 14s
batch: 1563/1563 - train loss: 4.4351 - test loss: 4.1809 - train acc: 0.7188 - test acc: 0.7434 - 4m 17s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.1136 - test loss: 4.1692 - train acc: 0.7581 - test acc: 0.7442 - 4m 20s
batch: 200/1563 - train loss: 4.0185 - test loss: 4.4195 - train acc: 0.7459 - test acc: 0.7281 - 4m 24s
batch: 300/1563 - train loss: 3.9022 - test loss: 4.1

batch: 400/1563 - train loss: 2.5418 - test loss: 3.5037 - train acc: 0.8390 - test acc: 0.7904 - 8m 0s
batch: 500/1563 - train loss: 2.6820 - test loss: 3.4890 - train acc: 0.8309 - test acc: 0.7888 - 8m 3s
batch: 600/1563 - train loss: 2.6819 - test loss: 3.6606 - train acc: 0.8309 - test acc: 0.7826 - 8m 6s
batch: 700/1563 - train loss: 2.7641 - test loss: 3.4527 - train acc: 0.8306 - test acc: 0.7937 - 8m 10s
batch: 800/1563 - train loss: 2.5738 - test loss: 3.6748 - train acc: 0.8359 - test acc: 0.7833 - 8m 13s
batch: 900/1563 - train loss: 2.6318 - test loss: 3.5400 - train acc: 0.8319 - test acc: 0.7885 - 8m 17s
batch: 1000/1563 - train loss: 2.7601 - test loss: 3.5220 - train acc: 0.8262 - test acc: 0.7895 - 8m 20s
batch: 1100/1563 - train loss: 2.6834 - test loss: 3.5270 - train acc: 0.8315 - test acc: 0.7905 - 8m 24s
batch: 1200/1563 - train loss: 2.7971 - test loss: 3.5672 - train acc: 0.8269 - test acc: 0.7846 - 8m 27s
batch: 1300/1563 - train loss: 2.7389 - test loss: 3.40

batch: 1400/1563 - train loss: 1.8970 - test loss: 3.5294 - train acc: 0.8828 - test acc: 0.8041 - 12m 6s
batch: 1500/1563 - train loss: 1.8550 - test loss: 3.6097 - train acc: 0.8809 - test acc: 0.7991 - 12m 10s
batch: 1563/1563 - train loss: 1.8499 - test loss: 3.6709 - train acc: 0.8822 - test acc: 0.7998 - 12m 13s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3138 - test loss: 3.6332 - train acc: 0.9173 - test acc: 0.8049 - 12m 16s
batch: 200/1563 - train loss: 1.4775 - test loss: 3.7616 - train acc: 0.9057 - test acc: 0.8037 - 12m 19s
batch: 300/1563 - train loss: 1.4614 - test loss: 3.7472 - train acc: 0.9108 - test acc: 0.8018 - 12m 23s
batch: 400/1563 - train loss: 1.4616 - test loss: 3.6084 - train acc: 0.9073 - test acc: 0.8021 - 12m 26s
batch: 500/1563 - train loss: 1.5871 - test loss: 3.7079 - train acc: 0.8935 - test acc: 0.8080 - 12m 30s
batch: 600/1563 - train loss: 1.5799 - test l

batch: 700/1563 - train loss: 1.1126 - test loss: 4.1006 - train acc: 0.9305 - test acc: 0.7992 - 16m 11s
batch: 800/1563 - train loss: 1.0118 - test loss: 4.1650 - train acc: 0.9373 - test acc: 0.8012 - 16m 15s
batch: 900/1563 - train loss: 1.1329 - test loss: 4.1017 - train acc: 0.9280 - test acc: 0.8072 - 16m 18s
batch: 1000/1563 - train loss: 1.2378 - test loss: 4.0427 - train acc: 0.9185 - test acc: 0.8084 - 16m 21s
batch: 1100/1563 - train loss: 1.0697 - test loss: 4.3373 - train acc: 0.9308 - test acc: 0.7987 - 16m 25s
batch: 1200/1563 - train loss: 1.3182 - test loss: 4.0808 - train acc: 0.9173 - test acc: 0.8001 - 16m 28s
batch: 1300/1563 - train loss: 1.1976 - test loss: 4.2771 - train acc: 0.9254 - test acc: 0.7958 - 16m 32s
batch: 1400/1563 - train loss: 1.1972 - test loss: 4.4368 - train acc: 0.9273 - test acc: 0.7886 - 16m 35s
batch: 1500/1563 - train loss: 1.2209 - test loss: 3.9437 - train acc: 0.9223 - test acc: 0.8060 - 16m 38s
batch: 1563/1563 - train loss: 1.1259 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0252 - test loss: 13.0045 - train acc: 0.0999 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 12.9396 - test loss: 12.7789 - train acc: 0.1108 - test acc: 0.1046 - 0m 4s
batch: 300/1563 - train loss: 12.2958 - test loss: 11.6826 - train acc: 0.1544 - test acc: 0.2272 - 0m 8s
batch: 400/1563 - train loss: 11.7745 - test loss: 11.3332 - train acc: 0.2291 - test acc: 0.2517 - 0m 11s
batch: 500/1563 - train loss: 11.3221 - test loss: 10.7981 - train acc: 0.2629 - test acc: 0.2963 - 0m 14s
batch: 600/1563 - train loss: 10.7550 - test loss: 10.4141 - train acc: 0.2984 - test acc: 0.3308 - 0m 18s
batch: 700/1563 - train loss: 10.3102 - test loss: 10.0156 - train acc: 0.3237 - test acc: 0.3386 - 0m 21s
batch: 800/1563 - train loss: 10.0684 - test loss: 9.8251 - train acc: 0.3297 - test acc: 0.3460 - 0m 24s
batch: 900/1563 - train loss: 9.9048 - test loss: 9.6136 - train acc: 0.3456 - test acc: 0.3661 - 0m 27s
b

batch: 1100/1563 - train loss: 4.1576 - test loss: 4.3778 - train acc: 0.7369 - test acc: 0.7289 - 4m 4s
batch: 1200/1563 - train loss: 4.3623 - test loss: 4.4425 - train acc: 0.7296 - test acc: 0.7281 - 4m 8s
batch: 1300/1563 - train loss: 4.4991 - test loss: 4.3808 - train acc: 0.7141 - test acc: 0.7322 - 4m 11s
batch: 1400/1563 - train loss: 4.5726 - test loss: 4.3060 - train acc: 0.7181 - test acc: 0.7371 - 4m 15s
batch: 1500/1563 - train loss: 4.4577 - test loss: 4.4148 - train acc: 0.7225 - test acc: 0.7289 - 4m 18s
batch: 1563/1563 - train loss: 4.3306 - test loss: 4.2754 - train acc: 0.7303 - test acc: 0.7400 - 4m 21s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.1121 - test loss: 4.2616 - train acc: 0.7471 - test acc: 0.7382 - 4m 24s
batch: 200/1563 - train loss: 3.9511 - test loss: 4.3269 - train acc: 0.7709 - test acc: 0.7348 - 4m 28s
batch: 300/1563 - train loss: 4.0765 - test loss: 4.

batch: 400/1563 - train loss: 2.6852 - test loss: 3.7204 - train acc: 0.8331 - test acc: 0.7825 - 8m 5s
batch: 500/1563 - train loss: 2.6586 - test loss: 3.5485 - train acc: 0.8371 - test acc: 0.7869 - 8m 8s
batch: 600/1563 - train loss: 2.7545 - test loss: 3.5643 - train acc: 0.8316 - test acc: 0.7863 - 8m 12s
batch: 700/1563 - train loss: 2.6906 - test loss: 3.6449 - train acc: 0.8328 - test acc: 0.7811 - 8m 15s
batch: 800/1563 - train loss: 2.6898 - test loss: 3.7506 - train acc: 0.8375 - test acc: 0.7721 - 8m 18s
batch: 900/1563 - train loss: 2.6290 - test loss: 3.5822 - train acc: 0.8421 - test acc: 0.7881 - 8m 22s
batch: 1000/1563 - train loss: 2.7492 - test loss: 3.5037 - train acc: 0.8350 - test acc: 0.7909 - 8m 25s
batch: 1100/1563 - train loss: 2.6993 - test loss: 3.5869 - train acc: 0.8300 - test acc: 0.7881 - 8m 28s
batch: 1200/1563 - train loss: 2.6956 - test loss: 3.7587 - train acc: 0.8271 - test acc: 0.7776 - 8m 32s
batch: 1300/1563 - train loss: 2.8045 - test loss: 3.5

batch: 1400/1563 - train loss: 2.0712 - test loss: 3.5675 - train acc: 0.8707 - test acc: 0.8018 - 12m 10s
batch: 1500/1563 - train loss: 1.7943 - test loss: 3.6884 - train acc: 0.8900 - test acc: 0.7964 - 12m 13s
batch: 1563/1563 - train loss: 1.8998 - test loss: 3.6314 - train acc: 0.8838 - test acc: 0.7999 - 12m 16s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3718 - test loss: 3.9482 - train acc: 0.9195 - test acc: 0.7965 - 12m 19s
batch: 200/1563 - train loss: 1.3964 - test loss: 3.8808 - train acc: 0.9138 - test acc: 0.7999 - 12m 23s
batch: 300/1563 - train loss: 1.5096 - test loss: 3.7869 - train acc: 0.9104 - test acc: 0.7959 - 12m 27s
batch: 400/1563 - train loss: 1.4726 - test loss: 3.8711 - train acc: 0.9057 - test acc: 0.7959 - 12m 30s
batch: 500/1563 - train loss: 1.5656 - test loss: 3.8680 - train acc: 0.9004 - test acc: 0.7960 - 12m 33s
batch: 600/1563 - train loss: 1.5013 - test 

batch: 700/1563 - train loss: 1.0323 - test loss: 4.4193 - train acc: 0.9345 - test acc: 0.8037 - 16m 18s
batch: 800/1563 - train loss: 1.1472 - test loss: 4.2744 - train acc: 0.9267 - test acc: 0.7993 - 16m 21s
batch: 900/1563 - train loss: 1.0992 - test loss: 4.4397 - train acc: 0.9333 - test acc: 0.7916 - 16m 25s
batch: 1000/1563 - train loss: 1.1722 - test loss: 4.2575 - train acc: 0.9286 - test acc: 0.8026 - 16m 28s
batch: 1100/1563 - train loss: 1.1354 - test loss: 4.3258 - train acc: 0.9264 - test acc: 0.8024 - 16m 32s
batch: 1200/1563 - train loss: 1.0499 - test loss: 4.4234 - train acc: 0.9358 - test acc: 0.7980 - 16m 35s
batch: 1300/1563 - train loss: 1.2193 - test loss: 4.0881 - train acc: 0.9245 - test acc: 0.8037 - 16m 38s
batch: 1400/1563 - train loss: 1.1854 - test loss: 4.3770 - train acc: 0.9242 - test acc: 0.7959 - 16m 42s
batch: 1500/1563 - train loss: 1.2841 - test loss: 4.0456 - train acc: 0.9195 - test acc: 0.7998 - 16m 46s
batch: 1563/1563 - train loss: 1.1018 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0223 - test loss: 13.0131 - train acc: 0.1081 - test acc: 0.1003 - 0m 1s
batch: 200/1563 - train loss: 12.9850 - test loss: 12.8706 - train acc: 0.1147 - test acc: 0.1018 - 0m 5s
batch: 300/1563 - train loss: 12.5514 - test loss: 11.8422 - train acc: 0.1525 - test acc: 0.2203 - 0m 8s
batch: 400/1563 - train loss: 11.8230 - test loss: 11.4479 - train acc: 0.2178 - test acc: 0.2438 - 0m 12s
batch: 500/1563 - train loss: 11.3238 - test loss: 11.2941 - train acc: 0.2531 - test acc: 0.2523 - 0m 15s
batch: 600/1563 - train loss: 10.9117 - test loss: 10.8238 - train acc: 0.2744 - test acc: 0.2805 - 0m 18s
batch: 700/1563 - train loss: 10.5458 - test loss: 10.0880 - train acc: 0.2993 - test acc: 0.3354 - 0m 22s
batch: 800/1563 - train loss: 10.3127 - test loss: 9.7207 - train acc: 0.3347 - test acc: 0.3484 - 0m 25s
batch: 900/1563 - train loss: 10.1241 - test loss: 9.6155 - train acc: 0.3206 - test acc: 0.3646 - 0m 28s


batch: 1100/1563 - train loss: 4.4245 - test loss: 4.7270 - train acc: 0.7291 - test acc: 0.7025 - 4m 4s
batch: 1200/1563 - train loss: 4.4677 - test loss: 4.5080 - train acc: 0.7181 - test acc: 0.7255 - 4m 8s
batch: 1300/1563 - train loss: 4.3365 - test loss: 4.3266 - train acc: 0.7447 - test acc: 0.7307 - 4m 11s
batch: 1400/1563 - train loss: 4.2454 - test loss: 4.6347 - train acc: 0.7388 - test acc: 0.7097 - 4m 14s
batch: 1500/1563 - train loss: 4.3958 - test loss: 4.2824 - train acc: 0.7309 - test acc: 0.7383 - 4m 17s
batch: 1563/1563 - train loss: 4.4149 - test loss: 4.3282 - train acc: 0.7281 - test acc: 0.7316 - 4m 20s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.9830 - test loss: 4.3194 - train acc: 0.7457 - test acc: 0.7334 - 4m 23s
batch: 200/1563 - train loss: 4.1253 - test loss: 4.3967 - train acc: 0.7444 - test acc: 0.7281 - 4m 27s
batch: 300/1563 - train loss: 3.8463 - test loss: 4.

batch: 400/1563 - train loss: 2.5887 - test loss: 3.6395 - train acc: 0.8349 - test acc: 0.7860 - 8m 3s
batch: 500/1563 - train loss: 2.5066 - test loss: 3.5875 - train acc: 0.8434 - test acc: 0.7849 - 8m 7s
batch: 600/1563 - train loss: 2.5980 - test loss: 3.5190 - train acc: 0.8343 - test acc: 0.7889 - 8m 10s
batch: 700/1563 - train loss: 2.6478 - test loss: 3.9773 - train acc: 0.8387 - test acc: 0.7651 - 8m 14s
batch: 800/1563 - train loss: 2.6981 - test loss: 3.6070 - train acc: 0.8277 - test acc: 0.7852 - 8m 17s
batch: 900/1563 - train loss: 2.8561 - test loss: 3.4912 - train acc: 0.8187 - test acc: 0.7867 - 8m 20s
batch: 1000/1563 - train loss: 2.8241 - test loss: 3.6796 - train acc: 0.8240 - test acc: 0.7829 - 8m 24s
batch: 1100/1563 - train loss: 2.7141 - test loss: 3.6044 - train acc: 0.8280 - test acc: 0.7857 - 8m 27s
batch: 1200/1563 - train loss: 2.7061 - test loss: 3.5889 - train acc: 0.8281 - test acc: 0.7885 - 8m 30s
batch: 1300/1563 - train loss: 2.7094 - test loss: 3.6

batch: 1400/1563 - train loss: 1.9795 - test loss: 3.6370 - train acc: 0.8785 - test acc: 0.7941 - 12m 9s
batch: 1500/1563 - train loss: 2.0868 - test loss: 3.6390 - train acc: 0.8725 - test acc: 0.7915 - 12m 12s
batch: 1563/1563 - train loss: 1.9708 - test loss: 3.5570 - train acc: 0.8750 - test acc: 0.8059 - 12m 15s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4270 - test loss: 3.7960 - train acc: 0.9138 - test acc: 0.8011 - 12m 18s
batch: 200/1563 - train loss: 1.5821 - test loss: 3.7228 - train acc: 0.8966 - test acc: 0.8003 - 12m 22s
batch: 300/1563 - train loss: 1.5167 - test loss: 4.0323 - train acc: 0.9051 - test acc: 0.7919 - 12m 25s
batch: 400/1563 - train loss: 1.7046 - test loss: 3.7308 - train acc: 0.8973 - test acc: 0.7930 - 12m 29s
batch: 500/1563 - train loss: 1.5172 - test loss: 3.8099 - train acc: 0.9020 - test acc: 0.7996 - 12m 32s
batch: 600/1563 - train loss: 1.6262 - test l

batch: 700/1563 - train loss: 1.1114 - test loss: 4.1620 - train acc: 0.9308 - test acc: 0.8021 - 16m 15s
batch: 800/1563 - train loss: 1.1293 - test loss: 4.1071 - train acc: 0.9310 - test acc: 0.8026 - 16m 18s
batch: 900/1563 - train loss: 1.0809 - test loss: 4.2714 - train acc: 0.9352 - test acc: 0.7987 - 16m 22s
batch: 1000/1563 - train loss: 1.2859 - test loss: 3.9790 - train acc: 0.9202 - test acc: 0.7964 - 16m 25s
batch: 1100/1563 - train loss: 1.2245 - test loss: 4.1976 - train acc: 0.9223 - test acc: 0.7954 - 16m 28s
batch: 1200/1563 - train loss: 1.2406 - test loss: 4.3037 - train acc: 0.9180 - test acc: 0.7985 - 16m 32s
batch: 1300/1563 - train loss: 1.1603 - test loss: 4.1640 - train acc: 0.9273 - test acc: 0.8035 - 16m 36s
batch: 1400/1563 - train loss: 1.0882 - test loss: 4.1107 - train acc: 0.9336 - test acc: 0.8031 - 16m 39s
batch: 1500/1563 - train loss: 1.2801 - test loss: 4.1390 - train acc: 0.9185 - test acc: 0.8021 - 16m 43s
batch: 1563/1563 - train loss: 1.2960 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0247 - test loss: 13.0199 - train acc: 0.0984 - test acc: 0.1237 - 0m 1s
batch: 200/1563 - train loss: 13.0167 - test loss: 12.9777 - train acc: 0.1225 - test acc: 0.1550 - 0m 4s
batch: 300/1563 - train loss: 12.7779 - test loss: 12.1727 - train acc: 0.1347 - test acc: 0.1820 - 0m 8s
batch: 400/1563 - train loss: 12.0972 - test loss: 11.6025 - train acc: 0.1897 - test acc: 0.2336 - 0m 11s
batch: 500/1563 - train loss: 11.5669 - test loss: 11.1982 - train acc: 0.2444 - test acc: 0.2788 - 0m 15s
batch: 600/1563 - train loss: 11.0065 - test loss: 10.9112 - train acc: 0.2800 - test acc: 0.2869 - 0m 18s
batch: 700/1563 - train loss: 10.5438 - test loss: 10.2312 - train acc: 0.3168 - test acc: 0.3279 - 0m 21s
batch: 800/1563 - train loss: 10.1691 - test loss: 10.0397 - train acc: 0.3235 - test acc: 0.3317 - 0m 25s
batch: 900/1563 - train loss: 9.9283 - test loss: 9.5774 - train acc: 0.3509 - test acc: 0.3690 - 0m 28s


batch: 1100/1563 - train loss: 4.3355 - test loss: 4.3745 - train acc: 0.7290 - test acc: 0.7294 - 4m 9s
batch: 1200/1563 - train loss: 4.2933 - test loss: 4.5120 - train acc: 0.7313 - test acc: 0.7175 - 4m 12s
batch: 1300/1563 - train loss: 4.3198 - test loss: 4.3896 - train acc: 0.7244 - test acc: 0.7284 - 4m 16s
batch: 1400/1563 - train loss: 4.2892 - test loss: 4.2157 - train acc: 0.7334 - test acc: 0.7411 - 4m 19s
batch: 1500/1563 - train loss: 4.3679 - test loss: 4.1816 - train acc: 0.7331 - test acc: 0.7451 - 4m 22s
batch: 1563/1563 - train loss: 4.3411 - test loss: 4.1770 - train acc: 0.7356 - test acc: 0.7418 - 4m 25s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.8355 - test loss: 4.2491 - train acc: 0.7603 - test acc: 0.7386 - 4m 29s
batch: 200/1563 - train loss: 3.7302 - test loss: 4.0644 - train acc: 0.7656 - test acc: 0.7501 - 4m 32s
batch: 300/1563 - train loss: 3.9743 - test loss: 4

batch: 400/1563 - train loss: 2.6251 - test loss: 3.5562 - train acc: 0.8334 - test acc: 0.7923 - 8m 13s
batch: 500/1563 - train loss: 2.6219 - test loss: 3.5197 - train acc: 0.8306 - test acc: 0.7915 - 8m 17s
batch: 600/1563 - train loss: 2.6420 - test loss: 3.5562 - train acc: 0.8453 - test acc: 0.7914 - 8m 20s
batch: 700/1563 - train loss: 2.4834 - test loss: 3.6109 - train acc: 0.8434 - test acc: 0.7897 - 8m 23s
batch: 800/1563 - train loss: 2.6168 - test loss: 3.4905 - train acc: 0.8443 - test acc: 0.7931 - 8m 26s
batch: 900/1563 - train loss: 2.5892 - test loss: 3.4770 - train acc: 0.8278 - test acc: 0.7915 - 8m 30s
batch: 1000/1563 - train loss: 2.6695 - test loss: 3.5566 - train acc: 0.8309 - test acc: 0.7854 - 8m 33s
batch: 1100/1563 - train loss: 2.5678 - test loss: 3.5918 - train acc: 0.8440 - test acc: 0.7853 - 8m 36s
batch: 1200/1563 - train loss: 2.5931 - test loss: 3.5119 - train acc: 0.8393 - test acc: 0.7947 - 8m 40s
batch: 1300/1563 - train loss: 2.6732 - test loss: 3

batch: 1400/1563 - train loss: 1.7550 - test loss: 3.6793 - train acc: 0.8892 - test acc: 0.8013 - 12m 23s
batch: 1500/1563 - train loss: 1.8451 - test loss: 3.6663 - train acc: 0.8804 - test acc: 0.8040 - 12m 27s
batch: 1563/1563 - train loss: 1.7942 - test loss: 3.6567 - train acc: 0.8850 - test acc: 0.8032 - 12m 29s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3378 - test loss: 3.7345 - train acc: 0.9164 - test acc: 0.8059 - 12m 33s
batch: 200/1563 - train loss: 1.3383 - test loss: 3.8990 - train acc: 0.9167 - test acc: 0.8004 - 12m 36s
batch: 300/1563 - train loss: 1.4110 - test loss: 3.7818 - train acc: 0.9092 - test acc: 0.8075 - 12m 40s
batch: 400/1563 - train loss: 1.5368 - test loss: 3.8570 - train acc: 0.9041 - test acc: 0.8000 - 12m 43s
batch: 500/1563 - train loss: 1.4003 - test loss: 3.9080 - train acc: 0.9151 - test acc: 0.8047 - 12m 46s
batch: 600/1563 - train loss: 1.4805 - test 

batch: 700/1563 - train loss: 1.1475 - test loss: 4.1701 - train acc: 0.9301 - test acc: 0.8008 - 16m 33s
batch: 800/1563 - train loss: 1.0845 - test loss: 4.0433 - train acc: 0.9355 - test acc: 0.8058 - 16m 36s
batch: 900/1563 - train loss: 1.0142 - test loss: 4.0561 - train acc: 0.9348 - test acc: 0.8070 - 16m 40s
batch: 1000/1563 - train loss: 1.0258 - test loss: 4.3225 - train acc: 0.9386 - test acc: 0.8066 - 16m 44s
batch: 1100/1563 - train loss: 1.0271 - test loss: 4.2198 - train acc: 0.9357 - test acc: 0.8047 - 16m 48s
batch: 1200/1563 - train loss: 1.0092 - test loss: 4.2441 - train acc: 0.9336 - test acc: 0.8031 - 16m 51s
batch: 1300/1563 - train loss: 1.0959 - test loss: 4.2073 - train acc: 0.9295 - test acc: 0.8004 - 16m 55s
batch: 1400/1563 - train loss: 1.1235 - test loss: 4.2606 - train acc: 0.9301 - test acc: 0.8030 - 16m 58s
batch: 1500/1563 - train loss: 0.9964 - test loss: 4.1802 - train acc: 0.9342 - test acc: 0.8031 - 17m 2s
batch: 1563/1563 - train loss: 1.1091 - t

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0286 - test loss: 13.0179 - train acc: 0.0999 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 12.9942 - test loss: 12.9348 - train acc: 0.1053 - test acc: 0.1092 - 0m 4s
batch: 300/1563 - train loss: 12.6987 - test loss: 12.1371 - train acc: 0.1325 - test acc: 0.1748 - 0m 8s
batch: 400/1563 - train loss: 11.8940 - test loss: 11.3887 - train acc: 0.2069 - test acc: 0.2541 - 0m 11s
batch: 500/1563 - train loss: 11.5343 - test loss: 10.9670 - train acc: 0.2422 - test acc: 0.2766 - 0m 15s
batch: 600/1563 - train loss: 10.9504 - test loss: 10.4354 - train acc: 0.2972 - test acc: 0.3242 - 0m 18s
batch: 700/1563 - train loss: 10.6136 - test loss: 9.9510 - train acc: 0.3041 - test acc: 0.3506 - 0m 21s
batch: 800/1563 - train loss: 10.0554 - test loss: 10.7067 - train acc: 0.3424 - test acc: 0.3026 - 0m 25s
batch: 900/1563 - train loss: 10.1471 - test loss: 10.1041 - train acc: 0.3309 - test acc: 0.3295 - 0m 28s

batch: 1100/1563 - train loss: 4.5059 - test loss: 4.3776 - train acc: 0.7325 - test acc: 0.7278 - 4m 9s
batch: 1200/1563 - train loss: 4.4817 - test loss: 4.3811 - train acc: 0.7190 - test acc: 0.7278 - 4m 13s
batch: 1300/1563 - train loss: 4.4934 - test loss: 4.5728 - train acc: 0.7194 - test acc: 0.7113 - 4m 16s
batch: 1400/1563 - train loss: 4.4489 - test loss: 4.3642 - train acc: 0.7210 - test acc: 0.7288 - 4m 19s
batch: 1500/1563 - train loss: 4.3120 - test loss: 4.4174 - train acc: 0.7360 - test acc: 0.7271 - 4m 23s
batch: 1563/1563 - train loss: 4.4363 - test loss: 4.3199 - train acc: 0.7300 - test acc: 0.7343 - 4m 26s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.0231 - test loss: 4.5752 - train acc: 0.7500 - test acc: 0.7203 - 4m 29s
batch: 200/1563 - train loss: 3.9405 - test loss: 4.2179 - train acc: 0.7491 - test acc: 0.7418 - 4m 33s
batch: 300/1563 - train loss: 4.0800 - test loss: 4

batch: 400/1563 - train loss: 2.4671 - test loss: 3.5802 - train acc: 0.8449 - test acc: 0.7871 - 8m 16s
batch: 500/1563 - train loss: 2.6641 - test loss: 3.5187 - train acc: 0.8416 - test acc: 0.7917 - 8m 19s
batch: 600/1563 - train loss: 2.6274 - test loss: 3.4975 - train acc: 0.8356 - test acc: 0.7896 - 8m 23s
batch: 700/1563 - train loss: 2.6053 - test loss: 3.5385 - train acc: 0.8363 - test acc: 0.7848 - 8m 26s
batch: 800/1563 - train loss: 2.7275 - test loss: 3.4855 - train acc: 0.8334 - test acc: 0.7924 - 8m 29s
batch: 900/1563 - train loss: 2.7052 - test loss: 3.5913 - train acc: 0.8362 - test acc: 0.7847 - 8m 33s
batch: 1000/1563 - train loss: 2.8208 - test loss: 3.4698 - train acc: 0.8228 - test acc: 0.7917 - 8m 36s
batch: 1100/1563 - train loss: 2.8047 - test loss: 3.4967 - train acc: 0.8247 - test acc: 0.7907 - 8m 40s
batch: 1200/1563 - train loss: 2.8050 - test loss: 3.3845 - train acc: 0.8158 - test acc: 0.7978 - 8m 43s
batch: 1300/1563 - train loss: 2.6125 - test loss: 3

batch: 1400/1563 - train loss: 2.0475 - test loss: 3.4825 - train acc: 0.8741 - test acc: 0.7976 - 12m 24s
batch: 1500/1563 - train loss: 1.9576 - test loss: 3.5968 - train acc: 0.8794 - test acc: 0.7983 - 12m 28s
batch: 1563/1563 - train loss: 1.8409 - test loss: 3.5509 - train acc: 0.8875 - test acc: 0.8009 - 12m 31s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4550 - test loss: 3.7034 - train acc: 0.9103 - test acc: 0.8014 - 12m 34s
batch: 200/1563 - train loss: 1.4501 - test loss: 3.7093 - train acc: 0.9132 - test acc: 0.8033 - 12m 37s
batch: 300/1563 - train loss: 1.4737 - test loss: 3.7618 - train acc: 0.9095 - test acc: 0.7964 - 12m 41s
batch: 400/1563 - train loss: 1.4609 - test loss: 3.8134 - train acc: 0.9098 - test acc: 0.7987 - 12m 44s
batch: 500/1563 - train loss: 1.6520 - test loss: 3.6698 - train acc: 0.8954 - test acc: 0.8018 - 12m 47s
batch: 600/1563 - train loss: 1.6607 - test 

batch: 700/1563 - train loss: 1.0337 - test loss: 4.2951 - train acc: 0.9364 - test acc: 0.7980 - 16m 28s
batch: 800/1563 - train loss: 1.1157 - test loss: 3.9751 - train acc: 0.9283 - test acc: 0.8032 - 16m 31s
batch: 900/1563 - train loss: 1.0892 - test loss: 4.4071 - train acc: 0.9321 - test acc: 0.7975 - 16m 35s
batch: 1000/1563 - train loss: 1.1920 - test loss: 4.1329 - train acc: 0.9239 - test acc: 0.8003 - 16m 38s
batch: 1100/1563 - train loss: 1.0593 - test loss: 4.4795 - train acc: 0.9320 - test acc: 0.7970 - 16m 42s
batch: 1200/1563 - train loss: 1.3331 - test loss: 3.9874 - train acc: 0.9186 - test acc: 0.8031 - 16m 45s
batch: 1300/1563 - train loss: 1.1287 - test loss: 4.1374 - train acc: 0.9251 - test acc: 0.8036 - 16m 49s
batch: 1400/1563 - train loss: 1.0744 - test loss: 4.3356 - train acc: 0.9336 - test acc: 0.8017 - 16m 52s
batch: 1500/1563 - train loss: 1.1044 - test loss: 4.2038 - train acc: 0.9317 - test acc: 0.7994 - 16m 55s
batch: 1563/1563 - train loss: 1.1654 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0209 - test loss: 13.0091 - train acc: 0.0946 - test acc: 0.1375 - 0m 1s
batch: 200/1563 - train loss: 12.9661 - test loss: 12.8585 - train acc: 0.1291 - test acc: 0.1195 - 0m 4s
batch: 300/1563 - train loss: 12.5946 - test loss: 11.9802 - train acc: 0.1516 - test acc: 0.2196 - 0m 7s
batch: 400/1563 - train loss: 11.8646 - test loss: 11.2312 - train acc: 0.2166 - test acc: 0.2629 - 0m 11s
batch: 500/1563 - train loss: 11.1284 - test loss: 10.6704 - train acc: 0.2600 - test acc: 0.2862 - 0m 14s
batch: 600/1563 - train loss: 10.6299 - test loss: 10.3316 - train acc: 0.2981 - test acc: 0.3118 - 0m 17s
batch: 700/1563 - train loss: 10.1473 - test loss: 10.1406 - train acc: 0.3296 - test acc: 0.3325 - 0m 21s
batch: 800/1563 - train loss: 9.9847 - test loss: 9.8430 - train acc: 0.3337 - test acc: 0.3438 - 0m 24s
batch: 900/1563 - train loss: 9.7849 - test loss: 9.4463 - train acc: 0.3600 - test acc: 0.3726 - 0m 27s
ba

batch: 1100/1563 - train loss: 4.3916 - test loss: 4.3393 - train acc: 0.7250 - test acc: 0.7279 - 4m 3s
batch: 1200/1563 - train loss: 4.4814 - test loss: 4.3610 - train acc: 0.7256 - test acc: 0.7314 - 4m 6s
batch: 1300/1563 - train loss: 4.3694 - test loss: 4.3580 - train acc: 0.7334 - test acc: 0.7302 - 4m 10s
batch: 1400/1563 - train loss: 4.4227 - test loss: 4.3886 - train acc: 0.7238 - test acc: 0.7330 - 4m 13s
batch: 1500/1563 - train loss: 4.4767 - test loss: 4.7626 - train acc: 0.7178 - test acc: 0.7085 - 4m 16s
batch: 1563/1563 - train loss: 4.2366 - test loss: 4.3120 - train acc: 0.7303 - test acc: 0.7350 - 4m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.9430 - test loss: 4.5081 - train acc: 0.7537 - test acc: 0.7233 - 4m 23s
batch: 200/1563 - train loss: 3.9573 - test loss: 4.1793 - train acc: 0.7547 - test acc: 0.7438 - 4m 26s
batch: 300/1563 - train loss: 3.9690 - test loss: 4.

batch: 400/1563 - train loss: 2.4895 - test loss: 3.5840 - train acc: 0.8462 - test acc: 0.7866 - 8m 4s
batch: 500/1563 - train loss: 2.5804 - test loss: 3.6012 - train acc: 0.8356 - test acc: 0.7874 - 8m 7s
batch: 600/1563 - train loss: 2.6985 - test loss: 3.6194 - train acc: 0.8350 - test acc: 0.7799 - 8m 10s
batch: 700/1563 - train loss: 2.5920 - test loss: 3.6243 - train acc: 0.8343 - test acc: 0.7824 - 8m 14s
batch: 800/1563 - train loss: 2.9459 - test loss: 3.5359 - train acc: 0.8253 - test acc: 0.7877 - 8m 17s
batch: 900/1563 - train loss: 2.7885 - test loss: 3.5799 - train acc: 0.8205 - test acc: 0.7884 - 8m 20s
batch: 1000/1563 - train loss: 2.7480 - test loss: 3.5642 - train acc: 0.8284 - test acc: 0.7901 - 8m 24s
batch: 1100/1563 - train loss: 2.7356 - test loss: 3.7270 - train acc: 0.8275 - test acc: 0.7836 - 8m 27s
batch: 1200/1563 - train loss: 2.6893 - test loss: 3.5565 - train acc: 0.8325 - test acc: 0.7865 - 8m 30s
batch: 1300/1563 - train loss: 2.7352 - test loss: 3.4

batch: 1400/1563 - train loss: 2.0203 - test loss: 3.4336 - train acc: 0.8772 - test acc: 0.8060 - 12m 9s
batch: 1500/1563 - train loss: 1.9291 - test loss: 3.5194 - train acc: 0.8785 - test acc: 0.8018 - 12m 13s
batch: 1563/1563 - train loss: 1.7876 - test loss: 3.5535 - train acc: 0.8888 - test acc: 0.7989 - 12m 15s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.5005 - test loss: 3.7778 - train acc: 0.9092 - test acc: 0.8009 - 12m 19s
batch: 200/1563 - train loss: 1.5201 - test loss: 3.5799 - train acc: 0.9086 - test acc: 0.8054 - 12m 22s
batch: 300/1563 - train loss: 1.5485 - test loss: 3.7456 - train acc: 0.9023 - test acc: 0.7978 - 12m 26s
batch: 400/1563 - train loss: 1.5671 - test loss: 3.7211 - train acc: 0.9023 - test acc: 0.8048 - 12m 29s
batch: 500/1563 - train loss: 1.5307 - test loss: 3.7352 - train acc: 0.9076 - test acc: 0.7972 - 12m 32s
batch: 600/1563 - train loss: 1.5675 - test l

batch: 700/1563 - train loss: 1.0994 - test loss: 4.1870 - train acc: 0.9308 - test acc: 0.8002 - 16m 13s
batch: 800/1563 - train loss: 1.1066 - test loss: 4.1012 - train acc: 0.9324 - test acc: 0.7985 - 16m 16s
batch: 900/1563 - train loss: 1.1434 - test loss: 4.1848 - train acc: 0.9254 - test acc: 0.8029 - 16m 20s
batch: 1000/1563 - train loss: 1.1104 - test loss: 4.1670 - train acc: 0.9317 - test acc: 0.8017 - 16m 23s
batch: 1100/1563 - train loss: 1.0733 - test loss: 4.2659 - train acc: 0.9317 - test acc: 0.7994 - 16m 26s
batch: 1200/1563 - train loss: 1.1220 - test loss: 4.0376 - train acc: 0.9301 - test acc: 0.8017 - 16m 30s
batch: 1300/1563 - train loss: 1.2789 - test loss: 3.9639 - train acc: 0.9258 - test acc: 0.8044 - 16m 33s
batch: 1400/1563 - train loss: 1.2121 - test loss: 4.0435 - train acc: 0.9232 - test acc: 0.8014 - 16m 36s
batch: 1500/1563 - train loss: 1.1880 - test loss: 4.2534 - train acc: 0.9248 - test acc: 0.8027 - 16m 40s
batch: 1563/1563 - train loss: 1.1515 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0215 - test loss: 13.0077 - train acc: 0.1058 - test acc: 0.1252 - 0m 1s
batch: 200/1563 - train loss: 12.9149 - test loss: 13.0346 - train acc: 0.1234 - test acc: 0.1000 - 0m 4s
batch: 300/1563 - train loss: 12.9606 - test loss: 12.6416 - train acc: 0.1253 - test acc: 0.1506 - 0m 7s
batch: 400/1563 - train loss: 12.0751 - test loss: 11.4449 - train acc: 0.1979 - test acc: 0.2548 - 0m 11s
batch: 500/1563 - train loss: 11.5655 - test loss: 11.1383 - train acc: 0.2294 - test acc: 0.2739 - 0m 14s
batch: 600/1563 - train loss: 11.0440 - test loss: 10.6156 - train acc: 0.2759 - test acc: 0.3146 - 0m 18s
batch: 700/1563 - train loss: 10.5174 - test loss: 9.8428 - train acc: 0.3125 - test acc: 0.3580 - 0m 21s
batch: 800/1563 - train loss: 10.1936 - test loss: 9.7270 - train acc: 0.3347 - test acc: 0.3413 - 0m 24s
batch: 900/1563 - train loss: 9.8464 - test loss: 9.6665 - train acc: 0.3474 - test acc: 0.3547 - 0m 27s
ba

batch: 1100/1563 - train loss: 4.3893 - test loss: 4.3208 - train acc: 0.7204 - test acc: 0.7345 - 4m 2s
batch: 1200/1563 - train loss: 4.4156 - test loss: 4.2625 - train acc: 0.7335 - test acc: 0.7359 - 4m 5s
batch: 1300/1563 - train loss: 4.5239 - test loss: 4.3534 - train acc: 0.7178 - test acc: 0.7295 - 4m 9s
batch: 1400/1563 - train loss: 4.4311 - test loss: 4.3322 - train acc: 0.7297 - test acc: 0.7300 - 4m 12s
batch: 1500/1563 - train loss: 4.2035 - test loss: 4.2497 - train acc: 0.7438 - test acc: 0.7397 - 4m 15s
batch: 1563/1563 - train loss: 4.3694 - test loss: 4.3233 - train acc: 0.7297 - test acc: 0.7357 - 4m 18s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.9845 - test loss: 4.1792 - train acc: 0.7594 - test acc: 0.7434 - 4m 21s
batch: 200/1563 - train loss: 3.8510 - test loss: 4.3652 - train acc: 0.7584 - test acc: 0.7334 - 4m 25s
batch: 300/1563 - train loss: 4.0990 - test loss: 4.1

batch: 400/1563 - train loss: 2.5158 - test loss: 3.7041 - train acc: 0.8538 - test acc: 0.7846 - 8m 1s
batch: 500/1563 - train loss: 2.7908 - test loss: 3.4884 - train acc: 0.8391 - test acc: 0.7851 - 8m 4s
batch: 600/1563 - train loss: 2.6455 - test loss: 3.5571 - train acc: 0.8337 - test acc: 0.7893 - 8m 8s
batch: 700/1563 - train loss: 2.7692 - test loss: 3.5268 - train acc: 0.8287 - test acc: 0.7888 - 8m 11s
batch: 800/1563 - train loss: 2.6277 - test loss: 3.5303 - train acc: 0.8408 - test acc: 0.7892 - 8m 14s
batch: 900/1563 - train loss: 2.6454 - test loss: 3.4557 - train acc: 0.8365 - test acc: 0.7930 - 8m 18s
batch: 1000/1563 - train loss: 2.8402 - test loss: 3.3856 - train acc: 0.8196 - test acc: 0.7959 - 8m 21s
batch: 1100/1563 - train loss: 2.7676 - test loss: 3.4740 - train acc: 0.8309 - test acc: 0.7865 - 8m 24s
batch: 1200/1563 - train loss: 2.7364 - test loss: 3.6448 - train acc: 0.8347 - test acc: 0.7790 - 8m 27s
batch: 1300/1563 - train loss: 2.7489 - test loss: 3.54

batch: 1400/1563 - train loss: 1.8112 - test loss: 3.4805 - train acc: 0.8856 - test acc: 0.8024 - 12m 6s
batch: 1500/1563 - train loss: 1.8426 - test loss: 3.5015 - train acc: 0.8835 - test acc: 0.8017 - 12m 9s
batch: 1563/1563 - train loss: 1.8644 - test loss: 3.5849 - train acc: 0.8778 - test acc: 0.8020 - 12m 12s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.5012 - test loss: 3.6878 - train acc: 0.9010 - test acc: 0.8048 - 12m 15s
batch: 200/1563 - train loss: 1.2561 - test loss: 3.7066 - train acc: 0.9214 - test acc: 0.8067 - 12m 18s
batch: 300/1563 - train loss: 1.5572 - test loss: 3.7548 - train acc: 0.9060 - test acc: 0.8006 - 12m 22s
batch: 400/1563 - train loss: 1.5463 - test loss: 3.7826 - train acc: 0.9063 - test acc: 0.7992 - 12m 25s
batch: 500/1563 - train loss: 1.5550 - test loss: 3.6394 - train acc: 0.9011 - test acc: 0.8049 - 12m 28s
batch: 600/1563 - train loss: 1.4286 - test lo

batch: 700/1563 - train loss: 1.2092 - test loss: 3.8800 - train acc: 0.9286 - test acc: 0.8062 - 16m 13s
batch: 800/1563 - train loss: 0.9810 - test loss: 4.1463 - train acc: 0.9386 - test acc: 0.8086 - 16m 16s
batch: 900/1563 - train loss: 1.1410 - test loss: 4.1840 - train acc: 0.9292 - test acc: 0.8015 - 16m 19s
batch: 1000/1563 - train loss: 1.0223 - test loss: 4.2466 - train acc: 0.9345 - test acc: 0.8015 - 16m 22s
batch: 1100/1563 - train loss: 1.1541 - test loss: 4.1181 - train acc: 0.9267 - test acc: 0.8014 - 16m 26s
batch: 1200/1563 - train loss: 1.1736 - test loss: 4.1681 - train acc: 0.9236 - test acc: 0.7989 - 16m 29s
batch: 1300/1563 - train loss: 1.2036 - test loss: 4.2349 - train acc: 0.9261 - test acc: 0.7957 - 16m 32s
batch: 1400/1563 - train loss: 1.1966 - test loss: 4.1921 - train acc: 0.9242 - test acc: 0.8041 - 16m 36s
batch: 1500/1563 - train loss: 1.1308 - test loss: 4.0804 - train acc: 0.9267 - test acc: 0.8077 - 16m 40s
batch: 1563/1563 - train loss: 1.1479 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0302 - test loss: 13.0177 - train acc: 0.1006 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 12.9969 - test loss: 12.9357 - train acc: 0.1084 - test acc: 0.1666 - 0m 5s
batch: 300/1563 - train loss: 12.5516 - test loss: 12.0660 - train acc: 0.1479 - test acc: 0.1768 - 0m 8s
batch: 400/1563 - train loss: 11.7250 - test loss: 11.3120 - train acc: 0.2235 - test acc: 0.2516 - 0m 11s
batch: 500/1563 - train loss: 11.3391 - test loss: 11.0921 - train acc: 0.2726 - test acc: 0.2721 - 0m 15s
batch: 600/1563 - train loss: 10.7483 - test loss: 10.9395 - train acc: 0.2918 - test acc: 0.2812 - 0m 18s
batch: 700/1563 - train loss: 10.5611 - test loss: 10.0509 - train acc: 0.2975 - test acc: 0.3429 - 0m 21s
batch: 800/1563 - train loss: 10.1671 - test loss: 9.7267 - train acc: 0.3265 - test acc: 0.3518 - 0m 24s
batch: 900/1563 - train loss: 9.8571 - test loss: 9.6587 - train acc: 0.3412 - test acc: 0.3574 - 0m 28s
b

batch: 1100/1563 - train loss: 4.2329 - test loss: 4.3433 - train acc: 0.7294 - test acc: 0.7314 - 4m 4s
batch: 1200/1563 - train loss: 4.3503 - test loss: 4.3952 - train acc: 0.7300 - test acc: 0.7264 - 4m 7s
batch: 1300/1563 - train loss: 4.2717 - test loss: 4.3301 - train acc: 0.7357 - test acc: 0.7313 - 4m 10s
batch: 1400/1563 - train loss: 4.3631 - test loss: 4.5958 - train acc: 0.7366 - test acc: 0.7142 - 4m 13s
batch: 1500/1563 - train loss: 4.2320 - test loss: 4.2877 - train acc: 0.7300 - test acc: 0.7340 - 4m 17s
batch: 1563/1563 - train loss: 4.2774 - test loss: 4.2364 - train acc: 0.7266 - test acc: 0.7381 - 4m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 3.9963 - test loss: 4.3042 - train acc: 0.7494 - test acc: 0.7328 - 4m 23s
batch: 200/1563 - train loss: 3.9413 - test loss: 4.3582 - train acc: 0.7522 - test acc: 0.7367 - 4m 26s
batch: 300/1563 - train loss: 4.3246 - test loss: 4.

batch: 400/1563 - train loss: 2.7443 - test loss: 3.5483 - train acc: 0.8343 - test acc: 0.7865 - 8m 7s
batch: 500/1563 - train loss: 2.6364 - test loss: 3.5897 - train acc: 0.8431 - test acc: 0.7889 - 8m 10s
batch: 600/1563 - train loss: 2.6501 - test loss: 3.6432 - train acc: 0.8337 - test acc: 0.7832 - 8m 14s
batch: 700/1563 - train loss: 2.6483 - test loss: 3.7480 - train acc: 0.8359 - test acc: 0.7814 - 8m 17s
batch: 800/1563 - train loss: 2.6983 - test loss: 3.5372 - train acc: 0.8375 - test acc: 0.7909 - 8m 20s
batch: 900/1563 - train loss: 2.6752 - test loss: 3.5012 - train acc: 0.8290 - test acc: 0.7923 - 8m 24s
batch: 1000/1563 - train loss: 2.7930 - test loss: 3.5700 - train acc: 0.8356 - test acc: 0.7864 - 8m 27s
batch: 1100/1563 - train loss: 2.7063 - test loss: 3.5571 - train acc: 0.8312 - test acc: 0.7861 - 8m 30s
batch: 1200/1563 - train loss: 2.7126 - test loss: 3.5081 - train acc: 0.8293 - test acc: 0.7919 - 8m 34s
batch: 1300/1563 - train loss: 2.7302 - test loss: 3.

batch: 1400/1563 - train loss: 1.8836 - test loss: 3.4640 - train acc: 0.8842 - test acc: 0.8022 - 12m 13s
batch: 1500/1563 - train loss: 1.9823 - test loss: 3.6377 - train acc: 0.8744 - test acc: 0.7938 - 12m 16s
batch: 1563/1563 - train loss: 2.0037 - test loss: 3.4555 - train acc: 0.8791 - test acc: 0.8031 - 12m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4498 - test loss: 3.6534 - train acc: 0.9120 - test acc: 0.8030 - 12m 23s
batch: 200/1563 - train loss: 1.5343 - test loss: 3.7410 - train acc: 0.9026 - test acc: 0.7995 - 12m 26s
batch: 300/1563 - train loss: 1.5521 - test loss: 3.7846 - train acc: 0.9060 - test acc: 0.7975 - 12m 29s
batch: 400/1563 - train loss: 1.5156 - test loss: 3.7394 - train acc: 0.9076 - test acc: 0.7992 - 12m 32s
batch: 500/1563 - train loss: 1.6463 - test loss: 3.5134 - train acc: 0.9010 - test acc: 0.8080 - 12m 36s
batch: 600/1563 - train loss: 1.4817 - test 

batch: 700/1563 - train loss: 1.0312 - test loss: 4.1691 - train acc: 0.9395 - test acc: 0.7989 - 16m 22s
batch: 800/1563 - train loss: 1.1213 - test loss: 4.3019 - train acc: 0.9346 - test acc: 0.7976 - 16m 25s
batch: 900/1563 - train loss: 1.0919 - test loss: 4.1019 - train acc: 0.9343 - test acc: 0.8028 - 16m 29s
batch: 1000/1563 - train loss: 1.1270 - test loss: 4.1904 - train acc: 0.9342 - test acc: 0.7983 - 16m 32s
batch: 1100/1563 - train loss: 1.2535 - test loss: 4.1422 - train acc: 0.9227 - test acc: 0.8037 - 16m 36s
batch: 1200/1563 - train loss: 1.1891 - test loss: 4.0877 - train acc: 0.9217 - test acc: 0.7992 - 16m 39s
batch: 1300/1563 - train loss: 1.1752 - test loss: 4.1131 - train acc: 0.9295 - test acc: 0.8038 - 16m 42s
batch: 1400/1563 - train loss: 1.1446 - test loss: 3.9720 - train acc: 0.9295 - test acc: 0.8029 - 16m 46s
batch: 1500/1563 - train loss: 1.1843 - test loss: 4.1554 - train acc: 0.9258 - test acc: 0.7962 - 16m 49s
batch: 1563/1563 - train loss: 1.1487 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0242 - test loss: 13.0214 - train acc: 0.1014 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 13.0008 - test loss: 12.9834 - train acc: 0.1228 - test acc: 0.1109 - 0m 4s
batch: 300/1563 - train loss: 12.7697 - test loss: 12.2713 - train acc: 0.1346 - test acc: 0.1597 - 0m 8s
batch: 400/1563 - train loss: 11.9855 - test loss: 11.5703 - train acc: 0.1985 - test acc: 0.2332 - 0m 11s
batch: 500/1563 - train loss: 11.5221 - test loss: 11.0591 - train acc: 0.2315 - test acc: 0.2750 - 0m 14s
batch: 600/1563 - train loss: 11.0058 - test loss: 10.4441 - train acc: 0.2756 - test acc: 0.3105 - 0m 18s
batch: 700/1563 - train loss: 10.2834 - test loss: 10.4883 - train acc: 0.3328 - test acc: 0.2965 - 0m 21s
batch: 800/1563 - train loss: 10.1064 - test loss: 9.8547 - train acc: 0.3253 - test acc: 0.3397 - 0m 24s
batch: 900/1563 - train loss: 10.0004 - test loss: 9.6535 - train acc: 0.3472 - test acc: 0.3580 - 0m 27s


batch: 1100/1563 - train loss: 4.5420 - test loss: 4.3932 - train acc: 0.7135 - test acc: 0.7284 - 4m 2s
batch: 1200/1563 - train loss: 4.2827 - test loss: 4.4365 - train acc: 0.7290 - test acc: 0.7270 - 4m 6s
batch: 1300/1563 - train loss: 4.3919 - test loss: 4.3740 - train acc: 0.7341 - test acc: 0.7310 - 4m 9s
batch: 1400/1563 - train loss: 4.4280 - test loss: 4.6096 - train acc: 0.7288 - test acc: 0.7097 - 4m 13s
batch: 1500/1563 - train loss: 4.2723 - test loss: 4.4030 - train acc: 0.7228 - test acc: 0.7295 - 4m 16s
batch: 1563/1563 - train loss: 4.2595 - test loss: 4.2443 - train acc: 0.7328 - test acc: 0.7400 - 4m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.1379 - test loss: 4.3558 - train acc: 0.7441 - test acc: 0.7327 - 4m 22s
batch: 200/1563 - train loss: 4.0465 - test loss: 4.3442 - train acc: 0.7465 - test acc: 0.7318 - 4m 26s
batch: 300/1563 - train loss: 3.9371 - test loss: 4.3

batch: 400/1563 - train loss: 2.7989 - test loss: 3.6503 - train acc: 0.8303 - test acc: 0.7805 - 8m 4s
batch: 500/1563 - train loss: 2.6231 - test loss: 3.8009 - train acc: 0.8384 - test acc: 0.7760 - 8m 8s
batch: 600/1563 - train loss: 2.6036 - test loss: 3.6701 - train acc: 0.8356 - test acc: 0.7810 - 8m 11s
batch: 700/1563 - train loss: 2.6137 - test loss: 3.6247 - train acc: 0.8299 - test acc: 0.7875 - 8m 14s
batch: 800/1563 - train loss: 2.6759 - test loss: 3.6189 - train acc: 0.8312 - test acc: 0.7857 - 8m 18s
batch: 900/1563 - train loss: 2.7733 - test loss: 3.6107 - train acc: 0.8299 - test acc: 0.7854 - 8m 21s
batch: 1000/1563 - train loss: 2.6175 - test loss: 3.5563 - train acc: 0.8350 - test acc: 0.7890 - 8m 24s
batch: 1100/1563 - train loss: 2.8580 - test loss: 3.5853 - train acc: 0.8178 - test acc: 0.7869 - 8m 28s
batch: 1200/1563 - train loss: 2.7373 - test loss: 3.7351 - train acc: 0.8274 - test acc: 0.7863 - 8m 31s
batch: 1300/1563 - train loss: 2.7371 - test loss: 3.5

batch: 1400/1563 - train loss: 1.7476 - test loss: 3.8391 - train acc: 0.8844 - test acc: 0.7880 - 12m 9s
batch: 1500/1563 - train loss: 1.9179 - test loss: 3.6786 - train acc: 0.8744 - test acc: 0.8013 - 12m 13s
batch: 1563/1563 - train loss: 1.9772 - test loss: 3.6511 - train acc: 0.8757 - test acc: 0.7942 - 12m 16s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4799 - test loss: 3.7045 - train acc: 0.9035 - test acc: 0.8011 - 12m 19s
batch: 200/1563 - train loss: 1.5353 - test loss: 3.7522 - train acc: 0.9051 - test acc: 0.8012 - 12m 23s
batch: 300/1563 - train loss: 1.4412 - test loss: 3.9587 - train acc: 0.9072 - test acc: 0.7955 - 12m 26s
batch: 400/1563 - train loss: 1.5034 - test loss: 3.9478 - train acc: 0.9110 - test acc: 0.7967 - 12m 29s
batch: 500/1563 - train loss: 1.6279 - test loss: 3.7791 - train acc: 0.8969 - test acc: 0.7968 - 12m 32s
batch: 600/1563 - train loss: 1.4467 - test l

batch: 700/1563 - train loss: 1.1213 - test loss: 4.3622 - train acc: 0.9310 - test acc: 0.7998 - 16m 14s
batch: 800/1563 - train loss: 1.1540 - test loss: 4.2443 - train acc: 0.9277 - test acc: 0.7984 - 16m 17s
batch: 900/1563 - train loss: 1.1400 - test loss: 4.4236 - train acc: 0.9304 - test acc: 0.7960 - 16m 20s
batch: 1000/1563 - train loss: 1.1525 - test loss: 4.2995 - train acc: 0.9320 - test acc: 0.7948 - 16m 23s
batch: 1100/1563 - train loss: 0.9954 - test loss: 4.3747 - train acc: 0.9361 - test acc: 0.7984 - 16m 27s
batch: 1200/1563 - train loss: 1.2646 - test loss: 4.1028 - train acc: 0.9236 - test acc: 0.8045 - 16m 30s
batch: 1300/1563 - train loss: 1.0559 - test loss: 4.4011 - train acc: 0.9336 - test acc: 0.7944 - 16m 34s
batch: 1400/1563 - train loss: 1.2049 - test loss: 4.0605 - train acc: 0.9245 - test acc: 0.7998 - 16m 37s
batch: 1500/1563 - train loss: 1.1318 - test loss: 4.2322 - train acc: 0.9299 - test acc: 0.8010 - 16m 41s
batch: 1563/1563 - train loss: 1.0911 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0203 - test loss: 13.0171 - train acc: 0.1059 - test acc: 0.1000 - 0m 1s
batch: 200/1563 - train loss: 13.0110 - test loss: 12.9751 - train acc: 0.0890 - test acc: 0.1001 - 0m 5s
batch: 300/1563 - train loss: 12.7621 - test loss: 12.8937 - train acc: 0.1316 - test acc: 0.1222 - 0m 8s
batch: 400/1563 - train loss: 11.9163 - test loss: 11.5436 - train acc: 0.2144 - test acc: 0.2338 - 0m 11s
batch: 500/1563 - train loss: 11.5404 - test loss: 11.4241 - train acc: 0.2378 - test acc: 0.2621 - 0m 15s
batch: 600/1563 - train loss: 11.3284 - test loss: 10.6568 - train acc: 0.2743 - test acc: 0.3006 - 0m 18s
batch: 700/1563 - train loss: 10.7087 - test loss: 10.1312 - train acc: 0.2956 - test acc: 0.3184 - 0m 21s
batch: 800/1563 - train loss: 10.3629 - test loss: 9.9052 - train acc: 0.3112 - test acc: 0.3479 - 0m 24s
batch: 900/1563 - train loss: 10.1280 - test loss: 9.8948 - train acc: 0.3306 - test acc: 0.3509 - 0m 28s


batch: 1100/1563 - train loss: 4.3236 - test loss: 4.4388 - train acc: 0.7285 - test acc: 0.7311 - 4m 3s
batch: 1200/1563 - train loss: 4.3639 - test loss: 4.7420 - train acc: 0.7281 - test acc: 0.7083 - 4m 6s
batch: 1300/1563 - train loss: 4.1427 - test loss: 4.5466 - train acc: 0.7375 - test acc: 0.7136 - 4m 10s
batch: 1400/1563 - train loss: 4.2092 - test loss: 4.5507 - train acc: 0.7406 - test acc: 0.7247 - 4m 13s
batch: 1500/1563 - train loss: 4.4633 - test loss: 4.3703 - train acc: 0.7225 - test acc: 0.7289 - 4m 16s
batch: 1563/1563 - train loss: 4.4271 - test loss: 4.4916 - train acc: 0.7247 - test acc: 0.7260 - 4m 19s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.1499 - test loss: 4.2671 - train acc: 0.7425 - test acc: 0.7381 - 4m 22s
batch: 200/1563 - train loss: 4.1853 - test loss: 4.4194 - train acc: 0.7434 - test acc: 0.7302 - 4m 26s
batch: 300/1563 - train loss: 4.1301 - test loss: 4.

batch: 400/1563 - train loss: 2.6712 - test loss: 3.4886 - train acc: 0.8265 - test acc: 0.7896 - 8m 2s
batch: 500/1563 - train loss: 2.7337 - test loss: 3.6840 - train acc: 0.8303 - test acc: 0.7772 - 8m 5s
batch: 600/1563 - train loss: 2.5892 - test loss: 3.6166 - train acc: 0.8352 - test acc: 0.7812 - 8m 8s
batch: 700/1563 - train loss: 2.6059 - test loss: 3.5660 - train acc: 0.8350 - test acc: 0.7862 - 8m 11s
batch: 800/1563 - train loss: 2.5979 - test loss: 3.7545 - train acc: 0.8396 - test acc: 0.7739 - 8m 15s
batch: 900/1563 - train loss: 2.5942 - test loss: 3.6505 - train acc: 0.8478 - test acc: 0.7848 - 8m 18s
batch: 1000/1563 - train loss: 2.8651 - test loss: 3.7336 - train acc: 0.8252 - test acc: 0.7794 - 8m 21s
batch: 1100/1563 - train loss: 2.7372 - test loss: 3.5271 - train acc: 0.8262 - test acc: 0.7898 - 8m 25s
batch: 1200/1563 - train loss: 2.6736 - test loss: 3.5974 - train acc: 0.8328 - test acc: 0.7930 - 8m 28s
batch: 1300/1563 - train loss: 2.5796 - test loss: 3.56

batch: 1400/1563 - train loss: 1.8527 - test loss: 3.5888 - train acc: 0.8794 - test acc: 0.8021 - 12m 7s
batch: 1500/1563 - train loss: 1.9222 - test loss: 3.6153 - train acc: 0.8782 - test acc: 0.8012 - 12m 11s
batch: 1563/1563 - train loss: 2.0337 - test loss: 3.7633 - train acc: 0.8687 - test acc: 0.7922 - 12m 13s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.5543 - test loss: 3.8521 - train acc: 0.8973 - test acc: 0.7950 - 12m 17s
batch: 200/1563 - train loss: 1.4634 - test loss: 3.8522 - train acc: 0.9060 - test acc: 0.7936 - 12m 20s
batch: 300/1563 - train loss: 1.5985 - test loss: 3.8747 - train acc: 0.8970 - test acc: 0.8032 - 12m 24s
batch: 400/1563 - train loss: 1.4490 - test loss: 3.9968 - train acc: 0.9079 - test acc: 0.7968 - 12m 27s
batch: 500/1563 - train loss: 1.5655 - test loss: 3.8545 - train acc: 0.9013 - test acc: 0.8011 - 12m 30s
batch: 600/1563 - train loss: 1.6426 - test l

batch: 700/1563 - train loss: 1.0269 - test loss: 4.1725 - train acc: 0.9383 - test acc: 0.8037 - 16m 12s
batch: 800/1563 - train loss: 1.0138 - test loss: 4.2678 - train acc: 0.9377 - test acc: 0.8009 - 16m 15s
batch: 900/1563 - train loss: 1.0136 - test loss: 4.3063 - train acc: 0.9336 - test acc: 0.8024 - 16m 18s
batch: 1000/1563 - train loss: 1.1525 - test loss: 4.2634 - train acc: 0.9295 - test acc: 0.8006 - 16m 22s
batch: 1100/1563 - train loss: 1.1793 - test loss: 4.2846 - train acc: 0.9239 - test acc: 0.7982 - 16m 25s
batch: 1200/1563 - train loss: 1.2005 - test loss: 4.5533 - train acc: 0.9251 - test acc: 0.7937 - 16m 28s
batch: 1300/1563 - train loss: 1.2098 - test loss: 4.2295 - train acc: 0.9258 - test acc: 0.7971 - 16m 32s
batch: 1400/1563 - train loss: 1.3638 - test loss: 4.1617 - train acc: 0.9189 - test acc: 0.7992 - 16m 36s
batch: 1500/1563 - train loss: 1.2360 - test loss: 3.9913 - train acc: 0.9214 - test acc: 0.8060 - 16m 39s
batch: 1563/1563 - train loss: 1.1972 - 

partition 8635/8635
starting epoch: 1/100
batch: 100/1563 - train loss: 13.0237 - test loss: 13.0134 - train acc: 0.0952 - test acc: 0.1065 - 0m 1s
batch: 200/1563 - train loss: 12.9829 - test loss: 12.8431 - train acc: 0.1137 - test acc: 0.1563 - 0m 4s
batch: 300/1563 - train loss: 12.3913 - test loss: 11.6943 - train acc: 0.1832 - test acc: 0.2220 - 0m 7s
batch: 400/1563 - train loss: 11.7433 - test loss: 11.2559 - train acc: 0.2312 - test acc: 0.2627 - 0m 11s
batch: 500/1563 - train loss: 11.3168 - test loss: 11.0816 - train acc: 0.2509 - test acc: 0.2691 - 0m 14s
batch: 600/1563 - train loss: 10.9238 - test loss: 10.4741 - train acc: 0.2781 - test acc: 0.3070 - 0m 17s
batch: 700/1563 - train loss: 10.2950 - test loss: 9.8416 - train acc: 0.3299 - test acc: 0.3520 - 0m 21s
batch: 800/1563 - train loss: 10.0326 - test loss: 9.5958 - train acc: 0.3387 - test acc: 0.3675 - 0m 24s
batch: 900/1563 - train loss: 9.8457 - test loss: 9.4631 - train acc: 0.3377 - test acc: 0.3719 - 0m 27s
ba

batch: 1100/1563 - train loss: 4.2570 - test loss: 4.4370 - train acc: 0.7297 - test acc: 0.7285 - 4m 6s
batch: 1200/1563 - train loss: 4.3098 - test loss: 4.6517 - train acc: 0.7369 - test acc: 0.7116 - 4m 9s
batch: 1300/1563 - train loss: 4.3796 - test loss: 4.3195 - train acc: 0.7272 - test acc: 0.7327 - 4m 13s
batch: 1400/1563 - train loss: 4.0907 - test loss: 4.4421 - train acc: 0.7403 - test acc: 0.7235 - 4m 16s
batch: 1500/1563 - train loss: 4.1841 - test loss: 4.2028 - train acc: 0.7416 - test acc: 0.7423 - 4m 19s
batch: 1563/1563 - train loss: 4.0495 - test loss: 4.3845 - train acc: 0.7518 - test acc: 0.7309 - 4m 22s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.0321 - test loss: 4.2740 - train acc: 0.7550 - test acc: 0.7348 - 4m 26s
batch: 200/1563 - train loss: 4.0845 - test loss: 4.3300 - train acc: 0.7478 - test acc: 0.7309 - 4m 29s
batch: 300/1563 - train loss: 3.9605 - test loss: 4.

batch: 400/1563 - train loss: 2.5318 - test loss: 3.6155 - train acc: 0.8462 - test acc: 0.7879 - 8m 9s
batch: 500/1563 - train loss: 2.6145 - test loss: 3.5994 - train acc: 0.8356 - test acc: 0.7857 - 8m 13s
batch: 600/1563 - train loss: 2.4966 - test loss: 3.4781 - train acc: 0.8394 - test acc: 0.7900 - 8m 16s
batch: 700/1563 - train loss: 2.7296 - test loss: 3.6640 - train acc: 0.8343 - test acc: 0.7819 - 8m 19s
batch: 800/1563 - train loss: 2.7197 - test loss: 3.6862 - train acc: 0.8372 - test acc: 0.7839 - 8m 22s
batch: 900/1563 - train loss: 2.6611 - test loss: 3.4975 - train acc: 0.8362 - test acc: 0.7931 - 8m 26s
batch: 1000/1563 - train loss: 2.6785 - test loss: 3.6194 - train acc: 0.8381 - test acc: 0.7855 - 8m 30s
batch: 1100/1563 - train loss: 2.6373 - test loss: 3.4404 - train acc: 0.8378 - test acc: 0.7985 - 8m 33s
batch: 1200/1563 - train loss: 2.6160 - test loss: 3.5086 - train acc: 0.8399 - test acc: 0.7917 - 8m 37s
batch: 1300/1563 - train loss: 2.6350 - test loss: 3.

batch: 1400/1563 - train loss: 1.8378 - test loss: 3.6726 - train acc: 0.8863 - test acc: 0.8030 - 12m 18s
batch: 1500/1563 - train loss: 1.8120 - test loss: 3.6858 - train acc: 0.8882 - test acc: 0.7989 - 12m 22s
batch: 1563/1563 - train loss: 1.9051 - test loss: 3.5942 - train acc: 0.8832 - test acc: 0.7989 - 12m 24s
GPU memory used: 2.88 GB - max: 3.19 GB - memory reserved: 3.28 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3438 - test loss: 3.7136 - train acc: 0.9182 - test acc: 0.8018 - 12m 28s
batch: 200/1563 - train loss: 1.3275 - test loss: 3.7488 - train acc: 0.9186 - test acc: 0.8010 - 12m 31s
batch: 300/1563 - train loss: 1.5650 - test loss: 3.6900 - train acc: 0.8998 - test acc: 0.7998 - 12m 35s
batch: 400/1563 - train loss: 1.4612 - test loss: 3.7870 - train acc: 0.9089 - test acc: 0.7992 - 12m 38s
batch: 500/1563 - train loss: 1.5835 - test loss: 3.7549 - train acc: 0.9045 - test acc: 0.7985 - 12m 41s
batch: 600/1563 - train loss: 1.6073 - test 

batch: 700/1563 - train loss: 1.1285 - test loss: 4.1754 - train acc: 0.9298 - test acc: 0.7961 - 16m 25s
batch: 800/1563 - train loss: 1.0566 - test loss: 4.1813 - train acc: 0.9295 - test acc: 0.8010 - 16m 29s
batch: 900/1563 - train loss: 1.1534 - test loss: 4.0279 - train acc: 0.9317 - test acc: 0.8003 - 16m 32s
batch: 1000/1563 - train loss: 1.0058 - test loss: 4.0923 - train acc: 0.9361 - test acc: 0.7997 - 16m 36s
batch: 1100/1563 - train loss: 1.0983 - test loss: 3.9368 - train acc: 0.9329 - test acc: 0.8081 - 16m 39s
batch: 1200/1563 - train loss: 0.9907 - test loss: 4.2521 - train acc: 0.9355 - test acc: 0.8000 - 16m 43s
batch: 1300/1563 - train loss: 1.1691 - test loss: 4.1882 - train acc: 0.9267 - test acc: 0.8054 - 16m 46s
batch: 1400/1563 - train loss: 1.1054 - test loss: 4.3200 - train acc: 0.9277 - test acc: 0.7993 - 16m 49s
batch: 1500/1563 - train loss: 1.1123 - test loss: 4.0235 - train acc: 0.9289 - test acc: 0.8042 - 16m 53s
batch: 1563/1563 - train loss: 1.0795 - 

In [26]:
for _ in range(n_runs):
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        
    print(f'iteration: {step_i}')

    default_metrics, _ = train_network_fisher_optimization(apply_fisher = False,
                                                           net_params = {'p': 0.1},
                                                           epochs = 100,
                                                           time_limit_secs = 1200)

    results_list.append( (default_metrics, -1, -1, -1) )
    results_list_to_json(results_list, step=step_i)
    step_i += 1
    
    print()

iteration: 20
generating CIFAR10 data with 10 classes
Files already downloaded and verified
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8, 8]               0
        Dropout2d-10             [-1, 96, 8, 8]               0
           Conv2d-11             [-1, 64, 8, 8]         153,664
             ReLU-12             [-1,

batch: 1000/1563 - train loss: 5.4929 - test loss: 5.3985 - train acc: 0.6513 - test acc: 0.6630 - 2m 39s
batch: 1100/1563 - train loss: 5.4734 - test loss: 5.2466 - train acc: 0.6531 - test acc: 0.6779 - 2m 42s
batch: 1200/1563 - train loss: 5.3955 - test loss: 5.3796 - train acc: 0.6647 - test acc: 0.6625 - 2m 45s
batch: 1300/1563 - train loss: 5.2556 - test loss: 5.1176 - train acc: 0.6713 - test acc: 0.6819 - 2m 47s
batch: 1400/1563 - train loss: 5.3796 - test loss: 5.4091 - train acc: 0.6632 - test acc: 0.6624 - 2m 50s
batch: 1500/1563 - train loss: 5.4044 - test loss: 5.1970 - train acc: 0.6675 - test acc: 0.6790 - 2m 53s
batch: 1563/1563 - train loss: 5.5181 - test loss: 5.0861 - train acc: 0.6503 - test acc: 0.6860 - 2m 55s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 5/100
batch: 100/1563 - train loss: 4.7349 - test loss: 4.9716 - train acc: 0.7016 - test acc: 0.6931 - 2m 58s
batch: 200/1563 - train loss: 4.9515 - test loss:

batch: 300/1563 - train loss: 3.1024 - test loss: 4.1526 - train acc: 0.8068 - test acc: 0.7548 - 6m 3s
batch: 400/1563 - train loss: 2.9378 - test loss: 4.2616 - train acc: 0.8268 - test acc: 0.7421 - 6m 5s
batch: 500/1563 - train loss: 3.1935 - test loss: 4.1940 - train acc: 0.7974 - test acc: 0.7426 - 6m 8s
batch: 600/1563 - train loss: 3.1799 - test loss: 4.2597 - train acc: 0.8061 - test acc: 0.7460 - 6m 11s
batch: 700/1563 - train loss: 3.0846 - test loss: 4.1931 - train acc: 0.8080 - test acc: 0.7535 - 6m 14s
batch: 800/1563 - train loss: 3.1846 - test loss: 4.2528 - train acc: 0.8056 - test acc: 0.7463 - 6m 17s
batch: 900/1563 - train loss: 3.1483 - test loss: 4.5165 - train acc: 0.8040 - test acc: 0.7360 - 6m 20s
batch: 1000/1563 - train loss: 3.0799 - test loss: 4.1852 - train acc: 0.8090 - test acc: 0.7510 - 6m 23s
batch: 1100/1563 - train loss: 3.1254 - test loss: 4.0970 - train acc: 0.8012 - test acc: 0.7515 - 6m 26s
batch: 1200/1563 - train loss: 3.1622 - test loss: 4.176

batch: 1300/1563 - train loss: 2.3457 - test loss: 4.3963 - train acc: 0.8572 - test acc: 0.7584 - 9m 33s
batch: 1400/1563 - train loss: 2.2844 - test loss: 4.2774 - train acc: 0.8606 - test acc: 0.7646 - 9m 36s
batch: 1500/1563 - train loss: 2.2703 - test loss: 4.4175 - train acc: 0.8607 - test acc: 0.7568 - 9m 39s
batch: 1563/1563 - train loss: 2.2793 - test loss: 4.4353 - train acc: 0.8581 - test acc: 0.7594 - 9m 41s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 14/100
batch: 100/1563 - train loss: 1.8180 - test loss: 4.5978 - train acc: 0.8828 - test acc: 0.7656 - 9m 44s
batch: 200/1563 - train loss: 1.6873 - test loss: 4.5465 - train acc: 0.8966 - test acc: 0.7612 - 9m 47s
batch: 300/1563 - train loss: 1.6972 - test loss: 4.6201 - train acc: 0.8948 - test acc: 0.7680 - 9m 50s
batch: 400/1563 - train loss: 1.8010 - test loss: 4.6575 - train acc: 0.8887 - test acc: 0.7587 - 9m 52s
batch: 500/1563 - train loss: 1.7331 - test loss: 4

batch: 600/1563 - train loss: 1.2770 - test loss: 5.1681 - train acc: 0.9242 - test acc: 0.7672 - 13m 2s
batch: 700/1563 - train loss: 1.3690 - test loss: 5.2301 - train acc: 0.9179 - test acc: 0.7614 - 13m 5s
batch: 800/1563 - train loss: 1.5606 - test loss: 5.2335 - train acc: 0.9094 - test acc: 0.7591 - 13m 8s
batch: 900/1563 - train loss: 1.3877 - test loss: 5.5522 - train acc: 0.9126 - test acc: 0.7576 - 13m 11s
batch: 1000/1563 - train loss: 1.4184 - test loss: 5.1074 - train acc: 0.9110 - test acc: 0.7588 - 13m 14s
batch: 1100/1563 - train loss: 1.3933 - test loss: 5.0804 - train acc: 0.9152 - test acc: 0.7557 - 13m 17s
batch: 1200/1563 - train loss: 1.4670 - test loss: 4.9600 - train acc: 0.9041 - test acc: 0.7558 - 13m 20s
batch: 1300/1563 - train loss: 1.3804 - test loss: 5.1829 - train acc: 0.9151 - test acc: 0.7629 - 13m 23s
batch: 1400/1563 - train loss: 1.5440 - test loss: 5.1434 - train acc: 0.9045 - test acc: 0.7563 - 13m 26s
batch: 1500/1563 - train loss: 1.4083 - test

batch: 1563/1563 - train loss: 1.2826 - test loss: 5.8886 - train acc: 0.9217 - test acc: 0.7510 - 16m 37s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 23/100
batch: 100/1563 - train loss: 1.0637 - test loss: 5.8743 - train acc: 0.9355 - test acc: 0.7645 - 16m 39s
batch: 200/1563 - train loss: 0.8721 - test loss: 6.3154 - train acc: 0.9480 - test acc: 0.7551 - 16m 42s
batch: 300/1563 - train loss: 1.0846 - test loss: 6.3393 - train acc: 0.9354 - test acc: 0.7529 - 16m 45s
batch: 400/1563 - train loss: 0.9906 - test loss: 6.1797 - train acc: 0.9433 - test acc: 0.7552 - 16m 48s
batch: 500/1563 - train loss: 1.2503 - test loss: 5.9601 - train acc: 0.9248 - test acc: 0.7556 - 16m 51s
batch: 600/1563 - train loss: 1.0398 - test loss: 5.7913 - train acc: 0.9339 - test acc: 0.7663 - 16m 54s
batch: 700/1563 - train loss: 1.1881 - test loss: 6.0063 - train acc: 0.9264 - test acc: 0.7481 - 16m 58s
batch: 800/1563 - train loss: 1.2582 - test lo

Files already downloaded and verified
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8, 8]               0
        Dropout2d-10             [-1, 96, 8, 8]               0
           Conv2d-11             [-1, 64, 8, 8]         153,664
             ReLU-12             [-1, 64, 8, 8]               0
        Dropout2d-13       

batch: 1100/1563 - train loss: 5.3672 - test loss: 5.4169 - train acc: 0.6716 - test acc: 0.6660 - 2m 41s
batch: 1200/1563 - train loss: 5.3286 - test loss: 5.0918 - train acc: 0.6654 - test acc: 0.6843 - 2m 44s
batch: 1300/1563 - train loss: 5.3724 - test loss: 5.1767 - train acc: 0.6601 - test acc: 0.6750 - 2m 47s
batch: 1400/1563 - train loss: 5.2558 - test loss: 4.9416 - train acc: 0.6685 - test acc: 0.6920 - 2m 50s
batch: 1500/1563 - train loss: 5.2415 - test loss: 5.7973 - train acc: 0.6656 - test acc: 0.6400 - 2m 52s
batch: 1563/1563 - train loss: 5.2777 - test loss: 4.9905 - train acc: 0.6628 - test acc: 0.6881 - 2m 55s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 5/100
batch: 100/1563 - train loss: 4.7977 - test loss: 5.0327 - train acc: 0.6978 - test acc: 0.6883 - 2m 57s
batch: 200/1563 - train loss: 4.9514 - test loss: 5.0364 - train acc: 0.7010 - test acc: 0.6886 - 3m 0s
batch: 300/1563 - train loss: 4.7630 - test loss: 4

batch: 400/1563 - train loss: 3.0939 - test loss: 4.1197 - train acc: 0.8153 - test acc: 0.7564 - 6m 5s
batch: 500/1563 - train loss: 2.8647 - test loss: 4.1348 - train acc: 0.8268 - test acc: 0.7578 - 6m 8s
batch: 600/1563 - train loss: 2.9326 - test loss: 4.2325 - train acc: 0.8268 - test acc: 0.7528 - 6m 11s
batch: 700/1563 - train loss: 3.0441 - test loss: 4.2771 - train acc: 0.8090 - test acc: 0.7442 - 6m 14s
batch: 800/1563 - train loss: 3.0257 - test loss: 4.0598 - train acc: 0.8084 - test acc: 0.7567 - 6m 17s
batch: 900/1563 - train loss: 3.0845 - test loss: 3.9937 - train acc: 0.8028 - test acc: 0.7620 - 6m 20s
batch: 1000/1563 - train loss: 3.1318 - test loss: 4.0178 - train acc: 0.8140 - test acc: 0.7593 - 6m 22s
batch: 1100/1563 - train loss: 3.1335 - test loss: 3.9937 - train acc: 0.7959 - test acc: 0.7626 - 6m 25s
batch: 1200/1563 - train loss: 3.1299 - test loss: 4.0933 - train acc: 0.8118 - test acc: 0.7522 - 6m 28s
batch: 1300/1563 - train loss: 3.0519 - test loss: 4.0

batch: 1400/1563 - train loss: 2.2344 - test loss: 4.4661 - train acc: 0.8603 - test acc: 0.7595 - 9m 34s
batch: 1500/1563 - train loss: 2.3512 - test loss: 4.5591 - train acc: 0.8521 - test acc: 0.7547 - 9m 36s
batch: 1563/1563 - train loss: 2.1674 - test loss: 4.4461 - train acc: 0.8656 - test acc: 0.7714 - 9m 39s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 14/100
batch: 100/1563 - train loss: 1.4166 - test loss: 4.6833 - train acc: 0.9135 - test acc: 0.7648 - 9m 42s
batch: 200/1563 - train loss: 1.5883 - test loss: 4.6237 - train acc: 0.9001 - test acc: 0.7662 - 9m 45s
batch: 300/1563 - train loss: 1.6726 - test loss: 4.6762 - train acc: 0.8992 - test acc: 0.7664 - 9m 47s
batch: 400/1563 - train loss: 1.7086 - test loss: 4.7150 - train acc: 0.8903 - test acc: 0.7643 - 9m 50s
batch: 500/1563 - train loss: 1.8525 - test loss: 4.3862 - train acc: 0.8882 - test acc: 0.7705 - 9m 53s
batch: 600/1563 - train loss: 1.9683 - test loss: 4.

batch: 700/1563 - train loss: 1.3712 - test loss: 5.0736 - train acc: 0.9142 - test acc: 0.7691 - 13m 5s
batch: 800/1563 - train loss: 1.3692 - test loss: 5.1245 - train acc: 0.9139 - test acc: 0.7675 - 13m 8s
batch: 900/1563 - train loss: 1.3151 - test loss: 5.1793 - train acc: 0.9192 - test acc: 0.7675 - 13m 11s
batch: 1000/1563 - train loss: 1.3029 - test loss: 5.2727 - train acc: 0.9145 - test acc: 0.7544 - 13m 14s
batch: 1100/1563 - train loss: 1.3493 - test loss: 5.2698 - train acc: 0.9135 - test acc: 0.7664 - 13m 17s
batch: 1200/1563 - train loss: 1.5396 - test loss: 4.9809 - train acc: 0.9070 - test acc: 0.7640 - 13m 20s
batch: 1300/1563 - train loss: 1.5834 - test loss: 5.2019 - train acc: 0.9057 - test acc: 0.7534 - 13m 23s
batch: 1400/1563 - train loss: 1.4293 - test loss: 4.9401 - train acc: 0.9085 - test acc: 0.7686 - 13m 25s
batch: 1500/1563 - train loss: 1.5889 - test loss: 5.1280 - train acc: 0.9026 - test acc: 0.7683 - 13m 28s
batch: 1563/1563 - train loss: 1.5893 - te

batch: 100/1563 - train loss: 0.9084 - test loss: 6.0140 - train acc: 0.9439 - test acc: 0.7539 - 16m 38s
batch: 200/1563 - train loss: 0.9939 - test loss: 6.0741 - train acc: 0.9408 - test acc: 0.7566 - 16m 41s
batch: 300/1563 - train loss: 0.8404 - test loss: 6.2543 - train acc: 0.9496 - test acc: 0.7578 - 16m 44s
batch: 400/1563 - train loss: 1.0063 - test loss: 6.2980 - train acc: 0.9355 - test acc: 0.7602 - 16m 47s
batch: 500/1563 - train loss: 1.1518 - test loss: 5.6158 - train acc: 0.9395 - test acc: 0.7604 - 16m 50s
batch: 600/1563 - train loss: 1.0207 - test loss: 5.8877 - train acc: 0.9398 - test acc: 0.7563 - 16m 52s
batch: 700/1563 - train loss: 1.0461 - test loss: 5.9398 - train acc: 0.9380 - test acc: 0.7604 - 16m 56s
batch: 800/1563 - train loss: 1.1810 - test loss: 5.8329 - train acc: 0.9274 - test acc: 0.7634 - 16m 59s
batch: 900/1563 - train loss: 1.2748 - test loss: 5.7171 - train acc: 0.9173 - test acc: 0.7600 - 17m 1s
batch: 1000/1563 - train loss: 1.0610 - test lo

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8, 8]               0
        Dropout2d-10             [-1, 96, 8, 8]               0
           Conv2d-11             [-1, 64, 8, 8]         153,664
             ReLU-12             [-1, 64, 8, 8]               0
        Dropout2d-13             [-1, 64, 8, 8]               0
          Flatten-14                 [-

batch: 1200/1563 - train loss: 5.1707 - test loss: 4.9705 - train acc: 0.6847 - test acc: 0.6901 - 2m 44s
batch: 1300/1563 - train loss: 5.2038 - test loss: 5.2878 - train acc: 0.6807 - test acc: 0.6728 - 2m 47s
batch: 1400/1563 - train loss: 5.1295 - test loss: 5.1079 - train acc: 0.6816 - test acc: 0.6810 - 2m 50s
batch: 1500/1563 - train loss: 5.1134 - test loss: 5.3742 - train acc: 0.6757 - test acc: 0.6635 - 2m 52s
batch: 1563/1563 - train loss: 5.1323 - test loss: 5.1315 - train acc: 0.6722 - test acc: 0.6812 - 2m 55s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 5/100
batch: 100/1563 - train loss: 4.7439 - test loss: 4.8651 - train acc: 0.7023 - test acc: 0.6984 - 2m 58s
batch: 200/1563 - train loss: 4.7895 - test loss: 5.0872 - train acc: 0.6991 - test acc: 0.6829 - 3m 1s
batch: 300/1563 - train loss: 4.7930 - test loss: 4.8900 - train acc: 0.7063 - test acc: 0.6930 - 3m 4s
batch: 400/1563 - train loss: 4.8171 - test loss: 5.0

batch: 500/1563 - train loss: 3.0998 - test loss: 4.2150 - train acc: 0.8143 - test acc: 0.7527 - 6m 8s
batch: 600/1563 - train loss: 2.9841 - test loss: 4.1401 - train acc: 0.8115 - test acc: 0.7544 - 6m 11s
batch: 700/1563 - train loss: 3.0834 - test loss: 4.0009 - train acc: 0.8087 - test acc: 0.7638 - 6m 14s
batch: 800/1563 - train loss: 2.9639 - test loss: 4.2112 - train acc: 0.8099 - test acc: 0.7490 - 6m 17s
batch: 900/1563 - train loss: 2.9964 - test loss: 4.1006 - train acc: 0.8209 - test acc: 0.7527 - 6m 20s
batch: 1000/1563 - train loss: 3.0622 - test loss: 4.4000 - train acc: 0.8027 - test acc: 0.7457 - 6m 23s
batch: 1100/1563 - train loss: 3.0565 - test loss: 4.3210 - train acc: 0.8068 - test acc: 0.7467 - 6m 25s
batch: 1200/1563 - train loss: 2.9894 - test loss: 3.9696 - train acc: 0.8203 - test acc: 0.7664 - 6m 28s
batch: 1300/1563 - train loss: 3.0401 - test loss: 4.1481 - train acc: 0.8087 - test acc: 0.7585 - 6m 31s
batch: 1400/1563 - train loss: 3.1415 - test loss: 4

batch: 1500/1563 - train loss: 2.1835 - test loss: 4.4304 - train acc: 0.8650 - test acc: 0.7667 - 9m 38s
batch: 1563/1563 - train loss: 2.0469 - test loss: 4.5159 - train acc: 0.8672 - test acc: 0.7638 - 9m 40s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 14/100
batch: 100/1563 - train loss: 1.5844 - test loss: 4.4999 - train acc: 0.8969 - test acc: 0.7712 - 9m 43s
batch: 200/1563 - train loss: 1.5346 - test loss: 4.6690 - train acc: 0.9051 - test acc: 0.7728 - 9m 46s
batch: 300/1563 - train loss: 1.8060 - test loss: 4.5508 - train acc: 0.8870 - test acc: 0.7642 - 9m 49s
batch: 400/1563 - train loss: 1.7176 - test loss: 4.6186 - train acc: 0.9010 - test acc: 0.7705 - 9m 52s
batch: 500/1563 - train loss: 1.6237 - test loss: 4.6217 - train acc: 0.8979 - test acc: 0.7696 - 9m 55s
batch: 600/1563 - train loss: 1.6089 - test loss: 4.6501 - train acc: 0.8963 - test acc: 0.7672 - 9m 58s
batch: 700/1563 - train loss: 1.7647 - test loss: 4.6

batch: 800/1563 - train loss: 1.1395 - test loss: 5.3147 - train acc: 0.9252 - test acc: 0.7643 - 13m 8s
batch: 900/1563 - train loss: 1.2988 - test loss: 5.0876 - train acc: 0.9233 - test acc: 0.7681 - 13m 11s
batch: 1000/1563 - train loss: 1.3230 - test loss: 5.2976 - train acc: 0.9223 - test acc: 0.7590 - 13m 14s
batch: 1100/1563 - train loss: 1.3372 - test loss: 5.3402 - train acc: 0.9192 - test acc: 0.7525 - 13m 16s
batch: 1200/1563 - train loss: 1.3341 - test loss: 5.8847 - train acc: 0.9132 - test acc: 0.7570 - 13m 20s
batch: 1300/1563 - train loss: 1.5583 - test loss: 5.2287 - train acc: 0.9060 - test acc: 0.7549 - 13m 22s
batch: 1400/1563 - train loss: 1.4411 - test loss: 5.2267 - train acc: 0.9073 - test acc: 0.7597 - 13m 25s
batch: 1500/1563 - train loss: 1.5226 - test loss: 5.1350 - train acc: 0.9113 - test acc: 0.7621 - 13m 28s
batch: 1563/1563 - train loss: 1.4316 - test loss: 5.2487 - train acc: 0.9079 - test acc: 0.7635 - 13m 30s
GPU memory used: 0.02 GB - max: 3.19 GB 

batch: 100/1563 - train loss: 0.8095 - test loss: 5.8632 - train acc: 0.9489 - test acc: 0.7681 - 16m 39s
batch: 200/1563 - train loss: 0.8377 - test loss: 6.3522 - train acc: 0.9511 - test acc: 0.7555 - 16m 42s
batch: 300/1563 - train loss: 0.9580 - test loss: 6.0537 - train acc: 0.9408 - test acc: 0.7633 - 16m 45s
batch: 400/1563 - train loss: 0.8932 - test loss: 6.1102 - train acc: 0.9442 - test acc: 0.7613 - 16m 48s
batch: 500/1563 - train loss: 1.1812 - test loss: 6.0065 - train acc: 0.9279 - test acc: 0.7644 - 16m 51s
batch: 600/1563 - train loss: 0.9516 - test loss: 6.3174 - train acc: 0.9454 - test acc: 0.7637 - 16m 54s
batch: 700/1563 - train loss: 1.0018 - test loss: 6.3219 - train acc: 0.9389 - test acc: 0.7600 - 16m 57s
batch: 800/1563 - train loss: 1.1092 - test loss: 5.9670 - train acc: 0.9355 - test acc: 0.7563 - 17m 0s
batch: 900/1563 - train loss: 1.0022 - test loss: 6.0400 - train acc: 0.9383 - test acc: 0.7609 - 17m 3s
batch: 1000/1563 - train loss: 1.2031 - test los

batch: 100/1563 - train loss: 13.0263 - test loss: 13.0166 - train acc: 0.1028 - test acc: 0.1079 - 0m 1s
batch: 200/1563 - train loss: 12.9902 - test loss: 12.9103 - train acc: 0.1152 - test acc: 0.1004 - 0m 3s
batch: 300/1563 - train loss: 12.4810 - test loss: 13.0452 - train acc: 0.1735 - test acc: 0.2011 - 0m 6s
batch: 400/1563 - train loss: 12.0611 - test loss: 11.8333 - train acc: 0.2175 - test acc: 0.2594 - 0m 9s
batch: 500/1563 - train loss: 11.5106 - test loss: 11.4242 - train acc: 0.2444 - test acc: 0.2374 - 0m 11s
batch: 600/1563 - train loss: 10.8605 - test loss: 10.9863 - train acc: 0.2824 - test acc: 0.2800 - 0m 14s
batch: 700/1563 - train loss: 10.4955 - test loss: 10.4327 - train acc: 0.3055 - test acc: 0.3066 - 0m 17s
batch: 800/1563 - train loss: 10.2074 - test loss: 9.8758 - train acc: 0.3216 - test acc: 0.3269 - 0m 20s
batch: 900/1563 - train loss: 10.0626 - test loss: 9.8222 - train acc: 0.3244 - test acc: 0.3505 - 0m 23s
batch: 1000/1563 - train loss: 9.7704 - tes

batch: 1100/1563 - train loss: 4.7411 - test loss: 4.9748 - train acc: 0.7097 - test acc: 0.6928 - 3m 25s
batch: 1200/1563 - train loss: 4.6507 - test loss: 4.6711 - train acc: 0.7124 - test acc: 0.7113 - 3m 28s
batch: 1300/1563 - train loss: 4.7014 - test loss: 4.8886 - train acc: 0.6982 - test acc: 0.6955 - 3m 31s
batch: 1400/1563 - train loss: 4.8520 - test loss: 4.7460 - train acc: 0.6995 - test acc: 0.7037 - 3m 34s
batch: 1500/1563 - train loss: 4.8547 - test loss: 4.6894 - train acc: 0.6944 - test acc: 0.7102 - 3m 37s
batch: 1563/1563 - train loss: 4.7155 - test loss: 4.5999 - train acc: 0.7062 - test acc: 0.7207 - 3m 39s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.2127 - test loss: 4.4608 - train acc: 0.7403 - test acc: 0.7284 - 3m 42s
batch: 200/1563 - train loss: 3.9930 - test loss: 4.6810 - train acc: 0.7537 - test acc: 0.7150 - 3m 45s
batch: 300/1563 - train loss: 4.2511 - test loss: 

batch: 400/1563 - train loss: 2.7393 - test loss: 4.0672 - train acc: 0.8244 - test acc: 0.7588 - 6m 50s
batch: 500/1563 - train loss: 2.5438 - test loss: 4.1357 - train acc: 0.8403 - test acc: 0.7579 - 6m 53s
batch: 600/1563 - train loss: 2.5771 - test loss: 4.1025 - train acc: 0.8402 - test acc: 0.7549 - 6m 55s
batch: 700/1563 - train loss: 2.5518 - test loss: 4.1714 - train acc: 0.8456 - test acc: 0.7609 - 6m 58s
batch: 800/1563 - train loss: 2.7985 - test loss: 4.0910 - train acc: 0.8312 - test acc: 0.7603 - 7m 1s
batch: 900/1563 - train loss: 2.9346 - test loss: 4.2725 - train acc: 0.8209 - test acc: 0.7552 - 7m 4s
batch: 1000/1563 - train loss: 2.6750 - test loss: 4.2630 - train acc: 0.8294 - test acc: 0.7557 - 7m 7s
batch: 1100/1563 - train loss: 2.7751 - test loss: 4.1210 - train acc: 0.8262 - test acc: 0.7668 - 7m 10s
batch: 1200/1563 - train loss: 2.7980 - test loss: 3.9493 - train acc: 0.8315 - test acc: 0.7653 - 7m 13s
batch: 1300/1563 - train loss: 2.8555 - test loss: 4.07

batch: 1400/1563 - train loss: 2.0345 - test loss: 4.5230 - train acc: 0.8794 - test acc: 0.7650 - 10m 20s
batch: 1500/1563 - train loss: 1.9350 - test loss: 4.5353 - train acc: 0.8759 - test acc: 0.7604 - 10m 22s
batch: 1563/1563 - train loss: 1.9872 - test loss: 4.5058 - train acc: 0.8753 - test acc: 0.7651 - 10m 25s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.3606 - test loss: 4.8340 - train acc: 0.9192 - test acc: 0.7708 - 10m 28s
batch: 200/1563 - train loss: 1.4739 - test loss: 4.7608 - train acc: 0.9095 - test acc: 0.7634 - 10m 31s
batch: 300/1563 - train loss: 1.5165 - test loss: 4.9300 - train acc: 0.9041 - test acc: 0.7592 - 10m 34s
batch: 400/1563 - train loss: 1.5734 - test loss: 4.8347 - train acc: 0.8963 - test acc: 0.7669 - 10m 37s
batch: 500/1563 - train loss: 1.4066 - test loss: 4.9107 - train acc: 0.9130 - test acc: 0.7745 - 10m 39s
batch: 600/1563 - train loss: 1.6827 - test 

batch: 700/1563 - train loss: 1.1199 - test loss: 5.3799 - train acc: 0.9320 - test acc: 0.7671 - 13m 49s
batch: 800/1563 - train loss: 1.1472 - test loss: 5.3897 - train acc: 0.9327 - test acc: 0.7638 - 13m 52s
batch: 900/1563 - train loss: 1.1745 - test loss: 5.6706 - train acc: 0.9314 - test acc: 0.7623 - 13m 55s
batch: 1000/1563 - train loss: 1.2859 - test loss: 5.3425 - train acc: 0.9182 - test acc: 0.7591 - 13m 57s
batch: 1100/1563 - train loss: 1.3501 - test loss: 5.7131 - train acc: 0.9151 - test acc: 0.7536 - 14m 0s
batch: 1200/1563 - train loss: 1.2940 - test loss: 5.5960 - train acc: 0.9214 - test acc: 0.7605 - 14m 3s
batch: 1300/1563 - train loss: 1.4704 - test loss: 5.2352 - train acc: 0.9126 - test acc: 0.7606 - 14m 6s
batch: 1400/1563 - train loss: 1.1173 - test loss: 5.3633 - train acc: 0.9326 - test acc: 0.7567 - 14m 9s
batch: 1500/1563 - train loss: 1.2997 - test loss: 5.6603 - train acc: 0.9273 - test acc: 0.7586 - 14m 12s
batch: 1563/1563 - train loss: 1.4846 - test

batch: 100/1563 - train loss: 0.9590 - test loss: 5.8854 - train acc: 0.9430 - test acc: 0.7618 - 17m 23s
batch: 200/1563 - train loss: 0.8811 - test loss: 6.0296 - train acc: 0.9430 - test acc: 0.7562 - 17m 25s
batch: 300/1563 - train loss: 0.9654 - test loss: 6.4960 - train acc: 0.9433 - test acc: 0.7579 - 17m 28s
batch: 400/1563 - train loss: 0.8976 - test loss: 6.1803 - train acc: 0.9420 - test acc: 0.7649 - 17m 31s
batch: 500/1563 - train loss: 1.0897 - test loss: 6.1843 - train acc: 0.9364 - test acc: 0.7580 - 17m 34s
batch: 600/1563 - train loss: 0.9833 - test loss: 6.1314 - train acc: 0.9417 - test acc: 0.7630 - 17m 37s
batch: 700/1563 - train loss: 0.9478 - test loss: 6.1475 - train acc: 0.9458 - test acc: 0.7560 - 17m 40s
batch: 800/1563 - train loss: 1.1571 - test loss: 6.1375 - train acc: 0.9314 - test acc: 0.7564 - 17m 43s
batch: 900/1563 - train loss: 1.0221 - test loss: 6.1781 - train acc: 0.9383 - test acc: 0.7560 - 17m 46s
batch: 1000/1563 - train loss: 0.8686 - test l

batch: 200/1563 - train loss: 13.0104 - test loss: 12.9852 - train acc: 0.1140 - test acc: 0.1056 - 0m 3s
batch: 300/1563 - train loss: 12.6217 - test loss: 11.8654 - train acc: 0.1557 - test acc: 0.2208 - 0m 6s
batch: 400/1563 - train loss: 11.8386 - test loss: 11.4872 - train acc: 0.2222 - test acc: 0.2441 - 0m 9s
batch: 500/1563 - train loss: 11.4431 - test loss: 10.9713 - train acc: 0.2416 - test acc: 0.2929 - 0m 11s
batch: 600/1563 - train loss: 11.1394 - test loss: 10.4779 - train acc: 0.2781 - test acc: 0.3144 - 0m 14s
batch: 700/1563 - train loss: 10.8350 - test loss: 10.0165 - train acc: 0.2821 - test acc: 0.3314 - 0m 17s
batch: 800/1563 - train loss: 10.2849 - test loss: 10.0576 - train acc: 0.3166 - test acc: 0.3264 - 0m 20s
batch: 900/1563 - train loss: 10.1353 - test loss: 10.3794 - train acc: 0.3416 - test acc: 0.3390 - 0m 23s
batch: 1000/1563 - train loss: 9.9434 - test loss: 9.4237 - train acc: 0.3562 - test acc: 0.3893 - 0m 25s
batch: 1100/1563 - train loss: 9.2426 - t

batch: 1200/1563 - train loss: 4.6256 - test loss: 4.6200 - train acc: 0.7050 - test acc: 0.7150 - 3m 30s
batch: 1300/1563 - train loss: 4.5850 - test loss: 4.4580 - train acc: 0.7135 - test acc: 0.7259 - 3m 33s
batch: 1400/1563 - train loss: 4.5630 - test loss: 4.7412 - train acc: 0.7185 - test acc: 0.7060 - 3m 35s
batch: 1500/1563 - train loss: 4.5860 - test loss: 4.9932 - train acc: 0.7150 - test acc: 0.6890 - 3m 38s
batch: 1563/1563 - train loss: 4.4422 - test loss: 4.7498 - train acc: 0.7263 - test acc: 0.7001 - 3m 41s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.3979 - test loss: 4.7810 - train acc: 0.7360 - test acc: 0.7036 - 3m 44s
batch: 200/1563 - train loss: 4.1671 - test loss: 4.5279 - train acc: 0.7400 - test acc: 0.7200 - 3m 46s
batch: 300/1563 - train loss: 4.0003 - test loss: 4.6361 - train acc: 0.7515 - test acc: 0.7192 - 3m 49s
batch: 400/1563 - train loss: 4.1724 - test loss: 4

batch: 500/1563 - train loss: 2.5812 - test loss: 4.1907 - train acc: 0.8446 - test acc: 0.7562 - 6m 55s
batch: 600/1563 - train loss: 2.5751 - test loss: 4.1450 - train acc: 0.8488 - test acc: 0.7595 - 6m 58s
batch: 700/1563 - train loss: 2.7061 - test loss: 4.2559 - train acc: 0.8305 - test acc: 0.7542 - 7m 1s
batch: 800/1563 - train loss: 2.5253 - test loss: 4.1849 - train acc: 0.8406 - test acc: 0.7541 - 7m 4s
batch: 900/1563 - train loss: 2.8172 - test loss: 4.1572 - train acc: 0.8209 - test acc: 0.7569 - 7m 7s
batch: 1000/1563 - train loss: 2.7272 - test loss: 4.1612 - train acc: 0.8349 - test acc: 0.7585 - 7m 10s
batch: 1100/1563 - train loss: 2.8916 - test loss: 4.0105 - train acc: 0.8246 - test acc: 0.7626 - 7m 13s
batch: 1200/1563 - train loss: 2.8805 - test loss: 3.9216 - train acc: 0.8206 - test acc: 0.7700 - 7m 15s
batch: 1300/1563 - train loss: 2.7571 - test loss: 4.0278 - train acc: 0.8283 - test acc: 0.7629 - 7m 18s
batch: 1400/1563 - train loss: 2.7291 - test loss: 3.9

batch: 1500/1563 - train loss: 1.9213 - test loss: 4.4372 - train acc: 0.8782 - test acc: 0.7667 - 10m 27s
batch: 1563/1563 - train loss: 1.8690 - test loss: 4.5177 - train acc: 0.8841 - test acc: 0.7672 - 10m 30s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.4553 - test loss: 4.5445 - train acc: 0.9129 - test acc: 0.7664 - 10m 33s
batch: 200/1563 - train loss: 1.3138 - test loss: 4.9394 - train acc: 0.9192 - test acc: 0.7619 - 10m 36s
batch: 300/1563 - train loss: 1.4673 - test loss: 4.7228 - train acc: 0.9055 - test acc: 0.7709 - 10m 39s
batch: 400/1563 - train loss: 1.5618 - test loss: 5.1405 - train acc: 0.9010 - test acc: 0.7583 - 10m 42s
batch: 500/1563 - train loss: 1.5571 - test loss: 4.6506 - train acc: 0.9092 - test acc: 0.7640 - 10m 44s
batch: 600/1563 - train loss: 1.5345 - test loss: 4.9100 - train acc: 0.9051 - test acc: 0.7611 - 10m 47s
batch: 700/1563 - train loss: 1.6355 - test l

batch: 800/1563 - train loss: 1.2897 - test loss: 5.4195 - train acc: 0.9183 - test acc: 0.7620 - 13m 59s
batch: 900/1563 - train loss: 1.2740 - test loss: 5.1325 - train acc: 0.9220 - test acc: 0.7670 - 14m 2s
batch: 1000/1563 - train loss: 1.3708 - test loss: 5.3607 - train acc: 0.9105 - test acc: 0.7668 - 14m 5s
batch: 1100/1563 - train loss: 1.3102 - test loss: 5.0709 - train acc: 0.9217 - test acc: 0.7648 - 14m 8s
batch: 1200/1563 - train loss: 1.4044 - test loss: 5.0987 - train acc: 0.9160 - test acc: 0.7639 - 14m 11s
batch: 1300/1563 - train loss: 1.3132 - test loss: 5.0790 - train acc: 0.9214 - test acc: 0.7684 - 14m 13s
batch: 1400/1563 - train loss: 1.1561 - test loss: 5.1900 - train acc: 0.9305 - test acc: 0.7689 - 14m 17s
batch: 1500/1563 - train loss: 1.5297 - test loss: 5.1690 - train acc: 0.9148 - test acc: 0.7593 - 14m 20s
batch: 1563/1563 - train loss: 1.5466 - test loss: 5.1210 - train acc: 0.9026 - test acc: 0.7602 - 14m 22s
GPU memory used: 0.02 GB - max: 3.19 GB - 

batch: 100/1563 - train loss: 1.0479 - test loss: 6.2126 - train acc: 0.9336 - test acc: 0.7524 - 17m 32s
batch: 200/1563 - train loss: 0.9636 - test loss: 5.7271 - train acc: 0.9408 - test acc: 0.7626 - 17m 35s
batch: 300/1563 - train loss: 0.9790 - test loss: 5.7214 - train acc: 0.9418 - test acc: 0.7696 - 17m 38s
batch: 400/1563 - train loss: 0.8935 - test loss: 5.8047 - train acc: 0.9476 - test acc: 0.7702 - 17m 41s
batch: 500/1563 - train loss: 1.0250 - test loss: 5.8418 - train acc: 0.9433 - test acc: 0.7664 - 17m 44s
batch: 600/1563 - train loss: 1.0026 - test loss: 6.2083 - train acc: 0.9370 - test acc: 0.7634 - 17m 47s
batch: 700/1563 - train loss: 1.0758 - test loss: 6.2500 - train acc: 0.9305 - test acc: 0.7670 - 17m 50s
batch: 800/1563 - train loss: 1.0003 - test loss: 6.0514 - train acc: 0.9367 - test acc: 0.7699 - 17m 53s
batch: 900/1563 - train loss: 1.1266 - test loss: 5.8780 - train acc: 0.9323 - test acc: 0.7643 - 17m 56s
batch: 1000/1563 - train loss: 1.1637 - test l

batch: 500/1563 - train loss: 11.4674 - test loss: 11.5976 - train acc: 0.2444 - test acc: 0.2473 - 0m 12s
batch: 600/1563 - train loss: 10.6155 - test loss: 10.5510 - train acc: 0.3016 - test acc: 0.3149 - 0m 15s
batch: 700/1563 - train loss: 10.4019 - test loss: 9.9264 - train acc: 0.3059 - test acc: 0.3477 - 0m 17s
batch: 800/1563 - train loss: 10.1267 - test loss: 10.0518 - train acc: 0.3374 - test acc: 0.3332 - 0m 20s
batch: 900/1563 - train loss: 10.2225 - test loss: 9.8741 - train acc: 0.3234 - test acc: 0.3339 - 0m 23s
batch: 1000/1563 - train loss: 9.7433 - test loss: 9.6069 - train acc: 0.3605 - test acc: 0.3623 - 0m 26s
batch: 1100/1563 - train loss: 9.8521 - test loss: 9.2910 - train acc: 0.3569 - test acc: 0.4001 - 0m 29s
batch: 1200/1563 - train loss: 9.2775 - test loss: 8.9561 - train acc: 0.3865 - test acc: 0.4039 - 0m 32s
batch: 1300/1563 - train loss: 9.0878 - test loss: 9.0787 - train acc: 0.3948 - test acc: 0.4063 - 0m 34s
batch: 1400/1563 - train loss: 9.0853 - tes

batch: 1500/1563 - train loss: 4.5061 - test loss: 4.5491 - train acc: 0.7225 - test acc: 0.7187 - 3m 40s
batch: 1563/1563 - train loss: 4.6862 - test loss: 5.0981 - train acc: 0.7156 - test acc: 0.6735 - 3m 43s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.3010 - test loss: 4.7684 - train acc: 0.7341 - test acc: 0.7090 - 3m 46s
batch: 200/1563 - train loss: 4.1744 - test loss: 4.6816 - train acc: 0.7344 - test acc: 0.7087 - 3m 48s
batch: 300/1563 - train loss: 4.2900 - test loss: 4.4408 - train acc: 0.7313 - test acc: 0.7204 - 3m 51s
batch: 400/1563 - train loss: 4.1767 - test loss: 4.6577 - train acc: 0.7416 - test acc: 0.7102 - 3m 54s
batch: 500/1563 - train loss: 4.3261 - test loss: 4.6059 - train acc: 0.7344 - test acc: 0.7174 - 3m 57s
batch: 600/1563 - train loss: 4.2217 - test loss: 4.7573 - train acc: 0.7381 - test acc: 0.7047 - 4m 0s
batch: 700/1563 - train loss: 4.3508 - test loss: 4.474

batch: 800/1563 - train loss: 2.7403 - test loss: 4.0502 - train acc: 0.8296 - test acc: 0.7619 - 7m 6s
batch: 900/1563 - train loss: 2.6765 - test loss: 3.9718 - train acc: 0.8363 - test acc: 0.7616 - 7m 9s
batch: 1000/1563 - train loss: 2.7527 - test loss: 3.9093 - train acc: 0.8296 - test acc: 0.7676 - 7m 12s
batch: 1100/1563 - train loss: 2.7585 - test loss: 3.9562 - train acc: 0.8237 - test acc: 0.7653 - 7m 15s
batch: 1200/1563 - train loss: 2.8547 - test loss: 4.0260 - train acc: 0.8185 - test acc: 0.7601 - 7m 17s
batch: 1300/1563 - train loss: 2.9136 - test loss: 4.0909 - train acc: 0.8243 - test acc: 0.7548 - 7m 20s
batch: 1400/1563 - train loss: 2.8596 - test loss: 4.1910 - train acc: 0.8288 - test acc: 0.7546 - 7m 23s
batch: 1500/1563 - train loss: 2.8000 - test loss: 4.1194 - train acc: 0.8253 - test acc: 0.7561 - 7m 26s
batch: 1563/1563 - train loss: 2.9762 - test loss: 4.2491 - train acc: 0.8106 - test acc: 0.7487 - 7m 28s
GPU memory used: 0.02 GB - max: 3.19 GB - memory r

batch: 100/1563 - train loss: 1.3806 - test loss: 4.6774 - train acc: 0.9186 - test acc: 0.7684 - 10m 35s
batch: 200/1563 - train loss: 1.3542 - test loss: 4.8873 - train acc: 0.9142 - test acc: 0.7606 - 10m 37s
batch: 300/1563 - train loss: 1.7067 - test loss: 4.7808 - train acc: 0.8982 - test acc: 0.7621 - 10m 40s
batch: 400/1563 - train loss: 1.5977 - test loss: 4.6760 - train acc: 0.9023 - test acc: 0.7634 - 10m 43s
batch: 500/1563 - train loss: 1.4124 - test loss: 4.7136 - train acc: 0.9142 - test acc: 0.7698 - 10m 46s
batch: 600/1563 - train loss: 1.7323 - test loss: 5.0216 - train acc: 0.8948 - test acc: 0.7517 - 10m 49s
batch: 700/1563 - train loss: 1.6642 - test loss: 4.7271 - train acc: 0.8975 - test acc: 0.7539 - 10m 52s
batch: 800/1563 - train loss: 1.5049 - test loss: 4.9846 - train acc: 0.9039 - test acc: 0.7545 - 10m 55s
batch: 900/1563 - train loss: 1.6708 - test loss: 4.9307 - train acc: 0.8944 - test acc: 0.7607 - 10m 57s
batch: 1000/1563 - train loss: 1.6904 - test l

batch: 1100/1563 - train loss: 1.4334 - test loss: 5.2735 - train acc: 0.9142 - test acc: 0.7615 - 14m 7s
batch: 1200/1563 - train loss: 1.4261 - test loss: 5.0302 - train acc: 0.9158 - test acc: 0.7567 - 14m 10s
batch: 1300/1563 - train loss: 1.5158 - test loss: 5.1256 - train acc: 0.9032 - test acc: 0.7632 - 14m 13s
batch: 1400/1563 - train loss: 1.3168 - test loss: 5.6439 - train acc: 0.9195 - test acc: 0.7540 - 14m 16s
batch: 1500/1563 - train loss: 1.5508 - test loss: 5.1642 - train acc: 0.9082 - test acc: 0.7578 - 14m 18s
batch: 1563/1563 - train loss: 1.3854 - test loss: 5.1444 - train acc: 0.9136 - test acc: 0.7631 - 14m 21s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 20/100
batch: 100/1563 - train loss: 0.9915 - test loss: 5.4140 - train acc: 0.9418 - test acc: 0.7636 - 14m 24s
batch: 200/1563 - train loss: 0.9100 - test loss: 5.3490 - train acc: 0.9423 - test acc: 0.7717 - 14m 27s
batch: 300/1563 - train loss: 1.0058 - tes

batch: 400/1563 - train loss: 0.9457 - test loss: 6.2170 - train acc: 0.9423 - test acc: 0.7610 - 17m 38s
batch: 500/1563 - train loss: 1.0171 - test loss: 5.8481 - train acc: 0.9361 - test acc: 0.7663 - 17m 41s
batch: 600/1563 - train loss: 1.0725 - test loss: 5.7244 - train acc: 0.9361 - test acc: 0.7588 - 17m 44s
batch: 700/1563 - train loss: 0.9283 - test loss: 5.8274 - train acc: 0.9414 - test acc: 0.7601 - 17m 47s
batch: 800/1563 - train loss: 1.1109 - test loss: 5.9610 - train acc: 0.9355 - test acc: 0.7585 - 17m 50s
batch: 900/1563 - train loss: 1.0465 - test loss: 5.7925 - train acc: 0.9330 - test acc: 0.7651 - 17m 53s
batch: 1000/1563 - train loss: 1.0223 - test loss: 6.2569 - train acc: 0.9355 - test acc: 0.7579 - 17m 56s
batch: 1100/1563 - train loss: 0.9767 - test loss: 6.0682 - train acc: 0.9433 - test acc: 0.7587 - 17m 59s
batch: 1200/1563 - train loss: 1.0932 - test loss: 6.0920 - train acc: 0.9358 - test acc: 0.7557 - 18m 2s
batch: 1300/1563 - train loss: 1.2415 - test

batch: 700/1563 - train loss: 10.3969 - test loss: 9.9514 - train acc: 0.3153 - test acc: 0.3451 - 0m 17s
batch: 800/1563 - train loss: 10.3209 - test loss: 9.8841 - train acc: 0.3153 - test acc: 0.3547 - 0m 20s
batch: 900/1563 - train loss: 10.1583 - test loss: 9.6323 - train acc: 0.3218 - test acc: 0.3643 - 0m 23s
batch: 1000/1563 - train loss: 9.7875 - test loss: 9.4859 - train acc: 0.3487 - test acc: 0.3753 - 0m 26s
batch: 1100/1563 - train loss: 9.6318 - test loss: 9.3036 - train acc: 0.3666 - test acc: 0.3770 - 0m 28s
batch: 1200/1563 - train loss: 9.5021 - test loss: 8.9685 - train acc: 0.3794 - test acc: 0.4106 - 0m 31s
batch: 1300/1563 - train loss: 9.1511 - test loss: 9.1020 - train acc: 0.3954 - test acc: 0.3994 - 0m 34s
batch: 1400/1563 - train loss: 9.3513 - test loss: 8.7601 - train acc: 0.3919 - test acc: 0.4279 - 0m 37s
batch: 1500/1563 - train loss: 9.0041 - test loss: 8.6740 - train acc: 0.4116 - test acc: 0.4378 - 0m 40s
batch: 1563/1563 - train loss: 8.7813 - test l

batch: 100/1563 - train loss: 4.1405 - test loss: 4.8515 - train acc: 0.7422 - test acc: 0.7010 - 3m 44s
batch: 200/1563 - train loss: 4.2043 - test loss: 4.5791 - train acc: 0.7410 - test acc: 0.7224 - 3m 47s
batch: 300/1563 - train loss: 4.1927 - test loss: 4.6781 - train acc: 0.7412 - test acc: 0.7131 - 3m 50s
batch: 400/1563 - train loss: 4.3121 - test loss: 4.5321 - train acc: 0.7313 - test acc: 0.7217 - 3m 53s
batch: 500/1563 - train loss: 4.3502 - test loss: 4.5887 - train acc: 0.7337 - test acc: 0.7169 - 3m 55s
batch: 600/1563 - train loss: 4.2633 - test loss: 4.5660 - train acc: 0.7410 - test acc: 0.7176 - 3m 58s
batch: 700/1563 - train loss: 4.1251 - test loss: 4.5903 - train acc: 0.7450 - test acc: 0.7182 - 4m 1s
batch: 800/1563 - train loss: 4.1397 - test loss: 4.5299 - train acc: 0.7484 - test acc: 0.7247 - 4m 4s
batch: 900/1563 - train loss: 4.1316 - test loss: 4.4629 - train acc: 0.7494 - test acc: 0.7258 - 4m 7s
batch: 1000/1563 - train loss: 4.3551 - test loss: 4.6391 

batch: 1100/1563 - train loss: 2.7968 - test loss: 4.2299 - train acc: 0.8272 - test acc: 0.7512 - 7m 13s
batch: 1200/1563 - train loss: 2.8449 - test loss: 4.0816 - train acc: 0.8150 - test acc: 0.7601 - 7m 16s
batch: 1300/1563 - train loss: 2.7363 - test loss: 4.2273 - train acc: 0.8303 - test acc: 0.7578 - 7m 19s
batch: 1400/1563 - train loss: 2.8033 - test loss: 4.4176 - train acc: 0.8215 - test acc: 0.7452 - 7m 21s
batch: 1500/1563 - train loss: 2.8464 - test loss: 4.2255 - train acc: 0.8193 - test acc: 0.7563 - 7m 24s
batch: 1563/1563 - train loss: 2.9465 - test loss: 4.1600 - train acc: 0.8209 - test acc: 0.7556 - 7m 27s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 11/100
batch: 100/1563 - train loss: 2.2512 - test loss: 4.0938 - train acc: 0.8647 - test acc: 0.7643 - 7m 30s
batch: 200/1563 - train loss: 2.1284 - test loss: 4.3472 - train acc: 0.8744 - test acc: 0.7596 - 7m 33s
batch: 300/1563 - train loss: 2.3211 - test loss:

batch: 400/1563 - train loss: 1.4735 - test loss: 5.0355 - train acc: 0.9042 - test acc: 0.7635 - 10m 42s
batch: 500/1563 - train loss: 1.6194 - test loss: 4.8342 - train acc: 0.8994 - test acc: 0.7593 - 10m 45s
batch: 600/1563 - train loss: 1.6611 - test loss: 4.9030 - train acc: 0.8992 - test acc: 0.7518 - 10m 48s
batch: 700/1563 - train loss: 1.5572 - test loss: 4.9401 - train acc: 0.9048 - test acc: 0.7572 - 10m 51s
batch: 800/1563 - train loss: 1.6954 - test loss: 4.6781 - train acc: 0.8957 - test acc: 0.7614 - 10m 54s
batch: 900/1563 - train loss: 1.5855 - test loss: 4.8986 - train acc: 0.9016 - test acc: 0.7614 - 10m 56s
batch: 1000/1563 - train loss: 1.6315 - test loss: 4.8859 - train acc: 0.8950 - test acc: 0.7620 - 10m 59s
batch: 1100/1563 - train loss: 1.6430 - test loss: 4.8166 - train acc: 0.8976 - test acc: 0.7610 - 11m 2s
batch: 1200/1563 - train loss: 1.7437 - test loss: 4.8844 - train acc: 0.8916 - test acc: 0.7633 - 11m 5s
batch: 1300/1563 - train loss: 1.7419 - test 

batch: 1400/1563 - train loss: 1.1926 - test loss: 5.6637 - train acc: 0.9311 - test acc: 0.7585 - 14m 18s
batch: 1500/1563 - train loss: 1.2972 - test loss: 5.6792 - train acc: 0.9211 - test acc: 0.7549 - 14m 21s
batch: 1563/1563 - train loss: 1.3949 - test loss: 5.6689 - train acc: 0.9163 - test acc: 0.7506 - 14m 23s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 20/100
batch: 100/1563 - train loss: 0.9701 - test loss: 6.0079 - train acc: 0.9402 - test acc: 0.7576 - 14m 27s
batch: 200/1563 - train loss: 1.1016 - test loss: 5.5910 - train acc: 0.9333 - test acc: 0.7602 - 14m 30s
batch: 300/1563 - train loss: 0.8982 - test loss: 5.8500 - train acc: 0.9448 - test acc: 0.7598 - 14m 32s
batch: 400/1563 - train loss: 1.0403 - test loss: 5.8688 - train acc: 0.9351 - test acc: 0.7593 - 14m 35s
batch: 500/1563 - train loss: 1.1380 - test loss: 5.7565 - train acc: 0.9270 - test acc: 0.7572 - 14m 38s
batch: 600/1563 - train loss: 1.0632 - test 

batch: 700/1563 - train loss: 0.9854 - test loss: 6.2144 - train acc: 0.9424 - test acc: 0.7562 - 17m 53s
batch: 800/1563 - train loss: 0.9559 - test loss: 6.3093 - train acc: 0.9455 - test acc: 0.7612 - 17m 56s
batch: 900/1563 - train loss: 1.0667 - test loss: 6.2955 - train acc: 0.9383 - test acc: 0.7546 - 17m 59s
batch: 1000/1563 - train loss: 1.0731 - test loss: 6.2788 - train acc: 0.9352 - test acc: 0.7510 - 18m 2s
batch: 1100/1563 - train loss: 1.1714 - test loss: 6.3441 - train acc: 0.9283 - test acc: 0.7485 - 18m 5s
batch: 1200/1563 - train loss: 1.3037 - test loss: 6.0533 - train acc: 0.9202 - test acc: 0.7575 - 18m 9s
batch: 1300/1563 - train loss: 1.0317 - test loss: 6.1837 - train acc: 0.9311 - test acc: 0.7622 - 18m 12s
batch: 1400/1563 - train loss: 1.0809 - test loss: 6.2215 - train acc: 0.9345 - test acc: 0.7614 - 18m 14s
batch: 1500/1563 - train loss: 1.3297 - test loss: 6.0746 - train acc: 0.9201 - test acc: 0.7475 - 18m 17s
batch: 1563/1563 - train loss: 1.2970 - tes

batch: 1300/1563 - train loss: 9.2333 - test loss: 9.1632 - train acc: 0.3831 - test acc: 0.3902 - 0m 35s
batch: 1400/1563 - train loss: 9.3361 - test loss: 9.0797 - train acc: 0.3863 - test acc: 0.4028 - 0m 38s
batch: 1500/1563 - train loss: 9.0310 - test loss: 8.8975 - train acc: 0.3931 - test acc: 0.4034 - 0m 40s
batch: 1563/1563 - train loss: 8.9777 - test loss: 8.7660 - train acc: 0.4047 - test acc: 0.4208 - 0m 43s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 2/100
batch: 100/1563 - train loss: 8.9844 - test loss: 8.7026 - train acc: 0.4100 - test acc: 0.4338 - 0m 45s
batch: 200/1563 - train loss: 8.7081 - test loss: 8.4136 - train acc: 0.4294 - test acc: 0.4525 - 0m 48s
batch: 300/1563 - train loss: 8.5985 - test loss: 8.1892 - train acc: 0.4318 - test acc: 0.4623 - 0m 51s
batch: 400/1563 - train loss: 8.3580 - test loss: 8.1201 - train acc: 0.4537 - test acc: 0.4694 - 0m 54s
batch: 500/1563 - train loss: 8.1788 - test loss: 7.

batch: 600/1563 - train loss: 4.0128 - test loss: 4.4832 - train acc: 0.7569 - test acc: 0.7257 - 4m 0s
batch: 700/1563 - train loss: 4.4587 - test loss: 4.7427 - train acc: 0.7256 - test acc: 0.7080 - 4m 3s
batch: 800/1563 - train loss: 4.3676 - test loss: 4.4815 - train acc: 0.7250 - test acc: 0.7230 - 4m 6s
batch: 900/1563 - train loss: 4.4013 - test loss: 4.6140 - train acc: 0.7253 - test acc: 0.7153 - 4m 8s
batch: 1000/1563 - train loss: 4.4225 - test loss: 4.4938 - train acc: 0.7231 - test acc: 0.7228 - 4m 11s
batch: 1100/1563 - train loss: 4.3715 - test loss: 4.5328 - train acc: 0.7328 - test acc: 0.7184 - 4m 15s
batch: 1200/1563 - train loss: 4.3223 - test loss: 4.7492 - train acc: 0.7241 - test acc: 0.7019 - 4m 18s
batch: 1300/1563 - train loss: 4.3917 - test loss: 4.6960 - train acc: 0.7298 - test acc: 0.7124 - 4m 20s
batch: 1400/1563 - train loss: 4.3893 - test loss: 4.4186 - train acc: 0.7297 - test acc: 0.7278 - 4m 23s
batch: 1500/1563 - train loss: 4.1461 - test loss: 4.3

batch: 1563/1563 - train loss: 3.0208 - test loss: 3.9586 - train acc: 0.8109 - test acc: 0.7659 - 7m 31s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 11/100
batch: 100/1563 - train loss: 2.1348 - test loss: 4.1963 - train acc: 0.8712 - test acc: 0.7594 - 7m 34s
batch: 200/1563 - train loss: 2.3544 - test loss: 4.1990 - train acc: 0.8522 - test acc: 0.7592 - 7m 37s
batch: 300/1563 - train loss: 2.3147 - test loss: 4.1820 - train acc: 0.8556 - test acc: 0.7677 - 7m 40s
batch: 400/1563 - train loss: 2.2436 - test loss: 4.1166 - train acc: 0.8544 - test acc: 0.7666 - 7m 42s
batch: 500/1563 - train loss: 2.3142 - test loss: 4.2093 - train acc: 0.8503 - test acc: 0.7595 - 7m 45s
batch: 600/1563 - train loss: 2.3921 - test loss: 4.3706 - train acc: 0.8490 - test acc: 0.7591 - 7m 48s
batch: 700/1563 - train loss: 2.3806 - test loss: 4.3169 - train acc: 0.8466 - test acc: 0.7606 - 7m 51s
batch: 800/1563 - train loss: 2.5137 - test loss: 4.33

batch: 900/1563 - train loss: 1.5275 - test loss: 4.7936 - train acc: 0.9019 - test acc: 0.7604 - 11m 3s
batch: 1000/1563 - train loss: 1.7328 - test loss: 4.7336 - train acc: 0.8966 - test acc: 0.7592 - 11m 6s
batch: 1100/1563 - train loss: 1.6218 - test loss: 4.7943 - train acc: 0.9051 - test acc: 0.7651 - 11m 9s
batch: 1200/1563 - train loss: 1.8386 - test loss: 4.5839 - train acc: 0.8841 - test acc: 0.7633 - 11m 12s
batch: 1300/1563 - train loss: 1.8128 - test loss: 4.5974 - train acc: 0.8875 - test acc: 0.7642 - 11m 15s
batch: 1400/1563 - train loss: 1.6975 - test loss: 4.7678 - train acc: 0.8969 - test acc: 0.7618 - 11m 18s
batch: 1500/1563 - train loss: 1.7660 - test loss: 4.4984 - train acc: 0.8917 - test acc: 0.7661 - 11m 21s
batch: 1563/1563 - train loss: 1.9539 - test loss: 4.4863 - train acc: 0.8803 - test acc: 0.7624 - 11m 24s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 16/100
batch: 100/1563 - train loss: 1.3559 - test

batch: 200/1563 - train loss: 0.9886 - test loss: 5.3214 - train acc: 0.9358 - test acc: 0.7712 - 14m 36s
batch: 300/1563 - train loss: 1.0502 - test loss: 5.4334 - train acc: 0.9398 - test acc: 0.7614 - 14m 39s
batch: 400/1563 - train loss: 1.0545 - test loss: 5.3405 - train acc: 0.9352 - test acc: 0.7693 - 14m 42s
batch: 500/1563 - train loss: 1.1544 - test loss: 5.4966 - train acc: 0.9261 - test acc: 0.7646 - 14m 45s
batch: 600/1563 - train loss: 1.1509 - test loss: 5.5540 - train acc: 0.9267 - test acc: 0.7628 - 14m 48s
batch: 700/1563 - train loss: 1.0988 - test loss: 5.8421 - train acc: 0.9339 - test acc: 0.7546 - 14m 52s
batch: 800/1563 - train loss: 1.3280 - test loss: 5.3066 - train acc: 0.9236 - test acc: 0.7647 - 14m 55s
batch: 900/1563 - train loss: 1.3220 - test loss: 5.2651 - train acc: 0.9195 - test acc: 0.7619 - 14m 58s
batch: 1000/1563 - train loss: 1.2245 - test loss: 5.2395 - train acc: 0.9289 - test acc: 0.7580 - 15m 1s
batch: 1100/1563 - train loss: 1.3394 - test l

batch: 1200/1563 - train loss: 1.1234 - test loss: 5.9462 - train acc: 0.9330 - test acc: 0.7616 - 18m 15s
batch: 1300/1563 - train loss: 0.9991 - test loss: 6.3649 - train acc: 0.9389 - test acc: 0.7538 - 18m 18s
batch: 1400/1563 - train loss: 1.0530 - test loss: 5.8738 - train acc: 0.9329 - test acc: 0.7561 - 18m 21s
batch: 1500/1563 - train loss: 1.0694 - test loss: 5.9098 - train acc: 0.9376 - test acc: 0.7646 - 18m 24s
batch: 1563/1563 - train loss: 1.1496 - test loss: 5.9623 - train acc: 0.9323 - test acc: 0.7612 - 18m 27s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 25/100
batch: 100/1563 - train loss: 0.8351 - test loss: 6.1136 - train acc: 0.9477 - test acc: 0.7606 - 18m 30s
batch: 200/1563 - train loss: 1.0210 - test loss: 6.4132 - train acc: 0.9358 - test acc: 0.7533 - 18m 33s
batch: 300/1563 - train loss: 0.9599 - test loss: 6.0165 - train acc: 0.9449 - test acc: 0.7620 - 18m 36s
batch: 400/1563 - train loss: 0.9314 - tes

batch: 400/1563 - train loss: 8.1910 - test loss: 7.8484 - train acc: 0.4738 - test acc: 0.4866 - 0m 54s
batch: 500/1563 - train loss: 8.2861 - test loss: 7.6711 - train acc: 0.4547 - test acc: 0.5072 - 0m 57s
batch: 600/1563 - train loss: 8.1721 - test loss: 7.6502 - train acc: 0.4732 - test acc: 0.4992 - 1m 0s
batch: 700/1563 - train loss: 7.7523 - test loss: 7.4452 - train acc: 0.4994 - test acc: 0.5165 - 1m 3s
batch: 800/1563 - train loss: 7.8302 - test loss: 7.4469 - train acc: 0.5000 - test acc: 0.5255 - 1m 6s
batch: 900/1563 - train loss: 7.5275 - test loss: 7.4122 - train acc: 0.5131 - test acc: 0.5260 - 1m 8s
batch: 1000/1563 - train loss: 7.7670 - test loss: 7.3571 - train acc: 0.4875 - test acc: 0.5267 - 1m 11s
batch: 1100/1563 - train loss: 7.5280 - test loss: 7.1494 - train acc: 0.5209 - test acc: 0.5391 - 1m 14s
batch: 1200/1563 - train loss: 7.4428 - test loss: 7.1146 - train acc: 0.5181 - test acc: 0.5503 - 1m 17s
batch: 1300/1563 - train loss: 7.3583 - test loss: 6.999

batch: 1400/1563 - train loss: 3.9843 - test loss: 4.3206 - train acc: 0.7484 - test acc: 0.7346 - 4m 24s
batch: 1500/1563 - train loss: 4.0222 - test loss: 4.2234 - train acc: 0.7482 - test acc: 0.7461 - 4m 27s
batch: 1563/1563 - train loss: 4.2442 - test loss: 4.4140 - train acc: 0.7354 - test acc: 0.7310 - 4m 29s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 7/100
batch: 100/1563 - train loss: 3.5214 - test loss: 4.6255 - train acc: 0.7818 - test acc: 0.7213 - 4m 32s
batch: 200/1563 - train loss: 3.7543 - test loss: 4.4059 - train acc: 0.7684 - test acc: 0.7361 - 4m 35s
batch: 300/1563 - train loss: 3.8390 - test loss: 4.3229 - train acc: 0.7507 - test acc: 0.7342 - 4m 37s
batch: 400/1563 - train loss: 3.7924 - test loss: 4.1863 - train acc: 0.7693 - test acc: 0.7433 - 4m 41s
batch: 500/1563 - train loss: 3.6806 - test loss: 4.3097 - train acc: 0.7709 - test acc: 0.7405 - 4m 44s
batch: 600/1563 - train loss: 3.6638 - test loss: 4.3

batch: 700/1563 - train loss: 2.3054 - test loss: 4.2137 - train acc: 0.8497 - test acc: 0.7663 - 7m 52s
batch: 800/1563 - train loss: 2.3109 - test loss: 4.8644 - train acc: 0.8594 - test acc: 0.7442 - 7m 55s
batch: 900/1563 - train loss: 2.4877 - test loss: 4.0509 - train acc: 0.8443 - test acc: 0.7681 - 7m 58s
batch: 1000/1563 - train loss: 2.4148 - test loss: 4.1356 - train acc: 0.8531 - test acc: 0.7620 - 8m 1s
batch: 1100/1563 - train loss: 2.4631 - test loss: 4.0239 - train acc: 0.8437 - test acc: 0.7726 - 8m 4s
batch: 1200/1563 - train loss: 2.4333 - test loss: 4.0377 - train acc: 0.8393 - test acc: 0.7711 - 8m 6s
batch: 1300/1563 - train loss: 2.4414 - test loss: 4.0625 - train acc: 0.8437 - test acc: 0.7705 - 8m 9s
batch: 1400/1563 - train loss: 2.5106 - test loss: 4.1452 - train acc: 0.8428 - test acc: 0.7615 - 8m 12s
batch: 1500/1563 - train loss: 2.6322 - test loss: 4.1492 - train acc: 0.8349 - test acc: 0.7670 - 8m 15s
batch: 1563/1563 - train loss: 2.4788 - test loss: 3.

batch: 100/1563 - train loss: 1.3048 - test loss: 4.7028 - train acc: 0.9186 - test acc: 0.7693 - 11m 26s
batch: 200/1563 - train loss: 1.3344 - test loss: 4.7688 - train acc: 0.9139 - test acc: 0.7745 - 11m 29s
batch: 300/1563 - train loss: 1.2827 - test loss: 5.0041 - train acc: 0.9179 - test acc: 0.7723 - 11m 32s
batch: 400/1563 - train loss: 1.3291 - test loss: 4.8189 - train acc: 0.9217 - test acc: 0.7688 - 11m 35s
batch: 500/1563 - train loss: 1.5601 - test loss: 4.9297 - train acc: 0.9101 - test acc: 0.7622 - 11m 38s
batch: 600/1563 - train loss: 1.4730 - test loss: 4.6411 - train acc: 0.9101 - test acc: 0.7646 - 11m 41s
batch: 700/1563 - train loss: 1.5414 - test loss: 4.9463 - train acc: 0.9032 - test acc: 0.7606 - 11m 44s
batch: 800/1563 - train loss: 1.6540 - test loss: 4.6891 - train acc: 0.8994 - test acc: 0.7681 - 11m 46s
batch: 900/1563 - train loss: 1.6119 - test loss: 4.6960 - train acc: 0.8988 - test acc: 0.7663 - 11m 50s
batch: 1000/1563 - train loss: 1.4393 - test l

batch: 1100/1563 - train loss: 1.5069 - test loss: 5.1119 - train acc: 0.9089 - test acc: 0.7655 - 15m 2s
batch: 1200/1563 - train loss: 1.3165 - test loss: 5.0365 - train acc: 0.9186 - test acc: 0.7653 - 15m 5s
batch: 1300/1563 - train loss: 1.3329 - test loss: 5.4829 - train acc: 0.9210 - test acc: 0.7562 - 15m 8s
batch: 1400/1563 - train loss: 1.3056 - test loss: 5.2428 - train acc: 0.9192 - test acc: 0.7680 - 15m 11s
batch: 1500/1563 - train loss: 1.2641 - test loss: 5.4817 - train acc: 0.9236 - test acc: 0.7692 - 15m 14s
batch: 1563/1563 - train loss: 1.4000 - test loss: 5.2480 - train acc: 0.9101 - test acc: 0.7596 - 15m 17s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 21/100
batch: 100/1563 - train loss: 0.9386 - test loss: 5.3006 - train acc: 0.9408 - test acc: 0.7750 - 15m 20s
batch: 200/1563 - train loss: 1.1531 - test loss: 5.5018 - train acc: 0.9308 - test acc: 0.7712 - 15m 23s
batch: 300/1563 - train loss: 1.0648 - test 

batch: 400/1563 - train loss: 0.9752 - test loss: 6.2270 - train acc: 0.9418 - test acc: 0.7578 - 18m 37s
batch: 500/1563 - train loss: 0.9238 - test loss: 6.3370 - train acc: 0.9433 - test acc: 0.7635 - 18m 40s
batch: 600/1563 - train loss: 1.0311 - test loss: 5.7393 - train acc: 0.9417 - test acc: 0.7658 - 18m 43s
batch: 700/1563 - train loss: 0.9727 - test loss: 5.8815 - train acc: 0.9442 - test acc: 0.7659 - 18m 46s
batch: 800/1563 - train loss: 0.8571 - test loss: 5.9677 - train acc: 0.9467 - test acc: 0.7633 - 18m 49s
batch: 900/1563 - train loss: 1.0334 - test loss: 5.9187 - train acc: 0.9370 - test acc: 0.7577 - 18m 52s
batch: 1000/1563 - train loss: 1.1460 - test loss: 6.3105 - train acc: 0.9333 - test acc: 0.7497 - 18m 55s
batch: 1100/1563 - train loss: 1.0702 - test loss: 6.1071 - train acc: 0.9355 - test acc: 0.7581 - 18m 58s
batch: 1200/1563 - train loss: 1.0664 - test loss: 6.0313 - train acc: 0.9417 - test acc: 0.7610 - 19m 1s
batch: 1300/1563 - train loss: 1.0579 - test

batch: 1100/1563 - train loss: 7.4170 - test loss: 7.4275 - train acc: 0.5231 - test acc: 0.5173 - 1m 13s
batch: 1200/1563 - train loss: 7.1261 - test loss: 6.9961 - train acc: 0.5406 - test acc: 0.5457 - 1m 16s
batch: 1300/1563 - train loss: 7.0516 - test loss: 7.3481 - train acc: 0.5434 - test acc: 0.5330 - 1m 19s
batch: 1400/1563 - train loss: 7.2726 - test loss: 6.7571 - train acc: 0.5446 - test acc: 0.5677 - 1m 22s
batch: 1500/1563 - train loss: 7.0068 - test loss: 6.9570 - train acc: 0.5662 - test acc: 0.5585 - 1m 25s
batch: 1563/1563 - train loss: 7.1963 - test loss: 6.7581 - train acc: 0.5416 - test acc: 0.5571 - 1m 27s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 3/100
batch: 100/1563 - train loss: 6.6952 - test loss: 6.6413 - train acc: 0.5747 - test acc: 0.5707 - 1m 30s
batch: 200/1563 - train loss: 6.6783 - test loss: 6.2582 - train acc: 0.5737 - test acc: 0.6028 - 1m 33s
batch: 300/1563 - train loss: 6.4979 - test loss: 

batch: 400/1563 - train loss: 3.6159 - test loss: 4.4060 - train acc: 0.7687 - test acc: 0.7307 - 4m 39s
batch: 500/1563 - train loss: 3.7948 - test loss: 4.1566 - train acc: 0.7643 - test acc: 0.7465 - 4m 42s
batch: 600/1563 - train loss: 3.6382 - test loss: 4.2828 - train acc: 0.7737 - test acc: 0.7435 - 4m 45s
batch: 700/1563 - train loss: 3.7089 - test loss: 4.2524 - train acc: 0.7565 - test acc: 0.7425 - 4m 48s
batch: 800/1563 - train loss: 3.8240 - test loss: 4.2744 - train acc: 0.7693 - test acc: 0.7385 - 4m 50s
batch: 900/1563 - train loss: 3.9664 - test loss: 4.3034 - train acc: 0.7528 - test acc: 0.7347 - 4m 54s
batch: 1000/1563 - train loss: 3.9842 - test loss: 4.1806 - train acc: 0.7569 - test acc: 0.7470 - 4m 56s
batch: 1100/1563 - train loss: 3.7028 - test loss: 4.1344 - train acc: 0.7681 - test acc: 0.7490 - 4m 59s
batch: 1200/1563 - train loss: 3.6681 - test loss: 4.2723 - train acc: 0.7728 - test acc: 0.7405 - 5m 2s
batch: 1300/1563 - train loss: 3.7613 - test loss: 4.

batch: 1400/1563 - train loss: 2.6597 - test loss: 4.2119 - train acc: 0.8337 - test acc: 0.7664 - 8m 11s
batch: 1500/1563 - train loss: 2.4625 - test loss: 4.1893 - train acc: 0.8465 - test acc: 0.7602 - 8m 14s
batch: 1563/1563 - train loss: 2.5499 - test loss: 4.1696 - train acc: 0.8402 - test acc: 0.7641 - 8m 17s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 12/100
batch: 100/1563 - train loss: 2.0764 - test loss: 4.0851 - train acc: 0.8694 - test acc: 0.7715 - 8m 19s
batch: 200/1563 - train loss: 1.8291 - test loss: 4.3740 - train acc: 0.8891 - test acc: 0.7687 - 8m 22s
batch: 300/1563 - train loss: 2.1135 - test loss: 4.2087 - train acc: 0.8672 - test acc: 0.7676 - 8m 25s
batch: 400/1563 - train loss: 2.0944 - test loss: 4.2733 - train acc: 0.8681 - test acc: 0.7638 - 8m 28s
batch: 500/1563 - train loss: 1.9688 - test loss: 4.4016 - train acc: 0.8754 - test acc: 0.7607 - 8m 31s
batch: 600/1563 - train loss: 2.2314 - test loss: 4.

batch: 700/1563 - train loss: 1.5428 - test loss: 4.9394 - train acc: 0.9026 - test acc: 0.7599 - 11m 42s
batch: 800/1563 - train loss: 1.3801 - test loss: 4.8512 - train acc: 0.9171 - test acc: 0.7688 - 11m 45s
batch: 900/1563 - train loss: 1.4907 - test loss: 4.9165 - train acc: 0.9073 - test acc: 0.7608 - 11m 48s
batch: 1000/1563 - train loss: 1.4802 - test loss: 4.9809 - train acc: 0.9126 - test acc: 0.7651 - 11m 51s
batch: 1100/1563 - train loss: 1.7487 - test loss: 4.6806 - train acc: 0.8857 - test acc: 0.7597 - 11m 54s
batch: 1200/1563 - train loss: 1.6088 - test loss: 4.7636 - train acc: 0.8941 - test acc: 0.7713 - 11m 57s
batch: 1300/1563 - train loss: 1.6822 - test loss: 4.8667 - train acc: 0.8885 - test acc: 0.7615 - 12m 0s
batch: 1400/1563 - train loss: 1.7514 - test loss: 4.7500 - train acc: 0.8985 - test acc: 0.7651 - 12m 3s
batch: 1500/1563 - train loss: 1.6039 - test loss: 4.7177 - train acc: 0.9032 - test acc: 0.7710 - 12m 6s
batch: 1563/1563 - train loss: 1.6521 - tes

batch: 100/1563 - train loss: 1.0891 - test loss: 5.5235 - train acc: 0.9326 - test acc: 0.7662 - 15m 19s
batch: 200/1563 - train loss: 0.9227 - test loss: 5.4635 - train acc: 0.9411 - test acc: 0.7664 - 15m 22s
batch: 300/1563 - train loss: 1.0775 - test loss: 5.6436 - train acc: 0.9333 - test acc: 0.7597 - 15m 24s
batch: 400/1563 - train loss: 0.9724 - test loss: 5.7351 - train acc: 0.9424 - test acc: 0.7572 - 15m 27s
batch: 500/1563 - train loss: 1.0711 - test loss: 6.0017 - train acc: 0.9358 - test acc: 0.7595 - 15m 31s
batch: 600/1563 - train loss: 1.2101 - test loss: 5.6898 - train acc: 0.9239 - test acc: 0.7495 - 15m 34s
batch: 700/1563 - train loss: 1.1864 - test loss: 5.9001 - train acc: 0.9258 - test acc: 0.7470 - 15m 36s
batch: 800/1563 - train loss: 1.0471 - test loss: 6.0117 - train acc: 0.9332 - test acc: 0.7629 - 15m 40s
batch: 900/1563 - train loss: 1.0260 - test loss: 5.5380 - train acc: 0.9345 - test acc: 0.7677 - 15m 43s
batch: 1000/1563 - train loss: 1.2402 - test l

batch: 1100/1563 - train loss: 1.1324 - test loss: 6.0817 - train acc: 0.9314 - test acc: 0.7586 - 18m 58s
batch: 1200/1563 - train loss: 1.2369 - test loss: 6.1049 - train acc: 0.9211 - test acc: 0.7611 - 19m 1s
batch: 1300/1563 - train loss: 1.0186 - test loss: 5.8648 - train acc: 0.9370 - test acc: 0.7636 - 19m 4s
batch: 1400/1563 - train loss: 1.0320 - test loss: 6.2078 - train acc: 0.9364 - test acc: 0.7600 - 19m 7s
batch: 1500/1563 - train loss: 1.2180 - test loss: 6.0813 - train acc: 0.9314 - test acc: 0.7466 - 19m 10s
batch: 1563/1563 - train loss: 1.2197 - test loss: 5.9969 - train acc: 0.9257 - test acc: 0.7621 - 19m 13s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 26/100
batch: 100/1563 - train loss: 0.8970 - test loss: 6.1541 - train acc: 0.9417 - test acc: 0.7611 - 19m 16s
batch: 200/1563 - train loss: 0.8259 - test loss: 6.3901 - train acc: 0.9518 - test acc: 0.7618 - 19m 19s
batch: 300/1563 - train loss: 0.7682 - test 

batch: 300/1563 - train loss: 6.6755 - test loss: 6.4172 - train acc: 0.5725 - test acc: 0.5912 - 1m 37s
batch: 400/1563 - train loss: 6.4654 - test loss: 6.2988 - train acc: 0.5875 - test acc: 0.5932 - 1m 39s
batch: 500/1563 - train loss: 6.3660 - test loss: 6.3216 - train acc: 0.5919 - test acc: 0.5945 - 1m 42s
batch: 600/1563 - train loss: 6.5047 - test loss: 6.2642 - train acc: 0.5953 - test acc: 0.6061 - 1m 46s
batch: 700/1563 - train loss: 6.2737 - test loss: 6.3011 - train acc: 0.6137 - test acc: 0.5995 - 1m 48s
batch: 800/1563 - train loss: 6.4430 - test loss: 6.0907 - train acc: 0.6019 - test acc: 0.6150 - 1m 51s
batch: 900/1563 - train loss: 6.2403 - test loss: 6.1866 - train acc: 0.6018 - test acc: 0.6133 - 1m 54s
batch: 1000/1563 - train loss: 6.1297 - test loss: 6.0413 - train acc: 0.6059 - test acc: 0.6199 - 1m 57s
batch: 1100/1563 - train loss: 6.0658 - test loss: 5.7918 - train acc: 0.6179 - test acc: 0.6333 - 2m 0s
batch: 1200/1563 - train loss: 5.8758 - test loss: 6.0

batch: 1300/1563 - train loss: 3.8204 - test loss: 4.1204 - train acc: 0.7650 - test acc: 0.7496 - 5m 7s
batch: 1400/1563 - train loss: 3.8151 - test loss: 4.2522 - train acc: 0.7597 - test acc: 0.7408 - 5m 10s
batch: 1500/1563 - train loss: 3.8360 - test loss: 4.2022 - train acc: 0.7627 - test acc: 0.7446 - 5m 13s
batch: 1563/1563 - train loss: 3.7411 - test loss: 4.1257 - train acc: 0.7753 - test acc: 0.7480 - 5m 15s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 8/100
batch: 100/1563 - train loss: 3.3056 - test loss: 4.1934 - train acc: 0.7959 - test acc: 0.7461 - 5m 18s
batch: 200/1563 - train loss: 3.3024 - test loss: 4.4219 - train acc: 0.7977 - test acc: 0.7386 - 5m 21s
batch: 300/1563 - train loss: 3.2588 - test loss: 4.3550 - train acc: 0.7949 - test acc: 0.7379 - 5m 24s
batch: 400/1563 - train loss: 3.2698 - test loss: 4.2168 - train acc: 0.7962 - test acc: 0.7450 - 5m 27s
batch: 500/1563 - train loss: 3.3454 - test loss: 4.2

batch: 600/1563 - train loss: 2.0811 - test loss: 4.5379 - train acc: 0.8669 - test acc: 0.7583 - 8m 36s
batch: 700/1563 - train loss: 2.2672 - test loss: 4.3016 - train acc: 0.8553 - test acc: 0.7607 - 8m 39s
batch: 800/1563 - train loss: 2.1764 - test loss: 4.3572 - train acc: 0.8665 - test acc: 0.7664 - 8m 42s
batch: 900/1563 - train loss: 2.1802 - test loss: 4.3173 - train acc: 0.8590 - test acc: 0.7665 - 8m 45s
batch: 1000/1563 - train loss: 2.1973 - test loss: 4.4812 - train acc: 0.8603 - test acc: 0.7613 - 8m 48s
batch: 1100/1563 - train loss: 2.3058 - test loss: 4.2323 - train acc: 0.8525 - test acc: 0.7666 - 8m 51s
batch: 1200/1563 - train loss: 2.1243 - test loss: 4.3055 - train acc: 0.8700 - test acc: 0.7707 - 8m 54s
batch: 1300/1563 - train loss: 2.3847 - test loss: 4.3589 - train acc: 0.8493 - test acc: 0.7561 - 8m 57s
batch: 1400/1563 - train loss: 2.2200 - test loss: 4.4694 - train acc: 0.8541 - test acc: 0.7589 - 9m 0s
batch: 1500/1563 - train loss: 2.4576 - test loss: 

batch: 1563/1563 - train loss: 1.8074 - test loss: 4.8872 - train acc: 0.8838 - test acc: 0.7568 - 12m 11s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 17/100
batch: 100/1563 - train loss: 1.3691 - test loss: 4.7377 - train acc: 0.9198 - test acc: 0.7702 - 12m 14s
batch: 200/1563 - train loss: 1.1341 - test loss: 5.3411 - train acc: 0.9289 - test acc: 0.7580 - 12m 17s
batch: 300/1563 - train loss: 1.2966 - test loss: 5.4643 - train acc: 0.9220 - test acc: 0.7565 - 12m 20s
batch: 400/1563 - train loss: 1.1507 - test loss: 5.1229 - train acc: 0.9289 - test acc: 0.7677 - 12m 23s
batch: 500/1563 - train loss: 1.2683 - test loss: 5.2146 - train acc: 0.9239 - test acc: 0.7641 - 12m 26s
batch: 600/1563 - train loss: 1.3984 - test loss: 5.0398 - train acc: 0.9126 - test acc: 0.7606 - 12m 29s
batch: 700/1563 - train loss: 1.4379 - test loss: 5.0290 - train acc: 0.9117 - test acc: 0.7557 - 12m 32s
batch: 800/1563 - train loss: 1.2343 - test lo

batch: 900/1563 - train loss: 1.3008 - test loss: 5.5323 - train acc: 0.9227 - test acc: 0.7609 - 15m 45s
batch: 1000/1563 - train loss: 1.2706 - test loss: 5.4191 - train acc: 0.9229 - test acc: 0.7660 - 15m 48s
batch: 1100/1563 - train loss: 1.2422 - test loss: 5.4409 - train acc: 0.9245 - test acc: 0.7622 - 15m 51s
batch: 1200/1563 - train loss: 1.2271 - test loss: 5.5468 - train acc: 0.9270 - test acc: 0.7594 - 15m 54s
batch: 1300/1563 - train loss: 1.1993 - test loss: 5.7999 - train acc: 0.9276 - test acc: 0.7547 - 15m 57s
batch: 1400/1563 - train loss: 1.3844 - test loss: 5.4083 - train acc: 0.9176 - test acc: 0.7616 - 16m 0s
batch: 1500/1563 - train loss: 1.2416 - test loss: 5.3979 - train acc: 0.9257 - test acc: 0.7603 - 16m 2s
batch: 1563/1563 - train loss: 1.2189 - test loss: 5.4800 - train acc: 0.9226 - test acc: 0.7625 - 16m 5s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 22/100
batch: 100/1563 - train loss: 0.9358 - test

batch: 200/1563 - train loss: 0.9348 - test loss: 6.1439 - train acc: 0.9443 - test acc: 0.7584 - 19m 21s
batch: 300/1563 - train loss: 0.9150 - test loss: 6.1095 - train acc: 0.9471 - test acc: 0.7657 - 19m 24s
batch: 400/1563 - train loss: 0.9415 - test loss: 6.0273 - train acc: 0.9421 - test acc: 0.7673 - 19m 27s
batch: 500/1563 - train loss: 0.8122 - test loss: 6.3495 - train acc: 0.9505 - test acc: 0.7684 - 19m 30s
batch: 600/1563 - train loss: 0.8688 - test loss: 7.0018 - train acc: 0.9468 - test acc: 0.7557 - 19m 33s
batch: 700/1563 - train loss: 0.8560 - test loss: 6.4102 - train acc: 0.9474 - test acc: 0.7680 - 19m 36s
batch: 800/1563 - train loss: 0.9406 - test loss: 6.1800 - train acc: 0.9467 - test acc: 0.7664 - 19m 39s
batch: 900/1563 - train loss: 1.0295 - test loss: 6.4638 - train acc: 0.9380 - test acc: 0.7584 - 19m 42s
batch: 1000/1563 - train loss: 1.1075 - test loss: 6.2206 - train acc: 0.9339 - test acc: 0.7627 - 19m 45s
batch: 1100/1563 - train loss: 1.1615 - test 

batch: 1200/1563 - train loss: 6.2407 - test loss: 5.9233 - train acc: 0.6091 - test acc: 0.6273 - 2m 2s
batch: 1300/1563 - train loss: 6.1640 - test loss: 5.9991 - train acc: 0.6081 - test acc: 0.6300 - 2m 5s
batch: 1400/1563 - train loss: 6.1299 - test loss: 5.8667 - train acc: 0.6119 - test acc: 0.6334 - 2m 8s
batch: 1500/1563 - train loss: 5.9135 - test loss: 5.6447 - train acc: 0.6265 - test acc: 0.6446 - 2m 11s
batch: 1563/1563 - train loss: 5.8774 - test loss: 6.5831 - train acc: 0.6325 - test acc: 0.5873 - 2m 13s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 4/100
batch: 100/1563 - train loss: 5.8475 - test loss: 5.7737 - train acc: 0.6328 - test acc: 0.6386 - 2m 16s
batch: 200/1563 - train loss: 5.6723 - test loss: 5.5111 - train acc: 0.6447 - test acc: 0.6543 - 2m 19s
batch: 300/1563 - train loss: 5.6397 - test loss: 5.7615 - train acc: 0.6484 - test acc: 0.6387 - 2m 21s
batch: 400/1563 - train loss: 5.5710 - test loss: 5.54

batch: 500/1563 - train loss: 3.4290 - test loss: 4.1724 - train acc: 0.7815 - test acc: 0.7506 - 5m 28s
batch: 600/1563 - train loss: 3.4847 - test loss: 4.2615 - train acc: 0.7778 - test acc: 0.7454 - 5m 31s
batch: 700/1563 - train loss: 3.3746 - test loss: 4.4779 - train acc: 0.8009 - test acc: 0.7311 - 5m 34s
batch: 800/1563 - train loss: 3.5495 - test loss: 4.1876 - train acc: 0.7824 - test acc: 0.7455 - 5m 37s
batch: 900/1563 - train loss: 3.3579 - test loss: 4.2168 - train acc: 0.7881 - test acc: 0.7458 - 5m 40s
batch: 1000/1563 - train loss: 3.3564 - test loss: 4.2824 - train acc: 0.7937 - test acc: 0.7459 - 5m 42s
batch: 1100/1563 - train loss: 3.2841 - test loss: 4.2736 - train acc: 0.7916 - test acc: 0.7439 - 5m 45s
batch: 1200/1563 - train loss: 3.4384 - test loss: 4.2559 - train acc: 0.7896 - test acc: 0.7453 - 5m 48s
batch: 1300/1563 - train loss: 3.6062 - test loss: 4.1813 - train acc: 0.7728 - test acc: 0.7439 - 5m 51s
batch: 1400/1563 - train loss: 3.3989 - test loss: 

batch: 1500/1563 - train loss: 2.2349 - test loss: 4.4983 - train acc: 0.8587 - test acc: 0.7529 - 9m 0s
batch: 1563/1563 - train loss: 2.4934 - test loss: 4.1899 - train acc: 0.8453 - test acc: 0.7599 - 9m 2s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 13/100
batch: 100/1563 - train loss: 1.6015 - test loss: 4.5113 - train acc: 0.9007 - test acc: 0.7606 - 9m 5s
batch: 200/1563 - train loss: 1.8829 - test loss: 4.3937 - train acc: 0.8850 - test acc: 0.7606 - 9m 8s
batch: 300/1563 - train loss: 2.0042 - test loss: 4.3297 - train acc: 0.8707 - test acc: 0.7550 - 9m 11s
batch: 400/1563 - train loss: 1.7568 - test loss: 4.5805 - train acc: 0.8903 - test acc: 0.7581 - 9m 14s
batch: 500/1563 - train loss: 1.8235 - test loss: 4.5771 - train acc: 0.8825 - test acc: 0.7596 - 9m 17s
batch: 600/1563 - train loss: 2.0506 - test loss: 4.4583 - train acc: 0.8709 - test acc: 0.7590 - 9m 20s
batch: 700/1563 - train loss: 2.1852 - test loss: 4.4173 

batch: 800/1563 - train loss: 1.5407 - test loss: 5.0174 - train acc: 0.9016 - test acc: 0.7602 - 12m 32s
batch: 900/1563 - train loss: 1.3772 - test loss: 5.1349 - train acc: 0.9167 - test acc: 0.7594 - 12m 35s
batch: 1000/1563 - train loss: 1.4841 - test loss: 5.2224 - train acc: 0.9028 - test acc: 0.7530 - 12m 38s
batch: 1100/1563 - train loss: 1.3567 - test loss: 5.3089 - train acc: 0.9195 - test acc: 0.7555 - 12m 40s
batch: 1200/1563 - train loss: 1.4494 - test loss: 4.9339 - train acc: 0.9067 - test acc: 0.7623 - 12m 43s
batch: 1300/1563 - train loss: 1.5447 - test loss: 5.0760 - train acc: 0.9026 - test acc: 0.7616 - 12m 47s
batch: 1400/1563 - train loss: 1.5332 - test loss: 4.9273 - train acc: 0.9042 - test acc: 0.7631 - 12m 50s
batch: 1500/1563 - train loss: 1.5054 - test loss: 5.2824 - train acc: 0.9092 - test acc: 0.7573 - 12m 52s
batch: 1563/1563 - train loss: 1.4509 - test loss: 5.0650 - train acc: 0.9035 - test acc: 0.7600 - 12m 55s
GPU memory used: 0.02 GB - max: 3.19 GB

batch: 100/1563 - train loss: 0.8483 - test loss: 5.9220 - train acc: 0.9502 - test acc: 0.7607 - 16m 6s
batch: 200/1563 - train loss: 0.9960 - test loss: 5.7756 - train acc: 0.9436 - test acc: 0.7575 - 16m 8s
batch: 300/1563 - train loss: 0.8830 - test loss: 5.8179 - train acc: 0.9442 - test acc: 0.7583 - 16m 11s
batch: 400/1563 - train loss: 0.9543 - test loss: 6.1316 - train acc: 0.9436 - test acc: 0.7565 - 16m 14s
batch: 500/1563 - train loss: 0.9367 - test loss: 5.9998 - train acc: 0.9465 - test acc: 0.7644 - 16m 18s
batch: 600/1563 - train loss: 1.0828 - test loss: 5.7187 - train acc: 0.9382 - test acc: 0.7599 - 16m 21s
batch: 700/1563 - train loss: 1.0507 - test loss: 5.9625 - train acc: 0.9320 - test acc: 0.7582 - 16m 24s
batch: 800/1563 - train loss: 1.0881 - test loss: 5.7372 - train acc: 0.9364 - test acc: 0.7567 - 16m 26s
batch: 900/1563 - train loss: 1.0171 - test loss: 5.9190 - train acc: 0.9411 - test acc: 0.7486 - 16m 30s
batch: 1000/1563 - train loss: 1.0689 - test los

batch: 1100/1563 - train loss: 1.1530 - test loss: 6.3979 - train acc: 0.9336 - test acc: 0.7514 - 19m 45s
batch: 1200/1563 - train loss: 1.0227 - test loss: 6.2796 - train acc: 0.9402 - test acc: 0.7591 - 19m 48s
batch: 1300/1563 - train loss: 1.0883 - test loss: 6.2266 - train acc: 0.9317 - test acc: 0.7590 - 19m 51s
batch: 1400/1563 - train loss: 1.1785 - test loss: 6.5095 - train acc: 0.9267 - test acc: 0.7462 - 19m 54s
batch: 1500/1563 - train loss: 1.1497 - test loss: 6.0284 - train acc: 0.9358 - test acc: 0.7576 - 19m 57s
batch: 1563/1563 - train loss: 1.0415 - test loss: 6.1609 - train acc: 0.9367 - test acc: 0.7614 - 19m 59s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 27/100
time is up! finishing training
batch: 1/1563 - train loss: 1.0509 - test loss: 6.1902 - train acc: 0.9358 - test acc: 0.7602 - 20m 1s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB

iteration: 32
generating CIFAR10 dat

batch: 100/1563 - train loss: 5.6556 - test loss: 5.7793 - train acc: 0.6384 - test acc: 0.6364 - 2m 15s
batch: 200/1563 - train loss: 5.7498 - test loss: 5.8694 - train acc: 0.6400 - test acc: 0.6330 - 2m 18s
batch: 300/1563 - train loss: 5.6298 - test loss: 5.7321 - train acc: 0.6478 - test acc: 0.6464 - 2m 21s
batch: 400/1563 - train loss: 5.6259 - test loss: 5.8326 - train acc: 0.6494 - test acc: 0.6330 - 2m 24s
batch: 500/1563 - train loss: 5.6361 - test loss: 5.5405 - train acc: 0.6438 - test acc: 0.6536 - 2m 27s
batch: 600/1563 - train loss: 5.4606 - test loss: 5.5970 - train acc: 0.6573 - test acc: 0.6502 - 2m 30s
batch: 700/1563 - train loss: 5.3790 - test loss: 5.5773 - train acc: 0.6628 - test acc: 0.6497 - 2m 33s
batch: 800/1563 - train loss: 5.5324 - test loss: 5.3240 - train acc: 0.6525 - test acc: 0.6668 - 2m 36s
batch: 900/1563 - train loss: 5.5923 - test loss: 5.4402 - train acc: 0.6551 - test acc: 0.6628 - 2m 39s
batch: 1000/1563 - train loss: 5.5279 - test loss: 5.38

batch: 1100/1563 - train loss: 3.3764 - test loss: 4.1416 - train acc: 0.7909 - test acc: 0.7536 - 5m 47s
batch: 1200/1563 - train loss: 3.3338 - test loss: 4.2742 - train acc: 0.7937 - test acc: 0.7462 - 5m 50s
batch: 1300/1563 - train loss: 3.6225 - test loss: 4.0867 - train acc: 0.7696 - test acc: 0.7525 - 5m 52s
batch: 1400/1563 - train loss: 3.4442 - test loss: 4.1384 - train acc: 0.7956 - test acc: 0.7515 - 5m 55s
batch: 1500/1563 - train loss: 3.6474 - test loss: 4.0964 - train acc: 0.7690 - test acc: 0.7512 - 5m 58s
batch: 1563/1563 - train loss: 3.5693 - test loss: 4.0537 - train acc: 0.7777 - test acc: 0.7524 - 6m 1s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 9/100
batch: 100/1563 - train loss: 2.8762 - test loss: 4.1533 - train acc: 0.8165 - test acc: 0.7489 - 6m 3s
batch: 200/1563 - train loss: 3.0751 - test loss: 4.1995 - train acc: 0.8074 - test acc: 0.7555 - 6m 6s
batch: 300/1563 - train loss: 2.9180 - test loss: 4.1

batch: 400/1563 - train loss: 1.8251 - test loss: 4.7683 - train acc: 0.8872 - test acc: 0.7547 - 9m 16s
batch: 500/1563 - train loss: 2.0689 - test loss: 4.4057 - train acc: 0.8666 - test acc: 0.7558 - 9m 19s
batch: 600/1563 - train loss: 1.9790 - test loss: 4.4955 - train acc: 0.8716 - test acc: 0.7581 - 9m 21s
batch: 700/1563 - train loss: 1.9904 - test loss: 4.4058 - train acc: 0.8734 - test acc: 0.7640 - 9m 24s
batch: 800/1563 - train loss: 2.0681 - test loss: 4.4432 - train acc: 0.8663 - test acc: 0.7607 - 9m 28s
batch: 900/1563 - train loss: 1.9550 - test loss: 4.4312 - train acc: 0.8838 - test acc: 0.7601 - 9m 30s
batch: 1000/1563 - train loss: 2.0806 - test loss: 4.5393 - train acc: 0.8650 - test acc: 0.7592 - 9m 33s
batch: 1100/1563 - train loss: 2.0044 - test loss: 4.6430 - train acc: 0.8775 - test acc: 0.7535 - 9m 36s
batch: 1200/1563 - train loss: 2.1455 - test loss: 4.6139 - train acc: 0.8613 - test acc: 0.7516 - 9m 39s
batch: 1300/1563 - train loss: 2.2927 - test loss: 4

batch: 1400/1563 - train loss: 1.6946 - test loss: 4.9396 - train acc: 0.8938 - test acc: 0.7638 - 12m 51s
batch: 1500/1563 - train loss: 1.7238 - test loss: 5.0794 - train acc: 0.8969 - test acc: 0.7510 - 12m 54s
batch: 1563/1563 - train loss: 1.6103 - test loss: 5.3029 - train acc: 0.8985 - test acc: 0.7564 - 12m 56s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 18/100
batch: 100/1563 - train loss: 1.1529 - test loss: 5.1785 - train acc: 0.9268 - test acc: 0.7636 - 12m 59s
batch: 200/1563 - train loss: 1.3419 - test loss: 5.2094 - train acc: 0.9167 - test acc: 0.7593 - 13m 2s
batch: 300/1563 - train loss: 1.2610 - test loss: 5.1751 - train acc: 0.9261 - test acc: 0.7670 - 13m 5s
batch: 400/1563 - train loss: 1.4117 - test loss: 5.2904 - train acc: 0.9155 - test acc: 0.7588 - 13m 8s
batch: 500/1563 - train loss: 1.3661 - test loss: 5.2923 - train acc: 0.9145 - test acc: 0.7623 - 13m 11s
batch: 600/1563 - train loss: 1.3705 - test los

batch: 700/1563 - train loss: 1.1162 - test loss: 5.8393 - train acc: 0.9320 - test acc: 0.7539 - 16m 25s
batch: 800/1563 - train loss: 1.2525 - test loss: 5.8159 - train acc: 0.9211 - test acc: 0.7579 - 16m 28s
batch: 900/1563 - train loss: 1.3383 - test loss: 5.6965 - train acc: 0.9185 - test acc: 0.7587 - 16m 31s
batch: 1000/1563 - train loss: 1.2376 - test loss: 5.9710 - train acc: 0.9280 - test acc: 0.7561 - 16m 34s
batch: 1100/1563 - train loss: 1.1965 - test loss: 5.6967 - train acc: 0.9308 - test acc: 0.7608 - 16m 37s
batch: 1200/1563 - train loss: 1.2343 - test loss: 6.0670 - train acc: 0.9217 - test acc: 0.7512 - 16m 39s
batch: 1300/1563 - train loss: 1.2476 - test loss: 5.8934 - train acc: 0.9245 - test acc: 0.7601 - 16m 42s
batch: 1400/1563 - train loss: 1.1463 - test loss: 5.8799 - train acc: 0.9270 - test acc: 0.7601 - 16m 45s
batch: 1500/1563 - train loss: 1.3305 - test loss: 5.6025 - train acc: 0.9129 - test acc: 0.7539 - 16m 48s
batch: 1563/1563 - train loss: 1.4223 - 

batch: 1/1563 - train loss: 1.1101 - test loss: 6.0437 - train acc: 0.9317 - test acc: 0.7605 - 20m 2s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB

iteration: 33
generating CIFAR10 data with 10 classes
Files already downloaded and verified
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96, 8

batch: 900/1563 - train loss: 5.6446 - test loss: 5.6257 - train acc: 0.6460 - test acc: 0.6458 - 2m 35s
batch: 1000/1563 - train loss: 5.7998 - test loss: 5.5747 - train acc: 0.6360 - test acc: 0.6542 - 2m 38s
batch: 1100/1563 - train loss: 5.5493 - test loss: 5.4646 - train acc: 0.6479 - test acc: 0.6608 - 2m 41s
batch: 1200/1563 - train loss: 5.4131 - test loss: 5.5594 - train acc: 0.6615 - test acc: 0.6544 - 2m 44s
batch: 1300/1563 - train loss: 5.6797 - test loss: 5.3865 - train acc: 0.6407 - test acc: 0.6675 - 2m 46s
batch: 1400/1563 - train loss: 5.4840 - test loss: 5.1181 - train acc: 0.6544 - test acc: 0.6825 - 2m 49s
batch: 1500/1563 - train loss: 5.2653 - test loss: 5.3297 - train acc: 0.6766 - test acc: 0.6675 - 2m 52s
batch: 1563/1563 - train loss: 5.4635 - test loss: 5.2989 - train acc: 0.6541 - test acc: 0.6689 - 2m 55s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 5/100
batch: 100/1563 - train loss: 4.9790 - test loss:

batch: 200/1563 - train loss: 3.0195 - test loss: 4.2375 - train acc: 0.8111 - test acc: 0.7445 - 5m 58s
batch: 300/1563 - train loss: 3.1345 - test loss: 4.2641 - train acc: 0.8050 - test acc: 0.7466 - 6m 1s
batch: 400/1563 - train loss: 3.1220 - test loss: 4.3276 - train acc: 0.8156 - test acc: 0.7401 - 6m 4s
batch: 500/1563 - train loss: 3.0668 - test loss: 4.2964 - train acc: 0.8109 - test acc: 0.7487 - 6m 7s
batch: 600/1563 - train loss: 3.3641 - test loss: 4.2030 - train acc: 0.7905 - test acc: 0.7502 - 6m 10s
batch: 700/1563 - train loss: 3.2862 - test loss: 4.2010 - train acc: 0.8040 - test acc: 0.7493 - 6m 12s
batch: 800/1563 - train loss: 3.3374 - test loss: 4.1139 - train acc: 0.7872 - test acc: 0.7562 - 6m 15s
batch: 900/1563 - train loss: 3.1130 - test loss: 4.2453 - train acc: 0.8034 - test acc: 0.7450 - 6m 18s
batch: 1000/1563 - train loss: 3.1840 - test loss: 4.1742 - train acc: 0.8074 - test acc: 0.7485 - 6m 21s
batch: 1100/1563 - train loss: 3.4728 - test loss: 4.0639

batch: 1200/1563 - train loss: 2.1003 - test loss: 4.3410 - train acc: 0.8690 - test acc: 0.7640 - 9m 29s
batch: 1300/1563 - train loss: 2.1571 - test loss: 4.2508 - train acc: 0.8712 - test acc: 0.7664 - 9m 32s
batch: 1400/1563 - train loss: 2.2295 - test loss: 4.2690 - train acc: 0.8584 - test acc: 0.7626 - 9m 34s
batch: 1500/1563 - train loss: 2.2576 - test loss: 4.4162 - train acc: 0.8632 - test acc: 0.7574 - 9m 37s
batch: 1563/1563 - train loss: 2.3025 - test loss: 4.2888 - train acc: 0.8565 - test acc: 0.7584 - 9m 40s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 14/100
batch: 100/1563 - train loss: 1.6862 - test loss: 4.4472 - train acc: 0.8938 - test acc: 0.7634 - 9m 43s
batch: 200/1563 - train loss: 1.6699 - test loss: 4.6368 - train acc: 0.9001 - test acc: 0.7574 - 9m 45s
batch: 300/1563 - train loss: 1.7606 - test loss: 4.5401 - train acc: 0.8906 - test acc: 0.7579 - 9m 48s
batch: 400/1563 - train loss: 1.8393 - test loss: 

batch: 500/1563 - train loss: 1.3871 - test loss: 5.1175 - train acc: 0.9157 - test acc: 0.7643 - 12m 57s
batch: 600/1563 - train loss: 1.2416 - test loss: 5.2301 - train acc: 0.9179 - test acc: 0.7567 - 13m 1s
batch: 700/1563 - train loss: 1.3162 - test loss: 5.1198 - train acc: 0.9161 - test acc: 0.7563 - 13m 3s
batch: 800/1563 - train loss: 1.4473 - test loss: 5.2535 - train acc: 0.9126 - test acc: 0.7629 - 13m 7s
batch: 900/1563 - train loss: 1.4543 - test loss: 5.5886 - train acc: 0.9120 - test acc: 0.7533 - 13m 9s
batch: 1000/1563 - train loss: 1.7408 - test loss: 5.1938 - train acc: 0.8853 - test acc: 0.7554 - 13m 12s
batch: 1100/1563 - train loss: 1.5759 - test loss: 4.9298 - train acc: 0.9048 - test acc: 0.7646 - 13m 15s
batch: 1200/1563 - train loss: 1.6328 - test loss: 5.1138 - train acc: 0.9001 - test acc: 0.7628 - 13m 18s
batch: 1300/1563 - train loss: 1.6190 - test loss: 5.1537 - train acc: 0.8991 - test acc: 0.7592 - 13m 21s
batch: 1400/1563 - train loss: 1.6272 - test l

batch: 1500/1563 - train loss: 1.1686 - test loss: 5.4605 - train acc: 0.9273 - test acc: 0.7683 - 16m 32s
batch: 1563/1563 - train loss: 1.2832 - test loss: 5.6896 - train acc: 0.9186 - test acc: 0.7630 - 16m 35s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 23/100
batch: 100/1563 - train loss: 0.9816 - test loss: 5.6765 - train acc: 0.9361 - test acc: 0.7615 - 16m 38s
batch: 200/1563 - train loss: 0.9029 - test loss: 5.6424 - train acc: 0.9471 - test acc: 0.7620 - 16m 40s
batch: 300/1563 - train loss: 1.0738 - test loss: 6.0531 - train acc: 0.9342 - test acc: 0.7646 - 16m 44s
batch: 400/1563 - train loss: 0.9872 - test loss: 5.8172 - train acc: 0.9390 - test acc: 0.7665 - 16m 47s
batch: 500/1563 - train loss: 1.0706 - test loss: 5.9928 - train acc: 0.9342 - test acc: 0.7538 - 16m 50s
batch: 600/1563 - train loss: 1.0217 - test loss: 5.9510 - train acc: 0.9352 - test acc: 0.7619 - 16m 53s
batch: 700/1563 - train loss: 1.1526 - test l

batch: 601/1563 - train loss: 1.1800 - test loss: 6.1925 - train acc: 0.9276 - test acc: 0.7515 - 20m 2s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB

iteration: 34
generating CIFAR10 data with 10 classes
Files already downloaded and verified
Files already downloaded and verified
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 96, 32, 32]           7,296
         MaxPool2d-2           [-1, 96, 16, 16]               0
              ReLU-3           [-1, 96, 16, 16]               0
            Conv2d-4           [-1, 80, 16, 16]         192,080
              ReLU-5           [-1, 80, 16, 16]               0
         MaxPool2d-6             [-1, 80, 8, 8]               0
         Dropout2d-7             [-1, 80, 8, 8]               0
            Conv2d-8             [-1, 96, 8, 8]         192,096
              ReLU-9             [-1, 96,

batch: 900/1563 - train loss: 5.4770 - test loss: 5.2346 - train acc: 0.6566 - test acc: 0.6732 - 2m 38s
batch: 1000/1563 - train loss: 5.4268 - test loss: 5.3356 - train acc: 0.6491 - test acc: 0.6642 - 2m 41s
batch: 1100/1563 - train loss: 5.5231 - test loss: 5.1070 - train acc: 0.6588 - test acc: 0.6813 - 2m 44s
batch: 1200/1563 - train loss: 5.4333 - test loss: 5.1419 - train acc: 0.6616 - test acc: 0.6822 - 2m 47s
batch: 1300/1563 - train loss: 5.2735 - test loss: 4.9803 - train acc: 0.6788 - test acc: 0.6894 - 2m 50s
batch: 1400/1563 - train loss: 5.2387 - test loss: 5.2484 - train acc: 0.6787 - test acc: 0.6740 - 2m 52s
batch: 1500/1563 - train loss: 5.1704 - test loss: 5.0749 - train acc: 0.6891 - test acc: 0.6822 - 2m 55s
batch: 1563/1563 - train loss: 5.1759 - test loss: 5.0704 - train acc: 0.6819 - test acc: 0.6841 - 2m 58s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 5/100
batch: 100/1563 - train loss: 4.7995 - test loss:

batch: 200/1563 - train loss: 2.8150 - test loss: 4.0633 - train acc: 0.8231 - test acc: 0.7605 - 6m 5s
batch: 300/1563 - train loss: 2.8655 - test loss: 4.1040 - train acc: 0.8221 - test acc: 0.7577 - 6m 8s
batch: 400/1563 - train loss: 3.0280 - test loss: 4.0696 - train acc: 0.8115 - test acc: 0.7505 - 6m 11s
batch: 500/1563 - train loss: 2.9264 - test loss: 4.3267 - train acc: 0.8180 - test acc: 0.7442 - 6m 14s
batch: 600/1563 - train loss: 3.0591 - test loss: 4.2687 - train acc: 0.8068 - test acc: 0.7519 - 6m 17s
batch: 700/1563 - train loss: 2.9977 - test loss: 4.3061 - train acc: 0.8137 - test acc: 0.7443 - 6m 20s
batch: 800/1563 - train loss: 3.1335 - test loss: 4.0713 - train acc: 0.8102 - test acc: 0.7512 - 6m 23s
batch: 900/1563 - train loss: 3.2122 - test loss: 4.1269 - train acc: 0.8062 - test acc: 0.7517 - 6m 25s
batch: 1000/1563 - train loss: 2.9648 - test loss: 4.2517 - train acc: 0.8143 - test acc: 0.7423 - 6m 28s
batch: 1100/1563 - train loss: 3.0765 - test loss: 4.034

batch: 1200/1563 - train loss: 1.8423 - test loss: 4.3868 - train acc: 0.8863 - test acc: 0.7714 - 9m 38s
batch: 1300/1563 - train loss: 2.2247 - test loss: 4.3292 - train acc: 0.8593 - test acc: 0.7643 - 9m 41s
batch: 1400/1563 - train loss: 2.1422 - test loss: 4.3378 - train acc: 0.8593 - test acc: 0.7672 - 9m 44s
batch: 1500/1563 - train loss: 2.1761 - test loss: 4.2663 - train acc: 0.8662 - test acc: 0.7676 - 9m 47s
batch: 1563/1563 - train loss: 2.1901 - test loss: 4.2387 - train acc: 0.8656 - test acc: 0.7636 - 9m 50s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 14/100
batch: 100/1563 - train loss: 1.4539 - test loss: 4.5220 - train acc: 0.9129 - test acc: 0.7636 - 9m 53s
batch: 200/1563 - train loss: 1.4351 - test loss: 4.7141 - train acc: 0.9060 - test acc: 0.7600 - 9m 56s
batch: 300/1563 - train loss: 1.6229 - test loss: 4.5641 - train acc: 0.8979 - test acc: 0.7651 - 9m 59s
batch: 400/1563 - train loss: 1.6598 - test loss: 

batch: 500/1563 - train loss: 1.2318 - test loss: 5.2062 - train acc: 0.9230 - test acc: 0.7628 - 13m 10s
batch: 600/1563 - train loss: 1.3142 - test loss: 5.2890 - train acc: 0.9186 - test acc: 0.7617 - 13m 13s
batch: 700/1563 - train loss: 1.3391 - test loss: 5.0770 - train acc: 0.9176 - test acc: 0.7584 - 13m 16s
batch: 800/1563 - train loss: 1.3894 - test loss: 5.2939 - train acc: 0.9095 - test acc: 0.7560 - 13m 19s
batch: 900/1563 - train loss: 1.3078 - test loss: 5.1570 - train acc: 0.9236 - test acc: 0.7651 - 13m 22s
batch: 1000/1563 - train loss: 1.3807 - test loss: 5.2510 - train acc: 0.9057 - test acc: 0.7539 - 13m 25s
batch: 1100/1563 - train loss: 1.7065 - test loss: 4.9261 - train acc: 0.8995 - test acc: 0.7662 - 13m 28s
batch: 1200/1563 - train loss: 1.2770 - test loss: 5.3152 - train acc: 0.9230 - test acc: 0.7636 - 13m 31s
batch: 1300/1563 - train loss: 1.4777 - test loss: 4.9184 - train acc: 0.9136 - test acc: 0.7620 - 13m 33s
batch: 1400/1563 - train loss: 1.4259 - te

batch: 1500/1563 - train loss: 1.1143 - test loss: 5.5031 - train acc: 0.9317 - test acc: 0.7637 - 16m 48s
batch: 1563/1563 - train loss: 1.1590 - test loss: 5.8006 - train acc: 0.9298 - test acc: 0.7559 - 16m 51s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 23/100
batch: 100/1563 - train loss: 0.9137 - test loss: 5.6559 - train acc: 0.9433 - test acc: 0.7662 - 16m 54s
batch: 200/1563 - train loss: 0.8995 - test loss: 5.8117 - train acc: 0.9471 - test acc: 0.7684 - 16m 57s
batch: 300/1563 - train loss: 0.9794 - test loss: 5.9182 - train acc: 0.9402 - test acc: 0.7682 - 17m 0s
batch: 400/1563 - train loss: 0.9136 - test loss: 5.7554 - train acc: 0.9451 - test acc: 0.7652 - 17m 3s
batch: 500/1563 - train loss: 1.0405 - test loss: 6.0557 - train acc: 0.9383 - test acc: 0.7546 - 17m 6s
batch: 600/1563 - train loss: 0.9960 - test loss: 5.9435 - train acc: 0.9383 - test acc: 0.7596 - 17m 9s
batch: 700/1563 - train loss: 1.1219 - test loss:

batch: 100/1563 - train loss: 13.0313 - test loss: 13.0162 - train acc: 0.0933 - test acc: 0.1534 - 0m 0s
batch: 200/1563 - train loss: 12.9652 - test loss: 12.7591 - train acc: 0.1366 - test acc: 0.1318 - 0m 3s
batch: 300/1563 - train loss: 12.2500 - test loss: 12.0883 - train acc: 0.1882 - test acc: 0.2058 - 0m 6s
batch: 400/1563 - train loss: 11.5819 - test loss: 11.2040 - train acc: 0.2329 - test acc: 0.2798 - 0m 9s
batch: 500/1563 - train loss: 11.2643 - test loss: 11.0108 - train acc: 0.2616 - test acc: 0.2780 - 0m 12s
batch: 600/1563 - train loss: 10.6610 - test loss: 10.7028 - train acc: 0.2897 - test acc: 0.2807 - 0m 14s
batch: 700/1563 - train loss: 10.3659 - test loss: 9.7108 - train acc: 0.3047 - test acc: 0.3588 - 0m 17s
batch: 800/1563 - train loss: 10.3540 - test loss: 9.7547 - train acc: 0.3056 - test acc: 0.3622 - 0m 20s
batch: 900/1563 - train loss: 9.9751 - test loss: 9.3364 - train acc: 0.3472 - test acc: 0.3866 - 0m 23s
batch: 1000/1563 - train loss: 9.7909 - test 

batch: 1100/1563 - train loss: 4.6690 - test loss: 4.7353 - train acc: 0.7141 - test acc: 0.7123 - 3m 28s
batch: 1200/1563 - train loss: 4.6142 - test loss: 4.7428 - train acc: 0.7157 - test acc: 0.7126 - 3m 31s
batch: 1300/1563 - train loss: 4.7549 - test loss: 4.8865 - train acc: 0.7097 - test acc: 0.6959 - 3m 34s
batch: 1400/1563 - train loss: 4.5524 - test loss: 4.7751 - train acc: 0.7238 - test acc: 0.7062 - 3m 37s
batch: 1500/1563 - train loss: 4.5390 - test loss: 4.7660 - train acc: 0.7272 - test acc: 0.7046 - 3m 39s
batch: 1563/1563 - train loss: 4.5902 - test loss: 4.7104 - train acc: 0.7154 - test acc: 0.7079 - 3m 42s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 6/100
batch: 100/1563 - train loss: 4.2666 - test loss: 4.6818 - train acc: 0.7293 - test acc: 0.7160 - 3m 45s
batch: 200/1563 - train loss: 4.2048 - test loss: 4.5923 - train acc: 0.7390 - test acc: 0.7157 - 3m 48s
batch: 300/1563 - train loss: 4.2294 - test loss: 

batch: 400/1563 - train loss: 2.7070 - test loss: 4.0659 - train acc: 0.8299 - test acc: 0.7659 - 6m 55s
batch: 500/1563 - train loss: 2.5467 - test loss: 4.0414 - train acc: 0.8368 - test acc: 0.7674 - 6m 57s
batch: 600/1563 - train loss: 2.6948 - test loss: 4.1163 - train acc: 0.8362 - test acc: 0.7553 - 7m 0s
batch: 700/1563 - train loss: 2.7855 - test loss: 4.3070 - train acc: 0.8272 - test acc: 0.7476 - 7m 3s
batch: 800/1563 - train loss: 2.7463 - test loss: 4.0455 - train acc: 0.8259 - test acc: 0.7611 - 7m 6s
batch: 900/1563 - train loss: 2.6017 - test loss: 4.0671 - train acc: 0.8431 - test acc: 0.7648 - 7m 9s
batch: 1000/1563 - train loss: 2.7726 - test loss: 4.2776 - train acc: 0.8243 - test acc: 0.7530 - 7m 12s
batch: 1100/1563 - train loss: 2.7824 - test loss: 4.0873 - train acc: 0.8303 - test acc: 0.7604 - 7m 15s
batch: 1200/1563 - train loss: 2.7904 - test loss: 4.3031 - train acc: 0.8368 - test acc: 0.7532 - 7m 17s
batch: 1300/1563 - train loss: 2.7733 - test loss: 4.127

batch: 1400/1563 - train loss: 1.9664 - test loss: 4.4604 - train acc: 0.8800 - test acc: 0.7659 - 10m 27s
batch: 1500/1563 - train loss: 2.0164 - test loss: 4.1827 - train acc: 0.8753 - test acc: 0.7701 - 10m 30s
batch: 1563/1563 - train loss: 1.9503 - test loss: 4.5813 - train acc: 0.8760 - test acc: 0.7657 - 10m 33s
GPU memory used: 0.02 GB - max: 3.19 GB - memory reserved: 0.15 GB - max: 3.28 GB
starting epoch: 15/100
batch: 100/1563 - train loss: 1.2495 - test loss: 4.9297 - train acc: 0.9239 - test acc: 0.7619 - 10m 35s
batch: 200/1563 - train loss: 1.4123 - test loss: 4.8872 - train acc: 0.9189 - test acc: 0.7681 - 10m 39s
batch: 300/1563 - train loss: 1.4141 - test loss: 4.6223 - train acc: 0.9145 - test acc: 0.7711 - 10m 42s
batch: 400/1563 - train loss: 1.4645 - test loss: 4.8270 - train acc: 0.9028 - test acc: 0.7636 - 10m 45s
batch: 500/1563 - train loss: 1.4360 - test loss: 4.8515 - train acc: 0.9101 - test acc: 0.7727 - 10m 47s
