# Kernel Sparsity via Pruning Tutorial

Neural networks, in general, are very overparamatarized for given tasks (ie the number of parameters far exceeds the number of training points) yet still they [generalize very well](https://arxiv.org/abs/1611.03530). This flies against conventional wisdom where overparamatarizing a model will lead to overfitting and putting theory behind this empirical evidence is a very active area of research.

Additionally, [early on](http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf) it was discovered that large numbers of weights in neural networks could be pruned away (set to 0) without affecting the loss and in most cases actually improving the generalization capability of the network. This work was reinvigorated with Song Han's [2015 paper](https://arxiv.org/abs/1510.00149) in pursuit of compressing model size for mobile applications. This has resulted in numerous papers coming out on the topic of weight pruning, filter pruning, channel pruning, and ultimately block pruning. [This paper](https://arxiv.org/abs/1902.09574) out of Google gives a good overview of the current state of sparsity.

Given that models are very overparamatarized and large numbers of weights can be effectively pruned away, what does this leave us with? Well intuitively, then, we can think of pruning as performing an [architecture search](https://openreview.net/pdf?id=rJlnB3C5Ym) within this large, traditionally fixed weight space. What was originally important in the dense model was representing a large number of pathways for optimization. We can then effectively remove the unused pathways in the optimization space with a fine toothed comb (post training).

Well what does pruning get us? We now have a model with a lot of multiplications by zero that we don't need to run. If we're smart about how we do structure this compute (a surprisingly tricky problem), we can now run the model much faster than ever thought possible! That's where the [Neural Magic](http://neuralmagic.com/) engine can help us.

This tutorial provides a step by step walk through for pruning an already trained (dense) model. Specifically it is set up to work with the model trained in our [model training tutorial](model_training.ipynb), but it can be changed to support other models/datasets as needed:
1. Dataset selection
2. Model selection and loading
3. Pruning setup
4. Recalibration using pruning

In [1]:
import sys
import os

print('Python %s on %s' % (sys.version, sys.platform))

package_path = os.path.abspath(os.path.join(os.path.expanduser(os.getcwd()), os.pardir))
print(package_path)

"""
Adding the path to the neuralmagic-pytorch extension to the path so it isn't necessary to have it installed
"""
sys.path.extend([package_path])

print('Added current package path to sys.path')
print('Be sure to install from requirements.txt and pytorch separately')


Python 3.6.8 (default, Jan 14 2019, 11:02:34) 
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
/home/mark/neuralmagic/Shared/neuralmagicml-pytorch
Added current package path to sys.path
Be sure to install from requirements.txt and pytorch separately


## Dataset Selection

We are using fast.ai's [Imagenette dataset](https://github.com/fastai/imagenette) provided under the [Apache License 2.0](https://github.com/fastai/imagenette/blob/master/LICENSE) as the default dataset. The original authors, much like ourselves, were interested in a dataset that has similar properties to more complicated datasets such as the Imagenet dataset but one that would allow rapid iterations. It includes 10 of the easiest classes out of the Imagenet 1000 dataset: tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute. If you are interested in visualizing the properties in this dataset see our [model training tutorial](model_training.ipynb) which also gives a more in depth breakdown for what batch size to use and the dataset splits.

The dataset can easily be changed to the desired dataset in the code given.

Below we will need to fill in the dataset path, train batch size, and test batch size.

In [2]:
import ipywidgets as widgets
import torch

print('\nEnter the local path where the dataset can be found')

dataset_text = widgets.Text(value='', placeholder='Enter local path to dataset', description='Dataset Path')
display(dataset_text)

print('\nChoose the batch size to run through the model during train and test runs')
print('(be sure to press enter if/after inputting manually)')
train_batch_size_slider = widgets.IntSlider(
    value=64, min=1, max=256, step=1, description='Train Batch Size:'
)
display(train_batch_size_slider)
test_batch_size_slider = widgets.IntSlider(
    value=64 if torch.cuda.is_available() else 1, min=1, max=256, step=1, description='Test Batch Size:'
)
display(test_batch_size_slider)



Enter the local path where the dataset can be found


Text(value='', description='Dataset Path', placeholder='Enter local path to dataset')


Choose the batch size to run through the model during train and test runs
(be sure to press enter if/after inputting manually)


IntSlider(value=64, description='Train Batch Size:', max=256, min=1)

IntSlider(value=64, description='Test Batch Size:', max=256, min=1)

In [3]:
from neuralmagicML.datasets import ImagenetteDataset, EarlyStopDataset
from torch.utils.data import Dataset, DataLoader

dataset_root = os.path.abspath(os.path.expanduser(dataset_text.value.strip()))
print('\nLoading dataset from {}'.format(dataset_root))

if not os.path.exists(dataset_root):
    raise Exception('Folder must exist for dataset at {}'.format(dataset_root))
    
train_batch_size = train_batch_size_slider.value
test_batch_size = test_batch_size_slider.value

print('\nUsing train batch size of {} and test batch size of {}\n'
      .format(train_batch_size, test_batch_size))
    
train_dataset = ImagenetteDataset(dataset_root, train=True, rand_trans=True)
train_data_loader = DataLoader(train_dataset, batch_size=train_batch_size, shuffle=True, num_workers=4)
print('train dataset created: \n{}\n'.format(train_dataset))

val_dataset = ImagenetteDataset(dataset_root, train=False, rand_trans=False)
val_data_loader = DataLoader(val_dataset, batch_size=train_batch_size, shuffle=False, num_workers=4)
print('validation test dataset created: \n{}\n'.format(val_dataset))

train_test_dataset = EarlyStopDataset(ImagenetteDataset(dataset_root, train=True, rand_trans=False),
                                      early_stop=len(val_dataset))
train_test_data_loader = DataLoader(train_test_dataset, batch_size=train_batch_size, shuffle=False, num_workers=4)
print('train test dataset created: \n{}\n'.format(train_test_dataset))



Loading dataset from /home/mark/datasets/imagenette

Using train batch size of 256 and test batch size of 256

already downloaded imagenette of size ImagenetteSize.s160
train dataset created: 
Dataset ImagenetteDataset
    Number of datapoints: 12894
    Root location: /home/mark/datasets/imagenette/imagenette-160/train

already downloaded imagenette of size ImagenetteSize.s160
validation test dataset created: 
Dataset ImagenetteDataset
    Number of datapoints: 500
    Root location: /home/mark/datasets/imagenette/imagenette-160/val

already downloaded imagenette of size ImagenetteSize.s160
train test dataset created: 
Dataset ImagenetteDataset
    Number of datapoints: 500
    Root location: /home/mark/datasets/imagenette/imagenette-160/train



## Model Selection and Loading

For this exercise we'll create the standard [ResNet50 model](https://arxiv.org/abs/1512.03385) and in addition we will load the pretrained weights from our [model training tutorial](model_training.ipynb)

If you changed the dataset in the above cell, then we'll need to update the number of classes to create the model appropriately as well as loading your own pretrained weights. Additionally the model can be changed out completely to work with your specific use case.

Additionally, run the code block and select the device to run on before continuing. cpu runs in the pytorch cpu framework and cuda runs on an attached GPU.


In [4]:
import glob
import datetime
from neuralmagicML.models import resnet50, load_model

num_classes = 10
# TODO: change this to load pretrained weights from our cloud
model = resnet50(num_classes=num_classes)
model_id = '{}-{}'.format(model.__class__.__name__,
                          datetime.datetime.today().strftime('%Y-%m-%d-%H:%M:%S')
                              .replace('-', '.').replace(':', '.'))
pretrained_paths = [path for path in glob.glob('ResNet*.pth')]
pretrained_path = pretrained_paths[0]
load_model(pretrained_path, model)
print('Created model {}'.format(model.__class__.__name__))

print('\nChoose the device to run on')
device_choice = widgets.ToggleButtons(
    options=['cuda', 'cpu'] if torch.cuda.is_available() else ['cpu'],
    description='Device'
)
display(device_choice)


Created model ResNet

Choose the device to run on


ToggleButtons(description='Device', options=('cuda', 'cpu'), value='cuda')

## Pruning Setup

Informally, sparsity is the degree in which a tensor is comprised of zeros. More formally:

let $N^i$ be the total number of elements in a (e.g. weight) tensor $W_i$ and let $N^i_z$ be the number of elements which are zero-valued within that tensor.

The sparsity level associated with that tensor is then defined as $s_i \triangleq \dfrac{N^i_z}{N^i}$ 

Our goal now is to maximize the sparsity of each weight tensor within the model while preserving the accuracy of the densely trained version. The most robust and effective approach to this so far has been magnitude pruning. To frame the problem, let's say our goal is to go from an intial sparsity $s^i_{init}$ to a final sparsity $s^i_{final}$. We could start with changing out our usual $L_2$ weight regularization with $L_1$ [($L_2$ vs $L_1$)](https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c) in our cost function. Following through, we would find that we did, in fact, introduce a few zeros into our weights. We created a direct pathway within the optimization problem such that the model is incentivized to reduce the magnitude of unimportant weights to 0 (unimportant are defined as weights that do not significantly contribute to the loss function). This is hard to balance, though. For example, how do we determine the proper weighting between the loss function and the regularization such that we reach a desired sparsity without sacrificing accuracy? Also, empirically we find that the model will settle in local minima thus failing to further reduce weights to zero.

We can take this general idea to more of an extreme, though. Based on the previous thought experiment, it's reasonable to make an assumption that weights near zero are likely not important to the final loss function as well. Taking this even further, we could say that the smallest values within a given weight tensor are the ones least likely to affect our loss function. This is, of course, assuming that we've trained long enough to reach a stable point where our neural networks cost function has minimized unimportant weights. Naturally, then, we could apply a schedule where we prune away a small number of smallest weights for every $M$ training steps. This is exactly what 'magnitude pruning' does. 

Below, we go through a UI to set up a magnitude pruning schedule for our model. While pruning, some layers within a model will be more sensitive than others. A general rule is that the initial input layer and the final output layer are the most sensitive as they are an absolute bottleneck to the information flow. Because of this, we offer up a UI capable of creating very intricate schedules to the point that each layer could be pruned with a different schedule to a different final sparsity. In general, though, we can get by with ignoring the initial layer, pruning the final layer minimally (this is done to boost test accuracy for the pruned model), and prune all other layers equally. At Neural Magic we are actively working on automating this process to find the best possible pruning schedules to maximize performance while minimizing accuracy loss. 

The default for the below model will disable pruning for the initial and final layers and prune all other layers to 90% sparsity over the course of 10 epochs. The options available are described below:
- Prune Epochs: number of epochs to apply the pruning over
- Add New Group: add a new pruning group to the UI, used to prune selected layers to different sparsity levels at different rates
- Delete Current Group: remove the current group from the pruning schedule
- Tabs: the pruning groups setup so far, click to switch between
- Sparsity: a range slider where the left value defines the sparsity level to initially start pruning at and the right value represents the final sparsity level
- Start Epoch: the epoch to start pruning the selected layers at the initial sparsity
- End Epoch: the epoch to finish pruning the selected layers to the final sparsity
- Update Freq: the update frequency, in epochs, at which to prune the layers; ie 1.0 will prune every epoch
- Selectable Layers: a dropdown of the layers available in the model that can be pruned along with their FLOPS and param counts; select the desired layers to prune using the checkboxes

In [5]:
from neuralmagicML.notebooks import KSModifierWidgets


device = device_choice.value
print('running on device {}'.format(device))

print('\ncreating kernel sparsity analyzer widgets (need to execute the model, so may take some time)...')
ks_widget, ks_modifiers = KSModifierWidgets.interactive_module(model, device, inp_dim=(1, 3, 224, 224))

# add first group for all layers and remove the input and final layers
ks_widget.add_group(
    init_start_sparsity=0.05, init_final_sparsity=0.85, init_enabled=True,
    init_start_epoch=0.0, init_end_epoch=30.0, init_update_frequency=1.0
)
ks_modifiers[0].layers.pop(0)
ks_modifiers[0].layers.pop()
ks_widget.update_from_modifiers()

# add second group with just the final layer to set a lower sparsity for it
ks_widget.add_group(
    init_start_sparsity=0.05, init_final_sparsity=0.6, init_enabled=True,
    init_start_epoch=0.0, init_end_epoch=30.0, init_update_frequency=1.0
)
ks_modifiers[1].layers = ks_modifiers[1].layers[-1:]
ks_widget.update_from_modifiers()

display(ks_widget)


running on device cuda

creating kernel sparsity analyzer widgets (need to execute the model, so may take some time)...


Box(children=(VBox(children=(HBox(children=(Button(description='Add New Group', style=ButtonStyle()), Button(d…

### Learning Rate Selection

With our magnitude pruning schedule complete, we need to define the hyperparameters for training while pruning. The most important of these is the learning rate. Too high and we will diverge from our initial dense solution. Too low and we will have to train for too many epochs. Below we run a learning rate sensitivity analysis from the [cyclic LR paper](https://arxiv.org/abs/1506.01186).

To run the sensitivity analysis we will begin training the model at a very small learning rate ($10^{-7}$) where the weight updates will be lost in floating point precision errors (ie we won't learn). After each batch we exponentially increase the learning rate until we reach a very high learning rate ($10^0$) where we are guaranteed to diverge from our trained model.

A flat region will be apparent in the graph starting from the left to some point at the right for a fully trained model. After this we reach a critical learning rate where the loss begins to rapidly rise. This is the critical point where we begin diverging from the local minimum in our optimization space. Using this information, we can find the optimal learning rate to use with an [SGD + nesterov momentum optimizer](https://towardsdatascience.com/stochastic-gradient-descent-with-momentum-a84097641a5d) while pruning. Ideally we want to pick a learning rate that is a little before the divergent behavior. In this way we can guarantee fast convergence of the model after each pruning step. 

A good default for the learning rate is given for the model and dataset used in this notebook. If changing either, you will need to update the learning rate.

In [6]:
import torch
from neuralmagicML.utils import lr_analysis, lr_analysis_figure, CrossEntropyLossWrapper
%matplotlib inline
import matplotlib.pyplot as plt

### optimizer definitions
momentum = 0.9
weight_decay = 1e-4
###

# print('\nrunning learning rate analysis...')
# batches_per_sample = round(500 / train_batch_size)  # make sure we have enough sample points per learning rate
# analysis = lr_analysis(model, device, train_data_loader, CrossEntropyLossWrapper(), batches_per_sample,
#                        init_lr=1e-7, final_lr=1e0, sgd_momentum=momentum, sgd_weight_decay=weight_decay)
# lr_analysis_figure(analysis)
# plt.show()

print('\nselect the learning rate')
lr_slider = widgets.FloatLogSlider(
    value=0.01, min=-7, max=1, step=0.01, description='Learning Rate:'
)
display(lr_slider)



select the learning rate


FloatLogSlider(value=0.01, description='Learning Rate:', max=1.0, min=-7.0, step=0.01)

### Post Pruning Hyperparameters

After pruning the model, we'll need to train for a few final epochs to recover any lost accuracy. A typical setup is to train with an [exponential decay learning rate](https://towardsdatascience.com/learning-rate-schedules-and-adaptive-learning-rate-methods-for-deep-learning-2c8f433990d1) schedule starting from our previously found learning rate: $LR_i=LR_{init}*\gamma^i$ where $LR_i$ is the learning rate used at epoch $i$. In doing this, we can find a local min with equivalent accuracy to the dense model provided we did not sparsify the model too much.

Given this, the necessary hyperparameters to determine are:
- Num Epochs - the number of epochs to train for in the finalizing stage
- Final LR - the learning rate we will converge to while finalizing

These parameters must be filled in below. We give defaults for the original model and dataset in this notebook. The defaults were found using previous intuition along with a few training runs. In general if accuracy is not recovering, then training for longer overall as well for longer times at higher learning rates (increasing update freq) will help.

In [7]:
print('\nselect the number of epochs to train for after pruning')
finalize_epochs_text = widgets.IntText(value=30, description='Num Epochs')
display(finalize_epochs_text)

print('\nselect the final learning rate')
lr_final_slider = widgets.FloatLogSlider(
    value=0.001, min=-7, max=1, step=0.01, description='Final LR:'
)
display(lr_final_slider)



select the number of epochs to train for after pruning


IntText(value=30, description='Num Epochs')


select the final learning rate


FloatLogSlider(value=0.001, description='Final LR:', max=1.0, min=-7.0, step=0.01)

## Recalibration Using Pruning

With the parameters properly setup, we now create a schedule for applying the parameters. This is done using the Neural Magic ML library. Specifically we create a `ScheduledModifierManager` that controls a list of two different classes: `GradualKSModifier` which performs magnitude pruning to the weights in the given layers and `LearningRateModifier` which applies the exponential learning rate in the training epochs after pruning.

We then create a `ScheduledOptimizer` giving it the model, an SGD optimizer, the `ScheduledModifierManager` we created, and the training datset size. Using all this information, the code can apply any schedule needed by capturing the `.step()` call, perform schedule updates, and then calling into the original optimizer. Additionally `epoch_start()` and `epoch_end()` should be called on the optimizer wrapper while training. We will see this use in the next code block. 

For the loss function we will use cross entropy as is standard for classification tasks. For additional metrics we will use the top 1 accuracy, ie did we predict the class correctly or not. We do this because there are only 10 classes available so top N accuracy is generally uninformative with so few classes. We use a custom wrapper class to organize the metrics and loss into one, callable class.

Finally, beyond the usual basic screen-printouts let's use tensorboard's nice logging capabilities. We'll primarily track scalars such as the loss and accuracy throughout training in this example. We use tensorboardX to log from pytorch into tensorboard.


In [8]:
import math
from torch import optim
from torch.nn import Conv2d, Linear
from tensorboardX import SummaryWriter

from neuralmagicML.sparsity import (
    LearningRateModifier, ScheduledModifierManager, ScheduledOptimizer, KSAnalyzerLayer
)
from neuralmagicML.utils import TopKAccuracy, CrossEntropyLossWrapper


model = model.to(device)

prune_epochs = max([modifier.end_epoch for modifier in ks_modifiers])
finalizer_epochs = finalize_epochs_text.value
lr_init = lr_slider.value
lr_final = lr_final_slider.value
lr_gamma = (lr_final / lr_init) ** (1 / (finalizer_epochs - 1))
lr_modifier = LearningRateModifier(lr_class='ExponentialLR', lr_kwargs={'gamma': lr_gamma},
                                   start_epoch=prune_epochs, end_epoch=prune_epochs + finalizer_epochs,
                                   update_frequency=1)
modifier_manager = ScheduledModifierManager([lr_modifier, *ks_modifiers])
print('Created ScheduledModifierManager with exponential lr_modifier with gamma {} and {} ks modifiers'
      .format(lr_gamma, len(ks_modifiers)))

optimizer = optim.SGD(model.parameters(), lr_slider.value, momentum=momentum,
                      weight_decay=weight_decay, nesterov=True)
optimizer = ScheduledOptimizer(optimizer, model, modifier_manager, steps_per_epoch=len(train_dataset))
print('\nCreated scheudled optimizer with initial lr: {}, momentum: {}, weight decay: {}'
      .format(lr_slider.value, momentum, weight_decay))

loss = CrossEntropyLossWrapper(extras={'top1acc': TopKAccuracy(1)})
print('\nCreated loss wrapper\n{}'.format(loss))

logs_dir = os.path.abspath(os.path.expanduser(os.path.join('.', 'model_training_logs', model_id)))

if not os.path.exists(logs_dir):
    os.makedirs(logs_dir)

writer = SummaryWriter(logdir=logs_dir, comment='imagenette training')
print('\nCreated summary writer logging to \n{}'.format(logs_dir))


Created ScheduledModifierManager with exponential lr_modifier with gamma 0.9744600632908477 and 2 ks modifiers

Created scheudled optimizer with initial lr: 0.01, momentum: 0.9, weight decay: 0.0001

Created loss wrapper
CrossEntropyLossWrapper(Loss: cross_entropy; Extras: TopKAccuracy)

Created summary writer logging to 
/home/mark/neuralmagic/Shared/neuralmagicml-pytorch/notebooks/model_training_logs/ResNet-2019.08.11.11.59.07


### Applying Pruning Schedule

Below we go through a standard training and testing cycle in pytorch. 

for number of epochs required for pruning and finalizing:
train model over full training dataset (one epoch); update weights
test model over full validation dataset; no weight update
test model over sampled training dataset; no weight update

In the below code blocks, we create self-contained convenience functions for running the train and testing loops. These convenience functions are then called to train the model. At the end of the script we save the trained model in our current location with the date included as well as the final validation accuracy in the name.

Note, we have additionally created a logs directory which, in combination with tensorboard, can be used to visualize the progress of the model. To launch tensorboard use the following command from within the notebooks directory: 'tensorboard --logdir model_training_logs --port 6006' Now you will have an interactive dashboard running on http://localhost:6006

We additionally create analyzers for the kernel sparsity of all convs and kernel layers that is logged to tensorboard for visualization of the sparsity of each layer within the model. Also, we call `epoch_start()` and `epoch_end()` on the optimizer wrapper as mentioned above.

Finally, we save the final result next to this notebook under the model id given earlier. Happy pruning!


In [9]:
from tqdm import tqdm


def test_epoch(model, data_loader, loss, device, epoch):
    model.eval()
    results = {}
    
    with torch.no_grad():
        for batch, (*x_feature, y_lab) in enumerate(tqdm(data_loader)):
            y_lab = y_lab.to(device)
            x_feature = tuple([dat.to(device) for dat in x_feature])
            batch_size = y_lab.shape[0]
            y_pred = model(*x_feature)
            losses = loss(x_feature, y_lab, y_pred)
            
            for key, val in losses.items():
                if key not in results:
                    results[key] = []
                result = val.detach_().cpu()
                result = result.repeat(batch_size)
                results[key].append(result)
                
    return results

def train_epoch(model, data_loader, optimizer, loss, device, epoch, writer):
    model.train()
    init_batch_size = None
    batches_per_epoch = len(data_loader)
    
    for batch, (*x_feature, y_lab) in enumerate(tqdm(data_loader)):
        y_lab = y_lab.to(device)
        x_feature = tuple([dat.to(device) for dat in x_feature])
        batch_size = y_lab.shape[0]
        if init_batch_size is None:
            init_batch_size = batch_size
        optimizer.zero_grad()
        y_pred = model(*x_feature)
        losses = loss(x_feature, y_lab, y_pred)
        losses['loss'].backward()
        optimizer.step(closure=None)
        
        step_count = init_batch_size * (epoch * batches_per_epoch + batch)
        for _loss, _value in losses.items():
            writer.add_scalar('Train/{}'.format(_loss), _value.item(), step_count)
            writer.add_scalar('Train/Learning Rate', optimizer.learning_rate, step_count)
            
print('Recalibrating model for kernel sparsity...')

analyzed_layers = KSAnalyzerLayer.analyze_layers(
    model, layers=[name for name, mod in model.named_modules()
                   if isinstance(mod, Conv2d) or isinstance(mod, Linear)]
)
            
for epoch in tqdm(range(math.ceil(modifier_manager.max_epochs))):
    print('Starting epoch {}'.format(epoch))
    optimizer.epoch_start()
    train_epoch(model, train_data_loader, optimizer, loss, device, epoch, writer)
    
    print('Completed training for epoch {}, testing validation dataset'.format(epoch))
    val_losses = test_epoch(model, val_data_loader, loss, device, epoch)
    for _loss, _values in val_losses.items():
        _value = torch.mean(torch.cat(_values))
        last_value = _value
        writer.add_scalar('Test/Validation/{}'.format(_loss), _value, epoch)
        print('{}: {}'.format(_loss, _value))
        
    print('Completed testing validation dataset for epoch {}, testing training dataset'.format(epoch))
    train_losses = test_epoch(model, train_test_data_loader, loss, device, epoch)
    for _loss, _values in train_losses.items():
        _value = torch.mean(torch.cat(_values))
        writer.add_scalar('Test/Training/{}'.format(_loss), _value, epoch)
        print('{}: {}'.format(_loss, _value))
        
    optimizer.epoch_end()
    
    for ks_layer in analyzed_layers:
        writer.add_scalar('Kernel Sparsity/{}'.format(ks_layer.name), ks_layer.param_sparsity.item(), epoch + 1)
        
    print('Completed testing training dataset for epoch {}'.format(epoch))
    
pruned_save_path = os.path.abspath(os.path.expanduser(os.path.join('.', '{}.pth'.format(model_id))))
print('Finished pruning, saving model to {}'.format(pruned_save_path))
save_model(pruned_save_path, model, optimizer, epoch)
print('Saved model')


  0%|          | 0/120 [00:00<?, ?it/s]

Recalibrating model for kernel sparsity...
Starting epoch 0


  self._mask_tensor = self._param.data.new_tensor(self._mask_tensor)

  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.45it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.56it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.66it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.73it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.79it/s][A
 18%|█▊        | 9/51 [00:05<00:22,  1.83it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.86it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.88it/s][A
 24%|██▎       | 12/51 [00:06<00:20,  1.89it/s][A
 25%|██▌       | 13/51 [00:07<00:19,  1.91it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.91it/s][A
 29%|██▉       | 15/51 [00:08<00:18,  1.92it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.92it/s][A
 33%|███▎      | 17/51 [00:09<00:17,  1.92it/s][A
 35%|███▌      | 18/51 [00:09<

Completed training for epoch 0, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40830186009407043
top1acc: 88.1999282836914
Completed testing validation dataset for epoch 0, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
  1%|          | 1/120 [00:28<56:56, 28.71s/it]A
  0%|          | 0/51 [00:00<?, ?it/s]

loss: 0.2600196897983551
top1acc: 91.59994506835938
Completed testing training dataset for epoch 0
Starting epoch 1


[A
  2%|▏         | 1/51 [00:01<00:50,  1.00s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.16it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.32it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.45it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.57it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.66it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.73it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.78it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.82it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.85it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.87it/s][A
 24%|██▎       | 12/51 [00:06<00:20,  1.88it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.89it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.90it/s][A
 29%|██▉       | 15/51 [00:08<00:18,  1.90it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.90it/s][A
 33%|███▎      | 17/51 [00:09<00:17,  1.90it/s][A
 35%|███▌      | 18/51 [00:09<00:17,  1.91it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.91it/s][A
 39%|███▉      | 20/51 [00:10<00:16,

Completed training for epoch 1, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3848629891872406
top1acc: 88.19993591308594
Completed testing validation dataset for epoch 1, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
  2%|▏         | 2/120 [00:57<56:30, 28.73s/it]A

loss: 0.24593403935432434
top1acc: 91.99991607666016
Completed testing training dataset for epoch 1
Starting epoch 2



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.64it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.71it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.77it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.81it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.83it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.86it/s][A
 24%|██▎       | 12/51 [00:06<00:20,  1.87it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.88it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.89it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.89it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.90it/s][A
 33%|███▎      | 17/51 [00:09<00:17,  1.89it/s][A
 35%|███▌      | 18/51 [00:09<00:17,  1.89it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.90it/s]

Completed training for epoch 2, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.16it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.42813313007354736
top1acc: 87.2000732421875
Completed testing validation dataset for epoch 2, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
  2%|▎         | 3/120 [01:26<56:08, 28.79s/it]A

loss: 0.3073204755783081
top1acc: 89.40003967285156
Completed testing training dataset for epoch 2
Starting epoch 3



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.76it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.80it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.83it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.85it/s][A
 24%|██▎       | 12/51 [00:06<00:20,  1.86it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.87it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.88it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.88it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.89it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.89it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.89it/s]

Completed training for epoch 3, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39044925570487976
top1acc: 88.39994812011719
Completed testing validation dataset for epoch 3, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
  3%|▎         | 4/120 [01:55<55:48, 28.86s/it]A


loss: 0.23077288269996643
top1acc: 92.1999740600586
Completed testing training dataset for epoch 3
Starting epoch 4


  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.76it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.80it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.85it/s][A
 24%|██▎       | 12/51 [00:06<00:20,  1.86it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.87it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.88it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.89it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.89it/s][

Completed training for epoch 4, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.45614320039749146
top1acc: 88.19993591308594
Completed testing validation dataset for epoch 4, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
  4%|▍         | 5/120 [02:24<55:25, 28.92s/it]A

loss: 0.30494895577430725
top1acc: 90.20002746582031
Completed testing training dataset for epoch 4
Starting epoch 5



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.84it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.88it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.88it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.89it/s]

Completed training for epoch 5, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.42969316244125366
top1acc: 87.00005340576172
Completed testing validation dataset for epoch 5, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
  5%|▌         | 6/120 [02:53<55:04, 28.99s/it]A

loss: 0.24472083151340485
top1acc: 90.80005645751953
Completed testing training dataset for epoch 5
Starting epoch 6



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.01s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.44it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.55it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.64it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.76it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.84it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 6, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.18it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4251702129840851
top1acc: 85.99993133544922
Completed testing validation dataset for epoch 6, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
  6%|▌         | 7/120 [03:22<54:41, 29.04s/it]A

loss: 0.2427665889263153
top1acc: 91.59996032714844
Completed testing training dataset for epoch 6
Starting epoch 7



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.01s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 7, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.405050665140152
top1acc: 87.60006713867188
Completed testing validation dataset for epoch 7, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
  7%|▋         | 8/120 [03:52<54:19, 29.10s/it]A

loss: 0.2375756949186325
top1acc: 92.59994506835938
Completed testing training dataset for epoch 7
Starting epoch 8



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 8, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4008839428424835
top1acc: 88.79998016357422
Completed testing validation dataset for epoch 8, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
  8%|▊         | 9/120 [04:21<53:56, 29.15s/it]A

loss: 0.27077919244766235
top1acc: 91.20008850097656
Completed testing training dataset for epoch 8
Starting epoch 9



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.84it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.88it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:16,  1.88it/s]

Completed training for epoch 9, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4125473201274872
top1acc: 87.1999282836914
Completed testing validation dataset for epoch 9, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
  8%|▊         | 10/120 [04:50<53:27, 29.16s/it]

loss: 0.2762140929698944
top1acc: 90.80004119873047
Completed testing training dataset for epoch 9
Starting epoch 10



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.88it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 10, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4566492736339569
top1acc: 86.59993743896484
Completed testing validation dataset for epoch 10, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
  9%|▉         | 11/120 [05:19<53:01, 29.19s/it]

loss: 0.3191305100917816
top1acc: 89.20000457763672
Completed testing training dataset for epoch 10
Starting epoch 11



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 11, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4467296302318573
top1acc: 87.19994354248047
Completed testing validation dataset for epoch 11, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 10%|█         | 12/120 [05:49<52:34, 29.21s/it]

loss: 0.3194417357444763
top1acc: 90.40000915527344
Completed testing training dataset for epoch 11
Starting epoch 12



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 12, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.45951053500175476
top1acc: 86.40006256103516
Completed testing validation dataset for epoch 12, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
 11%|█         | 13/120 [06:18<52:08, 29.24s/it]

loss: 0.29615432024002075
top1acc: 90.5999984741211
Completed testing training dataset for epoch 12
Starting epoch 13



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 13, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40761926770210266
top1acc: 86.39993286132812
Completed testing validation dataset for epoch 13, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 12%|█▏        | 14/120 [06:47<51:41, 29.26s/it]

loss: 0.3294319808483124
top1acc: 90.20001983642578
Completed testing training dataset for epoch 13
Starting epoch 14



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.87it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 14, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.58it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.05it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.41504549980163574
top1acc: 86.39997100830078
Completed testing validation dataset for epoch 14, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 12%|█▎        | 15/120 [07:16<51:14, 29.28s/it]

loss: 0.3576636016368866
top1acc: 88.00006866455078
Completed testing training dataset for epoch 14
Starting epoch 15



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 15, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4535043239593506
top1acc: 86.4000015258789
Completed testing validation dataset for epoch 15, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
 13%|█▎        | 16/120 [07:46<50:44, 29.28s/it]

loss: 0.3570753037929535
top1acc: 89.4000015258789
Completed testing training dataset for epoch 15
Starting epoch 16



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 16, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4227321445941925
top1acc: 85.40007019042969
Completed testing validation dataset for epoch 16, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 14%|█▍        | 17/120 [08:15<50:16, 29.28s/it]

loss: 0.38998785614967346
top1acc: 86.20002746582031
Completed testing training dataset for epoch 16
Starting epoch 17



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.76it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 17, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4564765691757202
top1acc: 85.40007019042969
Completed testing validation dataset for epoch 17, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.59it/s][A
 15%|█▌        | 18/120 [08:44<49:49, 29.31s/it]

loss: 0.42787811160087585
top1acc: 85.99993896484375
Completed testing training dataset for epoch 17
Starting epoch 18



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:20,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.85it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s]

Completed training for epoch 18, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4621700942516327
top1acc: 85.40003204345703
Completed testing validation dataset for epoch 18, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 16%|█▌        | 19/120 [09:14<49:22, 29.33s/it]

loss: 0.4481644034385681
top1acc: 83.80000305175781
Completed testing training dataset for epoch 18
Starting epoch 19



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 19, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.43604859709739685
top1acc: 86.40005493164062
Completed testing validation dataset for epoch 19, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 17%|█▋        | 20/120 [09:43<48:54, 29.35s/it]

loss: 0.4647865891456604
top1acc: 83.19999694824219
Completed testing training dataset for epoch 19
Starting epoch 20



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 20, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.46687614917755127
top1acc: 85.2000732421875
Completed testing validation dataset for epoch 20, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
 18%|█▊        | 21/120 [10:12<48:23, 29.33s/it]

loss: 0.45753148198127747
top1acc: 85.4000244140625
Completed testing training dataset for epoch 20
Starting epoch 21



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 21, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.46205756068229675
top1acc: 85.80005645751953
Completed testing validation dataset for epoch 21, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
 18%|█▊        | 22/120 [10:42<47:54, 29.33s/it]

loss: 0.44732433557510376
top1acc: 84.8000259399414
Completed testing training dataset for epoch 21
Starting epoch 22



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 22, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.5216693878173828
top1acc: 84.60002136230469
Completed testing validation dataset for epoch 22, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.77it/s][A
 19%|█▉        | 23/120 [11:11<47:22, 29.30s/it]

loss: 0.4795128405094147
top1acc: 84.0
Completed testing training dataset for epoch 22
Starting epoch 23



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 23, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4228565990924835
top1acc: 87.1999282836914
Completed testing validation dataset for epoch 23, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 20%|██        | 24/120 [11:40<46:52, 29.30s/it]

loss: 0.41076934337615967
top1acc: 86.20006561279297
Completed testing training dataset for epoch 23
Starting epoch 24



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.01s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.44it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.55it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.84it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 24, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.43019184470176697
top1acc: 85.20001983642578
Completed testing validation dataset for epoch 24, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 21%|██        | 25/120 [12:10<46:21, 29.28s/it]

loss: 0.39435330033302307
top1acc: 86.39994049072266
Completed testing training dataset for epoch 24
Starting epoch 25



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 25, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4158949851989746
top1acc: 88.40007019042969
Completed testing validation dataset for epoch 25, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.60it/s][A
 22%|██▏       | 26/120 [12:39<45:53, 29.30s/it]

loss: 0.3742988407611847
top1acc: 87.79994201660156
Completed testing training dataset for epoch 25
Starting epoch 26



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 26, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4543895423412323
top1acc: 87.19993591308594
Completed testing validation dataset for epoch 26, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 22%|██▎       | 27/120 [13:08<45:25, 29.30s/it]

loss: 0.38315320014953613
top1acc: 86.20001983642578
Completed testing training dataset for epoch 26
Starting epoch 27



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 27, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.07it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4181293249130249
top1acc: 87.2000732421875
Completed testing validation dataset for epoch 27, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
 23%|██▎       | 28/120 [13:38<44:57, 29.32s/it]

loss: 0.34489017724990845
top1acc: 88.60002899169922
Completed testing training dataset for epoch 27
Starting epoch 28



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 28, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39852046966552734
top1acc: 88.39994812011719
Completed testing validation dataset for epoch 28, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 24%|██▍       | 29/120 [14:07<44:26, 29.31s/it]

loss: 0.3328379988670349
top1acc: 88.59996795654297
Completed testing training dataset for epoch 28
Starting epoch 29



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s]

Completed training for epoch 29, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.46314120292663574
top1acc: 87.0000228881836
Completed testing validation dataset for epoch 29, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 25%|██▌       | 30/120 [14:36<43:56, 29.30s/it]

loss: 0.36581555008888245
top1acc: 87.1999740600586
Completed testing training dataset for epoch 29
Starting epoch 30



  0%|          | 0/51 [00:00<?, ?it/s][A
  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s]

Completed training for epoch 30, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.449349969625473
top1acc: 86.60005187988281
Completed testing validation dataset for epoch 30, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 26%|██▌       | 31/120 [15:05<43:28, 29.31s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.37936416268348694
top1acc: 86.5999755859375
Completed testing training dataset for epoch 30
Starting epoch 31



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 31, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.18it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4185154139995575
top1acc: 86.80005645751953
Completed testing validation dataset for epoch 31, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 27%|██▋       | 32/120 [15:35<42:53, 29.24s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.31141212582588196
top1acc: 89.59996795654297
Completed testing training dataset for epoch 31
Starting epoch 32



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 32, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.60it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.06it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.41506946086883545
top1acc: 88.60006713867188
Completed testing validation dataset for epoch 32, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 28%|██▊       | 33/120 [16:04<42:20, 29.20s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.32775169610977173
top1acc: 88.9999771118164
Completed testing training dataset for epoch 32
Starting epoch 33



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 33, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.55it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.01it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39833810925483704
top1acc: 88.99993896484375
Completed testing validation dataset for epoch 33, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 28%|██▊       | 34/120 [16:33<41:49, 29.18s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.31584635376930237
top1acc: 89.19999694824219
Completed testing training dataset for epoch 33
Starting epoch 34



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 34, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4221983253955841
top1acc: 87.60002136230469
Completed testing validation dataset for epoch 34, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 29%|██▉       | 35/120 [17:02<41:18, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.325738787651062
top1acc: 87.79994201660156
Completed testing training dataset for epoch 34
Starting epoch 35



  2%|▏         | 1/51 [00:01<00:50,  1.01s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 35, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.486969530582428
top1acc: 84.79995727539062
Completed testing validation dataset for epoch 35, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.60it/s][A
 30%|███       | 36/120 [17:31<40:47, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.4209975600242615
top1acc: 84.8000259399414
Completed testing training dataset for epoch 35
Starting epoch 36



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 36, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.08it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3792221248149872
top1acc: 89.80000305175781
Completed testing validation dataset for epoch 36, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 31%|███       | 37/120 [18:00<40:18, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.27344274520874023
top1acc: 90.60002899169922
Completed testing training dataset for epoch 36
Starting epoch 37



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 37, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4061359167098999
top1acc: 87.79993438720703
Completed testing validation dataset for epoch 37, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 32%|███▏      | 38/120 [18:29<39:49, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2958497405052185
top1acc: 88.80006408691406
Completed testing training dataset for epoch 37
Starting epoch 38



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 38, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.08it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4110291302204132
top1acc: 87.60006713867188
Completed testing validation dataset for epoch 38, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 32%|███▎      | 39/120 [18:58<39:21, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.29461923241615295
top1acc: 89.80003356933594
Completed testing training dataset for epoch 38
Starting epoch 39



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 39, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.16it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4312516748905182
top1acc: 86.40006256103516
Completed testing validation dataset for epoch 39, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
 33%|███▎      | 40/120 [19:28<38:50, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.32860878109931946
top1acc: 88.1999282836914
Completed testing training dataset for epoch 39
Starting epoch 40



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.67it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 40, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4056905210018158
top1acc: 87.60005950927734
Completed testing validation dataset for epoch 40, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.58it/s][A
 34%|███▍      | 41/120 [19:57<38:23, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.28178489208221436
top1acc: 90.20001983642578
Completed testing training dataset for epoch 40
Starting epoch 41



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 41, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39275553822517395
top1acc: 87.7999267578125
Completed testing validation dataset for epoch 41, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 35%|███▌      | 42/120 [20:26<37:53, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.27410611510276794
top1acc: 90.4000244140625
Completed testing training dataset for epoch 41
Starting epoch 42



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 42, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4023912847042084
top1acc: 87.79994201660156
Completed testing validation dataset for epoch 42, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 36%|███▌      | 43/120 [20:55<37:25, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2605035603046417
top1acc: 91.79995727539062
Completed testing training dataset for epoch 42
Starting epoch 43



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 43, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.20it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.45103609561920166
top1acc: 86.40006256103516
Completed testing validation dataset for epoch 43, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.75it/s][A
 37%|███▋      | 44/120 [21:24<36:54, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.31510525941848755
top1acc: 88.60000610351562
Completed testing training dataset for epoch 43
Starting epoch 44



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 44, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3909650444984436
top1acc: 88.00006866455078
Completed testing validation dataset for epoch 44, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 38%|███▊      | 45/120 [21:53<36:24, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.24962574243545532
top1acc: 92.00008392333984
Completed testing training dataset for epoch 44
Starting epoch 45



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 45, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4290706217288971
top1acc: 87.00005340576172
Completed testing validation dataset for epoch 45, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
 38%|███▊      | 46/120 [22:22<35:54, 29.11s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2944796085357666
top1acc: 88.80001068115234
Completed testing training dataset for epoch 45
Starting epoch 46



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 46, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40049102902412415
top1acc: 87.20006561279297
Completed testing validation dataset for epoch 46, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 39%|███▉      | 47/120 [22:52<35:27, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2466651350259781
top1acc: 90.60005187988281
Completed testing training dataset for epoch 46
Starting epoch 47



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 47, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3959505558013916
top1acc: 88.39994049072266
Completed testing validation dataset for epoch 47, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 40%|████      | 48/120 [23:21<34:58, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2569080889225006
top1acc: 92.40005493164062
Completed testing training dataset for epoch 47
Starting epoch 48



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 48, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4169016182422638
top1acc: 87.39993286132812
Completed testing validation dataset for epoch 48, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
 41%|████      | 49/120 [23:50<34:28, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.24977931380271912
top1acc: 91.0
Completed testing training dataset for epoch 48
Starting epoch 49



  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 49, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4045274257659912
top1acc: 89.19996643066406
Completed testing validation dataset for epoch 49, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 42%|████▏     | 50/120 [24:19<33:59, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22321996092796326
top1acc: 93.99999237060547
Completed testing training dataset for epoch 49
Starting epoch 50



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 50, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.18it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39499419927597046
top1acc: 88.59992980957031
Completed testing validation dataset for epoch 50, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 42%|████▎     | 51/120 [24:48<33:29, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22959136962890625
top1acc: 92.40003967285156
Completed testing training dataset for epoch 50
Starting epoch 51



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 51, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4138984978199005
top1acc: 87.20005798339844
Completed testing validation dataset for epoch 51, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.56it/s][A
 43%|████▎     | 52/120 [25:17<33:02, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.23643410205841064
top1acc: 92.60008239746094
Completed testing training dataset for epoch 51
Starting epoch 52



  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 52, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38472604751586914
top1acc: 88.59994506835938
Completed testing validation dataset for epoch 52, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 44%|████▍     | 53/120 [25:46<32:32, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.23469194769859314
top1acc: 92.79994201660156
Completed testing training dataset for epoch 52
Starting epoch 53



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 53, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.55it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.01it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3848687708377838
top1acc: 89.1999740600586
Completed testing validation dataset for epoch 53, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
 45%|████▌     | 54/120 [26:16<32:03, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22524890303611755
top1acc: 92.20005798339844
Completed testing training dataset for epoch 53
Starting epoch 54



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 54, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38090160489082336
top1acc: 89.19999694824219
Completed testing validation dataset for epoch 54, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
 46%|████▌     | 55/120 [26:45<31:34, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22203096747398376
top1acc: 93.199951171875
Completed testing training dataset for epoch 54
Starting epoch 55



  2%|▏         | 1/51 [00:01<00:53,  1.07s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.10it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 55, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.21it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39797133207321167
top1acc: 88.19993591308594
Completed testing validation dataset for epoch 55, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 47%|████▋     | 56/120 [27:14<31:05, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22874808311462402
top1acc: 92.79995727539062
Completed testing training dataset for epoch 55
Starting epoch 56



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 56, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39996203780174255
top1acc: 88.79994201660156
Completed testing validation dataset for epoch 56, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.55it/s][A
 48%|████▊     | 57/120 [27:43<30:37, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22392000257968903
top1acc: 91.59996032714844
Completed testing training dataset for epoch 56
Starting epoch 57



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.84it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 57, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4128941595554352
top1acc: 87.39994049072266
Completed testing validation dataset for epoch 57, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
 48%|████▊     | 58/120 [28:12<30:08, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.217548206448555
top1acc: 93.79994201660156
Completed testing training dataset for epoch 57
Starting epoch 58



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.84it/s][A
 27%|██▋       | 14/51 [00:08<00:20,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.86it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 58, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38770297169685364
top1acc: 88.39994812011719
Completed testing validation dataset for epoch 58, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 49%|████▉     | 59/120 [28:41<29:39, 29.18s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.22199110686779022
top1acc: 92.00008392333984
Completed testing training dataset for epoch 58
Starting epoch 59



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 59, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40839189291000366
top1acc: 87.7999267578125
Completed testing validation dataset for epoch 59, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.75it/s][A
 50%|█████     | 60/120 [29:11<29:09, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.21117450296878815
top1acc: 92.59991455078125
Completed testing training dataset for epoch 59
Starting epoch 60



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 60, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40562838315963745
top1acc: 88.79998016357422
Completed testing validation dataset for epoch 60, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 51%|█████     | 61/120 [29:40<28:40, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.21959011256694794
top1acc: 93.99996185302734
Completed testing training dataset for epoch 60
Starting epoch 61



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 61, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39709046483039856
top1acc: 88.59996795654297
Completed testing validation dataset for epoch 61, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 52%|█████▏    | 62/120 [30:09<28:10, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.21680980920791626
top1acc: 93.5999755859375
Completed testing training dataset for epoch 61
Starting epoch 62



  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 62, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.16it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38294318318367004
top1acc: 87.7999267578125
Completed testing validation dataset for epoch 62, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 52%|█████▎    | 63/120 [30:38<27:40, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.19681884348392487
top1acc: 94.19995880126953
Completed testing training dataset for epoch 62
Starting epoch 63



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 63, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3836655616760254
top1acc: 88.39993286132812
Completed testing validation dataset for epoch 63, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
 53%|█████▎    | 64/120 [31:07<27:11, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.20196741819381714
top1acc: 94.19994354248047
Completed testing training dataset for epoch 63
Starting epoch 64



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 64, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3928786814212799
top1acc: 88.59994506835938
Completed testing validation dataset for epoch 64, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 54%|█████▍    | 65/120 [31:36<26:42, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2070450335741043
top1acc: 93.79995727539062
Completed testing training dataset for epoch 64
Starting epoch 65



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 65, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38105136156082153
top1acc: 87.20006561279297
Completed testing validation dataset for epoch 65, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.59it/s][A
 55%|█████▌    | 66/120 [32:05<26:13, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.2093859165906906
top1acc: 92.7999496459961
Completed testing training dataset for epoch 65
Starting epoch 66



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 66, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3828638195991516
top1acc: 88.39994049072266
Completed testing validation dataset for epoch 66, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.69it/s][A
 56%|█████▌    | 67/120 [32:34<25:43, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.19363835453987122
top1acc: 93.59994506835938
Completed testing training dataset for epoch 66
Starting epoch 67



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 67, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3754747807979584
top1acc: 88.59992980957031
Completed testing validation dataset for epoch 67, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 57%|█████▋    | 68/120 [33:04<25:14, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.20335723459720612
top1acc: 92.99991607666016
Completed testing training dataset for epoch 67
Starting epoch 68



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 68, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38460686802864075
top1acc: 88.79994201660156
Completed testing validation dataset for epoch 68, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 57%|█████▊    | 69/120 [33:33<24:46, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.19496190547943115
top1acc: 93.9999771118164
Completed testing training dataset for epoch 68
Starting epoch 69



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.84it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 69, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3933296501636505
top1acc: 87.59992980957031
Completed testing validation dataset for epoch 69, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.75it/s][A
 58%|█████▊    | 70/120 [34:02<24:16, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.18533119559288025
top1acc: 93.99994659423828
Completed testing training dataset for epoch 69
Starting epoch 70



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 70, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3944971561431885
top1acc: 88.39994049072266
Completed testing validation dataset for epoch 70, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 59%|█████▉    | 71/120 [34:31<23:47, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.19101685285568237
top1acc: 93.39997863769531
Completed testing training dataset for epoch 70
Starting epoch 71



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 71, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38206833600997925
top1acc: 89.79994201660156
Completed testing validation dataset for epoch 71, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 60%|██████    | 72/120 [35:00<23:18, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.18914751708507538
top1acc: 93.19995880126953
Completed testing training dataset for epoch 71
Starting epoch 72



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.76it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.79it/s][A
 22%|██▏       | 11/51 [00:06<00:22,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 72, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.37942934036254883
top1acc: 88.99993896484375
Completed testing validation dataset for epoch 72, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 61%|██████    | 73/120 [35:29<22:50, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1781320869922638
top1acc: 94.59994506835938
Completed testing training dataset for epoch 72
Starting epoch 73



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 73, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.59it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.04it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.37708935141563416
top1acc: 89.19996643066406
Completed testing validation dataset for epoch 73, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 62%|██████▏   | 74/120 [35:59<22:21, 29.17s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.168060302734375
top1acc: 94.5999755859375
Completed testing training dataset for epoch 73
Starting epoch 74



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 74, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4007960259914398
top1acc: 88.00006866455078
Completed testing validation dataset for epoch 74, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.57it/s][A
 62%|██████▎   | 75/120 [36:28<21:52, 29.17s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.19823449850082397
top1acc: 92.79995727539062
Completed testing training dataset for epoch 74
Starting epoch 75



  2%|▏         | 1/51 [00:01<00:53,  1.07s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.10it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 75, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3824807405471802
top1acc: 89.19994354248047
Completed testing validation dataset for epoch 75, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 63%|██████▎   | 76/120 [36:57<21:23, 29.17s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.171090230345726
top1acc: 94.9999771118164
Completed testing training dataset for epoch 75
Starting epoch 76



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 76, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.60it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.07it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3787241280078888
top1acc: 89.99994659423828
Completed testing validation dataset for epoch 76, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 64%|██████▍   | 77/120 [37:26<20:55, 29.19s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16411608457565308
top1acc: 94.8000259399414
Completed testing training dataset for epoch 76
Starting epoch 77



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 77, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39519935846328735
top1acc: 88.1999740600586
Completed testing validation dataset for epoch 77, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
 65%|██████▌   | 78/120 [37:55<20:25, 29.18s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.18979564309120178
top1acc: 94.39997863769531
Completed testing training dataset for epoch 77
Starting epoch 78



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.85it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.85it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.86it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 78, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3955886960029602
top1acc: 88.99994659423828
Completed testing validation dataset for epoch 78, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.58it/s][A
 66%|██████▌   | 79/120 [38:24<19:57, 29.20s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.17826004326343536
top1acc: 94.39997863769531
Completed testing training dataset for epoch 78
Starting epoch 79



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 79, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4052780568599701
top1acc: 87.7999267578125
Completed testing validation dataset for epoch 79, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 67%|██████▋   | 80/120 [38:54<19:26, 29.17s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.20369978249073029
top1acc: 93.19994354248047
Completed testing training dataset for epoch 79
Starting epoch 80



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 80, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.19it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.4012455344200134
top1acc: 88.99993896484375
Completed testing validation dataset for epoch 80, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 68%|██████▊   | 81/120 [39:23<18:57, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.17380523681640625
top1acc: 93.79994201660156
Completed testing training dataset for epoch 80
Starting epoch 81



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 81, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.37672412395477295
top1acc: 88.99993896484375
Completed testing validation dataset for epoch 81, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 68%|██████▊   | 82/120 [39:52<18:28, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15876860916614532
top1acc: 95.79998016357422
Completed testing training dataset for epoch 81
Starting epoch 82



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 82, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39172831177711487
top1acc: 88.99993133544922
Completed testing validation dataset for epoch 82, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
 69%|██████▉   | 83/120 [40:21<17:58, 29.16s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16183996200561523
top1acc: 95.20001220703125
Completed testing training dataset for epoch 82
Starting epoch 83



  2%|▏         | 1/51 [00:01<00:53,  1.07s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.10it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.39it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.67it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 83, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39396145939826965
top1acc: 88.79993438720703
Completed testing validation dataset for epoch 83, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
 70%|███████   | 84/120 [40:50<17:29, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.158380925655365
top1acc: 95.00001525878906
Completed testing training dataset for epoch 83
Starting epoch 84



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 84, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.59it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.05it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3983616530895233
top1acc: 88.19994354248047
Completed testing validation dataset for epoch 84, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 71%|███████   | 85/120 [41:19<17:00, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16202159225940704
top1acc: 95.1999740600586
Completed testing training dataset for epoch 84
Starting epoch 85



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 85, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3957047164440155
top1acc: 89.39994812011719
Completed testing validation dataset for epoch 85, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 72%|███████▏  | 86/120 [41:48<16:31, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.17122913897037506
top1acc: 93.79994201660156
Completed testing training dataset for epoch 85
Starting epoch 86



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.10it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 86, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38627609610557556
top1acc: 88.39994049072266
Completed testing validation dataset for epoch 86, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 72%|███████▎  | 87/120 [42:18<16:02, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1592845618724823
top1acc: 94.79998016357422
Completed testing training dataset for epoch 86
Starting epoch 87



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 87, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.16it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38306349515914917
top1acc: 89.2000732421875
Completed testing validation dataset for epoch 87, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 73%|███████▎  | 88/120 [42:47<15:32, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15675297379493713
top1acc: 94.9999771118164
Completed testing training dataset for epoch 87
Starting epoch 88



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.82it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:08<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 88, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38901540637016296
top1acc: 87.80005645751953
Completed testing validation dataset for epoch 88, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 74%|███████▍  | 89/120 [43:16<15:02, 29.11s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1649876832962036
top1acc: 94.79998016357422
Completed testing training dataset for epoch 88
Starting epoch 89



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 89, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38424086570739746
top1acc: 88.9999771118164
Completed testing validation dataset for epoch 89, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 75%|███████▌  | 90/120 [43:45<14:33, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16864719986915588
top1acc: 94.39994812011719
Completed testing training dataset for epoch 89
Starting epoch 90



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 90, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.40063023567199707
top1acc: 88.39997863769531
Completed testing validation dataset for epoch 90, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 76%|███████▌  | 91/120 [44:14<14:04, 29.11s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.17495915293693542
top1acc: 93.79999542236328
Completed testing training dataset for epoch 90
Starting epoch 91



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 91, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.08it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3941539227962494
top1acc: 88.79993438720703
Completed testing validation dataset for epoch 91, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 77%|███████▋  | 92/120 [44:43<13:35, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1628599911928177
top1acc: 94.39999389648438
Completed testing training dataset for epoch 91
Starting epoch 92



  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:22,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 92, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3804416358470917
top1acc: 90.39997100830078
Completed testing validation dataset for epoch 92, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
 78%|███████▊  | 93/120 [45:12<13:06, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16169309616088867
top1acc: 95.60001373291016
Completed testing training dataset for epoch 92
Starting epoch 93



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 93, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.372405081987381
top1acc: 89.39994812011719
Completed testing validation dataset for epoch 93, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 78%|███████▊  | 94/120 [45:41<12:36, 29.11s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15683674812316895
top1acc: 94.99994659423828
Completed testing training dataset for epoch 93
Starting epoch 94



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.85it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.86it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 94, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.08it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3861960768699646
top1acc: 89.79993438720703
Completed testing validation dataset for epoch 94, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
 79%|███████▉  | 95/120 [46:11<12:08, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15097801387310028
top1acc: 94.99999237060547
Completed testing training dataset for epoch 94
Starting epoch 95



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.79it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 95, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39494621753692627
top1acc: 88.79998016357422
Completed testing validation dataset for epoch 95, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
 80%|████████  | 96/120 [46:40<11:39, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14748317003250122
top1acc: 94.9999771118164
Completed testing training dataset for epoch 95
Starting epoch 96



  2%|▏         | 1/51 [00:01<00:53,  1.08s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.09it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.25it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.39it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.88it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.88it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.88it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 96, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38132885098457336
top1acc: 88.99994659423828
Completed testing validation dataset for epoch 96, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
 81%|████████  | 97/120 [47:09<11:10, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15005677938461304
top1acc: 95.39997863769531
Completed testing training dataset for epoch 96
Starting epoch 97



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 97, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.10it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38963866233825684
top1acc: 87.99993133544922
Completed testing validation dataset for epoch 97, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
 82%|████████▏ | 98/120 [47:38<10:41, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.16254740953445435
top1acc: 94.59996032714844
Completed testing training dataset for epoch 97
Starting epoch 98



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 98, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3890766501426697
top1acc: 88.99994659423828
Completed testing validation dataset for epoch 98, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 82%|████████▎ | 99/120 [48:07<10:12, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1466895043849945
top1acc: 95.60001373291016
Completed testing training dataset for epoch 98
Starting epoch 99



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 99, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39377570152282715
top1acc: 88.9999771118164
Completed testing validation dataset for epoch 99, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
 83%|████████▎ | 100/120 [48:36<09:42, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1536923348903656
top1acc: 94.39994812011719
Completed testing training dataset for epoch 99
Starting epoch 100



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 100, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38954398036003113
top1acc: 88.79993438720703
Completed testing validation dataset for epoch 100, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 84%|████████▍ | 101/120 [49:05<09:13, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14045090973377228
top1acc: 95.39999389648438
Completed testing training dataset for epoch 100
Starting epoch 101



  2%|▏         | 1/51 [00:01<00:53,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 101, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.62it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.08it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3873865306377411
top1acc: 89.1999740600586
Completed testing validation dataset for epoch 101, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.73it/s][A
 85%|████████▌ | 102/120 [49:35<08:44, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14671052992343903
top1acc: 95.80001831054688
Completed testing training dataset for epoch 101
Starting epoch 102



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 102, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.67it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39041757583618164
top1acc: 89.19993591308594
Completed testing validation dataset for epoch 102, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
 86%|████████▌ | 103/120 [50:04<08:15, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1519717425107956
top1acc: 94.5999755859375
Completed testing training dataset for epoch 102
Starting epoch 103



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 103, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.37506911158561707
top1acc: 89.19993591308594
Completed testing validation dataset for epoch 103, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.72it/s][A
 87%|████████▋ | 104/120 [50:33<07:45, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14406345784664154
top1acc: 95.39997863769531
Completed testing training dataset for epoch 103
Starting epoch 104



  2%|▏         | 1/51 [00:01<00:50,  1.01s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:36,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.44it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.70it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.79it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 104, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.392625629901886
top1acc: 88.9999771118164
Completed testing validation dataset for epoch 104, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 88%|████████▊ | 105/120 [51:02<07:16, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1400071233510971
top1acc: 95.79998016357422
Completed testing training dataset for epoch 104
Starting epoch 105



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.76it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.79it/s][A
 22%|██▏       | 11/51 [00:06<00:22,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.83it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.84it/s][A
 27%|██▋       | 14/51 [00:08<00:20,  1.85it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.85it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 105, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.16it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39439722895622253
top1acc: 88.59994506835938
Completed testing validation dataset for epoch 105, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
 88%|████████▊ | 106/120 [51:31<06:47, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.15197117626667023
top1acc: 94.79999542236328
Completed testing training dataset for epoch 105
Starting epoch 106



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 106, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3949129283428192
top1acc: 89.1999282836914
Completed testing validation dataset for epoch 106, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.76it/s][A
 89%|████████▉ | 107/120 [52:00<06:18, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14402411878108978
top1acc: 95.39999389648438
Completed testing training dataset for epoch 106
Starting epoch 107



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.15it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.87it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 107, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.63it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.09it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3897542953491211
top1acc: 88.59993743896484
Completed testing validation dataset for epoch 107, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 90%|█████████ | 108/120 [52:29<05:49, 29.11s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14435462653636932
top1acc: 94.79999542236328
Completed testing training dataset for epoch 107
Starting epoch 108



  2%|▏         | 1/51 [00:01<00:53,  1.07s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.10it/s][A
  6%|▌         | 3/51 [00:02<00:38,  1.26it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.39it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.51it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.60it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.73it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:08<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 108, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39241844415664673
top1acc: 89.39994812011719
Completed testing validation dataset for epoch 108, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.56it/s][A
 91%|█████████ | 109/120 [52:58<05:20, 29.15s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14795677363872528
top1acc: 95.20001220703125
Completed testing training dataset for epoch 108
Starting epoch 109



  2%|▏         | 1/51 [00:01<00:52,  1.06s/it][A
  4%|▍         | 2/51 [00:01<00:44,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 109, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38407406210899353
top1acc: 88.99994659423828
Completed testing validation dataset for epoch 109, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.75it/s][A
 92%|█████████▏| 110/120 [53:28<04:51, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14640763401985168
top1acc: 95.19998931884766
Completed testing training dataset for epoch 109
Starting epoch 110



  2%|▏         | 1/51 [00:01<00:50,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.30it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.63it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 110, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.13it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3907421827316284
top1acc: 88.59994506835938
Completed testing validation dataset for epoch 110, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 92%|█████████▎| 111/120 [53:57<04:22, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14346632361412048
top1acc: 94.9999771118164
Completed testing training dataset for epoch 110
Starting epoch 111



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 111, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.17it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3859167993068695
top1acc: 89.19993591308594
Completed testing validation dataset for epoch 111, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 93%|█████████▎| 112/120 [54:26<03:53, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1385374218225479
top1acc: 95.59999084472656
Completed testing training dataset for epoch 111
Starting epoch 112



  2%|▏         | 1/51 [00:01<00:52,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.12it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.41it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 112, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39220210909843445
top1acc: 89.19994354248047
Completed testing validation dataset for epoch 112, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.61it/s][A
 94%|█████████▍| 113/120 [54:55<03:24, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14316397905349731
top1acc: 95.59999084472656
Completed testing training dataset for epoch 112
Starting epoch 113



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 113, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.66it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.14it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.39059051871299744
top1acc: 88.40007019042969
Completed testing validation dataset for epoch 113, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 95%|█████████▌| 114/120 [55:24<02:54, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1446521133184433
top1acc: 94.79994201660156
Completed testing training dataset for epoch 113
Starting epoch 114



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 114, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.68it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.15it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.38881149888038635
top1acc: 89.19994354248047
Completed testing validation dataset for epoch 114, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 96%|█████████▌| 115/120 [55:53<02:25, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.14207713305950165
top1acc: 95.39999389648438
Completed testing training dataset for epoch 114
Starting epoch 115



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 115, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.71it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.19it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3842807710170746
top1acc: 88.99993133544922
Completed testing validation dataset for epoch 115, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.70it/s][A
 97%|█████████▋| 116/120 [56:22<01:56, 29.12s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1332586705684662
top1acc: 95.60001373291016
Completed testing training dataset for epoch 115
Starting epoch 116



  2%|▏         | 1/51 [00:01<00:51,  1.02s/it][A
  4%|▍         | 2/51 [00:01<00:42,  1.14it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:32,  1.43it/s][A
 10%|▉         | 5/51 [00:03<00:29,  1.54it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:25,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.75it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.86it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 116, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.58it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.04it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3806011378765106
top1acc: 89.39994812011719
Completed testing validation dataset for epoch 116, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.58it/s][A
 98%|█████████▊| 117/120 [56:52<01:27, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1348140984773636
top1acc: 95.79999542236328
Completed testing training dataset for epoch 116
Starting epoch 117



  2%|▏         | 1/51 [00:01<00:52,  1.05s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.11it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.27it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.40it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.52it/s][A
 12%|█▏        | 6/51 [00:03<00:28,  1.61it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.68it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.77it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 117, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.37968018651008606
top1acc: 88.59993743896484
Completed testing validation dataset for epoch 117, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
 98%|█████████▊| 118/120 [57:21<00:58, 29.14s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.13960713148117065
top1acc: 95.5999755859375
Completed testing training dataset for epoch 117
Starting epoch 118



  2%|▏         | 1/51 [00:01<00:51,  1.03s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.29it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.80it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.82it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.87it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.87it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.87it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 118, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.12it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3836105167865753
top1acc: 89.1999740600586
Completed testing validation dataset for epoch 118, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.74it/s][A
 99%|█████████▉| 119/120 [57:50<00:29, 29.13s/it]
  0%|          | 0/51 [00:00<?, ?it/s][A

loss: 0.1363842934370041
top1acc: 95.79998016357422
Completed testing training dataset for epoch 118
Starting epoch 119



  2%|▏         | 1/51 [00:01<00:51,  1.04s/it][A
  4%|▍         | 2/51 [00:01<00:43,  1.13it/s][A
  6%|▌         | 3/51 [00:02<00:37,  1.28it/s][A
  8%|▊         | 4/51 [00:02<00:33,  1.42it/s][A
 10%|▉         | 5/51 [00:03<00:30,  1.53it/s][A
 12%|█▏        | 6/51 [00:03<00:27,  1.62it/s][A
 14%|█▎        | 7/51 [00:04<00:26,  1.69it/s][A
 16%|█▌        | 8/51 [00:04<00:24,  1.74it/s][A
 18%|█▊        | 9/51 [00:05<00:23,  1.78it/s][A
 20%|█▉        | 10/51 [00:05<00:22,  1.81it/s][A
 22%|██▏       | 11/51 [00:06<00:21,  1.83it/s][A
 24%|██▎       | 12/51 [00:06<00:21,  1.84it/s][A
 25%|██▌       | 13/51 [00:07<00:20,  1.85it/s][A
 27%|██▋       | 14/51 [00:07<00:19,  1.86it/s][A
 29%|██▉       | 15/51 [00:08<00:19,  1.86it/s][A
 31%|███▏      | 16/51 [00:09<00:18,  1.87it/s][A
 33%|███▎      | 17/51 [00:09<00:18,  1.86it/s][A
 35%|███▌      | 18/51 [00:10<00:17,  1.86it/s][A
 37%|███▋      | 19/51 [00:10<00:17,  1.86it/s][A
 39%|███▉      | 20/51 [00:11<00:16,  1

Completed training for epoch 119, testing validation dataset



 50%|█████     | 1/2 [00:00<00:00,  1.64it/s][A
100%|██████████| 2/2 [00:00<00:00,  2.11it/s][A
  0%|          | 0/2 [00:00<?, ?it/s][A

loss: 0.3852517306804657
top1acc: 89.1999740600586
Completed testing validation dataset for epoch 119, testing training dataset



 50%|█████     | 1/2 [00:00<00:00,  1.65it/s][A
100%|██████████| 120/120 [58:19<00:00, 29.14s/it]

loss: 0.1325506567955017
top1acc: 95.5999755859375
Completed testing training dataset for epoch 119
Finished pruning, saving model to /home/mark/neuralmagic/Shared/neuralmagicml-pytorch/notebooks/ResNet-2019.08.11.11.59.07.pth





NameError: name 'save_model' is not defined