# **Homework 3 - Convolutional Neural Network**

This is the example code of homework 3 of the machine learning course by Prof. Hung-yi Lee.

In this homework, you are required to build a convolutional neural network for image classification, possibly with some advanced training tips.


There are three levels here:

**Easy**: Build a simple convolutional neural network as the baseline. (2 pts)

**Medium**: Design a better architecture or adopt different data augmentations to improve the performance. (2 pts)

**Hard**: Utilize provided unlabeled data to obtain better results. (2 pts)

## **About the Dataset**

The dataset used here is food-11, a collection of food images in 11 classes.

For the requirement in the homework, TAs slightly modified the data.
Please DO NOT access the original fully-labeled training data or testing labels.

Also, the modified dataset is for this course only, and any further distribution or commercial use is forbidden.

In [1]:
# Download the dataset
# You may choose where to download the data.

# Google Drive
#!gdown --id '1awF7pZ9Dz7X1jn1_QAiKN-_v56veCEKy' --output food-11.zip

# Dropbox
# !wget https://www.dropbox.com/s/m9q6273jl3djall/food-11.zip -O food-11.zip

# MEGA
# !sudo apt install megatools
# !megadl "https://mega.nz/#!zt1TTIhK!ZuMbg5ZjGWzWX1I6nEUbfjMZgCmAgeqJlwDkqdIryfg"

# Unzip the dataset.
# This may take some time.
!unzip -q food-11.zip

## **Import Packages**

First, we need to import packages that will be used later.

In this homework, we highly rely on **torchvision**, a library of PyTorch.

In [2]:
!pip install wandb

Looking in indexes: http://mirrors.aliyun.com/pypi/simple
Collecting wandb
  Downloading http://mirrors.aliyun.com/pypi/packages/1c/5e/0362fa88679852c7fd3ac85ee5bd949426c4a51a61379010d4089be6d7ac/wandb-0.15.12-py3-none-any.whl (2.1 MB)
[K     |████████████████████████████████| 2.1 MB 1.2 MB/s eta 0:00:01
Collecting docker-pycreds>=0.4.0
  Downloading http://mirrors.aliyun.com/pypi/packages/f5/e8/f6bd1eee09314e7e6dee49cbe2c5e22314ccdb38db16c9fc72d2fa80d054/docker_pycreds-0.4.0-py2.py3-none-any.whl (9.0 kB)
Collecting setproctitle
  Downloading http://mirrors.aliyun.com/pypi/packages/45/8d/68eec8de2d22a8ed6004344b35f94f2407ba723beee6ab468f162bb7be3e/setproctitle-1.3.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31 kB)
Collecting appdirs>=1.4.3
  Downloading http://mirrors.aliyun.com/pypi/packages/3b/00/2344469e2084fb287c2e0b57b72910309874c3245463acd6cf5e3db69324/appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting GitPython!=3.1.29,

In [12]:
# Import necessary packages.
import numpy as np
import torch
import torch.nn as nn
import wandb
import torchvision.transforms as transforms
from PIL import Image
# "ConcatDataset" and "Subset" are possibly useful when doing semi-supervised learning.
from torch.utils.data import ConcatDataset, DataLoader, Subset
from torchvision.datasets import DatasetFolder

# This is for the progress bar.
from tqdm.auto import tqdm

In [3]:
wandb.login()

[34m[1mwandb[0m: Currently logged in as: [33mbaoxihuang0429[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [14]:
wandb.init(project = 'Classifier with Semi-supervised 2')

VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011112645640969277, max=1.0…

## **Dataset, Data Loader, and Transforms**

Torchvision provides lots of useful utilities for image preprocessing, data wrapping as well as data augmentation.

Here, since our data are stored in folders by class labels, we can directly apply **torchvision.datasets.DatasetFolder** for wrapping data without much effort.

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

In [4]:
# It is important to do data augmentation in training.
# However, not every augmentation is useful.
# Please think about what kind of augmentation is helpful for food recognition.
train_tfm = transforms.Compose([
    # Resize the image into a fixed shape (height = width = 128)
    transforms.Resize((128, 128)),
    transforms.RandomPerspective(),
    transforms.RandomRotation(degrees=(0, 180)),
    transforms.RandomCrop((128,128)),
    # You may add some transforms here.
    # ToTensor() should be the last one of the transforms.
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

# We don't need augmentations in testing and validation.
# All we need here is to resize the PIL image and transform it into Tensor.
test_tfm = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

In [5]:
# Batch size for training, validation, and testing.
# A greater batch size usually gives a more stable gradient.
# But the GPU memory is limited, so please adjust it carefully.
batch_size = 256

# Construct datasets.
# The argument "loader" tells how torchvision reads the data.
train_set = DatasetFolder("food-11/training/labeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm)
valid_set = DatasetFolder("food-11/validation", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)
unlabeled_set = DatasetFolder("food-11/training/unlabeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm)
test_set = DatasetFolder("food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)

# Construct data loaders.
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

## **Model**

The basic model here is simply a stack of convolutional layers followed by some fully-connected layers.

Since there are three channels for a color image (RGB), the input channels of the network must be three.
In each convolutional layer, typically the channels of inputs grow, while the height and width shrink (or remain unchanged, according to some hyperparameters like stride and padding).

Before fed into fully-connected layers, the feature map must be flattened into a single one-dimensional vector (for each image).
These features are then transformed by the fully-connected layers, and finally, we obtain the "logits" for each class.

### **WARNING -- You Must Know**
You are free to modify the model architecture here for further improvement.
However, if you want to use some well-known architectures such as ResNet50, please make sure **NOT** to load the pre-trained weights.
Using such pre-trained models is considered cheating and therefore you will be punished.
Similarly, it is your responsibility to make sure no pre-trained weights are used if you use **torch.hub** to load any modules.

For example, if you use ResNet-18 as your model:

model = torchvision.models.resnet18(pretrained=**False**) → This is fine.

model = torchvision.models.resnet18(pretrained=**True**)  → This is **NOT** allowed.

In [6]:
class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        # The arguments for commonly used modules:
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)

        # input image size: [3, 128, 128]
        self.cnn_layers = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(64, 128, 3, 1, 1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(128, 256, 3, 1, 1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(4, 4, 0),

        )


        self.fc_layers = nn.Sequential(
            nn.Linear(256 * 8 * 8, 256 * 2 * 2),
            nn.ReLU(),
            nn.Dropout(),
            nn.Linear(256 * 2 * 2, 256),
            nn.ReLU(),
            nn.Dropout(),
            nn.Linear(256, 11)
        )

    def forward(self, x):
        # input (x): [batch_size, 3, 128, 128]
        # output: [batch_size, 11]

        # Extract features by convolutional layers.
        x = self.cnn_layers(x)

        #x = self.cnn_layers(x)

        # The extracted feature map must be flatten before going to fully-connected layers.
        x = x.flatten(1)

        # The features are transformed by fully-connected layers to obtain the final logits.
        x = self.fc_layers(x)
        return x

## **Training**

You can finish supervised learning by simply running the provided code without any modification.

The function "get_pseudo_labels" is used for semi-supervised learning.
It is expected to get better performance if you use unlabeled data for semi-supervised learning.
However, you have to implement the function on your own and need to adjust several hyperparameters manually.

For more details about semi-supervised learning, please refer to [Prof. Lee's slides](https://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/semi%20(v3).pdf).

Again, please notice that utilizing external data (or pre-trained model) for training is **prohibited**.

In [7]:
def get_pseudo_labels(dataset, model, threshold=0.65):
    # This functions generates pseudo-labels of a dataset using given model.
    # It returns an instance of DatasetFolder containing images whose prediction confidences exceed a given threshold.
    # You are NOT allowed to use any models trained on external data for pseudo-labeling.
    device = "cuda" if torch.cuda.is_available() else "cpu"

    # Construct a data loader.
    data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=False)

    # Make sure the model is in eval mode.
    model.eval()
    # Define softmax function.
    softmax = nn.Softmax(dim=-1)

    # Iterate over the dataset by batches.
    for batch in tqdm(data_loader):
        img, _ = batch

        # Forward the data
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(img.to(device))

        # Obtain the probability distributions by applying softmax on logits.
        probs = softmax(logits)

        # ---------- TODO ----------
        # Filter the data and construct a new dataset.

    # # Turn off the eval mode.
    model.train()
    return dataset

In [None]:
# "cuda" only when GPUs are available.
device = "cuda" if torch.cuda.is_available() else "cpu"

# Initialize a model, and put it on the device specified.
model = Classifier().to(device)
model.device = device

# For the classification task, we use cross-entropy as the measurement of performance.
criterion = nn.CrossEntropyLoss()

# Initialize optimizer, you may fine-tune some hyperparameters such as learning rate on your own.
optimizer = torch.optim.Adam(model.parameters(), lr=0.0003, weight_decay=1e-5)

# The number of training epochs.
n_epochs = 500

# Whether to do semi-supervised learning.
do_semi = True

for epoch in range(n_epochs):
    # ---------- TODO ----------
    # In each epoch, relabel the unlabeled dataset for semi-supervised learning.
    # Then you can combine the labeled dataset and pseudo-labeled dataset for the training.
    if do_semi:
        # Obtain pseudo-labels for unlabeled data using trained model.
        pseudo_set = get_pseudo_labels(unlabeled_set, model)

        # Construct a new dataset and a data loader for training.
        # This is used in semi-supervised learning only.
        concat_dataset = ConcatDataset([train_set, pseudo_set])
        train_loader = DataLoader(concat_dataset, batch_size=batch_size, shuffle=True, num_workers=0, pin_memory=True)

    # ---------- Training ----------
    # Make sure the model is in train mode before training.
    model.train()

    # These are used to record information in training.
    train_loss = []
    train_accs = []

    # Iterate the training set by batches.
    for batch in tqdm(train_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # Forward the data. (Make sure data and model are on the same device.)
        logits = model(imgs.to(device))

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        train_loss.append(loss.item())
        train_accs.append(acc)

    # The average loss and accuracy of the training set is the average of the recorded values.
    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    # Print the information.
    print(f"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()

    # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(valid_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
            logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)
    wandb.log({"tr_loss": train_loss, "val_loss": valid_loss})
    wandb.log({"tr_acc": train_acc, "val_acc": valid_acc})

    # Print the information.
    print(f"[ Valid | {epoch + 1:03d}/{n_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")
wandb.finish()

  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 001/500 ] loss = 1.55072, acc = 0.67918


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 001/500 ] loss = 2.75576, acc = 0.09048


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 002/500 ] loss = 1.32565, acc = 0.71690


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 002/500 ] loss = 2.80932, acc = 0.08668


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 003/500 ] loss = 1.29852, acc = 0.71620


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 003/500 ] loss = 2.91493, acc = 0.08763


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 004/500 ] loss = 1.27883, acc = 0.71623


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 004/500 ] loss = 2.58066, acc = 0.10505


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 005/500 ] loss = 1.25584, acc = 0.71635


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 005/500 ] loss = 2.64472, acc = 0.09794


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 006/500 ] loss = 1.21917, acc = 0.71588


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 006/500 ] loss = 2.64226, acc = 0.12021


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 007/500 ] loss = 1.21448, acc = 0.71585


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 007/500 ] loss = 2.67343, acc = 0.11370


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 008/500 ] loss = 1.20896, acc = 0.71634


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 008/500 ] loss = 2.55402, acc = 0.13017


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 009/500 ] loss = 1.20024, acc = 0.71497


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 009/500 ] loss = 2.57802, acc = 0.11821


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 010/500 ] loss = 1.19482, acc = 0.71696


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 010/500 ] loss = 2.48479, acc = 0.13098


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 011/500 ] loss = 1.18034, acc = 0.71524


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 011/500 ] loss = 2.68524, acc = 0.12057


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 012/500 ] loss = 1.16290, acc = 0.71508


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 012/500 ] loss = 2.48616, acc = 0.12057


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 013/500 ] loss = 1.16895, acc = 0.71731


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 013/500 ] loss = 2.38346, acc = 0.13904


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 014/500 ] loss = 1.14640, acc = 0.71675


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 014/500 ] loss = 2.75571, acc = 0.10670


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 015/500 ] loss = 1.14797, acc = 0.71625


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 015/500 ] loss = 2.41719, acc = 0.13017


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 016/500 ] loss = 1.12425, acc = 0.71824


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 016/500 ] loss = 2.49560, acc = 0.16452


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 017/500 ] loss = 1.12625, acc = 0.71664


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 017/500 ] loss = 2.35221, acc = 0.12778


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 018/500 ] loss = 1.11711, acc = 0.71607


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 018/500 ] loss = 2.53414, acc = 0.11831


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 019/500 ] loss = 1.11291, acc = 0.71628


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 019/500 ] loss = 2.42598, acc = 0.11641


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 020/500 ] loss = 1.10572, acc = 0.71507


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 020/500 ] loss = 2.28764, acc = 0.12328


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 021/500 ] loss = 1.10510, acc = 0.71688


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 021/500 ] loss = 2.27399, acc = 0.12447


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 022/500 ] loss = 1.10556, acc = 0.71618


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 022/500 ] loss = 2.46203, acc = 0.12447


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 023/500 ] loss = 1.09488, acc = 0.71454


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 023/500 ] loss = 2.31202, acc = 0.10480


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 024/500 ] loss = 1.08596, acc = 0.71603


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 024/500 ] loss = 2.57838, acc = 0.13915


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 025/500 ] loss = 1.08247, acc = 0.71593


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 025/500 ] loss = 2.42531, acc = 0.13880


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 026/500 ] loss = 1.08543, acc = 0.71743


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 026/500 ] loss = 2.17846, acc = 0.14791


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 027/500 ] loss = 1.07708, acc = 0.71667


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 027/500 ] loss = 2.26750, acc = 0.12518


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 028/500 ] loss = 1.07346, acc = 0.71667


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 028/500 ] loss = 2.21338, acc = 0.12838


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 029/500 ] loss = 1.08294, acc = 0.71650


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 029/500 ] loss = 2.40791, acc = 0.11096


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 030/500 ] loss = 1.07145, acc = 0.71760


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 030/500 ] loss = 2.24208, acc = 0.14045


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 031/500 ] loss = 1.05763, acc = 0.71880


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 031/500 ] loss = 2.41943, acc = 0.12162


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 032/500 ] loss = 1.06964, acc = 0.71743


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 032/500 ] loss = 2.34662, acc = 0.12187


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 033/500 ] loss = 1.06165, acc = 0.71729


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 033/500 ] loss = 2.37488, acc = 0.11261


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 034/500 ] loss = 1.05704, acc = 0.71969


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 034/500 ] loss = 2.20349, acc = 0.13158


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 035/500 ] loss = 1.05020, acc = 0.71721


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 035/500 ] loss = 2.29991, acc = 0.12317


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 036/500 ] loss = 1.04216, acc = 0.71889


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 036/500 ] loss = 2.23058, acc = 0.13183


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 037/500 ] loss = 1.03934, acc = 0.72030


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 037/500 ] loss = 2.14547, acc = 0.13228


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 038/500 ] loss = 1.02853, acc = 0.71892


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 038/500 ] loss = 2.19254, acc = 0.13239


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 039/500 ] loss = 1.04340, acc = 0.71780


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 039/500 ] loss = 2.25754, acc = 0.13489


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 040/500 ] loss = 1.03299, acc = 0.71840


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 040/500 ] loss = 2.31559, acc = 0.11796


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 041/500 ] loss = 1.01621, acc = 0.72162


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 041/500 ] loss = 2.23387, acc = 0.14024


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 042/500 ] loss = 1.01886, acc = 0.72067


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 042/500 ] loss = 2.11504, acc = 0.13193


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 043/500 ] loss = 1.02327, acc = 0.72075


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 043/500 ] loss = 2.28635, acc = 0.13228


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 044/500 ] loss = 1.01747, acc = 0.72195


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 044/500 ] loss = 2.19416, acc = 0.13119


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 045/500 ] loss = 1.01858, acc = 0.71741


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 045/500 ] loss = 2.15061, acc = 0.14520


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 046/500 ] loss = 1.00785, acc = 0.71952


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 046/500 ] loss = 2.24403, acc = 0.12943


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 047/500 ] loss = 1.01625, acc = 0.72394


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 047/500 ] loss = 2.21912, acc = 0.12412


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 048/500 ] loss = 1.00679, acc = 0.71965


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 048/500 ] loss = 2.11587, acc = 0.15502


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 049/500 ] loss = 1.00457, acc = 0.72069


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 049/500 ] loss = 2.10085, acc = 0.12743


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 050/500 ] loss = 0.99253, acc = 0.72188


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 050/500 ] loss = 2.06518, acc = 0.15361


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 051/500 ] loss = 0.99405, acc = 0.71945


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 051/500 ] loss = 2.34211, acc = 0.11877


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 052/500 ] loss = 0.98797, acc = 0.72113


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 052/500 ] loss = 2.22367, acc = 0.14875


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 053/500 ] loss = 0.99614, acc = 0.72133


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 053/500 ] loss = 2.21140, acc = 0.11997


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 054/500 ] loss = 0.99615, acc = 0.72299


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 054/500 ] loss = 2.19659, acc = 0.12933


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 055/500 ] loss = 0.98973, acc = 0.72036


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 055/500 ] loss = 2.13810, acc = 0.16297


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 056/500 ] loss = 0.97864, acc = 0.72254


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 056/500 ] loss = 2.26717, acc = 0.14309


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 057/500 ] loss = 0.98691, acc = 0.72106


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 057/500 ] loss = 2.06807, acc = 0.16188


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 058/500 ] loss = 0.97842, acc = 0.72209


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 058/500 ] loss = 2.30423, acc = 0.13429


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 059/500 ] loss = 0.97883, acc = 0.72263


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 059/500 ] loss = 2.50731, acc = 0.11356


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 060/500 ] loss = 0.97208, acc = 0.72374


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 060/500 ] loss = 2.12801, acc = 0.16343


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 061/500 ] loss = 0.98298, acc = 0.72271


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 061/500 ] loss = 2.17153, acc = 0.13689


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 062/500 ] loss = 0.96922, acc = 0.72563


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 062/500 ] loss = 2.17821, acc = 0.13971


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 063/500 ] loss = 0.95892, acc = 0.72395


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 063/500 ] loss = 2.11479, acc = 0.15417


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 064/500 ] loss = 0.95803, acc = 0.72338


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 064/500 ] loss = 2.17050, acc = 0.14295


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 065/500 ] loss = 0.96513, acc = 0.72202


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 065/500 ] loss = 2.17881, acc = 0.14010


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 066/500 ] loss = 0.97074, acc = 0.72517


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 066/500 ] loss = 2.21316, acc = 0.15312


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 067/500 ] loss = 0.95782, acc = 0.72603


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 067/500 ] loss = 2.11209, acc = 0.15111


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 068/500 ] loss = 0.95692, acc = 0.72408


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 068/500 ] loss = 2.24400, acc = 0.13795


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 069/500 ] loss = 0.95301, acc = 0.72568


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 069/500 ] loss = 2.21428, acc = 0.13369


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 070/500 ] loss = 0.94723, acc = 0.72334


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 070/500 ] loss = 2.18614, acc = 0.16508


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 071/500 ] loss = 0.95286, acc = 0.72687


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 071/500 ] loss = 2.31725, acc = 0.14270


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 072/500 ] loss = 0.95309, acc = 0.72357


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 072/500 ] loss = 2.18619, acc = 0.14344


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 073/500 ] loss = 0.94029, acc = 0.72593


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 073/500 ] loss = 2.14039, acc = 0.17951


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 074/500 ] loss = 0.95292, acc = 0.72912


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 074/500 ] loss = 2.54405, acc = 0.12658


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 075/500 ] loss = 0.93990, acc = 0.72256


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 075/500 ] loss = 2.32106, acc = 0.14165


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 076/500 ] loss = 0.92902, acc = 0.72508


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 076/500 ] loss = 2.22786, acc = 0.16283


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 077/500 ] loss = 0.92906, acc = 0.72214


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 077/500 ] loss = 2.21902, acc = 0.14115


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 078/500 ] loss = 0.93317, acc = 0.72621


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 078/500 ] loss = 2.30645, acc = 0.15041


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 079/500 ] loss = 0.93300, acc = 0.72414


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 079/500 ] loss = 2.42016, acc = 0.13158


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 080/500 ] loss = 0.92865, acc = 0.72531


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 080/500 ] loss = 2.29010, acc = 0.13299


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 081/500 ] loss = 0.92416, acc = 0.72667


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 081/500 ] loss = 2.25526, acc = 0.14566


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 082/500 ] loss = 0.92311, acc = 0.72661


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 082/500 ] loss = 2.15309, acc = 0.14305


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 083/500 ] loss = 0.93715, acc = 0.72702


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 083/500 ] loss = 2.13771, acc = 0.19292


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 084/500 ] loss = 0.92414, acc = 0.72690


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 084/500 ] loss = 2.21609, acc = 0.12588


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 085/500 ] loss = 0.93201, acc = 0.72933


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 085/500 ] loss = 2.32243, acc = 0.11972


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 086/500 ] loss = 0.91954, acc = 0.72408


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 086/500 ] loss = 2.06273, acc = 0.15892


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 087/500 ] loss = 0.91450, acc = 0.72947


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 087/500 ] loss = 2.14814, acc = 0.20594


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 088/500 ] loss = 0.90937, acc = 0.72826


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 088/500 ] loss = 2.42221, acc = 0.12838


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 089/500 ] loss = 0.90824, acc = 0.72524


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 089/500 ] loss = 2.21678, acc = 0.13985


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 090/500 ] loss = 0.90818, acc = 0.72564


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 090/500 ] loss = 2.11199, acc = 0.16459


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 091/500 ] loss = 0.91496, acc = 0.72764


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 091/500 ] loss = 2.10485, acc = 0.17124


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 092/500 ] loss = 0.90897, acc = 0.72701


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 092/500 ] loss = 2.39033, acc = 0.12694


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 093/500 ] loss = 0.90368, acc = 0.73079


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 093/500 ] loss = 2.26107, acc = 0.14485


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 094/500 ] loss = 0.89902, acc = 0.72853


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 094/500 ] loss = 2.20367, acc = 0.15171


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 095/500 ] loss = 0.89421, acc = 0.72936


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 095/500 ] loss = 2.29639, acc = 0.15136


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 096/500 ] loss = 0.91242, acc = 0.72723


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 096/500 ] loss = 2.20022, acc = 0.14590


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 097/500 ] loss = 0.90045, acc = 0.72740


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 097/500 ] loss = 2.13285, acc = 0.15621


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 098/500 ] loss = 0.89674, acc = 0.72982


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 098/500 ] loss = 2.13285, acc = 0.17114


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 099/500 ] loss = 0.90512, acc = 0.72892


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 099/500 ] loss = 2.14605, acc = 0.15502


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 100/500 ] loss = 0.90287, acc = 0.72763


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 100/500 ] loss = 2.12526, acc = 0.17444


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 101/500 ] loss = 0.89225, acc = 0.72993


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 101/500 ] loss = 2.15898, acc = 0.15076


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 102/500 ] loss = 0.89596, acc = 0.73208


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 102/500 ] loss = 2.15850, acc = 0.18176


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 103/500 ] loss = 0.89474, acc = 0.72912


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 103/500 ] loss = 2.19494, acc = 0.18131


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 104/500 ] loss = 0.88238, acc = 0.73165


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 104/500 ] loss = 2.13561, acc = 0.16498


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 105/500 ] loss = 0.87643, acc = 0.73287


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 105/500 ] loss = 2.17027, acc = 0.19422


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 106/500 ] loss = 0.87919, acc = 0.72981


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 106/500 ] loss = 2.10680, acc = 0.17054


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 107/500 ] loss = 0.88780, acc = 0.73068


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 107/500 ] loss = 2.06803, acc = 0.20513


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 108/500 ] loss = 0.88032, acc = 0.72855


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 108/500 ] loss = 2.21482, acc = 0.16227


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 109/500 ] loss = 0.87774, acc = 0.73069


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 109/500 ] loss = 2.26486, acc = 0.17279


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 110/500 ] loss = 0.87242, acc = 0.72929


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 110/500 ] loss = 2.14297, acc = 0.16853


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 111/500 ] loss = 0.87758, acc = 0.73089


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 111/500 ] loss = 2.29334, acc = 0.15336


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 112/500 ] loss = 0.86392, acc = 0.73418


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 112/500 ] loss = 2.18813, acc = 0.17846


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 113/500 ] loss = 0.86228, acc = 0.73141


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 113/500 ] loss = 2.20132, acc = 0.13799


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 114/500 ] loss = 0.86582, acc = 0.73108


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 114/500 ] loss = 2.20564, acc = 0.17399


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 115/500 ] loss = 0.86247, acc = 0.73192


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 115/500 ] loss = 2.24007, acc = 0.16389


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 116/500 ] loss = 0.85968, acc = 0.73390


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 116/500 ] loss = 2.29669, acc = 0.16568


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 117/500 ] loss = 0.85716, acc = 0.73156


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 117/500 ] loss = 2.18611, acc = 0.16829


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 118/500 ] loss = 0.86156, acc = 0.73102


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 118/500 ] loss = 2.13322, acc = 0.17539


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 119/500 ] loss = 0.85680, acc = 0.73194


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 119/500 ] loss = 2.27053, acc = 0.14495


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 120/500 ] loss = 0.86806, acc = 0.73214


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 120/500 ] loss = 2.13329, acc = 0.19897


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 121/500 ] loss = 0.86809, acc = 0.73317


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 121/500 ] loss = 2.15018, acc = 0.16709


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 122/500 ] loss = 0.84375, acc = 0.73229


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 122/500 ] loss = 2.47632, acc = 0.13133


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 123/500 ] loss = 0.85333, acc = 0.73325


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 123/500 ] loss = 2.11458, acc = 0.17919


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 124/500 ] loss = 0.84125, acc = 0.72897


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 124/500 ] loss = 2.28295, acc = 0.15586


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 125/500 ] loss = 0.85039, acc = 0.73368


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 125/500 ] loss = 2.13762, acc = 0.17515


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 126/500 ] loss = 0.84217, acc = 0.73384


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 126/500 ] loss = 2.23238, acc = 0.16318


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 127/500 ] loss = 0.85144, acc = 0.73112


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 127/500 ] loss = 2.20342, acc = 0.15572


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 128/500 ] loss = 0.84938, acc = 0.73344


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 128/500 ] loss = 2.39082, acc = 0.13834


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 129/500 ] loss = 0.85156, acc = 0.72902


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 129/500 ] loss = 2.13081, acc = 0.17135


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 130/500 ] loss = 0.83692, acc = 0.73337


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 130/500 ] loss = 2.26110, acc = 0.16653


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 131/500 ] loss = 0.84198, acc = 0.73194


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 131/500 ] loss = 2.24021, acc = 0.14555


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 132/500 ] loss = 0.84405, acc = 0.73337


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 132/500 ] loss = 2.29690, acc = 0.15688


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 133/500 ] loss = 0.84678, acc = 0.73174


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 133/500 ] loss = 2.14303, acc = 0.17159


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 134/500 ] loss = 0.83725, acc = 0.73251


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 134/500 ] loss = 2.33391, acc = 0.15583


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 135/500 ] loss = 0.83623, acc = 0.73254


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 135/500 ] loss = 2.42790, acc = 0.15987


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 136/500 ] loss = 0.83348, acc = 0.73448


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 136/500 ] loss = 2.20836, acc = 0.19232


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 137/500 ] loss = 0.82477, acc = 0.73358


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 137/500 ] loss = 2.30556, acc = 0.15231


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 138/500 ] loss = 0.83297, acc = 0.73106


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 138/500 ] loss = 2.17442, acc = 0.18771


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 139/500 ] loss = 0.83140, acc = 0.73222


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 139/500 ] loss = 2.23131, acc = 0.17800


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 140/500 ] loss = 0.82216, acc = 0.73116


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 140/500 ] loss = 2.29989, acc = 0.17740


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 141/500 ] loss = 0.81907, acc = 0.73717


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 141/500 ] loss = 2.23682, acc = 0.17233


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 142/500 ] loss = 0.83448, acc = 0.73179


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 142/500 ] loss = 2.35774, acc = 0.16614


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 143/500 ] loss = 0.81466, acc = 0.73719


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 143/500 ] loss = 2.21219, acc = 0.20298


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 144/500 ] loss = 0.82614, acc = 0.73317


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 144/500 ] loss = 2.23760, acc = 0.17891


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 145/500 ] loss = 0.83402, acc = 0.73217


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 145/500 ] loss = 2.33933, acc = 0.15621


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 146/500 ] loss = 0.81999, acc = 0.73256


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 146/500 ] loss = 2.21036, acc = 0.17870


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 147/500 ] loss = 0.82501, acc = 0.73581


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 147/500 ] loss = 2.28808, acc = 0.16223


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 148/500 ] loss = 0.81540, acc = 0.73149


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 148/500 ] loss = 2.18756, acc = 0.19493


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 149/500 ] loss = 0.81822, acc = 0.73500


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 149/500 ] loss = 2.28358, acc = 0.16047


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 150/500 ] loss = 0.82006, acc = 0.73560


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 150/500 ] loss = 2.21829, acc = 0.22146


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 151/500 ] loss = 0.81851, acc = 0.73474


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 151/500 ] loss = 2.40001, acc = 0.17005


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 152/500 ] loss = 0.81886, acc = 0.73451


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 152/500 ] loss = 2.25846, acc = 0.17349


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 153/500 ] loss = 0.83017, acc = 0.72846


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 153/500 ] loss = 2.15164, acc = 0.19813


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 154/500 ] loss = 0.81270, acc = 0.73451


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 154/500 ] loss = 2.43097, acc = 0.15783


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 155/500 ] loss = 0.81205, acc = 0.73683


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 155/500 ] loss = 2.25438, acc = 0.20749


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 156/500 ] loss = 0.81461, acc = 0.73443


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 156/500 ] loss = 2.19242, acc = 0.17539


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 157/500 ] loss = 0.80671, acc = 0.73851


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 157/500 ] loss = 2.22786, acc = 0.19623


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 158/500 ] loss = 0.81006, acc = 0.73803


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 158/500 ] loss = 2.26139, acc = 0.17729


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 159/500 ] loss = 0.81227, acc = 0.73426


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 159/500 ] loss = 2.17847, acc = 0.19496


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 160/500 ] loss = 0.80592, acc = 0.73818


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 160/500 ] loss = 2.35637, acc = 0.18250


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 161/500 ] loss = 0.80349, acc = 0.73763


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 161/500 ] loss = 2.25857, acc = 0.18155


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 162/500 ] loss = 0.80565, acc = 0.73464


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 162/500 ] loss = 2.33557, acc = 0.19623


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 163/500 ] loss = 0.79548, acc = 0.73972


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 163/500 ] loss = 2.24902, acc = 0.16853


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 164/500 ] loss = 0.79810, acc = 0.73282


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 164/500 ] loss = 2.22629, acc = 0.22030


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 165/500 ] loss = 0.80080, acc = 0.73767


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 165/500 ] loss = 2.27661, acc = 0.17019


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 166/500 ] loss = 0.80552, acc = 0.73624


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 166/500 ] loss = 2.35944, acc = 0.15171


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 167/500 ] loss = 0.79876, acc = 0.73553


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 167/500 ] loss = 2.34069, acc = 0.16924


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 168/500 ] loss = 0.80457, acc = 0.73839


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 168/500 ] loss = 2.49086, acc = 0.17279


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 169/500 ] loss = 0.80245, acc = 0.73897


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 169/500 ] loss = 2.28386, acc = 0.20298


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 170/500 ] loss = 0.78937, acc = 0.73687


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 170/500 ] loss = 2.25565, acc = 0.19683


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 171/500 ] loss = 0.78765, acc = 0.73823


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 171/500 ] loss = 2.33902, acc = 0.18261


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 172/500 ] loss = 0.78865, acc = 0.73624


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 172/500 ] loss = 2.24062, acc = 0.18391


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 173/500 ] loss = 0.79194, acc = 0.73838


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 173/500 ] loss = 2.11681, acc = 0.21340


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 174/500 ] loss = 0.79011, acc = 0.73566


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 174/500 ] loss = 2.17242, acc = 0.22691


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 175/500 ] loss = 0.79578, acc = 0.73846


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 175/500 ] loss = 2.40165, acc = 0.15192


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 176/500 ] loss = 0.79769, acc = 0.73747


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 176/500 ] loss = 2.30917, acc = 0.17870


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 177/500 ] loss = 0.80498, acc = 0.73865


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 177/500 ] loss = 2.25068, acc = 0.21139


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 178/500 ] loss = 0.79317, acc = 0.73938


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 178/500 ] loss = 2.36650, acc = 0.16568


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 179/500 ] loss = 0.78515, acc = 0.73880


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 179/500 ] loss = 2.27400, acc = 0.17870


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 180/500 ] loss = 0.77448, acc = 0.73636


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 180/500 ] loss = 2.29410, acc = 0.15607


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 181/500 ] loss = 0.78323, acc = 0.73900


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 181/500 ] loss = 2.33636, acc = 0.17905


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 182/500 ] loss = 0.78092, acc = 0.74046


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 182/500 ] loss = 2.16520, acc = 0.18736


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 183/500 ] loss = 0.78457, acc = 0.73704


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 183/500 ] loss = 2.28843, acc = 0.16850


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 184/500 ] loss = 0.77820, acc = 0.74054


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 184/500 ] loss = 2.29729, acc = 0.19943


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 185/500 ] loss = 0.78131, acc = 0.73938


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 185/500 ] loss = 2.30331, acc = 0.17170


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 186/500 ] loss = 0.77520, acc = 0.74023


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 186/500 ] loss = 2.35967, acc = 0.17680


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 187/500 ] loss = 0.78354, acc = 0.73959


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 187/500 ] loss = 2.38218, acc = 0.16568


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 188/500 ] loss = 0.77764, acc = 0.73750


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 188/500 ] loss = 2.37365, acc = 0.17539


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 189/500 ] loss = 0.77436, acc = 0.73689


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 189/500 ] loss = 2.39600, acc = 0.16698


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 190/500 ] loss = 0.77882, acc = 0.73523


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 190/500 ] loss = 2.22922, acc = 0.18511


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 191/500 ] loss = 0.76286, acc = 0.73860


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 191/500 ] loss = 2.45343, acc = 0.16199


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 192/500 ] loss = 0.76499, acc = 0.73961


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 192/500 ] loss = 2.35472, acc = 0.16153


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 193/500 ] loss = 0.77838, acc = 0.73826


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 193/500 ] loss = 2.49551, acc = 0.16973


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 194/500 ] loss = 0.76011, acc = 0.74451


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 194/500 ] loss = 2.40202, acc = 0.18190


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 195/500 ] loss = 0.76652, acc = 0.73915


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 195/500 ] loss = 2.29819, acc = 0.19292


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 196/500 ] loss = 0.76055, acc = 0.74055


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 196/500 ] loss = 2.26967, acc = 0.21069


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 197/500 ] loss = 0.76434, acc = 0.74164


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 197/500 ] loss = 2.42071, acc = 0.17680


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 198/500 ] loss = 0.77737, acc = 0.73988


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 198/500 ] loss = 2.32612, acc = 0.19767


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 199/500 ] loss = 0.76715, acc = 0.73849


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 199/500 ] loss = 2.31825, acc = 0.18937


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 200/500 ] loss = 0.76191, acc = 0.73869


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 200/500 ] loss = 2.28252, acc = 0.19482


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 201/500 ] loss = 0.77246, acc = 0.74076


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 201/500 ] loss = 2.36176, acc = 0.17279


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 202/500 ] loss = 0.75954, acc = 0.73825


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 202/500 ] loss = 2.42315, acc = 0.16744


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 203/500 ] loss = 0.76068, acc = 0.73871


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 203/500 ] loss = 2.19882, acc = 0.21425


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 204/500 ] loss = 0.75597, acc = 0.73963


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 204/500 ] loss = 2.16828, acc = 0.20027


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 205/500 ] loss = 0.74915, acc = 0.74079


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 205/500 ] loss = 2.28316, acc = 0.18110


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 206/500 ] loss = 0.76079, acc = 0.74046


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 206/500 ] loss = 2.59624, acc = 0.15101


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 207/500 ] loss = 0.75208, acc = 0.74254


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 207/500 ] loss = 2.46809, acc = 0.15977


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 208/500 ] loss = 0.75413, acc = 0.74191


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 208/500 ] loss = 2.21617, acc = 0.19021


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 209/500 ] loss = 0.76325, acc = 0.74104


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 209/500 ] loss = 2.43871, acc = 0.17089


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 210/500 ] loss = 0.74786, acc = 0.74368


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 210/500 ] loss = 2.31827, acc = 0.17810


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 211/500 ] loss = 0.74392, acc = 0.74202


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 211/500 ] loss = 2.37323, acc = 0.19753


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 212/500 ] loss = 0.74313, acc = 0.74181


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 212/500 ] loss = 2.44673, acc = 0.18380


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 213/500 ] loss = 0.75092, acc = 0.73942


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 213/500 ] loss = 2.31780, acc = 0.19211


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 214/500 ] loss = 0.74281, acc = 0.74059


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 214/500 ] loss = 2.48801, acc = 0.18630


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 215/500 ] loss = 0.75722, acc = 0.74258


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 215/500 ] loss = 2.36371, acc = 0.18887


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 216/500 ] loss = 0.76096, acc = 0.73836


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 216/500 ] loss = 2.36874, acc = 0.18641


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 217/500 ] loss = 0.74892, acc = 0.74265


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 217/500 ] loss = 2.41887, acc = 0.19542


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 218/500 ] loss = 0.75142, acc = 0.73978


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 218/500 ] loss = 2.35026, acc = 0.17304


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 219/500 ] loss = 0.73935, acc = 0.74026


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 219/500 ] loss = 2.44059, acc = 0.20443


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 220/500 ] loss = 0.73765, acc = 0.73939


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 220/500 ] loss = 2.28625, acc = 0.20854


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 221/500 ] loss = 0.74716, acc = 0.74247


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 221/500 ] loss = 2.26821, acc = 0.17705


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 222/500 ] loss = 0.75515, acc = 0.74030


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 222/500 ] loss = 2.33842, acc = 0.19552


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 223/500 ] loss = 0.74028, acc = 0.74147


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 223/500 ] loss = 2.40702, acc = 0.17089


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 224/500 ] loss = 0.73933, acc = 0.74490


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 224/500 ] loss = 2.36498, acc = 0.20914


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 225/500 ] loss = 0.73006, acc = 0.74666


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 225/500 ] loss = 2.33412, acc = 0.19883


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 226/500 ] loss = 0.72604, acc = 0.74139


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 226/500 ] loss = 2.37218, acc = 0.19693


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 227/500 ] loss = 0.73178, acc = 0.74795


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 227/500 ] loss = 2.51512, acc = 0.16424


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 228/500 ] loss = 0.73913, acc = 0.74400


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 228/500 ] loss = 2.50412, acc = 0.17374


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 229/500 ] loss = 0.73307, acc = 0.74046


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 229/500 ] loss = 2.44076, acc = 0.18345


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 230/500 ] loss = 0.74705, acc = 0.74291


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 230/500 ] loss = 2.23714, acc = 0.22171


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 231/500 ] loss = 0.73981, acc = 0.74016


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 231/500 ] loss = 2.48288, acc = 0.17184


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 232/500 ] loss = 0.73109, acc = 0.74118


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 232/500 ] loss = 2.48046, acc = 0.16568


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 233/500 ] loss = 0.72481, acc = 0.74132


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 233/500 ] loss = 2.58977, acc = 0.17835


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 234/500 ] loss = 0.74267, acc = 0.74269


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 234/500 ] loss = 2.62820, acc = 0.16888


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 235/500 ] loss = 0.73505, acc = 0.74310


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 235/500 ] loss = 2.49315, acc = 0.17620


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 236/500 ] loss = 0.73856, acc = 0.74372


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 236/500 ] loss = 2.32461, acc = 0.20049


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 237/500 ] loss = 0.72843, acc = 0.74381


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 237/500 ] loss = 2.59857, acc = 0.16744


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 238/500 ] loss = 0.73080, acc = 0.74427


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 238/500 ] loss = 2.56360, acc = 0.17846


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 239/500 ] loss = 0.72072, acc = 0.74573


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 239/500 ] loss = 2.39052, acc = 0.18176


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 240/500 ] loss = 0.72668, acc = 0.74191


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 240/500 ] loss = 2.41518, acc = 0.17539


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 241/500 ] loss = 0.72439, acc = 0.74181


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 241/500 ] loss = 2.36276, acc = 0.17765


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 242/500 ] loss = 0.72092, acc = 0.74550


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 242/500 ] loss = 2.46009, acc = 0.17219


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 243/500 ] loss = 0.73196, acc = 0.74421


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 243/500 ] loss = 2.30215, acc = 0.20914


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 244/500 ] loss = 0.72082, acc = 0.74187


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 244/500 ] loss = 2.26884, acc = 0.21104


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 245/500 ] loss = 0.72416, acc = 0.74580


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 245/500 ] loss = 2.52753, acc = 0.17905


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 246/500 ] loss = 0.71522, acc = 0.74636


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 246/500 ] loss = 2.57365, acc = 0.21235


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 247/500 ] loss = 0.71881, acc = 0.74533


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 247/500 ] loss = 2.34775, acc = 0.20158


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 248/500 ] loss = 0.71549, acc = 0.74717


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 248/500 ] loss = 2.48280, acc = 0.21249


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 249/500 ] loss = 0.71520, acc = 0.74737


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 249/500 ] loss = 2.40180, acc = 0.20358


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 250/500 ] loss = 0.71236, acc = 0.74587


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 250/500 ] loss = 2.54416, acc = 0.20904


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 251/500 ] loss = 0.71820, acc = 0.74649


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 251/500 ] loss = 2.57091, acc = 0.15938


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 252/500 ] loss = 0.72015, acc = 0.74749


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 252/500 ] loss = 2.40632, acc = 0.22016


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 253/500 ] loss = 0.72207, acc = 0.74315


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 253/500 ] loss = 2.47471, acc = 0.19447


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 254/500 ] loss = 0.71423, acc = 0.74561


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 254/500 ] loss = 2.63297, acc = 0.18521


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 255/500 ] loss = 0.71732, acc = 0.74561


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 255/500 ] loss = 2.48218, acc = 0.18416


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 256/500 ] loss = 0.70175, acc = 0.74690


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 256/500 ] loss = 2.40327, acc = 0.16663


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 257/500 ] loss = 0.71969, acc = 0.74590


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 257/500 ] loss = 2.32840, acc = 0.19753


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 258/500 ] loss = 0.71824, acc = 0.74580


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 258/500 ] loss = 2.59771, acc = 0.19482


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 259/500 ] loss = 0.71210, acc = 0.74594


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 259/500 ] loss = 2.61286, acc = 0.14541


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 260/500 ] loss = 0.71531, acc = 0.74235


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 260/500 ] loss = 2.55428, acc = 0.16603


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 261/500 ] loss = 0.71851, acc = 0.74939


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 261/500 ] loss = 2.47288, acc = 0.18356


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 262/500 ] loss = 0.70886, acc = 0.74900


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 262/500 ] loss = 2.48124, acc = 0.18296


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 263/500 ] loss = 0.70651, acc = 0.74735


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 263/500 ] loss = 2.46062, acc = 0.18996


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 264/500 ] loss = 0.70743, acc = 0.74599


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 264/500 ] loss = 2.59571, acc = 0.19542


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 265/500 ] loss = 0.69934, acc = 0.74718


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 265/500 ] loss = 2.26245, acc = 0.22952


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 266/500 ] loss = 0.70222, acc = 0.74939


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 266/500 ] loss = 2.52416, acc = 0.17159


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 267/500 ] loss = 0.71125, acc = 0.74670


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 267/500 ] loss = 2.43897, acc = 0.20774


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 268/500 ] loss = 0.69485, acc = 0.75230


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 268/500 ] loss = 2.51831, acc = 0.17575


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 269/500 ] loss = 0.70286, acc = 0.74421


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 269/500 ] loss = 2.47119, acc = 0.19661


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 270/500 ] loss = 0.69702, acc = 0.74370


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 270/500 ] loss = 2.68010, acc = 0.17244


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 271/500 ] loss = 0.69238, acc = 0.74703


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 271/500 ] loss = 2.54521, acc = 0.17881


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 272/500 ] loss = 0.70432, acc = 0.74766


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 272/500 ] loss = 2.49509, acc = 0.19081


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 273/500 ] loss = 0.68979, acc = 0.74946


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 273/500 ] loss = 2.57274, acc = 0.19387


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 274/500 ] loss = 0.71144, acc = 0.74448


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 274/500 ] loss = 2.52860, acc = 0.19503


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 275/500 ] loss = 0.69230, acc = 0.74656


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 275/500 ] loss = 2.50617, acc = 0.19112


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 276/500 ] loss = 0.69297, acc = 0.74994


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 276/500 ] loss = 2.66685, acc = 0.17290


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 277/500 ] loss = 0.68956, acc = 0.74924


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 277/500 ] loss = 2.58596, acc = 0.18486


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 278/500 ] loss = 0.69217, acc = 0.74773


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 278/500 ] loss = 2.52371, acc = 0.23167


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 279/500 ] loss = 0.69881, acc = 0.74609


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 279/500 ] loss = 2.60757, acc = 0.17265


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 280/500 ] loss = 0.68216, acc = 0.74966


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 280/500 ] loss = 2.64141, acc = 0.20914


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 281/500 ] loss = 0.69059, acc = 0.74669


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 281/500 ] loss = 2.51824, acc = 0.17775


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 282/500 ] loss = 0.67955, acc = 0.75188


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 282/500 ] loss = 2.62368, acc = 0.18155


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 283/500 ] loss = 0.70145, acc = 0.74993


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 283/500 ] loss = 2.61345, acc = 0.19352


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 284/500 ] loss = 0.69377, acc = 0.75069


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 284/500 ] loss = 2.48231, acc = 0.18711


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 285/500 ] loss = 0.68040, acc = 0.75265


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 285/500 ] loss = 2.56288, acc = 0.20073


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 286/500 ] loss = 0.70237, acc = 0.74463


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 286/500 ] loss = 2.50417, acc = 0.17800


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 287/500 ] loss = 0.68404, acc = 0.74830


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 287/500 ] loss = 2.51111, acc = 0.18841


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 288/500 ] loss = 0.68206, acc = 0.75131


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 288/500 ] loss = 2.55233, acc = 0.20830


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 289/500 ] loss = 0.68235, acc = 0.75007


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 289/500 ] loss = 2.47193, acc = 0.20003


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 290/500 ] loss = 0.69196, acc = 0.75079


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 290/500 ] loss = 2.65452, acc = 0.18426


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 291/500 ] loss = 0.69048, acc = 0.74961


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 291/500 ] loss = 2.52194, acc = 0.20418


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 292/500 ] loss = 0.68211, acc = 0.74965


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 292/500 ] loss = 2.57473, acc = 0.20098


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 293/500 ] loss = 0.68465, acc = 0.75065


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 293/500 ] loss = 2.45088, acc = 0.20524


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 294/500 ] loss = 0.68791, acc = 0.75109


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 294/500 ] loss = 2.46467, acc = 0.21755


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 295/500 ] loss = 0.66639, acc = 0.75164


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 295/500 ] loss = 2.53450, acc = 0.20063


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 296/500 ] loss = 0.67898, acc = 0.74307


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 296/500 ] loss = 2.64479, acc = 0.19683


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 297/500 ] loss = 0.67047, acc = 0.75115


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 297/500 ] loss = 2.71051, acc = 0.18190


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 298/500 ] loss = 0.67666, acc = 0.75150


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 298/500 ] loss = 2.56640, acc = 0.19514


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 299/500 ] loss = 0.67515, acc = 0.74941


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 299/500 ] loss = 2.68060, acc = 0.20249


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 300/500 ] loss = 0.67555, acc = 0.75419


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 300/500 ] loss = 2.57830, acc = 0.21034


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 301/500 ] loss = 0.67549, acc = 0.74975


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 301/500 ] loss = 2.62920, acc = 0.16568


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 302/500 ] loss = 0.67649, acc = 0.75284


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 302/500 ] loss = 2.57262, acc = 0.22582


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 303/500 ] loss = 0.68548, acc = 0.75062


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 303/500 ] loss = 2.52096, acc = 0.20974


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 304/500 ] loss = 0.66593, acc = 0.75513


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 304/500 ] loss = 2.72751, acc = 0.17645


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 305/500 ] loss = 0.67042, acc = 0.75351


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 305/500 ] loss = 2.57137, acc = 0.20949


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 306/500 ] loss = 0.66215, acc = 0.75451


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 306/500 ] loss = 2.55974, acc = 0.19732


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 307/500 ] loss = 0.67826, acc = 0.75347


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 307/500 ] loss = 2.75666, acc = 0.22100


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 308/500 ] loss = 0.66664, acc = 0.75623


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 308/500 ] loss = 2.54120, acc = 0.23937


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 309/500 ] loss = 0.66928, acc = 0.75396


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 309/500 ] loss = 2.64347, acc = 0.20098


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 310/500 ] loss = 0.66799, acc = 0.75523


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 310/500 ] loss = 2.79442, acc = 0.16283


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 311/500 ] loss = 0.66767, acc = 0.75570


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 311/500 ] loss = 2.53429, acc = 0.20179


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 312/500 ] loss = 0.67173, acc = 0.75288


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 312/500 ] loss = 2.65253, acc = 0.18806


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 313/500 ] loss = 0.66794, acc = 0.75626


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 313/500 ] loss = 2.68777, acc = 0.20678


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 314/500 ] loss = 0.65655, acc = 0.75102


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 314/500 ] loss = 2.64043, acc = 0.21090


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 315/500 ] loss = 0.66701, acc = 0.75766


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 315/500 ] loss = 2.40104, acc = 0.21921


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 316/500 ] loss = 0.67413, acc = 0.75383


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 316/500 ] loss = 2.54987, acc = 0.22252


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 317/500 ] loss = 0.66385, acc = 0.74848


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 317/500 ] loss = 2.85570, acc = 0.20133


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 318/500 ] loss = 0.67070, acc = 0.75244


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 318/500 ] loss = 2.66052, acc = 0.19056


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 319/500 ] loss = 0.66450, acc = 0.75500


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 319/500 ] loss = 2.40799, acc = 0.22311


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 320/500 ] loss = 0.65251, acc = 0.75489


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 320/500 ] loss = 2.57217, acc = 0.21435


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 321/500 ] loss = 0.66068, acc = 0.75532


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 321/500 ] loss = 2.45788, acc = 0.19007


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 322/500 ] loss = 0.65585, acc = 0.75317


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 322/500 ] loss = 2.71038, acc = 0.22051


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 323/500 ] loss = 0.65499, acc = 0.75444


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 323/500 ] loss = 2.60698, acc = 0.20038


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 324/500 ] loss = 0.66771, acc = 0.75709


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 324/500 ] loss = 2.59006, acc = 0.20854


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 325/500 ] loss = 0.66149, acc = 0.75314


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 325/500 ] loss = 2.76251, acc = 0.20214


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 326/500 ] loss = 0.65537, acc = 0.75632


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 326/500 ] loss = 2.53869, acc = 0.20063


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 327/500 ] loss = 0.64821, acc = 0.75936


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 327/500 ] loss = 2.55987, acc = 0.22748


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 328/500 ] loss = 0.65615, acc = 0.75706


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 328/500 ] loss = 2.63110, acc = 0.18486


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 329/500 ] loss = 0.65234, acc = 0.75509


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 329/500 ] loss = 2.77448, acc = 0.18901


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 330/500 ] loss = 0.65731, acc = 0.75791


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 330/500 ] loss = 2.63776, acc = 0.17916


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 331/500 ] loss = 0.64934, acc = 0.75240


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 331/500 ] loss = 2.44732, acc = 0.20168


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 332/500 ] loss = 0.66302, acc = 0.75253


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 332/500 ] loss = 2.85354, acc = 0.19303


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 333/500 ] loss = 0.65364, acc = 0.75858


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 333/500 ] loss = 2.67323, acc = 0.20879


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 334/500 ] loss = 0.65415, acc = 0.75748


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 334/500 ] loss = 2.67431, acc = 0.18120


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 335/500 ] loss = 0.64691, acc = 0.75567


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 335/500 ] loss = 2.74433, acc = 0.18497


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 336/500 ] loss = 0.65627, acc = 0.75208


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 336/500 ] loss = 2.70575, acc = 0.18014


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 337/500 ] loss = 0.64913, acc = 0.75618


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 337/500 ] loss = 2.65218, acc = 0.21460


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 338/500 ] loss = 0.65619, acc = 0.75731


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 338/500 ] loss = 2.75168, acc = 0.18817


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 339/500 ] loss = 0.64094, acc = 0.76024


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 339/500 ] loss = 2.73180, acc = 0.22501


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 340/500 ] loss = 0.64910, acc = 0.75330


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 340/500 ] loss = 2.68334, acc = 0.20534


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 341/500 ] loss = 0.64943, acc = 0.75809


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 341/500 ] loss = 2.77666, acc = 0.18166


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 342/500 ] loss = 0.64896, acc = 0.75922


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 342/500 ] loss = 2.82757, acc = 0.20988


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 343/500 ] loss = 0.64640, acc = 0.75682


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 343/500 ] loss = 2.67037, acc = 0.20893


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 344/500 ] loss = 0.65207, acc = 0.75732


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 344/500 ] loss = 2.68223, acc = 0.21115


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 345/500 ] loss = 0.64660, acc = 0.75341


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 345/500 ] loss = 2.93335, acc = 0.15632


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 346/500 ] loss = 0.64386, acc = 0.75853


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 346/500 ] loss = 2.57815, acc = 0.20122


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 347/500 ] loss = 0.63281, acc = 0.75899


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 347/500 ] loss = 2.80389, acc = 0.19137


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 348/500 ] loss = 0.63439, acc = 0.75662


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 348/500 ] loss = 2.71363, acc = 0.21791


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 349/500 ] loss = 0.63962, acc = 0.75756


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 349/500 ] loss = 2.67427, acc = 0.18746


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 350/500 ] loss = 0.64599, acc = 0.75822


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 350/500 ] loss = 2.94620, acc = 0.18247


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 351/500 ] loss = 0.65100, acc = 0.75748


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 351/500 ] loss = 2.82925, acc = 0.18486


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 352/500 ] loss = 0.64955, acc = 0.75434


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 352/500 ] loss = 2.72402, acc = 0.18701


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 353/500 ] loss = 0.64801, acc = 0.75883


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 353/500 ] loss = 2.78241, acc = 0.20203


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 354/500 ] loss = 0.63344, acc = 0.76051


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 354/500 ] loss = 2.72953, acc = 0.22206


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 355/500 ] loss = 0.64570, acc = 0.75733


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 355/500 ] loss = 2.73103, acc = 0.19278


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 356/500 ] loss = 0.64111, acc = 0.75874


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 356/500 ] loss = 2.86592, acc = 0.19897


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 357/500 ] loss = 0.63719, acc = 0.75815


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 357/500 ] loss = 2.83874, acc = 0.20119


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 358/500 ] loss = 0.64177, acc = 0.76167


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 358/500 ] loss = 2.73694, acc = 0.21826


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 359/500 ] loss = 0.63443, acc = 0.76042


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 359/500 ] loss = 2.77177, acc = 0.20559


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 360/500 ] loss = 0.63055, acc = 0.75994


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 360/500 ] loss = 2.70293, acc = 0.22062


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 361/500 ] loss = 0.61951, acc = 0.76406


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 361/500 ] loss = 2.78501, acc = 0.22846


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 362/500 ] loss = 0.62425, acc = 0.76405


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 362/500 ] loss = 2.60243, acc = 0.21565


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 363/500 ] loss = 0.61239, acc = 0.76267


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 363/500 ] loss = 2.85191, acc = 0.20358


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 364/500 ] loss = 0.62324, acc = 0.76455


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 364/500 ] loss = 3.05493, acc = 0.17575


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 365/500 ] loss = 0.63758, acc = 0.75865


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 365/500 ] loss = 2.85476, acc = 0.23994


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 366/500 ] loss = 0.63187, acc = 0.76268


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 366/500 ] loss = 2.82717, acc = 0.21850


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 367/500 ] loss = 0.63146, acc = 0.76446


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 367/500 ] loss = 2.62745, acc = 0.21625


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 368/500 ] loss = 0.62730, acc = 0.75846


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 368/500 ] loss = 2.80266, acc = 0.21495


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 369/500 ] loss = 0.62996, acc = 0.76213


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 369/500 ] loss = 2.76026, acc = 0.18497


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 370/500 ] loss = 0.63231, acc = 0.75972


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 370/500 ] loss = 2.95370, acc = 0.20583


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 371/500 ] loss = 0.62424, acc = 0.76217


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 371/500 ] loss = 3.01009, acc = 0.20003


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 372/500 ] loss = 0.62906, acc = 0.76184


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 372/500 ] loss = 2.85054, acc = 0.18476


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 373/500 ] loss = 0.62659, acc = 0.76231


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 373/500 ] loss = 2.92991, acc = 0.19943


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 374/500 ] loss = 0.62095, acc = 0.76253


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 374/500 ] loss = 3.02668, acc = 0.16959


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 375/500 ] loss = 0.62177, acc = 0.76393


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 375/500 ] loss = 2.84625, acc = 0.21340


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 376/500 ] loss = 0.61809, acc = 0.76666


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 376/500 ] loss = 2.77288, acc = 0.21646


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 377/500 ] loss = 0.61523, acc = 0.76570


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 377/500 ] loss = 3.01950, acc = 0.21034


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 378/500 ] loss = 0.62813, acc = 0.75927


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 378/500 ] loss = 2.87633, acc = 0.18461


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 379/500 ] loss = 0.62474, acc = 0.76310


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 379/500 ] loss = 3.09010, acc = 0.18806


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 380/500 ] loss = 0.61775, acc = 0.76387


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 380/500 ] loss = 2.79539, acc = 0.20439


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 381/500 ] loss = 0.61157, acc = 0.76347


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 381/500 ] loss = 2.93926, acc = 0.17800


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 382/500 ] loss = 0.60844, acc = 0.76510


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 382/500 ] loss = 2.86430, acc = 0.21910


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

[ Train | 383/500 ] loss = 0.62632, acc = 0.76198


  0%|          | 0/3 [00:00<?, ?it/s]

[ Valid | 383/500 ] loss = 2.77199, acc = 0.21900


  0%|          | 0/27 [00:00<?, ?it/s]

  0%|          | 0/39 [00:00<?, ?it/s]

## **Testing**

For inference, we need to make sure the model is in eval mode, and the order of the dataset should not be shuffled ("shuffle=False" in test_loader).

Last but not least, don't forget to save the predictions into a single CSV file.
The format of CSV file should follow the rules mentioned in the slides.

### **WARNING -- Keep in Mind**

Cheating includes but not limited to:
1.   using testing labels,
2.   submitting results to previous Kaggle competitions,
3.   sharing predictions with others,
4.   copying codes from any creatures on Earth,
5.   asking other people to do it for you.

Any violations bring you punishments from getting a discount on the final grade to failing the course.

It is your responsibility to check whether your code violates the rules.
When citing codes from the Internet, you should know what these codes exactly do.
You will **NOT** be tolerated if you break the rule and claim you don't know what these codes do.


In [None]:
# Make sure the model is in eval mode.
# Some modules like Dropout or BatchNorm affect if the model is in training mode.
model.eval()

# Initialize a list to store the predictions.
predictions = []

# Iterate the testing set by batches.
for batch in tqdm(test_loader):
    # A batch consists of image data and corresponding labels.
    # But here the variable "labels" is useless since we do not have the ground-truth.
    # If printing out the labels, you will find that it is always 0.
    # This is because the wrapper (DatasetFolder) returns images and labels for each batch,
    # so we have to create fake labels to make it work normally.
    imgs, labels = batch

    # We don't need gradient in testing, and we don't even have labels to compute loss.
    # Using torch.no_grad() accelerates the forward process.
    with torch.no_grad():
        logits = model(imgs.to(device))

    # Take the class with greatest logit as prediction and record it.
    predictions.extend(logits.argmax(dim=-1).cpu().numpy().tolist())

In [None]:
# Save predictions into the file.
with open("predict.csv", "w") as f:

    # The first row must be "Id, Category"
    f.write("Id,Category\n")

    # For the rest of the rows, each image id corresponds to a predicted class.
    for i, pred in  enumerate(predictions):
         f.write(f"{i},{pred}\n")