# **Homework 3 - Convolutional Neural Network**

This is the example code of homework 3 of the machine learning course by Prof. Hung-yi Lee.

In this homework, you are required to build a convolutional neural network for image classification, possibly with some advanced training tips.


There are three levels here:

**Easy**: Build a simple convolutional neural network as the baseline. (2 pts)

**Medium**: Design a better architecture or adopt different data augmentations to improve the performance. (2 pts)

**Hard**: Utilize provided unlabeled data to obtain better results. (2 pts)

## **About the Dataset**

The dataset used here is food-11, a collection of food images in 11 classes.

For the requirement in the homework, TAs slightly modified the data.
Please DO NOT access the original fully-labeled training data or testing labels.

Also, the modified dataset is for this course only, and any further distribution or commercial use is forbidden.

In [1]:
!wget "https://www.dropbox.com/s/7yl5rra84ia0k8f/food-11.zip?dl=0" -O food-11.zip
!unzip food-11.zip

[1;30;43m串流輸出內容已截斷至最後 5000 行。[0m
  inflating: food-11/training/unlabeled/00/5451.jpg  
  inflating: food-11/training/unlabeled/00/5454.jpg  
  inflating: food-11/training/unlabeled/00/5459.jpg  
  inflating: food-11/training/unlabeled/00/5462.jpg  
  inflating: food-11/training/unlabeled/00/5470.jpg  
  inflating: food-11/training/unlabeled/00/5479.jpg  
  inflating: food-11/training/unlabeled/00/5497.jpg  
  inflating: food-11/training/unlabeled/00/5498.jpg  
  inflating: food-11/training/unlabeled/00/5502.jpg  
  inflating: food-11/training/unlabeled/00/5505.jpg  
  inflating: food-11/training/unlabeled/00/5515.jpg  
  inflating: food-11/training/unlabeled/00/5520.jpg  
  inflating: food-11/training/unlabeled/00/5530.jpg  
  inflating: food-11/training/unlabeled/00/5531.jpg  
  inflating: food-11/training/unlabeled/00/5541.jpg  
  inflating: food-11/training/unlabeled/00/5548.jpg  
  inflating: food-11/training/unlabeled/00/5549.jpg  
  inflating: food-11/training/unlabeled/00/5565

## **Import Packages**

First, we need to import packages that will be used later.

In this homework, we highly rely on **torchvision**, a library of PyTorch.

In [2]:
!pip install --upgrade torch torchvision

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [3]:
# Import necessary packages.
import numpy as np
import torch
import torch.nn as nn
import torchvision.transforms as transforms
from PIL import Image
# "ConcatDataset" and "Subset" are possibly useful when doing semi-supervised learning.
from torch.utils.data import ConcatDataset, DataLoader, Subset
from torchvision.datasets import DatasetFolder

# This is for the progress bar.
from tqdm import tqdm

## **Dataset, Data Loader, and Transforms**

Torchvision provides lots of useful utilities for image preprocessing, data wrapping as well as data augmentation.

Here, since our data are stored in folders by class labels, we can directly apply **torchvision.datasets.DatasetFolder** for wrapping data without much effort.

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

In [4]:
train_tfm = transforms.Compose([
    transforms.RandomRotation(40),
    transforms.RandomAffine(degrees=0, translate=(0.2, 0.2), shear=0.2),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485,0.456,0.406],
        std=[0.229,0.224,0.225]
    )
])

test_tfm = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485,0.456,0.406],
        std=[0.229,0.224,0.225]
    )
])


In [5]:
batch_size = 32

train_set = DatasetFolder("food-11/training/labeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm)
valid_set = DatasetFolder("food-11/validation", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)
unlabeled_set = DatasetFolder("food-11/training/unlabeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm)
test_set = DatasetFolder("food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)

train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

## **Model**

The basic model here is simply a stack of convolutional layers followed by some fully-connected layers.

Since there are three channels for a color image (RGB), the input channels of the network must be three.
In each convolutional layer, typically the channels of inputs grow, while the height and width shrink (or remain unchanged, according to some hyperparameters like stride and padding).

Before fed into fully-connected layers, the feature map must be flattened into a single one-dimensional vector (for each image).
These features are then transformed by the fully-connected layers, and finally, we obtain the "logits" for each class.

### **WARNING -- You Must Know**
You are free to modify the model architecture here for further improvement.
However, if you want to use some well-known architectures such as ResNet50, please make sure **NOT** to load the pre-trained weights.
Using such pre-trained models is considered cheating and therefore you will be punished.
Similarly, it is your responsibility to make sure no pre-trained weights are used if you use **torch.hub** to load any modules.

For example, if you use ResNet-18 as your model:

model = torchvision.models.resnet18(pretrained=**False**) → This is fine.

model = torchvision.models.resnet18(pretrained=**True**)  → This is **NOT** allowed.

In [6]:
class Classifier(nn.Module):
    def __init__(self):
        super(Classifier, self).__init__()
        self.cnn_layers = nn.Sequential(
            nn.Conv2d(3, 64, 3),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(64, 128, 3),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(128, 256, 3),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(256, 256, 3),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(256, 512, 3),
            nn.BatchNorm2d(512),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.fc_layers = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(12800, 512),
            nn.ReLU(),
            nn.BatchNorm1d(512),
            nn.Dropout(0.5),
            nn.Linear(512, 11)
        )

    def forward(self, x):
        x = self.cnn_layers(x)
        x = x.flatten(1)
        x = self.fc_layers(x)
        return x

In [7]:
from torch.utils.data import Dataset
class PseudoDataset(Dataset):
    def __init__(self, dataset, labels):
        self.data = dataset
        self.target = torch.LongTensor(labels)
    def __getitem__(self, idx):
        return self.data[idx][0], self.target[idx].item()
    def __len__(self):
        return len(self.data)

def get_pseudo_labels(dataset, model, threshold=0.65):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=False)

    model.eval()
    softmax = nn.Softmax(dim=-1)

    masks = []
    labels = []
    for batch in tqdm(data_loader):
        img, _ = batch
        with torch.no_grad():
            logits = model(img.to(device))
        probs = softmax(logits)
        preds = torch.max(probs, 1)[1]
        mask = torch.max(probs, 1)[0] > threshold
        masks.append(mask)
        labels.append(preds)
    masks = torch.cat(masks, dim=0).cpu().numpy()
    labels = torch.cat(labels, dim=0).cpu().numpy()
    indices = torch.arange(0, len(dataset))[masks]
    dataset = Subset(dataset, indices)
    labels = labels[indices]
    dataset = PseudoDataset(dataset, labels)

    model.train()
    return dataset

In [8]:
def same_seeds(seed):
    torch.manual_seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed(seed)
        torch.cuda.manual_seed_all(seed)  
    np.random.seed(seed)  
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True

In [9]:
same_seeds(411185050)
device = "cuda" if torch.cuda.is_available() else "cpu"

model = Classifier().to(device)
model.load_state_dict(torch.load('./model.ckpt 12.39.49 AM'))
#torch._dynamo.reset()
#torch._dynamo.config.suppress_errors = True
#torch._dynamo.config.verbose=True
opt_model = torch.compile(model, mode="max-autotune")
model.device = device
opt_model.device = device

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4)

In [10]:
total_epochs = 650
no_semi_epochs = 200
best_acc = 0.0
for epoch in range(total_epochs):
    if epoch >= no_semi_epochs:
        pseudo_set = get_pseudo_labels(unlabeled_set, model, threshold=0.8)
        print('Using {0:.2f}% of unlabeld data'.format(len(pseudo_set) / len(unlabeled_set) * 100))
        concat_dataset = ConcatDataset([train_set, pseudo_set])
        train_loader = DataLoader(concat_dataset, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True, drop_last=True)
    model.train()
    opt_model.train()

    train_loss = []
    train_accs = []

    for batch in tqdm(train_loader):
        imgs, labels = batch

        if len(imgs) == batch_size:
          logits = opt_model(imgs.to(device))
        else:
          logits = model(imgs.to(device))
        

        # Calculate the cross-entropy loss.
        # We don't need to apply softmax before computing cross-entropy as it is done automatically.
        loss = criterion(logits, labels.to(device))

        # Gradients stored in the parameters in the previous step should be cleared out first.
        optimizer.zero_grad()

        # Compute the gradients for parameters.
        loss.backward()

        # Clip the gradient norms for stable training.
        grad_norm = nn.utils.clip_grad_norm_(opt_model.parameters(), max_norm=10)

        # Update the parameters with computed gradients.
        optimizer.step()

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        train_loss.append(loss.item())
        train_accs.append(acc)

    # The average loss and accuracy of the training set is the average of the recorded values.
    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    # Print the information.
    print(f"[ Train | {epoch + 1:03d}/{total_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")

    # ---------- Validation ----------
    # Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.
    model.eval()
    opt_model.eval()

    # These are used to record information in validation.
    valid_loss = []
    valid_accs = []

    # Iterate the validation set by batches.
    for batch in tqdm(valid_loader):

        # A batch consists of image data and corresponding labels.
        imgs, labels = batch

        # We don't need gradient in validation.
        # Using torch.no_grad() accelerates the forward process.
        with torch.no_grad():
          logits = model(imgs.to(device))

        # We can still compute the loss (but not the gradient).
        loss = criterion(logits, labels.to(device))

        # Compute the accuracy for current batch.
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()

        # Record the loss and accuracy.
        valid_loss.append(loss.item())
        valid_accs.append(acc)

    # The average loss and accuracy for entire validation set is the average of the recorded values.
    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)

    # Print the information.
    print(f"[ Valid | {epoch + 1:03d}/{total_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")
    if valid_acc > best_acc:
        best_acc = valid_acc
        torch.save(model.state_dict(), './model.ckpt')
        print('Saving model with acc {:.5f}'.format(best_acc))
    if epoch == no_semi_epochs-1:
        print('Start semi-supervising with previous best acc {:.5f}'.format(best_acc))
        model.load_state_dict(torch.load('./model.ckpt'))

100%|██████████| 97/97 [02:48<00:00,  1.74s/it]


[ Train | 001/200 ] loss = 0.26824, acc = 0.90754


100%|██████████| 21/21 [00:05<00:00,  3.89it/s]


[ Valid | 001/200 ] loss = 0.84214, acc = 0.78452
Saving model with acc 0.78452


100%|██████████| 97/97 [00:32<00:00,  3.00it/s]


[ Train | 002/200 ] loss = 0.22632, acc = 0.92332


100%|██████████| 21/21 [00:06<00:00,  3.47it/s]


[ Valid | 002/200 ] loss = 0.88957, acc = 0.77470


100%|██████████| 97/97 [00:31<00:00,  3.05it/s]


[ Train | 003/200 ] loss = 0.20889, acc = 0.92848


100%|██████████| 21/21 [00:05<00:00,  3.90it/s]


[ Valid | 003/200 ] loss = 0.74453, acc = 0.81280
Saving model with acc 0.81280


100%|██████████| 97/97 [00:32<00:00,  2.97it/s]


[ Train | 004/200 ] loss = 0.17269, acc = 0.94362


100%|██████████| 21/21 [00:05<00:00,  3.54it/s]


[ Valid | 004/200 ] loss = 0.88948, acc = 0.78542


100%|██████████| 97/97 [00:30<00:00,  3.17it/s]


[ Train | 005/200 ] loss = 0.20078, acc = 0.93299


100%|██████████| 21/21 [00:06<00:00,  3.14it/s]


[ Valid | 005/200 ] loss = 0.88620, acc = 0.78214
Start semi-supervising with previous best acc 0.81280


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 77.75% of unlabeld data


100%|██████████| 261/261 [01:25<00:00,  3.04it/s]


[ Train | 006/200 ] loss = 0.29492, acc = 0.90769


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]


[ Valid | 006/200 ] loss = 0.94999, acc = 0.79107


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 80.33% of unlabeld data


100%|██████████| 266/266 [01:27<00:00,  3.05it/s]


[ Train | 007/200 ] loss = 0.29381, acc = 0.90320


100%|██████████| 21/21 [00:06<00:00,  3.46it/s]


[ Valid | 007/200 ] loss = 0.89798, acc = 0.77024


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 77.67% of unlabeld data


100%|██████████| 260/260 [01:23<00:00,  3.11it/s]


[ Train | 008/200 ] loss = 0.29300, acc = 0.90349


100%|██████████| 21/21 [00:07<00:00,  2.64it/s]


[ Valid | 008/200 ] loss = 0.75216, acc = 0.79315


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 78.31% of unlabeld data


100%|██████████| 262/262 [01:27<00:00,  2.99it/s]


[ Train | 009/200 ] loss = 0.26779, acc = 0.91245


100%|██████████| 21/21 [00:05<00:00,  3.89it/s]


[ Valid | 009/200 ] loss = 0.99896, acc = 0.75060


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 79.31% of unlabeld data


100%|██████████| 264/264 [01:25<00:00,  3.07it/s]


[ Train | 010/200 ] loss = 0.31705, acc = 0.89702


100%|██████████| 21/21 [00:05<00:00,  3.89it/s]


[ Valid | 010/200 ] loss = 0.91134, acc = 0.77649


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 78.71% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.13it/s]


[ Train | 011/200 ] loss = 0.27321, acc = 0.91172


100%|██████████| 21/21 [00:06<00:00,  3.13it/s]


[ Valid | 011/200 ] loss = 0.94203, acc = 0.76458


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 77.47% of unlabeld data


100%|██████████| 260/260 [01:23<00:00,  3.11it/s]


[ Train | 012/200 ] loss = 0.30333, acc = 0.90204


100%|██████████| 21/21 [00:06<00:00,  3.12it/s]


[ Valid | 012/200 ] loss = 0.88067, acc = 0.76756


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 77.06% of unlabeld data


100%|██████████| 259/259 [01:24<00:00,  3.06it/s]


[ Train | 013/200 ] loss = 0.26260, acc = 0.91421


100%|██████████| 21/21 [00:06<00:00,  3.46it/s]


[ Valid | 013/200 ] loss = 0.87999, acc = 0.78839


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 78.03% of unlabeld data


100%|██████████| 261/261 [01:24<00:00,  3.11it/s]


[ Train | 014/200 ] loss = 0.25114, acc = 0.91846


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 014/200 ] loss = 0.84507, acc = 0.77857


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 79.22% of unlabeld data


100%|██████████| 264/264 [01:25<00:00,  3.09it/s]


[ Train | 015/200 ] loss = 0.28915, acc = 0.90767


100%|██████████| 21/21 [00:05<00:00,  3.79it/s]


[ Valid | 015/200 ] loss = 0.88212, acc = 0.78214


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 79.03% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.12it/s]


[ Train | 016/200 ] loss = 0.26875, acc = 0.91302


100%|██████████| 21/21 [00:06<00:00,  3.18it/s]


[ Valid | 016/200 ] loss = 0.83111, acc = 0.78065


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 78.72% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.11it/s]


[ Train | 017/200 ] loss = 0.29347, acc = 0.90756


100%|██████████| 21/21 [00:06<00:00,  3.22it/s]


[ Valid | 017/200 ] loss = 0.79884, acc = 0.80387


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 78.71% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.11it/s]


[ Train | 018/200 ] loss = 0.26868, acc = 0.91362


100%|██████████| 21/21 [00:06<00:00,  3.31it/s]


[ Valid | 018/200 ] loss = 0.95473, acc = 0.77976


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 79.75% of unlabeld data


100%|██████████| 265/265 [01:26<00:00,  3.07it/s]


[ Train | 019/200 ] loss = 0.27459, acc = 0.90873


100%|██████████| 21/21 [00:06<00:00,  3.16it/s]


[ Valid | 019/200 ] loss = 1.04625, acc = 0.76667


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 81.15% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.11it/s]


[ Train | 020/200 ] loss = 0.25811, acc = 0.91196


100%|██████████| 21/21 [00:06<00:00,  3.07it/s]


[ Valid | 020/200 ] loss = 0.99278, acc = 0.77321


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 81.42% of unlabeld data


100%|██████████| 268/268 [01:28<00:00,  3.04it/s]


[ Train | 021/200 ] loss = 0.28592, acc = 0.90543


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 021/200 ] loss = 0.82786, acc = 0.79315


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 78.43% of unlabeld data


100%|██████████| 262/262 [01:24<00:00,  3.10it/s]


[ Train | 022/200 ] loss = 0.27558, acc = 0.90494


100%|██████████| 21/21 [00:05<00:00,  3.88it/s]


[ Valid | 022/200 ] loss = 0.80094, acc = 0.78929


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 79.19% of unlabeld data


100%|██████████| 264/264 [01:24<00:00,  3.13it/s]


[ Train | 023/200 ] loss = 0.26848, acc = 0.91489


100%|██████████| 21/21 [00:06<00:00,  3.28it/s]


[ Valid | 023/200 ] loss = 0.84733, acc = 0.77768


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 79.33% of unlabeld data


100%|██████████| 264/264 [01:24<00:00,  3.12it/s]


[ Train | 024/200 ] loss = 0.29243, acc = 0.90767


100%|██████████| 21/21 [00:06<00:00,  3.17it/s]


[ Valid | 024/200 ] loss = 0.79314, acc = 0.79226


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 79.69% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.08it/s]


[ Train | 025/200 ] loss = 0.26142, acc = 0.91380


100%|██████████| 21/21 [00:05<00:00,  3.86it/s]


[ Valid | 025/200 ] loss = 0.87264, acc = 0.78720


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 80.17% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.10it/s]


[ Train | 026/200 ] loss = 0.28324, acc = 0.90648


100%|██████████| 21/21 [00:05<00:00,  3.59it/s]


[ Valid | 026/200 ] loss = 0.93720, acc = 0.76667


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 77.03% of unlabeld data


100%|██████████| 259/259 [01:22<00:00,  3.13it/s]


[ Train | 027/200 ] loss = 0.25676, acc = 0.91325


100%|██████████| 21/21 [00:06<00:00,  3.16it/s]


[ Valid | 027/200 ] loss = 0.89295, acc = 0.79345


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 80.67% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.10it/s]


[ Train | 028/200 ] loss = 0.28693, acc = 0.90414


100%|██████████| 21/21 [00:06<00:00,  3.36it/s]


[ Valid | 028/200 ] loss = 0.92551, acc = 0.78393


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 81.54% of unlabeld data


100%|██████████| 269/269 [01:27<00:00,  3.08it/s]


[ Train | 029/200 ] loss = 0.29673, acc = 0.90404


100%|██████████| 21/21 [00:05<00:00,  3.75it/s]


[ Valid | 029/200 ] loss = 0.83507, acc = 0.75685


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 80.24% of unlabeld data


100%|██████████| 266/266 [01:26<00:00,  3.09it/s]


[ Train | 030/200 ] loss = 0.24566, acc = 0.91776


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 030/200 ] loss = 1.00085, acc = 0.77292


100%|██████████| 213/213 [01:06<00:00,  3.18it/s]


Using 80.75% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.12it/s]


[ Train | 031/200 ] loss = 0.28836, acc = 0.90672


100%|██████████| 21/21 [00:06<00:00,  3.16it/s]


[ Valid | 031/200 ] loss = 0.90907, acc = 0.77679


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 78.46% of unlabeld data


100%|██████████| 262/262 [01:24<00:00,  3.09it/s]


[ Train | 032/200 ] loss = 0.28850, acc = 0.90828


100%|██████████| 21/21 [00:06<00:00,  3.11it/s]


[ Valid | 032/200 ] loss = 0.86020, acc = 0.79494


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 78.87% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.11it/s]


[ Train | 033/200 ] loss = 0.26533, acc = 0.91385


100%|██████████| 21/21 [00:06<00:00,  3.11it/s]


[ Valid | 033/200 ] loss = 0.86187, acc = 0.78512


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 78.06% of unlabeld data


100%|██████████| 261/261 [01:24<00:00,  3.11it/s]


[ Train | 034/200 ] loss = 0.27496, acc = 0.90781


100%|██████████| 21/21 [00:06<00:00,  3.08it/s]


[ Valid | 034/200 ] loss = 0.90095, acc = 0.77500


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 78.26% of unlabeld data


100%|██████████| 262/262 [01:25<00:00,  3.07it/s]


[ Train | 035/200 ] loss = 0.29438, acc = 0.90219


100%|██████████| 21/21 [00:06<00:00,  3.06it/s]


[ Valid | 035/200 ] loss = 0.86859, acc = 0.78006


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 76.04% of unlabeld data


100%|██████████| 257/257 [01:23<00:00,  3.07it/s]


[ Train | 036/200 ] loss = 0.29839, acc = 0.90430


100%|██████████| 21/21 [00:06<00:00,  3.34it/s]


[ Valid | 036/200 ] loss = 0.94452, acc = 0.76577


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 75.85% of unlabeld data


100%|██████████| 257/257 [01:24<00:00,  3.05it/s]


[ Train | 037/200 ] loss = 0.26757, acc = 0.91087


100%|██████████| 21/21 [00:05<00:00,  3.89it/s]


[ Valid | 037/200 ] loss = 0.90205, acc = 0.78274


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 81.24% of unlabeld data


100%|██████████| 268/268 [01:27<00:00,  3.06it/s]


[ Train | 038/200 ] loss = 0.28997, acc = 0.90672


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 038/200 ] loss = 0.85712, acc = 0.77024


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 77.37% of unlabeld data


100%|██████████| 260/260 [01:24<00:00,  3.08it/s]


[ Train | 039/200 ] loss = 0.26809, acc = 0.91370


100%|██████████| 21/21 [00:05<00:00,  3.88it/s]


[ Valid | 039/200 ] loss = 0.79272, acc = 0.79851


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 79.80% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.12it/s]


[ Train | 040/200 ] loss = 0.26056, acc = 0.90979


100%|██████████| 21/21 [00:06<00:00,  3.13it/s]


[ Valid | 040/200 ] loss = 0.80527, acc = 0.78810


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 80.96% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.09it/s]


[ Train | 041/200 ] loss = 0.26687, acc = 0.91175


100%|██████████| 21/21 [00:05<00:00,  3.66it/s]


[ Valid | 041/200 ] loss = 0.97144, acc = 0.76310


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 79.37% of unlabeld data


100%|██████████| 264/264 [01:26<00:00,  3.05it/s]


[ Train | 042/200 ] loss = 0.26466, acc = 0.91051


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 042/200 ] loss = 0.90525, acc = 0.76458


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 79.18% of unlabeld data


100%|██████████| 264/264 [01:25<00:00,  3.09it/s]


[ Train | 043/200 ] loss = 0.28775, acc = 0.90601


100%|██████████| 21/21 [00:06<00:00,  3.13it/s]


[ Valid | 043/200 ] loss = 0.87002, acc = 0.77827


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 80.05% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.10it/s]


[ Train | 044/200 ] loss = 0.28506, acc = 0.90660


100%|██████████| 21/21 [00:06<00:00,  3.37it/s]


[ Valid | 044/200 ] loss = 0.89754, acc = 0.75595


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 78.37% of unlabeld data


100%|██████████| 262/262 [01:25<00:00,  3.06it/s]


[ Train | 045/200 ] loss = 0.31346, acc = 0.90112


100%|██████████| 21/21 [00:05<00:00,  3.86it/s]


[ Valid | 045/200 ] loss = 0.85185, acc = 0.78214


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 78.06% of unlabeld data


100%|██████████| 261/261 [01:26<00:00,  3.02it/s]


[ Train | 046/200 ] loss = 0.25406, acc = 0.92014


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 046/200 ] loss = 0.96953, acc = 0.78036


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 77.59% of unlabeld data


100%|██████████| 260/260 [01:24<00:00,  3.08it/s]


[ Train | 047/200 ] loss = 0.30759, acc = 0.90312


100%|██████████| 21/21 [00:08<00:00,  2.61it/s]


[ Valid | 047/200 ] loss = 0.85578, acc = 0.78155


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 79.03% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.10it/s]


[ Train | 048/200 ] loss = 0.26785, acc = 0.90779


100%|██████████| 21/21 [00:06<00:00,  3.12it/s]


[ Valid | 048/200 ] loss = 0.80453, acc = 0.79018


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 77.19% of unlabeld data


100%|██████████| 259/259 [01:24<00:00,  3.08it/s]


[ Train | 049/200 ] loss = 0.25382, acc = 0.91868


100%|██████████| 21/21 [00:06<00:00,  3.25it/s]


[ Valid | 049/200 ] loss = 0.93583, acc = 0.76161


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 79.99% of unlabeld data


100%|██████████| 265/265 [01:26<00:00,  3.07it/s]


[ Train | 050/200 ] loss = 0.27233, acc = 0.90884


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 050/200 ] loss = 0.95581, acc = 0.75595


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 78.47% of unlabeld data


100%|██████████| 262/262 [01:26<00:00,  3.04it/s]


[ Train | 051/200 ] loss = 0.28889, acc = 0.90768


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 051/200 ] loss = 0.85138, acc = 0.79256


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 78.90% of unlabeld data


100%|██████████| 263/263 [01:25<00:00,  3.09it/s]


[ Train | 052/200 ] loss = 0.25399, acc = 0.91481


100%|██████████| 21/21 [00:06<00:00,  3.11it/s]


[ Valid | 052/200 ] loss = 0.94508, acc = 0.76935


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 81.17% of unlabeld data


100%|██████████| 268/268 [01:27<00:00,  3.07it/s]


[ Train | 053/200 ] loss = 0.28715, acc = 0.90462


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 053/200 ] loss = 0.90401, acc = 0.78006


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 78.79% of unlabeld data


100%|██████████| 263/263 [01:25<00:00,  3.08it/s]


[ Train | 054/200 ] loss = 0.30500, acc = 0.90280


100%|██████████| 21/21 [00:06<00:00,  3.26it/s]


[ Valid | 054/200 ] loss = 0.94451, acc = 0.77292


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 79.12% of unlabeld data


100%|██████████| 264/264 [01:25<00:00,  3.08it/s]


[ Train | 055/200 ] loss = 0.27734, acc = 0.90885


100%|██████████| 21/21 [00:06<00:00,  3.15it/s]


[ Valid | 055/200 ] loss = 0.89152, acc = 0.79554


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 78.96% of unlabeld data


100%|██████████| 263/263 [01:25<00:00,  3.06it/s]


[ Train | 056/200 ] loss = 0.26462, acc = 0.91433


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 056/200 ] loss = 0.92721, acc = 0.76577


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 78.51% of unlabeld data


100%|██████████| 262/262 [01:25<00:00,  3.06it/s]


[ Train | 057/200 ] loss = 0.28055, acc = 0.90875


100%|██████████| 21/21 [00:05<00:00,  3.61it/s]


[ Valid | 057/200 ] loss = 0.79227, acc = 0.80149


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 80.55% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.10it/s]


[ Train | 058/200 ] loss = 0.25661, acc = 0.91526


100%|██████████| 21/21 [00:07<00:00,  2.82it/s]


[ Valid | 058/200 ] loss = 0.78024, acc = 0.79613


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 80.95% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.07it/s]


[ Train | 059/200 ] loss = 0.23509, acc = 0.92626


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 059/200 ] loss = 0.89375, acc = 0.78869


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 82.38% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.08it/s]


[ Train | 060/200 ] loss = 0.27174, acc = 0.91076


100%|██████████| 21/21 [00:06<00:00,  3.44it/s]


[ Valid | 060/200 ] loss = 0.88412, acc = 0.77500


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 79.63% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.09it/s]


[ Train | 061/200 ] loss = 0.25976, acc = 0.91250


100%|██████████| 21/21 [00:06<00:00,  3.17it/s]


[ Valid | 061/200 ] loss = 0.88878, acc = 0.78988


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 80.67% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.10it/s]


[ Train | 062/200 ] loss = 0.28424, acc = 0.90742


100%|██████████| 21/21 [00:05<00:00,  3.81it/s]


[ Valid | 062/200 ] loss = 0.87222, acc = 0.78452


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 78.32% of unlabeld data


100%|██████████| 262/262 [01:25<00:00,  3.06it/s]


[ Train | 063/200 ] loss = 0.27936, acc = 0.91054


100%|██████████| 21/21 [00:05<00:00,  3.77it/s]


[ Valid | 063/200 ] loss = 0.84568, acc = 0.77054


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.71% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.12it/s]


[ Train | 064/200 ] loss = 0.25863, acc = 0.91374


100%|██████████| 21/21 [00:07<00:00,  2.67it/s]


[ Valid | 064/200 ] loss = 0.88308, acc = 0.77054


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 80.17% of unlabeld data


100%|██████████| 266/266 [01:27<00:00,  3.03it/s]


[ Train | 065/200 ] loss = 0.26853, acc = 0.91236


100%|██████████| 21/21 [00:05<00:00,  3.81it/s]


[ Valid | 065/200 ] loss = 0.88219, acc = 0.77024


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.42% of unlabeld data


100%|██████████| 266/266 [01:26<00:00,  3.07it/s]


[ Train | 066/200 ] loss = 0.25887, acc = 0.91718


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 066/200 ] loss = 0.96991, acc = 0.76339


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.28% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.11it/s]


[ Train | 067/200 ] loss = 0.28134, acc = 0.91212


100%|██████████| 21/21 [00:06<00:00,  3.11it/s]


[ Valid | 067/200 ] loss = 0.95093, acc = 0.77321


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 81.71% of unlabeld data


100%|██████████| 269/269 [01:27<00:00,  3.08it/s]


[ Train | 068/200 ] loss = 0.25617, acc = 0.91589


100%|██████████| 21/21 [00:05<00:00,  3.72it/s]


[ Valid | 068/200 ] loss = 0.85580, acc = 0.79821


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 80.68% of unlabeld data


100%|██████████| 267/267 [01:28<00:00,  3.03it/s]


[ Train | 069/200 ] loss = 0.26083, acc = 0.91491


100%|██████████| 21/21 [00:06<00:00,  3.25it/s]


[ Valid | 069/200 ] loss = 0.84413, acc = 0.77946


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 80.28% of unlabeld data


100%|██████████| 266/266 [01:26<00:00,  3.07it/s]


[ Train | 070/200 ] loss = 0.26602, acc = 0.91189


100%|██████████| 21/21 [00:06<00:00,  3.10it/s]


[ Valid | 070/200 ] loss = 0.98397, acc = 0.75595


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 79.59% of unlabeld data


100%|██████████| 265/265 [01:26<00:00,  3.07it/s]


[ Train | 071/200 ] loss = 0.30850, acc = 0.90389


100%|██████████| 21/21 [00:05<00:00,  3.81it/s]


[ Valid | 071/200 ] loss = 0.88035, acc = 0.78006


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 79.22% of unlabeld data


100%|██████████| 264/264 [01:26<00:00,  3.05it/s]


[ Train | 072/200 ] loss = 0.27472, acc = 0.90838


100%|██████████| 21/21 [00:05<00:00,  3.72it/s]


[ Valid | 072/200 ] loss = 0.88952, acc = 0.78214


100%|██████████| 213/213 [01:06<00:00,  3.19it/s]


Using 79.91% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.09it/s]


[ Train | 073/200 ] loss = 0.25667, acc = 0.91368


100%|██████████| 21/21 [00:06<00:00,  3.20it/s]


[ Valid | 073/200 ] loss = 0.86273, acc = 0.77649


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 81.76% of unlabeld data


100%|██████████| 269/269 [01:29<00:00,  3.02it/s]


[ Train | 074/200 ] loss = 0.26578, acc = 0.91276


100%|██████████| 21/21 [00:06<00:00,  3.17it/s]


[ Valid | 074/200 ] loss = 0.83259, acc = 0.77798


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 81.28% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.08it/s]


[ Train | 075/200 ] loss = 0.25598, acc = 0.91709


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 075/200 ] loss = 0.76734, acc = 0.79405


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 78.53% of unlabeld data


100%|██████████| 262/262 [01:26<00:00,  3.03it/s]


[ Train | 076/200 ] loss = 0.28255, acc = 0.90923


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 076/200 ] loss = 0.84663, acc = 0.77381


100%|██████████| 213/213 [01:06<00:00,  3.19it/s]


Using 78.56% of unlabeld data


100%|██████████| 262/262 [01:25<00:00,  3.07it/s]


[ Train | 077/200 ] loss = 0.25509, acc = 0.91543


100%|██████████| 21/21 [00:05<00:00,  3.59it/s]


[ Valid | 077/200 ] loss = 0.86414, acc = 0.77411


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.05% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.11it/s]


[ Train | 078/200 ] loss = 0.25946, acc = 0.91612


100%|██████████| 21/21 [00:06<00:00,  3.10it/s]


[ Valid | 078/200 ] loss = 0.89971, acc = 0.77054


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 79.24% of unlabeld data


100%|██████████| 264/264 [01:26<00:00,  3.04it/s]


[ Train | 079/200 ] loss = 0.27162, acc = 0.91217


100%|██████████| 21/21 [00:05<00:00,  3.87it/s]


[ Valid | 079/200 ] loss = 0.94885, acc = 0.77083


100%|██████████| 213/213 [01:07<00:00,  3.15it/s]


Using 79.44% of unlabeld data


100%|██████████| 264/264 [01:26<00:00,  3.05it/s]


[ Train | 080/200 ] loss = 0.29971, acc = 0.90317


100%|██████████| 21/21 [00:06<00:00,  3.07it/s]


[ Valid | 080/200 ] loss = 0.77600, acc = 0.79405


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.62% of unlabeld data


100%|██████████| 267/267 [01:27<00:00,  3.03it/s]


[ Train | 081/200 ] loss = 0.23593, acc = 0.92228


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 081/200 ] loss = 0.81107, acc = 0.78720


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 82.20% of unlabeld data


100%|██████████| 270/270 [01:28<00:00,  3.07it/s]


[ Train | 082/200 ] loss = 0.25565, acc = 0.91644


100%|██████████| 21/21 [00:06<00:00,  3.15it/s]


[ Valid | 082/200 ] loss = 0.92656, acc = 0.76637


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 82.23% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.07it/s]


[ Train | 083/200 ] loss = 0.27730, acc = 0.90521


100%|██████████| 21/21 [00:05<00:00,  3.87it/s]


[ Valid | 083/200 ] loss = 0.81800, acc = 0.78780


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 79.94% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.08it/s]


[ Train | 084/200 ] loss = 0.26705, acc = 0.91297


100%|██████████| 21/21 [00:06<00:00,  3.17it/s]


[ Valid | 084/200 ] loss = 0.88505, acc = 0.77887


100%|██████████| 213/213 [01:07<00:00,  3.17it/s]


Using 79.15% of unlabeld data


100%|██████████| 264/264 [01:26<00:00,  3.04it/s]


[ Train | 085/200 ] loss = 0.25242, acc = 0.91809


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 085/200 ] loss = 0.95545, acc = 0.77500


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 82.27% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.09it/s]


[ Train | 086/200 ] loss = 0.27452, acc = 0.90949


100%|██████████| 21/21 [00:05<00:00,  3.54it/s]


[ Valid | 086/200 ] loss = 0.96836, acc = 0.76518


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.27% of unlabeld data


100%|██████████| 266/266 [01:26<00:00,  3.09it/s]


[ Train | 087/200 ] loss = 0.27393, acc = 0.90977


100%|██████████| 21/21 [00:06<00:00,  3.14it/s]


[ Valid | 087/200 ] loss = 0.88420, acc = 0.78571


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 78.71% of unlabeld data


100%|██████████| 263/263 [01:26<00:00,  3.04it/s]


[ Train | 088/200 ] loss = 0.28321, acc = 0.91053


100%|██████████| 21/21 [00:05<00:00,  3.81it/s]


[ Valid | 088/200 ] loss = 0.86240, acc = 0.78929


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 80.59% of unlabeld data


100%|██████████| 267/267 [01:27<00:00,  3.04it/s]


[ Train | 089/200 ] loss = 0.24727, acc = 0.92111


100%|██████████| 21/21 [00:06<00:00,  3.12it/s]


[ Valid | 089/200 ] loss = 0.81618, acc = 0.78542


100%|██████████| 213/213 [01:06<00:00,  3.23it/s]


Using 81.40% of unlabeld data


100%|██████████| 268/268 [01:28<00:00,  3.03it/s]


[ Train | 090/200 ] loss = 0.24862, acc = 0.91616


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]


[ Valid | 090/200 ] loss = 0.89541, acc = 0.78214


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 82.46% of unlabeld data


100%|██████████| 271/271 [01:29<00:00,  3.04it/s]


[ Train | 091/200 ] loss = 0.26671, acc = 0.91363


100%|██████████| 21/21 [00:06<00:00,  3.11it/s]


[ Valid | 091/200 ] loss = 0.86479, acc = 0.77827


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 83.02% of unlabeld data


100%|██████████| 272/272 [01:28<00:00,  3.06it/s]


[ Train | 092/200 ] loss = 0.24580, acc = 0.91935


100%|██████████| 21/21 [00:05<00:00,  3.86it/s]


[ Valid | 092/200 ] loss = 0.85098, acc = 0.77917


100%|██████████| 213/213 [01:07<00:00,  3.16it/s]


Using 82.26% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.10it/s]


[ Train | 093/200 ] loss = 0.28632, acc = 0.90637


100%|██████████| 21/21 [00:06<00:00,  3.25it/s]


[ Valid | 093/200 ] loss = 0.86836, acc = 0.78601


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 78.35% of unlabeld data


100%|██████████| 262/262 [01:26<00:00,  3.02it/s]


[ Train | 094/200 ] loss = 0.25509, acc = 0.91448


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 094/200 ] loss = 0.88020, acc = 0.75089


100%|██████████| 213/213 [01:06<00:00,  3.18it/s]


Using 77.56% of unlabeld data


100%|██████████| 260/260 [01:23<00:00,  3.12it/s]


[ Train | 095/200 ] loss = 0.30274, acc = 0.90517


100%|██████████| 21/21 [00:07<00:00,  2.75it/s]


[ Valid | 095/200 ] loss = 0.78434, acc = 0.75923


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 78.84% of unlabeld data


100%|██████████| 263/263 [01:26<00:00,  3.06it/s]


[ Train | 096/200 ] loss = 0.25143, acc = 0.91694


100%|██████████| 21/21 [00:06<00:00,  3.45it/s]


[ Valid | 096/200 ] loss = 0.73787, acc = 0.79851


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.28% of unlabeld data


100%|██████████| 266/266 [01:27<00:00,  3.06it/s]


[ Train | 097/200 ] loss = 0.24373, acc = 0.92000


100%|██████████| 21/21 [00:05<00:00,  3.80it/s]


[ Valid | 097/200 ] loss = 0.83732, acc = 0.78155


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 82.01% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.08it/s]


[ Train | 098/200 ] loss = 0.25689, acc = 0.91123


100%|██████████| 21/21 [00:06<00:00,  3.15it/s]


[ Valid | 098/200 ] loss = 0.86469, acc = 0.78899


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.59% of unlabeld data


100%|██████████| 267/267 [01:27<00:00,  3.04it/s]


[ Train | 099/200 ] loss = 0.27104, acc = 0.91269


100%|██████████| 21/21 [00:05<00:00,  3.80it/s]


[ Valid | 099/200 ] loss = 0.95083, acc = 0.76696


100%|██████████| 213/213 [01:05<00:00,  3.28it/s]


Using 81.84% of unlabeld data


100%|██████████| 269/269 [01:28<00:00,  3.05it/s]


[ Train | 100/200 ] loss = 0.26577, acc = 0.91217


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 100/200 ] loss = 0.85849, acc = 0.77976


100%|██████████| 213/213 [01:07<00:00,  3.17it/s]


Using 82.30% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.12it/s]


[ Train | 101/200 ] loss = 0.24400, acc = 0.91968


100%|██████████| 21/21 [00:06<00:00,  3.12it/s]


[ Valid | 101/200 ] loss = 0.91393, acc = 0.76667


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 81.76% of unlabeld data


100%|██████████| 269/269 [01:27<00:00,  3.07it/s]


[ Train | 102/200 ] loss = 0.27063, acc = 0.91357


100%|██████████| 21/21 [00:05<00:00,  3.79it/s]


[ Valid | 102/200 ] loss = 0.90650, acc = 0.75506


100%|██████████| 213/213 [01:07<00:00,  3.17it/s]


Using 80.55% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.10it/s]


[ Train | 103/200 ] loss = 0.26147, acc = 0.91304


100%|██████████| 21/21 [00:06<00:00,  3.12it/s]


[ Valid | 103/200 ] loss = 0.90209, acc = 0.76875


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 82.18% of unlabeld data


100%|██████████| 270/270 [01:28<00:00,  3.06it/s]


[ Train | 104/200 ] loss = 0.24821, acc = 0.91528


100%|██████████| 21/21 [00:06<00:00,  3.29it/s]


[ Valid | 104/200 ] loss = 0.88656, acc = 0.80030


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 83.79% of unlabeld data


100%|██████████| 273/273 [01:29<00:00,  3.06it/s]


[ Train | 105/200 ] loss = 0.28381, acc = 0.91094


100%|██████████| 21/21 [00:05<00:00,  3.72it/s]


[ Valid | 105/200 ] loss = 0.85420, acc = 0.77560


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 81.40% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.13it/s]


[ Train | 106/200 ] loss = 0.25158, acc = 0.91826


100%|██████████| 21/21 [00:06<00:00,  3.28it/s]


[ Valid | 106/200 ] loss = 0.79517, acc = 0.79911


100%|██████████| 213/213 [01:03<00:00,  3.35it/s]


Using 81.36% of unlabeld data


100%|██████████| 268/268 [01:24<00:00,  3.16it/s]


[ Train | 107/200 ] loss = 0.25127, acc = 0.92211


100%|██████████| 21/21 [00:06<00:00,  3.28it/s]


[ Valid | 107/200 ] loss = 0.96596, acc = 0.75446


100%|██████████| 213/213 [01:04<00:00,  3.32it/s]


Using 83.07% of unlabeld data


100%|██████████| 272/272 [01:25<00:00,  3.19it/s]


[ Train | 108/200 ] loss = 0.28164, acc = 0.91062


100%|██████████| 21/21 [00:06<00:00,  3.32it/s]


[ Valid | 108/200 ] loss = 0.93351, acc = 0.78542


100%|██████████| 213/213 [01:03<00:00,  3.33it/s]


Using 81.71% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.16it/s]


[ Train | 109/200 ] loss = 0.25136, acc = 0.91787


100%|██████████| 21/21 [00:06<00:00,  3.31it/s]


[ Valid | 109/200 ] loss = 0.95687, acc = 0.74792


100%|██████████| 213/213 [01:03<00:00,  3.33it/s]


Using 80.03% of unlabeld data


100%|██████████| 265/265 [01:22<00:00,  3.20it/s]


[ Train | 110/200 ] loss = 0.28617, acc = 0.90920


100%|██████████| 21/21 [00:05<00:00,  3.77it/s]


[ Valid | 110/200 ] loss = 0.92201, acc = 0.78333


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 80.73% of unlabeld data


100%|██████████| 267/267 [01:24<00:00,  3.16it/s]


[ Train | 111/200 ] loss = 0.28583, acc = 0.90765


100%|██████████| 21/21 [00:05<00:00,  3.85it/s]


[ Valid | 111/200 ] loss = 0.96481, acc = 0.76994


100%|██████████| 213/213 [01:03<00:00,  3.36it/s]


Using 81.01% of unlabeld data


100%|██████████| 268/268 [01:23<00:00,  3.19it/s]


[ Train | 112/200 ] loss = 0.24837, acc = 0.91733


100%|██████████| 21/21 [00:06<00:00,  3.33it/s]


[ Valid | 112/200 ] loss = 0.93764, acc = 0.77440


100%|██████████| 213/213 [01:03<00:00,  3.35it/s]


Using 82.27% of unlabeld data


100%|██████████| 270/270 [01:24<00:00,  3.18it/s]


[ Train | 113/200 ] loss = 0.26954, acc = 0.91285


100%|██████████| 21/21 [00:06<00:00,  3.36it/s]


[ Valid | 113/200 ] loss = 0.92895, acc = 0.76994


100%|██████████| 213/213 [01:03<00:00,  3.34it/s]


Using 82.49% of unlabeld data


100%|██████████| 271/271 [01:25<00:00,  3.17it/s]


[ Train | 114/200 ] loss = 0.25891, acc = 0.91605


100%|██████████| 21/21 [00:06<00:00,  3.28it/s]


[ Valid | 114/200 ] loss = 1.01781, acc = 0.75655


100%|██████████| 213/213 [01:03<00:00,  3.34it/s]


Using 81.51% of unlabeld data


100%|██████████| 269/269 [01:24<00:00,  3.18it/s]


[ Train | 115/200 ] loss = 0.26752, acc = 0.91276


100%|██████████| 21/21 [00:07<00:00,  2.99it/s]


[ Valid | 115/200 ] loss = 1.08781, acc = 0.76964


100%|██████████| 213/213 [01:03<00:00,  3.36it/s]


Using 82.39% of unlabeld data


100%|██████████| 270/270 [01:25<00:00,  3.17it/s]


[ Train | 116/200 ] loss = 0.26984, acc = 0.91030


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 116/200 ] loss = 1.01504, acc = 0.75149


100%|██████████| 213/213 [01:04<00:00,  3.32it/s]


Using 80.77% of unlabeld data


100%|██████████| 267/267 [01:23<00:00,  3.19it/s]


[ Train | 117/200 ] loss = 0.27120, acc = 0.91163


100%|██████████| 21/21 [00:05<00:00,  3.67it/s]


[ Valid | 117/200 ] loss = 0.89085, acc = 0.77470


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 81.89% of unlabeld data


100%|██████████| 269/269 [01:26<00:00,  3.11it/s]


[ Train | 118/200 ] loss = 0.24987, acc = 0.92100


100%|██████████| 21/21 [00:05<00:00,  3.67it/s]


[ Valid | 118/200 ] loss = 1.07752, acc = 0.75179


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 81.82% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.13it/s]


[ Train | 119/200 ] loss = 0.26133, acc = 0.91415


100%|██████████| 21/21 [00:05<00:00,  3.56it/s]


[ Valid | 119/200 ] loss = 0.84205, acc = 0.79911


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 82.27% of unlabeld data


100%|██████████| 270/270 [01:24<00:00,  3.21it/s]


[ Train | 120/200 ] loss = 0.24576, acc = 0.91875


100%|██████████| 21/21 [00:06<00:00,  3.49it/s]


[ Valid | 120/200 ] loss = 1.06495, acc = 0.76339


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.89% of unlabeld data


100%|██████████| 267/267 [01:24<00:00,  3.15it/s]


[ Train | 121/200 ] loss = 0.27309, acc = 0.91035


100%|██████████| 21/21 [00:06<00:00,  3.32it/s]


[ Valid | 121/200 ] loss = 0.94767, acc = 0.78452


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 82.23% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.13it/s]


[ Train | 122/200 ] loss = 0.26862, acc = 0.91586


100%|██████████| 21/21 [00:06<00:00,  3.18it/s]


[ Valid | 122/200 ] loss = 1.05524, acc = 0.73036


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 81.86% of unlabeld data


100%|██████████| 269/269 [01:24<00:00,  3.18it/s]


[ Train | 123/200 ] loss = 0.26511, acc = 0.91078


100%|██████████| 21/21 [00:07<00:00,  2.94it/s]


[ Valid | 123/200 ] loss = 1.07123, acc = 0.76012


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 80.18% of unlabeld data


100%|██████████| 266/266 [01:26<00:00,  3.09it/s]


[ Train | 124/200 ] loss = 0.26898, acc = 0.91353


100%|██████████| 21/21 [00:06<00:00,  3.26it/s]


[ Valid | 124/200 ] loss = 0.89753, acc = 0.78512


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 82.48% of unlabeld data


100%|██████████| 271/271 [01:27<00:00,  3.10it/s]


[ Train | 125/200 ] loss = 0.24952, acc = 0.91767


100%|██████████| 21/21 [00:06<00:00,  3.26it/s]


[ Valid | 125/200 ] loss = 0.86656, acc = 0.78899


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 81.39% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.09it/s]


[ Train | 126/200 ] loss = 0.25273, acc = 0.91535


100%|██████████| 21/21 [00:05<00:00,  3.51it/s]


[ Valid | 126/200 ] loss = 0.86308, acc = 0.80000


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 83.41% of unlabeld data


100%|██████████| 273/273 [01:27<00:00,  3.12it/s]


[ Train | 127/200 ] loss = 0.23546, acc = 0.92079


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]


[ Valid | 127/200 ] loss = 0.87239, acc = 0.79226


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 80.95% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.09it/s]


[ Train | 128/200 ] loss = 0.28000, acc = 0.90777


100%|██████████| 21/21 [00:05<00:00,  3.73it/s]


[ Valid | 128/200 ] loss = 0.85845, acc = 0.77798


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 80.75% of unlabeld data


100%|██████████| 267/267 [01:26<00:00,  3.10it/s]


[ Train | 129/200 ] loss = 0.25541, acc = 0.91585


100%|██████████| 21/21 [00:05<00:00,  3.75it/s]


[ Valid | 129/200 ] loss = 0.97371, acc = 0.76994


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 82.17% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.11it/s]


[ Train | 130/200 ] loss = 0.27049, acc = 0.91065


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 130/200 ] loss = 0.85041, acc = 0.78631


100%|██████████| 213/213 [01:06<00:00,  3.21it/s]


Using 81.11% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.10it/s]


[ Train | 131/200 ] loss = 0.24911, acc = 0.92001


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 131/200 ] loss = 0.77412, acc = 0.80238


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 81.51% of unlabeld data


100%|██████████| 269/269 [01:26<00:00,  3.09it/s]


[ Train | 132/200 ] loss = 0.26211, acc = 0.91612


100%|██████████| 21/21 [00:06<00:00,  3.32it/s]


[ Valid | 132/200 ] loss = 0.87599, acc = 0.79196


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.26% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.15it/s]


[ Train | 133/200 ] loss = 0.26895, acc = 0.91465


100%|██████████| 21/21 [00:05<00:00,  3.52it/s]


[ Valid | 133/200 ] loss = 0.77518, acc = 0.80625


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 81.40% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.10it/s]


[ Train | 134/200 ] loss = 0.24444, acc = 0.91803


100%|██████████| 21/21 [00:05<00:00,  3.50it/s]


[ Valid | 134/200 ] loss = 0.85650, acc = 0.78155


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 83.24% of unlabeld data


100%|██████████| 272/272 [01:26<00:00,  3.14it/s]


[ Train | 135/200 ] loss = 0.23899, acc = 0.91981


100%|██████████| 21/21 [00:06<00:00,  3.38it/s]


[ Valid | 135/200 ] loss = 0.87489, acc = 0.78423


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 80.67% of unlabeld data


100%|██████████| 267/267 [01:24<00:00,  3.16it/s]


[ Train | 136/200 ] loss = 0.26915, acc = 0.91128


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 136/200 ] loss = 0.88021, acc = 0.78423


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 81.87% of unlabeld data


100%|██████████| 269/269 [01:26<00:00,  3.13it/s]


[ Train | 137/200 ] loss = 0.26926, acc = 0.91392


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 137/200 ] loss = 0.82979, acc = 0.79107


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 79.90% of unlabeld data


100%|██████████| 265/265 [01:24<00:00,  3.13it/s]


[ Train | 138/200 ] loss = 0.29700, acc = 0.90507


100%|██████████| 21/21 [00:05<00:00,  3.75it/s]


[ Valid | 138/200 ] loss = 0.90121, acc = 0.77232


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 77.69% of unlabeld data


100%|██████████| 261/261 [01:24<00:00,  3.10it/s]


[ Train | 139/200 ] loss = 0.26818, acc = 0.91355


100%|██████████| 21/21 [00:05<00:00,  3.77it/s]


[ Valid | 139/200 ] loss = 0.91356, acc = 0.77202


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 80.09% of unlabeld data


100%|██████████| 266/266 [01:24<00:00,  3.15it/s]


[ Train | 140/200 ] loss = 0.24502, acc = 0.91823


100%|██████████| 21/21 [00:05<00:00,  3.53it/s]


[ Valid | 140/200 ] loss = 0.89336, acc = 0.79345


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.18% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.13it/s]


[ Train | 141/200 ] loss = 0.25696, acc = 0.91424


100%|██████████| 21/21 [00:05<00:00,  3.57it/s]


[ Valid | 141/200 ] loss = 0.96587, acc = 0.77232


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 78.68% of unlabeld data


100%|██████████| 263/263 [01:23<00:00,  3.15it/s]


[ Train | 142/200 ] loss = 0.27484, acc = 0.91148


100%|██████████| 21/21 [00:06<00:00,  3.23it/s]


[ Valid | 142/200 ] loss = 0.86453, acc = 0.79732


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 79.96% of unlabeld data


100%|██████████| 265/265 [01:24<00:00,  3.14it/s]


[ Train | 143/200 ] loss = 0.26814, acc = 0.91085


100%|██████████| 21/21 [00:06<00:00,  3.25it/s]


[ Valid | 143/200 ] loss = 0.85657, acc = 0.80476


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 79.80% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.12it/s]


[ Train | 144/200 ] loss = 0.26385, acc = 0.91368


100%|██████████| 21/21 [00:06<00:00,  3.30it/s]


[ Valid | 144/200 ] loss = 0.82588, acc = 0.80030


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 80.65% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.12it/s]


[ Train | 145/200 ] loss = 0.25288, acc = 0.92018


100%|██████████| 21/21 [00:05<00:00,  3.79it/s]


[ Valid | 145/200 ] loss = 0.83417, acc = 0.79435


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.93% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.12it/s]


[ Train | 146/200 ] loss = 0.27567, acc = 0.91128


100%|██████████| 21/21 [00:05<00:00,  3.79it/s]


[ Valid | 146/200 ] loss = 0.73775, acc = 0.79583


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 79.99% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.10it/s]


[ Train | 147/200 ] loss = 0.24158, acc = 0.91946


100%|██████████| 21/21 [00:05<00:00,  3.79it/s]


[ Valid | 147/200 ] loss = 0.88384, acc = 0.77500


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 79.61% of unlabeld data


100%|██████████| 265/265 [01:24<00:00,  3.12it/s]


[ Train | 148/200 ] loss = 0.27957, acc = 0.91144


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 148/200 ] loss = 0.85360, acc = 0.77768


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.74% of unlabeld data


100%|██████████| 269/269 [01:26<00:00,  3.13it/s]


[ Train | 149/200 ] loss = 0.26470, acc = 0.91368


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 149/200 ] loss = 0.82137, acc = 0.79018


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 79.63% of unlabeld data


100%|██████████| 265/265 [01:25<00:00,  3.10it/s]


[ Train | 150/200 ] loss = 0.25890, acc = 0.91380


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]


[ Valid | 150/200 ] loss = 0.81789, acc = 0.78899


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 81.08% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.14it/s]


[ Train | 151/200 ] loss = 0.24665, acc = 0.92176


100%|██████████| 21/21 [00:05<00:00,  3.71it/s]


[ Valid | 151/200 ] loss = 0.85378, acc = 0.77560


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 78.13% of unlabeld data


100%|██████████| 261/261 [01:22<00:00,  3.15it/s]


[ Train | 152/200 ] loss = 0.27018, acc = 0.91367


100%|██████████| 21/21 [00:06<00:00,  3.19it/s]


[ Valid | 152/200 ] loss = 0.72791, acc = 0.79970


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 81.89% of unlabeld data


100%|██████████| 269/269 [01:27<00:00,  3.08it/s]


[ Train | 153/200 ] loss = 0.24414, acc = 0.92077


100%|██████████| 21/21 [00:05<00:00,  3.52it/s]


[ Valid | 153/200 ] loss = 0.84208, acc = 0.78304


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 81.21% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.13it/s]


[ Train | 154/200 ] loss = 0.26158, acc = 0.91465


100%|██████████| 21/21 [00:05<00:00,  3.77it/s]


[ Valid | 154/200 ] loss = 0.81023, acc = 0.80655


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 82.57% of unlabeld data


100%|██████████| 271/271 [01:25<00:00,  3.16it/s]


[ Train | 155/200 ] loss = 0.23907, acc = 0.92274


100%|██████████| 21/21 [00:05<00:00,  3.81it/s]


[ Valid | 155/200 ] loss = 0.79569, acc = 0.78006


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 82.15% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.11it/s]


[ Train | 156/200 ] loss = 0.25424, acc = 0.91551


100%|██████████| 21/21 [00:05<00:00,  3.73it/s]


[ Valid | 156/200 ] loss = 0.78004, acc = 0.80030


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 82.14% of unlabeld data


100%|██████████| 270/270 [01:27<00:00,  3.09it/s]


[ Train | 157/200 ] loss = 0.24971, acc = 0.91782


100%|██████████| 21/21 [00:05<00:00,  3.62it/s]


[ Valid | 157/200 ] loss = 0.89888, acc = 0.79345


100%|██████████| 213/213 [01:05<00:00,  3.23it/s]


Using 82.20% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.12it/s]


[ Train | 158/200 ] loss = 0.26280, acc = 0.91250


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 158/200 ] loss = 0.81248, acc = 0.78899


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 82.51% of unlabeld data


100%|██████████| 271/271 [01:27<00:00,  3.10it/s]


[ Train | 159/200 ] loss = 0.24242, acc = 0.91836


100%|██████████| 21/21 [00:05<00:00,  3.75it/s]


[ Valid | 159/200 ] loss = 0.83501, acc = 0.77560


100%|██████████| 213/213 [01:06<00:00,  3.22it/s]


Using 82.36% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.13it/s]


[ Train | 160/200 ] loss = 0.25935, acc = 0.91551


100%|██████████| 21/21 [00:06<00:00,  3.28it/s]


[ Valid | 160/200 ] loss = 0.82454, acc = 0.78899


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 81.54% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.14it/s]


[ Train | 161/200 ] loss = 0.26504, acc = 0.91566


100%|██████████| 21/21 [00:06<00:00,  3.38it/s]


[ Valid | 161/200 ] loss = 0.78779, acc = 0.78304


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 81.80% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.14it/s]


[ Train | 162/200 ] loss = 0.25247, acc = 0.91601


100%|██████████| 21/21 [00:05<00:00,  3.58it/s]


[ Valid | 162/200 ] loss = 0.87407, acc = 0.77589


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.53% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.14it/s]


[ Train | 163/200 ] loss = 0.28238, acc = 0.91152


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 163/200 ] loss = 0.82102, acc = 0.78720


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.30% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.14it/s]


[ Train | 164/200 ] loss = 0.24826, acc = 0.91908


100%|██████████| 21/21 [00:05<00:00,  3.66it/s]


[ Valid | 164/200 ] loss = 0.80986, acc = 0.79226


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 83.88% of unlabeld data


100%|██████████| 274/274 [01:26<00:00,  3.16it/s]


[ Train | 165/200 ] loss = 0.24465, acc = 0.92119


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 165/200 ] loss = 0.73846, acc = 0.81637
Saving model with acc 0.81637


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 83.04% of unlabeld data


100%|██████████| 272/272 [01:26<00:00,  3.13it/s]


[ Train | 166/200 ] loss = 0.24763, acc = 0.91820


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 166/200 ] loss = 0.82651, acc = 0.79911


100%|██████████| 213/213 [01:05<00:00,  3.28it/s]


Using 82.14% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.14it/s]


[ Train | 167/200 ] loss = 0.26694, acc = 0.90984


100%|██████████| 21/21 [00:05<00:00,  3.74it/s]


[ Valid | 167/200 ] loss = 0.86275, acc = 0.78214


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 80.36% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.12it/s]


[ Train | 168/200 ] loss = 0.23752, acc = 0.92317


100%|██████████| 21/21 [00:05<00:00,  3.80it/s]


[ Valid | 168/200 ] loss = 0.86759, acc = 0.78006


100%|██████████| 213/213 [01:05<00:00,  3.28it/s]


Using 80.90% of unlabeld data


100%|██████████| 267/267 [01:24<00:00,  3.16it/s]


[ Train | 169/200 ] loss = 0.27539, acc = 0.91456


100%|██████████| 21/21 [00:05<00:00,  3.77it/s]


[ Valid | 169/200 ] loss = 0.76053, acc = 0.80357


100%|██████████| 213/213 [01:05<00:00,  3.28it/s]


Using 79.09% of unlabeld data


100%|██████████| 263/263 [01:23<00:00,  3.14it/s]


[ Train | 170/200 ] loss = 0.25440, acc = 0.91445


100%|██████████| 21/21 [00:06<00:00,  3.44it/s]


[ Valid | 170/200 ] loss = 0.69621, acc = 0.80476


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 81.28% of unlabeld data


100%|██████████| 268/268 [01:25<00:00,  3.14it/s]


[ Train | 171/200 ] loss = 0.26153, acc = 0.91371


100%|██████████| 21/21 [00:06<00:00,  3.34it/s]


[ Valid | 171/200 ] loss = 0.81426, acc = 0.79167


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 79.09% of unlabeld data


100%|██████████| 263/263 [01:23<00:00,  3.14it/s]


[ Train | 172/200 ] loss = 0.25541, acc = 0.91873


100%|██████████| 21/21 [00:06<00:00,  3.15it/s]


[ Valid | 172/200 ] loss = 0.83915, acc = 0.77976


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 82.29% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.12it/s]


[ Train | 173/200 ] loss = 0.26891, acc = 0.91111


100%|██████████| 21/21 [00:06<00:00,  3.25it/s]


[ Valid | 173/200 ] loss = 0.77603, acc = 0.78214


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 83.19% of unlabeld data


100%|██████████| 272/272 [01:27<00:00,  3.12it/s]


[ Train | 174/200 ] loss = 0.25783, acc = 0.91567


100%|██████████| 21/21 [00:05<00:00,  3.55it/s]


[ Valid | 174/200 ] loss = 0.79633, acc = 0.78363


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 81.01% of unlabeld data


100%|██████████| 268/268 [01:24<00:00,  3.15it/s]


[ Train | 175/200 ] loss = 0.24618, acc = 0.91884


100%|██████████| 21/21 [00:06<00:00,  3.19it/s]


[ Valid | 175/200 ] loss = 0.79932, acc = 0.78244


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 82.63% of unlabeld data


100%|██████████| 271/271 [01:26<00:00,  3.12it/s]


[ Train | 176/200 ] loss = 0.24381, acc = 0.92009


100%|██████████| 21/21 [00:06<00:00,  3.34it/s]


[ Valid | 176/200 ] loss = 0.83231, acc = 0.76845


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 83.20% of unlabeld data


100%|██████████| 272/272 [01:26<00:00,  3.13it/s]


[ Train | 177/200 ] loss = 0.25153, acc = 0.91613


100%|██████████| 21/21 [00:06<00:00,  3.50it/s]


[ Valid | 177/200 ] loss = 0.90007, acc = 0.76429


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 82.94% of unlabeld data


100%|██████████| 272/272 [01:26<00:00,  3.15it/s]


[ Train | 178/200 ] loss = 0.25340, acc = 0.91923


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 178/200 ] loss = 0.82766, acc = 0.78006


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 82.49% of unlabeld data


100%|██████████| 271/271 [01:26<00:00,  3.12it/s]


[ Train | 179/200 ] loss = 0.24776, acc = 0.92147


100%|██████████| 21/21 [00:05<00:00,  3.66it/s]


[ Valid | 179/200 ] loss = 0.99524, acc = 0.74702


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 80.93% of unlabeld data


100%|██████████| 267/267 [01:25<00:00,  3.12it/s]


[ Train | 180/200 ] loss = 0.26913, acc = 0.90953


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]


[ Valid | 180/200 ] loss = 0.85194, acc = 0.79464


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 80.33% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.11it/s]


[ Train | 181/200 ] loss = 0.26937, acc = 0.91283


100%|██████████| 21/21 [00:05<00:00,  3.70it/s]


[ Valid | 181/200 ] loss = 1.01163, acc = 0.76607


100%|██████████| 213/213 [01:06<00:00,  3.20it/s]


Using 80.45% of unlabeld data


100%|██████████| 266/266 [01:25<00:00,  3.13it/s]


[ Train | 182/200 ] loss = 0.29536, acc = 0.90273


100%|██████████| 21/21 [00:05<00:00,  3.73it/s]


[ Valid | 182/200 ] loss = 0.88461, acc = 0.76101


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.39% of unlabeld data


100%|██████████| 268/268 [01:26<00:00,  3.11it/s]


[ Train | 183/200 ] loss = 0.26460, acc = 0.91535


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 183/200 ] loss = 0.87448, acc = 0.75774


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 83.22% of unlabeld data


100%|██████████| 272/272 [01:26<00:00,  3.15it/s]


[ Train | 184/200 ] loss = 0.25125, acc = 0.92153


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 184/200 ] loss = 0.86468, acc = 0.77351


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 80.98% of unlabeld data


100%|██████████| 267/267 [01:24<00:00,  3.14it/s]


[ Train | 185/200 ] loss = 0.25215, acc = 0.92088


100%|██████████| 21/21 [00:05<00:00,  3.84it/s]


[ Valid | 185/200 ] loss = 0.92680, acc = 0.77827


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.52% of unlabeld data


100%|██████████| 269/269 [01:26<00:00,  3.13it/s]


[ Train | 186/200 ] loss = 0.27292, acc = 0.91403


100%|██████████| 21/21 [00:05<00:00,  3.64it/s]


[ Valid | 186/200 ] loss = 1.08085, acc = 0.73333


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 77.70% of unlabeld data


100%|██████████| 261/261 [01:23<00:00,  3.12it/s]


[ Train | 187/200 ] loss = 0.29277, acc = 0.91020


100%|██████████| 21/21 [00:05<00:00,  3.66it/s]


[ Valid | 187/200 ] loss = 0.88483, acc = 0.77798


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 79.00% of unlabeld data


100%|██████████| 263/263 [01:24<00:00,  3.10it/s]


[ Train | 188/200 ] loss = 0.25932, acc = 0.91813


100%|██████████| 21/21 [00:05<00:00,  3.80it/s]


[ Valid | 188/200 ] loss = 0.98052, acc = 0.77351


100%|██████████| 213/213 [01:05<00:00,  3.26it/s]


Using 82.05% of unlabeld data


100%|██████████| 270/270 [01:26<00:00,  3.13it/s]


[ Train | 189/200 ] loss = 0.26668, acc = 0.91343


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 189/200 ] loss = 0.92807, acc = 0.78869


100%|██████████| 213/213 [01:05<00:00,  3.25it/s]


Using 80.45% of unlabeld data


100%|██████████| 266/266 [01:24<00:00,  3.14it/s]


[ Train | 190/200 ] loss = 0.26307, acc = 0.91400


100%|██████████| 21/21 [00:05<00:00,  3.83it/s]


[ Valid | 190/200 ] loss = 0.87380, acc = 0.78929


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 81.67% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.15it/s]


[ Train | 191/200 ] loss = 0.24498, acc = 0.92309


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 191/200 ] loss = 0.78363, acc = 0.79762


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 82.58% of unlabeld data


100%|██████████| 271/271 [01:25<00:00,  3.17it/s]


[ Train | 192/200 ] loss = 0.24406, acc = 0.91917


100%|██████████| 21/21 [00:05<00:00,  3.82it/s]


[ Valid | 192/200 ] loss = 0.84120, acc = 0.77917


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 82.23% of unlabeld data


100%|██████████| 270/270 [01:25<00:00,  3.15it/s]


[ Train | 193/200 ] loss = 0.24833, acc = 0.91840


100%|██████████| 21/21 [00:05<00:00,  3.78it/s]


[ Valid | 193/200 ] loss = 1.02269, acc = 0.77351


100%|██████████| 213/213 [01:05<00:00,  3.27it/s]


Using 83.97% of unlabeld data


100%|██████████| 274/274 [01:26<00:00,  3.18it/s]


[ Train | 194/200 ] loss = 0.27360, acc = 0.91070


100%|██████████| 21/21 [00:05<00:00,  3.73it/s]


[ Valid | 194/200 ] loss = 0.86653, acc = 0.78065


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 80.67% of unlabeld data


100%|██████████| 267/267 [01:23<00:00,  3.18it/s]


[ Train | 195/200 ] loss = 0.24724, acc = 0.91807


100%|██████████| 21/21 [00:06<00:00,  3.40it/s]


[ Valid | 195/200 ] loss = 0.97705, acc = 0.76369


100%|██████████| 213/213 [01:04<00:00,  3.29it/s]


Using 82.12% of unlabeld data


100%|██████████| 270/270 [01:25<00:00,  3.14it/s]


[ Train | 196/200 ] loss = 0.26489, acc = 0.91528


100%|██████████| 21/21 [00:06<00:00,  3.35it/s]


[ Valid | 196/200 ] loss = 1.00280, acc = 0.77411


100%|██████████| 213/213 [01:05<00:00,  3.24it/s]


Using 82.14% of unlabeld data


100%|██████████| 270/270 [01:25<00:00,  3.17it/s]


[ Train | 197/200 ] loss = 0.26219, acc = 0.91331


100%|██████████| 21/21 [00:05<00:00,  3.70it/s]


[ Valid | 197/200 ] loss = 0.80792, acc = 0.80238


100%|██████████| 213/213 [01:04<00:00,  3.31it/s]


Using 79.78% of unlabeld data


100%|██████████| 265/265 [01:24<00:00,  3.12it/s]


[ Train | 198/200 ] loss = 0.26688, acc = 0.91722


100%|██████████| 21/21 [00:06<00:00,  3.34it/s]


[ Valid | 198/200 ] loss = 0.83194, acc = 0.79226


100%|██████████| 213/213 [01:04<00:00,  3.28it/s]


Using 81.62% of unlabeld data


100%|██████████| 269/269 [01:25<00:00,  3.16it/s]


[ Train | 199/200 ] loss = 0.24897, acc = 0.91671


100%|██████████| 21/21 [00:06<00:00,  3.29it/s]


[ Valid | 199/200 ] loss = 0.95493, acc = 0.78750


100%|██████████| 213/213 [01:04<00:00,  3.30it/s]


Using 83.61% of unlabeld data


100%|██████████| 273/273 [01:27<00:00,  3.11it/s]


[ Train | 200/200 ] loss = 0.24107, acc = 0.91678


100%|██████████| 21/21 [00:05<00:00,  3.76it/s]

[ Valid | 200/200 ] loss = 0.97733, acc = 0.78958





## **Testing**

For inference, we need to make sure the model is in eval mode, and the order of the dataset should not be shuffled ("shuffle=False" in test_loader).

Last but not least, don't forget to save the predictions into a single CSV file.
The format of CSV file should follow the rules mentioned in the slides.

### **WARNING -- Keep in Mind**

Cheating includes but not limited to:
1.   using testing labels,
2.   submitting results to previous Kaggle competitions,
3.   sharing predictions with others,
4.   copying codes from any creatures on Earth,
5.   asking other people to do it for you.

Any violations bring you punishments from getting a discount on the final grade to failing the course.

It is your responsibility to check whether your code violates the rules.
When citing codes from the Internet, you should know what these codes exactly do.
You will **NOT** be tolerated if you break the rule and claim you don't know what these codes do.


In [11]:
model.load_state_dict(torch.load('./model.ckpt'))
model.eval()
opt_model.eval()

# Initialize a list to store the predictions.
predictions = []

# Iterate the testing set by batches.
for batch in tqdm(test_loader):
    imgs, _ = batch
    with torch.no_grad():
      logits = model(imgs.to(device))

    # Take the class with greatest logit as prediction and record it.
    predictions.extend(logits.argmax(dim=-1).cpu().numpy().tolist())

100%|██████████| 105/105 [00:33<00:00,  3.12it/s]


In [12]:
# Save predictions into the file.
with open("predict.csv", "w") as f:

    # The first row must be "Id, Category"
    f.write("Id,Category\n")

    # For the rest of the rows, each image id corresponds to a predicted class.
    for i, pred in  enumerate(predictions):
         f.write(f"{i},{pred}\n")