# **Homework 3 - Convolutional Neural Network**

This is the example code of homework 3 of the machine learning course by Prof. Hung-yi Lee.

In this homework, you are required to build a convolutional neural network for image classification, possibly with some advanced training tips.


There are three levels here:

**Easy**: Build a simple convolutional neural network as the baseline. (2 pts)

**Medium**: Design a better architecture or adopt different data augmentations to improve the performance. (2 pts)

**Hard**: Utilize provided unlabeled data to obtain better results. (2 pts)

## **About the Dataset**

The dataset used here is food-11, a collection of food images in 11 classes.

For the requirement in the homework, TAs slightly modified the data.
Please DO NOT access the original fully-labeled training data or testing labels.

Also, the modified dataset is for this course only, and any further distribution or commercial use is forbidden.

In [1]:
import os

os.environ['CUDA_VISIBLE_DEVICES'] = '1'

In [2]:
# Download the dataset
# You may choose where to download the data.

# Google Drive
# !gdown --id '1awF7pZ9Dz7X1jn1_QAiKN-_v56veCEKy' --output food-11.zip

# Dropbox
# !wget https://www.dropbox.com/s/m9q6273jl3djall/food-11.zip -O food-11.zip

# Unzip the dataset.
# This may take some time.
# !unzip -q food-11.zip

## **Import Packages**

First, we need to import packages that will be used later.

In this homework, we highly rely on **torchvision**, a library of PyTorch.

In [3]:
# Import necessary packages.
import numpy as np
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
from PIL import Image
# "ConcatDataset" and "Subset" are possibly useful when doing semi-supervised learning.
from torch.utils.data import ConcatDataset, DataLoader, Subset
from torchvision.datasets import DatasetFolder

# This is for the progress bar.
from tqdm.notebook import tqdm

## **Dataset, Data Loader, and Transforms**

Torchvision provides lots of useful utilities for image preprocessing, data wrapping as well as data augmentation.

Here, since our data are stored in folders by class labels, we can directly apply **torchvision.datasets.DatasetFolder** for wrapping data without much effort.

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

In [4]:
# It is important to do data augmentation in training.
# However, not every augmentation is useful.
# Please think about what kind of augmentation is helpful for food recognition.
train_tfm = transforms.Compose([
    # Resize the image into a fixed shape (height = width = 128)
    transforms.Resize(256),
    transforms.RandomRotation(180, expand=True),
    transforms.RandomCrop(224),
    # transforms.CenterCrop(224),
    transforms.ColorJitter(brightness=0.2, contrast=0.25, saturation=.15, hue=0.05),
    transforms.RandomVerticalFlip(p=0.2),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    transforms.RandomErasing(p=0.5, scale=(0.05, 0.15), ratio=(0.5, 1.5), value=0, inplace=False),
    transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5)),
])

# We don't need augmentations in testing and validation.
# All we need here is to resize the PIL image and transform it into Tensor.
test_tfm = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5)),
])


In [5]:
class RandomTransformDataset:
    def __init__(self, dataset, transform, random_time):
        self.dataset = dataset
        self.transform = transform
        self.random_time = random_time
    
    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, idx):
        jpg, _ = self.dataset[idx]
        imgs = [self.transform(jpg) for i in range(self.random_time)]
        return torch.stack(imgs, 0)
    
    @staticmethod
    def merge_batch(imgs):
        # (batch, RANDOM_NUM, 3, shape0, shape1)
        return imgs.reshape(-1, *imgs.shape[2:])
    
    def merge_predict(self, predicts):
        # (batch*RANOM_NUM, 11)
        res = []
        labels = torch.argmax(predicts, dim=1).squeeze()
        # (batch*RANOM_NUM)
        for prob in torch.split(labels, self.random_time):
            # (RANDOM_NUM)
            # voting
            res.append(torch.argmax(torch.bincount(prob)))
        # (batch)
        return  res



In [6]:
# Batch size for training, validation, and testing.
# A greater batch size usually gives a more stable gradient.
# But the GPU memory is limited, so please adjust it carefully.
batch_size = 128

# Construct datasets.
# The argument "loader" tells how torchvision reads the data.
data_folder = "/data/ML2021/hw3/"

train_set = DatasetFolder(data_folder+"food-11/training/labeled", loader=lambda x: Image.open(x),extensions="jpg", transform=train_tfm)
valid_set = DatasetFolder(data_folder+"food-11/validation", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)
# test_set = DatasetFolder(data_folder+"food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm)

unlabeled_set = DatasetFolder(data_folder+"food-11/training/unlabeled", loader=lambda x: Image.open(x), extensions="jpg", transform=train_tfm)
unlabeled_set.classes = train_set.classes
unlabeled_set.class_to_idx = train_set.class_to_idx

# Construct data loaders.
train_loader = DataLoader(train_set, batch_size=batch_size, shuffle=True, num_workers=8, pin_memory=True,drop_last=True)
unlab_loader = DataLoader(unlabeled_set, batch_size=batch_size, shuffle=False, num_workers=8, pin_memory=True)
valid_loader = DataLoader(valid_set, batch_size=batch_size, shuffle=True, num_workers=2, pin_memory=True)
# test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=False)

In [7]:
RANDOM_NUM = 32
test_set = DatasetFolder(data_folder+"food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=None)
rd_test_set = RandomTransformDataset(test_set, train_tfm, random_time=RANDOM_NUM)
rd_test_loader = DataLoader(rd_test_set, batch_size=batch_size//RANDOM_NUM*2, shuffle=False, num_workers=16)
test_loader = DataLoader(DatasetFolder(data_folder+"food-11/testing", loader=lambda x: Image.open(x), extensions="jpg", transform=test_tfm), batch_size=batch_size, shuffle=False, num_workers=16)


## **Model**

The basic model here is simply a stack of convolutional layers followed by some fully-connected layers.

Since there are three channels for a color image (RGB), the input channels of the network must be three.
In each convolutional layer, typically the channels of inputs grow, while the height and width shrink (or remain unchanged, according to some hyperparameters like stride and padding).

Before fed into fully-connected layers, the feature map must be flattened into a single one-dimensional vector (for each image).
These features are then transformed by the fully-connected layers, and finally, we obtain the "logits" for each class.

### **WARNING -- You Must Know**
You are free to modify the model architecture here for further improvement.
However, if you want to use some well-known architectures such as ResNet50, please make sure **NOT** to load the pre-trained weights.
Using such pre-trained models is considered cheating and therefore you will be punished.
Similarly, it is your responsibility to make sure no pre-trained weights are used if you use **torch.hub** to load any modules.

For example, if you use ResNet-18 as your model:

model = torchvision.models.resnet18(pretrained=**False**) → This is fine.

model = torchvision.models.resnet18(pretrained=**True**)  → This is **NOT** allowed.

In [8]:
class BlockConv2d(nn.Module):
    def __init__(self, ch_in, ch_out, k, stride=1,
            act=None, pooling=None, use_bn=True, drop=0, is_dconv=False, is_pad=True):

        super(BlockConv2d, self).__init__()
        pad = k//2 if is_pad else 0
        if is_dconv:
            conv = nn.ConvTranspose2d(ch_in, ch_out, k, padding=pad, stride=stride, output_padding=stride-1)
        else:
            conv = nn.Conv2d(ch_in, ch_out, k, padding=pad, stride=stride)
            
        list = [
            conv
        ]
        if use_bn: list.append(nn.BatchNorm2d(ch_out))
        if act: list.append(act)
        if pooling: list.append(pooling)
        if drop > 0: list.append(nn.Dropout2d(drop))
        self.net = nn.Sequential(*list)

    def forward(self, x):
          return self.net(x)

class BlockLinear(nn.Module):
    def __init__(self, ch_in, ch_out, act=None, use_bn=True, drop=0):
        super(BlockLinear, self).__init__()
        list = [
            nn.Linear(ch_in, ch_out)
        ]
        if use_bn: list.append(nn.BatchNorm1d(ch_out))
        if act: list.append(act)
        if drop > 0: list.append(nn.Dropout(drop))
        self.net = nn.Sequential(*list)

    def forward(self, x):
          return self.net(x)

In [9]:
# m = torchvision.models.resnext50_32x4d(pretrained=False)
# for i, j in (m.state_dict().items()):
    # print(i, "\t", j.shape)
# m.fc

In [10]:
class Classifier(nn.Module):
    def __init__(self, encoder=None):
        super(Classifier, self).__init__()
        # The arguments for commonly used modules:
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)

        # input image size: [3, 128, 128]
        self.act = nn.ReLU()
        if encoder:
            self.encoder = torchvision.models.resnext50_32x4d(pretrained=False)
            # self.encoder.fc = BlockLinear(512, 128, act=nn.ReLU(), use_bn=True)
            # self.n = [512, 128]
            self.n = [2048, 512, 128]
            self.l = len(self.n)
            fc = [
                BlockLinear(self.n[i], self.n[i+1], act=nn.ReLU(), use_bn=True)
                for i in range(self.l-1)
            ]
            self.encoder.fc = nn.Sequential(*fc)
        else:
        # 
            self.n_0 = [3] + [64]*2 + [128]*2 + [256]
            self.k_0 = [3]*5
            self.p_0 = [1, 2]*2+[4]
            self.l_0 = len(self.k_0)
            self.encoder = [
                BlockConv2d(
                    self.n_0[i], self.n_0[i+1], self.k_0[i], 
                    act=self.act, pooling=nn.MaxPool2d(self.p_0[i], self.p_0[i]),
                    use_bn=True, drop=0.2)
                for i in range(self.l_0)]
            self.encoder = nn.Sequential(*self.encoder)

        # 512x8x8
        
        self.out = nn.Linear(self.n[-1], 11)

    def forward(self, x):
        # input (x): [batch_size, 3, 128, 128]
        # output: [batch_size, 11]
        x = self.encoder(x)
        # x = x.flatten(1)
        # print(x.shape)
        # x = self.fc(x)
        x = self.out(x)
        return x

## **Training**

You can finish supervised learning by simply running the provided code without any modification.

The function "get_pseudo_labels" is used for semi-supervised learning.
It is expected to get better performance if you use unlabeled data for semi-supervised learning.
However, you have to implement the function on your own and need to adjust several hyperparameters manually.

For more details about semi-supervised learning, please refer to [Prof. Lee's slides](https://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/semi%20(v3).pdf).

Again, please notice that utilizing external data (or pre-trained model) for training is **prohibited**.

In [11]:
# Explicitly define the psudolabels in Subset class
def get_pseudosample(self, idx):
    return self.dataset[self.indices[idx]][0], self.labels[idx]

def init_pseudolabel(self, dataset, labels, indices):
    self.dataset = dataset
    self.labels = labels
    self.indices = indices
Subset.__getitem__ = get_pseudosample
Subset.__init__ = init_pseudolabel

In [12]:
def get_pseudo_labels(dataset, model, threshold=0.5):
    # This functions generates pseudo-labels of a dataset using given model.
    # It returns an instance of DatasetFolder containing images whose prediction confidences exceed a given threshold.
    # You are NOT allowed to use any models trained on external data for pseudo-labeling.
    device = "cuda" if torch.cuda.is_available() else "cpu"

    # Make sure the model is in eval mode.
    model.eval()
    # Define softmax function.
    softmax = nn.Softmax(dim=-1)
    idx = []
    targets = []
    # Iterate over the dataset by batches.
    for i, batch in tqdm(enumerate(unlab_loader), leave=False, desc='PseudoLabels'):
        img, _ = batch
        with torch.no_grad():
            logits = model(img.to(device))

        # Obtain the probability distributions by applying softmax on logits.
        probs = softmax(logits)
        st = torch.topk(probs, 2, dim=1)
        
        # ---------- TODO ----------
        # Filter the data and construct a new dataset.
        probs1 = st[0][:, 0]
        # probs2 = st[0][:, 1]
        # select = (probs1 > threshold) & ((probs1-probs2) > .25)
        select = (probs1 > threshold)
        targets += st[1][select, 0].tolist()
        idx += (torch.where(select)[0] + batch_size*i).tolist()

    # custom subset
    new = Subset(dataset, targets, idx)
    model.train()
    return new

In [13]:
log_path = 'log'
if log_path in os.listdir():
    os.remove(log_path)
    print('remove log')
log = open(log_path, 'a')

remove log


In [20]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model_path = "./model.ckpt"

try:
    model = Classifier(1).to(device)
except NameError:
    print("Use new encoder")
    model = Classifier().to(device)
# model.load_state_dict(torch.load(model_path))


criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4)
lr_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
    optimizer, mode='max', factor=0.8, patience=8, threshold=2e-4, min_lr=1e-5, verbose=True
)


n_epochs = 200
best_acc = 0
valid_acc = 0

best_semi = 0
semi_flg = False
semi_count = 0

loader = train_loader
for epoch in range(n_epochs):
    # ---------- TODO ----------
    # In each epoch, relabel the unlabeled dataset for semi-supervised learning.
    # Then you can combine the labeled dataset and pseudo-labeled dataset for the training.
    if valid_acc >= .65:
    # if False:
    # if True:
        # Otain pseudo-labels for unlabeled data using trained model.
        pseudo_set = get_pseudo_labels(unlabeled_set, model, threshold=0.7)

        # Construct a new dataset and a data loader for training.
        # This is used in semi-supervised learning only.
        concat_dataset = ConcatDataset([train_set, pseudo_set])
        loader = DataLoader(concat_dataset, batch_size=batch_size, shuffle=True, num_workers=8, pin_memory=True, drop_last=True)
        semi_flg = True
        best_semi = 0
        log_str = f"Use pseudo label: {len(concat_dataset)}"
        print(log_str)
        log.write(log_str+"\n")
    else:
        loader = train_loader

    # if semi_count >= 5:
    #     loader = train_loader
    #     semi_flg = False
    #     semi_count = 0
    #     log_str = "Abort pseudo label"
    #     print(log_str)
    #     log.write(log_str+"\n")

    # ---------- Training ----------
    model.train()

    train_loss = []
    train_accs = []

    # for batch in train_loader:
    for batch in tqdm(loader, desc='Train', leave=False):

        imgs, labels = batch
        logits = model(imgs.to(device))
        loss = criterion(logits, labels.to(device))
        optimizer.zero_grad()
        loss.backward()
        grad_norm = nn.utils.clip_grad_norm_(model.parameters(), max_norm=10)
        optimizer.step()

        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()
        train_loss.append(loss.item())
        train_accs.append(acc)

    train_loss = sum(train_loss) / len(train_loss)
    train_acc = sum(train_accs) / len(train_accs)

    log_str1 = (f"[ Train | {epoch + 1:03d}/{n_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")
    # print(log)
    # ---------- Validation ----------
    model.eval()
    
    valid_loss = []
    valid_accs = []

    # for batch in valid_loader:
    for batch in tqdm(valid_loader, desc='Valid', leave=False):
        imgs, labels = batch
        imgs = imgs.to(device)
        with torch.no_grad():
          logits = model(imgs)

        loss = criterion(logits, labels.to(device))
        acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()
        valid_loss.append(loss.item())
        valid_accs.append(acc)

    valid_loss = sum(valid_loss) / len(valid_loss)
    valid_acc = sum(valid_accs) / len(valid_accs)
    if valid_acc >= best_acc:
        best_acc = valid_acc
        best_model = model.state_dict()
        torch.save(model.state_dict(), model_path)
        log_str = ('saving model with acc {:.3f}'.format(best_acc))
        print(log_str)
        log.write(log_str+"\n")
        semi_count = 0
    
    # semi early stop
    # if semi_flg:
    #     if valid_acc > best_semi:
    #         best_semi = valid_acc
    #         semi_count = 0
    #     else:
    #         semi_count += 1

    log_str1 += f"\t[ Valid\t| {epoch + 1:03d}/{n_epochs:03d} ]  loss = {valid_loss:.5f}, acc = {valid_acc:.5f}"
    print(log_str1)
    log.write(log_str1+"\n")
    log.flush()
    lr_scheduler.step(valid_acc)
    # print(f"[ Valid\t| {epoch + 1:03d}/{n_epochs:03d} ] ae_loss = {valid_ae:.5f}, loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")

Train:   0%|          | 0/24 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.730
[ Train | 001/200 ] loss = 0.23250, acc = 0.92253	[ Valid	| 001/200 ]  loss = 1.22877, acc = 0.73047


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8344


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 002/200 ] loss = 0.67673, acc = 0.80072	[ Valid	| 002/200 ]  loss = 1.54019, acc = 0.69766


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8036


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 003/200 ] loss = 0.54754, acc = 0.82271	[ Valid	| 003/200 ]  loss = 1.29909, acc = 0.70755


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7948


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.735
[ Train | 004/200 ] loss = 0.54586, acc = 0.81842	[ Valid	| 004/200 ]  loss = 0.98844, acc = 0.73490


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7772


Train:   0%|          | 0/60 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 005/200 ] loss = 0.40190, acc = 0.86667	[ Valid	| 005/200 ]  loss = 1.10860, acc = 0.71510


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7760


Train:   0%|          | 0/60 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 006/200 ] loss = 0.41435, acc = 0.86563	[ Valid	| 006/200 ]  loss = 1.07993, acc = 0.73359


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7880


Train:   0%|          | 0/61 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 007/200 ] loss = 0.45888, acc = 0.85336	[ Valid	| 007/200 ]  loss = 1.21655, acc = 0.71641


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7885


Train:   0%|          | 0/61 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 008/200 ] loss = 0.42675, acc = 0.85873	[ Valid	| 008/200 ]  loss = 0.85568, acc = 0.73047


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7879


Train:   0%|          | 0/61 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 009/200 ] loss = 0.40306, acc = 0.86860	[ Valid	| 009/200 ]  loss = 1.11800, acc = 0.68828


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7646


Train:   0%|          | 0/59 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 010/200 ] loss = 0.56057, acc = 0.81727	[ Valid	| 010/200 ]  loss = 1.03208, acc = 0.72604


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7725


Train:   0%|          | 0/60 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 011/200 ] loss = 0.42721, acc = 0.85833	[ Valid	| 011/200 ]  loss = 1.08258, acc = 0.66120


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7706


Train:   0%|          | 0/60 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 012/200 ] loss = 0.35712, acc = 0.88151	[ Valid	| 012/200 ]  loss = 1.19256, acc = 0.67292


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8005


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 013/200 ] loss = 0.42321, acc = 0.85736	[ Valid	| 013/200 ]  loss = 1.17540, acc = 0.70677
Epoch    13: reducing learning rate of group 0 to 8.0000e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7799


Train:   0%|          | 0/60 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 014/200 ] loss = 0.40974, acc = 0.86289	[ Valid	| 014/200 ]  loss = 1.03185, acc = 0.72214


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7940


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.740
[ Train | 015/200 ] loss = 0.36900, acc = 0.87878	[ Valid	| 015/200 ]  loss = 0.90610, acc = 0.73984


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8145


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.759
[ Train | 016/200 ] loss = 0.35100, acc = 0.88430	[ Valid	| 016/200 ]  loss = 0.89852, acc = 0.75911


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8163


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.765
[ Train | 017/200 ] loss = 0.37120, acc = 0.87773	[ Valid	| 017/200 ]  loss = 0.89936, acc = 0.76484


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8103


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 018/200 ] loss = 0.35830, acc = 0.87748	[ Valid	| 018/200 ]  loss = 0.87481, acc = 0.74740


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8187


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 019/200 ] loss = 0.35770, acc = 0.88256	[ Valid	| 019/200 ]  loss = 0.93618, acc = 0.73438


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8266


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.783
[ Train | 020/200 ] loss = 0.40703, acc = 0.86548	[ Valid	| 020/200 ]  loss = 0.73980, acc = 0.78333


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8221


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 021/200 ] loss = 0.35722, acc = 0.88184	[ Valid	| 021/200 ]  loss = 0.97801, acc = 0.74349


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8115


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 022/200 ] loss = 0.35816, acc = 0.87661	[ Valid	| 022/200 ]  loss = 0.96044, acc = 0.72708


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8178


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 023/200 ] loss = 0.37325, acc = 0.87674	[ Valid	| 023/200 ]  loss = 0.84732, acc = 0.75417


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8307


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 024/200 ] loss = 0.35692, acc = 0.88086	[ Valid	| 024/200 ]  loss = 0.99859, acc = 0.73750


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8046


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 025/200 ] loss = 0.37009, acc = 0.87500	[ Valid	| 025/200 ]  loss = 1.02018, acc = 0.73542


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 7917


Train:   0%|          | 0/61 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 026/200 ] loss = 0.40243, acc = 0.86616	[ Valid	| 026/200 ]  loss = 0.89265, acc = 0.72396


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8051


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 027/200 ] loss = 0.33182, acc = 0.88924	[ Valid	| 027/200 ]  loss = 0.89245, acc = 0.76276


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8219


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 028/200 ] loss = 0.33860, acc = 0.88855	[ Valid	| 028/200 ]  loss = 0.86057, acc = 0.77083


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8187


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 029/200 ] loss = 0.33126, acc = 0.88938	[ Valid	| 029/200 ]  loss = 0.87497, acc = 0.74609
Epoch    29: reducing learning rate of group 0 to 6.4000e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8045


Train:   0%|          | 0/62 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 030/200 ] loss = 0.35732, acc = 0.88558	[ Valid	| 030/200 ]  loss = 0.96004, acc = 0.73464


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8098


Train:   0%|          | 0/63 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 031/200 ] loss = 0.34060, acc = 0.88765	[ Valid	| 031/200 ]  loss = 0.92982, acc = 0.73464


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8265


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 032/200 ] loss = 0.34475, acc = 0.88831	[ Valid	| 032/200 ]  loss = 0.85051, acc = 0.76042


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8329


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 033/200 ] loss = 0.33575, acc = 0.88798	[ Valid	| 033/200 ]  loss = 0.91303, acc = 0.75573


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8340


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.794
[ Train | 034/200 ] loss = 0.34590, acc = 0.88822	[ Valid	| 034/200 ]  loss = 0.80262, acc = 0.79375


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8255


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 035/200 ] loss = 0.32086, acc = 0.89551	[ Valid	| 035/200 ]  loss = 0.87315, acc = 0.75677


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8341


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 036/200 ] loss = 0.33821, acc = 0.88894	[ Valid	| 036/200 ]  loss = 0.88972, acc = 0.76432


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8365


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 037/200 ] loss = 0.33322, acc = 0.88666	[ Valid	| 037/200 ]  loss = 0.82438, acc = 0.75703


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8272


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 038/200 ] loss = 0.35501, acc = 0.88257	[ Valid	| 038/200 ]  loss = 0.81159, acc = 0.77839


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8348


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 039/200 ] loss = 0.32545, acc = 0.89062	[ Valid	| 039/200 ]  loss = 0.84444, acc = 0.76172


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8248


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 040/200 ] loss = 0.31332, acc = 0.89575	[ Valid	| 040/200 ]  loss = 0.91113, acc = 0.76615


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8296


Train:   0%|          | 0/64 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 041/200 ] loss = 0.30691, acc = 0.90283	[ Valid	| 041/200 ]  loss = 0.81060, acc = 0.77240


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8494


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 042/200 ] loss = 0.31205, acc = 0.89181	[ Valid	| 042/200 ]  loss = 1.09145, acc = 0.72422


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8327


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 043/200 ] loss = 0.30244, acc = 0.89964	[ Valid	| 043/200 ]  loss = 0.89449, acc = 0.78281
Epoch    43: reducing learning rate of group 0 to 5.1200e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8404


Train:   0%|          | 0/65 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 044/200 ] loss = 0.27587, acc = 0.90950	[ Valid	| 044/200 ]  loss = 0.88680, acc = 0.77057


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8501


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 045/200 ] loss = 0.27304, acc = 0.90554	[ Valid	| 045/200 ]  loss = 0.92582, acc = 0.78307


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8535


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 046/200 ] loss = 0.29287, acc = 0.90530	[ Valid	| 046/200 ]  loss = 0.99470, acc = 0.75807


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8508


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 047/200 ] loss = 0.30083, acc = 0.90388	[ Valid	| 047/200 ]  loss = 0.82035, acc = 0.78047


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8460


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.796
[ Train | 048/200 ] loss = 0.27626, acc = 0.91087	[ Valid	| 048/200 ]  loss = 0.75060, acc = 0.79609


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8517


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 049/200 ] loss = 0.25368, acc = 0.91359	[ Valid	| 049/200 ]  loss = 0.74943, acc = 0.79245


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8626


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 050/200 ] loss = 0.26835, acc = 0.91290	[ Valid	| 050/200 ]  loss = 0.82882, acc = 0.79036


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8717


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 051/200 ] loss = 0.28440, acc = 0.90165	[ Valid	| 051/200 ]  loss = 0.98605, acc = 0.74453


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8607


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 052/200 ] loss = 0.28174, acc = 0.91278	[ Valid	| 052/200 ]  loss = 1.03930, acc = 0.74063


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8617


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 053/200 ] loss = 0.27690, acc = 0.90742	[ Valid	| 053/200 ]  loss = 0.98684, acc = 0.74687


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8598


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 054/200 ] loss = 0.28669, acc = 0.90788	[ Valid	| 054/200 ]  loss = 0.98443, acc = 0.73672


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8641


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 055/200 ] loss = 0.28137, acc = 0.90800	[ Valid	| 055/200 ]  loss = 0.98740, acc = 0.70755


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8581


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 056/200 ] loss = 0.26059, acc = 0.91453	[ Valid	| 056/200 ]  loss = 0.99114, acc = 0.73385


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8598


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 057/200 ] loss = 0.29305, acc = 0.90054	[ Valid	| 057/200 ]  loss = 0.81185, acc = 0.77344
Epoch    57: reducing learning rate of group 0 to 4.0960e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8553


Train:   0%|          | 0/66 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 058/200 ] loss = 0.26613, acc = 0.91312	[ Valid	| 058/200 ]  loss = 0.75985, acc = 0.77969


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8682


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 059/200 ] loss = 0.24053, acc = 0.92292	[ Valid	| 059/200 ]  loss = 0.87725, acc = 0.78490


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8729


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 060/200 ] loss = 0.25102, acc = 0.92061	[ Valid	| 060/200 ]  loss = 0.92967, acc = 0.75885


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8701


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 061/200 ] loss = 0.24434, acc = 0.91733	[ Valid	| 061/200 ]  loss = 0.95292, acc = 0.74375


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8641


Train:   0%|          | 0/67 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 062/200 ] loss = 0.24687, acc = 0.91943	[ Valid	| 062/200 ]  loss = 1.08406, acc = 0.74193


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8761


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 063/200 ] loss = 0.24476, acc = 0.91820	[ Valid	| 063/200 ]  loss = 0.81281, acc = 0.79063


PseudoLabels: 0it [00:00, ?it/s]

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1177, in _shutdown_workers
    w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt

Train:   0%|          | 0/68 [00:00<?, ?it/s]

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1177, in _shutdown_workers
    w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1177, in _shutdown_workers
    w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/multiprocessing/process.py", line 147, in join
    assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()
  File "/home/csvt

PseudoLabels: 0it [00:00, ?it/s]

Exception ignored in: <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0><function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>

Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in __del__
    self._shutdown_workers()    
self._shutdown_workers()  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1177, in _shutdown_workers

  File "/home/csvt32745/anaconda3/envs/torch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1177, in _shutdown_workers
    Exception ignored in: w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL)    <function _MultiProcessingDataLoaderIter.__del__ at 0x7fdfe4861ca0>w.join(timeout=_utils.MP_STATUS_CHECK_

Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 065/200 ] loss = 0.23618, acc = 0.92406	[ Valid	| 065/200 ]  loss = 0.91705, acc = 0.76641


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8764


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 066/200 ] loss = 0.23007, acc = 0.92268	[ Valid	| 066/200 ]  loss = 0.82765, acc = 0.78438


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8763


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.818
[ Train | 067/200 ] loss = 0.25394, acc = 0.91808	[ Valid	| 067/200 ]  loss = 0.81827, acc = 0.81797


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8883


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 068/200 ] loss = 0.23711, acc = 0.92131	[ Valid	| 068/200 ]  loss = 0.95742, acc = 0.76823


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8758


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 069/200 ] loss = 0.22470, acc = 0.92440	[ Valid	| 069/200 ]  loss = 0.90727, acc = 0.77995


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8865


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 070/200 ] loss = 0.23705, acc = 0.92267	[ Valid	| 070/200 ]  loss = 0.82146, acc = 0.81406


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8804


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.818
[ Train | 071/200 ] loss = 0.25390, acc = 0.91705	[ Valid	| 071/200 ]  loss = 0.79391, acc = 0.81849


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8776


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 072/200 ] loss = 0.21679, acc = 0.93348	[ Valid	| 072/200 ]  loss = 0.80680, acc = 0.79271


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8810


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 073/200 ] loss = 0.22647, acc = 0.92785	[ Valid	| 073/200 ]  loss = 0.82580, acc = 0.78177


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8794


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 074/200 ] loss = 0.23017, acc = 0.92417	[ Valid	| 074/200 ]  loss = 1.08865, acc = 0.74010


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8835


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 075/200 ] loss = 0.23365, acc = 0.92663	[ Valid	| 075/200 ]  loss = 1.13558, acc = 0.75078


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8890


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 076/200 ] loss = 0.22885, acc = 0.92946	[ Valid	| 076/200 ]  loss = 1.02657, acc = 0.75990


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8842


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 077/200 ] loss = 0.24587, acc = 0.92403	[ Valid	| 077/200 ]  loss = 0.83223, acc = 0.78880


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8840


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 078/200 ] loss = 0.22271, acc = 0.92788	[ Valid	| 078/200 ]  loss = 0.78592, acc = 0.79323


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8841


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 079/200 ] loss = 0.21939, acc = 0.92991	[ Valid	| 079/200 ]  loss = 0.81705, acc = 0.78698


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8875


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 080/200 ] loss = 0.22812, acc = 0.92606	[ Valid	| 080/200 ]  loss = 0.79315, acc = 0.79453
Epoch    80: reducing learning rate of group 0 to 3.2768e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8783


Train:   0%|          | 0/68 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.822
[ Train | 081/200 ] loss = 0.22420, acc = 0.92750	[ Valid	| 081/200 ]  loss = 0.82950, acc = 0.82188


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8930


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 082/200 ] loss = 0.20303, acc = 0.93467	[ Valid	| 082/200 ]  loss = 0.79253, acc = 0.80417


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8865


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 083/200 ] loss = 0.21677, acc = 0.93388	[ Valid	| 083/200 ]  loss = 0.89935, acc = 0.77240


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8934


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 084/200 ] loss = 0.21319, acc = 0.92867	[ Valid	| 084/200 ]  loss = 0.87360, acc = 0.78958


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9017


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 085/200 ] loss = 0.21089, acc = 0.93092	[ Valid	| 085/200 ]  loss = 0.78668, acc = 0.81719


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8962


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 086/200 ] loss = 0.21076, acc = 0.93337	[ Valid	| 086/200 ]  loss = 0.80346, acc = 0.79167


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9005


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.829
[ Train | 087/200 ] loss = 0.18241, acc = 0.94196	[ Valid	| 087/200 ]  loss = 0.72782, acc = 0.82891


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8989


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 088/200 ] loss = 0.20370, acc = 0.93348	[ Valid	| 088/200 ]  loss = 0.84028, acc = 0.79688


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9054


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 089/200 ] loss = 0.21391, acc = 0.93471	[ Valid	| 089/200 ]  loss = 0.87738, acc = 0.80443


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8959


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 090/200 ] loss = 0.21499, acc = 0.93365	[ Valid	| 090/200 ]  loss = 1.04243, acc = 0.74740


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8907


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 091/200 ] loss = 0.22149, acc = 0.92742	[ Valid	| 091/200 ]  loss = 0.85669, acc = 0.78958


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8916


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 092/200 ] loss = 0.19802, acc = 0.93761	[ Valid	| 092/200 ]  loss = 0.90763, acc = 0.79036


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8994


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 093/200 ] loss = 0.21776, acc = 0.92969	[ Valid	| 093/200 ]  loss = 0.87243, acc = 0.76953


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8950


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 094/200 ] loss = 0.21227, acc = 0.93467	[ Valid	| 094/200 ]  loss = 0.77128, acc = 0.78177


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8947


Train:   0%|          | 0/69 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 095/200 ] loss = 0.21184, acc = 0.93308	[ Valid	| 095/200 ]  loss = 0.80526, acc = 0.80234


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 8969


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 096/200 ] loss = 0.20068, acc = 0.93449	[ Valid	| 096/200 ]  loss = 0.70568, acc = 0.82760
Epoch    96: reducing learning rate of group 0 to 2.6214e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9010


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 097/200 ] loss = 0.19175, acc = 0.93951	[ Valid	| 097/200 ]  loss = 0.84538, acc = 0.79297


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9037


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 098/200 ] loss = 0.17245, acc = 0.94576	[ Valid	| 098/200 ]  loss = 0.69974, acc = 0.82448


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9124


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 099/200 ] loss = 0.18631, acc = 0.94245	[ Valid	| 099/200 ]  loss = 0.84667, acc = 0.81094


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9099


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 100/200 ] loss = 0.17241, acc = 0.94663	[ Valid	| 100/200 ]  loss = 0.74770, acc = 0.80573


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9056


Train:   0%|          | 0/70 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 101/200 ] loss = 0.17809, acc = 0.94665	[ Valid	| 101/200 ]  loss = 0.75070, acc = 0.82240


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9154


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 102/200 ] loss = 0.17961, acc = 0.94201	[ Valid	| 102/200 ]  loss = 1.02374, acc = 0.78021


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9144


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 103/200 ] loss = 0.18859, acc = 0.94080	[ Valid	| 103/200 ]  loss = 0.85426, acc = 0.79479


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9090


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 104/200 ] loss = 0.18694, acc = 0.94267	[ Valid	| 104/200 ]  loss = 0.93137, acc = 0.80964


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9113


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 105/200 ] loss = 0.18358, acc = 0.94190	[ Valid	| 105/200 ]  loss = 0.88123, acc = 0.76719
Epoch   105: reducing learning rate of group 0 to 2.0972e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9163


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 106/200 ] loss = 0.16652, acc = 0.94806	[ Valid	| 106/200 ]  loss = 0.78498, acc = 0.80234


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9158


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 107/200 ] loss = 0.16604, acc = 0.94894	[ Valid	| 107/200 ]  loss = 0.82971, acc = 0.79297


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9213


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 108/200 ] loss = 0.17165, acc = 0.94388	[ Valid	| 108/200 ]  loss = 0.84691, acc = 0.80339


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9131


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 109/200 ] loss = 0.16282, acc = 0.94872	[ Valid	| 109/200 ]  loss = 0.82168, acc = 0.80833


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9224


Train:   0%|          | 0/72 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 110/200 ] loss = 0.15965, acc = 0.94944	[ Valid	| 110/200 ]  loss = 0.80252, acc = 0.80208


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9199


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 111/200 ] loss = 0.15243, acc = 0.95092	[ Valid	| 111/200 ]  loss = 0.82346, acc = 0.80391


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9205


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 112/200 ] loss = 0.15244, acc = 0.95169	[ Valid	| 112/200 ]  loss = 0.71455, acc = 0.82708


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9220


Train:   0%|          | 0/72 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 113/200 ] loss = 0.15789, acc = 0.94911	[ Valid	| 113/200 ]  loss = 0.75393, acc = 0.81406


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9204


Train:   0%|          | 0/71 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

[ Train | 114/200 ] loss = 0.16384, acc = 0.94971	[ Valid	| 114/200 ]  loss = 0.90164, acc = 0.78151
Epoch   114: reducing learning rate of group 0 to 1.6777e-04.


PseudoLabels: 0it [00:00, ?it/s]

Use pseudo label: 9248


Train:   0%|          | 0/72 [00:00<?, ?it/s]

Valid:   0%|          | 0/6 [00:00<?, ?it/s]

saving model with acc 0.835
[ Train | 115/200 ] loss = 0.14697, acc = 0.95605	[ Valid	| 115/200 ]  loss = 0.70236, acc = 0.83490


PseudoLabels: 0it [00:00, ?it/s]

KeyboardInterrupt: 

## **Testing**

For inference, we need to make sure the model is in eval mode, and the order of the dataset should not be shuffled ("shuffle=False" in test_loader).

Last but not least, don't forget to save the predictions into a single CSV file.
The format of CSV file should follow the rules mentioned in the slides.

### **WARNING -- Keep in Mind**

Cheating includes but not limited to:
1.   using testing labels,
2.   submitting results to previous Kaggle competitions,
3.   sharing predictions with others,
4.   copying codes from any creatures on Earth,
5.   asking other people to do it for you.

Any violations bring you punishments from getting a discount on the final grade to failing the course.

It is your responsibility to check whether your code violates the rules.
When citing codes from the Internet, you should know what these codes exactly do.
You will **NOT** be tolerated if you break the rule and claim you don't know what these codes do.


it = iter(test_loader)
a = next(it)[0].to(device)
ae_model.eval()
b = ae_model(a)
import cv2
b = b.to('cpu').detach().numpy().transpose([0, 2, 3, 1])
a = a.to('cpu').detach().numpy().transpose([0, 2, 3, 1])
def norm(img):
    for i in range(3):
        ch = img[:, :, i]
        a = ch.min()
        b = ch.max()
        img[:, :, i] = (ch-a)/(b-a)
    return (img*255).astype('uint8')
for i, img in enumerate(b):
    cv2.imwrite(f"img/{i}.png", norm(img))
    cv2.imwrite(f"img/{i}_.png", norm(a[i]))
    if i == 0:
        break

In [21]:
# Make sure the model is in eval mode.
# Some modules like Dropout or BatchNorm affect if the model is in training mode.
model_path = './model.ckpt'
device = "cuda" if torch.cuda.is_available() else "cpu"
model = Classifier(1).to(device)
model.load_state_dict(torch.load(model_path))
model.eval()

# Initialize a list to store the predictions.
predictions = []

# Iterate the testing set by batches.
for batch in tqdm(rd_test_loader):
    # imgs, _ = batch
    imgs = rd_test_set.merge_batch(batch)

    # We don't need gradient in testing, and we don't even have labels to compute loss.
    # Using torch.no_grad() accelerates the forward process.
    with torch.no_grad():
        logits = model(imgs.to(device))
    
    # Take the class with greatest logit as prediction and record it.
    # predictions.extend(logits.argmax(dim=-1).cpu().numpy().tolist())
    predictions.extend(rd_test_set.merge_predict(logits))

  0%|          | 0/419 [00:00<?, ?it/s]

In [22]:
# Save predictions into the file.
with open("predict.csv", "w") as f:

    # The first row must be "Id, Category"
    f.write("Id,Category\n")

    # For the rest of the rows, each image id corresponds to a predicted class.
    for i, pred in  enumerate(predictions):
         f.write(f"{i},{pred}\n")