# **Homework 3 - Convolutional Neural Network**

This is the example code of homework 3 of the machine learning course by Prof. Hung-yi Lee.

In this homework, you are required to build a convolutional neural network for image classification, possibly with some advanced training tips.


There are three levels here:

**Easy**: Build a simple convolutional neural network as the baseline. (2 pts)

**Medium**: Design a better architecture or adopt different data augmentations to improve the performance. (2 pts)

**Hard**: Utilize provided unlabeled data to obtain better results. (2 pts)

## **About the Dataset**

The dataset used here is food-11, a collection of food images in 11 classes.

For the requirement in the homework, TAs slightly modified the data.
Please DO NOT access the original fully-labeled training data or testing labels.

Also, the modified dataset is for this course only, and any further distribution or commercial use is forbidden.

In [27]:
%reset -f

In [28]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


## **Import Packages**

First, we need to import packages that will be used later.

In this homework, we highly rely on **torchvision**, a library of PyTorch.

In [29]:
# import necessary packages.
import numpy as np
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import os

from torch.utils.data import Dataset, ConcatDataset, DataLoader, Subset
from torchvision.datasets import DatasetFolder
from torchsummary import summary
from PIL import Image
from sklearn.metrics import confusion_matrix
from tqdm.auto import tqdm

CONFIG = {
    'GD_PRE_MODEL_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/model.ckpt',
    'GD_MODEL_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/transfer_model_',
    'GD_VAL_BEST_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/transfer_val_best_',

    'ABS_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/food-11/',
    'TRAIN_PATH': 'training/labeled/',
    'UNLABELED_PATH': 'training/unlabeled/',
    'VAL_PATH': 'validation/',
    'TEST_PATH': 'testing/',

    'EPOCH_NUM': 30,
    'BATCH_SIZE': 32,
    'OPTIMIZER': 'Adam',
    'OPTIM_PARAMS': {
        'lr': 7e-5,
        'weight_decay': 1e-5,
    },
    'DECAY_RATE': 1,
    'MIN_LR': 7e-5,
    'MODEL_NUM': 5,
    'SEED': 0,
    'THRESHOLD': 0.99,
    'NEGATIVE_SLOPE': 0,
    'MOMENTUM': 0.1,
}

# set random seed for reproducibility
SEED = CONFIG['SEED']
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(SEED)
torch.manual_seed(SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)

# create directory for saving model
os.makedirs('gdrive/MyDrive/Colab Notebooks/HW3/models', exist_ok=True)


## **Dataset, Data Loader, and Transforms**

Torchvision provides lots of useful utilities for image preprocessing, data wrapping as well as data augmentation.

Here, since our data are stored in folders by class labels, we can directly apply **torchvision.datasets.DatasetFolder** for wrapping data without much effort.

Please refer to [PyTorch official website](https://pytorch.org/vision/stable/transforms.html) for details about different transforms.

In [30]:
from sklearn.model_selection import train_test_split

class DataManager():
    def __init__(self, state):
        print('init data manager...')
        ABS_PATH = CONFIG['ABS_PATH']
        TRAIN_PATH = CONFIG['TRAIN_PATH']
        VAL_PATH = CONFIG['VAL_PATH']
        UNLABELED_PATH = CONFIG['UNLABELED_PATH']
        TEST_PATH = CONFIG['TEST_PATH']
        BATCH_SIZE = CONFIG['BATCH_SIZE']
        self.set_tfm()

        self.train_set = DatasetFolder(ABS_PATH + TRAIN_PATH, loader=lambda x: Image.open(x), extensions="jpg", transform=self.train_tfm)
        self.val_set = DatasetFolder(ABS_PATH + VAL_PATH, loader=lambda x: Image.open(x), extensions="jpg", transform=self.test_tfm)
        
        self.train_set = ConcatDataset([self.train_set, self.val_set])
        train_idxs, val_idxs = train_test_split(list(range(len(self.train_set))), test_size=0.1, random_state=state)

        self.val_set = Subset(self.train_set, val_idxs)
        self.train_set = Subset(self.train_set, train_idxs)
        self.unlabeled_set = DatasetFolder(ABS_PATH + UNLABELED_PATH, loader=lambda x: Image.open(x), extensions="jpg", transform=self.train_tfm)
        self.test_set_for_training = DatasetFolder(ABS_PATH + TEST_PATH, loader=lambda x: Image.open(x), extensions="jpg", transform=self.test_tfm)
        self.test_set = DatasetFolder(ABS_PATH + TEST_PATH, loader=lambda x: Image.open(x), extensions="jpg", transform=self.test_tfm)

        self.train_loader = DataLoader(self.train_set, batch_size=BATCH_SIZE, shuffle=True, num_workers=2, pin_memory=True, drop_last=True)
        self.val_loader = DataLoader(self.val_set, batch_size=BATCH_SIZE, shuffle=False, num_workers=2, pin_memory=True, drop_last=True)
        self.unlabeled_loader = DataLoader(self.unlabeled_set, batch_size=BATCH_SIZE, shuffle=False, num_workers=2, pin_memory=True, drop_last=True)
        self.test_loader_for_training = DataLoader(self.test_set_for_training, batch_size=BATCH_SIZE, shuffle=False, num_workers=2, pin_memory=True, drop_last=True)
        self.test_loader = DataLoader(self.test_set, batch_size=BATCH_SIZE, shuffle=False)

        # for setting loss_weights
        self.counts = [0] * 11
        for i in range(len(self.val_set)):
            x, y = self.val_set.__getitem__(i)
            self.counts[y] += 1
          
        for i in range(11):
            self.counts[y] = 280 + 60 - self.counts[y]


    def set_tfm(self):
        # It is important to do data augmentation in training.
        # However, not every augmentation is useful.
        # Please think about what kind of augmentation is helpful for food recognition.
        get_rand_num = lambda x: np.random.randint(x)
        get_rand_padding = lambda : (get_rand_num(32), get_rand_num(32), get_rand_num(32), get_rand_num(32))
        self.train_tfm = transforms.Compose([
            transforms.RandomHorizontalFlip(),
            transforms.Resize((256, 256)),
            transforms.Pad(get_rand_padding(), fill=0, padding_mode="constant"), 
            transforms.RandomRotation(180),
            transforms.ColorJitter(brightness=0.1, contrast=0.1),
            transforms.Resize((224, 224)),
            transforms.ToTensor(),
            # reference: https://paperswithcode.github.io/torchbench/imagenet/
            transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
        ])

        # We don't need augmentations in testing and validation.
        # All we need here is to resize the PIL image and transform it into Tensor.
        self.test_tfm = transforms.Compose([
            transforms.Resize((224, 224)),
            transforms.ToTensor(),
            transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
        ])


## **Model**

The basic model here is simply a stack of convolutional layers followed by some fully-connected layers.

Since there are three channels for a color image (RGB), the input channels of the network must be three.
In each convolutional layer, typically the channels of inputs grow, while the height and width shrink (or remain unchanged, according to some hyperparameters like stride and padding).

Before fed into fully-connected layers, the feature map must be flattened into a single one-dimensional vector (for each image).
These features are then transformed by the fully-connected layers, and finally, we obtain the "logits" for each class.

### **WARNING -- You Must Know**
You are free to modify the model architecture here for further improvement.
However, if you want to use some well-known architectures such as ResNet50, please make sure **NOT** to load the pre-trained weights.
Using such pre-trained models is considered cheating and therefore you will be punished.
Similarly, it is your responsibility to make sure no pre-trained weights are used if you use **torch.hub** to load any modules.

For example, if you use ResNet-18 as your model:

model = torchvision.models.resnet18(pretrained=**False**) → This is fine.

model = torchvision.models.resnet18(pretrained=**True**)  → This is **NOT** allowed.

In [31]:
# reference: vgg16 (https://arxiv.org/pdf/1409.1556.pdf)
class Classifier(nn.Module):
    def __init__(self):
        print('init classifier...')
        super(Classifier, self).__init__()
        NEGATIVE_SLOPE = CONFIG['NEGATIVE_SLOPE']
        MOMENTUM = CONFIG['MOMENTUM']
        # The arguments for commonly used modules:
        # torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding)
        # torch.nn.MaxPool2d(kernel_size, stride, padding)

        # input image size: [3, 224, 224]
        self.cnn_layers = nn.Sequential(
            nn.Conv2d(3, 64, 3, 1, 1),
            nn.BatchNorm2d(64, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(64, 64, 3, 1, 1),
            nn.BatchNorm2d(64, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(64, 128, 3, 1, 1),
            nn.BatchNorm2d(128, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(128, 128, 3, 1, 1),
            nn.BatchNorm2d(128, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(128, 256, 3, 1, 1),
            nn.BatchNorm2d(256, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(256, 256, 3, 1, 1),
            nn.BatchNorm2d(256, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(256, 256, 3, 1, 1),
            nn.BatchNorm2d(256, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(256, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(512, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(512, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.MaxPool2d(2, 2, 0),

            nn.Conv2d(512, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(512, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Conv2d(512, 512, 3, 1, 1),
            nn.BatchNorm2d(512, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.MaxPool2d(2, 2, 0),
        )

        self.fc_layers = nn.Sequential(
            nn.Linear(512 * 7 * 7, 2048),
            nn.BatchNorm1d(2048, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.4),

            nn.Linear(2048, 1024),
            nn.BatchNorm1d(1024, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.4),

            nn.Linear(1024, 128),
            nn.BatchNorm1d(128, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.4),

            nn.Linear(128, 11)
        )

    def forward(self, x):
        # input (x): [batch_size, 3, 224, 224]
        # output: [batch_size, 11]

        # Extract features by convolutional layers.
        x = self.cnn_layers(x)

        # The extracted feature map must be flatten before going to fully-connected layers.
        x = x.flatten(1)

        # The features are transformed by fully-connected layers to obtain the final logits.
        x = self.fc_layers(x)
        return x

    def do_transfer(self):
        NEGATIVE_SLOPE = CONFIG['NEGATIVE_SLOPE']
        MOMENTUM = CONFIG['MOMENTUM']
        self.fc_layers = nn.Sequential(
            nn.Linear(512 * 7 * 7, 2048),
            nn.BatchNorm1d(2048, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.8),

            nn.Linear(2048, 1024),
            nn.BatchNorm1d(1024, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.8),

            nn.Linear(1024, 128),
            nn.BatchNorm1d(128, momentum=MOMENTUM),
            nn.LeakyReLU(negative_slope=NEGATIVE_SLOPE),
            nn.Dropout(p=0.8),

            nn.Linear(128, 11)
        )

    def summary(self):
        summary(self, (3, 224, 224))

## **Training**

You can finish supervised learning by simply running the provided code without any modification.

The function "get_pseudo_labels" is used for semi-supervised learning.
It is expected to get better performance if you use unlabeled data for semi-supervised learning.
However, you have to implement the function on your own and need to adjust several hyperparameters manually.

For more details about semi-supervised learning, please refer to [Prof. Lee's slides](https://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/semi%20(v3).pdf).

Again, please notice that utilizing external data (or pre-trained model) for training is **prohibited**.

In [32]:
class Trainer():
    def __init__(self, idx):
        print('init trainer...')
        self.idx = idx
        OPTIMIZER = CONFIG['OPTIMIZER']
        OPTIM_PARAMS = CONFIG['OPTIM_PARAMS']
        GD_MODEL_PATH = CONFIG['GD_MODEL_PATH'] + str(self.idx)
        GD_PRE_MODEL_PATH = CONFIG['GD_PRE_MODEL_PATH']

        self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
        print(f'using device: {self.device}')

        self.model = Classifier()
        if os.path.isfile(GD_MODEL_PATH):
            # loading previous model
            print('loading previous model parameters...')
            ckpt = torch.load(GD_MODEL_PATH, map_location='cpu')
            self.model.do_transfer()
            self.model.load_state_dict(ckpt)
        else:
            # using pretrain model
            print('loading pretrain model parameters...')
            ckpt = torch.load(GD_PRE_MODEL_PATH, map_location='cpu')
            self.model.load_state_dict(ckpt) 
            self.model.do_transfer()

                       
        self.model = self.model.to(self.device)
        self.dataManager = DataManager(self.idx)
        self.optimizer = getattr(torch.optim, OPTIMIZER)(self.model.parameters(), **OPTIM_PARAMS)
        self.counts = self.dataManager.counts
        self.loss_weights = [1.0/count for count in self.counts]
        self.loss_weights_tensor = torch.FloatTensor(self.loss_weights).to(self.device)
        self.criterion = nn.CrossEntropyLoss(weight=self.loss_weights_tensor)
        
    def train(self):
        print(f'training model_{self.idx}...')
        GD_MODEL_PATH = CONFIG['GD_MODEL_PATH'] + str(self.idx)
        GD_VAL_BEST_PATH = CONFIG['GD_VAL_BEST_PATH'] + str(self.idx)

        # print gpu memory info
        t = torch.cuda.get_device_properties(0).total_memory
        r = torch.cuda.memory_reserved(0) 
        a = torch.cuda.memory_allocated(0)
        f = r-a
        print(f'memory_reserved: {r/1024/1024/1024}, memory_allocated:{a/1024/1024/1024}, memory_free:{f/1024/1024/1024}')

        # init the best loss & acc of validation
        best_val_loss = float('inf')
        best_val_acc = 0.0
        if os.path.isfile(GD_VAL_BEST_PATH):
            with open(GD_VAL_BEST_PATH, 'r') as f:
                best_val_acc = float(f.read())
                print(f'best_val_acc: {best_val_acc}')
        
        # train
        train_acc = 0.0
        train_loss = 0.0
        self.model.train()
        for xs, ys in tqdm(self.dataManager.train_loader):
            self.optimizer.zero_grad()
            xs, ys = xs.to(self.device), ys.to(self.device)
            outputs = self.model(xs)

            batch_loss = self.criterion(outputs, ys)
            _, y_preds = torch.max(outputs, 1) # get the index of the class with the highest probability

            batch_loss.backward()
            grad_norm = nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=10) # clip the gradient norms for stable training.
            self.optimizer.step()

            train_acc += (y_preds.cpu() == ys.cpu()).sum().item()
            train_loss += batch_loss.item()
        
        train_acc = train_acc/len(self.dataManager.train_set)
        train_loss = train_loss/len(self.dataManager.train_loader)

        # validation
        val_acc = 0.0
        val_loss = 0.0
        # confusion matrix
        val_y = []
        val_y_pred = []
        self.model.eval()
        with torch.no_grad():
            for xs, ys in tqdm(self.dataManager.val_loader):
                xs, ys = xs.to(self.device), ys.to(self.device)
                outputs = self.model(xs)

                batch_loss = self.criterion(outputs, ys) 
                _, y_preds = torch.max(outputs, 1) 

                val_acc += (y_preds.cpu() == ys.cpu()).sum().item()
                val_loss += batch_loss.item()

                for y_pred in y_preds.cpu().numpy():
                    val_y_pred.append(y_pred)

                for y in ys.cpu().numpy():
                    val_y.append(y)

        val_acc = val_acc/len(self.dataManager.val_set)
        val_loss = val_loss/len(self.dataManager.val_loader)
        
        # self.update_lr()
        print(f'Train Acc: {train_acc:.5f} Loss: {train_loss:.5f}, Val Acc: {val_acc:.5f} Loss: {val_loss:.5f}')

        # print accuracy for each class
        cf_matrix = confusion_matrix(val_y, val_y_pred)
        for i in range(len(cf_matrix)):
            print(f'class {i} acc: {cf_matrix[i][i] / cf_matrix[i].sum():.5f}')

        # save the best model
        if val_acc >= best_val_acc:
            best_val_acc = val_acc
            torch.save(self.model.state_dict(), GD_MODEL_PATH)
            with open(GD_VAL_BEST_PATH, 'w') as f:
                f.write(str(best_val_acc))

            print(f'saving model with acc {best_val_acc:.5f}')

        # using pseudo label & increase loss weight of bad performance class
        if best_val_acc > 0.72 and train_acc > 0.95:
            self.set_pseudo_label()
            self.set_test_label()

        # update loss_weight since appending new datas
        self.loss_weights_tensor = torch.FloatTensor(self.loss_weights).to(self.device)
        self.criterion = nn.CrossEntropyLoss(weight=self.loss_weights_tensor)

    # predicting testing data
    def pred(self):
        print('predicting...')
        GD_MODEL_PATH = CONFIG['GD_MODEL_PATH'] + str(self.idx)

        # load the best model
        ckpt = torch.load(GD_MODEL_PATH, map_location='cpu')
        self.model.load_state_dict(ckpt)

        # predict
        self.model.eval()
        test_y_preds = []
        with torch.no_grad():
            for xs, ys in tqdm(self.dataManager.test_loader):
                xs = xs.to(self.device)
                outputs = self.model(xs)
                _, y_preds = torch.max(outputs, 1)
                
                for y_pred in y_preds.cpu().numpy():
                    test_y_preds.append(y_pred)

        return test_y_preds

    # appending data from unlabeled data
    def set_pseudo_label(self):
        THRESHOLD = CONFIG['THRESHOLD']
        BATCH_SIZE = CONFIG['BATCH_SIZE']

        softmax = nn.Softmax(dim=-1)

        # calculate the confidence and prediction
        self.model.eval()
        pseudo_candidates = []
        failed_idxs = []
        i_abs = 0
        with torch.no_grad():
            for xs, ys in tqdm(self.dataManager.unlabeled_loader):
                xs = xs.to(self.device)
                outputs = self.model(xs)
                _, y_preds = torch.max(outputs, 1) 
                probss = softmax(outputs)

                for i, probs in enumerate(probss.cpu().numpy()):
                    y_pred = y_preds.cpu().numpy()[i]
                    if probs[y_pred] > THRESHOLD:
                        pseudo_candidates.append([i_abs, y_pred])
                    else:
                        failed_idxs.append(i_abs)

                    i_abs += 1

        # get idx_to_labels & update loss_weights
        pseudo_idxs = []
        idx_to_labels = {}
        for idx, y_pred in pseudo_candidates:
            pseudo_idxs.append(idx)
            idx_to_labels[idx] = y_pred
            self.counts[y_pred] += 1
        self.loss_weights = [1.0/count for count in self.counts]

        # traversal to raw dataset & get the raw data idx with high confidence
        dataset = self.dataManager.unlabeled_set
        raw_idxs = []
        for i in range(len(pseudo_idxs)):
            raw_idxs.append(pseudo_idxs[i])

        while type(dataset) == Subset:
            for i in range(len(pseudo_idxs)):
                raw_idxs[i] = dataset.indices[raw_idxs[i]]
            dataset = dataset.dataset
          
        # get raw_idx_to_labels
        raw_idx_to_labels = {}
        for i in range(len(pseudo_idxs)):
            raw_idx_to_labels[raw_idxs[i]] = idx_to_labels[pseudo_idxs[i]]

        # update pseudo labels to raw dataset
        samples = dataset.samples
        targets = dataset.targets
        for i in range(len(samples)):
            if i in raw_idx_to_labels.keys():
                samples[i] = (samples[i][0], raw_idx_to_labels[i])
                targets[i] = raw_idx_to_labels[i]

        print(f'appending {len(idx_to_labels)} datas to train_set')

        pseudo_set = Subset(self.dataManager.unlabeled_set, pseudo_idxs)
        failed_set = Subset(self.dataManager.unlabeled_set, failed_idxs)

        # print appending datas with label
        # transform torch tensor to PIL image. & 
        # tensor_to_PIL = transforms.ToPILImage()
        # from matplotlib.pyplot import imshow
        # from matplotlib import pyplot as plt
        # %matplotlib inline
        # for i in range(len(idx_to_labels)):
        #     img_tensor, y = pseudo_set.__getitem__(i)
        #     tfm = transforms.Compose([transforms.Normalize((0., 0., 0.), (1/0.229, 1/0.224, 1/0.225)),
        #                               transforms.Normalize((-0.485, -0.456, -0.406), (1., 1., 1.)),])
        #     img_tensor = tfm(img_tensor)
        #     img = tensor_to_PIL(img_tensor).convert('RGB')

        #     print(f'label: {y}')
        #     imshow(np.asarray(img))
        #     plt.show()

        self.dataManager.train_set = ConcatDataset([self.dataManager.train_set, pseudo_set])
        self.dataManager.unlabeled_set = failed_set
        self.dataManager.train_loader = DataLoader(self.dataManager.train_set, batch_size=BATCH_SIZE, shuffle=True, num_workers=2, pin_memory=True, drop_last=True)
        self.dataManager.unlabeled_loader = DataLoader(self.dataManager.unlabeled_set, batch_size=BATCH_SIZE, shuffle=False, num_workers=2, pin_memory=True, drop_last=True)

    # appending data from testing data
    def set_test_label(self):
        THRESHOLD = CONFIG['THRESHOLD']
        BATCH_SIZE = CONFIG['BATCH_SIZE']

        softmax = nn.Softmax(dim=-1)

        # calculate the confidence and prediction
        self.model.eval()
        pseudo_candidates = []
        failed_idxs = []
        i_abs = 0
        with torch.no_grad():
            for xs, ys in tqdm(self.dataManager.test_loader_for_training):
                xs = xs.to(self.device)
                outputs = self.model(xs)
                _, y_preds = torch.max(outputs, 1) 
                probss = softmax(outputs)

                for i, probs in enumerate(probss.cpu().numpy()):
                    y_pred = y_preds.cpu().numpy()[i]
                    if probs[y_pred] > THRESHOLD:
                        pseudo_candidates.append([i_abs, y_pred])
                    else:
                        failed_idxs.append(i_abs)

                    i_abs += 1

        # get idx_to_labels & update loss_weights
        pseudo_idxs = []
        idx_to_labels = {}
        for idx, y_pred in pseudo_candidates:
            pseudo_idxs.append(idx)
            idx_to_labels[idx] = y_pred
            self.counts[y_pred] += 1
        self.loss_weights = [1.0/count for count in self.counts]

        # traversal to raw dataset & get the raw data idx with high confidence
        dataset = self.dataManager.test_set_for_training
        raw_idxs = []
        for i in range(len(pseudo_idxs)):
            raw_idxs.append(pseudo_idxs[i])

        while type(dataset) == Subset:
            for i in range(len(pseudo_idxs)):
                raw_idxs[i] = dataset.indices[raw_idxs[i]]
            dataset = dataset.dataset
          
        # get raw_idx_to_labels
        raw_idx_to_labels = {}
        for i in range(len(pseudo_idxs)):
            raw_idx_to_labels[raw_idxs[i]] = idx_to_labels[pseudo_idxs[i]]

        # update pseudo labels to raw dataset
        samples = dataset.samples
        targets = dataset.targets
        for i in range(len(samples)):
            if i in raw_idx_to_labels.keys():
                samples[i] = (samples[i][0], raw_idx_to_labels[i])
                targets[i] = raw_idx_to_labels[i]

        print(f'appending {len(idx_to_labels)} datas to train_set')

        pseudo_set = Subset(self.dataManager.test_set_for_training, pseudo_idxs)
        failed_set = Subset(self.dataManager.test_set_for_training, failed_idxs)

        # print appending datas with label
        # transform torch tensor to PIL image.
        # tensor_to_PIL = transforms.ToPILImage()
        # from matplotlib.pyplot import imshow
        # from matplotlib import pyplot as plt
        # %matplotlib inline
        # for i in range(len(idx_to_labels)):
        #     img_tensor, y = pseudo_set.__getitem__(i)
        #     tfm = transforms.Compose([transforms.Normalize((0., 0., 0.), (1/0.229, 1/0.224, 1/0.225)),
        #                               transforms.Normalize((-0.485, -0.456, -0.406), (1., 1., 1.)),])
        #     img_tensor = tfm(img_tensor)
        #     img = tensor_to_PIL(img_tensor).convert('RGB')

        #     print(f'label: {y}')
        #     imshow(np.asarray(img))
        #     plt.show()

        self.dataManager.train_set = ConcatDataset([self.dataManager.train_set, pseudo_set])
        self.dataManager.test_set_for_training = failed_set
        self.dataManager.train_loader = DataLoader(self.dataManager.train_set, batch_size=BATCH_SIZE, shuffle=True, num_workers=2, pin_memory=True, drop_last=True)
        self.dataManager.test_loader_for_training = DataLoader(self.dataManager.test_set_for_training, batch_size=BATCH_SIZE, shuffle=False, num_workers=2, pin_memory=True, drop_last=True)

    def update_lr(self):
        DECAY_RATE = CONFIG['DECAY_RATE']
        MIN_LR = CONFIG['MIN_LR']
        for param_group in self.optimizer.param_groups:
            param_group['lr'] = param_group['lr'] * DECAY_RATE
            param_group['lr'] = max(MIN_LR, param_group['lr'])

In [33]:
  class Emssembler():
    def __init__(self):
        print('init emsembler...')
        MODEL_NUM = CONFIG['MODEL_NUM']
        self.trainers = []
        for i in range(MODEL_NUM):
            self.trainers.append(Trainer(i))

    # train trainer in self.trainer
    def train(self):
        MODEL_NUM = CONFIG['MODEL_NUM']
        EPOCH_NUM = CONFIG['EPOCH_NUM']
        for i in range(EPOCH_NUM):
            print(f'epoch: {i+1}')
            for trainer in self.trainers:
                trainer.train()

    # predict testing data with majority vote      
    def pred(self):
        MODEL_NUM = CONFIG['MODEL_NUM']
        
        y_preds = None
        for trainer in self.trainers:
            y_pred = np.array(trainer.pred())
            y_pred = np.reshape(y_pred, (y_pred.shape[0], 1))
            if y_preds is None:
                y_preds = y_pred
            else:
                y_preds = np.concatenate((y_preds, y_pred), axis=1)

        emssemble_y_preds = []
        for i in range(len(y_preds)):
            y_pred = self.most_freq(y_preds[i])
            emssemble_y_preds.append(y_pred)

        print('Saving...')
        with open('pred.csv', 'w') as f:
            f.write('Id,Category\n')
            for i, y in enumerate(emssemble_y_preds):
                f.write(f'{i},{y}\n')

        print('Finishing...')

    # get the most freqency object in arr
    def most_freq(self, arr):
        freq_map = {}
        ret = arr[0]
        for x in arr:
            if x not in freq_map:
                freq_map[x] = 0
            
            freq_map[x] += 1
            if freq_map[x] > freq_map[ret]:
                ret = x
        
        return ret

In [34]:
emssembler = Emssembler()
emssembler.train()
emssembler.pred()
print(f'CONFIG: {CONFIG}')


init emsembler...
init trainer...
using device: cuda
init classifier...
loading previous model parameters...
init data manager...
init trainer...
using device: cuda
init classifier...
loading previous model parameters...
init data manager...
init trainer...
using device: cuda
init classifier...
loading previous model parameters...
init data manager...
init trainer...
using device: cuda
init classifier...
loading previous model parameters...
init data manager...
init trainer...
using device: cuda
init classifier...
loading previous model parameters...
init data manager...
predicting...


HBox(children=(FloatProgress(value=0.0, max=105.0), HTML(value='')))


predicting...


HBox(children=(FloatProgress(value=0.0, max=105.0), HTML(value='')))


predicting...


HBox(children=(FloatProgress(value=0.0, max=105.0), HTML(value='')))


predicting...


HBox(children=(FloatProgress(value=0.0, max=105.0), HTML(value='')))


predicting...


HBox(children=(FloatProgress(value=0.0, max=105.0), HTML(value='')))


Saving...
Finishing...
CONFIG: {'GD_PRE_MODEL_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/model.ckpt', 'GD_MODEL_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/tfer_model_', 'GD_VAL_BEST_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/models/tfer_val_best_', 'ABS_PATH': 'gdrive/MyDrive/Colab Notebooks/HW3/food-11/', 'TRAIN_PATH': 'training/labeled/', 'UNLABELED_PATH': 'training/unlabeled/', 'VAL_PATH': 'validation/', 'TEST_PATH': 'testing/', 'EPOCH_NUM': 100, 'BATCH_SIZE': 32, 'OPTIMIZER': 'Adam', 'OPTIM_PARAMS': {'lr': 7e-05, 'weight_decay': 1e-05}, 'DECAY_RATE': 1, 'MIN_LR': 7e-05, 'MODEL_NUM': 5, 'SEED': 0, 'THRESHOLD': 0.99, 'NEGATIVE_SLOPE': 0, 'MOMENTUM': 0.1}
