### 統計學習與深度學習
### Homework 5

請將IPYNB檔上傳至COOL作業區。回答作業時建議使用 "三明治" 答題法。也就是說，先說明要做什麼，然後列出程式碼與結果，最後說明這些結果的意義。作業自己做。嚴禁抄襲。不接受紙本繳交，不接受遲交。請以英文或中文作答。

這個作業將要練習影像分類的問題。影像分類是CNN模型的強項，我們的任務是區分照片中主角穿的上衣類型。這個問題在不同的情境下有不同的難度。在`Dive into Deep Learning`中有類似的問題，但是處理較"乾淨"的影像。這次作業的資料來自街拍影像，因此分類的困難度較高。

我們這次作業的任務，是依照照片中人物的上衣，區分以下類別:
* blazer
* cardigan
* coat
* jacket

下面列出這四個類別的範例訓練資料。

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt
from PIL import Image
import os
import glob
import random

random.seed(1223)
labels = ['blazer', 'cardigan', 'coat', 'jacket']
for i in range(4):
    print("Label = ", labels[i])
    basepath = os.path.join("photos/train", labels[i], "*.jpg")
    cand_fn = glob.glob(basepath)
    for afn in random.choices(cand_fn, k = 3):    
        img = Image.open(afn)
        #plt.imshow(img)
        #plt.show()

Label =  blazer
Label =  cardigan
Label =  coat
Label =  jacket


### 資料
資料在`photos`資料夾。已經區分好訓練(train)、校正(valid)、測試(test)資料。下一層則是依照圖片的標籤分資料夾存放，因此有四個資料夾，分別是blazer, cardigan, coat, jacket。一張圖片只會屬於一個類別。

### Q1
(5%) 列出train, valid, test的總照片數，以及各類別的照片數與比率。在還沒進行模型訓練與評估前，你認為各類別相對的準確率的大小關係為何?


由於圖片的解析度較高，模型訓練前須將解析度調整(Resize)成較短邊為256像素的照片，然後隨機取大小為224x224的影像。接著隨機水平翻轉(Horizontal Flip)、隨機旋轉-20度到20度，並依照Pretrained ResNet的要求調整RGB的均數與標準差。
測試資料(Valid and Test)亦須先將解析度調整(Resize)成較短邊為256像素的照片，然後取圖片中心224x224的影像。


In [None]:
import numpy as np
import pandas as pd
import os
from torch import optim, cuda
from torchvision import transforms, datasets, models
from torch.utils.data import DataLoader, sampler
from collections import Counter
batch_size = 32

train_on_gpu = cuda.is_available()

preprocess = {
    'train':transforms.Compose([
        transforms.Resize(size=256),
        transforms.RandomRotation(degrees=20),
        transforms.RandomHorizontalFlip(),
        transforms.CenterCrop(size=224), 
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])]),
    
    'valid':transforms.Compose([
        transforms.Resize(size=256),
        transforms.CenterCrop(size=224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
    
    'test':transforms.Compose([
        transforms.Resize(size=256),
        transforms.CenterCrop(size=224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),}

data = {'train': datasets.ImageFolder(root="photos/train", transform=preprocess['train']),
        'valid': datasets.ImageFolder(root="photos/valid", transform=preprocess['valid']),
        'test': datasets.ImageFolder(root="photos/test", transform=preprocess['test'])}

dataloaders = {'train': DataLoader(data['train'], batch_size=batch_size, shuffle=True,num_workers=10),
               'val': DataLoader(data['valid'], batch_size=batch_size, shuffle=True,num_workers=10),
               'test': DataLoader(data['test'], batch_size=batch_size, shuffle=True,num_workers=10)}

categories = []
for d in os.listdir("photos/train"):
    categories.append(d)
n_classes = len(categories)

class_to_idx = data['train'].class_to_idx
idx_to_class = {idx: class_ for class_, idx in data['train'].class_to_idx.items()}

train_cnts = Counter([idx_to_class[x] for x in data['train'].targets])
val_cnts = Counter([idx_to_class[x] for x in data['valid'].targets])
test_cnts = Counter([idx_to_class[x] for x in data['test'].targets])
train_cnts = pd.DataFrame({'category' :list(train_cnts.keys()), 'train_cnt': list(train_cnts.values())})
val_cnts = pd.DataFrame({'category' :list(val_cnts.keys()), 'valid_cnt': list(val_cnts.values())})
test_cnts = pd.DataFrame({'category' :list(test_cnts.keys()), 'test_cnt': list(test_cnts.values())})

train_len = train_cnts['train_cnt'].sum()
train_cnts['train_prob'] = train_cnts.apply(lambda row: str(round(row.train_cnt/train_len*100,2))+'%', axis=1)
valid_len = val_cnts['valid_cnt'].sum()
val_cnts['valid_prob'] = val_cnts.apply(lambda row: str(round(row.valid_cnt/valid_len*100,2))+'%', axis=1)
test_len = test_cnts['test_cnt'].sum()
test_cnts['test_prob'] = test_cnts.apply(lambda row: str(round(row.test_cnt/test_len*100,2))+'%', axis=1)

categories_df = pd.merge(train_cnts,val_cnts,on='category',how='left').merge(test_cnts,on='category',how='left')
print(categories_df.head())

trainiter = iter(dataloaders['train'])
features, labels = next(trainiter)

### 因為有越多的訓練資料，模型通常能訓練的更好。

### 所以在開始訓練前，我認為各個class的模型準確率大小關係如下：jacket > coat > cardigan > blazer


### Q2
(35%) 使用Resnet50建構圖片分類模型。將最後一層的Fully Connected Layer輸出維度改成4以符合本題任務需求。除了最後一層以外，使用torchvision提供的pretrained weights (`torchvision.models.resnet50(pretrained=True)`)初始化模型權重。使用train資料訓練模型，以valid資料決定Early Stopping的Epoch。Early Stopping的Patient參數設為20 Epochs。Batch size設為32。每一個Epoch計算一次Valid Loss，並記錄Valid Loss最低的模型。模型訓練最多200個Epochs。使用最佳模型在test資料計算模型Accuracy, Confusion Matrix, 與Per-class Accuracy。你應該要考慮SGD與ADAM兩種最佳化演算法。調整超參數以達到最好的Valid Loss。

得到Per-Class Accuracy之後，請討論與Q1預期的差異與可能原因。

提示: 
* Pytorch Resnet pretrained model的說明請見 <https://pytorch.org/hub/pytorch_vision_resnet/>
* 本題的Test Accuracy應高於78%。



In [None]:
import torch.nn as nn

def train(train_loader, valid_loader, data, save_file_name, max_epochs_stop=20, n_epochs=20, print_every=1, learning_rate=0.0001, pretrained_tf=True, requires_grad_tf=True, _optimizer='Adam'):

    model = models.resnet50(pretrained=pretrained_tf)

    for param in model.parameters():
        param.requires_grad = requires_grad_tf

    n_inputs = model.fc.in_features
    model.fc = nn.Linear(n_inputs, 4)
    total_params = sum(p.numel() for p in model.parameters())
    #print('{total_params:,} total parameters.')
    total_trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    #print('{total_trainable_params:,} training parameters.')

    if train_on_gpu:
        model = model.to('cuda')

    model.class_to_idx = data['train'].class_to_idx
    model.idx_to_class = {idx: class_ for class_, idx in model.class_to_idx.items()}

    criterion = nn.NLLLoss()
    
    if _optimizer == 'Adam':
        optimizer = optim.Adam(model.parameters(),lr=learning_rate)
    elif _optimizer == 'SGD':
        optimizer = optim.SGD(model.parameters(),lr=learning_rate)
    
    epochs_no_improve = 0
    valid_loss_min = np.Inf
    valid_max_acc = 0
    history = []

    try:
        print(f'Model has been trained for: {model.epochs} epochs.')
    except:
        model.epochs = 0
        print(f'Starting Training from Scratch.\n')
    
    overall_start = timer()

    for epoch in range(n_epochs):

        train_loss = 0
        valid_loss = 0
        train_acc = 0
        valid_acc = 0

        model.train()
        start = timer()

        for ii, (data, target) in enumerate(train_loader):
            if train_on_gpu:
                data, target = data.cuda(), target.cuda()

            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

            train_loss += loss.item() * data.size(0)

            _, pred = torch.max(output, dim=1)
            correct_tensor = pred.eq(target.data.view_as(pred))
            accuracy = torch.mean(correct_tensor.type(torch.FloatTensor))
            train_acc += accuracy.item() * data.size(0)

            print(f'Epoch: {epoch}\t{100 * (ii + 1) / len(train_loader):.2f}% complete. {timer() - start:.2f} seconds elapsed in epoch.',end='\r')

        else:
            model.epochs += 1

            with torch.no_grad():
                model.eval()

                for data, target in valid_loader:
                    if train_on_gpu:
                        data, target = data.cuda(), target.cuda()

                    output = model(data)

                    loss = criterion(output, target)
                    valid_loss += loss.item() * data.size(0)

                    _, pred = torch.max(output, dim=1)
                    correct_tensor = pred.eq(target.data.view_as(pred))
                    accuracy = torch.mean(
                        correct_tensor.type(torch.FloatTensor))
                    valid_acc += accuracy.item() * data.size(0)

                train_loss = train_loss / len(train_loader.dataset)
                valid_loss = valid_loss / len(valid_loader.dataset)
                train_acc = train_acc / len(train_loader.dataset)
                valid_acc = valid_acc / len(valid_loader.dataset)

                history.append([train_loss, valid_loss, train_acc, valid_acc])

                if (epoch + 1) % print_every == 0:
                    print(f'Epoch: {epoch} \tTraining Loss: {train_loss:.4f} \tValidation Loss: {valid_loss:.4f}')
                    print(f'\t\tTraining Accuracy: {100 * train_acc:.2f}%\t Validation Accuracy: {100 * valid_acc:.2f}%')

                if valid_loss < valid_loss_min:
                    torch.save(model.state_dict(), save_file_name)
                    epochs_no_improve = 0
                    valid_loss_min = valid_loss
                    valid_best_acc = valid_acc
                    best_epoch = epoch

                else:
                    epochs_no_improve += 1
                    if epochs_no_improve >= max_epochs_stop:
                        print(f'\nEarly Stopping! Total epochs: {epoch}. Best epoch: {best_epoch} with loss: {valid_loss_min:.2f} and acc: {100 * valid_acc:.2f}%'
                        )
                        total_time = timer() - overall_start
                        print(f'{total_time:.2f} total seconds elapsed. {total_time / (epoch+1):.2f} seconds per epoch.\n')

                        model.load_state_dict(torch.load(save_file_name))
                        model.optimizer = optimizer

                        history = pd.DataFrame(history,columns=['train_loss', 'valid_loss', 'train_acc','valid_acc'])
                        return model, history

    model.optimizer = optimizer
    total_time = timer() - overall_start
    print(f'Best epoch: {best_epoch} with loss: {valid_loss_min:.2f} and acc: {100 * valid_acc:.2f}%')
    print(f'{total_time:.2f} total seconds elapsed. {total_time / (epoch):.2f} seconds per epoch.')
    history = pd.DataFrame(history,columns=['train_loss', 'valid_loss', 'train_acc', 'valid_acc'])
    return model, history

In [None]:
def test(model):
    result = {}
    result['coat'] = {}
    result['cardigan'] = {} 
    result['jacket'] = {}
    result['blazer'] = {}
    for c in categories:
        for c2 in categories:
            result[c][c2] = 0
        result[c]['total'] = 0

    for c in categories:
        root = "photos/test/"+ c + '/'
        for img in os.listdir(root):
            topk = 1
            image_path = root + img
            #print(image_path)

            real_class = image_path.split('/')[-2]
            img_tensor = process_image(image_path)

            if train_on_gpu:
                img_tensor = img_tensor.view(1, 3, 224, 224).cuda()
            else:
                img_tensor = img_tensor.view(1, 3, 224, 224)

            with torch.no_grad():
                model.eval()
                out = model(img_tensor)
                ps = torch.exp(out)

                topk, topclass = ps.topk(topk, dim=1)

                top_classes = [
                    model.idx_to_class[class_] for class_ in topclass.cpu().numpy()[0]
                ]
                top_p = topk.cpu().numpy()[0]

                result[c]['total'] += 1
                if top_classes[0] == 'blazer':
                    result[c]['blazer'] += 1
                elif top_classes[0] == 'coat':
                    result[c]['coat'] += 1
                elif top_classes[0] == 'cardigan':
                    result[c]['cardigan'] += 1
                elif top_classes[0] == 'jacket':
                    result[c]['jacket'] += 1

    print('--- Per-class Accuracy ---')
    print('blazer accuracy:', result['blazer']['blazer']/result['blazer']['total'])
    print('coat accuracy:', result['coat']['coat']/result['coat']['total'])
    print('cardigan accuracy:', result['cardigan']['cardigan']/result['cardigan']['total'])
    print('jacket accuracy:', result['jacket']['jacket']/result['jacket']['total'])
    total_acc = (result['blazer']['blazer']+result['coat']['coat']+result['cardigan']['cardigan']+result['jacket']['jacket'])/(result['blazer']['total']+result['coat']['total']+result['cardigan']['total']+result['jacket']['total'])
    print('\n--- Testing Accuracy ---')
    print('accuracy:', total_acc)
    print('\n--- Confusion Matrix ---')
    cdf = pd.DataFrame([[result['blazer']['blazer'],result['blazer']['coat'],result['blazer']['cardigan'],result['blazer']['jacket']], 
                        [result['coat']['blazer'],result['coat']['coat'],result['coat']['cardigan'],result['coat']['jacket']], 
                        [result['cardigan']['blazer'],result['cardigan']['coat'],result['cardigan']['cardigan'],result['cardigan']['jacket']],
                        [result['jacket']['blazer'],result['jacket']['coat'],result['jacket']['cardigan'],result['jacket']['jacket']]],
     index=['blazer','coat','cardigan','jacket'],
     columns=['blazer','coat','cardigan','jacket'])
    print(cdf)
    print('\n')

## Q2 Adam

In [None]:
from timeit import default_timer as timer
import torch
from torchvision import transforms, datasets, models
from torch.utils.data import DataLoader

save_file_name = f'resnet50-transfer.pt'
checkpoint_path = f'resnet50-transfer.pth'

for _learning_rate in [0.0001,0.0005,0.001]:

    print('-------- optimizer = Adam | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=True,
        requires_grad_tf=True,
        _optimizer='Adam')
    test(model)

-------- optimizer = Adam | learning rate = 0.0001 --------

Starting Training from Scratch.

Epoch: 19 	Training Loss: 0.0903 	Validation Loss: 0.4955
		Training Accuracy: 97.12%	 Validation Accuracy: 82.86%
Epoch: 39 	Training Loss: 0.0480 	Validation Loss: 0.9354
		Training Accuracy: 98.46%	 Validation Accuracy: 77.14%

Early Stopping! Total epochs: 39. Best epoch: 19 with loss: 0.50 and acc: 77.14%
735.11 total seconds elapsed. 18.38 seconds per epoch.

--- Per-class Accuracy ---
blazer accuracy: 0.6666666666666666
coat accuracy: 0.8604651162790697
cardigan accuracy: 0.7142857142857143
jacket accuracy: 0.8846153846153846

--- Testing Accuracy ---
accuracy: 0.815068493150685

--- Confusion Matrix ---
          blazer  coat  cardigan  jacket
blazer         6     1         0       2
coat           1    37         0       5
cardigan       0     3        30       9
jacket         2     2         2      46


-------- optimizer = Adam | learning rate = 0.0005 --------

Starting Training f

## Q2 SGD

In [None]:
for _learning_rate in [0.0005,0.001,0.005]:

    print('-------- optimizer = SGD | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=True,
        requires_grad_tf=True,
        _optimizer='SGD')
    test(model)

-------- optimizer = SGD | learning rate = 0.0005 --------



Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /root/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth


HBox(children=(FloatProgress(value=0.0, max=102502400.0), HTML(value='')))


Starting Training from Scratch.

Epoch: 19 	Training Loss: 1.2349 	Validation Loss: 1.2287
		Training Accuracy: 41.02%	 Validation Accuracy: 33.33%
Epoch: 39 	Training Loss: 1.1675 	Validation Loss: 1.1584
		Training Accuracy: 49.38%	 Validation Accuracy: 42.86%
Epoch: 59 	Training Loss: 1.0436 	Validation Loss: 1.0548
		Training Accuracy: 59.85%	 Validation Accuracy: 58.10%
Epoch: 79 	Training Loss: 0.8858 	Validation Loss: 0.9239
		Training Accuracy: 67.92%	 Validation Accuracy: 62.86%
Epoch: 99 	Training Loss: 0.7344 	Validation Loss: 0.7999
		Training Accuracy: 71.85%	 Validation Accuracy: 71.43%
Epoch: 119 	Training Loss: 0.5841 	Validation Loss: 0.7068
		Training Accuracy: 77.91%	 Validation Accuracy: 74.29%
Epoch: 139 	Training Loss: 0.4370 	Validation Loss: 0.6374
		Training Accuracy: 85.59%	 Validation Accuracy: 73.33%
Epoch: 159 	Training Loss: 0.3498 	Validation Loss: 0.5969
		Training Accuracy: 88.66%	 Validation Accuracy: 77.14%
Epoch: 179 	Training Loss: 0.2684 	Validati

### 在第二題所有模型中，testing accuracy 最高的是 learning rate = 0.0001、optimizer = Adam 的模型，模型準確率為81.5%。

### 而他的 Per-class accuracy 順序與第一題預測的結果一樣。而在其他的模型中，大多數也與第一題的預測結果一致，有時會出現相鄰兩個順序交換的情況，但是沒有看到規律。

#### Q3
(30%) 使用Resnet50建構圖片分類模型。將最後一層的Fully Connected Layer輸出維度改成4以符合本題任務需求。除了最後一層以外，使用torchvision提供的pretrained weights (`torchvision.models.resnet50(pretrained=True)`)初始化模型權重。模型訓練時固定除了最後一層以外的其他權重。也就是說，模型訓練只會調整最後一層Fully Connected Layer。圖片前處理與前題一致。

使用train訓練模型，以valid決定early stopping的epoch。Early stopping的patient參數為20 epochs。Batch size設為32。紀錄valid loss最低的模型，並在test中計算模型Accuracy, Confusion Matrix, 與Per-class Accuracy。你應該要考慮SGD與ADAM兩種最佳化演算法。調整超參數以達到最好的valid loss。



## Q3 Adam

In [None]:
for _learning_rate in [0.0001,0.0005,0.001]:

    print('-------- optimizer = Adam | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=True,
        requires_grad_tf=False,
        _optimizer='Adam')
    test(model)

-------- optimizer = Adam | learning rate = 0.0001 --------



Downloading: "https://download.pytorch.org/models/resnet50-19c8e357.pth" to /root/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth


HBox(children=(FloatProgress(value=0.0, max=102502400.0), HTML(value='')))


Starting Training from Scratch.

Epoch: 19 	Training Loss: 0.8825 	Validation Loss: 0.9833
		Training Accuracy: 62.82%	 Validation Accuracy: 59.05%
Epoch: 39 	Training Loss: 0.8075 	Validation Loss: 0.9458
		Training Accuracy: 66.76%	 Validation Accuracy: 55.24%

Early Stopping! Total epochs: 48. Best epoch: 28 with loss: 0.93 and acc: 59.05%
762.91 total seconds elapsed. 15.57 seconds per epoch.

--- Per-class Accuracy ---
blazer accuracy: 0.5555555555555556
coat accuracy: 0.5813953488372093
cardigan accuracy: 0.5238095238095238
jacket accuracy: 0.7307692307692307

--- Testing Accuracy ---
accuracy: 0.6164383561643836

--- Confusion Matrix ---
          blazer  coat  cardigan  jacket
blazer         5     0         0       4
coat           3    25         4      11
cardigan       0     6        22      14
jacket         0     5         9      38


-------- optimizer = Adam | learning rate = 0.0005 --------

Starting Training from Scratch.

Epoch: 19 	Training Loss: 0.8329 	Validation 

## Q3 SGD

In [None]:
for _learning_rate in [0.0005,0.001,0.005]:

    print('-------- optimizer = SGD | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=True,
        requires_grad_tf=False,
        _optimizer='SGD')
    test(model)

-------- optimizer = SGD | learning rate = 0.0005 --------

Starting Training from Scratch.

Epoch: 19 	Training Loss: 1.2586 	Validation Loss: 1.2774
		Training Accuracy: 40.73%	 Validation Accuracy: 33.33%
Epoch: 39 	Training Loss: 1.2268 	Validation Loss: 1.2406
		Training Accuracy: 42.65%	 Validation Accuracy: 36.19%
Epoch: 59 	Training Loss: 1.1928 	Validation Loss: 1.2094
		Training Accuracy: 46.88%	 Validation Accuracy: 41.90%
Epoch: 79 	Training Loss: 1.1719 	Validation Loss: 1.1765
		Training Accuracy: 49.76%	 Validation Accuracy: 45.71%
Epoch: 99 	Training Loss: 1.1342 	Validation Loss: 1.1484
		Training Accuracy: 51.78%	 Validation Accuracy: 48.57%
Epoch: 119 	Training Loss: 1.1205 	Validation Loss: 1.1183
		Training Accuracy: 53.99%	 Validation Accuracy: 56.19%
Epoch: 139 	Training Loss: 1.0867 	Validation Loss: 1.0941
		Training Accuracy: 54.66%	 Validation Accuracy: 57.14%
Epoch: 159 	Training Loss: 1.0450 	Validation Loss: 1.0722
		Training Accuracy: 57.25%	 Validation A

### 在第三題所有模型中，testing accuracy 最高的是 learning rate = 0.001、optimizer = SGD 的模型，模型準確率為63%。

### Q4
(20%) 使用Resnet50建構圖片分類模型。將最後一層的fully connected layer輸出維度改成4以符合本題任務需求。圖片前處理與前題一致。不使用預訓練權重初始化模型。使用train訓練模型，以valid決定early stopping的epoch。Early stopping的patient參數為20 epochs。Batch size設為32。紀錄valid loss最低的模型，並在test中計算模型Accuracy, Confusion Matrix, 與Per-class Accuracy。你應該要考慮SGD與ADAM兩種最佳化演算法。調整超參數以達到最好的valid loss。



## Q4 Adam

In [None]:
for _learning_rate in [0.0001,0.0005,0.001]:

    print('-------- optimizer = Adam | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=False,
        requires_grad_tf=True,
        _optimizer='Adam')
    test(model)

-------- optimizer = Adam | learning rate = 0.0001 --------

Starting Training from Scratch.

Epoch: 19 	Training Loss: 1.1455 	Validation Loss: 1.2430
		Training Accuracy: 46.40%	 Validation Accuracy: 43.81%
Epoch: 39 	Training Loss: 0.9327 	Validation Loss: 1.2447
		Training Accuracy: 59.27%	 Validation Accuracy: 48.57%

Early Stopping! Total epochs: 43. Best epoch: 23 with loss: 1.18 and acc: 49.52%
774.59 total seconds elapsed. 17.60 seconds per epoch.

--- Per-class Accuracy ---
blazer accuracy: 0.0
coat accuracy: 0.2558139534883721
cardigan accuracy: 0.4523809523809524
jacket accuracy: 0.6730769230769231

--- Testing Accuracy ---
accuracy: 0.4452054794520548

--- Confusion Matrix ---
          blazer  coat  cardigan  jacket
blazer         0     2         3       4
coat           0    11         9      23
cardigan       0     6        19      17
jacket         0     7        10      35


-------- optimizer = Adam | learning rate = 0.0005 --------

Starting Training from Scratch.



## Q4 SGD

In [None]:
for _learning_rate in [0.0005,0.001,0.005]:

    print('-------- optimizer = SGD | learning rate =',_learning_rate, '--------\n')
    model, history = train(
        dataloaders['train'],
        dataloaders['val'],
        data,
        save_file_name=save_file_name,
        max_epochs_stop=20,
        n_epochs=200,
        print_every=20,
        learning_rate=_learning_rate,
        pretrained_tf=False,
        requires_grad_tf=True,
        _optimizer='SGD')
    test(model)

-------- optimizer = SGD | learning rate = 0.0005 --------

Starting Training from Scratch.

Epoch: 19 	Training Loss: 1.2800 	Validation Loss: 1.3355
		Training Accuracy: 39.10%	 Validation Accuracy: 34.29%
Epoch: 39 	Training Loss: 1.2760 	Validation Loss: 1.3045
		Training Accuracy: 39.48%	 Validation Accuracy: 34.29%
Epoch: 59 	Training Loss: 1.2711 	Validation Loss: 1.3402
		Training Accuracy: 39.58%	 Validation Accuracy: 33.33%

Early Stopping! Total epochs: 61. Best epoch: 41 with loss: 1.30 and acc: 37.14%
1071.36 total seconds elapsed. 17.28 seconds per epoch.

--- Per-class Accuracy ---
blazer accuracy: 0.0
coat accuracy: 0.0
cardigan accuracy: 0.0
jacket accuracy: 0.9615384615384616

--- Testing Accuracy ---
accuracy: 0.3424657534246575

--- Confusion Matrix ---
          blazer  coat  cardigan  jacket
blazer         0     0         0       9
coat           0     0         0      43
cardigan       0     0         0      42
jacket         0     2         0      50


-------- 

### 在第四題所有模型中，testing accuracy 最高的是 learning rate = 0.0005、optimizer = Adam 的模型，模型準確率為51.4%。

### Q5
(10%) 統整併討論Q2-Q4的預測能力。說明你的觀察。

### 由第二題到第四題，我觀察到當『使用預訓練權重』且『不要將除了最後一層以外的權重固定』時模型準確度最高。當『固定預訓練權重』時，模型會表現的較差，但大致上還是分的出不同的class。而當『不使用預訓練權重』時，模型表現最差，幾乎無法學習（在SGD lr=0.001模型時甚至將全部都分類為jacket）

### 而超參數調整的部分，我觀察到SGD需要較大的learning rate，Adam需要較小，且Adam通常表現較好。