### Thesis notebook 4.4. - NOVA IMS

#### LSTM - Temporal data representation

In this notebook, we will finally start our application of temporal representation using LSTMs and bi-directional LSTMs.
The argument for the usage of Deep Learning stems from the fact that sequences themselves encode information that can be extracted using Recurrent Neural Networks and, more specifically, Long Short Term Memory Units.

#### First Step: Setup a PyTorch environment that enables the use of GPU for training. 

The following cell wll confirm that the GPU will be the default device to use.

In [1]:
import torch
import pycuda.driver as cuda

cuda.init()
## Get Id of default device
torch.cuda.current_device()
# 0
cuda.Device(0).name() # '0' is the id of your GPU

#set all tensors to gpu
torch.set_default_tensor_type('torch.cuda.FloatTensor')

#### Second Step: Import the relevant packages and declare global variables

In [2]:
#import necessary modules/libraries
import numpy as np
import scipy
import pandas as pd
import datetime as dt
import warnings
import time

#tqdm to monitor progress
from tqdm.notebook import tqdm, trange
tqdm.pandas(desc="Progress")

#time related features
from datetime import timedelta
from copy import copy, deepcopy

#vizualization
import matplotlib.pyplot as plt
import seaborn as sns

#imblearn, scalers, kfold and metrics
from imblearn.over_sampling import SMOTE
from sklearn.preprocessing import MinMaxScaler, StandardScaler, RobustScaler, QuantileTransformer,PowerTransformer
from sklearn.model_selection import train_test_split, RepeatedKFold, RepeatedStratifiedKFold, cross_val_score, GridSearchCV
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, roc_curve, recall_score, classification_report, average_precision_score, precision_recall_curve

#import torch related
import torch.nn as nn
from torch.nn import functional as F
from torch.autograd import Variable 
from torch.utils.data import TensorDataset, DataLoader
from torch.utils.data.sampler import SubsetRandomSampler


#and optimizer of learning rate
from torch.optim.lr_scheduler import ReduceLROnPlateau

#import pytorch modules
warnings.filterwarnings('ignore')

In [3]:
#global variables that may come in handy
#course threshold sets the % duration that will be considered (1 = 100%)
duration_threshold = [0.1, 0.25, 0.33, 0.5, 1]

#colors for vizualizations
nova_ims_colors = ['#BFD72F', '#5C666C']

#standard color for student aggregates
student_color = '#474838'

#standard color for course aggragates
course_color = '#1B3D2F'

#standard continuous colormap
standard_cmap = 'viridis_r'

#Function designed to deal with multiindex and flatten it
def flattenHierarchicalCol(col,sep = '_'):
    '''converts multiindex columns into single index columns while retaining the hierarchical components'''
    if not type(col) is tuple:
        return col
    else:
        new_col = ''
        for leveli,level in enumerate(col):
            if not level == '':
                if not leveli == 0:
                    new_col += sep
                new_col += level
        return new_col
    
#number of replicas - number of repeats of stratified k fold - in this case 10
replicas = 30

#names to display on result figures
date_names = {
             'Date_threshold_10': '10% of Course Duration',   
             'Date_threshold_25': '25% of Course Duration', 
             'Date_threshold_33': '33% of Course Duration', 
             'Date_threshold_50': '50% of Course Duration', 
             'Date_threshold_100':'100% of Course Duration', 
            }

target_names = {
                'exam_fail' : 'At risk - Exam Grade',
                'final_fail' : 'At risk - Final Grade', 
                'exam_gifted' : 'High performer - Exam Grade', 
                'final_gifted': 'High performer - Final Grade'
                }

#targets
targets = ['exam_fail' , 'final_fail' , 'exam_gifted' , 'final_gifted']
temporal_columns = ['0 to 4%', '4 to 8%', '8 to 12%', '12 to 16%', '16 to 20%', '20 to 24%',
       '24 to 28%', '28 to 32%', '32 to 36%', '36 to 40%', '40 to 44%',
       '44 to 48%', '48 to 52%', '52 to 56%', '56 to 60%', '60 to 64%',
       '64 to 68%', '68 to 72%', '72 to 76%', '76 to 80%', '80 to 84%',
       '84 to 88%', '88 to 92%', '92 to 96%', '96 to 100%']

#### Step 3: Import data and take a preliminary look at it 

In [4]:
#imports dataframes
course_programs = pd.read_excel("../Data/Modeling Stage/Nova_IMS_Temporal_Datasets_25_splits.xlsx", 
                                dtype = {
                                    'course_encoding' : int,
                                    'userid' : int},
                               sheet_name = None)

#save tables 
student_list = pd.read_csv('../Data/Modeling Stage/Nova_IMS_Filtered_targets.csv', 
                         dtype = {
                                   'course_encoding': int,
                                   'userid' : int,
                                   })

#drop unnamed 0 column
for i in course_programs:
        
    #merge with the targets we calculated on the other 
    course_programs[i] = course_programs[i].merge(student_list, on = ['course_encoding', 'userid'], how = 'inner')
    course_programs[i].drop(['Unnamed: 0', 'exam_mark', 'final_mark'], axis = 1, inplace = True)
    
    #convert results to object
    course_programs[i]['course_encoding'], course_programs[i]['userid'] = course_programs[i]['course_encoding'].astype(object), course_programs[i]['userid'].astype(object)

In [5]:
course_programs['Date_threshold_100'].info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 9296 entries, 0 to 9295
Data columns (total 31 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   course_encoding  9296 non-null   object
 1   userid           9296 non-null   object
 2   0 to 4%          9296 non-null   int64 
 3   4 to 8%          9296 non-null   int64 
 4   8 to 12%         9296 non-null   int64 
 5   12 to 16%        9296 non-null   int64 
 6   16 to 20%        9296 non-null   int64 
 7   20 to 24%        9296 non-null   int64 
 8   24 to 28%        9296 non-null   int64 
 9   28 to 32%        9296 non-null   int64 
 10  32 to 36%        9296 non-null   int64 
 11  36 to 40%        9296 non-null   int64 
 12  40 to 44%        9296 non-null   int64 
 13  44 to 48%        9296 non-null   int64 
 14  48 to 52%        9296 non-null   int64 
 15  52 to 56%        9296 non-null   int64 
 16  56 to 60%        9296 non-null   int64 
 17  60 to 64%        9296 non-null   

In [6]:
course_programs['Date_threshold_100'].describe(include = 'all')

Unnamed: 0,course_encoding,userid,0 to 4%,4 to 8%,8 to 12%,12 to 16%,16 to 20%,20 to 24%,24 to 28%,28 to 32%,...,76 to 80%,80 to 84%,84 to 88%,88 to 92%,92 to 96%,96 to 100%,exam_fail,final_fail,exam_gifted,final_gifted
count,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,...,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0,9296.0
unique,138.0,1590.0,,,,,,,,,...,,,,,,,,,,
top,150.0,3178.0,,,,,,,,,...,,,,,,,,,,
freq,178.0,14.0,,,,,,,,,...,,,,,,,,,,
mean,,,1.081863,8.307874,10.752797,11.193739,10.127797,8.966652,10.545396,11.445245,...,11.718051,13.136403,22.827883,27.341007,12.599613,0.0,0.201377,0.149957,0.276893,0.30809
std,,,3.526351,13.580025,13.626754,16.400023,14.291254,12.180177,13.507892,15.932226,...,28.186874,36.690068,47.158607,54.963959,35.194597,0.0,0.401051,0.357048,0.447487,0.461729
min,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,,,0.0,0.0,1.0,2.0,2.0,1.0,2.0,3.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,,,0.0,2.0,7.0,7.0,6.0,5.0,7.0,7.0,...,2.0,2.0,4.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,,,1.0,12.0,15.0,15.0,13.0,13.0,14.0,14.0,...,10.0,10.0,23.0,27.0,5.0,0.0,0.0,0.0,1.0,1.0


In our first attempt, we will use the absolute number of clicks made by each student - scaled using standard scaler. 
Therefore, we can start by immediately placing our course encoding/userid pairings into the index.

In [7]:
def normalize(train, test, scaler):
    
    if scaler == 'MinMax':
        pt = MinMaxScaler()
    elif scaler == 'Standard':
        pt = StandardScaler()
    elif scaler == 'Robust':
        pt = RobustScaler()
    elif scaler == 'Quantile':
        pt = QuantileTransformer()
    else:
        pt = PowerTransformer(method='yeo-johnson')
    
    data_train = pt.fit_transform(train)
    data_test = pt.transform(test)
    # convert the array back to a dataframe
    normalized_train = pd.DataFrame(data_train,columns=train.columns)
    normalized_test = pd.DataFrame(data_test,columns=test.columns)
        
    return normalized_train, normalized_test

#### Implementing Cross-Validation with Deep Learning Model

**1. Create the Deep Learning Model**

In this instance, we will follow-up with on the approach used in Chen & Cui - CrossEntropyLoss with applied over a softmax layer.

In [8]:
class LSTM_Uni(nn.Module):
    def __init__(self, num_classes, input_size, hidden_size, num_layers, seq_length):
        super(LSTM_Uni, self).__init__()
        self.num_classes = num_classes #number of classes
        self.num_layers = num_layers #number of layers
        self.input_size = input_size #input size
        self.hidden_size = hidden_size #hidden state
        self.seq_length = seq_length #sequence length

        self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size,
                          num_layers=num_layers, batch_first = True) #lstm
        
        self.dropout = nn.Dropout(p = 0.5)
    
        self.fc = nn.Linear(self.hidden_size, num_classes) #fully connected last layer

    def forward(self,x):
        h_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) #hidden state
        c_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) #internal state
        
        #Xavier_init for both H_0 and C_0
        torch.nn.init.xavier_normal_(h_0)
        torch.nn.init.xavier_normal_(c_0)
        
        # Propagate input through LSTM
        lstm_out, (hn, cn) = self.lstm(x, (h_0, c_0)) #lstm with input, hidden, and internal state
        last_output = hn.view(-1, self.hidden_size) #reshaping the data for Dense layer next
        
        drop_out = self.dropout(last_output)
        pre_softmax = self.fc(drop_out) #Final Output - dense
        return pre_softmax

**2. Define the train and validation Functions**

In [9]:
def train_epoch(model,dataloader,loss_fn,optimizer):
    
    train_loss,train_correct=0.0,0 
    model.train()
    for X, labels in dataloader:

        optimizer.zero_grad()
        output = model(X)
        loss = loss_fn(output,labels)
        loss.backward()
        optimizer.step()
        train_loss += loss.item() * X.size(0)
        scores, predictions = torch.max(F.log_softmax(output.data), 1)
        train_correct += (predictions == labels).sum().item()
        
    return train_loss,train_correct
  
def valid_epoch(model,dataloader,loss_fn):
    valid_loss, val_correct = 0.0, 0
    targets = []
    y_pred = []
    probability_1 = []
    
    model.eval()
    for X, labels in dataloader:

        output = model(X)
        loss=loss_fn(output,labels)
        valid_loss+=loss.item()*X.size(0)
        probability_1.append(F.softmax(output.data)[:,1])
        predictions = torch.argmax(output, dim=1)
        val_correct+=(predictions == labels).sum().item()
        targets.append(labels)
        y_pred.append(predictions)
    
    #concat all results
    targets = torch.cat(targets).data.cpu().numpy()
    y_pred = torch.cat(y_pred).data.cpu().numpy()
    probability_1 = torch.cat(probability_1).data.cpu().numpy()
    
    #calculate precision, recall and AUC score
    
    precision = precision_score(targets, y_pred)
    recall = recall_score(targets, y_pred)
    auroc = roc_auc_score(targets, probability_1)
    
    #return all
    return valid_loss,val_correct, precision, recall, auroc

**3. Define main hyperparameters of the model, including splits**

In [10]:
#Model
num_epochs = 200 #50 epochs
learning_rate = 0.01 #0.001 lr
input_size = 1 #number of features
hidden_size = 40 #number of features in hidden state
num_layers = 1 #number of stacked lstm layers

#Shape of Output as required for SoftMax Classifier
num_classes = 2 #output shape

batch_size = 32

k=10
splits= RepeatedStratifiedKFold(n_splits=k, n_repeats=replicas, random_state=15) #kfold of 10 with 30 replicas
criterion = nn.CrossEntropyLoss()    # cross-entropy for classification

**4. Make the splits and Start Training**

In a privous training session, we managed to complete the 10% threshold. This translates into not needing to perform training all over again in the case of a shutdown / restart.

In [None]:
for i in tqdm(list(course_programs.keys())[1:]):
    
    print(i)
    threshold_dict = {} #dict to store information in for each threshold
    data = deepcopy(course_programs[i])
    
    data.set_index(['course_encoding', 'userid'], drop = True, inplace = True)
    data.fillna(0, inplace = True)
    
    #set X and Y columns
    X = data[data.columns[:25]] #different timesteps
    y = data[data.columns[-4:]] #the 4 different putative targets
    
    for k in tqdm(targets):
        print(k)
        
        #Start with train test split
        X_train_val, X_test, y_train_val, y_test, = train_test_split(
                                    X,
                                   y[k], #replace when going for multi-target 
                                   test_size = 0.20,
                                   random_state = 15,
                                   shuffle=True,
                                   stratify = y[k] #replace when going for multi-target
                                    )
        
        #create dict to store fold performance
        foldperf={}
        
        #reset "best accuracy for treshold i and target k"
        best_accuracy = 0

        #make train_val split
        for fold, (train_idx,val_idx) in tqdm(enumerate(splits.split(X_train_val, y_train_val))):

            print('Split {}'.format(fold + 1))
            
            #make split between train and Val
            X_train, y_train = X_train_val.iloc[train_idx], y_train_val.iloc[train_idx]
            X_val, y_val = X_train_val.iloc[val_idx], y_train_val.iloc[val_idx]
            
            #apply SMOTE to training split
            over = SMOTE()
            X_train, y_train = over.fit_resample(X_train, y_train)
            
            #apply scaling after 
            X_train, X_val = normalize(X_train, X_val, 'Standard')
            
            #second, convert everything to pytorch tensor - we will convert to tensor dataset and 
            X_train_tensors = Variable(torch.Tensor(X_train.values))
            X_val_tensors = Variable(torch.Tensor(X_val.values))

            y_train_tensors = Variable(torch.Tensor(y_train.values))
            y_val_tensors = Variable(torch.Tensor(y_val.values)) 

            #reshaping to rows, timestamps, features 
            X_train_tensors = torch.reshape(X_train_tensors,   (X_train_tensors.shape[0], X_train_tensors.shape[1], 1))
            X_val_tensors = torch.reshape(X_val_tensors,  (X_val_tensors.shape[0], X_val_tensors.shape[1], 1))
        
            #convert y tensors to format longtensor
            y_train_tensors = y_train_tensors.type(torch.cuda.LongTensor)
            y_val_tensors = y_val_tensors.type(torch.cuda.LongTensor)
            
            #create Tensor Datasets and dataloaders for both Train and Val
            train_dataset = TensorDataset(X_train_tensors, y_train_tensors)
            val_dataset = TensorDataset(X_val_tensors, y_val_tensors)
            train_loader = DataLoader(train_dataset, batch_size=batch_size)
            val_loader = DataLoader(val_dataset, batch_size=batch_size)
    
            #creates new model for each 
            model = LSTM_Uni(num_classes, input_size, hidden_size, num_layers, X_train_tensors.shape[1]).to('cuda') #our lstm class
            optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 
            scheduler = ReduceLROnPlateau(optimizer, 
                                  'min', 
                                  patience = 10,
                                  cooldown = 20,
                                 verbose = True)
    
            history = {'train_loss': [], 'val_loss': [],'train_acc':[],'val_acc':[], 'precision': [],
                      'recall' : [], 'auroc': []}

            for epoch in tqdm(range(num_epochs)):
                train_loss, train_correct=train_epoch(model,train_loader,criterion,optimizer)
                val_loss, val_correct, precision, recall, auroc = valid_epoch(model,val_loader,criterion)

                train_loss = train_loss / len(train_loader.sampler)
                train_acc = train_correct / len(train_loader.sampler) * 100
                val_loss = val_loss / len(val_loader.sampler)
                val_acc = val_correct / len(val_loader.sampler) * 100
        
        
                if (epoch+1) % 10 == 0: 
                    print("Epoch:{}/{} AVG Training Loss:{:.3f} AVG Validation Loss:{:.3f} AVG Training Acc {:.2f} % AVG Validation Acc {:.2f} %".format(epoch + 1,
                                                                                                             num_epochs,
                                                                                                             train_loss,
                                                                                                             val_loss,
                                                                                                             train_acc,
                                                                                                             val_acc))
                history['train_loss'].append(train_loss)
                history['val_loss'].append(val_loss)
                history['train_acc'].append(train_acc)
                history['val_acc'].append(val_acc)
                history['precision'].append(precision)
                history['recall'].append(recall)
                history['auroc'].append(auroc)
                scheduler.step(val_loss)
    
                if val_acc > best_accuracy:
            
                #replace best accuracy and save best model
                    print(f'New Best Accuracy found: {val_acc:.2f}%\nEpoch: {epoch + 1}')
                    best_accuracy = val_acc
                    best = deepcopy(model)
                    curr_epoch = epoch + 1
                    
            #store fold performance
            foldperf['fold{}'.format(fold+1)] = history
        
        #saves fold performance for target 
        threshold_dict[k] = pd.DataFrame.from_dict(foldperf, orient='index') # convert dict to dataframe
        
        #explode to get eacxh epoch as a row
        threshold_dict[k] = threshold_dict[k].explode(list(threshold_dict[k].columns))
        torch.save(best,f"../Models/{i}/SMOTE_Nova_IMS_best_{k}_{curr_epoch}_epochs.h")
        
    # from pandas.io.parsers import ExcelWriter
    with pd.ExcelWriter(f"../Data/Modeling Stage/Results/IMS/Clicks per % duration/SMOTE_25_splits_{i}_{replicas}_replicas.xlsx") as writer:  
        for sheet in targets:
                threshold_dict[sheet].to_excel(writer, sheet_name=str(sheet))

  0%|          | 0/4 [00:00<?, ?it/s]

Date_threshold_25


  0%|          | 0/4 [00:00<?, ?it/s]

exam_fail


0it [00:00, ?it/s]

Split 1


  0%|          | 0/200 [00:00<?, ?it/s]

New Best Accuracy found: 20.16%
Epoch: 1
Epoch:10/200 AVG Training Loss:0.538 AVG Validation Loss:2.967 AVG Training Acc 79.00 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.574 AVG Validation Loss:3.732 AVG Training Acc 78.06 % AVG Validation Acc 20.16 %
Epoch    28: reducing learning rate of group 0 to 1.0000e-03.
New Best Accuracy found: 20.43%
Epoch: 29
Epoch:30/200 AVG Training Loss:0.708 AVG Validation Loss:2.037 AVG Training Acc 72.88 % AVG Validation Acc 20.43 %
Epoch:40/200 AVG Training Loss:0.643 AVG Validation Loss:1.235 AVG Training Acc 65.71 % AVG Validation Acc 20.30 %
New Best Accuracy found: 20.97%
Epoch: 49
Epoch:50/200 AVG Training Loss:0.651 AVG Validation Loss:1.095 AVG Training Acc 63.27 % AVG Validation Acc 20.70 %
New Best Accuracy found: 21.24%
Epoch: 53
New Best Accuracy found: 21.37%
Epoch: 55
Epoch:60/200 AVG Training Loss:0.622 AVG Validation Loss:1.212 AVG Training Acc 66.73 % AVG Validation Acc 20.70 %
Epoch    64: reducing learning rate of 

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.530 AVG Validation Loss:3.689 AVG Training Acc 76.93 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.470 AVG Validation Loss:4.454 AVG Training Acc 79.58 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.616 AVG Validation Loss:1.798 AVG Training Acc 69.45 % AVG Validation Acc 20.16 %
Epoch:40/200 AVG Training Loss:0.622 AVG Validation Loss:1.791 AVG Training Acc 69.06 % AVG Validation Acc 20.16 %
Epoch:50/200 AVG Training Loss:0.583 AVG Validation Loss:7.133 AVG Training Acc 81.61 % AVG Validation Acc 20.16 %
Epoch    56: reducing learning rate of group 0 to 1.0000e-03.
Epoch:60/200 AVG Training Loss:0.693 AVG Validation Loss:0.977 AVG Training Acc 56.85 % AVG Validation Acc 23.39 %
Epoch:70/200 AVG Training Loss:0.679 AVG Validation Loss:0.926 AVG Training Acc 58.79 % AVG Validation Acc 28.76 %
Epoch:80/200 AVG Training Loss:0.664 AVG Validation Loss:0.936 AVG Training Acc 61.31 % AVG Validation Acc 35.08 %
Epoch    87: reduc

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.546 AVG Validation Loss:3.848 AVG Training Acc 79.79 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.561 AVG Validation Loss:3.854 AVG Training Acc 81.72 % AVG Validation Acc 20.16 %
Epoch    23: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.694 AVG Validation Loss:0.814 AVG Training Acc 53.22 % AVG Validation Acc 24.33 %
Epoch:40/200 AVG Training Loss:0.679 AVG Validation Loss:0.861 AVG Training Acc 56.71 % AVG Validation Acc 25.00 %
Epoch:50/200 AVG Training Loss:0.668 AVG Validation Loss:0.826 AVG Training Acc 59.22 % AVG Validation Acc 29.97 %
Epoch    54: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.665 AVG Validation Loss:0.724 AVG Training Acc 59.30 % AVG Validation Acc 47.85 %
Epoch:70/200 AVG Training Loss:0.659 AVG Validation Loss:0.685 AVG Training Acc 60.17 % AVG Validation Acc 57.66 %
Epoch:80/200 AVG Training Loss:0.657 AVG Validation Loss:0.676 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.554 AVG Validation Loss:3.869 AVG Training Acc 77.78 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.509 AVG Validation Loss:4.751 AVG Training Acc 80.07 % AVG Validation Acc 20.16 %
Epoch    25: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.697 AVG Validation Loss:0.818 AVG Training Acc 52.36 % AVG Validation Acc 26.61 %
Epoch:40/200 AVG Training Loss:0.681 AVG Validation Loss:0.844 AVG Training Acc 56.94 % AVG Validation Acc 29.70 %
Epoch:50/200 AVG Training Loss:0.675 AVG Validation Loss:0.858 AVG Training Acc 58.19 % AVG Validation Acc 30.51 %
Epoch    56: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.677 AVG Validation Loss:0.761 AVG Training Acc 58.28 % AVG Validation Acc 42.61 %
Epoch:70/200 AVG Training Loss:0.664 AVG Validation Loss:0.699 AVG Training Acc 60.43 % AVG Validation Acc 51.75 %
Epoch:80/200 AVG Training Loss:0.663 AVG Validation Loss:0.688 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.530 AVG Validation Loss:5.106 AVG Training Acc 80.36 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.519 AVG Validation Loss:4.101 AVG Training Acc 83.05 % AVG Validation Acc 20.16 %
Epoch    28: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.791 AVG Validation Loss:0.980 AVG Training Acc 49.64 % AVG Validation Acc 20.30 %
Epoch:40/200 AVG Training Loss:0.686 AVG Validation Loss:0.836 AVG Training Acc 55.96 % AVG Validation Acc 34.81 %
Epoch:50/200 AVG Training Loss:0.675 AVG Validation Loss:0.820 AVG Training Acc 57.40 % AVG Validation Acc 35.48 %
Epoch:60/200 AVG Training Loss:0.669 AVG Validation Loss:0.830 AVG Training Acc 59.69 % AVG Validation Acc 38.44 %
Epoch    64: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.661 AVG Validation Loss:0.736 AVG Training Acc 60.27 % AVG Validation Acc 48.12 %
Epoch:80/200 AVG Training Loss:0.655 AVG Validation Loss:0.694 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.559 AVG Validation Loss:4.379 AVG Training Acc 77.15 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.648 AVG Validation Loss:1.598 AVG Training Acc 65.93 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.485 AVG Validation Loss:5.460 AVG Training Acc 82.61 % AVG Validation Acc 20.16 %
Epoch    31: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.696 AVG Validation Loss:0.814 AVG Training Acc 52.12 % AVG Validation Acc 24.19 %
Epoch:50/200 AVG Training Loss:0.690 AVG Validation Loss:0.807 AVG Training Acc 53.94 % AVG Validation Acc 26.48 %
Epoch:60/200 AVG Training Loss:0.676 AVG Validation Loss:0.814 AVG Training Acc 58.47 % AVG Validation Acc 30.91 %
Epoch    62: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.672 AVG Validation Loss:0.723 AVG Training Acc 57.70 % AVG Validation Acc 47.45 %
Epoch:80/200 AVG Training Loss:0.665 AVG Validation Loss:0.689 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.548 AVG Validation Loss:3.534 AVG Training Acc 78.27 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.584 AVG Validation Loss:2.533 AVG Training Acc 75.14 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.631 AVG Validation Loss:1.820 AVG Training Acc 67.57 % AVG Validation Acc 20.05 %
Epoch:40/200 AVG Training Loss:0.634 AVG Validation Loss:1.685 AVG Training Acc 68.78 % AVG Validation Acc 20.05 %
Epoch:50/200 AVG Training Loss:0.641 AVG Validation Loss:1.614 AVG Training Acc 66.16 % AVG Validation Acc 20.05 %
Epoch    57: reducing learning rate of group 0 to 1.0000e-03.
Epoch:60/200 AVG Training Loss:0.728 AVG Validation Loss:0.869 AVG Training Acc 51.86 % AVG Validation Acc 23.55 %
Epoch:70/200 AVG Training Loss:0.672 AVG Validation Loss:0.776 AVG Training Acc 59.18 % AVG Validation Acc 38.76 %
Epoch:80/200 AVG Training Loss:0.663 AVG Validation Loss:0.772 AVG Training Acc 60.00 % AVG Validation Acc 45.09 %
Epoch:90/200 AVG T

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.517 AVG Validation Loss:4.331 AVG Training Acc 80.93 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.559 AVG Validation Loss:7.378 AVG Training Acc 77.54 % AVG Validation Acc 20.05 %
Epoch    22: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.690 AVG Validation Loss:0.781 AVG Training Acc 54.71 % AVG Validation Acc 32.03 %
Epoch:40/200 AVG Training Loss:0.678 AVG Validation Loss:0.795 AVG Training Acc 57.76 % AVG Validation Acc 30.55 %
Epoch:50/200 AVG Training Loss:0.669 AVG Validation Loss:0.803 AVG Training Acc 59.11 % AVG Validation Acc 32.17 %
Epoch    53: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.665 AVG Validation Loss:0.707 AVG Training Acc 59.19 % AVG Validation Acc 51.28 %
Epoch:70/200 AVG Training Loss:0.658 AVG Validation Loss:0.672 AVG Training Acc 60.94 % AVG Validation Acc 59.35 %
Epoch:80/200 AVG Training Loss:0.660 AVG Validation Loss:0.666 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.515 AVG Validation Loss:6.271 AVG Training Acc 80.58 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.541 AVG Validation Loss:5.918 AVG Training Acc 78.62 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.533 AVG Validation Loss:8.971 AVG Training Acc 80.44 % AVG Validation Acc 20.05 %
Epoch    34: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.699 AVG Validation Loss:0.780 AVG Training Acc 51.01 % AVG Validation Acc 26.78 %
Epoch:50/200 AVG Training Loss:0.676 AVG Validation Loss:0.827 AVG Training Acc 57.96 % AVG Validation Acc 37.15 %
Epoch:60/200 AVG Training Loss:0.666 AVG Validation Loss:0.817 AVG Training Acc 60.30 % AVG Validation Acc 42.26 %
Epoch    65: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.663 AVG Validation Loss:0.723 AVG Training Acc 60.40 % AVG Validation Acc 52.49 %
Epoch:80/200 AVG Training Loss:0.654 AVG Validation Loss:0.677 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.537 AVG Validation Loss:3.023 AVG Training Acc 77.54 % AVG Validation Acc 20.19 %
Epoch:20/200 AVG Training Loss:0.543 AVG Validation Loss:3.323 AVG Training Acc 80.78 % AVG Validation Acc 20.19 %
Epoch:30/200 AVG Training Loss:0.585 AVG Validation Loss:2.303 AVG Training Acc 74.90 % AVG Validation Acc 20.19 %
Epoch    34: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.696 AVG Validation Loss:0.800 AVG Training Acc 52.16 % AVG Validation Acc 26.11 %
Epoch:50/200 AVG Training Loss:0.676 AVG Validation Loss:0.811 AVG Training Acc 57.77 % AVG Validation Acc 34.72 %
Epoch:60/200 AVG Training Loss:0.668 AVG Validation Loss:0.821 AVG Training Acc 59.08 % AVG Validation Acc 38.09 %
Epoch    65: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.665 AVG Validation Loss:0.745 AVG Training Acc 59.29 % AVG Validation Acc 51.14 %
Epoch:80/200 AVG Training Loss:0.655 AVG Validation Loss:0.702 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.485 AVG Validation Loss:6.890 AVG Training Acc 81.72 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.529 AVG Validation Loss:8.631 AVG Training Acc 78.66 % AVG Validation Acc 20.16 %
Epoch    26: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.700 AVG Validation Loss:0.797 AVG Training Acc 51.99 % AVG Validation Acc 24.60 %
Epoch:40/200 AVG Training Loss:0.690 AVG Validation Loss:0.781 AVG Training Acc 53.93 % AVG Validation Acc 25.40 %
Epoch:50/200 AVG Training Loss:0.680 AVG Validation Loss:0.794 AVG Training Acc 56.71 % AVG Validation Acc 29.03 %
Epoch    57: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.681 AVG Validation Loss:0.750 AVG Training Acc 55.71 % AVG Validation Acc 38.98 %
Epoch:70/200 AVG Training Loss:0.667 AVG Validation Loss:0.684 AVG Training Acc 59.30 % AVG Validation Acc 51.08 %
Epoch:80/200 AVG Training Loss:0.666 AVG Validation Loss:0.671 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.530 AVG Validation Loss:2.383 AVG Training Acc 80.80 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.575 AVG Validation Loss:4.073 AVG Training Acc 71.24 % AVG Validation Acc 20.16 %
Epoch    28: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.844 AVG Validation Loss:1.015 AVG Training Acc 50.04 % AVG Validation Acc 20.83 %
Epoch:40/200 AVG Training Loss:0.681 AVG Validation Loss:0.789 AVG Training Acc 56.93 % AVG Validation Acc 29.30 %
Epoch:50/200 AVG Training Loss:0.676 AVG Validation Loss:0.789 AVG Training Acc 57.89 % AVG Validation Acc 30.11 %
Epoch    59: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.681 AVG Validation Loss:0.781 AVG Training Acc 55.71 % AVG Validation Acc 34.81 %
Epoch:70/200 AVG Training Loss:0.663 AVG Validation Loss:0.720 AVG Training Acc 59.67 % AVG Validation Acc 49.33 %
Epoch:80/200 AVG Training Loss:0.662 AVG Validation Loss:0.702 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.496 AVG Validation Loss:4.803 AVG Training Acc 82.05 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.603 AVG Validation Loss:4.003 AVG Training Acc 69.39 % AVG Validation Acc 20.16 %
Epoch    29: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:2.286 AVG Validation Loss:1.545 AVG Training Acc 46.03 % AVG Validation Acc 20.16 %
Epoch:40/200 AVG Training Loss:0.682 AVG Validation Loss:0.834 AVG Training Acc 55.69 % AVG Validation Acc 25.00 %
Epoch:50/200 AVG Training Loss:0.672 AVG Validation Loss:0.877 AVG Training Acc 58.62 % AVG Validation Acc 28.49 %
Epoch:60/200 AVG Training Loss:0.663 AVG Validation Loss:0.866 AVG Training Acc 59.93 % AVG Validation Acc 33.06 %
Epoch    60: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.656 AVG Validation Loss:0.702 AVG Training Acc 60.39 % AVG Validation Acc 54.03 %
Epoch:80/200 AVG Training Loss:0.651 AVG Validation Loss:0.685 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.528 AVG Validation Loss:3.422 AVG Training Acc 79.22 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.635 AVG Validation Loss:1.756 AVG Training Acc 68.48 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.550 AVG Validation Loss:3.909 AVG Training Acc 80.87 % AVG Validation Acc 20.03 %
Epoch    34: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.692 AVG Validation Loss:0.811 AVG Training Acc 53.92 % AVG Validation Acc 22.31 %
Epoch:50/200 AVG Training Loss:0.681 AVG Validation Loss:0.806 AVG Training Acc 56.75 % AVG Validation Acc 25.67 %
Epoch:60/200 AVG Training Loss:0.675 AVG Validation Loss:0.814 AVG Training Acc 57.49 % AVG Validation Acc 25.67 %
Epoch    67: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.675 AVG Validation Loss:0.775 AVG Training Acc 56.67 % AVG Validation Acc 36.56 %
Epoch:80/200 AVG Training Loss:0.658 AVG Validation Loss:0.710 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.539 AVG Validation Loss:3.271 AVG Training Acc 78.42 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.557 AVG Validation Loss:8.708 AVG Training Acc 79.29 % AVG Validation Acc 20.16 %
Epoch    21: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.693 AVG Validation Loss:0.942 AVG Training Acc 55.51 % AVG Validation Acc 21.64 %
Epoch:40/200 AVG Training Loss:0.668 AVG Validation Loss:1.100 AVG Training Acc 60.13 % AVG Validation Acc 21.91 %
Epoch:50/200 AVG Training Loss:0.652 AVG Validation Loss:0.974 AVG Training Acc 63.65 % AVG Validation Acc 22.31 %
Epoch    52: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.655 AVG Validation Loss:0.712 AVG Training Acc 61.57 % AVG Validation Acc 52.82 %
Epoch:70/200 AVG Training Loss:0.652 AVG Validation Loss:0.693 AVG Training Acc 62.41 % AVG Validation Acc 57.12 %
Epoch:80/200 AVG Training Loss:0.649 AVG Validation Loss:0.693 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.541 AVG Validation Loss:5.882 AVG Training Acc 79.11 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.575 AVG Validation Loss:4.841 AVG Training Acc 73.84 % AVG Validation Acc 20.16 %
Epoch    29: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:1.546 AVG Validation Loss:1.261 AVG Training Acc 50.05 % AVG Validation Acc 20.16 %
Epoch:40/200 AVG Training Loss:0.681 AVG Validation Loss:0.775 AVG Training Acc 57.20 % AVG Validation Acc 30.11 %
Epoch:50/200 AVG Training Loss:0.673 AVG Validation Loss:0.777 AVG Training Acc 58.82 % AVG Validation Acc 34.68 %
Epoch:60/200 AVG Training Loss:0.662 AVG Validation Loss:0.773 AVG Training Acc 60.36 % AVG Validation Acc 37.63 %
Epoch    67: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.660 AVG Validation Loss:0.730 AVG Training Acc 59.55 % AVG Validation Acc 46.37 %
Epoch:80/200 AVG Training Loss:0.646 AVG Validation Loss:0.671 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.531 AVG Validation Loss:3.961 AVG Training Acc 77.62 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.544 AVG Validation Loss:4.556 AVG Training Acc 82.72 % AVG Validation Acc 20.05 %
Epoch    24: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.701 AVG Validation Loss:0.807 AVG Training Acc 51.96 % AVG Validation Acc 27.05 %
Epoch:40/200 AVG Training Loss:0.675 AVG Validation Loss:0.963 AVG Training Acc 55.94 % AVG Validation Acc 29.88 %
Epoch:50/200 AVG Training Loss:0.673 AVG Validation Loss:0.861 AVG Training Acc 59.63 % AVG Validation Acc 38.36 %
Epoch    55: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.668 AVG Validation Loss:0.727 AVG Training Acc 59.78 % AVG Validation Acc 51.41 %
Epoch:70/200 AVG Training Loss:0.657 AVG Validation Loss:0.671 AVG Training Acc 60.91 % AVG Validation Acc 60.16 %
Epoch:80/200 AVG Training Loss:0.652 AVG Validation Loss:0.665 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.521 AVG Validation Loss:3.067 AVG Training Acc 78.82 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.552 AVG Validation Loss:5.115 AVG Training Acc 74.36 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.643 AVG Validation Loss:1.686 AVG Training Acc 66.77 % AVG Validation Acc 20.05 %
Epoch:40/200 AVG Training Loss:0.669 AVG Validation Loss:2.463 AVG Training Acc 67.82 % AVG Validation Acc 20.05 %
Epoch:50/200 AVG Training Loss:0.508 AVG Validation Loss:4.717 AVG Training Acc 82.07 % AVG Validation Acc 20.05 %
Epoch    52: reducing learning rate of group 0 to 1.0000e-03.
Epoch:60/200 AVG Training Loss:0.684 AVG Validation Loss:1.158 AVG Training Acc 60.22 % AVG Validation Acc 20.46 %
Epoch:70/200 AVG Training Loss:0.667 AVG Validation Loss:1.178 AVG Training Acc 62.05 % AVG Validation Acc 21.67 %
Epoch:80/200 AVG Training Loss:0.659 AVG Validation Loss:1.163 AVG Training Acc 62.32 % AVG Validation Acc 23.69 %
Epoch:90/200 AVG T

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.555 AVG Validation Loss:4.059 AVG Training Acc 80.31 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.627 AVG Validation Loss:2.101 AVG Training Acc 69.25 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.637 AVG Validation Loss:1.745 AVG Training Acc 67.90 % AVG Validation Acc 20.05 %
Epoch:40/200 AVG Training Loss:0.641 AVG Validation Loss:1.573 AVG Training Acc 66.10 % AVG Validation Acc 20.05 %
Epoch:50/200 AVG Training Loss:0.613 AVG Validation Loss:8.488 AVG Training Acc 81.43 % AVG Validation Acc 20.05 %
Epoch    52: reducing learning rate of group 0 to 1.0000e-03.
Epoch:60/200 AVG Training Loss:0.692 AVG Validation Loss:0.796 AVG Training Acc 54.03 % AVG Validation Acc 30.42 %
Epoch:70/200 AVG Training Loss:0.678 AVG Validation Loss:0.850 AVG Training Acc 57.94 % AVG Validation Acc 35.80 %
Epoch:80/200 AVG Training Loss:0.673 AVG Validation Loss:0.844 AVG Training Acc 58.93 % AVG Validation Acc 40.38 %
Epoch    83: reduc

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.575 AVG Validation Loss:2.547 AVG Training Acc 74.81 % AVG Validation Acc 20.19 %
Epoch:20/200 AVG Training Loss:0.532 AVG Validation Loss:6.157 AVG Training Acc 82.78 % AVG Validation Acc 20.19 %
Epoch:30/200 AVG Training Loss:0.640 AVG Validation Loss:1.563 AVG Training Acc 65.99 % AVG Validation Acc 20.19 %
Epoch:40/200 AVG Training Loss:0.642 AVG Validation Loss:1.601 AVG Training Acc 66.25 % AVG Validation Acc 20.19 %
Epoch    42: reducing learning rate of group 0 to 1.0000e-03.
Epoch:50/200 AVG Training Loss:0.685 AVG Validation Loss:0.794 AVG Training Acc 55.57 % AVG Validation Acc 30.42 %
Epoch:60/200 AVG Training Loss:0.679 AVG Validation Loss:0.787 AVG Training Acc 56.79 % AVG Validation Acc 33.11 %
Epoch:70/200 AVG Training Loss:0.673 AVG Validation Loss:0.776 AVG Training Acc 58.38 % AVG Validation Acc 43.20 %
Epoch:80/200 AVG Training Loss:0.669 AVG Validation Loss:0.786 AVG Training Acc 59.50 % AVG Validation Acc 45.36 %
Epoch    81: reduc

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.536 AVG Validation Loss:9.165 AVG Training Acc 76.72 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.559 AVG Validation Loss:3.620 AVG Training Acc 77.61 % AVG Validation Acc 20.16 %
Epoch    26: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.701 AVG Validation Loss:0.834 AVG Training Acc 52.28 % AVG Validation Acc 22.98 %
Epoch:40/200 AVG Training Loss:0.682 AVG Validation Loss:0.843 AVG Training Acc 56.35 % AVG Validation Acc 23.79 %
Epoch:50/200 AVG Training Loss:0.672 AVG Validation Loss:0.898 AVG Training Acc 58.44 % AVG Validation Acc 25.40 %
Epoch    57: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.678 AVG Validation Loss:0.807 AVG Training Acc 57.73 % AVG Validation Acc 39.11 %
Epoch:70/200 AVG Training Loss:0.657 AVG Validation Loss:0.721 AVG Training Acc 60.85 % AVG Validation Acc 53.63 %
Epoch:80/200 AVG Training Loss:0.655 AVG Validation Loss:0.709 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.495 AVG Validation Loss:6.156 AVG Training Acc 80.95 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.557 AVG Validation Loss:3.039 AVG Training Acc 75.90 % AVG Validation Acc 20.16 %
Epoch    24: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.693 AVG Validation Loss:0.827 AVG Training Acc 55.00 % AVG Validation Acc 22.85 %
Epoch:40/200 AVG Training Loss:0.683 AVG Validation Loss:0.809 AVG Training Acc 56.46 % AVG Validation Acc 26.75 %
Epoch:50/200 AVG Training Loss:0.674 AVG Validation Loss:0.817 AVG Training Acc 57.80 % AVG Validation Acc 29.44 %
Epoch    55: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.672 AVG Validation Loss:0.731 AVG Training Acc 57.82 % AVG Validation Acc 46.10 %
Epoch:70/200 AVG Training Loss:0.660 AVG Validation Loss:0.676 AVG Training Acc 60.17 % AVG Validation Acc 57.26 %
Epoch:80/200 AVG Training Loss:0.658 AVG Validation Loss:0.667 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.463 AVG Validation Loss:5.848 AVG Training Acc 82.94 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.465 AVG Validation Loss:6.057 AVG Training Acc 81.06 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.595 AVG Validation Loss:9.494 AVG Training Acc 77.04 % AVG Validation Acc 20.16 %
Epoch    39: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.973 AVG Validation Loss:1.202 AVG Training Acc 50.00 % AVG Validation Acc 20.16 %
Epoch:50/200 AVG Training Loss:0.683 AVG Validation Loss:0.751 AVG Training Acc 56.20 % AVG Validation Acc 34.27 %
Epoch:60/200 AVG Training Loss:0.674 AVG Validation Loss:0.751 AVG Training Acc 58.31 % AVG Validation Acc 40.99 %
Epoch:70/200 AVG Training Loss:0.666 AVG Validation Loss:0.749 AVG Training Acc 59.37 % AVG Validation Acc 43.82 %
Epoch    74: reducing learning rate of group 0 to 1.0000e-04.
Epoch:80/200 AVG Training Loss:0.661 AVG Validation Loss:0.704 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.576 AVG Validation Loss:6.876 AVG Training Acc 74.34 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.623 AVG Validation Loss:4.029 AVG Training Acc 66.67 % AVG Validation Acc 20.16 %
Epoch    28: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.780 AVG Validation Loss:0.894 AVG Training Acc 47.70 % AVG Validation Acc 20.70 %
Epoch:40/200 AVG Training Loss:0.683 AVG Validation Loss:0.878 AVG Training Acc 56.78 % AVG Validation Acc 32.53 %
Epoch:50/200 AVG Training Loss:0.675 AVG Validation Loss:0.872 AVG Training Acc 59.33 % AVG Validation Acc 40.32 %
Epoch    59: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.693 AVG Validation Loss:0.817 AVG Training Acc 58.91 % AVG Validation Acc 43.28 %
Epoch:70/200 AVG Training Loss:0.662 AVG Validation Loss:0.702 AVG Training Acc 60.46 % AVG Validation Acc 54.17 %
Epoch:80/200 AVG Training Loss:0.658 AVG Validation Loss:0.683 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.539 AVG Validation Loss:4.081 AVG Training Acc 79.58 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.543 AVG Validation Loss:3.807 AVG Training Acc 80.59 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.527 AVG Validation Loss:3.709 AVG Training Acc 80.54 % AVG Validation Acc 20.16 %
Epoch:40/200 AVG Training Loss:0.518 AVG Validation Loss:4.903 AVG Training Acc 81.81 % AVG Validation Acc 20.16 %
Epoch    47: reducing learning rate of group 0 to 1.0000e-03.
Epoch:50/200 AVG Training Loss:0.742 AVG Validation Loss:0.875 AVG Training Acc 50.99 % AVG Validation Acc 24.33 %
Epoch:60/200 AVG Training Loss:0.683 AVG Validation Loss:0.788 AVG Training Acc 54.90 % AVG Validation Acc 34.14 %
Epoch:70/200 AVG Training Loss:0.678 AVG Validation Loss:0.770 AVG Training Acc 56.71 % AVG Validation Acc 38.84 %
Epoch:80/200 AVG Training Loss:0.668 AVG Validation Loss:0.792 AVG Training Acc 58.93 % AVG Validation Acc 41.53 %
Epoch    81: reduc

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.442 AVG Validation Loss:4.465 AVG Training Acc 82.23 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.587 AVG Validation Loss:3.615 AVG Training Acc 76.46 % AVG Validation Acc 20.70 %
Epoch    29: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:1.103 AVG Validation Loss:1.618 AVG Training Acc 51.20 % AVG Validation Acc 20.30 %
Epoch:40/200 AVG Training Loss:0.688 AVG Validation Loss:0.804 AVG Training Acc 54.93 % AVG Validation Acc 31.85 %
Epoch:50/200 AVG Training Loss:0.683 AVG Validation Loss:0.812 AVG Training Acc 56.19 % AVG Validation Acc 32.12 %
Epoch:60/200 AVG Training Loss:0.677 AVG Validation Loss:0.815 AVG Training Acc 57.52 % AVG Validation Acc 33.87 %
Epoch    60: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.671 AVG Validation Loss:0.717 AVG Training Acc 58.66 % AVG Validation Acc 47.18 %
Epoch:80/200 AVG Training Loss:0.670 AVG Validation Loss:0.693 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.447 AVG Validation Loss:6.638 AVG Training Acc 84.19 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.582 AVG Validation Loss:2.593 AVG Training Acc 75.96 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.593 AVG Validation Loss:6.267 AVG Training Acc 70.05 % AVG Validation Acc 20.05 %
Epoch    37: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.718 AVG Validation Loss:0.850 AVG Training Acc 52.22 % AVG Validation Acc 26.65 %
Epoch:50/200 AVG Training Loss:0.688 AVG Validation Loss:0.812 AVG Training Acc 54.86 % AVG Validation Acc 25.44 %
Epoch:60/200 AVG Training Loss:0.677 AVG Validation Loss:0.826 AVG Training Acc 58.52 % AVG Validation Acc 24.63 %
Epoch    68: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.688 AVG Validation Loss:0.793 AVG Training Acc 55.80 % AVG Validation Acc 32.44 %
Epoch:80/200 AVG Training Loss:0.666 AVG Validation Loss:0.710 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.526 AVG Validation Loss:7.363 AVG Training Acc 80.05 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.571 AVG Validation Loss:10.464 AVG Training Acc 80.26 % AVG Validation Acc 20.05 %
Epoch    23: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.695 AVG Validation Loss:0.943 AVG Training Acc 55.42 % AVG Validation Acc 21.80 %
Epoch:40/200 AVG Training Loss:0.676 AVG Validation Loss:0.935 AVG Training Acc 58.48 % AVG Validation Acc 21.40 %
Epoch:50/200 AVG Training Loss:0.667 AVG Validation Loss:0.915 AVG Training Acc 59.82 % AVG Validation Acc 22.88 %
Epoch:60/200 AVG Training Loss:0.658 AVG Validation Loss:0.906 AVG Training Acc 60.80 % AVG Validation Acc 24.63 %
Epoch:70/200 AVG Training Loss:0.651 AVG Validation Loss:0.914 AVG Training Acc 61.03 % AVG Validation Acc 24.63 %
Epoch:80/200 AVG Training Loss:0.639 AVG Validation Loss:0.910 AVG Training Acc 62.86 % AVG Validation Acc 25.30 %
Epoch    89: redu

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.516 AVG Validation Loss:5.318 AVG Training Acc 80.45 % AVG Validation Acc 20.05 %
Epoch:20/200 AVG Training Loss:0.537 AVG Validation Loss:3.288 AVG Training Acc 78.99 % AVG Validation Acc 20.05 %
Epoch:30/200 AVG Training Loss:0.514 AVG Validation Loss:6.709 AVG Training Acc 83.20 % AVG Validation Acc 20.05 %
Epoch:40/200 AVG Training Loss:0.634 AVG Validation Loss:1.644 AVG Training Acc 67.12 % AVG Validation Acc 20.05 %
Epoch:50/200 AVG Training Loss:0.565 AVG Validation Loss:2.740 AVG Training Acc 76.30 % AVG Validation Acc 20.05 %
Epoch    51: reducing learning rate of group 0 to 1.0000e-03.
Epoch:60/200 AVG Training Loss:0.688 AVG Validation Loss:0.804 AVG Training Acc 54.49 % AVG Validation Acc 28.13 %
Epoch:70/200 AVG Training Loss:0.682 AVG Validation Loss:0.823 AVG Training Acc 57.11 % AVG Validation Acc 31.22 %
Epoch:80/200 AVG Training Loss:0.674 AVG Validation Loss:0.837 AVG Training Acc 58.62 % AVG Validation Acc 34.59 %
Epoch    82: reduc

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.553 AVG Validation Loss:3.036 AVG Training Acc 79.39 % AVG Validation Acc 20.19 %
Epoch:20/200 AVG Training Loss:0.512 AVG Validation Loss:4.284 AVG Training Acc 80.08 % AVG Validation Acc 20.19 %
Epoch    23: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.670 AVG Validation Loss:1.353 AVG Training Acc 61.91 % AVG Validation Acc 20.19 %
Epoch:40/200 AVG Training Loss:0.683 AVG Validation Loss:1.064 AVG Training Acc 60.03 % AVG Validation Acc 20.19 %
Epoch:50/200 AVG Training Loss:0.655 AVG Validation Loss:1.011 AVG Training Acc 62.12 % AVG Validation Acc 20.86 %
Epoch    54: reducing learning rate of group 0 to 1.0000e-04.
Epoch:60/200 AVG Training Loss:0.677 AVG Validation Loss:0.731 AVG Training Acc 57.17 % AVG Validation Acc 44.01 %
Epoch:70/200 AVG Training Loss:0.669 AVG Validation Loss:0.691 AVG Training Acc 58.42 % AVG Validation Acc 50.87 %
Epoch:80/200 AVG Training Loss:0.668 AVG Validation Loss:0.694 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.544 AVG Validation Loss:7.839 AVG Training Acc 79.45 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.636 AVG Validation Loss:1.694 AVG Training Acc 67.52 % AVG Validation Acc 20.16 %
Epoch:30/200 AVG Training Loss:0.572 AVG Validation Loss:7.521 AVG Training Acc 78.51 % AVG Validation Acc 20.16 %
Epoch    34: reducing learning rate of group 0 to 1.0000e-03.
Epoch:40/200 AVG Training Loss:0.683 AVG Validation Loss:0.971 AVG Training Acc 57.51 % AVG Validation Acc 21.37 %
Epoch:50/200 AVG Training Loss:0.670 AVG Validation Loss:1.042 AVG Training Acc 59.87 % AVG Validation Acc 20.97 %
Epoch:60/200 AVG Training Loss:0.655 AVG Validation Loss:1.046 AVG Training Acc 61.86 % AVG Validation Acc 23.66 %
Epoch    65: reducing learning rate of group 0 to 1.0000e-04.
Epoch:70/200 AVG Training Loss:0.663 AVG Validation Loss:0.754 AVG Training Acc 59.94 % AVG Validation Acc 45.70 %
Epoch:80/200 AVG Training Loss:0.649 AVG Validation Loss:0.693 AVG Trai

  0%|          | 0/200 [00:00<?, ?it/s]

Epoch:10/200 AVG Training Loss:0.566 AVG Validation Loss:5.642 AVG Training Acc 78.46 % AVG Validation Acc 20.16 %
Epoch:20/200 AVG Training Loss:0.584 AVG Validation Loss:3.695 AVG Training Acc 73.67 % AVG Validation Acc 20.16 %
Epoch    27: reducing learning rate of group 0 to 1.0000e-03.
Epoch:30/200 AVG Training Loss:0.746 AVG Validation Loss:0.866 AVG Training Acc 49.84 % AVG Validation Acc 24.06 %
Epoch:40/200 AVG Training Loss:0.688 AVG Validation Loss:0.789 AVG Training Acc 54.52 % AVG Validation Acc 26.48 %
