# [ LG에너지 솔루션_DX_Intensive_Course ] 시계열 데이터 분석을 위한 딥러닝

## 트랜스포머 기반의 시계열 데이터 분석

## 강의 복습
강의자료 : 시계열 분석을 위한 딥러닝, 02 트랜스포머 기반의 시계열 데이터 분석

- Transformer는 Sequence를 입력으로 받아 sequenc를 출력하는 구조이므로, 시계열 과업에도 적용 가능
- Representation learning : 시계열의 의미 있는 정보를 더 쉽게 추출할 수 있도록 고차원의 raw data를 저 차원 공간에 mapping하는 것을 목표, 비지도 학습 방식으로 해당 데이터의 representation을 학습하고, 이를 downstreamtask에서 활용하는 방식
- 시계열 회귀 : 시계열로 구성된 예측변수 x를 통해 다른 시계열 종속변수 y를 예측하는 과업 

<img src="./image/TST05.png" width="900">

## 실습 요약

1. 본 실습에서는 Transformer를 활용한 representation learning 모델인 TST를 활용하여 시계열 회귀를 수행합니다.
2. Input에 대해 일부 masking을 한 뒤 masking 된 부분을 예측하는 방식으로 pre-training을 진행합니다.
3. 회귀 과업을 수행하도록 fine-tuning을 진행합니다.
4. 해당 모델에 대한 전체적인 구조는 강의자료 47 page에서 확인하실 수 있습니다.

[Zerveas, George, et al. "A transformer-based framework for multivariate time series representation learning." Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2021.](https://arxiv.org/pdf/2010.02803.pdf)

<img src="./image/TST01.png" width="700">

---

### SETP 0. 환경 구축하기

- TST 모델에 맞는 환경을 구축하기 위하여 필요한 패키지를 설치합니다.

In [1]:
# github에서 데이터 불러오기
# !git clone https://github.com/hwk0702/LG_ES_Timeseries.git
#%cd LG_ES_Timeseries/mvts_transformer

/tf/dsba/external_lecture/LG_ES_Timeseries/mvts_transformer


In [1]:
# !pip install -r requirements.txt
# !pip install torch==1.11.0





- 필요한 library들을 import 합니다.

In [2]:
import logging

logging.basicConfig(format='%(asctime)s | %(levelname)s : %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

logger.info("Loading packages ...")

import os
import sys
import json
import time
import random
import importlib
import numpy as np
import pandas as pd
import pickle
import glob
from easydict import EasyDict
import matplotlib.pyplot as plt
from tqdm import tqdm

import torch
import torch.backends.cudnn as cudnn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torch.utils.data import Dataset
import torch.nn as nn

from src.options import Options
from src.running import setup, pipeline_factory, validate, check_progress, NEG_METRICS
from src.utils import utils
from src.datasets import utils as dutils
from src.datasets.data import data_factory, Normalizer
from src.datasets.datasplit import split_dataset
from src.models.ts_transformer import model_factory
from src.models.loss import get_loss_module
from src.optimizers import get_optimizer

import warnings
warnings.filterwarnings("ignore")

#check torch version & device
print ("Python version:[%s]."%(sys.version))
print ("PyTorch version:[%s]."%(torch.__version__))
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print ("device:[%s]."%(device)) # device에 cuda:0가 프린트 된다면 GPU를 사용하는 상태입니다

2023-06-22 14:31:07,826 | INFO : Loading packages ...


Python version:[3.6.9 (default, Dec  8 2021, 21:08:43) 
[GCC 8.4.0]].
PyTorch version:[1.7.1].
device:[cuda:0].


In [3]:
# set random seed 

def set_seed(random_seed):
    torch.manual_seed(random_seed)
    torch.cuda.manual_seed(random_seed)
    torch.cuda.manual_seed_all(random_seed)  # if use multi-GPU
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False
    np.random.seed(random_seed)
    random.seed(random_seed)
    
random_seed = 42
set_seed(random_seed)

모델에 필요한 argument들을 설정합니다.

In [4]:
# Offical code에서 default로 세팅한 argument들을 불러옵니다.
args = Options()
args = args.parser.parse_args([])
args.__dict__

{'config_filepath': None,
 'output_dir': './output',
 'data_dir': './data',
 'load_model': None,
 'resume': False,
 'change_output': False,
 'save_all': False,
 'experiment_name': '',
 'comment': '',
 'no_timestamp': False,
 'records_file': './records.xls',
 'console': False,
 'print_interval': 1,
 'gpu': '0',
 'n_proc': -1,
 'num_workers': 0,
 'seed': None,
 'limit_size': None,
 'test_only': None,
 'data_class': 'weld',
 'labels': None,
 'test_from': None,
 'test_ratio': 0,
 'val_ratio': 0.2,
 'pattern': None,
 'val_pattern': None,
 'test_pattern': None,
 'normalization': 'standardization',
 'norm_from': None,
 'subsample_factor': None,
 'task': 'imputation',
 'masking_ratio': 0.15,
 'mean_mask_length': 3,
 'mask_mode': 'separate',
 'mask_distribution': 'geometric',
 'exclude_feats': None,
 'mask_feats': '0, 1',
 'start_hint': 0.0,
 'end_hint': 0.0,
 'harden': False,
 'epochs': 400,
 'val_interval': 2,
 'optimizer': 'Adam',
 'lr': 0.001,
 'lr_step': '1000000',
 'lr_factor': '0.1',
 'b

In [5]:
# 실습을 위해 수정해야 할 argumetn들을 설정해줍니다.
args_change = EasyDict({
    # for dataloader 
    'output_dir': './output',    # 학습 및 테스트 과정에서 생성되는 output들을 저장하는 폴더 위치
    'data_dir': './data/BeijingPM25Quality',    # 실습 데이터가 저장되어 있는 폴더 위치
    'name': 'pretrained',    # 실행 이름 지정
    'records_file': 'Imputation_records.xls',    # 최종 metric 결과를 저장하기 위한 파일 지정 (실습에서는 사용하지 않습니다.)
    
    # Dataset
    'limit_size': None,
    'data_class': 'tsra',    # 데이터 종류 지정 (tsra -> TSRegressionArchive)
    'pattern': 'TRAIN',    # TRAIN or TEST
    'val_ratio': 0.2,    # Train, validation split 비율
    'epochs': 400,
    'lr': 0.001,
    'optimizer': 'RAdam',
    'batch_size': 32,
    'pos_encoding': 'learnable',
    'd_model': 128
})

In [6]:
# 수정할 argument들을 업데이트 해줍니다.
args.__dict__.update(args_change)
args.__dict__

{'config_filepath': None,
 'output_dir': './output',
 'data_dir': './data/BeijingPM25Quality',
 'load_model': None,
 'resume': False,
 'change_output': False,
 'save_all': False,
 'experiment_name': '',
 'comment': '',
 'no_timestamp': False,
 'records_file': 'Imputation_records.xls',
 'console': False,
 'print_interval': 1,
 'gpu': '0',
 'n_proc': -1,
 'num_workers': 0,
 'seed': None,
 'limit_size': None,
 'test_only': None,
 'data_class': 'tsra',
 'labels': None,
 'test_from': None,
 'test_ratio': 0,
 'val_ratio': 0.2,
 'pattern': 'TRAIN',
 'val_pattern': None,
 'test_pattern': None,
 'normalization': 'standardization',
 'norm_from': None,
 'subsample_factor': None,
 'task': 'imputation',
 'masking_ratio': 0.15,
 'mean_mask_length': 3,
 'mask_mode': 'separate',
 'mask_distribution': 'geometric',
 'exclude_feats': None,
 'mask_feats': '0, 1',
 'start_hint': 0.0,
 'end_hint': 0.0,
 'harden': False,
 'epochs': 400,
 'val_interval': 2,
 'optimizer': 'RAdam',
 'lr': 0.001,
 'lr_step': '10

In [7]:
# 세팅한 argument들로 configuration diretory를 생성합니다. (+ config 저장 및 불러오기)
config = setup(args)

2023-06-22 14:31:22,543 | INFO : Stored configuration file in './output/_2023-06-22_14-31-22_Nbr'


### SETP 1. 데이터 준비하기 (이 부분만 데이터를 포맷에 맞게 코드 변경 해주시면 됩니다!)

금일 실습에서는 BeijingPM25 데이터를 활용하여 시계열 회귀를 진행합니다.
* 해당 데이터는 중국 베이징에서 수집한 대기질 데이터를 포함
* 베이징의 미세먼지(PM2.5) 농도에 대한 시간별 데이터를 주로 포함하며, 추가적으로 다양한 기상 정보(온도, 압력, 풍향, 풍속 등)와 누적된 시간별 미세먼지 농도를 포함
* 데이터셋 출처
    * https://zenodo.org/record/3902651#.YB5P0OpOm3s

In [8]:
data_dir = './data/BeijingPM25Quality'
data_paths = glob.glob(os.path.join(data_dir, '*'))
train_path, test_path = [p for p in data_paths if os.path.isfile(p) and p.endswith('.ts')]

In [9]:
train_df, train_labels = dutils.load_from_tsfile_to_dataframe(train_path,
                                                              return_separate_X_and_y=True,
                                                              replace_missing_vals_with='NaN')

11942it [00:39, 299.06it/s]


In [10]:
train_df

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5,dim_6,dim_7,dim_8
0,2013-03-01 00:00:00 4.0 2013-03-01 01:00:0...,2013-03-01 00:00:00 7.0 2013-03-01 01:00:0...,2013-03-01 00:00:00 300.0 2013-03-01 01:00:...,2013-03-01 00:00:00 77.0 2013-03-01 01:00:0...,2013-03-01 00:00:00 -0.7 2013-03-01 01:00:00...,2013-03-01 00:00:00 1023.0 2013-03-01 01:00...,2013-03-01 00:00:00 -18.8 2013-03-01 01:00:0...,2013-03-01 00:00:00 0.0 2013-03-01 01:00:00...,2013-03-01 00:00:00 4.4 2013-03-01 01:00:00...
1,2013-03-02 00:00:00 24.0 2013-03-02 01:00:0...,2013-03-02 00:00:00 44.0 2013-03-02 01:00:...,2013-03-02 00:00:00 500.0 2013-03-02 01:00...,2013-03-02 00:00:00 44.0 2013-03-02 01:00:0...,2013-03-02 00:00:00 -0.4 2013-03-02 01:00:00...,2013-03-02 00:00:00 1031.0 2013-03-02 01:00...,2013-03-02 00:00:00 -17.6 2013-03-02 01:00:0...,2013-03-02 00:00:00 0.0 2013-03-02 01:00:00...,2013-03-02 00:00:00 1.4 2013-03-02 01:00:00...
2,2013-03-03 00:00:00 73.0 2013-03-03 01:00:...,2013-03-03 00:00:00 100.0 2013-03-03 01:00:...,2013-03-03 00:00:00 1899.0 2013-03-03 01:00...,2013-03-03 00:00:00 2.0 2013-03-03 01:00:0...,2013-03-03 00:00:00 -1.4 2013-03-03 01:00:0...,2013-03-03 00:00:00 1020.4 2013-03-03 01:00...,2013-03-03 00:00:00 -13.0 2013-03-03 01:00:0...,2013-03-03 00:00:00 0.0 2013-03-03 01:00:00...,2013-03-03 00:00:00 1.2 2013-03-03 01:00:00...
3,2013-03-04 00:00:00 51.0 2013-03-04 01:00:0...,2013-03-04 00:00:00 86.0 2013-03-04 01:00:0...,2013-03-04 00:00:00 1300.0 2013-03-04 01:00...,2013-03-04 00:00:00 4.0 2013-03-04 01:00:0...,2013-03-04 00:00:00 7.7 2013-03-04 01:00:0...,2013-03-04 00:00:00 1015.7 2013-03-04 01:00...,2013-03-04 00:00:00 -11.1 2013-03-04 01:00:0...,2013-03-04 00:00:00 0.0 2013-03-04 01:00:00...,2013-03-04 00:00:00 2.6 2013-03-04 01:00:00...
4,2013-03-05 00:00:00 66.0 2013-03-05 01:00:...,2013-03-05 00:00:00 78.0 2013-03-05 01:00:...,2013-03-05 00:00:00 1100.0 2013-03-05 01:00...,2013-03-05 00:00:00 84.0 2013-03-05 01:00:0...,2013-03-05 00:00:00 4.7 2013-03-05 01:00:0...,2013-03-05 00:00:00 1015.2 2013-03-05 01:00...,2013-03-05 00:00:00 -9.1 2013-03-05 01:00:00...,2013-03-05 00:00:00 0.0 2013-03-05 01:00:00...,2013-03-05 00:00:00 1.6 2013-03-05 01:00:00...
...,...,...,...,...,...,...,...,...,...
11913,2015-12-27 00:00:00 16.0 2015-12-27 01:00:0...,2015-12-27 00:00:00 39.0 2015-12-27 01:00:0...,2015-12-27 00:00:00 800.0 2015-12-27 01:00...,2015-12-27 00:00:00 25.0 2015-12-27 01:00:0...,2015-12-27 00:00:00 -4.1 2015-12-27 01:00:00...,2015-12-27 00:00:00 1033.5 2015-12-27 01:00...,2015-12-27 00:00:00 -10.7 2015-12-27 01:00:0...,2015-12-27 00:00:00 0.0 2015-12-27 01:00:00...,2015-12-27 00:00:00 2.4 2015-12-27 01:00:00...
11914,2015-12-28 00:00:00 26.0 2015-12-28 01:00:0...,2015-12-28 00:00:00 66.0 2015-12-28 01:00:...,2015-12-28 00:00:00 2600.0 2015-12-28 01:00...,2015-12-28 00:00:00 11.0 2015-12-28 01:00:0...,2015-12-28 00:00:00 -7.6 2015-12-28 01:00:00...,2015-12-28 00:00:00 1033.2 2015-12-28 01:00...,2015-12-28 00:00:00 -11.4 2015-12-28 01:00:0...,2015-12-28 00:00:00 0.0 2015-12-28 01:00:00...,2015-12-28 00:00:00 0.9 2015-12-28 01:00:00...
11915,2015-12-29 00:00:00 33.0 2015-12-29 01:00:0...,2015-12-29 00:00:00 105.0 2015-12-29 01:00:...,2015-12-29 00:00:00 3500.0 2015-12-29 01:00...,2015-12-29 00:00:00 9.0 2015-12-29 01:00:0...,2015-12-29 00:00:00 -3.2 2015-12-29 01:00:00...,2015-12-29 00:00:00 1028.2 2015-12-29 01:00...,2015-12-29 00:00:00 -6.5 2015-12-29 01:00:00...,2015-12-29 00:00:00 0.0 2015-12-29 01:00:00...,2015-12-29 00:00:00 1.0 2015-12-29 01:00:00...
11916,2015-12-30 00:00:00 38.0 2015-12-30 01:00:0...,2015-12-30 00:00:00 148.0 2015-12-30 01:00:...,2015-12-30 00:00:00 6500.0 2015-12-30 01:00...,2015-12-30 00:00:00 10.0 2015-12-30 01:00:0...,2015-12-30 00:00:00 -2.9 2015-12-30 01:00:00...,2015-12-30 00:00:00 1023.9 2015-12-30 01:00...,2015-12-30 00:00:00 -4.3 2015-12-30 01:00:0...,2015-12-30 00:00:00 0.0 2015-12-30 01:00:00...,2015-12-30 00:00:00 1.4 2015-12-30 01:00:00...


In [11]:
# 결측치를 보간법을 통해 채워주는 함수
def interpolate_missing(y):
    """
    Replaces NaN values in pd.Series `y` using linear interpolation
    """
    if y.isna().any():
        y = y.interpolate(method='linear', limit_direction='both')
    return y

class load_dataframe:
    def __init__(self, root_dir):
    
        self.all_df, self.labels_df = self.load_all(root_dir)
        self.all_IDs = self.all_df.index.unique()
        
        # use all features
        self.feature_names = self.all_df.columns
        self.feature_df = self.all_df
    
    
    def load_all(self, root_dir):
        
        df, labels = dutils.load_from_tsfile_to_dataframe(root_dir,
                                                          return_separate_X_and_y=True,
                                                          replace_missing_vals_with='NaN')
        
        labels_df = pd.DataFrame(labels, dtype=np.float32)
        
        lengths = df.applymap(lambda x: len(x)).values
        self.max_seq_len = lengths[0, 0]
        
        df = pd.concat((pd.DataFrame({col: df.loc[row, col] for col in df.columns}).reset_index(drop=True).set_index(
            pd.Series(lengths[row, 0]*[row])) for row in range(df.shape[0])), axis=0)
        grp = df.groupby(by=df.index)
        df = grp.transform(interpolate_missing)
        
        return df, labels_df

In [12]:
my_data = load_dataframe(train_path)

11942it [00:39, 301.08it/s]


In [13]:
my_data.feature_df

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5,dim_6,dim_7,dim_8
0,4.0,7.0,300.0,77.0,-0.7,1023.0,-18.8,0.0,4.4
0,4.0,7.0,300.0,77.0,-1.1,1023.2,-18.2,0.0,4.7
0,5.0,10.0,300.0,73.0,-1.1,1023.5,-18.2,0.0,5.6
0,11.0,11.0,300.0,72.0,-1.4,1024.5,-19.4,0.0,3.1
0,12.0,12.0,300.0,72.0,-2.0,1025.2,-19.5,0.0,2.0
...,...,...,...,...,...,...,...,...,...
11917,27.0,96.0,3300.0,9.0,-1.4,1026.3,-8.6,0.0,1.0
11917,34.0,99.0,3700.0,9.0,-2.5,1026.2,-8.4,0.0,1.3
11917,31.0,95.0,3100.0,9.0,-2.7,1025.8,-8.0,0.0,0.9
11917,40.0,99.0,4200.0,13.0,-3.5,1025.5,-7.6,0.0,0.4


In [14]:
my_data.labels_df

Unnamed: 0,0
0,24.0
1,93.0
2,117.0
3,58.0
4,226.0
...,...
11913,89.0
11914,281.0
11915,543.0
11916,505.0


### STEP 2. 데이터 나누기

train 데이터셋의 일부를 validation 데이터셋으로 나눕니다.

In [15]:
validation_method = 'ShuffleSplit'    # 데이터를 나누는 방법을 명시합니다. ShuffleSplit or StratifiedShuffleSplit(for classification)
labels = None    # classification task에서 데이터 class를 고려하여 나눌 때 사용합니다.

test_data = my_data
test_indices = None
val_data = my_data
val_indices = []

train_indices, val_indices, test_indices = split_dataset(data_indices=my_data.all_IDs,
                                                            validation_method=validation_method,
                                                            n_splits=1,
                                                            validation_ratio=config['val_ratio'],
                                                            random_seed=1337)
train_indices = train_indices[0]  # `split_dataset` returns a list of indices *per fold/split*
val_indices = val_indices[0]  # `split_dataset` returns a list of indices *per fold/split*

### STEP 3. 데이터 정규화

In [16]:
normalizer = None

normalizer = Normalizer(config['normalization'])
my_data.feature_df.loc[train_indices] = normalizer.normalize(my_data.feature_df.loc[train_indices])
val_data.feature_df.loc[val_indices] = normalizer.normalize(val_data.feature_df.loc[val_indices])
test_data.feature_df.loc[test_indices] = normalizer.normalize(test_data.feature_df.loc[test_indices])

In [17]:
my_data.feature_df.loc[train_indices]

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5,dim_6,dim_7,dim_8
1967,-0.660388,0.372892,0.305150,-0.952829,-0.852954,0.993705,-0.054040,-0.078872,-0.704167
1967,-0.660388,0.316566,0.305150,-0.970150,-0.871063,0.993705,-0.069076,-0.078872,-0.543053
1967,-0.660388,0.147589,0.218324,-0.970150,-0.871063,1.003529,-0.069076,-0.078872,-0.381940
1967,-0.660388,0.006774,0.131499,-0.952829,-0.880117,0.983881,-0.076595,-0.078872,-0.704167
1967,-0.660388,-0.049551,0.218324,-0.970150,-0.880117,0.983881,-0.061558,-0.078872,-0.543053
...,...,...,...,...,...,...,...,...,...
3223,0.308048,0.175752,-0.389454,-0.658387,-1.024983,1.612613,-1.527624,-0.078872,1.068081
3223,0.434365,0.316566,-0.302629,-0.744988,-1.097416,1.691205,-1.753172,-0.078872,-0.220827
3223,-0.070905,-0.190366,-0.563106,-0.450546,-1.187958,1.799268,-1.723099,-0.078872,-0.381940
3223,-0.239329,-0.471994,-0.563106,-0.277344,-1.242282,1.789444,-1.858428,-0.078872,0.343070


### STEP 4. 데이터 마스킹 및 데이터로더 생성

<img src="./image/TST02.png" width="800">

In [18]:
def noise_mask(X, masking_ratio, lm=3, mode='separate', distribution='geometric', exclude_feats=None):
    """
    feature를 마스킹해야 하는 위치에 0을 사용하여 X와 동일한 모양의 random boolean mask를 만듭니다.
    Args:
        X: (seq_length, feat_dim) 단일 샘플에 해당하는 numpy array of features
        masking_ratio: 마스킹할 seq_length의 비율. 각 timestamp에서 평균적으로 마스킹할 feat_dim의 비율.
        lm: 마스킹 시퀀스의 평균 길이. Used only when `distribution` is 'geometric'.
        mode: 각 변수를 개별적으로 마스킹할지('별도'), 특정 위치의 모든 변수를 동시에 마스킹할지('동시') 결정
        distribution: 각 마스크 시퀀스 요소가 무작위로 독립적으로 샘플링되는지 또는 샘플링이 마코프 체인을 따르는지, 
        원하는 평균 길이 `lm`의 마스크된 시퀀스의 기하학적 분포를 가져오는지 여부
        exclude_feats: 마스킹에서 제외할 feature에 해당하는 인덱스 (i.e. to remain all 1s)

    Returns:
        feature를 마스킹해야 하는 위치에 0이 있는 X와 같은 모양의 boolean numpy array
    """
    if exclude_feats is not None:
        exclude_feats = set(exclude_feats)

    if distribution == 'geometric':  # stateful (Markov chain)
        if mode == 'separate':  # each variable (feature) is independent
            mask = np.ones(X.shape, dtype=bool)
            for m in range(X.shape[1]):  # feature dimension
                if exclude_feats is None or m not in exclude_feats:
                    mask[:, m] = geom_noise_mask_single(X.shape[0], lm, masking_ratio)  # time dimension
        else:  # replicate across feature dimension (mask all variables at the same positions concurrently)
            mask = np.tile(np.expand_dims(geom_noise_mask_single(X.shape[0], lm, masking_ratio), 1), X.shape[1])
    else:  # each position is independent Bernoulli with p = 1 - masking_ratio
        if mode == 'separate':
            mask = np.random.choice(np.array([True, False]), size=X.shape, replace=True,
                                    p=(1 - masking_ratio, masking_ratio))
        else:
            mask = np.tile(np.random.choice(np.array([True, False]), size=(X.shape[0], 1), replace=True,
                                            p=(1 - masking_ratio, masking_ratio)), X.shape[1])

    return mask

def geom_noise_mask_single(L, lm, masking_ratio):
    """
    masking_ratio에 맞게 평균 길이 lm의 시퀀스로 구성된 길이 'L'의 boolean mask를 무작위로 생성
    마스킹 시퀀스의 길이와 간격은 geometric distribution를 따릅니다.
    Args:
        L: 마스크의 길이 및 마스크할 시퀀스
        lm: 마스킹 시퀀스의 평균 길이
        masking_ratio: 마스킹할 L의 비율

    Returns:
        (L,) boolean numpy array intended to mask ('drop') with 0s a sequence of length L
    """
    keep_mask = np.ones(L, dtype=bool)
    p_m = 1 / lm  # probability of each masking sequence stopping. parameter of geometric distribution.
    p_u = p_m * masking_ratio / (1 - masking_ratio)  # probability of each unmasked sequence stopping. parameter of geometric distribution.
    p = [p_m, p_u]

    # Start in state 0 with masking_ratio probability
    state = int(np.random.rand() > masking_ratio)  # state 0 means masking, 1 means not masking
    for i in range(L):
        keep_mask[i] = state  # here it happens that state and masking value corresponding to state are identical
        if np.random.rand() < p[state]:
            state = 1 - state

    return keep_mask

In [19]:
mask = noise_mask(my_data.feature_df.loc[train_indices[0]], 
                  masking_ratio=0.15, 
                  lm=3, 
                  mode='separate', 
                  distribution='geometric', 
                  exclude_feats=None)  # (seq_length, feat_dim) boolean array

In [20]:
mask

array([[ True,  True,  True,  True, False,  True,  True,  True,  True],
       [ True,  True,  True,  True, False,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True, False,  True,  True,  True],
       [ True, False,  True,  True,  True, False,  True,  True,  True],
       [ True, False,  True,  True,  True, False,  True,  True, False],
       [False,  True,  True,  True,  True,  True,  True,  True,  True],
       [False,  True,  True,  True,  True,  True,  True,  True,  True],
       [False,  True, False,  True,  True,  True,  True,  True, False],
       [False,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True,  True,  True,  True, 

In [21]:
my_data.feature_df.loc[train_indices[0]] * mask

Unnamed: 0,dim_0,dim_1,dim_2,dim_3,dim_4,dim_5,dim_6,dim_7,dim_8
1967,-0.660388,0.372892,0.30515,-0.952829,-0.0,0.993705,-0.05404,-0.078872,-0.704167
1967,-0.660388,0.316566,0.30515,-0.97015,-0.0,0.993705,-0.069076,-0.078872,-0.543053
1967,-0.660388,0.147589,0.218324,-0.97015,-0.871063,1.003529,-0.069076,-0.078872,-0.38194
1967,-0.660388,0.006774,0.131499,-0.952829,-0.880117,0.0,-0.076595,-0.078872,-0.704167
1967,-0.660388,-0.0,0.218324,-0.97015,-0.880117,0.0,-0.061558,-0.078872,-0.543053
1967,-0.660388,0.0,0.30515,-0.97015,-0.880117,0.0,-0.05404,-0.078872,-0.0
1967,-0.0,0.175752,0.30515,-0.97015,-0.880117,1.003529,-0.016448,-0.078872,-0.86528
1967,-0.0,0.0631,0.30515,-0.97015,-0.898225,1.033001,-0.016448,-0.078872,-0.784724
1967,-0.0,0.26024,0.0,-0.97015,-0.880117,1.101768,0.006107,-0.078872,-0.0
1967,-0.0,0.119426,0.218324,-0.97015,-0.871063,1.13124,0.006107,-0.078872,-0.62361


In [22]:
class ImputationDataset(Dataset):
    """각 샘플에 대한 missingness (noise) mask를 동적으로 계산"""

    def __init__(self, data, indices, mean_mask_length=3, masking_ratio=0.15,
                 mode='separate', distribution='geometric', exclude_feats=None):
        super(ImputationDataset, self).__init__()

        self.data = data  # this is a subclass of the BaseData class in data.py
        self.IDs = indices  # list of data IDs, but also mapping between integer index and ID
        self.feature_df = self.data.feature_df.loc[self.IDs]

        self.masking_ratio = masking_ratio
        self.mean_mask_length = mean_mask_length
        self.mode = mode
        self.distribution = distribution
        self.exclude_feats = exclude_feats

    def __getitem__(self, ind):
        """
        주어진 정수 인덱스에 대해 해당 (seq_length, feat_dim) 배열과 동일한 모양의 noise mask를 반환합니다.
        Args:
            ind: integer index of sample in dataset
        Returns:
            X: (seq_length, feat_dim) tensor of the multivariate time series corresponding to a sample
            mask: (seq_length, feat_dim) boolean tensor: 0s mask and predict, 1s: unaffected input
            ID: ID of sample
        """

        X = self.feature_df.loc[self.IDs[ind]].values  # (seq_length, feat_dim) array
        mask = noise_mask(X, self.masking_ratio, self.mean_mask_length, self.mode, self.distribution,
                          self.exclude_feats)  # (seq_length, feat_dim) boolean array

        return torch.from_numpy(X), torch.from_numpy(mask), self.IDs[ind]

    def update(self):
        self.mean_mask_length = min(20, self.mean_mask_length + 1)
        self.masking_ratio = min(1, self.masking_ratio + 0.05)

    def __len__(self):
        return len(self.IDs)

해당 dataset generator를 통해서 input X, noise mask, index를 반환합니다.

Official code에서는 data generator, collate function, runner가 한번에 반환되는 pipeline이 구축되어있습니다.

In [23]:
# Initialize data generators
dataset_class, collate_fn, runner_class = pipeline_factory(config)    
"""
dataset_class : (X, mask, index)를 도출하는 데이터셋 생성하는 class
collate_fn : DataLoader에서 각각의 데이터 샘플을 어떻게 배치로 결합할지 결정하는 함수
runner_class : 학습 및 테스트 과정에 대한 class
"""
val_dataset = dataset_class(val_data, val_indices)

val_loader = DataLoader(dataset=val_dataset,
                        batch_size=config['batch_size'],
                        shuffle=False,
                        num_workers=config['num_workers'],
                        pin_memory=True,
                        collate_fn=lambda x: collate_fn(x, max_len=config['max_seq_len']))

train_dataset = dataset_class(my_data, train_indices)

train_loader = DataLoader(dataset=train_dataset,
                            batch_size=config['batch_size'],
                            shuffle=True,
                            num_workers=config['num_workers'],
                            pin_memory=True,
                            collate_fn=lambda x: collate_fn(x, max_len=config['max_seq_len']))

In [24]:
X, targets, target_masks, padding_masks, IDs = next(iter(train_loader))
print(X.shape, targets.shape, target_masks.shape, padding_masks.shape)

torch.Size([32, 24, 9]) torch.Size([32, 24, 9]) torch.Size([32, 24, 9]) torch.Size([32, 24])


In [25]:
target_masks[0]

tensor([[False, False, False,  True, False, False, False,  True, False],
        [False, False, False,  True, False, False, False,  True, False],
        [False, False, False,  True, False, False,  True,  True, False],
        [False, False, False,  True, False, False,  True,  True, False],
        [False, False, False,  True, False, False, False, False, False],
        [False, False, False,  True, False, False, False, False, False],
        [False, False, False,  True, False,  True, False, False, False],
        [False, False, False, False,  True,  True, False,  True, False],
        [False, False, False, False, False,  True, False,  True, False],
        [False, False, False, False,  True,  True, False, False, False],
        [False, False,  True, False, False,  True, False, False, False],
        [False, False,  True, False, False,  True, False, False, False],
        [ True, False,  True,  True, False,  True, False, False, False],
        [ True, False, False,  True, False,  True, 

In [26]:
X[0]

tensor([[ 1.1081,  0.7953,  2.5626, -0.0000, -1.7584,  1.6028, -0.9262, -0.0000,
         -0.8653],
        [ 0.6870,  0.3166,  1.4339, -0.0000, -1.8036,  1.5733, -0.9487, -0.0000,
         -1.2681],
        [ 1.4870, -0.1622,  0.9129, -0.0000, -1.8489,  1.6028, -0.0000, -0.0000,
         -0.8653],
        [-0.1130, -0.0777,  1.0432, -0.0000, -1.8489,  1.5733, -0.0000, -0.0000,
         -1.1070],
        [-0.5341,  0.1758,  1.1734, -0.0000, -1.8308,  1.5144, -0.9562, -0.0789,
         -0.1403],
        [-0.4078,  0.6264,  1.3471, -0.0000, -1.8217,  1.5242, -0.9638, -0.0789,
         -0.3014],
        [ 0.1817,  0.4574,  1.7812, -0.0000, -1.8127,  0.0000, -0.9938, -0.0789,
         -0.7042],
        [ 0.4344,  0.3729,  1.6075, -0.7450, -0.0000,  0.0000, -1.0089, -0.0000,
          0.0208],
        [ 0.6028,  0.3447,  1.5207, -0.7623, -1.8127,  0.0000, -1.0089, -0.0000,
         -0.3819],
        [ 0.6028,  0.1476,  1.0866, -0.7796, -0.0000,  0.0000, -0.9412, -0.0789,
         -0.3014],


In [27]:
targets[0]

tensor([[ 1.1081,  0.7953,  2.5626, -0.6411, -1.7584,  1.6028, -0.9262, -0.0789,
         -0.8653],
        [ 0.6870,  0.3166,  1.4339, -0.7969, -1.8036,  1.5733, -0.9487, -0.0789,
         -1.2681],
        [ 1.4870, -0.1622,  0.9129, -0.9355, -1.8489,  1.6028, -1.0239, -0.0789,
         -0.8653],
        [-0.1130, -0.0777,  1.0432, -0.7103, -1.8489,  1.5733, -0.9713, -0.0789,
         -1.1070],
        [-0.5341,  0.1758,  1.1734, -0.7796, -1.8308,  1.5144, -0.9562, -0.0789,
         -0.1403],
        [-0.4078,  0.6264,  1.3471, -0.8143, -1.8217,  1.5242, -0.9638, -0.0789,
         -0.3014],
        [ 0.1817,  0.4574,  1.7812, -0.7277, -1.8127,  1.5537, -0.9938, -0.0789,
         -0.7042],
        [ 0.4344,  0.3729,  1.6075, -0.7450, -1.8036,  1.5831, -1.0089, -0.0789,
          0.0208],
        [ 0.6028,  0.3447,  1.5207, -0.7623, -1.8127,  1.6028, -1.0089, -0.0789,
         -0.3819],
        [ 0.6028,  0.1476,  1.0866, -0.7796, -1.6950,  1.6421, -0.9412, -0.0789,
         -0.3014],


### STEP 5. 모델 생성

<img src="./image/TST03.png" width="300">

In [28]:
model = model_factory(config, my_data)

# Initialize optimizer
# L2 regularization
if config['global_reg']:
    weight_decay = config['l2_reg']
    output_reg = None
else:
    weight_decay = 0
    output_reg = config['l2_reg']

optim_class = get_optimizer(config['optimizer'])
optimizer = optim_class(model.parameters(), lr=config['lr'], weight_decay=weight_decay)

lr_step = 0  # current step index of `lr_step`
lr = config['lr']  # current learning step

# 학습된 weight을 불러올 때 사용됩니다.
if args.load_model:
    model, optimizer, start_epoch = utils.load_model(model, config['load_model'], optimizer, config['resume'],
                                                    config['change_output'],
                                                    config['lr'],
                                                    config['lr_step'],
                                                    config['lr_factor'])

model.to(device)

loss_module = get_loss_module(config)

pre-training에 맞게 마지막 Linear layer의 output이 9개가 됩니다.

In [29]:
model

TSTransformerEncoder(
  (project_inp): Linear(in_features=9, out_features=128, bias=True)
  (pos_enc): LearnablePositionalEncoding(
    (dropout): Dropout(p=0.1, inplace=False)
  )
  (transformer_encoder): TransformerEncoder(
    (layers): ModuleList(
      (0): TransformerBatchNormEncoderLayer(
        (self_attn): MultiheadAttention(
          (out_proj): _LinearWithBias(in_features=128, out_features=128, bias=True)
        )
        (linear1): Linear(in_features=128, out_features=256, bias=True)
        (dropout): Dropout(p=0.1, inplace=False)
        (linear2): Linear(in_features=256, out_features=128, bias=True)
        (norm1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (norm2): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (dropout1): Dropout(p=0.1, inplace=False)
        (dropout2): Dropout(p=0.1, inplace=False)
      )
      (1): TransformerBatchNormEncoderLayer(
        (self_attn): Multi

### STEP 6. Pre-training

Trainer 이해하기
- 학습하는 모든 과정은 어떤 모델, 어떤 구조를 사용하더라도 항상 반복됩니다
- 반복되는 과정은 하나의 파이프라인으로 구성할 수 있으며, 대부분의 개발자와 연구자들은 아래와 같이 Trainer를 만들어 학습을 진행합니다
- 모든 딥러닝 모델은 데이터를 생성하고 - 모델을 정의하고 - 반복학습(epoch)을 통해 여러번 학습되므로 모든 과정은 구조화 되어있음을 알 수 있습니다

In [30]:
class UnsupervisedRunner(BaseRunner):

    def train_epoch(self, epoch_num=None):
        """
        epoch마다 수행되는 학습 과정
        Args:
            epoch_num: epoch 번호

        Returns:
            해당 epoch의 metric 결과
        """
        
        '''
            STEP 1) 초기 세팅
        '''
        self.model = self.model.train()

        epoch_loss = 0  # epoch의 총 loss
        total_active_elements = 0  # epoch의 총 unmasked elements 수
        
        '''
            STEP 2) 한 epoch에서 수행되는 학습 과정
        '''
        for i, batch in enumerate(self.dataloader):
            '''
            STEP 3) 하나의 batch에서 수행되는 학습 과정
            '''
            X, targets, target_masks, padding_masks, IDs = batch
            targets = targets.to(self.device)
            target_masks = target_masks.to(self.device)
            padding_masks = padding_masks.to(self.device) 

            predictions = self.model(X.to(self.device), padding_masks)  # (batch_size, padded_length, feat_dim)

            # Cascade noise masks (batch_size, padded_length, feat_dim) and padding masks (batch_size, padded_length)
            target_masks = target_masks * padding_masks.unsqueeze(-1)
            loss = self.loss_module(predictions, targets, target_masks)  # (num_active,) individual loss (square error per element) for each active value in batch
            batch_loss = torch.sum(loss)
            mean_loss = batch_loss / len(loss)  # mean loss (over active elements) used for optimization

            if self.l2_reg:
                total_loss = mean_loss + self.l2_reg * l2_reg_loss(self.model)
            else:
                total_loss = mean_loss

            # gradients를 제로화하고, backward pass를 수행하고, 가중치를 업데이트합니다.
            self.optimizer.zero_grad()
            total_loss.backward()

            torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=4.0)
            self.optimizer.step()

            metrics = {"loss": mean_loss.item()}
            if i % self.print_interval == 0:
                ending = "" if epoch_num is None else 'Epoch {} '.format(epoch_num)
                self.print_callback(i, metrics, prefix='Training ' + ending)

            with torch.no_grad():
                total_active_elements += len(loss)
                epoch_loss += batch_loss.item()  # add total loss of batch

        epoch_loss = epoch_loss / total_active_elements  # average loss per element for whole epoch
        self.epoch_metrics['epoch'] = epoch_num
        self.epoch_metrics['loss'] = epoch_loss
        return self.epoch_metrics

    def evaluate(self, epoch_num=None, keep_all=True):
        """
        학습된 모델의 평가 함수
        Args:
            epoch_num: epoch 번호
            keep_all: 매 평가마다 배치 정보를 남길 것인지에 대한 변수

        Returns:
            해당 epoch의 metric 결과
        """

        '''
            STEP 1) 초기 세팅
        '''
        self.model = self.model.eval()

        epoch_loss = 0  # total loss of epoch
        total_active_elements = 0  # total unmasked elements in epoch

        if keep_all:
            per_batch = {'target_masks': [], 'targets': [], 'predictions': [], 'metrics': [], 'IDs': []}
            
        '''
            STEP 2) 한 epoch에서 수행되는 평가 과정
        '''
        for i, batch in enumerate(self.dataloader):
            '''
            STEP 3) 하나의 batch에서 수행되는 학습 과정
            '''
            X, targets, target_masks, padding_masks, IDs = batch
            targets = targets.to(self.device)
            target_masks = target_masks.to(self.device)  # 1s: mask and predict, 0s: unaffected input (ignore)
            padding_masks = padding_masks.to(self.device)  # 0s: ignore

            predictions = self.model(X.to(self.device), padding_masks)  # (batch_size, padded_length, feat_dim)

            # Cascade noise masks (batch_size, padded_length, feat_dim) and padding masks (batch_size, padded_length)
            target_masks = target_masks * padding_masks.unsqueeze(-1)
            loss = self.loss_module(predictions, targets, target_masks)  # (num_active,) individual loss (square error per element) for each active value in batch
            batch_loss = torch.sum(loss).cpu().item()
            mean_loss = batch_loss / len(loss)  # mean loss (over active elements) used for optimization the batch

            if keep_all:
                per_batch['target_masks'].append(target_masks.cpu().numpy())
                per_batch['targets'].append(targets.cpu().numpy())
                per_batch['predictions'].append(predictions.cpu().numpy())
                per_batch['metrics'].append([loss.cpu().numpy()])
                per_batch['IDs'].append(IDs)

            metrics = {"loss": mean_loss}
            if i % self.print_interval == 0:
                ending = "" if epoch_num is None else 'Epoch {} '.format(epoch_num)
                self.print_callback(i, metrics, prefix='Evaluating ' + ending)

            total_active_elements += len(loss)
            epoch_loss += batch_loss  # add total loss of batch

        epoch_loss = epoch_loss / total_active_elements  # average loss per element for whole epoch
        self.epoch_metrics['epoch'] = epoch_num
        self.epoch_metrics['loss'] = epoch_loss

        if keep_all:
            return self.epoch_metrics, per_batch
        else:
            return self.epoch_metrics

NameError: name 'BaseRunner' is not defined

In [30]:
trainer = runner_class(model, train_loader, device, loss_module, optimizer, l2_reg=output_reg,
                       print_interval=config['print_interval'], console=config['console'])
val_evaluator = runner_class(model, val_loader, device, loss_module,
                             print_interval=config['print_interval'], console=config['console'])

tensorboard_writer = SummaryWriter(config['tensorboard_dir'])

best_value = 1e16 if config['key_metric'] in NEG_METRICS else -1e16  # initialize with +inf or -inf depending on key metric
metrics = []  # (for validation) list of lists: for each epoch, stores metrics like loss, ...
best_metrics = {}

# 학습되지 않은 모델로 초기 성능을 확인합니다.
aggr_metrics_val, best_metrics, best_value = validate(val_evaluator, 
                                                      tensorboard_writer, 
                                                      config, best_metrics,
                                                      best_value, 
                                                      epoch=0)

metrics_names, metrics_values = zip(*aggr_metrics_val.items())
metrics.append(list(metrics_values))

2023-06-22 14:35:56,776 | INFO : Evaluating on validation set ...


Evaluating Epoch 0   0.0% | batch:         0 of        75	|	loss: 40.0271
Evaluating Epoch 0   1.3% | batch:         1 of        75	|	loss: 57.4144
Evaluating Epoch 0   2.7% | batch:         2 of        75	|	loss: 18.5814
Evaluating Epoch 0   4.0% | batch:         3 of        75	|	loss: 32.4551
Evaluating Epoch 0   5.3% | batch:         4 of        75	|	loss: 54.7981
Evaluating Epoch 0   6.7% | batch:         5 of        75	|	loss: 77.6586
Evaluating Epoch 0   8.0% | batch:         6 of        75	|	loss: 22.9577
Evaluating Epoch 0   9.3% | batch:         7 of        75	|	loss: 16.8651
Evaluating Epoch 0  10.7% | batch:         8 of        75	|	loss: 22.5102
Evaluating Epoch 0  12.0% | batch:         9 of        75	|	loss: 29.3689
Evaluating Epoch 0  13.3% | batch:        10 of        75	|	loss: 27.5272
Evaluating Epoch 0  14.7% | batch:        11 of        75	|	loss: 14.1647
Evaluating Epoch 0  16.0% | batch:        12 of        75	|	loss: 33.1974
Evaluating Epoch 0  17.3% | batch:    

2023-06-22 14:35:57,949 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.1719434261322021 seconds

2023-06-22 14:35:57,950 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.1719434261322021 seconds
2023-06-22 14:35:57,951 | INFO : Avg batch val. time: 0.01562591234842936 seconds
2023-06-22 14:35:57,952 | INFO : Avg sample val. time: 0.0004915870076057895 seconds
2023-06-22 14:35:57,952 | INFO : Epoch 0 Validation Summary: epoch: 0.000000 | loss: 33.298641 | 


Evaluating Epoch 0  92.0% | batch:        69 of        75	|	loss: 17.0672
Evaluating Epoch 0  93.3% | batch:        70 of        75	|	loss: 18.6113
Evaluating Epoch 0  94.7% | batch:        71 of        75	|	loss: 22.8909
Evaluating Epoch 0  96.0% | batch:        72 of        75	|	loss: 14.2727
Evaluating Epoch 0  97.3% | batch:        73 of        75	|	loss: 23.9697
Evaluating Epoch 0  98.7% | batch:        74 of        75	|	loss: 55.3534



pre-training을 시작해봅시다!

In [None]:
start_epoch = 0
total_epoch_time = 0

for epoch in tqdm(range(start_epoch + 1, config["epochs"] + 1), desc='Training Epoch', leave=False):
    mark = epoch if config['save_all'] else 'last'
    epoch_start_time = time.time()
    aggr_metrics_train = trainer.train_epoch(epoch)  # dictionary of aggregate epoch metrics
    epoch_runtime = time.time() - epoch_start_time
    print()
    print_str = 'Epoch {} Training Summary: '.format(epoch)
    for k, v in aggr_metrics_train.items():
        tensorboard_writer.add_scalar('{}/train'.format(k), v, epoch)
        print_str += '{}: {:8f} | '.format(k, v)
    logger.info(print_str)
    logger.info("Epoch runtime: {} hours, {} minutes, {} seconds\n".format(*utils.readable_time(epoch_runtime)))
    total_epoch_time += epoch_runtime
    avg_epoch_time = total_epoch_time / (epoch - start_epoch)
    avg_batch_time = avg_epoch_time / len(train_loader)
    avg_sample_time = avg_epoch_time / len(train_dataset)
    logger.info("Avg epoch train. time: {} hours, {} minutes, {} seconds".format(*utils.readable_time(avg_epoch_time)))
    logger.info("Avg batch train. time: {} seconds".format(avg_batch_time))
    logger.info("Avg sample train. time: {} seconds".format(avg_sample_time))

    # evaluate if first or last epoch or at specified interval
    if (epoch == config["epochs"]) or (epoch == start_epoch + 1) or (epoch % config['val_interval'] == 0):
        aggr_metrics_val, best_metrics, best_value = validate(val_evaluator, tensorboard_writer, config,
                                                                  best_metrics, best_value, epoch)
        metrics_names, metrics_values = zip(*aggr_metrics_val.items())
        metrics.append(list(metrics_values))

    utils.save_model(os.path.join(config['save_dir'], 'model_{}.pth'.format(mark)), epoch, model, optimizer)

    # Learning rate scheduling
    if epoch == config['lr_step'][lr_step]:
        utils.save_model(os.path.join(config['save_dir'], 'model_{}.pth'.format(epoch)), epoch, model, optimizer)
        lr = lr * config['lr_factor'][lr_step]
        if lr_step < len(config['lr_step']) - 1:  # so that this index does not get out of bounds
            lr_step += 1
        logger.info('Learning rate updated to: ', lr)
        for param_group in optimizer.param_groups:
            param_group['lr'] = lr

Training Epoch:   0%|          | 0/400 [00:00<?, ?it/s]

Training Epoch 1   0.0% | batch:         0 of       298	|	loss: 1.33582
Training Epoch 1   0.3% | batch:         1 of       298	|	loss: 0.754723
Training Epoch 1   0.7% | batch:         2 of       298	|	loss: 1.12153
Training Epoch 1   1.0% | batch:         3 of       298	|	loss: 1.12192
Training Epoch 1   1.3% | batch:         4 of       298	|	loss: 0.780175
Training Epoch 1   1.7% | batch:         5 of       298	|	loss: 0.80955
Training Epoch 1   2.0% | batch:         6 of       298	|	loss: 1.02738
Training Epoch 1   2.3% | batch:         7 of       298	|	loss: 0.836399
Training Epoch 1   2.7% | batch:         8 of       298	|	loss: 1.24348
Training Epoch 1   3.0% | batch:         9 of       298	|	loss: 1.02266
Training Epoch 1   3.4% | batch:        10 of       298	|	loss: 0.7257
Training Epoch 1   3.7% | batch:        11 of       298	|	loss: 0.978266
Training Epoch 1   4.0% | batch:        12 of       298	|	loss: 1.05213
Training Epoch 1   4.4% | batch:        13 of       298	|	los

Training Epoch 1  38.6% | batch:       115 of       298	|	loss: 0.30958
Training Epoch 1  38.9% | batch:       116 of       298	|	loss: 0.959631
Training Epoch 1  39.3% | batch:       117 of       298	|	loss: 3.00887
Training Epoch 1  39.6% | batch:       118 of       298	|	loss: 0.331467
Training Epoch 1  39.9% | batch:       119 of       298	|	loss: 0.436543
Training Epoch 1  40.3% | batch:       120 of       298	|	loss: 0.352147
Training Epoch 1  40.6% | batch:       121 of       298	|	loss: 0.489487
Training Epoch 1  40.9% | batch:       122 of       298	|	loss: 0.357692
Training Epoch 1  41.3% | batch:       123 of       298	|	loss: 1.86368
Training Epoch 1  41.6% | batch:       124 of       298	|	loss: 0.330332
Training Epoch 1  41.9% | batch:       125 of       298	|	loss: 0.414822
Training Epoch 1  42.3% | batch:       126 of       298	|	loss: 0.30075
Training Epoch 1  42.6% | batch:       127 of       298	|	loss: 0.30904
Training Epoch 1  43.0% | batch:       128 of       298	

Training Epoch 1  76.5% | batch:       228 of       298	|	loss: 0.421114
Training Epoch 1  76.8% | batch:       229 of       298	|	loss: 0.29088
Training Epoch 1  77.2% | batch:       230 of       298	|	loss: 0.318196
Training Epoch 1  77.5% | batch:       231 of       298	|	loss: 0.375216
Training Epoch 1  77.9% | batch:       232 of       298	|	loss: 0.269664
Training Epoch 1  78.2% | batch:       233 of       298	|	loss: 0.292773
Training Epoch 1  78.5% | batch:       234 of       298	|	loss: 0.340447
Training Epoch 1  78.9% | batch:       235 of       298	|	loss: 0.323675
Training Epoch 1  79.2% | batch:       236 of       298	|	loss: 0.248331
Training Epoch 1  79.5% | batch:       237 of       298	|	loss: 0.305548
Training Epoch 1  79.9% | batch:       238 of       298	|	loss: 0.21894
Training Epoch 1  80.2% | batch:       239 of       298	|	loss: 0.378518
Training Epoch 1  80.5% | batch:       240 of       298	|	loss: 0.218388
Training Epoch 1  80.9% | batch:       241 of       2

2023-06-22 14:36:13,661 | INFO : Epoch 1 Training Summary: epoch: 1.000000 | loss: 0.487118 | 
2023-06-22 14:36:13,663 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.202457189559937 seconds

2023-06-22 14:36:13,663 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.202457189559937 seconds
2023-06-22 14:36:13,664 | INFO : Avg batch train. time: 0.040947842917986366 seconds
2023-06-22 14:36:13,666 | INFO : Avg sample train. time: 0.0012798885241829176 seconds
2023-06-22 14:36:13,667 | INFO : Evaluating on validation set ...


Training Epoch 1  99.3% | batch:       296 of       298	|	loss: 0.391188
Training Epoch 1  99.7% | batch:       297 of       298	|	loss: 0.315896

Evaluating Epoch 1   0.0% | batch:         0 of        75	|	loss: 0.75289
Evaluating Epoch 1   1.3% | batch:         1 of        75	|	loss: 0.18148
Evaluating Epoch 1   2.7% | batch:         2 of        75	|	loss: 0.178394
Evaluating Epoch 1   4.0% | batch:         3 of        75	|	loss: 0.196378
Evaluating Epoch 1   5.3% | batch:         4 of        75	|	loss: 1.515
Evaluating Epoch 1   6.7% | batch:         5 of        75	|	loss: 0.219015
Evaluating Epoch 1   8.0% | batch:         6 of        75	|	loss: 0.33412
Evaluating Epoch 1   9.3% | batch:         7 of        75	|	loss: 0.184845
Evaluating Epoch 1  10.7% | batch:         8 of        75	|	loss: 0.238412
Evaluating Epoch 1  12.0% | batch:         9 of        75	|	loss: 0.363527
Evaluating Epoch 1  13.3% | batch:        10 of        75	|	loss: 0.179013
Evaluating Epoch 1  14.7% | batch:

2023-06-22 14:36:14,964 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.2962024211883545 seconds

2023-06-22 14:36:14,965 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.2340729236602783 seconds
2023-06-22 14:36:14,966 | INFO : Avg batch val. time: 0.01645430564880371 seconds
2023-06-22 14:36:14,966 | INFO : Avg sample val. time: 0.0005176480384481033 seconds
2023-06-22 14:36:14,967 | INFO : Epoch 1 Validation Summary: epoch: 1.000000 | loss: 0.327433 | 
Training Epoch:   0%|          | 1/400 [00:13<1:30:08, 13.55s/it]

Evaluating Epoch 1  92.0% | batch:        69 of        75	|	loss: 0.220971
Evaluating Epoch 1  93.3% | batch:        70 of        75	|	loss: 0.379347
Evaluating Epoch 1  94.7% | batch:        71 of        75	|	loss: 0.237577
Evaluating Epoch 1  96.0% | batch:        72 of        75	|	loss: 0.188008
Evaluating Epoch 1  97.3% | batch:        73 of        75	|	loss: 0.204049
Evaluating Epoch 1  98.7% | batch:        74 of        75	|	loss: 0.894837

Training Epoch 2   0.0% | batch:         0 of       298	|	loss: 0.268593
Training Epoch 2   0.3% | batch:         1 of       298	|	loss: 1.51081
Training Epoch 2   0.7% | batch:         2 of       298	|	loss: 0.634883
Training Epoch 2   1.0% | batch:         3 of       298	|	loss: 0.258298
Training Epoch 2   1.3% | batch:         4 of       298	|	loss: 1.02152
Training Epoch 2   1.7% | batch:         5 of       298	|	loss: 0.341819
Training Epoch 2   2.0% | batch:         6 of       298	|	loss: 2.11456
Training Epoch 2   2.3% | batch:         

Training Epoch 2  37.2% | batch:       111 of       298	|	loss: 0.24864
Training Epoch 2  37.6% | batch:       112 of       298	|	loss: 0.419456
Training Epoch 2  37.9% | batch:       113 of       298	|	loss: 0.417861
Training Epoch 2  38.3% | batch:       114 of       298	|	loss: 1.21881
Training Epoch 2  38.6% | batch:       115 of       298	|	loss: 0.25325
Training Epoch 2  38.9% | batch:       116 of       298	|	loss: 0.250232
Training Epoch 2  39.3% | batch:       117 of       298	|	loss: 0.348275
Training Epoch 2  39.6% | batch:       118 of       298	|	loss: 0.556507
Training Epoch 2  39.9% | batch:       119 of       298	|	loss: 0.252694
Training Epoch 2  40.3% | batch:       120 of       298	|	loss: 0.244956
Training Epoch 2  40.6% | batch:       121 of       298	|	loss: 1.18818
Training Epoch 2  40.9% | batch:       122 of       298	|	loss: 0.262909
Training Epoch 2  41.3% | batch:       123 of       298	|	loss: 0.320965
Training Epoch 2  41.6% | batch:       124 of       298

Training Epoch 2  76.2% | batch:       227 of       298	|	loss: 0.275449
Training Epoch 2  76.5% | batch:       228 of       298	|	loss: 0.209561
Training Epoch 2  76.8% | batch:       229 of       298	|	loss: 0.331591
Training Epoch 2  77.2% | batch:       230 of       298	|	loss: 0.253285
Training Epoch 2  77.5% | batch:       231 of       298	|	loss: 0.265752
Training Epoch 2  77.9% | batch:       232 of       298	|	loss: 0.235184
Training Epoch 2  78.2% | batch:       233 of       298	|	loss: 0.205687
Training Epoch 2  78.5% | batch:       234 of       298	|	loss: 0.493829
Training Epoch 2  78.9% | batch:       235 of       298	|	loss: 0.234795
Training Epoch 2  79.2% | batch:       236 of       298	|	loss: 0.264788
Training Epoch 2  79.5% | batch:       237 of       298	|	loss: 0.341761
Training Epoch 2  79.9% | batch:       238 of       298	|	loss: 0.202651
Training Epoch 2  80.2% | batch:       239 of       298	|	loss: 0.228505
Training Epoch 2  80.5% | batch:       240 of      

2023-06-22 14:36:27,295 | INFO : Epoch 2 Training Summary: epoch: 2.000000 | loss: 0.421315 | 
2023-06-22 14:36:27,297 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.281769752502441 seconds

2023-06-22 14:36:27,298 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.242113471031189 seconds
2023-06-22 14:36:27,299 | INFO : Avg batch train. time: 0.041080917688024125 seconds
2023-06-22 14:36:27,299 | INFO : Avg sample train. time: 0.0012840479831163403 seconds
2023-06-22 14:36:27,300 | INFO : Evaluating on validation set ...


Training Epoch 2  98.3% | batch:       293 of       298	|	loss: 0.310825
Training Epoch 2  98.7% | batch:       294 of       298	|	loss: 0.261435
Training Epoch 2  99.0% | batch:       295 of       298	|	loss: 0.42802
Training Epoch 2  99.3% | batch:       296 of       298	|	loss: 0.355879
Training Epoch 2  99.7% | batch:       297 of       298	|	loss: 0.268051

Evaluating Epoch 2   0.0% | batch:         0 of        75	|	loss: 0.309279
Evaluating Epoch 2   1.3% | batch:         1 of        75	|	loss: 1.0088
Evaluating Epoch 2   2.7% | batch:         2 of        75	|	loss: 0.197185
Evaluating Epoch 2   4.0% | batch:         3 of        75	|	loss: 0.304054
Evaluating Epoch 2   5.3% | batch:         4 of        75	|	loss: 0.234757
Evaluating Epoch 2   6.7% | batch:         5 of        75	|	loss: 0.230714
Evaluating Epoch 2   8.0% | batch:         6 of        75	|	loss: 0.265792
Evaluating Epoch 2   9.3% | batch:         7 of        75	|	loss: 0.182102
Evaluating Epoch 2  10.7% | batch:   

2023-06-22 14:36:28,563 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.262603998184204 seconds

2023-06-22 14:36:28,564 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.2435832818349202 seconds
2023-06-22 14:36:28,565 | INFO : Avg batch val. time: 0.0165811104244656 seconds
2023-06-22 14:36:28,565 | INFO : Avg sample val. time: 0.0005216372826488759 seconds
2023-06-22 14:36:28,566 | INFO : Epoch 2 Validation Summary: epoch: 2.000000 | loss: 0.287541 | 


Evaluating Epoch 2  85.3% | batch:        64 of        75	|	loss: 0.808197
Evaluating Epoch 2  86.7% | batch:        65 of        75	|	loss: 0.197834
Evaluating Epoch 2  88.0% | batch:        66 of        75	|	loss: 0.173561
Evaluating Epoch 2  89.3% | batch:        67 of        75	|	loss: 0.202207
Evaluating Epoch 2  90.7% | batch:        68 of        75	|	loss: 0.201962
Evaluating Epoch 2  92.0% | batch:        69 of        75	|	loss: 0.198868
Evaluating Epoch 2  93.3% | batch:        70 of        75	|	loss: 0.177642
Evaluating Epoch 2  94.7% | batch:        71 of        75	|	loss: 0.237155
Evaluating Epoch 2  96.0% | batch:        72 of        75	|	loss: 0.183432
Evaluating Epoch 2  97.3% | batch:        73 of        75	|	loss: 0.199276
Evaluating Epoch 2  98.7% | batch:        74 of        75	|	loss: 0.422934



Training Epoch:   0%|          | 2/400 [00:27<1:30:13, 13.60s/it]

Training Epoch 3   0.0% | batch:         0 of       298	|	loss: 0.279529
Training Epoch 3   0.3% | batch:         1 of       298	|	loss: 0.371718
Training Epoch 3   0.7% | batch:         2 of       298	|	loss: 0.210916
Training Epoch 3   1.0% | batch:         3 of       298	|	loss: 1.13954
Training Epoch 3   1.3% | batch:         4 of       298	|	loss: 0.311659
Training Epoch 3   1.7% | batch:         5 of       298	|	loss: 0.432885
Training Epoch 3   2.0% | batch:         6 of       298	|	loss: 0.189457
Training Epoch 3   2.3% | batch:         7 of       298	|	loss: 0.210895
Training Epoch 3   2.7% | batch:         8 of       298	|	loss: 0.274103
Training Epoch 3   3.0% | batch:         9 of       298	|	loss: 0.20468
Training Epoch 3   3.4% | batch:        10 of       298	|	loss: 0.615169
Training Epoch 3   3.7% | batch:        11 of       298	|	loss: 0.412208
Training Epoch 3   4.0% | batch:        12 of       298	|	loss: 0.296321
Training Epoch 3   4.4% | batch:        13 of       2

Training Epoch 3  38.3% | batch:       114 of       298	|	loss: 0.220564
Training Epoch 3  38.6% | batch:       115 of       298	|	loss: 0.213466
Training Epoch 3  38.9% | batch:       116 of       298	|	loss: 0.254513
Training Epoch 3  39.3% | batch:       117 of       298	|	loss: 0.220948
Training Epoch 3  39.6% | batch:       118 of       298	|	loss: 0.29086
Training Epoch 3  39.9% | batch:       119 of       298	|	loss: 0.203691
Training Epoch 3  40.3% | batch:       120 of       298	|	loss: 0.265271
Training Epoch 3  40.6% | batch:       121 of       298	|	loss: 0.254087
Training Epoch 3  40.9% | batch:       122 of       298	|	loss: 0.2703
Training Epoch 3  41.3% | batch:       123 of       298	|	loss: 0.208527
Training Epoch 3  41.6% | batch:       124 of       298	|	loss: 0.237054
Training Epoch 3  41.9% | batch:       125 of       298	|	loss: 0.217623
Training Epoch 3  42.3% | batch:       126 of       298	|	loss: 0.372317
Training Epoch 3  42.6% | batch:       127 of       29

Training Epoch 3  77.2% | batch:       230 of       298	|	loss: 0.304992
Training Epoch 3  77.5% | batch:       231 of       298	|	loss: 0.197569
Training Epoch 3  77.9% | batch:       232 of       298	|	loss: 0.224784
Training Epoch 3  78.2% | batch:       233 of       298	|	loss: 0.303042
Training Epoch 3  78.5% | batch:       234 of       298	|	loss: 0.251046
Training Epoch 3  78.9% | batch:       235 of       298	|	loss: 0.197351
Training Epoch 3  79.2% | batch:       236 of       298	|	loss: 0.183253
Training Epoch 3  79.5% | batch:       237 of       298	|	loss: 0.237731
Training Epoch 3  79.9% | batch:       238 of       298	|	loss: 0.202186
Training Epoch 3  80.2% | batch:       239 of       298	|	loss: 0.28995
Training Epoch 3  80.5% | batch:       240 of       298	|	loss: 0.240728
Training Epoch 3  80.9% | batch:       241 of       298	|	loss: 0.335338
Training Epoch 3  81.2% | batch:       242 of       298	|	loss: 0.373469
Training Epoch 3  81.5% | batch:       243 of       

2023-06-22 14:36:40,955 | INFO : Epoch 3 Training Summary: epoch: 3.000000 | loss: 0.342327 | 
2023-06-22 14:36:40,956 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.305079936981201 seconds

2023-06-22 14:36:40,957 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.263102293014526 seconds
2023-06-22 14:36:40,957 | INFO : Avg batch train. time: 0.04115134997655882 seconds
2023-06-22 14:36:40,959 | INFO : Avg sample train. time: 0.0012862494538509047 seconds
Training Epoch:   1%|          | 3/400 [00:39<1:26:10, 13.02s/it]

Training Epoch 3  98.7% | batch:       294 of       298	|	loss: 0.244461
Training Epoch 3  99.0% | batch:       295 of       298	|	loss: 1.52216
Training Epoch 3  99.3% | batch:       296 of       298	|	loss: 0.265747
Training Epoch 3  99.7% | batch:       297 of       298	|	loss: 0.301884

Training Epoch 4   0.0% | batch:         0 of       298	|	loss: 0.266762
Training Epoch 4   0.3% | batch:         1 of       298	|	loss: 0.250684
Training Epoch 4   0.7% | batch:         2 of       298	|	loss: 0.182544
Training Epoch 4   1.0% | batch:         3 of       298	|	loss: 0.252123
Training Epoch 4   1.3% | batch:         4 of       298	|	loss: 0.192592
Training Epoch 4   1.7% | batch:         5 of       298	|	loss: 0.198487
Training Epoch 4   2.0% | batch:         6 of       298	|	loss: 0.253433
Training Epoch 4   2.3% | batch:         7 of       298	|	loss: 0.192754
Training Epoch 4   2.7% | batch:         8 of       298	|	loss: 0.440513
Training Epoch 4   3.0% | batch:         9 of      

Training Epoch 4  36.9% | batch:       110 of       298	|	loss: 0.240061
Training Epoch 4  37.2% | batch:       111 of       298	|	loss: 0.202381
Training Epoch 4  37.6% | batch:       112 of       298	|	loss: 0.174844
Training Epoch 4  37.9% | batch:       113 of       298	|	loss: 1.17596
Training Epoch 4  38.3% | batch:       114 of       298	|	loss: 0.192122
Training Epoch 4  38.6% | batch:       115 of       298	|	loss: 0.284857
Training Epoch 4  38.9% | batch:       116 of       298	|	loss: 0.32138
Training Epoch 4  39.3% | batch:       117 of       298	|	loss: 0.266311
Training Epoch 4  39.6% | batch:       118 of       298	|	loss: 0.220978
Training Epoch 4  39.9% | batch:       119 of       298	|	loss: 0.270461
Training Epoch 4  40.3% | batch:       120 of       298	|	loss: 0.252154
Training Epoch 4  40.6% | batch:       121 of       298	|	loss: 0.204451
Training Epoch 4  40.9% | batch:       122 of       298	|	loss: 0.206054
Training Epoch 4  41.3% | batch:       123 of       2

Training Epoch 4  75.2% | batch:       224 of       298	|	loss: 0.519565
Training Epoch 4  75.5% | batch:       225 of       298	|	loss: 0.22076
Training Epoch 4  75.8% | batch:       226 of       298	|	loss: 0.19362
Training Epoch 4  76.2% | batch:       227 of       298	|	loss: 0.254066
Training Epoch 4  76.5% | batch:       228 of       298	|	loss: 0.284582
Training Epoch 4  76.8% | batch:       229 of       298	|	loss: 0.289846
Training Epoch 4  77.2% | batch:       230 of       298	|	loss: 0.230557
Training Epoch 4  77.5% | batch:       231 of       298	|	loss: 0.223497
Training Epoch 4  77.9% | batch:       232 of       298	|	loss: 0.378636
Training Epoch 4  78.2% | batch:       233 of       298	|	loss: 0.249605
Training Epoch 4  78.5% | batch:       234 of       298	|	loss: 2.36367
Training Epoch 4  78.9% | batch:       235 of       298	|	loss: 0.179184
Training Epoch 4  79.2% | batch:       236 of       298	|	loss: 0.232067
Training Epoch 4  79.5% | batch:       237 of       29

2023-06-22 14:36:53,518 | INFO : Epoch 4 Training Summary: epoch: 4.000000 | loss: 0.328881 | 
2023-06-22 14:36:53,520 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.5333993434906 seconds

2023-06-22 14:36:53,521 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.330676555633545 seconds
2023-06-22 14:36:53,521 | INFO : Avg batch train. time: 0.04137810924709243 seconds
2023-06-22 14:36:53,522 | INFO : Avg sample train. time: 0.0012933371675722199 seconds
2023-06-22 14:36:53,523 | INFO : Evaluating on validation set ...


Training Epoch 4  99.7% | batch:       297 of       298	|	loss: 0.316035

Evaluating Epoch 4   0.0% | batch:         0 of        75	|	loss: 0.301788
Evaluating Epoch 4   1.3% | batch:         1 of        75	|	loss: 0.175836
Evaluating Epoch 4   2.7% | batch:         2 of        75	|	loss: 0.206766
Evaluating Epoch 4   4.0% | batch:         3 of        75	|	loss: 0.187433
Evaluating Epoch 4   5.3% | batch:         4 of        75	|	loss: 0.280598
Evaluating Epoch 4   6.7% | batch:         5 of        75	|	loss: 0.887641
Evaluating Epoch 4   8.0% | batch:         6 of        75	|	loss: 0.263469
Evaluating Epoch 4   9.3% | batch:         7 of        75	|	loss: 0.227116
Evaluating Epoch 4  10.7% | batch:         8 of        75	|	loss: 0.20757
Evaluating Epoch 4  12.0% | batch:         9 of        75	|	loss: 0.209115
Evaluating Epoch 4  13.3% | batch:        10 of        75	|	loss: 0.175525
Evaluating Epoch 4  14.7% | batch:        11 of        75	|	loss: 0.162932
Evaluating Epoch 4  16.0% |

2023-06-22 14:36:54,754 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.2295327186584473 seconds

2023-06-22 14:36:54,755 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.240070641040802 seconds
2023-06-22 14:36:54,756 | INFO : Avg batch val. time: 0.01653427521387736 seconds
2023-06-22 14:36:54,757 | INFO : Avg sample val. time: 0.0005201638594969807 seconds
2023-06-22 14:36:54,758 | INFO : Epoch 4 Validation Summary: epoch: 4.000000 | loss: 0.261742 | 


Evaluating Epoch 4  85.3% | batch:        64 of        75	|	loss: 0.33487
Evaluating Epoch 4  86.7% | batch:        65 of        75	|	loss: 0.172901
Evaluating Epoch 4  88.0% | batch:        66 of        75	|	loss: 0.200612
Evaluating Epoch 4  89.3% | batch:        67 of        75	|	loss: 0.234759
Evaluating Epoch 4  90.7% | batch:        68 of        75	|	loss: 0.23187
Evaluating Epoch 4  92.0% | batch:        69 of        75	|	loss: 0.219548
Evaluating Epoch 4  93.3% | batch:        70 of        75	|	loss: 0.242654
Evaluating Epoch 4  94.7% | batch:        71 of        75	|	loss: 0.298192
Evaluating Epoch 4  96.0% | batch:        72 of        75	|	loss: 0.134447
Evaluating Epoch 4  97.3% | batch:        73 of        75	|	loss: 0.183118
Evaluating Epoch 4  98.7% | batch:        74 of        75	|	loss: 0.195597



Training Epoch:   1%|          | 4/400 [00:53<1:28:03, 13.34s/it]

Training Epoch 5   0.0% | batch:         0 of       298	|	loss: 0.20592
Training Epoch 5   0.3% | batch:         1 of       298	|	loss: 0.328299
Training Epoch 5   0.7% | batch:         2 of       298	|	loss: 0.444079
Training Epoch 5   1.0% | batch:         3 of       298	|	loss: 0.230073
Training Epoch 5   1.3% | batch:         4 of       298	|	loss: 0.203472
Training Epoch 5   1.7% | batch:         5 of       298	|	loss: 0.211755
Training Epoch 5   2.0% | batch:         6 of       298	|	loss: 0.350292
Training Epoch 5   2.3% | batch:         7 of       298	|	loss: 0.2487
Training Epoch 5   2.7% | batch:         8 of       298	|	loss: 0.240486
Training Epoch 5   3.0% | batch:         9 of       298	|	loss: 0.199285
Training Epoch 5   3.4% | batch:        10 of       298	|	loss: 0.216571
Training Epoch 5   3.7% | batch:        11 of       298	|	loss: 0.189944
Training Epoch 5   4.0% | batch:        12 of       298	|	loss: 0.268289
Training Epoch 5   4.4% | batch:        13 of       29

Training Epoch 5  38.3% | batch:       114 of       298	|	loss: 0.169618
Training Epoch 5  38.6% | batch:       115 of       298	|	loss: 3.19508
Training Epoch 5  38.9% | batch:       116 of       298	|	loss: 0.166001
Training Epoch 5  39.3% | batch:       117 of       298	|	loss: 0.204961
Training Epoch 5  39.6% | batch:       118 of       298	|	loss: 0.194937
Training Epoch 5  39.9% | batch:       119 of       298	|	loss: 0.204045
Training Epoch 5  40.3% | batch:       120 of       298	|	loss: 0.193591
Training Epoch 5  40.6% | batch:       121 of       298	|	loss: 0.450938
Training Epoch 5  40.9% | batch:       122 of       298	|	loss: 0.311377
Training Epoch 5  41.3% | batch:       123 of       298	|	loss: 0.234092
Training Epoch 5  41.6% | batch:       124 of       298	|	loss: 0.238457
Training Epoch 5  41.9% | batch:       125 of       298	|	loss: 0.218444
Training Epoch 5  42.3% | batch:       126 of       298	|	loss: 0.178134
Training Epoch 5  42.6% | batch:       127 of       

Training Epoch 5  76.2% | batch:       227 of       298	|	loss: 0.190578
Training Epoch 5  76.5% | batch:       228 of       298	|	loss: 1.66122
Training Epoch 5  76.8% | batch:       229 of       298	|	loss: 0.184362
Training Epoch 5  77.2% | batch:       230 of       298	|	loss: 0.299724
Training Epoch 5  77.5% | batch:       231 of       298	|	loss: 0.333276
Training Epoch 5  77.9% | batch:       232 of       298	|	loss: 0.206621
Training Epoch 5  78.2% | batch:       233 of       298	|	loss: 0.194996
Training Epoch 5  78.5% | batch:       234 of       298	|	loss: 0.172641
Training Epoch 5  78.9% | batch:       235 of       298	|	loss: 0.281533
Training Epoch 5  79.2% | batch:       236 of       298	|	loss: 0.278307
Training Epoch 5  79.5% | batch:       237 of       298	|	loss: 0.382944
Training Epoch 5  79.9% | batch:       238 of       298	|	loss: 0.260896
Training Epoch 5  80.2% | batch:       239 of       298	|	loss: 0.230439
Training Epoch 5  80.5% | batch:       240 of       

2023-06-22 14:37:07,090 | INFO : Epoch 5 Training Summary: epoch: 5.000000 | loss: 0.346913 | 
2023-06-22 14:37:07,091 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.276024103164673 seconds

2023-06-22 14:37:07,092 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.319746065139771 seconds
2023-06-22 14:37:07,093 | INFO : Avg batch train. time: 0.04134142974879118 seconds
2023-06-22 14:37:07,094 | INFO : Avg sample train. time: 0.0012921906927983817 seconds
Training Epoch:   1%|▏         | 5/400 [01:05<1:25:21, 12.97s/it]

Training Epoch 5  98.7% | batch:       294 of       298	|	loss: 0.231846
Training Epoch 5  99.0% | batch:       295 of       298	|	loss: 0.20824
Training Epoch 5  99.3% | batch:       296 of       298	|	loss: 0.252757
Training Epoch 5  99.7% | batch:       297 of       298	|	loss: 0.261742

Training Epoch 6   0.0% | batch:         0 of       298	|	loss: 0.218621
Training Epoch 6   0.3% | batch:         1 of       298	|	loss: 0.196486
Training Epoch 6   0.7% | batch:         2 of       298	|	loss: 0.241971
Training Epoch 6   1.0% | batch:         3 of       298	|	loss: 0.160663
Training Epoch 6   1.3% | batch:         4 of       298	|	loss: 0.230506
Training Epoch 6   1.7% | batch:         5 of       298	|	loss: 0.221683
Training Epoch 6   2.0% | batch:         6 of       298	|	loss: 0.181086
Training Epoch 6   2.3% | batch:         7 of       298	|	loss: 0.234168
Training Epoch 6   2.7% | batch:         8 of       298	|	loss: 0.265864
Training Epoch 6   3.0% | batch:         9 of      

Training Epoch 6  38.3% | batch:       114 of       298	|	loss: 0.347293
Training Epoch 6  38.6% | batch:       115 of       298	|	loss: 0.217038
Training Epoch 6  38.9% | batch:       116 of       298	|	loss: 0.194251
Training Epoch 6  39.3% | batch:       117 of       298	|	loss: 0.293132
Training Epoch 6  39.6% | batch:       118 of       298	|	loss: 0.163807
Training Epoch 6  39.9% | batch:       119 of       298	|	loss: 0.239559
Training Epoch 6  40.3% | batch:       120 of       298	|	loss: 0.217992
Training Epoch 6  40.6% | batch:       121 of       298	|	loss: 0.225045
Training Epoch 6  40.9% | batch:       122 of       298	|	loss: 0.217467
Training Epoch 6  41.3% | batch:       123 of       298	|	loss: 0.198244
Training Epoch 6  41.6% | batch:       124 of       298	|	loss: 0.20565
Training Epoch 6  41.9% | batch:       125 of       298	|	loss: 0.164773
Training Epoch 6  42.3% | batch:       126 of       298	|	loss: 0.193942
Training Epoch 6  42.6% | batch:       127 of       

Training Epoch 6  77.2% | batch:       230 of       298	|	loss: 0.344699
Training Epoch 6  77.5% | batch:       231 of       298	|	loss: 0.457438
Training Epoch 6  77.9% | batch:       232 of       298	|	loss: 0.190649
Training Epoch 6  78.2% | batch:       233 of       298	|	loss: 0.225969
Training Epoch 6  78.5% | batch:       234 of       298	|	loss: 0.253651
Training Epoch 6  78.9% | batch:       235 of       298	|	loss: 0.208365
Training Epoch 6  79.2% | batch:       236 of       298	|	loss: 0.253597
Training Epoch 6  79.5% | batch:       237 of       298	|	loss: 0.170661
Training Epoch 6  79.9% | batch:       238 of       298	|	loss: 0.239916
Training Epoch 6  80.2% | batch:       239 of       298	|	loss: 0.217956
Training Epoch 6  80.5% | batch:       240 of       298	|	loss: 0.221312
Training Epoch 6  80.9% | batch:       241 of       298	|	loss: 0.417191
Training Epoch 6  81.2% | batch:       242 of       298	|	loss: 0.211567
Training Epoch 6  81.5% | batch:       243 of      

2023-06-22 14:37:19,334 | INFO : Epoch 6 Training Summary: epoch: 6.000000 | loss: 0.318625 | 
2023-06-22 14:37:19,336 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.221484899520874 seconds

2023-06-22 14:37:19,337 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.303369204203287 seconds
2023-06-22 14:37:19,338 | INFO : Avg batch train. time: 0.041286473839608345 seconds
2023-06-22 14:37:19,339 | INFO : Avg sample train. time: 0.0012904729603737453 seconds
2023-06-22 14:37:19,339 | INFO : Evaluating on validation set ...


Training Epoch 6  99.7% | batch:       297 of       298	|	loss: 0.196057

Evaluating Epoch 6   0.0% | batch:         0 of        75	|	loss: 0.190291
Evaluating Epoch 6   1.3% | batch:         1 of        75	|	loss: 0.158777
Evaluating Epoch 6   2.7% | batch:         2 of        75	|	loss: 0.137948
Evaluating Epoch 6   4.0% | batch:         3 of        75	|	loss: 0.134823
Evaluating Epoch 6   5.3% | batch:         4 of        75	|	loss: 0.16636
Evaluating Epoch 6   6.7% | batch:         5 of        75	|	loss: 0.163638
Evaluating Epoch 6   8.0% | batch:         6 of        75	|	loss: 0.201189
Evaluating Epoch 6   9.3% | batch:         7 of        75	|	loss: 0.169551
Evaluating Epoch 6  10.7% | batch:         8 of        75	|	loss: 0.135026
Evaluating Epoch 6  12.0% | batch:         9 of        75	|	loss: 0.1724
Evaluating Epoch 6  13.3% | batch:        10 of        75	|	loss: 0.210595
Evaluating Epoch 6  14.7% | batch:        11 of        75	|	loss: 0.142871
Evaluating Epoch 6  16.0% | b

2023-06-22 14:37:20,544 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.203620433807373 seconds

2023-06-22 14:37:20,545 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.2327805995941161 seconds
2023-06-22 14:37:20,546 | INFO : Avg batch val. time: 0.01643707466125488 seconds
2023-06-22 14:37:20,546 | INFO : Avg sample val. time: 0.0005171059562055856 seconds
2023-06-22 14:37:20,547 | INFO : Epoch 6 Validation Summary: epoch: 6.000000 | loss: 0.369616 | 
Training Epoch:   2%|▏         | 6/400 [01:19<1:26:13, 13.13s/it]

Evaluating Epoch 6  84.0% | batch:        63 of        75	|	loss: 0.229898
Evaluating Epoch 6  85.3% | batch:        64 of        75	|	loss: 0.243926
Evaluating Epoch 6  86.7% | batch:        65 of        75	|	loss: 0.245487
Evaluating Epoch 6  88.0% | batch:        66 of        75	|	loss: 0.172253
Evaluating Epoch 6  89.3% | batch:        67 of        75	|	loss: 0.17313
Evaluating Epoch 6  90.7% | batch:        68 of        75	|	loss: 0.160164
Evaluating Epoch 6  92.0% | batch:        69 of        75	|	loss: 0.171194
Evaluating Epoch 6  93.3% | batch:        70 of        75	|	loss: 0.219519
Evaluating Epoch 6  94.7% | batch:        71 of        75	|	loss: 0.18957
Evaluating Epoch 6  96.0% | batch:        72 of        75	|	loss: 0.168218
Evaluating Epoch 6  97.3% | batch:        73 of        75	|	loss: 0.382789
Evaluating Epoch 6  98.7% | batch:        74 of        75	|	loss: 0.371724

Training Epoch 7   0.0% | batch:         0 of       298	|	loss: 0.203903
Training Epoch 7   0.3% | ba

Training Epoch 7  35.2% | batch:       105 of       298	|	loss: 0.198161
Training Epoch 7  35.6% | batch:       106 of       298	|	loss: 0.207965
Training Epoch 7  35.9% | batch:       107 of       298	|	loss: 0.242525
Training Epoch 7  36.2% | batch:       108 of       298	|	loss: 0.193
Training Epoch 7  36.6% | batch:       109 of       298	|	loss: 0.251719
Training Epoch 7  36.9% | batch:       110 of       298	|	loss: 0.202894
Training Epoch 7  37.2% | batch:       111 of       298	|	loss: 0.218785
Training Epoch 7  37.6% | batch:       112 of       298	|	loss: 0.256165
Training Epoch 7  37.9% | batch:       113 of       298	|	loss: 0.24058
Training Epoch 7  38.3% | batch:       114 of       298	|	loss: 0.368052
Training Epoch 7  38.6% | batch:       115 of       298	|	loss: 0.269781
Training Epoch 7  38.9% | batch:       116 of       298	|	loss: 0.247392
Training Epoch 7  39.3% | batch:       117 of       298	|	loss: 0.161102
Training Epoch 7  39.6% | batch:       118 of       298

Training Epoch 7  74.5% | batch:       222 of       298	|	loss: 0.246164
Training Epoch 7  74.8% | batch:       223 of       298	|	loss: 0.242352
Training Epoch 7  75.2% | batch:       224 of       298	|	loss: 0.161504
Training Epoch 7  75.5% | batch:       225 of       298	|	loss: 0.396263
Training Epoch 7  75.8% | batch:       226 of       298	|	loss: 0.201643
Training Epoch 7  76.2% | batch:       227 of       298	|	loss: 0.140625
Training Epoch 7  76.5% | batch:       228 of       298	|	loss: 0.1981
Training Epoch 7  76.8% | batch:       229 of       298	|	loss: 0.245804
Training Epoch 7  77.2% | batch:       230 of       298	|	loss: 0.218342
Training Epoch 7  77.5% | batch:       231 of       298	|	loss: 0.17114
Training Epoch 7  77.9% | batch:       232 of       298	|	loss: 0.196355
Training Epoch 7  78.2% | batch:       233 of       298	|	loss: 0.162693
Training Epoch 7  78.5% | batch:       234 of       298	|	loss: 0.194713
Training Epoch 7  78.9% | batch:       235 of       29

2023-06-22 14:37:32,618 | INFO : Epoch 7 Training Summary: epoch: 7.000000 | loss: 0.324510 | 
2023-06-22 14:37:32,619 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.054327249526978 seconds

2023-06-22 14:37:32,620 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.267791782106672 seconds
2023-06-22 14:37:32,621 | INFO : Avg batch train. time: 0.041167086517136485 seconds
2023-06-22 14:37:32,621 | INFO : Avg sample train. time: 0.0012867413239046226 seconds
Training Epoch:   2%|▏         | 7/400 [01:31<1:23:46, 12.79s/it]

Training Epoch 7  98.3% | batch:       293 of       298	|	loss: 0.218895
Training Epoch 7  98.7% | batch:       294 of       298	|	loss: 0.248616
Training Epoch 7  99.0% | batch:       295 of       298	|	loss: 0.188016
Training Epoch 7  99.3% | batch:       296 of       298	|	loss: 0.266405
Training Epoch 7  99.7% | batch:       297 of       298	|	loss: 0.142636

Training Epoch 8   0.0% | batch:         0 of       298	|	loss: 0.230752
Training Epoch 8   0.3% | batch:         1 of       298	|	loss: 0.207936
Training Epoch 8   0.7% | batch:         2 of       298	|	loss: 0.182118
Training Epoch 8   1.0% | batch:         3 of       298	|	loss: 0.19014
Training Epoch 8   1.3% | batch:         4 of       298	|	loss: 0.25244
Training Epoch 8   1.7% | batch:         5 of       298	|	loss: 0.16706
Training Epoch 8   2.0% | batch:         6 of       298	|	loss: 0.275529
Training Epoch 8   2.3% | batch:         7 of       298	|	loss: 0.185189
Training Epoch 8   2.7% | batch:         8 of       2

Training Epoch 8  37.2% | batch:       111 of       298	|	loss: 0.243911
Training Epoch 8  37.6% | batch:       112 of       298	|	loss: 0.396552
Training Epoch 8  37.9% | batch:       113 of       298	|	loss: 0.192669
Training Epoch 8  38.3% | batch:       114 of       298	|	loss: 1.67212
Training Epoch 8  38.6% | batch:       115 of       298	|	loss: 0.223709
Training Epoch 8  38.9% | batch:       116 of       298	|	loss: 0.515805
Training Epoch 8  39.3% | batch:       117 of       298	|	loss: 0.185379
Training Epoch 8  39.6% | batch:       118 of       298	|	loss: 0.233983
Training Epoch 8  39.9% | batch:       119 of       298	|	loss: 1.115
Training Epoch 8  40.3% | batch:       120 of       298	|	loss: 0.255725
Training Epoch 8  40.6% | batch:       121 of       298	|	loss: 0.186913
Training Epoch 8  40.9% | batch:       122 of       298	|	loss: 0.204739
Training Epoch 8  41.3% | batch:       123 of       298	|	loss: 0.198818
Training Epoch 8  41.6% | batch:       124 of       298

Training Epoch 8  75.2% | batch:       224 of       298	|	loss: 0.4319
Training Epoch 8  75.5% | batch:       225 of       298	|	loss: 0.193783
Training Epoch 8  75.8% | batch:       226 of       298	|	loss: 0.223585
Training Epoch 8  76.2% | batch:       227 of       298	|	loss: 0.193688
Training Epoch 8  76.5% | batch:       228 of       298	|	loss: 0.185033
Training Epoch 8  76.8% | batch:       229 of       298	|	loss: 0.170312
Training Epoch 8  77.2% | batch:       230 of       298	|	loss: 0.321624
Training Epoch 8  77.5% | batch:       231 of       298	|	loss: 0.18507
Training Epoch 8  77.9% | batch:       232 of       298	|	loss: 0.196522
Training Epoch 8  78.2% | batch:       233 of       298	|	loss: 0.163041
Training Epoch 8  78.5% | batch:       234 of       298	|	loss: 0.455762
Training Epoch 8  78.9% | batch:       235 of       298	|	loss: 0.204564
Training Epoch 8  79.2% | batch:       236 of       298	|	loss: 0.259189
Training Epoch 8  79.5% | batch:       237 of       29

2023-06-22 14:37:44,873 | INFO : Epoch 8 Training Summary: epoch: 8.000000 | loss: 0.312519 | 
2023-06-22 14:37:44,874 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.220973014831543 seconds

2023-06-22 14:37:44,875 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.26193943619728 seconds
2023-06-22 14:37:44,875 | INFO : Avg batch train. time: 0.041147447772474095 seconds
2023-06-22 14:37:44,876 | INFO : Avg sample train. time: 0.0012861274843924146 seconds
2023-06-22 14:37:44,877 | INFO : Evaluating on validation set ...


Training Epoch 8  98.7% | batch:       294 of       298	|	loss: 0.213928
Training Epoch 8  99.0% | batch:       295 of       298	|	loss: 0.277559
Training Epoch 8  99.3% | batch:       296 of       298	|	loss: 0.292778
Training Epoch 8  99.7% | batch:       297 of       298	|	loss: 0.192449

Evaluating Epoch 8   0.0% | batch:         0 of        75	|	loss: 0.24204
Evaluating Epoch 8   1.3% | batch:         1 of        75	|	loss: 0.343431
Evaluating Epoch 8   2.7% | batch:         2 of        75	|	loss: 0.23402
Evaluating Epoch 8   4.0% | batch:         3 of        75	|	loss: 0.158374
Evaluating Epoch 8   5.3% | batch:         4 of        75	|	loss: 0.137181
Evaluating Epoch 8   6.7% | batch:         5 of        75	|	loss: 0.332998
Evaluating Epoch 8   8.0% | batch:         6 of        75	|	loss: 0.312874
Evaluating Epoch 8   9.3% | batch:         7 of        75	|	loss: 0.300691
Evaluating Epoch 8  10.7% | batch:         8 of        75	|	loss: 0.163014
Evaluating Epoch 8  12.0% | batch:

2023-06-22 14:37:46,107 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.2296059131622314 seconds

2023-06-22 14:37:46,108 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.232251485188802 seconds
2023-06-22 14:37:46,109 | INFO : Avg batch val. time: 0.01643001980251736 seconds
2023-06-22 14:37:46,110 | INFO : Avg sample val. time: 0.000516884012243625 seconds
2023-06-22 14:37:46,111 | INFO : Epoch 8 Validation Summary: epoch: 8.000000 | loss: 0.421244 | 
Training Epoch:   2%|▏         | 8/400 [01:44<1:25:00, 13.01s/it]

Evaluating Epoch 8  90.7% | batch:        68 of        75	|	loss: 0.187474
Evaluating Epoch 8  92.0% | batch:        69 of        75	|	loss: 0.198219
Evaluating Epoch 8  93.3% | batch:        70 of        75	|	loss: 0.219111
Evaluating Epoch 8  94.7% | batch:        71 of        75	|	loss: 0.210799
Evaluating Epoch 8  96.0% | batch:        72 of        75	|	loss: 0.114155
Evaluating Epoch 8  97.3% | batch:        73 of        75	|	loss: 0.341037
Evaluating Epoch 8  98.7% | batch:        74 of        75	|	loss: 0.17377

Training Epoch 9   0.0% | batch:         0 of       298	|	loss: 0.46713
Training Epoch 9   0.3% | batch:         1 of       298	|	loss: 0.547712
Training Epoch 9   0.7% | batch:         2 of       298	|	loss: 0.218154
Training Epoch 9   1.0% | batch:         3 of       298	|	loss: 0.216219
Training Epoch 9   1.3% | batch:         4 of       298	|	loss: 0.234035
Training Epoch 9   1.7% | batch:         5 of       298	|	loss: 0.319391
Training Epoch 9   2.0% | batch:      

Training Epoch 9  35.9% | batch:       107 of       298	|	loss: 0.201196
Training Epoch 9  36.2% | batch:       108 of       298	|	loss: 0.167067
Training Epoch 9  36.6% | batch:       109 of       298	|	loss: 0.16983
Training Epoch 9  36.9% | batch:       110 of       298	|	loss: 0.244748
Training Epoch 9  37.2% | batch:       111 of       298	|	loss: 0.198581
Training Epoch 9  37.6% | batch:       112 of       298	|	loss: 0.208022
Training Epoch 9  37.9% | batch:       113 of       298	|	loss: 0.255789
Training Epoch 9  38.3% | batch:       114 of       298	|	loss: 0.195802
Training Epoch 9  38.6% | batch:       115 of       298	|	loss: 2.00121
Training Epoch 9  38.9% | batch:       116 of       298	|	loss: 2.17276
Training Epoch 9  39.3% | batch:       117 of       298	|	loss: 7.94597
Training Epoch 9  39.6% | batch:       118 of       298	|	loss: 0.185745
Training Epoch 9  39.9% | batch:       119 of       298	|	loss: 0.201486
Training Epoch 9  40.3% | batch:       120 of       298

Training Epoch 9  75.5% | batch:       225 of       298	|	loss: 0.163196
Training Epoch 9  75.8% | batch:       226 of       298	|	loss: 0.275672
Training Epoch 9  76.2% | batch:       227 of       298	|	loss: 0.308228
Training Epoch 9  76.5% | batch:       228 of       298	|	loss: 0.209477
Training Epoch 9  76.8% | batch:       229 of       298	|	loss: 0.365727
Training Epoch 9  77.2% | batch:       230 of       298	|	loss: 0.211901
Training Epoch 9  77.5% | batch:       231 of       298	|	loss: 0.201749
Training Epoch 9  77.9% | batch:       232 of       298	|	loss: 0.206119
Training Epoch 9  78.2% | batch:       233 of       298	|	loss: 0.737889
Training Epoch 9  78.5% | batch:       234 of       298	|	loss: 0.284569
Training Epoch 9  78.9% | batch:       235 of       298	|	loss: 0.219803
Training Epoch 9  79.2% | batch:       236 of       298	|	loss: 0.234753
Training Epoch 9  79.5% | batch:       237 of       298	|	loss: 0.409397
Training Epoch 9  79.9% | batch:       238 of      

2023-06-22 14:37:58,597 | INFO : Epoch 9 Training Summary: epoch: 9.000000 | loss: 0.326952 | 
2023-06-22 14:37:58,598 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.455394983291626 seconds

2023-06-22 14:37:58,599 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.283434496985542 seconds
2023-06-22 14:37:58,600 | INFO : Avg batch train. time: 0.041219578848944775 seconds
2023-06-22 14:37:58,601 | INFO : Avg sample train. time: 0.0012883820533863585 seconds
Training Epoch:   2%|▏         | 9/400 [01:57<1:23:43, 12.85s/it]

Training Epoch 9  98.3% | batch:       293 of       298	|	loss: 0.276695
Training Epoch 9  98.7% | batch:       294 of       298	|	loss: 0.149899
Training Epoch 9  99.0% | batch:       295 of       298	|	loss: 0.529277
Training Epoch 9  99.3% | batch:       296 of       298	|	loss: 0.186953
Training Epoch 9  99.7% | batch:       297 of       298	|	loss: 0.774705

Training Epoch 10   0.0% | batch:         0 of       298	|	loss: 0.174564
Training Epoch 10   0.3% | batch:         1 of       298	|	loss: 0.218953
Training Epoch 10   0.7% | batch:         2 of       298	|	loss: 0.193251
Training Epoch 10   1.0% | batch:         3 of       298	|	loss: 0.197531
Training Epoch 10   1.3% | batch:         4 of       298	|	loss: 0.130351
Training Epoch 10   1.7% | batch:         5 of       298	|	loss: 0.355452
Training Epoch 10   2.0% | batch:         6 of       298	|	loss: 0.233636
Training Epoch 10   2.3% | batch:         7 of       298	|	loss: 0.202525
Training Epoch 10   2.7% | batch:         

Training Epoch 10  36.9% | batch:       110 of       298	|	loss: 0.171441
Training Epoch 10  37.2% | batch:       111 of       298	|	loss: 2.08625
Training Epoch 10  37.6% | batch:       112 of       298	|	loss: 0.320077
Training Epoch 10  37.9% | batch:       113 of       298	|	loss: 0.238955
Training Epoch 10  38.3% | batch:       114 of       298	|	loss: 0.261396
Training Epoch 10  38.6% | batch:       115 of       298	|	loss: 0.161828
Training Epoch 10  38.9% | batch:       116 of       298	|	loss: 0.240074
Training Epoch 10  39.3% | batch:       117 of       298	|	loss: 0.357485
Training Epoch 10  39.6% | batch:       118 of       298	|	loss: 0.277759
Training Epoch 10  39.9% | batch:       119 of       298	|	loss: 0.186901
Training Epoch 10  40.3% | batch:       120 of       298	|	loss: 0.187827
Training Epoch 10  40.6% | batch:       121 of       298	|	loss: 0.182721
Training Epoch 10  40.9% | batch:       122 of       298	|	loss: 0.192557
Training Epoch 10  41.3% | batch:      

Training Epoch 10  74.5% | batch:       222 of       298	|	loss: 0.195155
Training Epoch 10  74.8% | batch:       223 of       298	|	loss: 0.210635
Training Epoch 10  75.2% | batch:       224 of       298	|	loss: 1.09334
Training Epoch 10  75.5% | batch:       225 of       298	|	loss: 0.173561
Training Epoch 10  75.8% | batch:       226 of       298	|	loss: 0.197644
Training Epoch 10  76.2% | batch:       227 of       298	|	loss: 1.06373
Training Epoch 10  76.5% | batch:       228 of       298	|	loss: 0.142216
Training Epoch 10  76.8% | batch:       229 of       298	|	loss: 0.268287
Training Epoch 10  77.2% | batch:       230 of       298	|	loss: 0.240734
Training Epoch 10  77.5% | batch:       231 of       298	|	loss: 0.270219
Training Epoch 10  77.9% | batch:       232 of       298	|	loss: 0.223687
Training Epoch 10  78.2% | batch:       233 of       298	|	loss: 0.253427
Training Epoch 10  78.5% | batch:       234 of       298	|	loss: 0.210582
Training Epoch 10  78.9% | batch:       

2023-06-22 14:38:10,869 | INFO : Epoch 10 Training Summary: epoch: 10.000000 | loss: 0.329171 | 
2023-06-22 14:38:10,871 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.242393255233765 seconds

2023-06-22 14:38:10,872 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.279330372810364 seconds
2023-06-22 14:38:10,873 | INFO : Avg batch train. time: 0.041205806620169004 seconds
2023-06-22 14:38:10,873 | INFO : Avg sample train. time: 0.001287951580953468 seconds
2023-06-22 14:38:10,874 | INFO : Evaluating on validation set ...


Training Epoch 10  98.7% | batch:       294 of       298	|	loss: 0.239367
Training Epoch 10  99.0% | batch:       295 of       298	|	loss: 0.190274
Training Epoch 10  99.3% | batch:       296 of       298	|	loss: 0.167482
Training Epoch 10  99.7% | batch:       297 of       298	|	loss: 0.235509

Evaluating Epoch 10   0.0% | batch:         0 of        75	|	loss: 0.223632
Evaluating Epoch 10   1.3% | batch:         1 of        75	|	loss: 0.244392
Evaluating Epoch 10   2.7% | batch:         2 of        75	|	loss: 0.226972
Evaluating Epoch 10   4.0% | batch:         3 of        75	|	loss: 0.199473
Evaluating Epoch 10   5.3% | batch:         4 of        75	|	loss: 0.251148
Evaluating Epoch 10   6.7% | batch:         5 of        75	|	loss: 0.179265
Evaluating Epoch 10   8.0% | batch:         6 of        75	|	loss: 0.380925
Evaluating Epoch 10   9.3% | batch:         7 of        75	|	loss: 0.168283
Evaluating Epoch 10  10.7% | batch:         8 of        75	|	loss: 0.167238
Evaluating Epoch 10

2023-06-22 14:38:12,061 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.1866626739501953 seconds

2023-06-22 14:38:12,062 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.225738797869001 seconds
2023-06-22 14:38:12,063 | INFO : Avg batch val. time: 0.01634318397158668 seconds
2023-06-22 14:38:12,064 | INFO : Avg sample val. time: 0.0005141521803141783 seconds
2023-06-22 14:38:12,064 | INFO : Epoch 10 Validation Summary: epoch: 10.000000 | loss: 0.288224 | 
Training Epoch:   2%|▎         | 10/400 [02:10<1:24:44, 13.04s/it]

Evaluating Epoch 10  93.3% | batch:        70 of        75	|	loss: 0.126027
Evaluating Epoch 10  94.7% | batch:        71 of        75	|	loss: 0.183043
Evaluating Epoch 10  96.0% | batch:        72 of        75	|	loss: 0.122077
Evaluating Epoch 10  97.3% | batch:        73 of        75	|	loss: 0.200899
Evaluating Epoch 10  98.7% | batch:        74 of        75	|	loss: 0.204311

Training Epoch 11   0.0% | batch:         0 of       298	|	loss: 0.190821
Training Epoch 11   0.3% | batch:         1 of       298	|	loss: 0.390651
Training Epoch 11   0.7% | batch:         2 of       298	|	loss: 0.197235
Training Epoch 11   1.0% | batch:         3 of       298	|	loss: 0.185638
Training Epoch 11   1.3% | batch:         4 of       298	|	loss: 0.205213
Training Epoch 11   1.7% | batch:         5 of       298	|	loss: 0.210049
Training Epoch 11   2.0% | batch:         6 of       298	|	loss: 0.208708
Training Epoch 11   2.3% | batch:         7 of       298	|	loss: 0.213721
Training Epoch 11   2.7% | 

Training Epoch 11  36.9% | batch:       110 of       298	|	loss: 0.227433
Training Epoch 11  37.2% | batch:       111 of       298	|	loss: 0.153943
Training Epoch 11  37.6% | batch:       112 of       298	|	loss: 0.223449
Training Epoch 11  37.9% | batch:       113 of       298	|	loss: 0.195653
Training Epoch 11  38.3% | batch:       114 of       298	|	loss: 0.205305
Training Epoch 11  38.6% | batch:       115 of       298	|	loss: 0.136172
Training Epoch 11  38.9% | batch:       116 of       298	|	loss: 0.290341
Training Epoch 11  39.3% | batch:       117 of       298	|	loss: 0.313948
Training Epoch 11  39.6% | batch:       118 of       298	|	loss: 0.193773
Training Epoch 11  39.9% | batch:       119 of       298	|	loss: 0.207364
Training Epoch 11  40.3% | batch:       120 of       298	|	loss: 0.187085
Training Epoch 11  40.6% | batch:       121 of       298	|	loss: 0.22314
Training Epoch 11  40.9% | batch:       122 of       298	|	loss: 0.166601
Training Epoch 11  41.3% | batch:      

Training Epoch 11  75.5% | batch:       225 of       298	|	loss: 0.283784
Training Epoch 11  75.8% | batch:       226 of       298	|	loss: 0.432467
Training Epoch 11  76.2% | batch:       227 of       298	|	loss: 0.180473
Training Epoch 11  76.5% | batch:       228 of       298	|	loss: 0.263147
Training Epoch 11  76.8% | batch:       229 of       298	|	loss: 0.178884
Training Epoch 11  77.2% | batch:       230 of       298	|	loss: 0.175245
Training Epoch 11  77.5% | batch:       231 of       298	|	loss: 0.176286
Training Epoch 11  77.9% | batch:       232 of       298	|	loss: 0.184351
Training Epoch 11  78.2% | batch:       233 of       298	|	loss: 0.24975
Training Epoch 11  78.5% | batch:       234 of       298	|	loss: 0.381122
Training Epoch 11  78.9% | batch:       235 of       298	|	loss: 0.323352
Training Epoch 11  79.2% | batch:       236 of       298	|	loss: 0.17772
Training Epoch 11  79.5% | batch:       237 of       298	|	loss: 0.521201
Training Epoch 11  79.9% | batch:       

2023-06-22 14:38:24,323 | INFO : Epoch 11 Training Summary: epoch: 11.000000 | loss: 0.320425 | 
2023-06-22 14:38:24,324 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.23210096359253 seconds

2023-06-22 14:38:24,325 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.275036790154196 seconds
2023-06-22 14:38:24,326 | INFO : Avg batch train. time: 0.041191398624678514 seconds
2023-06-22 14:38:24,327 | INFO : Avg sample train. time: 0.0012875012366429827 seconds
Training Epoch:   3%|▎         | 11/400 [02:22<1:22:58, 12.80s/it]

Training Epoch 11  98.3% | batch:       293 of       298	|	loss: 0.233226
Training Epoch 11  98.7% | batch:       294 of       298	|	loss: 0.190881
Training Epoch 11  99.0% | batch:       295 of       298	|	loss: 0.202362
Training Epoch 11  99.3% | batch:       296 of       298	|	loss: 0.189186
Training Epoch 11  99.7% | batch:       297 of       298	|	loss: 0.25468

Training Epoch 12   0.0% | batch:         0 of       298	|	loss: 0.230975
Training Epoch 12   0.3% | batch:         1 of       298	|	loss: 0.280066
Training Epoch 12   0.7% | batch:         2 of       298	|	loss: 0.236435
Training Epoch 12   1.0% | batch:         3 of       298	|	loss: 0.417578
Training Epoch 12   1.3% | batch:         4 of       298	|	loss: 0.199417
Training Epoch 12   1.7% | batch:         5 of       298	|	loss: 0.167222
Training Epoch 12   2.0% | batch:         6 of       298	|	loss: 0.168877
Training Epoch 12   2.3% | batch:         7 of       298	|	loss: 0.214877
Training Epoch 12   2.7% | batch:     

Training Epoch 12  35.6% | batch:       106 of       298	|	loss: 0.19739
Training Epoch 12  35.9% | batch:       107 of       298	|	loss: 0.229162
Training Epoch 12  36.2% | batch:       108 of       298	|	loss: 0.14183
Training Epoch 12  36.6% | batch:       109 of       298	|	loss: 0.177289
Training Epoch 12  36.9% | batch:       110 of       298	|	loss: 0.212801
Training Epoch 12  37.2% | batch:       111 of       298	|	loss: 0.173869
Training Epoch 12  37.6% | batch:       112 of       298	|	loss: 0.166407
Training Epoch 12  37.9% | batch:       113 of       298	|	loss: 0.277556
Training Epoch 12  38.3% | batch:       114 of       298	|	loss: 0.321158
Training Epoch 12  38.6% | batch:       115 of       298	|	loss: 0.235617
Training Epoch 12  38.9% | batch:       116 of       298	|	loss: 0.230623
Training Epoch 12  39.3% | batch:       117 of       298	|	loss: 0.180508
Training Epoch 12  39.6% | batch:       118 of       298	|	loss: 0.184356
Training Epoch 12  39.9% | batch:       

Training Epoch 12  73.5% | batch:       219 of       298	|	loss: 0.167584
Training Epoch 12  73.8% | batch:       220 of       298	|	loss: 0.157603
Training Epoch 12  74.2% | batch:       221 of       298	|	loss: 0.281774
Training Epoch 12  74.5% | batch:       222 of       298	|	loss: 0.160298
Training Epoch 12  74.8% | batch:       223 of       298	|	loss: 0.193477
Training Epoch 12  75.2% | batch:       224 of       298	|	loss: 0.198611
Training Epoch 12  75.5% | batch:       225 of       298	|	loss: 0.266436
Training Epoch 12  75.8% | batch:       226 of       298	|	loss: 0.19191
Training Epoch 12  76.2% | batch:       227 of       298	|	loss: 0.188342
Training Epoch 12  76.5% | batch:       228 of       298	|	loss: 0.204717
Training Epoch 12  76.8% | batch:       229 of       298	|	loss: 0.247647
Training Epoch 12  77.2% | batch:       230 of       298	|	loss: 0.214623
Training Epoch 12  77.5% | batch:       231 of       298	|	loss: 0.176039
Training Epoch 12  77.9% | batch:      

2023-06-22 14:38:36,937 | INFO : Epoch 12 Training Summary: epoch: 12.000000 | loss: 0.299670 | 
2023-06-22 14:38:36,938 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.592979907989502 seconds

2023-06-22 14:38:36,939 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.301532049973806 seconds
2023-06-22 14:38:36,940 | INFO : Avg batch train. time: 0.04128030889252955 seconds
2023-06-22 14:38:36,940 | INFO : Avg sample train. time: 0.001290280265363311 seconds
2023-06-22 14:38:36,941 | INFO : Evaluating on validation set ...


Training Epoch 12  99.0% | batch:       295 of       298	|	loss: 0.18809
Training Epoch 12  99.3% | batch:       296 of       298	|	loss: 0.308542
Training Epoch 12  99.7% | batch:       297 of       298	|	loss: 0.208229

Evaluating Epoch 12   0.0% | batch:         0 of        75	|	loss: 0.170041
Evaluating Epoch 12   1.3% | batch:         1 of        75	|	loss: 0.278595
Evaluating Epoch 12   2.7% | batch:         2 of        75	|	loss: 0.159068
Evaluating Epoch 12   4.0% | batch:         3 of        75	|	loss: 0.334611
Evaluating Epoch 12   5.3% | batch:         4 of        75	|	loss: 0.166447
Evaluating Epoch 12   6.7% | batch:         5 of        75	|	loss: 0.474977
Evaluating Epoch 12   8.0% | batch:         6 of        75	|	loss: 0.134769
Evaluating Epoch 12   9.3% | batch:         7 of        75	|	loss: 0.157703
Evaluating Epoch 12  10.7% | batch:         8 of        75	|	loss: 0.166243
Evaluating Epoch 12  12.0% | batch:         9 of        75	|	loss: 0.206473
Evaluating Epoch 1

2023-06-22 14:38:38,148 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.2060422897338867 seconds

2023-06-22 14:38:38,149 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.2232767343521118 seconds
2023-06-22 14:38:38,150 | INFO : Avg batch val. time: 0.01631035645802816 seconds
2023-06-22 14:38:38,150 | INFO : Avg sample val. time: 0.0005131194355503825 seconds
2023-06-22 14:38:38,151 | INFO : Epoch 12 Validation Summary: epoch: 12.000000 | loss: 0.425412 | 
Training Epoch:   3%|▎         | 12/400 [02:36<1:24:46, 13.11s/it]

Evaluating Epoch 12  97.3% | batch:        73 of        75	|	loss: 0.394946
Evaluating Epoch 12  98.7% | batch:        74 of        75	|	loss: 0.542281

Training Epoch 13   0.0% | batch:         0 of       298	|	loss: 0.191238
Training Epoch 13   0.3% | batch:         1 of       298	|	loss: 0.233242
Training Epoch 13   0.7% | batch:         2 of       298	|	loss: 0.359509
Training Epoch 13   1.0% | batch:         3 of       298	|	loss: 0.163149
Training Epoch 13   1.3% | batch:         4 of       298	|	loss: 0.283244
Training Epoch 13   1.7% | batch:         5 of       298	|	loss: 0.240411
Training Epoch 13   2.0% | batch:         6 of       298	|	loss: 0.185399
Training Epoch 13   2.3% | batch:         7 of       298	|	loss: 0.198212
Training Epoch 13   2.7% | batch:         8 of       298	|	loss: 0.521111
Training Epoch 13   3.0% | batch:         9 of       298	|	loss: 0.206528
Training Epoch 13   3.4% | batch:        10 of       298	|	loss: 0.178718
Training Epoch 13   3.7% | batch:

Training Epoch 13  36.6% | batch:       109 of       298	|	loss: 0.480858
Training Epoch 13  36.9% | batch:       110 of       298	|	loss: 0.524399
Training Epoch 13  37.2% | batch:       111 of       298	|	loss: 0.149014
Training Epoch 13  37.6% | batch:       112 of       298	|	loss: 0.24688
Training Epoch 13  37.9% | batch:       113 of       298	|	loss: 0.235805
Training Epoch 13  38.3% | batch:       114 of       298	|	loss: 0.219976
Training Epoch 13  38.6% | batch:       115 of       298	|	loss: 0.23932
Training Epoch 13  38.9% | batch:       116 of       298	|	loss: 0.214242
Training Epoch 13  39.3% | batch:       117 of       298	|	loss: 0.247188
Training Epoch 13  39.6% | batch:       118 of       298	|	loss: 0.178585
Training Epoch 13  39.9% | batch:       119 of       298	|	loss: 0.32721
Training Epoch 13  40.3% | batch:       120 of       298	|	loss: 0.655018
Training Epoch 13  40.6% | batch:       121 of       298	|	loss: 0.229144
Training Epoch 13  40.9% | batch:       1

Training Epoch 13  73.8% | batch:       220 of       298	|	loss: 0.191892
Training Epoch 13  74.2% | batch:       221 of       298	|	loss: 0.687809
Training Epoch 13  74.5% | batch:       222 of       298	|	loss: 0.332306
Training Epoch 13  74.8% | batch:       223 of       298	|	loss: 0.238418
Training Epoch 13  75.2% | batch:       224 of       298	|	loss: 0.198684
Training Epoch 13  75.5% | batch:       225 of       298	|	loss: 0.202377
Training Epoch 13  75.8% | batch:       226 of       298	|	loss: 0.162404
Training Epoch 13  76.2% | batch:       227 of       298	|	loss: 0.220664
Training Epoch 13  76.5% | batch:       228 of       298	|	loss: 0.210166
Training Epoch 13  76.8% | batch:       229 of       298	|	loss: 0.203992
Training Epoch 13  77.2% | batch:       230 of       298	|	loss: 0.41802
Training Epoch 13  77.5% | batch:       231 of       298	|	loss: 0.175264
Training Epoch 13  77.9% | batch:       232 of       298	|	loss: 0.224231
Training Epoch 13  78.2% | batch:      

2023-06-22 14:38:50,271 | INFO : Epoch 13 Training Summary: epoch: 13.000000 | loss: 0.269732 | 
2023-06-22 14:38:50,273 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.100887537002563 seconds

2023-06-22 14:38:50,274 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.286097856668325 seconds
2023-06-22 14:38:50,275 | INFO : Avg batch train. time: 0.04122851629754472 seconds
2023-06-22 14:38:50,276 | INFO : Avg sample train. time: 0.0012886614072444226 seconds
Training Epoch:   3%|▎         | 13/400 [02:48<1:22:40, 12.82s/it]

Training Epoch 13  98.7% | batch:       294 of       298	|	loss: 0.173892
Training Epoch 13  99.0% | batch:       295 of       298	|	loss: 0.268044
Training Epoch 13  99.3% | batch:       296 of       298	|	loss: 0.256925
Training Epoch 13  99.7% | batch:       297 of       298	|	loss: 0.284239

Training Epoch 14   0.0% | batch:         0 of       298	|	loss: 0.171345
Training Epoch 14   0.3% | batch:         1 of       298	|	loss: 0.682506
Training Epoch 14   0.7% | batch:         2 of       298	|	loss: 0.164687
Training Epoch 14   1.0% | batch:         3 of       298	|	loss: 0.541261
Training Epoch 14   1.3% | batch:         4 of       298	|	loss: 0.24068
Training Epoch 14   1.7% | batch:         5 of       298	|	loss: 0.164771
Training Epoch 14   2.0% | batch:         6 of       298	|	loss: 0.236048
Training Epoch 14   2.3% | batch:         7 of       298	|	loss: 0.150848
Training Epoch 14   2.7% | batch:         8 of       298	|	loss: 0.194115
Training Epoch 14   3.0% | batch:     

Training Epoch 14  37.2% | batch:       111 of       298	|	loss: 0.192875
Training Epoch 14  37.6% | batch:       112 of       298	|	loss: 0.203188
Training Epoch 14  37.9% | batch:       113 of       298	|	loss: 0.152731
Training Epoch 14  38.3% | batch:       114 of       298	|	loss: 0.188324
Training Epoch 14  38.6% | batch:       115 of       298	|	loss: 0.181321
Training Epoch 14  38.9% | batch:       116 of       298	|	loss: 0.488643
Training Epoch 14  39.3% | batch:       117 of       298	|	loss: 0.265967
Training Epoch 14  39.6% | batch:       118 of       298	|	loss: 0.343911
Training Epoch 14  39.9% | batch:       119 of       298	|	loss: 0.275396
Training Epoch 14  40.3% | batch:       120 of       298	|	loss: 0.366838
Training Epoch 14  40.6% | batch:       121 of       298	|	loss: 0.173037
Training Epoch 14  40.9% | batch:       122 of       298	|	loss: 0.207927
Training Epoch 14  41.3% | batch:       123 of       298	|	loss: 0.147564
Training Epoch 14  41.6% | batch:     

Training Epoch 14  74.5% | batch:       222 of       298	|	loss: 0.27806
Training Epoch 14  74.8% | batch:       223 of       298	|	loss: 0.216972
Training Epoch 14  75.2% | batch:       224 of       298	|	loss: 0.311236
Training Epoch 14  75.5% | batch:       225 of       298	|	loss: 0.193385
Training Epoch 14  75.8% | batch:       226 of       298	|	loss: 0.233595
Training Epoch 14  76.2% | batch:       227 of       298	|	loss: 0.180029
Training Epoch 14  76.5% | batch:       228 of       298	|	loss: 0.25216
Training Epoch 14  76.8% | batch:       229 of       298	|	loss: 0.190619
Training Epoch 14  77.2% | batch:       230 of       298	|	loss: 0.265056
Training Epoch 14  77.5% | batch:       231 of       298	|	loss: 0.183717
Training Epoch 14  77.9% | batch:       232 of       298	|	loss: 0.251198
Training Epoch 14  78.2% | batch:       233 of       298	|	loss: 0.331616
Training Epoch 14  78.5% | batch:       234 of       298	|	loss: 0.490015
Training Epoch 14  78.9% | batch:       

2023-06-22 14:39:02,450 | INFO : Epoch 14 Training Summary: epoch: 14.000000 | loss: 0.297982 | 
2023-06-22 14:39:02,452 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.134496688842773 seconds

2023-06-22 14:39:02,452 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.275269201823644 seconds
2023-06-22 14:39:02,453 | INFO : Avg batch train. time: 0.04119217852960954 seconds
2023-06-22 14:39:02,453 | INFO : Avg sample train. time: 0.001287525613784733 seconds
2023-06-22 14:39:02,454 | INFO : Evaluating on validation set ...


Training Epoch 14  99.0% | batch:       295 of       298	|	loss: 0.398719
Training Epoch 14  99.3% | batch:       296 of       298	|	loss: 0.378947
Training Epoch 14  99.7% | batch:       297 of       298	|	loss: 0.172578

Evaluating Epoch 14   0.0% | batch:         0 of        75	|	loss: 0.228403
Evaluating Epoch 14   1.3% | batch:         1 of        75	|	loss: 0.218066
Evaluating Epoch 14   2.7% | batch:         2 of        75	|	loss: 0.141804
Evaluating Epoch 14   4.0% | batch:         3 of        75	|	loss: 0.140044
Evaluating Epoch 14   5.3% | batch:         4 of        75	|	loss: 1.36802
Evaluating Epoch 14   6.7% | batch:         5 of        75	|	loss: 0.151313
Evaluating Epoch 14   8.0% | batch:         6 of        75	|	loss: 0.168725
Evaluating Epoch 14   9.3% | batch:         7 of        75	|	loss: 0.137639
Evaluating Epoch 14  10.7% | batch:         8 of        75	|	loss: 0.212806
Evaluating Epoch 14  12.0% | batch:         9 of        75	|	loss: 0.226685
Evaluating Epoch 1

2023-06-22 14:39:03,664 | INFO : Validation runtime: 0.0 hours, 0.0 minutes, 1.2096996307373047 seconds

2023-06-22 14:39:03,665 | INFO : Avg val. time: 0.0 hours, 0.0 minutes, 1.2217681672837999 seconds
2023-06-22 14:39:03,666 | INFO : Avg batch val. time: 0.016290242230450665 seconds
2023-06-22 14:39:03,666 | INFO : Avg sample val. time: 0.0005124866473505872 seconds
2023-06-22 14:39:03,667 | INFO : Epoch 14 Validation Summary: epoch: 14.000000 | loss: 0.280626 | 
Training Epoch:   4%|▎         | 14/400 [03:02<1:23:34, 12.99s/it]

Evaluating Epoch 14  93.3% | batch:        70 of        75	|	loss: 0.149805
Evaluating Epoch 14  94.7% | batch:        71 of        75	|	loss: 0.147547
Evaluating Epoch 14  96.0% | batch:        72 of        75	|	loss: 0.104209
Evaluating Epoch 14  97.3% | batch:        73 of        75	|	loss: 0.169496
Evaluating Epoch 14  98.7% | batch:        74 of        75	|	loss: 0.262936

Training Epoch 15   0.0% | batch:         0 of       298	|	loss: 0.245627
Training Epoch 15   0.3% | batch:         1 of       298	|	loss: 0.250631
Training Epoch 15   0.7% | batch:         2 of       298	|	loss: 0.152307
Training Epoch 15   1.0% | batch:         3 of       298	|	loss: 0.206389
Training Epoch 15   1.3% | batch:         4 of       298	|	loss: 0.22924
Training Epoch 15   1.7% | batch:         5 of       298	|	loss: 0.161285
Training Epoch 15   2.0% | batch:         6 of       298	|	loss: 0.363861
Training Epoch 15   2.3% | batch:         7 of       298	|	loss: 0.654482
Training Epoch 15   2.7% | b

Training Epoch 15  36.9% | batch:       110 of       298	|	loss: 0.168155
Training Epoch 15  37.2% | batch:       111 of       298	|	loss: 0.186602
Training Epoch 15  37.6% | batch:       112 of       298	|	loss: 0.174721
Training Epoch 15  37.9% | batch:       113 of       298	|	loss: 0.25965
Training Epoch 15  38.3% | batch:       114 of       298	|	loss: 0.173999
Training Epoch 15  38.6% | batch:       115 of       298	|	loss: 0.253742
Training Epoch 15  38.9% | batch:       116 of       298	|	loss: 0.158119
Training Epoch 15  39.3% | batch:       117 of       298	|	loss: 0.15712
Training Epoch 15  39.6% | batch:       118 of       298	|	loss: 0.167584
Training Epoch 15  39.9% | batch:       119 of       298	|	loss: 0.201689
Training Epoch 15  40.3% | batch:       120 of       298	|	loss: 0.270078
Training Epoch 15  40.6% | batch:       121 of       298	|	loss: 0.278463
Training Epoch 15  40.9% | batch:       122 of       298	|	loss: 0.231957
Training Epoch 15  41.3% | batch:       

Training Epoch 15  75.2% | batch:       224 of       298	|	loss: 0.2475
Training Epoch 15  75.5% | batch:       225 of       298	|	loss: 0.257643
Training Epoch 15  75.8% | batch:       226 of       298	|	loss: 0.161708
Training Epoch 15  76.2% | batch:       227 of       298	|	loss: 0.171368
Training Epoch 15  76.5% | batch:       228 of       298	|	loss: 0.350635
Training Epoch 15  76.8% | batch:       229 of       298	|	loss: 0.994677
Training Epoch 15  77.2% | batch:       230 of       298	|	loss: 0.197352
Training Epoch 15  77.5% | batch:       231 of       298	|	loss: 0.237511
Training Epoch 15  77.9% | batch:       232 of       298	|	loss: 0.371457
Training Epoch 15  78.2% | batch:       233 of       298	|	loss: 0.22433
Training Epoch 15  78.5% | batch:       234 of       298	|	loss: 0.247422
Training Epoch 15  78.9% | batch:       235 of       298	|	loss: 0.198914
Training Epoch 15  79.2% | batch:       236 of       298	|	loss: 0.228813
Training Epoch 15  79.5% | batch:       2

2023-06-22 14:39:15,862 | INFO : Epoch 15 Training Summary: epoch: 15.000000 | loss: 0.291969 | 
2023-06-22 14:39:15,863 | INFO : Epoch runtime: 0.0 hours, 0.0 minutes, 12.15755820274353 seconds

2023-06-22 14:39:15,865 | INFO : Avg epoch train. time: 0.0 hours, 0.0 minutes, 12.26742180188497 seconds
2023-06-22 14:39:15,866 | INFO : Avg batch train. time: 0.04116584497276835 seconds
2023-06-22 14:39:15,866 | INFO : Avg sample train. time: 0.0012867025175041923 seconds
Training Epoch:   4%|▍         | 15/400 [03:14<1:21:48, 12.75s/it]

Training Epoch 15  98.7% | batch:       294 of       298	|	loss: 0.176126
Training Epoch 15  99.0% | batch:       295 of       298	|	loss: 0.410197
Training Epoch 15  99.3% | batch:       296 of       298	|	loss: 0.278632
Training Epoch 15  99.7% | batch:       297 of       298	|	loss: 0.194292

Training Epoch 16   0.0% | batch:         0 of       298	|	loss: 0.180317
Training Epoch 16   0.3% | batch:         1 of       298	|	loss: 0.146001
Training Epoch 16   0.7% | batch:         2 of       298	|	loss: 0.412351
Training Epoch 16   1.0% | batch:         3 of       298	|	loss: 0.283904
Training Epoch 16   1.3% | batch:         4 of       298	|	loss: 0.152299
Training Epoch 16   1.7% | batch:         5 of       298	|	loss: 0.33559
Training Epoch 16   2.0% | batch:         6 of       298	|	loss: 0.169765
Training Epoch 16   2.3% | batch:         7 of       298	|	loss: 0.77213
Training Epoch 16   2.7% | batch:         8 of       298	|	loss: 0.157927
Training Epoch 16   3.0% | batch:      

Training Epoch 16  35.9% | batch:       107 of       298	|	loss: 0.187727
Training Epoch 16  36.2% | batch:       108 of       298	|	loss: 0.188101
Training Epoch 16  36.6% | batch:       109 of       298	|	loss: 0.236813
Training Epoch 16  36.9% | batch:       110 of       298	|	loss: 0.361798
Training Epoch 16  37.2% | batch:       111 of       298	|	loss: 0.232578
Training Epoch 16  37.6% | batch:       112 of       298	|	loss: 0.146367
Training Epoch 16  37.9% | batch:       113 of       298	|	loss: 0.372269
Training Epoch 16  38.3% | batch:       114 of       298	|	loss: 0.212842
Training Epoch 16  38.6% | batch:       115 of       298	|	loss: 0.353796
Training Epoch 16  38.9% | batch:       116 of       298	|	loss: 0.319272
Training Epoch 16  39.3% | batch:       117 of       298	|	loss: 1.69531
Training Epoch 16  39.6% | batch:       118 of       298	|	loss: 0.22554
Training Epoch 16  39.9% | batch:       119 of       298	|	loss: 0.185566
Training Epoch 16  40.3% | batch:       

In [None]:
config['load_model'] = './output/_2023-06-22_14-31-22_Nbr/checkpoints/model_best.pth'
model, optimizer, start_epoch = utils.load_model(model, config['load_model'], optimizer, config['resume'],
                                                config['change_output'],
                                                config['lr'],
                                                config['lr_step'],
                                                config['lr_factor'])

In [None]:
X, targets, target_masks, padding_masks, IDs = next(iter(train_loader))
targets = targets.to(device)
target_masks = target_masks.to(device)  
padding_masks = padding_masks.to(device) 

predictions = model(X.to(device), padding_masks)  # (batch_size, padded_length, feat_dim)

In [None]:
print(X[0])

In [None]:
print(target_masks[0])

In [None]:
print(predictions[0])

In [None]:
print(targets[0])

### STEP 7. Fine-tuning (regression)

<img src="./image/TST04.png" width="300">

In [None]:
# 실습을 위해 수정해야 할 argumetn들을 설정해줍니다.
args_change = EasyDict({
    # for dataloader 
    'output_dir': './output',
    'data_dir': './data/BeijingPM25Quality',
    'load_model': './output/_2023-06-22_14-31-22_Nbr/checkpoints/model_best.pth',
    'name': 'finetuned',
    'records_file': 'Regression_records.xls',
    'change_output': True,
    
    # Dataset
    'limit_size': None,
    'data_class': 'tsra',
    'pattern': 'TRAIN',
    'val_pattern': 'TEST',
    'val_ratio': 0.2,
    'epochs': 200,
    'lr': 0.001,
    'optimizer': 'RAdam',
    'batch_size': 128,
    'pos_encoding': 'learnable',
    'd_model': 128,
    'task': 'regression'
})

In [None]:
# Offical code에서 default로 세팅한 argument들을 불러옵니다.
args = Options()
args = args.parser.parse_args([])
# 수정할 argument들을 업데이트 해줍니다.
args.__dict__.update(args_change)
args.__dict__
# 세팅한 argument들로 configuration diretory를 생성합니다. (+ config 저장 및 불러오기)
config = setup(args)

In [None]:
# Initialize data generators
dataset_class, collate_fn, runner_class = pipeline_factory(config)    
"""
dataset_class : (X, mask, index)를 도출하는 데이터셋 생성하는 class
collate_fn : DataLoader에서 각각의 데이터 샘플을 어떻게 배치로 결합할지 결정하는 함수
runner_class : 학습 및 테스트 과정에 대한 class
"""
val_dataset = dataset_class(val_data, val_indices)

val_loader = DataLoader(dataset=val_dataset,
                        batch_size=config['batch_size'],
                        shuffle=False,
                        num_workers=config['num_workers'],
                        pin_memory=True,
                        collate_fn=lambda x: collate_fn(x, max_len=model.max_len))

train_dataset = dataset_class(my_data, train_indices)

train_loader = DataLoader(dataset=train_dataset,
                            batch_size=config['batch_size'],
                            shuffle=True,
                            num_workers=config['num_workers'],
                            pin_memory=True,
                            collate_fn=lambda x: collate_fn(x, max_len=model.max_len))

In [None]:
X, targets, padding_masks, IDs = next(iter(train_loader))
print(X.shape, targets.shape, target_masks.shape, padding_masks.shape)

In [None]:
X

In [None]:
targets

In [None]:
model = model_factory(config, my_data)

# Initialize optimizer
# L2 regularization
if config['global_reg']:
    weight_decay = config['l2_reg']
    output_reg = None
else:
    weight_decay = 0
    output_reg = config['l2_reg']

optim_class = get_optimizer(config['optimizer'])
optimizer = optim_class(model.parameters(), lr=config['lr'], weight_decay=weight_decay)

lr_step = 0  # current step index of `lr_step`
lr = config['lr']  # current learning step

# 학습된 weight을 불러올 때 사용됩니다.
if args.load_model:
    model, optimizer, start_epoch = utils.load_model(model, config['load_model'], optimizer, config['resume'],
                                                    config['change_output'],
                                                    config['lr'],
                                                    config['lr_step'],
                                                    config['lr_factor'])

model.to(device)

loss_module = get_loss_module(config)

In [None]:
model

In [None]:
trainer = runner_class(model, train_loader, device, loss_module, optimizer, l2_reg=output_reg,
                       print_interval=config['print_interval'], console=config['console'])
val_evaluator = runner_class(model, val_loader, device, loss_module,
                             print_interval=config['print_interval'], console=config['console'])

tensorboard_writer = SummaryWriter(config['tensorboard_dir'])

best_value = 1e16 if config['key_metric'] in NEG_METRICS else -1e16  # initialize with +inf or -inf depending on key metric
metrics = []  # (for validation) list of lists: for each epoch, stores metrics like loss, ...
best_metrics = {}

# 학습되지 않은 모델로 초기 성능을 확인합니다.
aggr_metrics_val, best_metrics, best_value = validate(val_evaluator, 
                                                      tensorboard_writer, 
                                                      config, best_metrics,
                                                      best_value, 
                                                      epoch=0)

metrics_names, metrics_values = zip(*aggr_metrics_val.items())
metrics.append(list(metrics_values))

In [None]:
start_epoch = 0
total_epoch_time = 0

for epoch in tqdm(range(start_epoch + 1, config["epochs"] + 1), desc='Training Epoch', leave=False):
    mark = epoch if config['save_all'] else 'last'
    epoch_start_time = time.time()
    aggr_metrics_train = trainer.train_epoch(epoch)  # dictionary of aggregate epoch metrics
    epoch_runtime = time.time() - epoch_start_time
    print()
    print_str = 'Epoch {} Training Summary: '.format(epoch)
    for k, v in aggr_metrics_train.items():
        tensorboard_writer.add_scalar('{}/train'.format(k), v, epoch)
        print_str += '{}: {:8f} | '.format(k, v)
    logger.info(print_str)
    logger.info("Epoch runtime: {} hours, {} minutes, {} seconds\n".format(*utils.readable_time(epoch_runtime)))
    total_epoch_time += epoch_runtime
    avg_epoch_time = total_epoch_time / (epoch - start_epoch)
    avg_batch_time = avg_epoch_time / len(train_loader)
    avg_sample_time = avg_epoch_time / len(train_dataset)
    logger.info("Avg epoch train. time: {} hours, {} minutes, {} seconds".format(*utils.readable_time(avg_epoch_time)))
    logger.info("Avg batch train. time: {} seconds".format(avg_batch_time))
    logger.info("Avg sample train. time: {} seconds".format(avg_sample_time))

    # evaluate if first or last epoch or at specified interval
    if (epoch == config["epochs"]) or (epoch == start_epoch + 1) or (epoch % config['val_interval'] == 0):
        aggr_metrics_val, best_metrics, best_value = validate(val_evaluator, tensorboard_writer, config,
                                                                  best_metrics, best_value, epoch)
        metrics_names, metrics_values = zip(*aggr_metrics_val.items())
        metrics.append(list(metrics_values))

    utils.save_model(os.path.join(config['save_dir'], 'model_{}.pth'.format(mark)), epoch, model, optimizer)

    # Learning rate scheduling
    if epoch == config['lr_step'][lr_step]:
        utils.save_model(os.path.join(config['save_dir'], 'model_{}.pth'.format(epoch)), epoch, model, optimizer)
        lr = lr * config['lr_factor'][lr_step]
        if lr_step < len(config['lr_step']) - 1:  # so that this index does not get out of bounds
            lr_step += 1
        logger.info('Learning rate updated to: ', lr)
        for param_group in optimizer.param_groups:
            param_group['lr'] = lr

In [65]:
config['load_model'] = './output/_2023-06-22_04-54-52_3fA/checkpoints/model_best.pth'
model, optimizer, start_epoch = utils.load_model(model, config['load_model'], optimizer, config['resume'],
                                                config['change_output'],
                                                config['lr'],
                                                config['lr_step'],
                                                config['lr_factor'])

Loaded model from ./mvts_transformer/output/_2023-06-21_15-34-54_nTP/checkpoints/model_best.pth. Epoch: 16


In [66]:
X, targets, padding_masks, IDs = next(iter(train_loader))
targets = targets.to(device)
target_masks = target_masks.to(device)
padding_masks = padding_masks.to(device)

predictions = model(X.to(device), padding_masks)  # (batch_size, padded_length, feat_dim)

In [67]:
print(targets[0])

tensor([558.], device='cuda:0')


In [68]:
print(predictions[0])

tensor([260.2995], device='cuda:0', grad_fn=<SelectBackward>)


---

## 명령어로 실행해보기

In [None]:
# ! git clone https://github.com/gzerveas/mvts_transformer.git
# %cd mvts_transformer
# ! mkdir output

### Train models from scratch

```
python src/main.py 
--output_dir output 
--comment "regression from Scratch" 
--name $1_fromScratch_Regression 
--records_file Regression_records.xls 
--data_dir ../data/BeijingPM25Quality
--data_class tsra 
--pattern TRAIN 
--val_pattern TEST 
--epochs 100 
--lr 0.001 
--optimizer RAdam  
--pos_encoding learnable 
--task regression
```

In [16]:
! python3 src/main.py --output_dir output --comment "regression from Scratch" --name $1_fromScratch_Regression --records_file Regression_records.xls --data_dir ../data/BeijingPM25Quality --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 100 --lr 0.001 --optimizer RAdam  --pos_encoding learnable --task regression

2023-06-22 12:55:47,323 | INFO : Loading packages ...
/usr/local/bin/python3: Error while finding module specification for 'src.main.py' (AttributeError: module 'src.main' has no attribute '__path__')


### Pre-train models (unsupervised learning through input masking)

```
python src/main.py 
--output_dir output 
--comment "pretraining through imputation" 
--name $1_pretrained 
--records_file Imputation_records.xls 
--data_dir ../data/BeijingPM25Quality
--data_class tsra 
--pattern TRAIN 
--val_ratio 0.2 
--epochs 700 
--lr 0.001 
--optimizer RAdam 
--batch_size 32 
--pos_encoding learnable 
--d_model 128
```

In [None]:
! python src/main.py --output_dir output --comment "pretraining through imputation" --name $1_pretrained --records_file Imputation_records.xls --data_dir ../data/BeijingPM25Quality--data_class tsra --pattern TRAIN --val_ratio 0.2 --epochs 700 --lr 0.001 --optimizer RAdam --batch_size 32 --pos_encoding learnable --d_model 128

### Fine-tune pretrained models

```
python src/main.py 
--output_dir output
--comment "finetune for regression" 
--name BeijingPM25Quality_finetuned 
--records_file Regression_records.xls 
--data_dir ../data/BeijingPM25Quality
--data_class tsra 
--pattern TRAIN 
--val_pattern TEST  
--epochs 400 
--lr 0.001 
--optimizer RAdam 
--pos_encoding learnable 
--d_model 128 
--load_model path/to/BeijingPM25Quality_pretrained/checkpoints/model_best.pth 
--task regression 
--change_output 
--batch_size 128
```

In [None]:
! python src/main.py --output_dir output --comment "finetune for regression" --name BeijingPM25Quality_finetuned --records_file Regression_records.xls --data_dir data/BeijingPM25Quality --data_class tsra --pattern TRAIN --val_pattern TEST --epochs 400 --lr 0.001 --optimizer RAdam --pos_encoding learnable --d_model 128 --load_model path/to/BeijingPM25Quality_pretrained/checkpoints/model_best.pth --task regression --change_output --batch_size 128

---