# 第2回講義 宿題

## 課題
今回のLessonで学んだことを元に，MNISTのファッション版 (Fashion MNIST，クラス数10) をソフトマックス回帰によって分類してみましょう．

Fashion MNISTの詳細については以下のリンクを参考にしてください．

Fashion MNIST: https://github.com/zalandoresearch/fashion-mnist

### 目標値
Accuracy: 80%

### ルール
- **下のセルで指定されている`x_train、y_train`以外の学習データは使わないでください．**
- **ソフトマックス回帰のアルゴリズム部分の実装はnumpyのみで行ってください** (sklearnやtensorflowなどは使用しないでください)．
    - データの前処理部分でsklearnの関数を使う (例えば `sklearn.model_selection.train_test_split`) のは問題ありません．

### 提出方法
- 2つのファイルを提出していただきます．
    1. テストデータ (`x_test`) に対する予測ラベルを`submission_pred.csv`として保存し，**Homeworkタブから`lecture02`を選択して**提出してください．
    2. それに対応するpythonのコードを`submission_code.py`として保存し，**Homeworkタブから`lecture02 (code)`を選択して**提出してください．
      
- なお，採点は1で行い，2はコードの確認用として利用します（成績優秀者はコード内容を公開させていただくかもしれません）．コードの内容を変更した場合は，**1と2の両方を提出し直してください**．

### 評価方法
- 予測ラベルの`y_test`に対する精度 (Accuracy) で評価します．
- 即時採点しLeader Boardを更新します．（採点スケジュールは別アナウンス）
- 締切後の点数を最終的な評価とします．

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### データの読み込み（このセルは修正しないでください）

In [2]:
import os
import sys

import numpy as np
import pandas as pd

sys.modules['tensorflow'] = None

def load_fashionmnist():
    # 学習データ
    x_train = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture02/data/x_train.npy')
    y_train = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture02/data/y_train.npy')
    
    # テストデータ
    x_test = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture02/data/x_test.npy')
    
    x_train = x_train.reshape(-1, 784).astype('float32') / 255
    y_train = np.eye(10)[y_train.astype('int32')]
    x_test = x_test.reshape(-1, 784).astype('float32') / 255
    
    return x_train, y_train, x_test

In [3]:
import os
import sys

import numpy as np
import pandas as pd

sys.modules['tensorflow'] = None

def load_fashionmnist():
    # 学習データ
    x_train = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture02/data/x_train.npy')
    y_train = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture02/data/y_train.npy')
    
    # テストデータ
    x_test = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture02/data/x_test.npy')
    
    x_train = x_train.reshape(-1, 784).astype('float32') / 255
    y_train = np.eye(10)[y_train.astype('int32')]
    x_test = x_test.reshape(-1, 784).astype('float32') / 255
    
    return x_train, y_train, x_test

### ソフトマックス回帰の実装

In [4]:
x_train, y_train, x_test = load_fashionmnist()

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from tqdm import tqdm

def np_log(x):
    return np.log(np.clip(a=x, a_min=1e-10, a_max=1e+10))

def softmax(x, axis=1):
    x -= x.max(axis, keepdims=True) # expのoverflowを防ぐ # WRITE ME
    x_exp = np.exp(x)
    return x_exp / x_exp.sum(axis, keepdims=True) # WRITE ME

def train(x, t, eps=1.0):
  global W, b
  batch_size = x.shape[0]

  t_hat = softmax(np.matmul(x,W) + b)

  cost = ((- t * np_log(t_hat)).sum(axis=1).mean())
  
  delta = t_hat - t
  dW = np.matmul(x.T, delta) / batch_size # shape: (入力の次元数, 出力の次元数)
  db = np.matmul(np.ones(shape=(batch_size,)), delta) / batch_size # shape: (出力の次元数,)
  W -= eps * dW
  b -= eps * db

  return cost
  # WRITE ME

def valid(x, t):
  t_hat = softmax(np.matmul(x,W) + b)
  cost = ((- t * np_log(t_hat)).sum(axis=1).mean())
  return cost, t_hat

# weights
W = np.random.uniform(low=-0.08, high=0.08, size=(784, 10)).astype('float32')# WRITE ME
b = np.zeros(shape=(10,)).astype('float32')# WRITE ME

# 学習データと検証データに分割
x_train, x_valid, y_train, y_valid = train_test_split(x_train, y_train, test_size=0.1)

for epoch in tqdm(range(5000)):
    # オンライン学習
    cost = train(x_train, y_train, 0.3)
    cost, y_pred_valid = valid(x_valid, y_valid)
    if epoch % 10== 9 or epoch == 0:
        print('EPOCH: {}, Valid Cost: {:.3f}, Valid Accuracy: {:.3f}'.format(
            epoch + 1,
            cost,
            accuracy_score(y_valid.argmax(axis=1), y_pred_valid.argmax(axis=1))
        ))
    # WRITE ME


  0%|          | 1/5000 [00:00<45:16,  1.84it/s]

EPOCH: 1, Valid Cost: 2.112, Valid Accuracy: 0.332


  0%|          | 10/5000 [00:04<39:03,  2.13it/s]

EPOCH: 10, Valid Cost: 2.131, Valid Accuracy: 0.539


  0%|          | 20/5000 [00:10<50:10,  1.65it/s]

EPOCH: 20, Valid Cost: 1.537, Valid Accuracy: 0.647


  1%|          | 30/5000 [00:15<40:25,  2.05it/s]

EPOCH: 30, Valid Cost: 1.404, Valid Accuracy: 0.683


  1%|          | 40/5000 [00:20<37:58,  2.18it/s]

EPOCH: 40, Valid Cost: 0.974, Valid Accuracy: 0.700


  1%|          | 50/5000 [00:25<52:54,  1.56it/s]

EPOCH: 50, Valid Cost: 1.026, Valid Accuracy: 0.700


  1%|          | 60/5000 [00:30<39:09,  2.10it/s]

EPOCH: 60, Valid Cost: 0.867, Valid Accuracy: 0.734


  1%|▏         | 70/5000 [00:35<37:50,  2.17it/s]

EPOCH: 70, Valid Cost: 0.868, Valid Accuracy: 0.727


  2%|▏         | 80/5000 [00:41<53:49,  1.52it/s]

EPOCH: 80, Valid Cost: 0.855, Valid Accuracy: 0.751


  2%|▏         | 90/5000 [00:46<38:20,  2.13it/s]

EPOCH: 90, Valid Cost: 1.094, Valid Accuracy: 0.763


  2%|▏         | 100/5000 [00:50<37:20,  2.19it/s]

EPOCH: 100, Valid Cost: 0.839, Valid Accuracy: 0.751


  2%|▏         | 110/5000 [00:56<48:21,  1.69it/s]

EPOCH: 110, Valid Cost: 0.913, Valid Accuracy: 0.769


  2%|▏         | 120/5000 [01:01<37:33,  2.17it/s]

EPOCH: 120, Valid Cost: 0.786, Valid Accuracy: 0.761


  3%|▎         | 130/5000 [01:06<50:56,  1.59it/s]

EPOCH: 130, Valid Cost: 0.906, Valid Accuracy: 0.773


  3%|▎         | 140/5000 [01:13<41:01,  1.97it/s]

EPOCH: 140, Valid Cost: 0.757, Valid Accuracy: 0.772


  3%|▎         | 150/5000 [01:17<36:47,  2.20it/s]

EPOCH: 150, Valid Cost: 0.884, Valid Accuracy: 0.775


  3%|▎         | 160/5000 [01:22<48:17,  1.67it/s]

EPOCH: 160, Valid Cost: 0.737, Valid Accuracy: 0.780


  3%|▎         | 170/5000 [01:28<38:56,  2.07it/s]

EPOCH: 170, Valid Cost: 0.866, Valid Accuracy: 0.779


  4%|▎         | 180/5000 [01:33<37:24,  2.15it/s]

EPOCH: 180, Valid Cost: 0.717, Valid Accuracy: 0.785


  4%|▍         | 190/5000 [01:38<51:52,  1.55it/s]

EPOCH: 190, Valid Cost: 0.846, Valid Accuracy: 0.784


  4%|▍         | 200/5000 [01:44<38:23,  2.08it/s]

EPOCH: 200, Valid Cost: 0.702, Valid Accuracy: 0.791


  4%|▍         | 210/5000 [01:48<36:33,  2.18it/s]

EPOCH: 210, Valid Cost: 0.827, Valid Accuracy: 0.786


  4%|▍         | 220/5000 [01:54<50:58,  1.56it/s]

EPOCH: 220, Valid Cost: 0.691, Valid Accuracy: 0.794


  5%|▍         | 230/5000 [01:59<37:20,  2.13it/s]

EPOCH: 230, Valid Cost: 0.810, Valid Accuracy: 0.789


  5%|▍         | 240/5000 [02:03<36:09,  2.19it/s]

EPOCH: 240, Valid Cost: 0.682, Valid Accuracy: 0.795


  5%|▌         | 250/5000 [02:09<48:19,  1.64it/s]

EPOCH: 250, Valid Cost: 0.794, Valid Accuracy: 0.791


  5%|▌         | 260/5000 [02:14<36:19,  2.17it/s]

EPOCH: 260, Valid Cost: 0.676, Valid Accuracy: 0.800


  5%|▌         | 270/5000 [02:19<36:00,  2.19it/s]

EPOCH: 270, Valid Cost: 0.778, Valid Accuracy: 0.795


  6%|▌         | 280/5000 [02:25<43:20,  1.81it/s]

EPOCH: 280, Valid Cost: 0.672, Valid Accuracy: 0.803


  6%|▌         | 290/5000 [02:29<36:01,  2.18it/s]

EPOCH: 290, Valid Cost: 0.761, Valid Accuracy: 0.798


  6%|▌         | 300/5000 [02:34<38:54,  2.01it/s]

EPOCH: 300, Valid Cost: 0.671, Valid Accuracy: 0.805


  6%|▌         | 310/5000 [02:40<40:25,  1.93it/s]

EPOCH: 310, Valid Cost: 0.743, Valid Accuracy: 0.801


  6%|▋         | 320/5000 [02:44<36:09,  2.16it/s]

EPOCH: 320, Valid Cost: 0.677, Valid Accuracy: 0.805


  7%|▋         | 330/5000 [02:49<45:19,  1.72it/s]

EPOCH: 330, Valid Cost: 0.721, Valid Accuracy: 0.803


  7%|▋         | 340/5000 [02:55<37:47,  2.05it/s]

EPOCH: 340, Valid Cost: 0.689, Valid Accuracy: 0.805


  7%|▋         | 350/5000 [03:00<35:36,  2.18it/s]

EPOCH: 350, Valid Cost: 0.698, Valid Accuracy: 0.805


  7%|▋         | 360/5000 [03:05<48:08,  1.61it/s]

EPOCH: 360, Valid Cost: 0.688, Valid Accuracy: 0.808


  7%|▋         | 370/5000 [03:10<37:18,  2.07it/s]

EPOCH: 370, Valid Cost: 0.690, Valid Accuracy: 0.809


  8%|▊         | 380/5000 [03:15<35:08,  2.19it/s]

EPOCH: 380, Valid Cost: 0.720, Valid Accuracy: 0.784


  8%|▊         | 390/5000 [03:21<48:09,  1.60it/s]

EPOCH: 390, Valid Cost: 0.692, Valid Accuracy: 0.792


  8%|▊         | 400/5000 [03:26<36:45,  2.09it/s]

EPOCH: 400, Valid Cost: 0.675, Valid Accuracy: 0.802


  8%|▊         | 410/5000 [03:30<35:51,  2.13it/s]

EPOCH: 410, Valid Cost: 0.664, Valid Accuracy: 0.807


  8%|▊         | 420/5000 [03:37<52:37,  1.45it/s]

EPOCH: 420, Valid Cost: 0.658, Valid Accuracy: 0.810


  9%|▊         | 430/5000 [03:42<41:22,  1.84it/s]

EPOCH: 430, Valid Cost: 0.654, Valid Accuracy: 0.811


  9%|▉         | 440/5000 [03:50<1:18:30,  1.03s/it]

EPOCH: 440, Valid Cost: 0.650, Valid Accuracy: 0.812


  9%|▉         | 450/5000 [03:56<37:18,  2.03it/s]

EPOCH: 450, Valid Cost: 0.647, Valid Accuracy: 0.812


  9%|▉         | 460/5000 [04:00<35:48,  2.11it/s]

EPOCH: 460, Valid Cost: 0.643, Valid Accuracy: 0.813


  9%|▉         | 470/5000 [04:06<49:51,  1.51it/s]

EPOCH: 470, Valid Cost: 0.640, Valid Accuracy: 0.814


 10%|▉         | 480/5000 [04:11<35:20,  2.13it/s]

EPOCH: 480, Valid Cost: 0.637, Valid Accuracy: 0.815


 10%|▉         | 490/5000 [04:16<34:47,  2.16it/s]

EPOCH: 490, Valid Cost: 0.635, Valid Accuracy: 0.817


 10%|█         | 500/5000 [04:22<44:12,  1.70it/s]

EPOCH: 500, Valid Cost: 0.635, Valid Accuracy: 0.821


 10%|█         | 510/5000 [04:27<35:32,  2.11it/s]

EPOCH: 510, Valid Cost: 0.713, Valid Accuracy: 0.797


 10%|█         | 520/5000 [04:31<34:53,  2.14it/s]

EPOCH: 520, Valid Cost: 0.692, Valid Accuracy: 0.804


 11%|█         | 530/5000 [04:37<39:21,  1.89it/s]

EPOCH: 530, Valid Cost: 0.678, Valid Accuracy: 0.807


 11%|█         | 540/5000 [04:42<34:44,  2.14it/s]

EPOCH: 540, Valid Cost: 0.667, Valid Accuracy: 0.812


 11%|█         | 550/5000 [04:47<43:04,  1.72it/s]

EPOCH: 550, Valid Cost: 0.658, Valid Accuracy: 0.814


 11%|█         | 560/5000 [04:53<36:50,  2.01it/s]

EPOCH: 560, Valid Cost: 0.652, Valid Accuracy: 0.816


 11%|█▏        | 570/5000 [04:58<34:37,  2.13it/s]

EPOCH: 570, Valid Cost: 0.646, Valid Accuracy: 0.817


 12%|█▏        | 580/5000 [05:03<46:09,  1.60it/s]

EPOCH: 580, Valid Cost: 0.642, Valid Accuracy: 0.818


 12%|█▏        | 590/5000 [05:08<35:09,  2.09it/s]

EPOCH: 590, Valid Cost: 0.638, Valid Accuracy: 0.819


 12%|█▏        | 600/5000 [05:13<34:10,  2.15it/s]

EPOCH: 600, Valid Cost: 0.635, Valid Accuracy: 0.819


 12%|█▏        | 610/5000 [05:19<48:06,  1.52it/s]

EPOCH: 610, Valid Cost: 0.632, Valid Accuracy: 0.820


 12%|█▏        | 620/5000 [05:24<34:10,  2.14it/s]

EPOCH: 620, Valid Cost: 0.630, Valid Accuracy: 0.821


 13%|█▎        | 630/5000 [05:29<34:22,  2.12it/s]

EPOCH: 630, Valid Cost: 0.628, Valid Accuracy: 0.821


 13%|█▎        | 640/5000 [05:35<44:13,  1.64it/s]

EPOCH: 640, Valid Cost: 0.626, Valid Accuracy: 0.821


 13%|█▎        | 650/5000 [05:39<33:57,  2.13it/s]

EPOCH: 650, Valid Cost: 0.624, Valid Accuracy: 0.821


 13%|█▎        | 660/5000 [05:44<33:47,  2.14it/s]

EPOCH: 660, Valid Cost: 0.623, Valid Accuracy: 0.821


 13%|█▎        | 670/5000 [05:50<38:23,  1.88it/s]

EPOCH: 670, Valid Cost: 0.621, Valid Accuracy: 0.821


 14%|█▎        | 680/5000 [05:55<33:29,  2.15it/s]

EPOCH: 680, Valid Cost: 0.619, Valid Accuracy: 0.822


 14%|█▍        | 690/5000 [06:00<40:55,  1.75it/s]

EPOCH: 690, Valid Cost: 0.616, Valid Accuracy: 0.823


 14%|█▍        | 700/5000 [06:06<35:45,  2.00it/s]

EPOCH: 700, Valid Cost: 0.612, Valid Accuracy: 0.824


 14%|█▍        | 710/5000 [06:10<32:42,  2.19it/s]

EPOCH: 710, Valid Cost: 0.631, Valid Accuracy: 0.820


 14%|█▍        | 720/5000 [06:16<43:42,  1.63it/s]

EPOCH: 720, Valid Cost: 0.639, Valid Accuracy: 0.804


 15%|█▍        | 730/5000 [06:21<34:28,  2.06it/s]

EPOCH: 730, Valid Cost: 0.618, Valid Accuracy: 0.813


 15%|█▍        | 740/5000 [06:26<33:24,  2.13it/s]

EPOCH: 740, Valid Cost: 0.605, Valid Accuracy: 0.819


 15%|█▌        | 750/5000 [06:32<44:52,  1.58it/s]

EPOCH: 750, Valid Cost: 0.599, Valid Accuracy: 0.822


 15%|█▌        | 760/5000 [06:36<33:37,  2.10it/s]

EPOCH: 760, Valid Cost: 0.596, Valid Accuracy: 0.823


 15%|█▌        | 770/5000 [06:41<33:10,  2.12it/s]

EPOCH: 770, Valid Cost: 0.594, Valid Accuracy: 0.824


 16%|█▌        | 780/5000 [06:47<44:41,  1.57it/s]

EPOCH: 780, Valid Cost: 0.592, Valid Accuracy: 0.825


 16%|█▌        | 790/5000 [06:52<32:46,  2.14it/s]

EPOCH: 790, Valid Cost: 0.591, Valid Accuracy: 0.825


 16%|█▌        | 800/5000 [06:56<32:12,  2.17it/s]

EPOCH: 800, Valid Cost: 0.589, Valid Accuracy: 0.825


 16%|█▌        | 810/5000 [07:03<40:09,  1.74it/s]

EPOCH: 810, Valid Cost: 0.587, Valid Accuracy: 0.825


 16%|█▋        | 820/5000 [07:07<32:54,  2.12it/s]

EPOCH: 820, Valid Cost: 0.585, Valid Accuracy: 0.826


 17%|█▋        | 830/5000 [07:12<35:40,  1.95it/s]

EPOCH: 830, Valid Cost: 0.584, Valid Accuracy: 0.826


 17%|█▋        | 840/5000 [07:18<36:02,  1.92it/s]

EPOCH: 840, Valid Cost: 0.583, Valid Accuracy: 0.827


 17%|█▋        | 850/5000 [07:23<32:30,  2.13it/s]

EPOCH: 850, Valid Cost: 0.583, Valid Accuracy: 0.828


 17%|█▋        | 860/5000 [07:28<39:19,  1.75it/s]

EPOCH: 860, Valid Cost: 0.603, Valid Accuracy: 0.828


 17%|█▋        | 870/5000 [07:33<33:47,  2.04it/s]

EPOCH: 870, Valid Cost: 0.661, Valid Accuracy: 0.809


 18%|█▊        | 880/5000 [07:38<32:06,  2.14it/s]

EPOCH: 880, Valid Cost: 0.643, Valid Accuracy: 0.814


 18%|█▊        | 890/5000 [07:44<42:39,  1.61it/s]

EPOCH: 890, Valid Cost: 0.629, Valid Accuracy: 0.816


 18%|█▊        | 900/5000 [07:49<32:55,  2.08it/s]

EPOCH: 900, Valid Cost: 0.619, Valid Accuracy: 0.819


 18%|█▊        | 910/5000 [07:54<31:55,  2.14it/s]

EPOCH: 910, Valid Cost: 0.612, Valid Accuracy: 0.822


 18%|█▊        | 920/5000 [07:59<44:33,  1.53it/s]

EPOCH: 920, Valid Cost: 0.607, Valid Accuracy: 0.822


 19%|█▊        | 930/5000 [08:04<31:58,  2.12it/s]

EPOCH: 930, Valid Cost: 0.603, Valid Accuracy: 0.823


 19%|█▉        | 940/5000 [08:09<31:52,  2.12it/s]

EPOCH: 940, Valid Cost: 0.600, Valid Accuracy: 0.823


 19%|█▉        | 950/5000 [08:15<40:55,  1.65it/s]

EPOCH: 950, Valid Cost: 0.597, Valid Accuracy: 0.823


 19%|█▉        | 960/5000 [08:20<31:10,  2.16it/s]

EPOCH: 960, Valid Cost: 0.595, Valid Accuracy: 0.824


 19%|█▉        | 970/5000 [08:24<31:12,  2.15it/s]

EPOCH: 970, Valid Cost: 0.593, Valid Accuracy: 0.825


 20%|█▉        | 980/5000 [08:31<37:44,  1.78it/s]

EPOCH: 980, Valid Cost: 0.591, Valid Accuracy: 0.825


 20%|█▉        | 990/5000 [08:35<31:06,  2.15it/s]

EPOCH: 990, Valid Cost: 0.590, Valid Accuracy: 0.826


 20%|██        | 1000/5000 [08:40<35:46,  1.86it/s]

EPOCH: 1000, Valid Cost: 0.589, Valid Accuracy: 0.826


 20%|██        | 1010/5000 [08:46<33:26,  1.99it/s]

EPOCH: 1010, Valid Cost: 0.588, Valid Accuracy: 0.826


 20%|██        | 1020/5000 [08:51<31:01,  2.14it/s]

EPOCH: 1020, Valid Cost: 0.587, Valid Accuracy: 0.826


 21%|██        | 1030/5000 [08:56<39:53,  1.66it/s]

EPOCH: 1030, Valid Cost: 0.586, Valid Accuracy: 0.827


 21%|██        | 1040/5000 [09:01<31:48,  2.07it/s]

EPOCH: 1040, Valid Cost: 0.585, Valid Accuracy: 0.827


 21%|██        | 1050/5000 [09:06<30:30,  2.16it/s]

EPOCH: 1050, Valid Cost: 0.584, Valid Accuracy: 0.827


 21%|██        | 1060/5000 [09:12<40:52,  1.61it/s]

EPOCH: 1060, Valid Cost: 0.582, Valid Accuracy: 0.828


 21%|██▏       | 1070/5000 [09:17<31:23,  2.09it/s]

EPOCH: 1070, Valid Cost: 0.581, Valid Accuracy: 0.828


 22%|██▏       | 1080/5000 [09:21<30:13,  2.16it/s]

EPOCH: 1080, Valid Cost: 0.579, Valid Accuracy: 0.828


 22%|██▏       | 1090/5000 [09:27<41:42,  1.56it/s]

EPOCH: 1090, Valid Cost: 0.578, Valid Accuracy: 0.829


 22%|██▏       | 1100/5000 [09:32<30:32,  2.13it/s]

EPOCH: 1100, Valid Cost: 0.576, Valid Accuracy: 0.832


 22%|██▏       | 1110/5000 [09:37<30:56,  2.10it/s]

EPOCH: 1110, Valid Cost: 0.581, Valid Accuracy: 0.831


 22%|██▏       | 1120/5000 [09:43<35:06,  1.84it/s]

EPOCH: 1120, Valid Cost: 0.613, Valid Accuracy: 0.807


 23%|██▎       | 1130/5000 [09:48<30:07,  2.14it/s]

EPOCH: 1130, Valid Cost: 0.588, Valid Accuracy: 0.818


 23%|██▎       | 1140/5000 [09:53<33:58,  1.89it/s]

EPOCH: 1140, Valid Cost: 0.574, Valid Accuracy: 0.825


 23%|██▎       | 1150/5000 [09:58<32:15,  1.99it/s]

EPOCH: 1150, Valid Cost: 0.567, Valid Accuracy: 0.828


 23%|██▎       | 1160/5000 [10:03<29:50,  2.14it/s]

EPOCH: 1160, Valid Cost: 0.565, Valid Accuracy: 0.829


 23%|██▎       | 1170/5000 [10:08<39:05,  1.63it/s]

EPOCH: 1170, Valid Cost: 0.563, Valid Accuracy: 0.830


 24%|██▎       | 1180/5000 [10:14<30:45,  2.07it/s]

EPOCH: 1180, Valid Cost: 0.562, Valid Accuracy: 0.830


 24%|██▍       | 1190/5000 [10:18<29:47,  2.13it/s]

EPOCH: 1190, Valid Cost: 0.561, Valid Accuracy: 0.830


 24%|██▍       | 1200/5000 [10:24<39:55,  1.59it/s]

EPOCH: 1200, Valid Cost: 0.560, Valid Accuracy: 0.831


 24%|██▍       | 1210/5000 [10:29<29:47,  2.12it/s]

EPOCH: 1210, Valid Cost: 0.559, Valid Accuracy: 0.831


 24%|██▍       | 1220/5000 [10:34<29:17,  2.15it/s]

EPOCH: 1220, Valid Cost: 0.558, Valid Accuracy: 0.832


 25%|██▍       | 1230/5000 [10:40<41:00,  1.53it/s]

EPOCH: 1230, Valid Cost: 0.557, Valid Accuracy: 0.832


 25%|██▍       | 1240/5000 [10:44<29:05,  2.15it/s]

EPOCH: 1240, Valid Cost: 0.556, Valid Accuracy: 0.832


 25%|██▌       | 1250/5000 [10:49<29:17,  2.13it/s]

EPOCH: 1250, Valid Cost: 0.555, Valid Accuracy: 0.832


 25%|██▌       | 1260/5000 [10:55<36:28,  1.71it/s]

EPOCH: 1260, Valid Cost: 0.555, Valid Accuracy: 0.834


 25%|██▌       | 1270/5000 [11:00<29:00,  2.14it/s]

EPOCH: 1270, Valid Cost: 0.560, Valid Accuracy: 0.835


 26%|██▌       | 1280/5000 [11:05<30:06,  2.06it/s]

EPOCH: 1280, Valid Cost: 0.645, Valid Accuracy: 0.812


 26%|██▌       | 1290/5000 [11:11<32:21,  1.91it/s]

EPOCH: 1290, Valid Cost: 0.628, Valid Accuracy: 0.819


 26%|██▌       | 1300/5000 [11:15<28:35,  2.16it/s]

EPOCH: 1300, Valid Cost: 0.611, Valid Accuracy: 0.821


 26%|██▌       | 1310/5000 [11:20<34:20,  1.79it/s]

EPOCH: 1310, Valid Cost: 0.599, Valid Accuracy: 0.824


 26%|██▋       | 1320/5000 [11:26<30:13,  2.03it/s]

EPOCH: 1320, Valid Cost: 0.591, Valid Accuracy: 0.826


 27%|██▋       | 1330/5000 [11:30<28:04,  2.18it/s]

EPOCH: 1330, Valid Cost: 0.585, Valid Accuracy: 0.827


 27%|██▋       | 1340/5000 [11:36<37:33,  1.62it/s]

EPOCH: 1340, Valid Cost: 0.581, Valid Accuracy: 0.829


 27%|██▋       | 1350/5000 [11:41<29:35,  2.06it/s]

EPOCH: 1350, Valid Cost: 0.578, Valid Accuracy: 0.830


 27%|██▋       | 1360/5000 [11:46<27:43,  2.19it/s]

EPOCH: 1360, Valid Cost: 0.575, Valid Accuracy: 0.831


 27%|██▋       | 1370/5000 [11:52<38:19,  1.58it/s]

EPOCH: 1370, Valid Cost: 0.573, Valid Accuracy: 0.831


 28%|██▊       | 1380/5000 [11:56<28:33,  2.11it/s]

EPOCH: 1380, Valid Cost: 0.571, Valid Accuracy: 0.831


 28%|██▊       | 1390/5000 [12:01<28:21,  2.12it/s]

EPOCH: 1390, Valid Cost: 0.570, Valid Accuracy: 0.831


 28%|██▊       | 1400/5000 [12:07<38:03,  1.58it/s]

EPOCH: 1400, Valid Cost: 0.569, Valid Accuracy: 0.832


 28%|██▊       | 1410/5000 [12:12<28:00,  2.14it/s]

EPOCH: 1410, Valid Cost: 0.568, Valid Accuracy: 0.833


 28%|██▊       | 1420/5000 [12:16<27:53,  2.14it/s]

EPOCH: 1420, Valid Cost: 0.567, Valid Accuracy: 0.833


 29%|██▊       | 1430/5000 [12:22<32:29,  1.83it/s]

EPOCH: 1430, Valid Cost: 0.566, Valid Accuracy: 0.833


 29%|██▉       | 1440/5000 [12:27<27:32,  2.15it/s]

EPOCH: 1440, Valid Cost: 0.566, Valid Accuracy: 0.833


 29%|██▉       | 1450/5000 [12:32<29:55,  1.98it/s]

EPOCH: 1450, Valid Cost: 0.565, Valid Accuracy: 0.833


 29%|██▉       | 1460/5000 [12:38<30:22,  1.94it/s]

EPOCH: 1460, Valid Cost: 0.564, Valid Accuracy: 0.834


 29%|██▉       | 1470/5000 [12:42<27:55,  2.11it/s]

EPOCH: 1470, Valid Cost: 0.563, Valid Accuracy: 0.834


 30%|██▉       | 1480/5000 [12:48<35:08,  1.67it/s]

EPOCH: 1480, Valid Cost: 0.563, Valid Accuracy: 0.834


 30%|██▉       | 1490/5000 [12:53<28:42,  2.04it/s]

EPOCH: 1490, Valid Cost: 0.562, Valid Accuracy: 0.834


 30%|███       | 1500/5000 [12:58<27:00,  2.16it/s]

EPOCH: 1500, Valid Cost: 0.561, Valid Accuracy: 0.835


 30%|███       | 1510/5000 [13:04<37:54,  1.53it/s]

EPOCH: 1510, Valid Cost: 0.560, Valid Accuracy: 0.835


 30%|███       | 1520/5000 [13:09<27:57,  2.07it/s]

EPOCH: 1520, Valid Cost: 0.560, Valid Accuracy: 0.835


 31%|███       | 1530/5000 [13:13<26:51,  2.15it/s]

EPOCH: 1530, Valid Cost: 0.560, Valid Accuracy: 0.835


 31%|███       | 1540/5000 [13:19<37:47,  1.53it/s]

EPOCH: 1540, Valid Cost: 0.561, Valid Accuracy: 0.835


 31%|███       | 1550/5000 [13:24<26:51,  2.14it/s]

EPOCH: 1550, Valid Cost: 0.564, Valid Accuracy: 0.835


 31%|███       | 1560/5000 [13:29<26:55,  2.13it/s]

EPOCH: 1560, Valid Cost: 0.565, Valid Accuracy: 0.832


 31%|███▏      | 1570/5000 [13:35<34:17,  1.67it/s]

EPOCH: 1570, Valid Cost: 0.561, Valid Accuracy: 0.834


 32%|███▏      | 1580/5000 [13:39<26:43,  2.13it/s]

EPOCH: 1580, Valid Cost: 0.557, Valid Accuracy: 0.837


 32%|███▏      | 1590/5000 [13:44<26:12,  2.17it/s]

EPOCH: 1590, Valid Cost: 0.553, Valid Accuracy: 0.838


 32%|███▏      | 1600/5000 [13:50<29:43,  1.91it/s]

EPOCH: 1600, Valid Cost: 0.570, Valid Accuracy: 0.833


 32%|███▏      | 1610/5000 [13:56<34:09,  1.65it/s]

EPOCH: 1610, Valid Cost: 0.589, Valid Accuracy: 0.818


 32%|███▏      | 1620/5000 [14:04<39:34,  1.42it/s]

EPOCH: 1620, Valid Cost: 0.567, Valid Accuracy: 0.827


 33%|███▎      | 1630/5000 [14:09<26:30,  2.12it/s]

EPOCH: 1630, Valid Cost: 0.555, Valid Accuracy: 0.832


 33%|███▎      | 1640/5000 [14:14<25:51,  2.17it/s]

EPOCH: 1640, Valid Cost: 0.550, Valid Accuracy: 0.834


 33%|███▎      | 1650/5000 [14:20<32:43,  1.71it/s]

EPOCH: 1650, Valid Cost: 0.547, Valid Accuracy: 0.836


 33%|███▎      | 1660/5000 [14:24<25:31,  2.18it/s]

EPOCH: 1660, Valid Cost: 0.546, Valid Accuracy: 0.836


 33%|███▎      | 1670/5000 [14:29<26:04,  2.13it/s]

EPOCH: 1670, Valid Cost: 0.545, Valid Accuracy: 0.837


 34%|███▎      | 1680/5000 [14:35<29:13,  1.89it/s]

EPOCH: 1680, Valid Cost: 0.544, Valid Accuracy: 0.837


 34%|███▍      | 1690/5000 [14:40<25:39,  2.15it/s]

EPOCH: 1690, Valid Cost: 0.543, Valid Accuracy: 0.836


 34%|███▍      | 1700/5000 [14:45<31:28,  1.75it/s]

EPOCH: 1700, Valid Cost: 0.542, Valid Accuracy: 0.837


 34%|███▍      | 1710/5000 [14:50<27:29,  2.00it/s]

EPOCH: 1710, Valid Cost: 0.541, Valid Accuracy: 0.837


 34%|███▍      | 1720/5000 [14:55<25:33,  2.14it/s]

EPOCH: 1720, Valid Cost: 0.541, Valid Accuracy: 0.837


 35%|███▍      | 1730/5000 [15:00<33:28,  1.63it/s]

EPOCH: 1730, Valid Cost: 0.540, Valid Accuracy: 0.838


 35%|███▍      | 1740/5000 [15:06<25:38,  2.12it/s]

EPOCH: 1740, Valid Cost: 0.539, Valid Accuracy: 0.838


 35%|███▌      | 1750/5000 [15:10<25:15,  2.14it/s]

EPOCH: 1750, Valid Cost: 0.539, Valid Accuracy: 0.839


 35%|███▌      | 1760/5000 [15:16<34:55,  1.55it/s]

EPOCH: 1760, Valid Cost: 0.539, Valid Accuracy: 0.839


 35%|███▌      | 1770/5000 [15:21<25:35,  2.10it/s]

EPOCH: 1770, Valid Cost: 0.539, Valid Accuracy: 0.839


 36%|███▌      | 1780/5000 [15:26<24:42,  2.17it/s]

EPOCH: 1780, Valid Cost: 0.539, Valid Accuracy: 0.840


 36%|███▌      | 1790/5000 [15:32<34:37,  1.54it/s]

EPOCH: 1790, Valid Cost: 0.540, Valid Accuracy: 0.841


 36%|███▌      | 1800/5000 [15:36<24:32,  2.17it/s]

EPOCH: 1800, Valid Cost: 0.563, Valid Accuracy: 0.837


 36%|███▌      | 1810/5000 [15:41<24:39,  2.16it/s]

EPOCH: 1810, Valid Cost: 0.624, Valid Accuracy: 0.816


 36%|███▋      | 1820/5000 [15:47<30:35,  1.73it/s]

EPOCH: 1820, Valid Cost: 0.603, Valid Accuracy: 0.822


 37%|███▋      | 1830/5000 [15:52<24:13,  2.18it/s]

EPOCH: 1830, Valid Cost: 0.588, Valid Accuracy: 0.828


 37%|███▋      | 1840/5000 [15:56<24:13,  2.17it/s]

EPOCH: 1840, Valid Cost: 0.578, Valid Accuracy: 0.830


 37%|███▋      | 1850/5000 [16:03<27:48,  1.89it/s]

EPOCH: 1850, Valid Cost: 0.571, Valid Accuracy: 0.832


 37%|███▋      | 1860/5000 [16:07<24:32,  2.13it/s]

EPOCH: 1860, Valid Cost: 0.566, Valid Accuracy: 0.833


 37%|███▋      | 1870/5000 [16:12<29:07,  1.79it/s]

EPOCH: 1870, Valid Cost: 0.563, Valid Accuracy: 0.835


 38%|███▊      | 1880/5000 [16:18<25:40,  2.03it/s]

EPOCH: 1880, Valid Cost: 0.560, Valid Accuracy: 0.836


 38%|███▊      | 1890/5000 [16:22<23:49,  2.18it/s]

EPOCH: 1890, Valid Cost: 0.558, Valid Accuracy: 0.836


 38%|███▊      | 1900/5000 [16:28<31:35,  1.64it/s]

EPOCH: 1900, Valid Cost: 0.556, Valid Accuracy: 0.836


 38%|███▊      | 1910/5000 [16:33<24:26,  2.11it/s]

EPOCH: 1910, Valid Cost: 0.555, Valid Accuracy: 0.837


 38%|███▊      | 1920/5000 [16:38<23:54,  2.15it/s]

EPOCH: 1920, Valid Cost: 0.553, Valid Accuracy: 0.838


 39%|███▊      | 1930/5000 [16:43<32:18,  1.58it/s]

EPOCH: 1930, Valid Cost: 0.553, Valid Accuracy: 0.838


 39%|███▉      | 1940/5000 [16:48<23:50,  2.14it/s]

EPOCH: 1940, Valid Cost: 0.552, Valid Accuracy: 0.839


 39%|███▉      | 1950/5000 [16:53<23:13,  2.19it/s]

EPOCH: 1950, Valid Cost: 0.551, Valid Accuracy: 0.839


 39%|███▉      | 1960/5000 [16:59<35:08,  1.44it/s]

EPOCH: 1960, Valid Cost: 0.551, Valid Accuracy: 0.839


 39%|███▉      | 1970/5000 [17:04<23:30,  2.15it/s]

EPOCH: 1970, Valid Cost: 0.550, Valid Accuracy: 0.839


 40%|███▉      | 1980/5000 [17:08<23:23,  2.15it/s]

EPOCH: 1980, Valid Cost: 0.550, Valid Accuracy: 0.839


 40%|███▉      | 1990/5000 [17:15<28:37,  1.75it/s]

EPOCH: 1990, Valid Cost: 0.549, Valid Accuracy: 0.839


 40%|████      | 2000/5000 [17:19<23:39,  2.11it/s]

EPOCH: 2000, Valid Cost: 0.549, Valid Accuracy: 0.840


 40%|████      | 2010/5000 [17:24<24:30,  2.03it/s]

EPOCH: 2010, Valid Cost: 0.548, Valid Accuracy: 0.840


 40%|████      | 2020/5000 [17:30<25:47,  1.93it/s]

EPOCH: 2020, Valid Cost: 0.547, Valid Accuracy: 0.840


 41%|████      | 2030/5000 [17:35<22:44,  2.18it/s]

EPOCH: 2030, Valid Cost: 0.546, Valid Accuracy: 0.840


 41%|████      | 2040/5000 [17:40<29:50,  1.65it/s]

EPOCH: 2040, Valid Cost: 0.545, Valid Accuracy: 0.840


 41%|████      | 2050/5000 [17:45<23:45,  2.07it/s]

EPOCH: 2050, Valid Cost: 0.545, Valid Accuracy: 0.839


 41%|████      | 2060/5000 [17:50<22:31,  2.18it/s]

EPOCH: 2060, Valid Cost: 0.544, Valid Accuracy: 0.840


 41%|████▏     | 2070/5000 [17:55<30:23,  1.61it/s]

EPOCH: 2070, Valid Cost: 0.544, Valid Accuracy: 0.840


 42%|████▏     | 2080/5000 [18:01<23:08,  2.10it/s]

EPOCH: 2080, Valid Cost: 0.546, Valid Accuracy: 0.840


 42%|████▏     | 2090/5000 [18:05<22:32,  2.15it/s]

EPOCH: 2090, Valid Cost: 0.550, Valid Accuracy: 0.837


 42%|████▏     | 2100/5000 [18:11<31:51,  1.52it/s]

EPOCH: 2100, Valid Cost: 0.555, Valid Accuracy: 0.834


 42%|████▏     | 2110/5000 [18:16<22:29,  2.14it/s]

EPOCH: 2110, Valid Cost: 0.551, Valid Accuracy: 0.836


 42%|████▏     | 2120/5000 [18:21<22:11,  2.16it/s]

EPOCH: 2120, Valid Cost: 0.544, Valid Accuracy: 0.838


 43%|████▎     | 2130/5000 [18:27<28:06,  1.70it/s]

EPOCH: 2130, Valid Cost: 0.540, Valid Accuracy: 0.840


 43%|████▎     | 2140/5000 [18:31<22:04,  2.16it/s]

EPOCH: 2140, Valid Cost: 0.544, Valid Accuracy: 0.840


 43%|████▎     | 2150/5000 [18:36<21:51,  2.17it/s]

EPOCH: 2150, Valid Cost: 0.585, Valid Accuracy: 0.821


 43%|████▎     | 2160/5000 [18:42<25:53,  1.83it/s]

EPOCH: 2160, Valid Cost: 0.560, Valid Accuracy: 0.828


 43%|████▎     | 2170/5000 [18:47<21:39,  2.18it/s]

EPOCH: 2170, Valid Cost: 0.545, Valid Accuracy: 0.834


 44%|████▎     | 2180/5000 [18:51<24:04,  1.95it/s]

EPOCH: 2180, Valid Cost: 0.539, Valid Accuracy: 0.838


 44%|████▍     | 2190/5000 [18:57<23:58,  1.95it/s]

EPOCH: 2190, Valid Cost: 0.536, Valid Accuracy: 0.838


 44%|████▍     | 2200/5000 [19:02<21:37,  2.16it/s]

EPOCH: 2200, Valid Cost: 0.534, Valid Accuracy: 0.838


 44%|████▍     | 2210/5000 [19:07<25:56,  1.79it/s]

EPOCH: 2210, Valid Cost: 0.534, Valid Accuracy: 0.838


 44%|████▍     | 2220/5000 [19:13<22:38,  2.05it/s]

EPOCH: 2220, Valid Cost: 0.533, Valid Accuracy: 0.838


 45%|████▍     | 2230/5000 [19:17<21:12,  2.18it/s]

EPOCH: 2230, Valid Cost: 0.532, Valid Accuracy: 0.839


 45%|████▍     | 2240/5000 [19:22<28:31,  1.61it/s]

EPOCH: 2240, Valid Cost: 0.531, Valid Accuracy: 0.839


 45%|████▌     | 2250/5000 [19:28<21:48,  2.10it/s]

EPOCH: 2250, Valid Cost: 0.531, Valid Accuracy: 0.840


 45%|████▌     | 2260/5000 [19:32<21:17,  2.14it/s]

EPOCH: 2260, Valid Cost: 0.530, Valid Accuracy: 0.839


 45%|████▌     | 2270/5000 [19:38<29:05,  1.56it/s]

EPOCH: 2270, Valid Cost: 0.529, Valid Accuracy: 0.839


 46%|████▌     | 2280/5000 [19:43<21:33,  2.10it/s]

EPOCH: 2280, Valid Cost: 0.529, Valid Accuracy: 0.840


 46%|████▌     | 2290/5000 [19:48<20:47,  2.17it/s]

EPOCH: 2290, Valid Cost: 0.528, Valid Accuracy: 0.840


 46%|████▌     | 2300/5000 [19:54<28:13,  1.59it/s]

EPOCH: 2300, Valid Cost: 0.528, Valid Accuracy: 0.840


 46%|████▌     | 2310/5000 [19:58<20:52,  2.15it/s]

EPOCH: 2310, Valid Cost: 0.528, Valid Accuracy: 0.841


 46%|████▋     | 2320/5000 [20:03<20:21,  2.19it/s]

EPOCH: 2320, Valid Cost: 0.528, Valid Accuracy: 0.841


 47%|████▋     | 2330/5000 [20:09<24:37,  1.81it/s]

EPOCH: 2330, Valid Cost: 0.528, Valid Accuracy: 0.841


 47%|████▋     | 2340/5000 [20:14<20:27,  2.17it/s]

EPOCH: 2340, Valid Cost: 0.528, Valid Accuracy: 0.842


 47%|████▋     | 2350/5000 [20:18<22:05,  2.00it/s]

EPOCH: 2350, Valid Cost: 0.528, Valid Accuracy: 0.841


 47%|████▋     | 2360/5000 [20:24<22:43,  1.94it/s]

EPOCH: 2360, Valid Cost: 0.529, Valid Accuracy: 0.841


 47%|████▋     | 2370/5000 [20:29<20:17,  2.16it/s]

EPOCH: 2370, Valid Cost: 0.541, Valid Accuracy: 0.839


 48%|████▊     | 2380/5000 [20:34<24:33,  1.78it/s]

EPOCH: 2380, Valid Cost: 0.620, Valid Accuracy: 0.815


 48%|████▊     | 2390/5000 [20:39<21:07,  2.06it/s]

EPOCH: 2390, Valid Cost: 0.597, Valid Accuracy: 0.824


 48%|████▊     | 2400/5000 [20:44<20:14,  2.14it/s]

EPOCH: 2400, Valid Cost: 0.580, Valid Accuracy: 0.830


 48%|████▊     | 2410/5000 [20:50<27:42,  1.56it/s]

EPOCH: 2410, Valid Cost: 0.568, Valid Accuracy: 0.831


 48%|████▊     | 2420/5000 [20:55<20:56,  2.05it/s]

EPOCH: 2420, Valid Cost: 0.561, Valid Accuracy: 0.834


 49%|████▊     | 2430/5000 [21:00<19:57,  2.15it/s]

EPOCH: 2430, Valid Cost: 0.556, Valid Accuracy: 0.836


 49%|████▉     | 2440/5000 [21:05<27:50,  1.53it/s]

EPOCH: 2440, Valid Cost: 0.552, Valid Accuracy: 0.837


 49%|████▉     | 2450/5000 [21:10<20:09,  2.11it/s]

EPOCH: 2450, Valid Cost: 0.550, Valid Accuracy: 0.838


 49%|████▉     | 2460/5000 [21:15<19:45,  2.14it/s]

EPOCH: 2460, Valid Cost: 0.547, Valid Accuracy: 0.839


 49%|████▉     | 2470/5000 [21:21<24:27,  1.72it/s]

EPOCH: 2470, Valid Cost: 0.546, Valid Accuracy: 0.839


 50%|████▉     | 2480/5000 [21:26<19:35,  2.14it/s]

EPOCH: 2480, Valid Cost: 0.544, Valid Accuracy: 0.839


 50%|████▉     | 2490/5000 [21:30<19:22,  2.16it/s]

EPOCH: 2490, Valid Cost: 0.543, Valid Accuracy: 0.839


 50%|█████     | 2500/5000 [21:37<22:18,  1.87it/s]

EPOCH: 2500, Valid Cost: 0.542, Valid Accuracy: 0.839


 50%|█████     | 2510/5000 [21:41<19:33,  2.12it/s]

EPOCH: 2510, Valid Cost: 0.541, Valid Accuracy: 0.839


 50%|█████     | 2520/5000 [21:46<23:01,  1.80it/s]

EPOCH: 2520, Valid Cost: 0.541, Valid Accuracy: 0.840


 51%|█████     | 2530/5000 [21:52<20:52,  1.97it/s]

EPOCH: 2530, Valid Cost: 0.540, Valid Accuracy: 0.840


 51%|█████     | 2540/5000 [21:57<19:26,  2.11it/s]

EPOCH: 2540, Valid Cost: 0.540, Valid Accuracy: 0.840


 51%|█████     | 2550/5000 [22:02<24:15,  1.68it/s]

EPOCH: 2550, Valid Cost: 0.540, Valid Accuracy: 0.840


 51%|█████     | 2560/5000 [22:07<19:38,  2.07it/s]

EPOCH: 2560, Valid Cost: 0.539, Valid Accuracy: 0.840


 51%|█████▏    | 2570/5000 [22:12<18:57,  2.14it/s]

EPOCH: 2570, Valid Cost: 0.539, Valid Accuracy: 0.840


 52%|█████▏    | 2580/5000 [22:18<25:19,  1.59it/s]

EPOCH: 2580, Valid Cost: 0.538, Valid Accuracy: 0.841


 52%|█████▏    | 2590/5000 [22:23<19:32,  2.06it/s]

EPOCH: 2590, Valid Cost: 0.538, Valid Accuracy: 0.841


 52%|█████▏    | 2600/5000 [22:27<18:27,  2.17it/s]

EPOCH: 2600, Valid Cost: 0.537, Valid Accuracy: 0.841


 52%|█████▏    | 2610/5000 [22:33<25:53,  1.54it/s]

EPOCH: 2610, Valid Cost: 0.536, Valid Accuracy: 0.841


 52%|█████▏    | 2620/5000 [22:38<18:27,  2.15it/s]

EPOCH: 2620, Valid Cost: 0.535, Valid Accuracy: 0.842


 53%|█████▎    | 2630/5000 [22:43<18:37,  2.12it/s]

EPOCH: 2630, Valid Cost: 0.534, Valid Accuracy: 0.842


 53%|█████▎    | 2640/5000 [22:49<23:18,  1.69it/s]

EPOCH: 2640, Valid Cost: 0.533, Valid Accuracy: 0.843


 53%|█████▎    | 2650/5000 [22:54<18:16,  2.14it/s]

EPOCH: 2650, Valid Cost: 0.531, Valid Accuracy: 0.841


 53%|█████▎    | 2660/5000 [22:58<18:03,  2.16it/s]

EPOCH: 2660, Valid Cost: 0.545, Valid Accuracy: 0.838


 53%|█████▎    | 2670/5000 [23:04<20:44,  1.87it/s]

EPOCH: 2670, Valid Cost: 0.565, Valid Accuracy: 0.821


 54%|█████▎    | 2680/5000 [23:09<17:56,  2.15it/s]

EPOCH: 2680, Valid Cost: 0.543, Valid Accuracy: 0.829


 54%|█████▍    | 2690/5000 [23:14<21:25,  1.80it/s]

EPOCH: 2690, Valid Cost: 0.531, Valid Accuracy: 0.836


 54%|█████▍    | 2700/5000 [23:20<18:48,  2.04it/s]

EPOCH: 2700, Valid Cost: 0.527, Valid Accuracy: 0.838


 54%|█████▍    | 2710/5000 [23:24<17:58,  2.12it/s]

EPOCH: 2710, Valid Cost: 0.525, Valid Accuracy: 0.838


 54%|█████▍    | 2720/5000 [23:30<23:18,  1.63it/s]

EPOCH: 2720, Valid Cost: 0.525, Valid Accuracy: 0.839


 55%|█████▍    | 2730/5000 [23:35<18:09,  2.08it/s]

EPOCH: 2730, Valid Cost: 0.524, Valid Accuracy: 0.840


 55%|█████▍    | 2740/5000 [23:40<17:52,  2.11it/s]

EPOCH: 2740, Valid Cost: 0.524, Valid Accuracy: 0.840


 55%|█████▌    | 2750/5000 [23:45<23:34,  1.59it/s]

EPOCH: 2750, Valid Cost: 0.524, Valid Accuracy: 0.840


 55%|█████▌    | 2760/5000 [23:50<17:46,  2.10it/s]

EPOCH: 2760, Valid Cost: 0.523, Valid Accuracy: 0.840


 55%|█████▌    | 2770/5000 [23:55<17:32,  2.12it/s]

EPOCH: 2770, Valid Cost: 0.522, Valid Accuracy: 0.840


 56%|█████▌    | 2780/5000 [24:02<27:45,  1.33it/s]

EPOCH: 2780, Valid Cost: 0.522, Valid Accuracy: 0.841


 56%|█████▌    | 2790/5000 [24:08<21:25,  1.72it/s]

EPOCH: 2790, Valid Cost: 0.521, Valid Accuracy: 0.841


 56%|█████▌    | 2800/5000 [24:14<23:56,  1.53it/s]

EPOCH: 2800, Valid Cost: 0.521, Valid Accuracy: 0.840


 56%|█████▌    | 2810/5000 [24:20<17:31,  2.08it/s]

EPOCH: 2810, Valid Cost: 0.520, Valid Accuracy: 0.841


 56%|█████▋    | 2820/5000 [24:24<16:43,  2.17it/s]

EPOCH: 2820, Valid Cost: 0.521, Valid Accuracy: 0.843


 57%|█████▋    | 2830/5000 [24:30<23:52,  1.51it/s]

EPOCH: 2830, Valid Cost: 0.522, Valid Accuracy: 0.843


 57%|█████▋    | 2840/5000 [24:35<17:24,  2.07it/s]

EPOCH: 2840, Valid Cost: 0.549, Valid Accuracy: 0.838


 57%|█████▋    | 2850/5000 [24:40<16:49,  2.13it/s]

EPOCH: 2850, Valid Cost: 0.605, Valid Accuracy: 0.820


 57%|█████▋    | 2860/5000 [24:46<21:08,  1.69it/s]

EPOCH: 2860, Valid Cost: 0.582, Valid Accuracy: 0.828


 57%|█████▋    | 2870/5000 [24:51<16:59,  2.09it/s]

EPOCH: 2870, Valid Cost: 0.566, Valid Accuracy: 0.832


 58%|█████▊    | 2880/5000 [24:55<16:43,  2.11it/s]

EPOCH: 2880, Valid Cost: 0.556, Valid Accuracy: 0.835


 58%|█████▊    | 2890/5000 [25:01<19:00,  1.85it/s]

EPOCH: 2890, Valid Cost: 0.550, Valid Accuracy: 0.837


 58%|█████▊    | 2900/5000 [25:06<16:14,  2.15it/s]

EPOCH: 2900, Valid Cost: 0.546, Valid Accuracy: 0.838


 58%|█████▊    | 2910/5000 [25:11<20:19,  1.71it/s]

EPOCH: 2910, Valid Cost: 0.543, Valid Accuracy: 0.839


 58%|█████▊    | 2920/5000 [25:17<17:32,  1.98it/s]

EPOCH: 2920, Valid Cost: 0.541, Valid Accuracy: 0.840


 59%|█████▊    | 2930/5000 [25:22<16:18,  2.12it/s]

EPOCH: 2930, Valid Cost: 0.539, Valid Accuracy: 0.840


 59%|█████▉    | 2940/5000 [25:27<21:55,  1.57it/s]

EPOCH: 2940, Valid Cost: 0.538, Valid Accuracy: 0.840


 59%|█████▉    | 2950/5000 [25:33<16:34,  2.06it/s]

EPOCH: 2950, Valid Cost: 0.537, Valid Accuracy: 0.840


 59%|█████▉    | 2960/5000 [25:37<15:50,  2.15it/s]

EPOCH: 2960, Valid Cost: 0.536, Valid Accuracy: 0.840


 59%|█████▉    | 2970/5000 [25:43<22:26,  1.51it/s]

EPOCH: 2970, Valid Cost: 0.535, Valid Accuracy: 0.840


 60%|█████▉    | 2980/5000 [25:48<16:45,  2.01it/s]

EPOCH: 2980, Valid Cost: 0.535, Valid Accuracy: 0.840


 60%|█████▉    | 2990/5000 [25:53<15:43,  2.13it/s]

EPOCH: 2990, Valid Cost: 0.534, Valid Accuracy: 0.841


 60%|██████    | 3000/5000 [25:59<19:33,  1.70it/s]

EPOCH: 3000, Valid Cost: 0.534, Valid Accuracy: 0.841


 60%|██████    | 3010/5000 [26:04<15:31,  2.14it/s]

EPOCH: 3010, Valid Cost: 0.534, Valid Accuracy: 0.841


 60%|██████    | 3020/5000 [26:08<15:26,  2.14it/s]

EPOCH: 3020, Valid Cost: 0.534, Valid Accuracy: 0.841


 61%|██████    | 3030/5000 [26:14<17:23,  1.89it/s]

EPOCH: 3030, Valid Cost: 0.533, Valid Accuracy: 0.841


 61%|██████    | 3040/5000 [26:19<15:48,  2.07it/s]

EPOCH: 3040, Valid Cost: 0.533, Valid Accuracy: 0.841


 61%|██████    | 3050/5000 [26:24<19:22,  1.68it/s]

EPOCH: 3050, Valid Cost: 0.532, Valid Accuracy: 0.841


 61%|██████    | 3060/5000 [26:30<15:56,  2.03it/s]

EPOCH: 3060, Valid Cost: 0.531, Valid Accuracy: 0.842


 61%|██████▏   | 3070/5000 [26:34<14:56,  2.15it/s]

EPOCH: 3070, Valid Cost: 0.530, Valid Accuracy: 0.842


 62%|██████▏   | 3080/5000 [26:40<20:37,  1.55it/s]

EPOCH: 3080, Valid Cost: 0.529, Valid Accuracy: 0.843


 62%|██████▏   | 3090/5000 [26:45<15:13,  2.09it/s]

EPOCH: 3090, Valid Cost: 0.528, Valid Accuracy: 0.843


 62%|██████▏   | 3100/5000 [26:50<14:59,  2.11it/s]

EPOCH: 3100, Valid Cost: 0.526, Valid Accuracy: 0.843


 62%|██████▏   | 3110/5000 [26:56<20:12,  1.56it/s]

EPOCH: 3110, Valid Cost: 0.529, Valid Accuracy: 0.842


 62%|██████▏   | 3120/5000 [27:01<15:00,  2.09it/s]

EPOCH: 3120, Valid Cost: 0.566, Valid Accuracy: 0.823


 63%|██████▎   | 3130/5000 [27:05<14:46,  2.11it/s]

EPOCH: 3130, Valid Cost: 0.544, Valid Accuracy: 0.828


 63%|██████▎   | 3140/5000 [27:11<18:19,  1.69it/s]

EPOCH: 3140, Valid Cost: 0.529, Valid Accuracy: 0.835


 63%|██████▎   | 3150/5000 [27:16<14:31,  2.12it/s]

EPOCH: 3150, Valid Cost: 0.523, Valid Accuracy: 0.839


 63%|██████▎   | 3160/5000 [27:21<14:40,  2.09it/s]

EPOCH: 3160, Valid Cost: 0.521, Valid Accuracy: 0.840


 63%|██████▎   | 3170/5000 [27:27<16:10,  1.88it/s]

EPOCH: 3170, Valid Cost: 0.520, Valid Accuracy: 0.839


 64%|██████▎   | 3180/5000 [27:32<14:25,  2.10it/s]

EPOCH: 3180, Valid Cost: 0.520, Valid Accuracy: 0.840


 64%|██████▍   | 3190/5000 [27:37<16:57,  1.78it/s]

EPOCH: 3190, Valid Cost: 0.520, Valid Accuracy: 0.840


 64%|██████▍   | 3200/5000 [27:42<15:08,  1.98it/s]

EPOCH: 3200, Valid Cost: 0.519, Valid Accuracy: 0.840


 64%|██████▍   | 3210/5000 [27:47<14:06,  2.11it/s]

EPOCH: 3210, Valid Cost: 0.519, Valid Accuracy: 0.841


 64%|██████▍   | 3220/5000 [27:53<19:17,  1.54it/s]

EPOCH: 3220, Valid Cost: 0.518, Valid Accuracy: 0.840


 65%|██████▍   | 3230/5000 [27:58<14:30,  2.03it/s]

EPOCH: 3230, Valid Cost: 0.518, Valid Accuracy: 0.841


 65%|██████▍   | 3240/5000 [28:03<13:46,  2.13it/s]

EPOCH: 3240, Valid Cost: 0.517, Valid Accuracy: 0.840


 65%|██████▌   | 3250/5000 [28:09<19:43,  1.48it/s]

EPOCH: 3250, Valid Cost: 0.516, Valid Accuracy: 0.841


 65%|██████▌   | 3260/5000 [28:14<13:44,  2.11it/s]

EPOCH: 3260, Valid Cost: 0.516, Valid Accuracy: 0.841


 65%|██████▌   | 3270/5000 [28:18<13:21,  2.16it/s]

EPOCH: 3270, Valid Cost: 0.516, Valid Accuracy: 0.843


 66%|██████▌   | 3280/5000 [28:24<17:57,  1.60it/s]

EPOCH: 3280, Valid Cost: 0.516, Valid Accuracy: 0.843


 66%|██████▌   | 3290/5000 [28:29<13:19,  2.14it/s]

EPOCH: 3290, Valid Cost: 0.521, Valid Accuracy: 0.842


 66%|██████▌   | 3300/5000 [28:34<13:10,  2.15it/s]

EPOCH: 3300, Valid Cost: 0.585, Valid Accuracy: 0.827


 66%|██████▌   | 3310/5000 [28:40<15:14,  1.85it/s]

EPOCH: 3310, Valid Cost: 0.590, Valid Accuracy: 0.825


 66%|██████▋   | 3320/5000 [28:44<12:57,  2.16it/s]

EPOCH: 3320, Valid Cost: 0.569, Valid Accuracy: 0.832


 67%|██████▋   | 3330/5000 [28:49<15:13,  1.83it/s]

EPOCH: 3330, Valid Cost: 0.556, Valid Accuracy: 0.835


 67%|██████▋   | 3340/5000 [28:55<14:19,  1.93it/s]

EPOCH: 3340, Valid Cost: 0.548, Valid Accuracy: 0.838


 67%|██████▋   | 3350/5000 [29:00<12:42,  2.16it/s]

EPOCH: 3350, Valid Cost: 0.542, Valid Accuracy: 0.839


 67%|██████▋   | 3360/5000 [29:05<15:43,  1.74it/s]

EPOCH: 3360, Valid Cost: 0.539, Valid Accuracy: 0.840


 67%|██████▋   | 3370/5000 [29:11<13:18,  2.04it/s]

EPOCH: 3370, Valid Cost: 0.537, Valid Accuracy: 0.840


 68%|██████▊   | 3380/5000 [29:15<12:48,  2.11it/s]

EPOCH: 3380, Valid Cost: 0.535, Valid Accuracy: 0.840


 68%|██████▊   | 3390/5000 [29:21<18:08,  1.48it/s]

EPOCH: 3390, Valid Cost: 0.534, Valid Accuracy: 0.841


 68%|██████▊   | 3400/5000 [29:26<12:51,  2.07it/s]

EPOCH: 3400, Valid Cost: 0.532, Valid Accuracy: 0.842


 68%|██████▊   | 3410/5000 [29:31<12:29,  2.12it/s]

EPOCH: 3410, Valid Cost: 0.531, Valid Accuracy: 0.842


 68%|██████▊   | 3420/5000 [29:37<17:38,  1.49it/s]

EPOCH: 3420, Valid Cost: 0.531, Valid Accuracy: 0.842


 69%|██████▊   | 3430/5000 [29:42<12:30,  2.09it/s]

EPOCH: 3430, Valid Cost: 0.530, Valid Accuracy: 0.842


 69%|██████▉   | 3440/5000 [29:47<12:15,  2.12it/s]

EPOCH: 3440, Valid Cost: 0.530, Valid Accuracy: 0.842


 69%|██████▉   | 3450/5000 [29:53<14:45,  1.75it/s]

EPOCH: 3450, Valid Cost: 0.529, Valid Accuracy: 0.842


 69%|██████▉   | 3460/5000 [29:58<12:15,  2.09it/s]

EPOCH: 3460, Valid Cost: 0.529, Valid Accuracy: 0.842


 69%|██████▉   | 3470/5000 [30:03<13:17,  1.92it/s]

EPOCH: 3470, Valid Cost: 0.529, Valid Accuracy: 0.842


 70%|██████▉   | 3480/5000 [30:08<12:50,  1.97it/s]

EPOCH: 3480, Valid Cost: 0.529, Valid Accuracy: 0.842


 70%|██████▉   | 3490/5000 [30:13<11:54,  2.11it/s]

EPOCH: 3490, Valid Cost: 0.529, Valid Accuracy: 0.842


 70%|███████   | 3500/5000 [30:18<15:27,  1.62it/s]

EPOCH: 3500, Valid Cost: 0.528, Valid Accuracy: 0.842


 70%|███████   | 3510/5000 [30:24<12:13,  2.03it/s]

EPOCH: 3510, Valid Cost: 0.527, Valid Accuracy: 0.842


 70%|███████   | 3520/5000 [30:29<11:38,  2.12it/s]

EPOCH: 3520, Valid Cost: 0.526, Valid Accuracy: 0.842


 71%|███████   | 3530/5000 [30:35<16:05,  1.52it/s]

EPOCH: 3530, Valid Cost: 0.525, Valid Accuracy: 0.843


 71%|███████   | 3540/5000 [30:40<11:46,  2.07it/s]

EPOCH: 3540, Valid Cost: 0.524, Valid Accuracy: 0.843


 71%|███████   | 3550/5000 [30:44<11:10,  2.16it/s]

EPOCH: 3550, Valid Cost: 0.522, Valid Accuracy: 0.843


 71%|███████   | 3560/5000 [30:50<15:28,  1.55it/s]

EPOCH: 3560, Valid Cost: 0.523, Valid Accuracy: 0.842


 71%|███████▏  | 3570/5000 [30:55<11:24,  2.09it/s]

EPOCH: 3570, Valid Cost: 0.556, Valid Accuracy: 0.832


 72%|███████▏  | 3580/5000 [31:00<11:12,  2.11it/s]

EPOCH: 3580, Valid Cost: 0.545, Valid Accuracy: 0.827


 72%|███████▏  | 3590/5000 [31:06<12:51,  1.83it/s]

EPOCH: 3590, Valid Cost: 0.528, Valid Accuracy: 0.835


 72%|███████▏  | 3600/5000 [31:11<10:58,  2.12it/s]

EPOCH: 3600, Valid Cost: 0.520, Valid Accuracy: 0.839


 72%|███████▏  | 3610/5000 [31:16<12:24,  1.87it/s]

EPOCH: 3610, Valid Cost: 0.517, Valid Accuracy: 0.840


 72%|███████▏  | 3620/5000 [31:22<11:42,  1.97it/s]

EPOCH: 3620, Valid Cost: 0.516, Valid Accuracy: 0.841


 73%|███████▎  | 3630/5000 [31:26<10:51,  2.10it/s]

EPOCH: 3630, Valid Cost: 0.516, Valid Accuracy: 0.842


 73%|███████▎  | 3640/5000 [31:32<14:07,  1.60it/s]

EPOCH: 3640, Valid Cost: 0.516, Valid Accuracy: 0.841


 73%|███████▎  | 3650/5000 [31:37<10:56,  2.06it/s]

EPOCH: 3650, Valid Cost: 0.516, Valid Accuracy: 0.841


 73%|███████▎  | 3660/5000 [31:42<10:31,  2.12it/s]

EPOCH: 3660, Valid Cost: 0.516, Valid Accuracy: 0.841


 73%|███████▎  | 3670/5000 [31:48<14:21,  1.54it/s]

EPOCH: 3670, Valid Cost: 0.515, Valid Accuracy: 0.841


 74%|███████▎  | 3680/5000 [31:53<10:25,  2.11it/s]

EPOCH: 3680, Valid Cost: 0.514, Valid Accuracy: 0.841


 74%|███████▍  | 3690/5000 [31:57<10:23,  2.10it/s]

EPOCH: 3690, Valid Cost: 0.514, Valid Accuracy: 0.841


 74%|███████▍  | 3700/5000 [32:04<13:35,  1.59it/s]

EPOCH: 3700, Valid Cost: 0.513, Valid Accuracy: 0.842


 74%|███████▍  | 3710/5000 [32:08<10:01,  2.14it/s]

EPOCH: 3710, Valid Cost: 0.513, Valid Accuracy: 0.841


 74%|███████▍  | 3720/5000 [32:13<10:06,  2.11it/s]

EPOCH: 3720, Valid Cost: 0.512, Valid Accuracy: 0.843


 75%|███████▍  | 3730/5000 [32:19<11:32,  1.83it/s]

EPOCH: 3730, Valid Cost: 0.513, Valid Accuracy: 0.844


 75%|███████▍  | 3740/5000 [32:24<09:48,  2.14it/s]

EPOCH: 3740, Valid Cost: 0.514, Valid Accuracy: 0.844


 75%|███████▌  | 3750/5000 [32:29<11:35,  1.80it/s]

EPOCH: 3750, Valid Cost: 0.534, Valid Accuracy: 0.842


 75%|███████▌  | 3760/5000 [32:35<10:20,  2.00it/s]

EPOCH: 3760, Valid Cost: 0.598, Valid Accuracy: 0.822


 75%|███████▌  | 3770/5000 [32:39<09:44,  2.10it/s]

EPOCH: 3770, Valid Cost: 0.573, Valid Accuracy: 0.830


 76%|███████▌  | 3780/5000 [32:45<12:03,  1.69it/s]

EPOCH: 3780, Valid Cost: 0.556, Valid Accuracy: 0.834


 76%|███████▌  | 3790/5000 [32:50<09:45,  2.07it/s]

EPOCH: 3790, Valid Cost: 0.546, Valid Accuracy: 0.837


 76%|███████▌  | 3800/5000 [32:55<09:25,  2.12it/s]

EPOCH: 3800, Valid Cost: 0.540, Valid Accuracy: 0.838


 76%|███████▌  | 3810/5000 [33:00<12:45,  1.55it/s]

EPOCH: 3810, Valid Cost: 0.536, Valid Accuracy: 0.840


 76%|███████▋  | 3820/5000 [33:05<09:14,  2.13it/s]

EPOCH: 3820, Valid Cost: 0.533, Valid Accuracy: 0.841


 77%|███████▋  | 3830/5000 [33:10<09:06,  2.14it/s]

EPOCH: 3830, Valid Cost: 0.532, Valid Accuracy: 0.841


 77%|███████▋  | 3840/5000 [33:16<12:21,  1.56it/s]

EPOCH: 3840, Valid Cost: 0.530, Valid Accuracy: 0.842


 77%|███████▋  | 3850/5000 [33:21<09:07,  2.10it/s]

EPOCH: 3850, Valid Cost: 0.529, Valid Accuracy: 0.843


 77%|███████▋  | 3860/5000 [33:25<08:53,  2.14it/s]

EPOCH: 3860, Valid Cost: 0.528, Valid Accuracy: 0.843


 77%|███████▋  | 3870/5000 [33:31<10:18,  1.83it/s]

EPOCH: 3870, Valid Cost: 0.527, Valid Accuracy: 0.843


 78%|███████▊  | 3880/5000 [33:36<08:45,  2.13it/s]

EPOCH: 3880, Valid Cost: 0.526, Valid Accuracy: 0.842


 78%|███████▊  | 3890/5000 [33:41<09:25,  1.96it/s]

EPOCH: 3890, Valid Cost: 0.526, Valid Accuracy: 0.842


 78%|███████▊  | 3900/5000 [33:47<09:29,  1.93it/s]

EPOCH: 3900, Valid Cost: 0.526, Valid Accuracy: 0.843


 78%|███████▊  | 3910/5000 [33:51<08:29,  2.14it/s]

EPOCH: 3910, Valid Cost: 0.525, Valid Accuracy: 0.843


 78%|███████▊  | 3920/5000 [33:57<10:36,  1.70it/s]

EPOCH: 3920, Valid Cost: 0.525, Valid Accuracy: 0.843


 79%|███████▊  | 3930/5000 [34:02<08:51,  2.01it/s]

EPOCH: 3930, Valid Cost: 0.525, Valid Accuracy: 0.843


 79%|███████▉  | 3940/5000 [34:07<08:11,  2.16it/s]

EPOCH: 3940, Valid Cost: 0.525, Valid Accuracy: 0.842


 79%|███████▉  | 3950/5000 [34:12<10:41,  1.64it/s]

EPOCH: 3950, Valid Cost: 0.525, Valid Accuracy: 0.842


 79%|███████▉  | 3960/5000 [34:18<08:14,  2.10it/s]

EPOCH: 3960, Valid Cost: 0.524, Valid Accuracy: 0.842


 79%|███████▉  | 3970/5000 [34:22<07:51,  2.18it/s]

EPOCH: 3970, Valid Cost: 0.524, Valid Accuracy: 0.843


 80%|███████▉  | 3980/5000 [34:28<10:48,  1.57it/s]

EPOCH: 3980, Valid Cost: 0.522, Valid Accuracy: 0.843


 80%|███████▉  | 3990/5000 [34:33<07:55,  2.12it/s]

EPOCH: 3990, Valid Cost: 0.521, Valid Accuracy: 0.844


 80%|████████  | 4000/5000 [34:37<07:41,  2.16it/s]

EPOCH: 4000, Valid Cost: 0.519, Valid Accuracy: 0.844


 80%|████████  | 4010/5000 [34:43<10:43,  1.54it/s]

EPOCH: 4010, Valid Cost: 0.519, Valid Accuracy: 0.844


 80%|████████  | 4020/5000 [34:48<07:37,  2.14it/s]

EPOCH: 4020, Valid Cost: 0.540, Valid Accuracy: 0.838


 81%|████████  | 4030/5000 [34:53<07:26,  2.17it/s]

EPOCH: 4030, Valid Cost: 0.548, Valid Accuracy: 0.827


 81%|████████  | 4040/5000 [34:59<09:26,  1.69it/s]

EPOCH: 4040, Valid Cost: 0.528, Valid Accuracy: 0.835


 81%|████████  | 4050/5000 [35:03<07:27,  2.12it/s]

EPOCH: 4050, Valid Cost: 0.518, Valid Accuracy: 0.839


 81%|████████  | 4060/5000 [35:08<07:09,  2.19it/s]

EPOCH: 4060, Valid Cost: 0.515, Valid Accuracy: 0.841


 81%|████████▏ | 4070/5000 [35:14<08:17,  1.87it/s]

EPOCH: 4070, Valid Cost: 0.513, Valid Accuracy: 0.842


 82%|████████▏ | 4080/5000 [35:19<07:11,  2.13it/s]

EPOCH: 4080, Valid Cost: 0.513, Valid Accuracy: 0.842


 82%|████████▏ | 4090/5000 [35:24<08:16,  1.83it/s]

EPOCH: 4090, Valid Cost: 0.513, Valid Accuracy: 0.842


 82%|████████▏ | 4100/5000 [35:30<07:35,  1.97it/s]

EPOCH: 4100, Valid Cost: 0.513, Valid Accuracy: 0.842


 82%|████████▏ | 4110/5000 [35:34<06:52,  2.16it/s]

EPOCH: 4110, Valid Cost: 0.513, Valid Accuracy: 0.842


 82%|████████▏ | 4120/5000 [35:39<09:01,  1.62it/s]

EPOCH: 4120, Valid Cost: 0.512, Valid Accuracy: 0.842


 83%|████████▎ | 4130/5000 [35:45<06:59,  2.08it/s]

EPOCH: 4130, Valid Cost: 0.512, Valid Accuracy: 0.841


 83%|████████▎ | 4140/5000 [35:49<06:36,  2.17it/s]

EPOCH: 4140, Valid Cost: 0.511, Valid Accuracy: 0.842


 83%|████████▎ | 4150/5000 [35:55<08:43,  1.62it/s]

EPOCH: 4150, Valid Cost: 0.510, Valid Accuracy: 0.842


 83%|████████▎ | 4160/5000 [36:00<06:41,  2.09it/s]

EPOCH: 4160, Valid Cost: 0.510, Valid Accuracy: 0.843


 83%|████████▎ | 4170/5000 [36:05<06:22,  2.17it/s]

EPOCH: 4170, Valid Cost: 0.510, Valid Accuracy: 0.843


 84%|████████▎ | 4180/5000 [36:10<08:24,  1.63it/s]

EPOCH: 4180, Valid Cost: 0.510, Valid Accuracy: 0.844


 84%|████████▍ | 4190/5000 [36:15<06:19,  2.14it/s]

EPOCH: 4190, Valid Cost: 0.510, Valid Accuracy: 0.845


 84%|████████▍ | 4200/5000 [36:20<06:13,  2.14it/s]

EPOCH: 4200, Valid Cost: 0.516, Valid Accuracy: 0.845


 84%|████████▍ | 4210/5000 [36:26<08:15,  1.60it/s]

EPOCH: 4210, Valid Cost: 0.581, Valid Accuracy: 0.825


 84%|████████▍ | 4220/5000 [36:31<06:07,  2.13it/s]

EPOCH: 4220, Valid Cost: 0.581, Valid Accuracy: 0.826


 85%|████████▍ | 4230/5000 [36:35<05:57,  2.16it/s]

EPOCH: 4230, Valid Cost: 0.560, Valid Accuracy: 0.832


 85%|████████▍ | 4240/5000 [36:41<07:01,  1.80it/s]

EPOCH: 4240, Valid Cost: 0.547, Valid Accuracy: 0.836


 85%|████████▌ | 4250/5000 [36:46<05:52,  2.13it/s]

EPOCH: 4250, Valid Cost: 0.539, Valid Accuracy: 0.838


 85%|████████▌ | 4260/5000 [36:51<06:22,  1.93it/s]

EPOCH: 4260, Valid Cost: 0.534, Valid Accuracy: 0.839


 85%|████████▌ | 4270/5000 [36:57<06:23,  1.91it/s]

EPOCH: 4270, Valid Cost: 0.531, Valid Accuracy: 0.841


 86%|████████▌ | 4280/5000 [37:01<05:29,  2.18it/s]

EPOCH: 4280, Valid Cost: 0.529, Valid Accuracy: 0.841


 86%|████████▌ | 4290/5000 [37:06<06:34,  1.80it/s]

EPOCH: 4290, Valid Cost: 0.527, Valid Accuracy: 0.841


 86%|████████▌ | 4300/5000 [37:12<05:44,  2.03it/s]

EPOCH: 4300, Valid Cost: 0.526, Valid Accuracy: 0.842


 86%|████████▌ | 4310/5000 [37:17<05:16,  2.18it/s]

EPOCH: 4310, Valid Cost: 0.525, Valid Accuracy: 0.842


 86%|████████▋ | 4320/5000 [37:22<06:58,  1.62it/s]

EPOCH: 4320, Valid Cost: 0.524, Valid Accuracy: 0.842


 87%|████████▋ | 4330/5000 [37:27<05:23,  2.07it/s]

EPOCH: 4330, Valid Cost: 0.524, Valid Accuracy: 0.842


 87%|████████▋ | 4340/5000 [37:32<05:08,  2.14it/s]

EPOCH: 4340, Valid Cost: 0.523, Valid Accuracy: 0.842


 87%|████████▋ | 4350/5000 [37:38<06:55,  1.57it/s]

EPOCH: 4350, Valid Cost: 0.523, Valid Accuracy: 0.842


 87%|████████▋ | 4360/5000 [37:43<05:05,  2.09it/s]

EPOCH: 4360, Valid Cost: 0.522, Valid Accuracy: 0.842


 87%|████████▋ | 4370/5000 [37:47<04:57,  2.12it/s]

EPOCH: 4370, Valid Cost: 0.522, Valid Accuracy: 0.842


 88%|████████▊ | 4380/5000 [37:53<06:29,  1.59it/s]

EPOCH: 4380, Valid Cost: 0.522, Valid Accuracy: 0.843


 88%|████████▊ | 4390/5000 [37:58<04:46,  2.13it/s]

EPOCH: 4390, Valid Cost: 0.522, Valid Accuracy: 0.843


 88%|████████▊ | 4400/5000 [38:03<04:43,  2.12it/s]

EPOCH: 4400, Valid Cost: 0.522, Valid Accuracy: 0.843


 88%|████████▊ | 4410/5000 [38:09<05:27,  1.80it/s]

EPOCH: 4410, Valid Cost: 0.522, Valid Accuracy: 0.843


 88%|████████▊ | 4420/5000 [38:13<04:30,  2.15it/s]

EPOCH: 4420, Valid Cost: 0.521, Valid Accuracy: 0.842


 89%|████████▊ | 4430/5000 [38:18<04:51,  1.95it/s]

EPOCH: 4430, Valid Cost: 0.520, Valid Accuracy: 0.842


 89%|████████▉ | 4440/5000 [38:24<04:46,  1.95it/s]

EPOCH: 4440, Valid Cost: 0.519, Valid Accuracy: 0.843


 89%|████████▉ | 4450/5000 [38:29<04:16,  2.14it/s]

EPOCH: 4450, Valid Cost: 0.517, Valid Accuracy: 0.844


 89%|████████▉ | 4460/5000 [38:34<05:22,  1.68it/s]

EPOCH: 4460, Valid Cost: 0.516, Valid Accuracy: 0.844


 89%|████████▉ | 4470/5000 [38:40<04:21,  2.03it/s]

EPOCH: 4470, Valid Cost: 0.523, Valid Accuracy: 0.844


 90%|████████▉ | 4480/5000 [38:44<03:59,  2.17it/s]

EPOCH: 4480, Valid Cost: 0.553, Valid Accuracy: 0.826


 90%|████████▉ | 4490/5000 [38:50<05:10,  1.64it/s]

EPOCH: 4490, Valid Cost: 0.531, Valid Accuracy: 0.833


 90%|█████████ | 4500/5000 [38:55<03:54,  2.13it/s]

EPOCH: 4500, Valid Cost: 0.518, Valid Accuracy: 0.839


 90%|█████████ | 4510/5000 [38:59<03:47,  2.15it/s]

EPOCH: 4510, Valid Cost: 0.513, Valid Accuracy: 0.842


 90%|█████████ | 4520/5000 [39:05<05:15,  1.52it/s]

EPOCH: 4520, Valid Cost: 0.511, Valid Accuracy: 0.842


 91%|█████████ | 4530/5000 [39:10<03:40,  2.13it/s]

EPOCH: 4530, Valid Cost: 0.511, Valid Accuracy: 0.842


 91%|█████████ | 4540/5000 [39:15<03:34,  2.15it/s]

EPOCH: 4540, Valid Cost: 0.511, Valid Accuracy: 0.843


 91%|█████████ | 4550/5000 [39:21<04:37,  1.62it/s]

EPOCH: 4550, Valid Cost: 0.511, Valid Accuracy: 0.842


 91%|█████████ | 4560/5000 [39:26<03:26,  2.13it/s]

EPOCH: 4560, Valid Cost: 0.511, Valid Accuracy: 0.842


 91%|█████████▏| 4570/5000 [39:30<03:19,  2.15it/s]

EPOCH: 4570, Valid Cost: 0.510, Valid Accuracy: 0.843


 92%|█████████▏| 4580/5000 [39:36<03:44,  1.87it/s]

EPOCH: 4580, Valid Cost: 0.510, Valid Accuracy: 0.843


 92%|█████████▏| 4590/5000 [39:41<03:12,  2.14it/s]

EPOCH: 4590, Valid Cost: 0.509, Valid Accuracy: 0.842


 92%|█████████▏| 4600/5000 [39:46<03:25,  1.95it/s]

EPOCH: 4600, Valid Cost: 0.508, Valid Accuracy: 0.843


 92%|█████████▏| 4610/5000 [39:52<03:18,  1.97it/s]

EPOCH: 4610, Valid Cost: 0.508, Valid Accuracy: 0.843


 92%|█████████▏| 4620/5000 [39:56<02:59,  2.11it/s]

EPOCH: 4620, Valid Cost: 0.508, Valid Accuracy: 0.844


 93%|█████████▎| 4630/5000 [40:01<03:46,  1.64it/s]

EPOCH: 4630, Valid Cost: 0.507, Valid Accuracy: 0.845


 93%|█████████▎| 4640/5000 [40:07<02:55,  2.05it/s]

EPOCH: 4640, Valid Cost: 0.508, Valid Accuracy: 0.845


 93%|█████████▎| 4650/5000 [40:12<02:41,  2.17it/s]

EPOCH: 4650, Valid Cost: 0.509, Valid Accuracy: 0.845


 93%|█████████▎| 4660/5000 [40:17<03:28,  1.63it/s]

EPOCH: 4660, Valid Cost: 0.525, Valid Accuracy: 0.842


 93%|█████████▎| 4670/5000 [40:22<02:37,  2.10it/s]

EPOCH: 4670, Valid Cost: 0.592, Valid Accuracy: 0.823


 94%|█████████▎| 4680/5000 [40:27<02:28,  2.16it/s]

EPOCH: 4680, Valid Cost: 0.568, Valid Accuracy: 0.830


 94%|█████████▍| 4690/5000 [40:33<03:12,  1.61it/s]

EPOCH: 4690, Valid Cost: 0.550, Valid Accuracy: 0.835


 94%|█████████▍| 4700/5000 [40:38<02:21,  2.12it/s]

EPOCH: 4700, Valid Cost: 0.540, Valid Accuracy: 0.837


 94%|█████████▍| 4710/5000 [40:42<02:14,  2.15it/s]

EPOCH: 4710, Valid Cost: 0.533, Valid Accuracy: 0.838


 94%|█████████▍| 4720/5000 [40:48<03:01,  1.54it/s]

EPOCH: 4720, Valid Cost: 0.530, Valid Accuracy: 0.840


 95%|█████████▍| 4730/5000 [40:53<02:07,  2.12it/s]

EPOCH: 4730, Valid Cost: 0.527, Valid Accuracy: 0.841


 95%|█████████▍| 4740/5000 [40:58<02:00,  2.16it/s]

EPOCH: 4740, Valid Cost: 0.525, Valid Accuracy: 0.842


 95%|█████████▌| 4750/5000 [41:04<02:24,  1.73it/s]

EPOCH: 4750, Valid Cost: 0.524, Valid Accuracy: 0.842


 95%|█████████▌| 4760/5000 [41:09<01:53,  2.11it/s]

EPOCH: 4760, Valid Cost: 0.523, Valid Accuracy: 0.842


 95%|█████████▌| 4770/5000 [41:13<01:49,  2.10it/s]

EPOCH: 4770, Valid Cost: 0.522, Valid Accuracy: 0.842


 96%|█████████▌| 4780/5000 [41:19<01:54,  1.92it/s]

EPOCH: 4780, Valid Cost: 0.521, Valid Accuracy: 0.842


 96%|█████████▌| 4790/5000 [41:24<01:37,  2.16it/s]

EPOCH: 4790, Valid Cost: 0.521, Valid Accuracy: 0.842


 96%|█████████▌| 4800/5000 [41:29<01:52,  1.78it/s]

EPOCH: 4800, Valid Cost: 0.520, Valid Accuracy: 0.842


 96%|█████████▌| 4810/5000 [41:35<01:36,  1.98it/s]

EPOCH: 4810, Valid Cost: 0.520, Valid Accuracy: 0.842


 96%|█████████▋| 4820/5000 [41:39<01:23,  2.15it/s]

EPOCH: 4820, Valid Cost: 0.520, Valid Accuracy: 0.842


 97%|█████████▋| 4830/5000 [41:45<01:42,  1.66it/s]

EPOCH: 4830, Valid Cost: 0.520, Valid Accuracy: 0.842


 97%|█████████▋| 4840/5000 [41:50<01:16,  2.09it/s]

EPOCH: 4840, Valid Cost: 0.520, Valid Accuracy: 0.843


 97%|█████████▋| 4850/5000 [41:55<01:09,  2.16it/s]

EPOCH: 4850, Valid Cost: 0.520, Valid Accuracy: 0.842


 97%|█████████▋| 4860/5000 [42:00<01:30,  1.55it/s]

EPOCH: 4860, Valid Cost: 0.520, Valid Accuracy: 0.843


 97%|█████████▋| 4870/5000 [42:06<01:02,  2.09it/s]

EPOCH: 4870, Valid Cost: 0.519, Valid Accuracy: 0.842


 98%|█████████▊| 4880/5000 [42:10<00:54,  2.18it/s]

EPOCH: 4880, Valid Cost: 0.518, Valid Accuracy: 0.842


 98%|█████████▊| 4890/5000 [42:16<01:10,  1.55it/s]

EPOCH: 4890, Valid Cost: 0.517, Valid Accuracy: 0.843


 98%|█████████▊| 4900/5000 [42:21<00:46,  2.13it/s]

EPOCH: 4900, Valid Cost: 0.516, Valid Accuracy: 0.844


 98%|█████████▊| 4910/5000 [42:25<00:42,  2.14it/s]

EPOCH: 4910, Valid Cost: 0.514, Valid Accuracy: 0.845


 98%|█████████▊| 4920/5000 [42:31<00:45,  1.75it/s]

EPOCH: 4920, Valid Cost: 0.514, Valid Accuracy: 0.844


 99%|█████████▊| 4930/5000 [42:36<00:32,  2.15it/s]

EPOCH: 4930, Valid Cost: 0.541, Valid Accuracy: 0.836


 99%|█████████▉| 4940/5000 [42:41<00:28,  2.14it/s]

EPOCH: 4940, Valid Cost: 0.540, Valid Accuracy: 0.829


 99%|█████████▉| 4950/5000 [42:47<00:26,  1.90it/s]

EPOCH: 4950, Valid Cost: 0.522, Valid Accuracy: 0.836


 99%|█████████▉| 4960/5000 [42:51<00:18,  2.12it/s]

EPOCH: 4960, Valid Cost: 0.513, Valid Accuracy: 0.841


 99%|█████████▉| 4970/5000 [42:56<00:16,  1.81it/s]

EPOCH: 4970, Valid Cost: 0.510, Valid Accuracy: 0.843


100%|█████████▉| 4980/5000 [43:02<00:10,  1.99it/s]

EPOCH: 4980, Valid Cost: 0.509, Valid Accuracy: 0.843


100%|█████████▉| 4990/5000 [43:07<00:04,  2.16it/s]

EPOCH: 4990, Valid Cost: 0.509, Valid Accuracy: 0.843


100%|██████████| 5000/5000 [43:12<00:00,  1.93it/s]

EPOCH: 5000, Valid Cost: 0.509, Valid Accuracy: 0.843





In [23]:

softmax(np.matmul(x_test, W) + b)
y_pred_test = softmax(np.matmul(x_test, W) + b).argmax(axis=1) # WRITE ME

submission = pd.Series(y_pred_test, name='label')
submission.to_csv('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture02/submission_pred.csv', header=True, index_label='id')

In [14]:
print(y_train)

[[0. 1. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 1.]
 [0. 0. 0. ... 0. 0. 1.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 1. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 1.]]
