# 第3回講義 宿題

## 課題

今Lessonで学んだことを元に，MNISTのファッション版 (Fashion MNIST，クラス数10) を多層パーセプトロンによって分類してみましょう．

Fashion MNISTの詳細については以下のリンクを参考にしてください．

Fashion MNIST: https://github.com/zalandoresearch/fashion-mnist

### 目標値

Accuracy 85%

### ルール

- 訓練データはx_train， t_train，テストデータはx_testで与えられます．
- 予測ラベルは one_hot表現ではなく0~9のクラスラベル で表してください．
- **下のセルで指定されているx_train，t_train以外の学習データは使わないでください．**
- Pytorchを第3回の演習を参考に，NumPyを用いて実装をしてください．

### 提出方法

- 2つのファイルを提出していただきます．
  - テストデータ (x_test) に対する予測ラベルをcsvファイル (ファイル名: submission_pred.csv) で提出してください．
  - それに対応するpythonのコードをsubmission_code.pyとして提出してください (%%writefileコマンドなどを利用してください)．

### 評価方法

- 予測ラベルのt_testに対する精度 (Accuracy) で評価します．
- 提出後即時採点を行い，Leader Boardが更新されます．
- 締切後の点数を最終的な評価とします．

In [2]:
# ドライブのマウント
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### データの読み込み

- この部分は修正しないでください

In [2]:
import os
import numpy as np
import pandas as pd
from sklearn.utils import shuffle
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import inspect


#学習データ
x_train = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture03/data/x_train.npy')
t_train = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture03/data/y_train.npy')
    
#テストデータ
x_test = np.load('drive/MyDrive/Colab Notebooks/DLBasics2023_colab/Lecture03/data/x_test.npy')

# データの前処理（正規化， one-hot encoding)
x_train, x_test = x_train / 255., x_test / 255.
x_train, x_test = x_train.reshape(x_train.shape[0], -1), x_test.reshape(x_test.shape[0], -1)
t_train = np.eye(N=10)[t_train.astype("int32").flatten()]

FileNotFoundError: ignored

### 多層パーセプトロンの実装

In [3]:
import os
import numpy as np
import pandas as pd
from sklearn.utils import shuffle
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
import inspect


#学習データ
x_train = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture03/data/x_train.npy')
t_train = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture03/data/y_train.npy')
    
#テストデータ
x_test = np.load('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture03/data/x_test.npy')

# データの前処理（正規化， one-hot encoding)
x_train, x_test = x_train / 255., x_test / 255.
x_train, x_test = x_train.reshape(x_train.shape[0], -1), x_test.reshape(x_test.shape[0], -1)
t_train = np.eye(N=10)[t_train.astype("int32").flatten()]

In [4]:
# データの分割
x_train, x_val, t_train, t_val =\
    train_test_split(x_train, t_train, test_size=10000)

In [5]:
def np_log(x):
    return np.log(np.clip(x, 1e-10, 1e+10))


def create_batch(data, batch_size):
    """
    :param data: np.ndarray，入力データ
    :param batch_size: int，バッチサイズ
    """
    num_batches, mod = divmod(data.shape[0], batch_size)
    batched_data = np.split(data[: batch_size * num_batches], num_batches)
    if mod:
        batched_data.append(data[batch_size * num_batches:])

    return batched_data

In [6]:
# シード値を変えることで何が起きるかも確かめてみてください．
rng = np.random.RandomState(1234)
random_state = 42

# 発展: 今回の講義で扱っていない活性化関数について調べ，実装してみましょう
def relu(x):
  return np.maximum(x,0)


def deriv_relu(x):
  return (x > 0).astype(x.dtype)


def softmax(x):
  x -= x.max(axis=1, keepdims=True)
  x_exp = np.exp(x)
  return x_exp / x_exp.sum(axis=1, keepdims=True)


def deriv_softmax(x):
  return softmax(x) * (1-softmax(x))


def crossentropy_loss(t, t_hat):
  return (- t * np_log(t_hat)).sum(axis=1).mean()

class Dense:
  def __init__(self, in_dim, out_dim, function, deriv_function):

    self.W = np.random.uniform(low=-0.08, high=0.08,
                              size=(in_dim, out_dim)).astype("float64")
    self.b = np.zeros(out_dim).astype("float64")
    self.function = function
    self.deriv_function = deriv_function

    self.x = None
    self.u = None

    self.dW = None
    self.db = None

    self.params_idxs = np.cumsum([self.W.size, self.b.size])

  def __call__(self, x):
    self.x = x
    self.u = np.matmul(self.x, self.W) + self.b
    h = self.function(self.u)
    return h

  def b_prop(self, delta, W):
    self.delta = self.deriv_function(self.u) * np.matmul(delta, W.T)
    return self.delta

  def compute_grad(self):
    batch_size = self.delta.shape[0]
    self.dW = np.matmul(self.x.T, self.delta) / batch_size
    self.db = np.matmul(np.ones(batch_size), self.delta) / batch_size

  def get_params(self):
    return np.concatenate([self.W.ravel(), self.b], axis=0)

  def set_params(self, params):
    _W, _b = np.split(params, self.params_idxs)[:-1]
    self.W = _W.reshape(self.W.shape)
    self.b = _b

  def get_grads(self):
    return np.concatenate([self.dW.ravel(), self.db], axis=0)


class Model:
    def __init__(self, hidden_dims, activation_functions, deriv_functions):
        """
        :param hiden_dims: List[int]，各層のノード数を格納したリスト．
        :params activation_functions: List, 各層で用いる活性化関数を格納したリスト．
        :params derive_functions: List，各層で用いる活性化関数の導関数を格納したリスト．
        """
        # 各層をリストに格納していく
        self.layers = []
        self.adam = Adam()

        for i in range(len(hidden_dims)-2):  # 出力層以外は同じ構造
            self.layers.append(Dense(hidden_dims[i], hidden_dims[i+1],
                                     activation_functions[i], deriv_functions[i]))
        self.layers.append(Dense(hidden_dims[-2], hidden_dims[-1],
                                 activation_functions[-1], deriv_functions[-1]))  # 出力層を追加

    def __call__(self, x):
        return self.forward(x)

    def forward(self, x):
        """順伝播処理を行うメソッド"""
        for layer in self.layers:
            x = layer(x)
        return x

    def backward(self, delta):
        """誤差逆伝播，勾配計算を行うメソッド"""
        batch_size = delta.shape[0]

        for i, layer in enumerate(self.layers[::-1]):
            if i == 0:  # 出力層の場合
                layer.delta = delta  # y - t
                layer.compute_grad()
            else:  # 出力層以外の場合
                delta = layer.b_prop(delta, W)  # 逆伝播
                layer.compute_grad()  # 勾配の計算

            W = layer.W

    def adam_opt(self, lr= 0.01):
      for layer in self.layers:        
          params = {'W': layer.W, 'b': layer.b}
          grads = {'W': layer.dW, 'b': layer.db}
          self.adam.update(params, grads)



    def update(self, eps=0.1):
        """パラメータの更新を行うメソッド"""
        for layer in self.layers:
            layer.W -= eps * layer.dW
            layer.b -= eps * layer.db




In [7]:
class Adam:
    def __init__(self, lr=0.01, beta1=0.9, beta2=0.09):
        self.lr = lr
        self.beta1 = beta1
        self.beta2 = beta2
        self.iter = 0
        self.m = None
        self.v = None

    def update(self, params, grads):
        if self.m is None:
            """
            更新初回にインスタンス変数 m, vを生成
            """
            self.m, self.v = {}, {}
            for key, val in params.items():
                self.m[key] = np.zeros_like(val)
                self.v[key] = np.zeros_like(val)
        
        self.iter += 1
        lr_t  = self.lr * np.sqrt(1.0 - self.beta2**self.iter) / (1.0 - self.beta1**self.iter)         
        
        for key in params.keys():
            self.m[key] = self.beta1 * self.m[key] + (1 - self.beta1) * grads[key]
            self.v[key] = self.beta2 * self.v[key] + (1 - self.beta2) * (grads[key]**2)
            m_unbias = self.m[key] / (1 - self.beta1**self.iter)
            v_unbias = self.v[key] / (1 - self.beta2**self.iter)
            
            params[key] -= lr_t * self.m[key] / (np.sqrt(self.v[key]) + 1e-7)


### モデルの学習

In [9]:
from tqdm import tqdm

def train_model(mlp, x_train, t_train, x_val, t_val, n_epochs=10, lr=0.1):
    for epoch in tqdm(range(n_epochs)):
        losses_train = []
        losses_valid = []
        train_num = 0
        train_true_num = 0
        valid_num = 0
        valid_true_num = 0


        x_train, t_train = shuffle(x_train, t_train)
        x_train_batches, t_train_batches = create_batch(x_train, batch_size), create_batch(t_train, batch_size)

        x_val, t_val = shuffle(x_val, t_val)
        x_val_batches, t_val_batches = create_batch(x_val, batch_size), create_batch(t_val, batch_size)

        # モデルの訓練
        for x, t in zip(x_train_batches, t_train_batches):
            # 順伝播
            t_hat = mlp(x)

            # 損失の計算
            loss = crossentropy_loss(t,t_hat)

            # パラメータの更新
            delta = t_hat - t
            mlp.backward(delta)
            losses_train.append(loss.tolist())
            mlp.update(eps=lr)
            #mlp.adam_opt(lr)


            # 精度を計算
            acc = accuracy_score(t.argmax(axis=1), t_hat.argmax(axis=1), normalize=False)
            train_num += x.shape[0]
            train_true_num += acc

        # モデルの評価
        for x, t in zip(x_val_batches, t_val_batches):
            # 順伝播
            y = mlp(x)

            # 損失の計算
            loss = crossentropy_loss(t,y)
            losses_valid.append(loss.tolist())

            acc = accuracy_score(t.argmax(axis=1), y.argmax(axis=1), normalize=False)
            valid_num += x.shape[0]
            valid_true_num += acc.sum().item()

        if epoch % 10 == 9 or epoch == 0:
          print('EPOCH: {}, Train [Loss: {:.3f}, Accuracy: {:.3f}], Valid [Loss: {:.3f}, Accuracy: {:.3f}]'.format(
              epoch + 1,
              np.mean(losses_train),
              train_true_num/train_num,
              np.mean(losses_valid),
              valid_true_num/valid_num
          ))



batch_size = 128
n_epochs = 2000
lr = 0.001

mlp = Model(hidden_dims=[784, 516, 128, 10],
              activation_functions=[relu, relu, relu, softmax],
              deriv_functions=[deriv_relu, deriv_relu, deriv_relu, deriv_softmax])
print('BatchSize: {}, N_epochs: {}, eps: {}'.format(
      batch_size,
      n_epochs,
      lr
))
train_model(mlp, x_train, t_train, x_val, t_val, n_epochs, lr)

BatchSize: 128, N_epochs: 2000, eps: 0.001


  0%|          | 1/2000 [00:13<7:27:19, 13.43s/it]

EPOCH: 1, Train [Loss: 2.222, Accuracy: 0.244], Valid [Loss: 2.142, Accuracy: 0.343]


  0%|          | 10/2000 [01:15<3:57:48,  7.17s/it]

EPOCH: 10, Train [Loss: 0.949, Accuracy: 0.688], Valid [Loss: 0.929, Accuracy: 0.686]


  1%|          | 20/2000 [02:22<3:38:28,  6.62s/it]

EPOCH: 20, Train [Loss: 0.716, Accuracy: 0.758], Valid [Loss: 0.710, Accuracy: 0.754]


  2%|▏         | 30/2000 [03:33<4:11:26,  7.66s/it]

EPOCH: 30, Train [Loss: 0.623, Accuracy: 0.794], Valid [Loss: 0.615, Accuracy: 0.789]


  2%|▏         | 40/2000 [04:40<3:42:45,  6.82s/it]

EPOCH: 40, Train [Loss: 0.568, Accuracy: 0.810], Valid [Loss: 0.560, Accuracy: 0.808]


  2%|▎         | 50/2000 [05:47<3:36:10,  6.65s/it]

EPOCH: 50, Train [Loss: 0.532, Accuracy: 0.821], Valid [Loss: 0.520, Accuracy: 0.821]


  3%|▎         | 60/2000 [06:56<3:45:40,  6.98s/it]

EPOCH: 60, Train [Loss: 0.507, Accuracy: 0.829], Valid [Loss: 0.496, Accuracy: 0.829]


  4%|▎         | 70/2000 [08:03<3:30:59,  6.56s/it]

EPOCH: 70, Train [Loss: 0.487, Accuracy: 0.834], Valid [Loss: 0.477, Accuracy: 0.835]


  4%|▍         | 80/2000 [09:11<3:40:11,  6.88s/it]

EPOCH: 80, Train [Loss: 0.472, Accuracy: 0.837], Valid [Loss: 0.462, Accuracy: 0.841]


  4%|▍         | 90/2000 [10:17<3:22:30,  6.36s/it]

EPOCH: 90, Train [Loss: 0.460, Accuracy: 0.842], Valid [Loss: 0.453, Accuracy: 0.844]


  5%|▌         | 100/2000 [11:27<3:31:39,  6.68s/it]

EPOCH: 100, Train [Loss: 0.450, Accuracy: 0.845], Valid [Loss: 0.441, Accuracy: 0.848]


  6%|▌         | 110/2000 [12:35<3:39:08,  6.96s/it]

EPOCH: 110, Train [Loss: 0.441, Accuracy: 0.847], Valid [Loss: 0.434, Accuracy: 0.849]


  6%|▌         | 120/2000 [13:41<3:24:01,  6.51s/it]

EPOCH: 120, Train [Loss: 0.433, Accuracy: 0.850], Valid [Loss: 0.430, Accuracy: 0.851]


  6%|▋         | 130/2000 [14:53<3:32:33,  6.82s/it]

EPOCH: 130, Train [Loss: 0.426, Accuracy: 0.853], Valid [Loss: 0.420, Accuracy: 0.854]


  7%|▋         | 140/2000 [16:01<3:33:51,  6.90s/it]

EPOCH: 140, Train [Loss: 0.420, Accuracy: 0.855], Valid [Loss: 0.414, Accuracy: 0.855]


  8%|▊         | 150/2000 [17:07<3:19:00,  6.45s/it]

EPOCH: 150, Train [Loss: 0.414, Accuracy: 0.857], Valid [Loss: 0.414, Accuracy: 0.857]


  8%|▊         | 160/2000 [18:18<3:35:31,  7.03s/it]

EPOCH: 160, Train [Loss: 0.408, Accuracy: 0.859], Valid [Loss: 0.408, Accuracy: 0.859]


  8%|▊         | 170/2000 [19:26<3:31:21,  6.93s/it]

EPOCH: 170, Train [Loss: 0.404, Accuracy: 0.860], Valid [Loss: 0.399, Accuracy: 0.861]


  9%|▉         | 180/2000 [20:33<3:19:34,  6.58s/it]

EPOCH: 180, Train [Loss: 0.399, Accuracy: 0.861], Valid [Loss: 0.395, Accuracy: 0.862]


 10%|▉         | 190/2000 [21:45<3:48:04,  7.56s/it]

EPOCH: 190, Train [Loss: 0.394, Accuracy: 0.863], Valid [Loss: 0.396, Accuracy: 0.861]


 10%|█         | 200/2000 [22:55<3:25:37,  6.85s/it]

EPOCH: 200, Train [Loss: 0.390, Accuracy: 0.865], Valid [Loss: 0.392, Accuracy: 0.863]


 10%|█         | 210/2000 [24:05<3:32:18,  7.12s/it]

EPOCH: 210, Train [Loss: 0.386, Accuracy: 0.866], Valid [Loss: 0.385, Accuracy: 0.864]


 11%|█         | 220/2000 [25:14<3:27:57,  7.01s/it]

EPOCH: 220, Train [Loss: 0.382, Accuracy: 0.867], Valid [Loss: 0.386, Accuracy: 0.865]


 12%|█▏        | 230/2000 [26:26<3:22:51,  6.88s/it]

EPOCH: 230, Train [Loss: 0.378, Accuracy: 0.869], Valid [Loss: 0.384, Accuracy: 0.867]


 12%|█▏        | 240/2000 [27:35<3:16:49,  6.71s/it]

EPOCH: 240, Train [Loss: 0.374, Accuracy: 0.870], Valid [Loss: 0.378, Accuracy: 0.869]


 12%|█▎        | 250/2000 [28:45<3:27:28,  7.11s/it]

EPOCH: 250, Train [Loss: 0.371, Accuracy: 0.871], Valid [Loss: 0.375, Accuracy: 0.870]


 13%|█▎        | 260/2000 [29:55<3:25:28,  7.09s/it]

EPOCH: 260, Train [Loss: 0.367, Accuracy: 0.872], Valid [Loss: 0.375, Accuracy: 0.871]


 14%|█▎        | 270/2000 [31:01<3:08:32,  6.54s/it]

EPOCH: 270, Train [Loss: 0.364, Accuracy: 0.873], Valid [Loss: 0.378, Accuracy: 0.871]


 14%|█▍        | 280/2000 [32:09<3:19:54,  6.97s/it]

EPOCH: 280, Train [Loss: 0.361, Accuracy: 0.875], Valid [Loss: 0.373, Accuracy: 0.872]


 14%|█▍        | 290/2000 [33:18<3:13:26,  6.79s/it]

EPOCH: 290, Train [Loss: 0.358, Accuracy: 0.875], Valid [Loss: 0.366, Accuracy: 0.873]


 15%|█▌        | 300/2000 [34:24<3:08:43,  6.66s/it]

EPOCH: 300, Train [Loss: 0.354, Accuracy: 0.876], Valid [Loss: 0.363, Accuracy: 0.875]


 16%|█▌        | 310/2000 [35:28<2:55:29,  6.23s/it]

EPOCH: 310, Train [Loss: 0.351, Accuracy: 0.878], Valid [Loss: 0.364, Accuracy: 0.875]


 16%|█▌        | 320/2000 [36:37<3:11:33,  6.84s/it]

EPOCH: 320, Train [Loss: 0.349, Accuracy: 0.878], Valid [Loss: 0.363, Accuracy: 0.874]


 16%|█▋        | 330/2000 [37:43<3:02:21,  6.55s/it]

EPOCH: 330, Train [Loss: 0.346, Accuracy: 0.879], Valid [Loss: 0.360, Accuracy: 0.875]


 17%|█▋        | 340/2000 [38:48<3:01:00,  6.54s/it]

EPOCH: 340, Train [Loss: 0.343, Accuracy: 0.880], Valid [Loss: 0.360, Accuracy: 0.876]


 18%|█▊        | 350/2000 [39:53<2:54:02,  6.33s/it]

EPOCH: 350, Train [Loss: 0.340, Accuracy: 0.882], Valid [Loss: 0.358, Accuracy: 0.876]


 18%|█▊        | 360/2000 [41:02<2:59:38,  6.57s/it]

EPOCH: 360, Train [Loss: 0.338, Accuracy: 0.882], Valid [Loss: 0.353, Accuracy: 0.880]


 18%|█▊        | 370/2000 [42:09<2:59:18,  6.60s/it]

EPOCH: 370, Train [Loss: 0.335, Accuracy: 0.883], Valid [Loss: 0.355, Accuracy: 0.878]


 19%|█▉        | 380/2000 [43:14<2:55:08,  6.49s/it]

EPOCH: 380, Train [Loss: 0.332, Accuracy: 0.884], Valid [Loss: 0.352, Accuracy: 0.879]


 20%|█▉        | 390/2000 [44:22<2:53:57,  6.48s/it]

EPOCH: 390, Train [Loss: 0.330, Accuracy: 0.885], Valid [Loss: 0.350, Accuracy: 0.880]


 20%|██        | 400/2000 [45:30<2:58:38,  6.70s/it]

EPOCH: 400, Train [Loss: 0.328, Accuracy: 0.885], Valid [Loss: 0.349, Accuracy: 0.881]


 20%|██        | 410/2000 [46:34<2:44:29,  6.21s/it]

EPOCH: 410, Train [Loss: 0.325, Accuracy: 0.887], Valid [Loss: 0.348, Accuracy: 0.881]


 21%|██        | 420/2000 [47:43<3:00:19,  6.85s/it]

EPOCH: 420, Train [Loss: 0.323, Accuracy: 0.886], Valid [Loss: 0.346, Accuracy: 0.881]


 22%|██▏       | 430/2000 [48:49<2:55:38,  6.71s/it]

EPOCH: 430, Train [Loss: 0.320, Accuracy: 0.888], Valid [Loss: 0.349, Accuracy: 0.881]


 22%|██▏       | 440/2000 [49:56<2:49:52,  6.53s/it]

EPOCH: 440, Train [Loss: 0.318, Accuracy: 0.889], Valid [Loss: 0.345, Accuracy: 0.880]


 22%|██▎       | 450/2000 [51:06<3:14:19,  7.52s/it]

EPOCH: 450, Train [Loss: 0.316, Accuracy: 0.890], Valid [Loss: 0.342, Accuracy: 0.881]


 23%|██▎       | 460/2000 [52:11<2:45:23,  6.44s/it]

EPOCH: 460, Train [Loss: 0.313, Accuracy: 0.890], Valid [Loss: 0.340, Accuracy: 0.882]


 24%|██▎       | 470/2000 [53:19<2:52:17,  6.76s/it]

EPOCH: 470, Train [Loss: 0.311, Accuracy: 0.891], Valid [Loss: 0.341, Accuracy: 0.881]


 24%|██▍       | 480/2000 [54:24<2:38:43,  6.27s/it]

EPOCH: 480, Train [Loss: 0.309, Accuracy: 0.891], Valid [Loss: 0.337, Accuracy: 0.884]


 24%|██▍       | 490/2000 [55:33<2:45:52,  6.59s/it]

EPOCH: 490, Train [Loss: 0.307, Accuracy: 0.892], Valid [Loss: 0.340, Accuracy: 0.881]


 25%|██▌       | 500/2000 [56:41<2:53:51,  6.95s/it]

EPOCH: 500, Train [Loss: 0.304, Accuracy: 0.893], Valid [Loss: 0.338, Accuracy: 0.884]


 26%|██▌       | 510/2000 [57:46<2:40:01,  6.44s/it]

EPOCH: 510, Train [Loss: 0.302, Accuracy: 0.895], Valid [Loss: 0.340, Accuracy: 0.884]


 26%|██▌       | 520/2000 [58:56<2:50:52,  6.93s/it]

EPOCH: 520, Train [Loss: 0.300, Accuracy: 0.895], Valid [Loss: 0.331, Accuracy: 0.884]


 26%|██▋       | 530/2000 [1:00:01<2:35:44,  6.36s/it]

EPOCH: 530, Train [Loss: 0.298, Accuracy: 0.896], Valid [Loss: 0.333, Accuracy: 0.885]


 27%|██▋       | 540/2000 [1:01:08<2:43:57,  6.74s/it]

EPOCH: 540, Train [Loss: 0.296, Accuracy: 0.896], Valid [Loss: 0.332, Accuracy: 0.885]


 28%|██▊       | 550/2000 [1:02:19<2:52:28,  7.14s/it]

EPOCH: 550, Train [Loss: 0.294, Accuracy: 0.897], Valid [Loss: 0.330, Accuracy: 0.886]


 28%|██▊       | 560/2000 [1:03:24<2:36:51,  6.54s/it]

EPOCH: 560, Train [Loss: 0.292, Accuracy: 0.898], Valid [Loss: 0.329, Accuracy: 0.885]


 28%|██▊       | 570/2000 [1:04:32<2:44:49,  6.92s/it]

EPOCH: 570, Train [Loss: 0.290, Accuracy: 0.899], Valid [Loss: 0.332, Accuracy: 0.886]


 29%|██▉       | 580/2000 [1:05:37<2:33:09,  6.47s/it]

EPOCH: 580, Train [Loss: 0.288, Accuracy: 0.899], Valid [Loss: 0.331, Accuracy: 0.885]


 30%|██▉       | 590/2000 [1:06:47<2:42:08,  6.90s/it]

EPOCH: 590, Train [Loss: 0.286, Accuracy: 0.900], Valid [Loss: 0.326, Accuracy: 0.886]


 30%|███       | 600/2000 [1:07:53<2:30:59,  6.47s/it]

EPOCH: 600, Train [Loss: 0.284, Accuracy: 0.901], Valid [Loss: 0.326, Accuracy: 0.887]


 30%|███       | 610/2000 [1:09:01<2:38:12,  6.83s/it]

EPOCH: 610, Train [Loss: 0.282, Accuracy: 0.902], Valid [Loss: 0.322, Accuracy: 0.886]


 31%|███       | 620/2000 [1:10:10<2:36:37,  6.81s/it]

EPOCH: 620, Train [Loss: 0.280, Accuracy: 0.903], Valid [Loss: 0.324, Accuracy: 0.887]


 32%|███▏      | 630/2000 [1:11:16<2:28:42,  6.51s/it]

EPOCH: 630, Train [Loss: 0.278, Accuracy: 0.903], Valid [Loss: 0.322, Accuracy: 0.887]


 32%|███▏      | 640/2000 [1:12:23<2:37:30,  6.95s/it]

EPOCH: 640, Train [Loss: 0.276, Accuracy: 0.903], Valid [Loss: 0.325, Accuracy: 0.886]


 32%|███▎      | 650/2000 [1:13:31<2:29:18,  6.64s/it]

EPOCH: 650, Train [Loss: 0.275, Accuracy: 0.904], Valid [Loss: 0.322, Accuracy: 0.887]


 33%|███▎      | 660/2000 [1:14:43<2:38:47,  7.11s/it]

EPOCH: 660, Train [Loss: 0.272, Accuracy: 0.905], Valid [Loss: 0.322, Accuracy: 0.888]


 34%|███▎      | 670/2000 [1:15:50<2:26:22,  6.60s/it]

EPOCH: 670, Train [Loss: 0.270, Accuracy: 0.906], Valid [Loss: 0.321, Accuracy: 0.887]


 34%|███▍      | 680/2000 [1:16:59<2:31:49,  6.90s/it]

EPOCH: 680, Train [Loss: 0.268, Accuracy: 0.906], Valid [Loss: 0.321, Accuracy: 0.889]


 34%|███▍      | 690/2000 [1:18:08<2:33:21,  7.02s/it]

EPOCH: 690, Train [Loss: 0.267, Accuracy: 0.906], Valid [Loss: 0.322, Accuracy: 0.887]


 35%|███▌      | 700/2000 [1:19:13<2:21:49,  6.55s/it]

EPOCH: 700, Train [Loss: 0.265, Accuracy: 0.907], Valid [Loss: 0.319, Accuracy: 0.888]


 36%|███▌      | 710/2000 [1:20:20<2:28:39,  6.91s/it]

EPOCH: 710, Train [Loss: 0.263, Accuracy: 0.908], Valid [Loss: 0.320, Accuracy: 0.887]


 36%|███▌      | 720/2000 [1:21:28<2:16:20,  6.39s/it]

EPOCH: 720, Train [Loss: 0.261, Accuracy: 0.909], Valid [Loss: 0.320, Accuracy: 0.889]


 36%|███▋      | 730/2000 [1:22:35<2:19:39,  6.60s/it]

EPOCH: 730, Train [Loss: 0.259, Accuracy: 0.909], Valid [Loss: 0.318, Accuracy: 0.888]


 37%|███▋      | 740/2000 [1:23:43<2:22:05,  6.77s/it]

EPOCH: 740, Train [Loss: 0.258, Accuracy: 0.910], Valid [Loss: 0.320, Accuracy: 0.888]


 38%|███▊      | 750/2000 [1:24:52<2:21:25,  6.79s/it]

EPOCH: 750, Train [Loss: 0.256, Accuracy: 0.911], Valid [Loss: 0.315, Accuracy: 0.889]


 38%|███▊      | 760/2000 [1:25:59<2:18:49,  6.72s/it]

EPOCH: 760, Train [Loss: 0.254, Accuracy: 0.911], Valid [Loss: 0.313, Accuracy: 0.890]


 38%|███▊      | 770/2000 [1:27:05<2:15:07,  6.59s/it]

EPOCH: 770, Train [Loss: 0.252, Accuracy: 0.912], Valid [Loss: 0.312, Accuracy: 0.890]


 39%|███▉      | 780/2000 [1:28:15<2:25:07,  7.14s/it]

EPOCH: 780, Train [Loss: 0.251, Accuracy: 0.912], Valid [Loss: 0.318, Accuracy: 0.889]


 40%|███▉      | 790/2000 [1:29:20<2:13:25,  6.62s/it]

EPOCH: 790, Train [Loss: 0.249, Accuracy: 0.913], Valid [Loss: 0.313, Accuracy: 0.888]


 40%|████      | 800/2000 [1:30:28<2:17:49,  6.89s/it]

EPOCH: 800, Train [Loss: 0.247, Accuracy: 0.913], Valid [Loss: 0.316, Accuracy: 0.890]


 40%|████      | 810/2000 [1:31:34<2:11:30,  6.63s/it]

EPOCH: 810, Train [Loss: 0.245, Accuracy: 0.914], Valid [Loss: 0.314, Accuracy: 0.889]


 41%|████      | 820/2000 [1:32:44<2:13:10,  6.77s/it]

EPOCH: 820, Train [Loss: 0.244, Accuracy: 0.915], Valid [Loss: 0.313, Accuracy: 0.889]


 42%|████▏     | 830/2000 [1:33:51<2:11:16,  6.73s/it]

EPOCH: 830, Train [Loss: 0.242, Accuracy: 0.916], Valid [Loss: 0.313, Accuracy: 0.890]


 42%|████▏     | 840/2000 [1:34:58<2:06:27,  6.54s/it]

EPOCH: 840, Train [Loss: 0.240, Accuracy: 0.916], Valid [Loss: 0.313, Accuracy: 0.891]


 42%|████▎     | 850/2000 [1:36:08<2:15:01,  7.04s/it]

EPOCH: 850, Train [Loss: 0.238, Accuracy: 0.917], Valid [Loss: 0.311, Accuracy: 0.890]


 43%|████▎     | 860/2000 [1:37:13<2:01:59,  6.42s/it]

EPOCH: 860, Train [Loss: 0.237, Accuracy: 0.918], Valid [Loss: 0.312, Accuracy: 0.891]


 44%|████▎     | 870/2000 [1:38:21<2:08:58,  6.85s/it]

EPOCH: 870, Train [Loss: 0.235, Accuracy: 0.918], Valid [Loss: 0.314, Accuracy: 0.892]


 44%|████▍     | 880/2000 [1:39:30<2:15:04,  7.24s/it]

EPOCH: 880, Train [Loss: 0.234, Accuracy: 0.919], Valid [Loss: 0.309, Accuracy: 0.891]


 44%|████▍     | 890/2000 [1:40:36<2:01:44,  6.58s/it]

EPOCH: 890, Train [Loss: 0.232, Accuracy: 0.919], Valid [Loss: 0.313, Accuracy: 0.890]


 45%|████▌     | 900/2000 [1:41:44<2:08:42,  7.02s/it]

EPOCH: 900, Train [Loss: 0.230, Accuracy: 0.921], Valid [Loss: 0.317, Accuracy: 0.890]


 46%|████▌     | 910/2000 [1:42:52<2:02:52,  6.76s/it]

EPOCH: 910, Train [Loss: 0.228, Accuracy: 0.921], Valid [Loss: 0.308, Accuracy: 0.890]


 46%|████▌     | 920/2000 [1:44:03<2:03:11,  6.84s/it]

EPOCH: 920, Train [Loss: 0.227, Accuracy: 0.922], Valid [Loss: 0.309, Accuracy: 0.891]


 46%|████▋     | 930/2000 [1:45:10<2:04:08,  6.96s/it]

EPOCH: 930, Train [Loss: 0.225, Accuracy: 0.922], Valid [Loss: 0.308, Accuracy: 0.891]


 47%|████▋     | 940/2000 [1:46:17<1:56:51,  6.61s/it]

EPOCH: 940, Train [Loss: 0.223, Accuracy: 0.923], Valid [Loss: 0.311, Accuracy: 0.891]


 48%|████▊     | 950/2000 [1:47:28<2:05:50,  7.19s/it]

EPOCH: 950, Train [Loss: 0.222, Accuracy: 0.924], Valid [Loss: 0.308, Accuracy: 0.892]


 48%|████▊     | 960/2000 [1:48:35<1:56:22,  6.71s/it]

EPOCH: 960, Train [Loss: 0.220, Accuracy: 0.924], Valid [Loss: 0.310, Accuracy: 0.891]


 48%|████▊     | 970/2000 [1:49:42<1:53:24,  6.61s/it]

EPOCH: 970, Train [Loss: 0.219, Accuracy: 0.925], Valid [Loss: 0.308, Accuracy: 0.891]


 49%|████▉     | 980/2000 [1:50:50<1:56:57,  6.88s/it]

EPOCH: 980, Train [Loss: 0.217, Accuracy: 0.925], Valid [Loss: 0.313, Accuracy: 0.890]


 50%|████▉     | 990/2000 [1:51:58<1:49:49,  6.52s/it]

EPOCH: 990, Train [Loss: 0.216, Accuracy: 0.926], Valid [Loss: 0.308, Accuracy: 0.891]


 50%|█████     | 1000/2000 [1:53:06<1:52:41,  6.76s/it]

EPOCH: 1000, Train [Loss: 0.214, Accuracy: 0.926], Valid [Loss: 0.307, Accuracy: 0.892]


 50%|█████     | 1010/2000 [1:54:12<1:49:39,  6.65s/it]

EPOCH: 1010, Train [Loss: 0.212, Accuracy: 0.927], Valid [Loss: 0.310, Accuracy: 0.892]


 51%|█████     | 1020/2000 [1:55:21<1:49:42,  6.72s/it]

EPOCH: 1020, Train [Loss: 0.210, Accuracy: 0.928], Valid [Loss: 0.319, Accuracy: 0.891]


 52%|█████▏    | 1030/2000 [1:56:29<1:50:52,  6.86s/it]

EPOCH: 1030, Train [Loss: 0.209, Accuracy: 0.928], Valid [Loss: 0.307, Accuracy: 0.892]


 52%|█████▏    | 1040/2000 [1:57:35<1:44:10,  6.51s/it]

EPOCH: 1040, Train [Loss: 0.207, Accuracy: 0.929], Valid [Loss: 0.310, Accuracy: 0.891]


 52%|█████▎    | 1050/2000 [1:58:46<1:57:07,  7.40s/it]

EPOCH: 1050, Train [Loss: 0.206, Accuracy: 0.929], Valid [Loss: 0.305, Accuracy: 0.892]


 53%|█████▎    | 1060/2000 [1:59:54<1:48:07,  6.90s/it]

EPOCH: 1060, Train [Loss: 0.204, Accuracy: 0.930], Valid [Loss: 0.306, Accuracy: 0.892]


 54%|█████▎    | 1070/2000 [2:01:01<1:43:33,  6.68s/it]

EPOCH: 1070, Train [Loss: 0.203, Accuracy: 0.930], Valid [Loss: 0.306, Accuracy: 0.893]


 54%|█████▍    | 1080/2000 [2:02:08<1:41:11,  6.60s/it]

EPOCH: 1080, Train [Loss: 0.201, Accuracy: 0.931], Valid [Loss: 0.309, Accuracy: 0.894]


 55%|█████▍    | 1090/2000 [2:03:19<1:45:38,  6.97s/it]

EPOCH: 1090, Train [Loss: 0.199, Accuracy: 0.932], Valid [Loss: 0.308, Accuracy: 0.893]


 55%|█████▌    | 1100/2000 [2:04:27<1:43:01,  6.87s/it]

EPOCH: 1100, Train [Loss: 0.198, Accuracy: 0.933], Valid [Loss: 0.307, Accuracy: 0.891]


 56%|█████▌    | 1110/2000 [2:05:34<1:37:54,  6.60s/it]

EPOCH: 1110, Train [Loss: 0.196, Accuracy: 0.934], Valid [Loss: 0.310, Accuracy: 0.892]


 56%|█████▌    | 1120/2000 [2:06:44<1:47:20,  7.32s/it]

EPOCH: 1120, Train [Loss: 0.195, Accuracy: 0.934], Valid [Loss: 0.313, Accuracy: 0.892]


 56%|█████▋    | 1130/2000 [2:07:51<1:37:48,  6.75s/it]

EPOCH: 1130, Train [Loss: 0.193, Accuracy: 0.934], Valid [Loss: 0.310, Accuracy: 0.892]


 57%|█████▋    | 1140/2000 [2:08:59<1:34:33,  6.60s/it]

EPOCH: 1140, Train [Loss: 0.192, Accuracy: 0.935], Valid [Loss: 0.313, Accuracy: 0.893]


 57%|█████▊    | 1150/2000 [2:10:07<1:39:12,  7.00s/it]

EPOCH: 1150, Train [Loss: 0.190, Accuracy: 0.935], Valid [Loss: 0.308, Accuracy: 0.891]


 58%|█████▊    | 1160/2000 [2:11:16<1:34:03,  6.72s/it]

EPOCH: 1160, Train [Loss: 0.189, Accuracy: 0.937], Valid [Loss: 0.307, Accuracy: 0.892]


 58%|█████▊    | 1170/2000 [2:12:23<1:32:00,  6.65s/it]

EPOCH: 1170, Train [Loss: 0.187, Accuracy: 0.937], Valid [Loss: 0.306, Accuracy: 0.890]


 59%|█████▉    | 1180/2000 [2:13:32<1:36:25,  7.05s/it]

EPOCH: 1180, Train [Loss: 0.186, Accuracy: 0.938], Valid [Loss: 0.306, Accuracy: 0.893]


 60%|█████▉    | 1190/2000 [2:14:41<1:35:41,  7.09s/it]

EPOCH: 1190, Train [Loss: 0.184, Accuracy: 0.938], Valid [Loss: 0.309, Accuracy: 0.892]


 60%|██████    | 1200/2000 [2:15:51<1:31:05,  6.83s/it]

EPOCH: 1200, Train [Loss: 0.183, Accuracy: 0.939], Valid [Loss: 0.309, Accuracy: 0.892]


 60%|██████    | 1210/2000 [2:16:59<1:33:29,  7.10s/it]

EPOCH: 1210, Train [Loss: 0.181, Accuracy: 0.939], Valid [Loss: 0.304, Accuracy: 0.893]


 61%|██████    | 1220/2000 [2:18:06<1:25:55,  6.61s/it]

EPOCH: 1220, Train [Loss: 0.180, Accuracy: 0.940], Valid [Loss: 0.314, Accuracy: 0.891]


 62%|██████▏   | 1230/2000 [2:19:16<1:30:22,  7.04s/it]

EPOCH: 1230, Train [Loss: 0.178, Accuracy: 0.941], Valid [Loss: 0.307, Accuracy: 0.893]


 62%|██████▏   | 1240/2000 [2:20:22<1:22:10,  6.49s/it]

EPOCH: 1240, Train [Loss: 0.177, Accuracy: 0.941], Valid [Loss: 0.310, Accuracy: 0.892]


 62%|██████▎   | 1250/2000 [2:21:31<1:23:39,  6.69s/it]

EPOCH: 1250, Train [Loss: 0.175, Accuracy: 0.942], Valid [Loss: 0.308, Accuracy: 0.893]


 63%|██████▎   | 1260/2000 [2:22:41<1:26:52,  7.04s/it]

EPOCH: 1260, Train [Loss: 0.174, Accuracy: 0.943], Valid [Loss: 0.304, Accuracy: 0.892]


 64%|██████▎   | 1270/2000 [2:23:50<1:21:44,  6.72s/it]

EPOCH: 1270, Train [Loss: 0.172, Accuracy: 0.943], Valid [Loss: 0.310, Accuracy: 0.893]


 64%|██████▍   | 1280/2000 [2:24:59<1:25:26,  7.12s/it]

EPOCH: 1280, Train [Loss: 0.171, Accuracy: 0.944], Valid [Loss: 0.308, Accuracy: 0.892]


 64%|██████▍   | 1290/2000 [2:26:05<1:16:38,  6.48s/it]

EPOCH: 1290, Train [Loss: 0.169, Accuracy: 0.944], Valid [Loss: 0.306, Accuracy: 0.892]


 65%|██████▌   | 1300/2000 [2:27:16<1:18:38,  6.74s/it]

EPOCH: 1300, Train [Loss: 0.168, Accuracy: 0.945], Valid [Loss: 0.313, Accuracy: 0.894]


 66%|██████▌   | 1310/2000 [2:28:24<1:20:49,  7.03s/it]

EPOCH: 1310, Train [Loss: 0.166, Accuracy: 0.945], Valid [Loss: 0.306, Accuracy: 0.893]


 66%|██████▌   | 1320/2000 [2:29:30<1:15:00,  6.62s/it]

EPOCH: 1320, Train [Loss: 0.165, Accuracy: 0.946], Valid [Loss: 0.312, Accuracy: 0.892]


 66%|██████▋   | 1330/2000 [2:30:41<1:22:34,  7.39s/it]

EPOCH: 1330, Train [Loss: 0.163, Accuracy: 0.946], Valid [Loss: 0.312, Accuracy: 0.892]


 67%|██████▋   | 1340/2000 [2:31:50<1:17:30,  7.05s/it]

EPOCH: 1340, Train [Loss: 0.162, Accuracy: 0.948], Valid [Loss: 0.313, Accuracy: 0.890]


 68%|██████▊   | 1350/2000 [2:32:56<1:11:08,  6.57s/it]

EPOCH: 1350, Train [Loss: 0.160, Accuracy: 0.948], Valid [Loss: 0.311, Accuracy: 0.894]


 68%|██████▊   | 1360/2000 [2:34:05<1:14:20,  6.97s/it]

EPOCH: 1360, Train [Loss: 0.159, Accuracy: 0.949], Valid [Loss: 0.314, Accuracy: 0.892]


 68%|██████▊   | 1370/2000 [2:35:17<1:16:37,  7.30s/it]

EPOCH: 1370, Train [Loss: 0.157, Accuracy: 0.949], Valid [Loss: 0.308, Accuracy: 0.893]


 69%|██████▉   | 1380/2000 [2:36:24<1:08:35,  6.64s/it]

EPOCH: 1380, Train [Loss: 0.156, Accuracy: 0.949], Valid [Loss: 0.311, Accuracy: 0.891]


 70%|██████▉   | 1390/2000 [2:37:32<1:10:03,  6.89s/it]

EPOCH: 1390, Train [Loss: 0.155, Accuracy: 0.950], Valid [Loss: 0.312, Accuracy: 0.893]


 70%|███████   | 1400/2000 [2:38:43<1:14:56,  7.49s/it]

EPOCH: 1400, Train [Loss: 0.153, Accuracy: 0.951], Valid [Loss: 0.319, Accuracy: 0.893]


 70%|███████   | 1410/2000 [2:39:50<1:05:07,  6.62s/it]

EPOCH: 1410, Train [Loss: 0.152, Accuracy: 0.951], Valid [Loss: 0.322, Accuracy: 0.892]


 71%|███████   | 1420/2000 [2:40:58<1:07:11,  6.95s/it]

EPOCH: 1420, Train [Loss: 0.151, Accuracy: 0.951], Valid [Loss: 0.318, Accuracy: 0.894]


 72%|███████▏  | 1430/2000 [2:42:05<1:03:04,  6.64s/it]

EPOCH: 1430, Train [Loss: 0.149, Accuracy: 0.952], Valid [Loss: 0.316, Accuracy: 0.893]


 72%|███████▏  | 1440/2000 [2:43:17<1:05:07,  6.98s/it]

EPOCH: 1440, Train [Loss: 0.148, Accuracy: 0.953], Valid [Loss: 0.309, Accuracy: 0.894]


 72%|███████▎  | 1450/2000 [2:44:26<1:02:49,  6.85s/it]

EPOCH: 1450, Train [Loss: 0.146, Accuracy: 0.953], Valid [Loss: 0.311, Accuracy: 0.893]


 73%|███████▎  | 1460/2000 [2:45:33<1:00:41,  6.74s/it]

EPOCH: 1460, Train [Loss: 0.145, Accuracy: 0.954], Valid [Loss: 0.309, Accuracy: 0.893]


 74%|███████▎  | 1470/2000 [2:46:39<57:54,  6.56s/it]  

EPOCH: 1470, Train [Loss: 0.144, Accuracy: 0.954], Valid [Loss: 0.312, Accuracy: 0.894]


 74%|███████▍  | 1480/2000 [2:47:50<1:01:07,  7.05s/it]

EPOCH: 1480, Train [Loss: 0.142, Accuracy: 0.955], Valid [Loss: 0.314, Accuracy: 0.894]


 74%|███████▍  | 1490/2000 [2:48:56<54:43,  6.44s/it]

EPOCH: 1490, Train [Loss: 0.141, Accuracy: 0.956], Valid [Loss: 0.314, Accuracy: 0.893]


 75%|███████▌  | 1500/2000 [2:50:04<55:29,  6.66s/it]

EPOCH: 1500, Train [Loss: 0.139, Accuracy: 0.957], Valid [Loss: 0.310, Accuracy: 0.895]


 76%|███████▌  | 1510/2000 [2:51:14<1:03:16,  7.75s/it]

EPOCH: 1510, Train [Loss: 0.138, Accuracy: 0.957], Valid [Loss: 0.323, Accuracy: 0.892]


 76%|███████▌  | 1520/2000 [2:52:20<52:28,  6.56s/it]

EPOCH: 1520, Train [Loss: 0.137, Accuracy: 0.957], Valid [Loss: 0.317, Accuracy: 0.894]


 76%|███████▋  | 1530/2000 [2:53:29<54:23,  6.94s/it]

EPOCH: 1530, Train [Loss: 0.136, Accuracy: 0.958], Valid [Loss: 0.312, Accuracy: 0.895]


 77%|███████▋  | 1540/2000 [2:54:35<49:32,  6.46s/it]

EPOCH: 1540, Train [Loss: 0.134, Accuracy: 0.959], Valid [Loss: 0.323, Accuracy: 0.891]


 78%|███████▊  | 1550/2000 [2:55:45<51:52,  6.92s/it]

EPOCH: 1550, Train [Loss: 0.133, Accuracy: 0.959], Valid [Loss: 0.312, Accuracy: 0.895]


 78%|███████▊  | 1560/2000 [2:56:53<49:47,  6.79s/it]

EPOCH: 1560, Train [Loss: 0.132, Accuracy: 0.960], Valid [Loss: 0.312, Accuracy: 0.894]


 78%|███████▊  | 1570/2000 [2:58:00<47:14,  6.59s/it]

EPOCH: 1570, Train [Loss: 0.130, Accuracy: 0.960], Valid [Loss: 0.317, Accuracy: 0.894]


 79%|███████▉  | 1580/2000 [2:59:08<48:36,  6.94s/it]

EPOCH: 1580, Train [Loss: 0.129, Accuracy: 0.961], Valid [Loss: 0.316, Accuracy: 0.894]


 80%|███████▉  | 1590/2000 [3:00:16<45:14,  6.62s/it]

EPOCH: 1590, Train [Loss: 0.128, Accuracy: 0.961], Valid [Loss: 0.316, Accuracy: 0.895]


 80%|████████  | 1600/2000 [3:01:24<45:43,  6.86s/it]

EPOCH: 1600, Train [Loss: 0.126, Accuracy: 0.962], Valid [Loss: 0.317, Accuracy: 0.892]


 80%|████████  | 1610/2000 [3:02:31<42:59,  6.61s/it]

EPOCH: 1610, Train [Loss: 0.125, Accuracy: 0.962], Valid [Loss: 0.317, Accuracy: 0.893]


 81%|████████  | 1620/2000 [3:03:39<42:34,  6.72s/it]

EPOCH: 1620, Train [Loss: 0.124, Accuracy: 0.963], Valid [Loss: 0.316, Accuracy: 0.895]


 82%|████████▏ | 1630/2000 [3:04:48<41:18,  6.70s/it]

EPOCH: 1630, Train [Loss: 0.122, Accuracy: 0.963], Valid [Loss: 0.318, Accuracy: 0.893]


 82%|████████▏ | 1640/2000 [3:05:55<39:52,  6.65s/it]

EPOCH: 1640, Train [Loss: 0.121, Accuracy: 0.964], Valid [Loss: 0.318, Accuracy: 0.892]


 82%|████████▎ | 1650/2000 [3:07:03<40:40,  6.97s/it]

EPOCH: 1650, Train [Loss: 0.120, Accuracy: 0.964], Valid [Loss: 0.317, Accuracy: 0.894]


 83%|████████▎ | 1660/2000 [3:08:14<43:17,  7.64s/it]

EPOCH: 1660, Train [Loss: 0.119, Accuracy: 0.964], Valid [Loss: 0.321, Accuracy: 0.894]


 84%|████████▎ | 1670/2000 [3:09:21<36:21,  6.61s/it]

EPOCH: 1670, Train [Loss: 0.118, Accuracy: 0.965], Valid [Loss: 0.328, Accuracy: 0.894]


 84%|████████▍ | 1680/2000 [3:10:29<37:22,  7.01s/it]

EPOCH: 1680, Train [Loss: 0.116, Accuracy: 0.966], Valid [Loss: 0.327, Accuracy: 0.895]


 84%|████████▍ | 1690/2000 [3:11:36<33:33,  6.50s/it]

EPOCH: 1690, Train [Loss: 0.115, Accuracy: 0.966], Valid [Loss: 0.325, Accuracy: 0.896]


 85%|████████▌ | 1700/2000 [3:12:46<35:08,  7.03s/it]

EPOCH: 1700, Train [Loss: 0.114, Accuracy: 0.967], Valid [Loss: 0.320, Accuracy: 0.893]


 86%|████████▌ | 1710/2000 [3:13:54<33:53,  7.01s/it]

EPOCH: 1710, Train [Loss: 0.113, Accuracy: 0.967], Valid [Loss: 0.322, Accuracy: 0.894]


 86%|████████▌ | 1720/2000 [3:15:00<30:48,  6.60s/it]

EPOCH: 1720, Train [Loss: 0.111, Accuracy: 0.968], Valid [Loss: 0.330, Accuracy: 0.895]


 86%|████████▋ | 1730/2000 [3:16:09<31:47,  7.07s/it]

EPOCH: 1730, Train [Loss: 0.110, Accuracy: 0.967], Valid [Loss: 0.323, Accuracy: 0.895]


 87%|████████▋ | 1740/2000 [3:17:20<31:06,  7.18s/it]

EPOCH: 1740, Train [Loss: 0.109, Accuracy: 0.969], Valid [Loss: 0.323, Accuracy: 0.895]


 88%|████████▊ | 1750/2000 [3:18:27<27:45,  6.66s/it]

EPOCH: 1750, Train [Loss: 0.108, Accuracy: 0.969], Valid [Loss: 0.323, Accuracy: 0.895]


 88%|████████▊ | 1760/2000 [3:19:35<27:53,  6.97s/it]

EPOCH: 1760, Train [Loss: 0.107, Accuracy: 0.969], Valid [Loss: 0.324, Accuracy: 0.895]


 88%|████████▊ | 1770/2000 [3:20:42<25:00,  6.52s/it]

EPOCH: 1770, Train [Loss: 0.106, Accuracy: 0.970], Valid [Loss: 0.325, Accuracy: 0.894]


 89%|████████▉ | 1780/2000 [3:21:53<24:49,  6.77s/it]

EPOCH: 1780, Train [Loss: 0.105, Accuracy: 0.970], Valid [Loss: 0.337, Accuracy: 0.895]


 90%|████████▉ | 1790/2000 [3:23:01<24:19,  6.95s/it]

EPOCH: 1790, Train [Loss: 0.103, Accuracy: 0.971], Valid [Loss: 0.329, Accuracy: 0.896]


 90%|█████████ | 1800/2000 [3:24:08<22:14,  6.67s/it]

EPOCH: 1800, Train [Loss: 0.102, Accuracy: 0.972], Valid [Loss: 0.338, Accuracy: 0.893]


 90%|█████████ | 1810/2000 [3:25:17<22:11,  7.01s/it]

EPOCH: 1810, Train [Loss: 0.101, Accuracy: 0.972], Valid [Loss: 0.327, Accuracy: 0.894]


 91%|█████████ | 1820/2000 [3:26:28<21:17,  7.09s/it]

EPOCH: 1820, Train [Loss: 0.100, Accuracy: 0.972], Valid [Loss: 0.344, Accuracy: 0.892]


 92%|█████████▏| 1830/2000 [3:27:35<19:01,  6.71s/it]

EPOCH: 1830, Train [Loss: 0.099, Accuracy: 0.973], Valid [Loss: 0.336, Accuracy: 0.894]


 92%|█████████▏| 1840/2000 [3:28:44<18:10,  6.81s/it]

EPOCH: 1840, Train [Loss: 0.098, Accuracy: 0.973], Valid [Loss: 0.335, Accuracy: 0.895]


 92%|█████████▎| 1850/2000 [3:29:55<18:57,  7.58s/it]

EPOCH: 1850, Train [Loss: 0.097, Accuracy: 0.974], Valid [Loss: 0.340, Accuracy: 0.894]


 93%|█████████▎| 1860/2000 [3:31:01<15:14,  6.54s/it]

EPOCH: 1860, Train [Loss: 0.096, Accuracy: 0.975], Valid [Loss: 0.334, Accuracy: 0.895]


 94%|█████████▎| 1870/2000 [3:32:09<14:35,  6.73s/it]

EPOCH: 1870, Train [Loss: 0.095, Accuracy: 0.975], Valid [Loss: 0.334, Accuracy: 0.894]


 94%|█████████▍| 1880/2000 [3:33:15<12:57,  6.48s/it]

EPOCH: 1880, Train [Loss: 0.093, Accuracy: 0.975], Valid [Loss: 0.351, Accuracy: 0.891]


 94%|█████████▍| 1890/2000 [3:34:25<12:50,  7.00s/it]

EPOCH: 1890, Train [Loss: 0.093, Accuracy: 0.975], Valid [Loss: 0.333, Accuracy: 0.894]


 95%|█████████▌| 1900/2000 [3:35:33<11:33,  6.93s/it]

EPOCH: 1900, Train [Loss: 0.092, Accuracy: 0.976], Valid [Loss: 0.333, Accuracy: 0.894]


 96%|█████████▌| 1910/2000 [3:36:39<09:38,  6.43s/it]

EPOCH: 1910, Train [Loss: 0.090, Accuracy: 0.976], Valid [Loss: 0.333, Accuracy: 0.896]


 96%|█████████▌| 1920/2000 [3:37:48<09:01,  6.77s/it]

EPOCH: 1920, Train [Loss: 0.089, Accuracy: 0.977], Valid [Loss: 0.342, Accuracy: 0.893]


 96%|█████████▋| 1930/2000 [3:38:59<08:26,  7.23s/it]

EPOCH: 1930, Train [Loss: 0.089, Accuracy: 0.977], Valid [Loss: 0.332, Accuracy: 0.894]


 97%|█████████▋| 1940/2000 [3:40:06<06:41,  6.69s/it]

EPOCH: 1940, Train [Loss: 0.087, Accuracy: 0.978], Valid [Loss: 0.342, Accuracy: 0.896]


 98%|█████████▊| 1950/2000 [3:41:14<05:45,  6.91s/it]

EPOCH: 1950, Train [Loss: 0.086, Accuracy: 0.979], Valid [Loss: 0.340, Accuracy: 0.894]


 98%|█████████▊| 1960/2000 [3:42:21<04:22,  6.57s/it]

EPOCH: 1960, Train [Loss: 0.085, Accuracy: 0.978], Valid [Loss: 0.337, Accuracy: 0.894]


 98%|█████████▊| 1970/2000 [3:43:32<03:29,  6.98s/it]

EPOCH: 1970, Train [Loss: 0.084, Accuracy: 0.979], Valid [Loss: 0.341, Accuracy: 0.895]


 99%|█████████▉| 1980/2000 [3:44:41<02:21,  7.07s/it]

EPOCH: 1980, Train [Loss: 0.083, Accuracy: 0.979], Valid [Loss: 0.346, Accuracy: 0.895]


100%|█████████▉| 1990/2000 [3:45:47<01:05,  6.58s/it]

EPOCH: 1990, Train [Loss: 0.083, Accuracy: 0.979], Valid [Loss: 0.339, Accuracy: 0.896]


100%|██████████| 2000/2000 [3:46:55<00:00,  6.81s/it]

EPOCH: 2000, Train [Loss: 0.081, Accuracy: 0.980], Valid [Loss: 0.345, Accuracy: 0.895]





In [None]:
t_pred = []
for x in x_test:
    # 順伝播
    x = x[np.newaxis, :]
    y = mlp(x)

    # モデルの出力を予測値のスカラーに変換
    pred = y.argmax(1).tolist()

    t_pred.extend(pred)

submission = pd.Series(t_pred, name='label')
submission.to_csv('drive/MyDrive/Colab Notebooks/DeepLearning/Lecture03/submission_pred.csv', header=True, index_label='id')