В первой части этого ipython notebook рассматривается сверточный слой нейронной сети, возможные способы его реализации, а также особые виды сверточных слоев. 

Вторая часть содержит пример использования утилиты tensorboard для визуализации архитектуры нейронной сети и для мониторинга различных значений в процессе обучения и тестирования (ошибка, точность и т.д.).

# Сверточный слой (2D convolution):

## Описание слоя

* Входные данные: 
    массив размером $C~\times~W_{in}~\times~H_{in}$, где $C$ - число каналов, $W_{in}$ и $H_{in}$ - ширина и высота входных карт признаков соответственно;


* Параметры слоя:
    * $K$ - размер ядра (для простоты рассматриваем только квадратные свертки, хотя на практике могут применяться и прямоугольные, например в этой [работе](https://arxiv.org/pdf/1512.00567.pdf "Rethinking the Inception Architecture for Computer Vision"));
    * $D$ - число фильторов;
    * $S$ - шаг (stride, для упрощения изложения считается, что $S_x=S_y=S$, хотя на практике могут использоваться разные размеры шага вдоль разных осей);
    * $P$ - дополнение (padding);


* Результат работы слоя: 
    массив размером $D~\times~W_{out}~\times~H_{out}$, где $D$ - число фильтров в слое, $W_{out}=\frac{W_{in}-K+2P}{S}+1$ и $H_{out}=\frac{H_{in}-K+2P}{S}+1$ - ширина и высота карт признаков на выходе соответственно;


* Число параметров в слое: 
    $D*(K*K*C)+D$, где первое слагаемое равно числу весов в $D$ фильтрах, каждый из которых обрабатывает все каналы входных данных, а второе слагаемое - количество смещений (bias), которые прибавляются к результату поканально.
    
**Замечание:** по строгому определению, рассматриваемая нами в этом ноутбуке операция называется кросс-корреляция, а не свертка. Кросс-корреляция $X_{cross-correlation} \star W$ дает такой же результат, как и свертка $X_{convolution} \ast W$ При обучении нейронных сетей веса в фильтрах настраиваются автоматически, а операции кросс-кореляции и свертки работают схожим образом, поэтому выбор рассматриваемой операции значения не имеет.

<img src="pictures/conv-corr.png">

В таблице ниже приведены несколько примеров работы сверток, где синим цветом обозначены входные данные, а зеленым - результат работы свертки. Для наглядности $C = D = 1$.

<table style="width:100%">
    <tr>
        <td style="text-align:center">$K=3,~S=1,~P=0$</td>
        <td style="text-align:center">$K=3,~S=2,~P=0$</td>
        <td style="text-align:center">$K=3,~S=1,~P=2$</td>
        <td style="text-align:center">$K=3,~S=2,~P=1$</td>
    </tr>
    <tr>
        <td><img src="pictures/P0S1.gif"></td>
        <td><img src="pictures/P0S2.gif"></td>
        <td><img src="pictures/P2S1.gif"></td>
        <td><img src="pictures/P1S2.gif"></td>
    </tr>
</table>


## Реализация сверточного слоя с помощью перемножения матриц

### Простая реализация с помощью вложенных циклов

Самый простой способ реализоавть операцию свертки - умножать каждый фрагмент, извлеченный из изображения (со всеми каналами) на фильтр внутри двух вложенных циклов (по высоте и ширине).


### Реализация с помощью преобразования матрицы весов

Более быстрый способ реализации операции свертки заключается в преобразовании матрицы весов так, чтобы результат свертки выражался следующим образом: $dot(W_{transformed}, X_{col})$, где $W_{transformed}$ - преобразованная матрица весов, а $X_{col}$ - в общем случае трехмерный тензор, в котором каждый канал соответствует каналу исходных данных, вытянутому в столбец (слева-направо и сверху-вниз). Например, для первой свертки из таблицы выше исходные и преобразованные матрицы будут выглядеть так:


$$W=\begin{bmatrix}
    w_{00}&w_{01}&w_{02} \\
    w_{10}&w_{11}&w_{12} \\
    w_{20}&w_{21}&w_{22} \\
\end{bmatrix}~~X=\begin{bmatrix}
    x_{00}&x_{01}&x_{02}&x_{03} \\
    x_{10}&x_{11}&x_{12}&x_{13} \\
    x_{20}&x_{21}&x_{22}&x_{23} \\
    x_{30}&x_{31}&x_{32}&x_{33} \\
\end{bmatrix}$$


$$W_{transformed}=\begin{bmatrix}                  
    w_{00}&w_{01}&w_{02}&0&w_{10}&w_{11}&w_{12}&0&w_{20}&w_{21}&w_{22}&0&0&0&0&0 \\
    0&w_{00}&w_{01}&w_{02}&0&w_{10}&w_{11}&w_{12}&0&w_{20}&w_{21}&w_{22}&0&0&0&0 \\
    0&0&0&0&w_{00}&w_{01}&w_{02}&0&w_{10}&w_{11}&w_{12}&0&w_{20}&w_{21}&w_{22}&0 \\
    0&0&0&0&0&w_{00}&w_{01}&w_{02}&0&w_{10}&w_{11}&w_{12}&0&w_{20}&w_{21}&w_{22} \\
\end{bmatrix}$$

$$X_{col}=\begin{bmatrix}                  
    x_{00}&x_{01}&x_{02}&x_{03}&x_{10}&x_{11}&x_{12}&x_{13}&x_{20}&x_{21}&x_{22}&x_{23}&x_{30}&x_{31}&x_{32}&x_{33} \\
\end{bmatrix}^T$$


**Важно заметить**, что если прямой проход для операции свертки выражается, как $dot(W_{transformed}, X_{col})$, то при обратном распространении градиентов при обучении нейронной сети, обратный проход (backward pass) для сверточного слоя будет выражаться, как $dot(W_{transformed}^T, Grad_{col})$, где $Grad_{col}$ - аналогичным образом преобразованная матрица градиентов, дошедших до сверточного слоя при обратном распространении. Таким образом матрица $W_{transformed}$ задает как прямой, так и обратный проход для операции свертки.

Существенный минус этого подхода - огромные накладные расходы на хранение матрицы $W_{transformed}$.

### Быстрая реализация с помощью преобразования входных данных
Другой способ сведения операции свертки к одной операции перемножения матриц состоит в использовании специального преобразования матриц, задающих входные данные, которое сводит свертку к одному большому матричному произведению (это преобразование называется *im2col*, которое подробнее рассмотрено, например, в этой [работе](https://arxiv.org/pdf/1410.0759.pdf)).

На рисунке ниже наглядно изображена свертка, выполняемая с помщью *im2col*. Данные на входе имеют размер $3~\times~3~\times~3$, а параметры сверточного слоя: $K=2$, $D=2$, $S=1$ и $P=0$. Далее по шагам разберем как строятся преобразованные матрицы для входных данных и весов сверточного слоя.


<img src="pictures/im2col.png">


Пусть, на вход сверточному слою подается массив размером $3~\times~224~\times~224$, параметры сверточного слоя: $K=7$, $D=8$, $S=3$, $P=1$ ($8$ фильтров размера $7~\times~7$ с шагом $3$ и дополнением в $1$ пиксель). Тогда свертка выполняется за три следующих шага:
1. Преобразование входных данных.
    * Исходные данные дополняются $P$ пикселями с каждой стороны по одной из возможных стратегий: заполнение фиксированным значением, отражение, дублирование граничного значения или другими.

    * Из исходных данных вырезаются фрагменты, размером $K~\times~K$ с шагом $S$, каждый из которых построчно вытягивается в столбец, солбцы, соответствующие одинаковым фрагментам разных каналов конкатенируются вертикально, а столбцы, соответствующие последовательно извлеченным фрагментам конкатенируются горизонтально. Длина вектора для одного фрагмента всех каналов - $K~\times~K~\times~3$, в нашем случае - $147$, всего таких фрагментов будет $(\frac{W_{in}-K+2*P}{S}+1)*(\frac{H_{in}-K+2*P}{S}+1)$, в нашем случае - $74*74=5476$. Таким образом в итоге мы получаем матрицу $X_{col}$, размером $[147~\times~5476]$.
    
    * Стоить заметить, что так как фрагменты могут накладываться, то в преобразованной матрице будут повторятся некоторые элементы.

2. Преобразование фильтра.
    * Веса фильтра аналогичным образом растягиваются в строки и конкатенируются (каждый канал каждого фильтра растягивается в строку, а после горизонтально конкатенируется с остальными каналами и вертикально конкатенируется с остальными фильтрами).
    
    * В результате получается матрица $W_{row}$ размера $[D~\times~(K*K*C)]$, в нашем случае - $[8~\times~147]$.
    
3. Свертка.

    * Результат свертки после преобразований равен произведению полученных матриц: $dot(W_{row}, X_{col})$. Это соответствует перемножению фильтров с каждой областью видимости (*receptive fielld*) свертки.
    
    * Полученную матрицу нужно преобразовать в правильный размер: из $[8~\times~5476]$ в $[8~\times~74~\times~74]$.
    
    
Очевидным минусом такого подхода к выполнению свертки является дополнительные накладные расходы на повторяющиеся фрагменты исходного изображения в матрице $X_{col}$. Но, во-первых, они меньше, чем расходы в случае преобразования матрицы весов, а во-вторых, ускорение, получаемое за счет эффективной реализации перемножения матриц гораздо существеннее.

### Сравнение

Ниже представлена реализация операции свертки через im2col. Предлагается самостоятельно написать простую реализацию через циклы с использованием готовых шаблонов и сравнить время работы двух реализаций.

#### Вспомогательные функции для реализации свертки через im2col

In [0]:
import numpy as np

def im2col_indices(X_shape, filter_H, filter_W, padding=1, stride=1):
    """
    Returns indexes for an im2col slice
    """
    # Get the output shape
    N, C, H, W = X_shape
    out_H = (H + 2 * padding - filter_H) // stride + 1
    out_W = (W + 2 * padding - filter_W) // stride + 1

    # Get indices for im2col
    i0 = np.repeat(np.arange(filter_H), filter_W)
    i0 = np.tile(i0, C)
    i1 = stride * np.repeat(np.arange(out_H), out_W)
    j0 = np.tile(np.arange(filter_W), filter_H * C)
    j1 = stride * np.tile(np.arange(out_W), out_H)
    i = i0.reshape(-1, 1) + i1.reshape(1, -1)
    j = j0.reshape(-1, 1) + j1.reshape(1, -1)

    k = np.repeat(np.arange(C), filter_H * filter_W).reshape(-1, 1)

    return (k, i, j)


def im2col(X, filter_H, filter_W, padding=1, stride=1):
    """
    An implementation of im2col based on array reindexing
    """
    # Zero-pad the input
    p = padding
    X_padded = np.pad(X, ((0, 0), (0, 0), (p, p), (p, p)), mode='constant')

    k, i, j = im2col_indices(X.shape, filter_H, filter_W, padding, stride)

    cols = X_padded[:, k, i, j]
    C = X.shape[1]
    cols = cols.transpose(1, 2, 0).reshape(filter_H * filter_W * C, -1)
    
    return cols

#### Реализации операции свертки

In [0]:
def conv_im2col(X, kernel, b, padding=1, stride=1):
    """
    Convolutional layer implementation with im2col
    """
    N, C, H, W = X.shape
    filter_N, _, filter_H, filter_W = kernel.shape

    # Check dimensions
    assert (W + 2 * padding - filter_W) % stride == 0, 'width does not work'
    assert (H + 2 * padding - filter_H) % stride == 0, 'height does not work'

    # Create output
    out_H = (H + 2 * padding - filter_H) // stride + 1
    out_W = (W + 2 * padding - filter_W) // stride + 1
    out = np.zeros((N, filter_N, out_H, out_W), dtype=np.float64)

    X_cols = im2col(X, filter_H, filter_W, padding, stride)
    res = np.dot(kernel.reshape((filter_N, -1)), (X_cols)) + b.reshape(-1, 1)

    out = res.reshape(filter_N, out_H, out_W, N)
    out = out.transpose(3, 0, 1, 2)
    out += b[np.newaxis, :, np.newaxis, np.newaxis]
    return out

    
def conv_naive(X, kernel, b, padding=1, stride=1):
    """
    Convolutional layer implementation with loops
    """
    N, C, H, W = X.shape
    filter_N, _, filter_H, filter_W = kernel.shape

    # Check dimensions
    assert (W + 2 * padding - filter_W) % stride == 0, 'width does not work'
    assert (H + 2 * padding - filter_H) % stride == 0, 'height does not work'

    # Create output
    out_H = (H + 2 * padding - filter_H) // stride + 1
    out_W = (W + 2 * padding - filter_W) // stride + 1
    out = np.zeros((N, filter_N, out_H, out_W), dtype=np.float64)
    
    # Pad input
    p = padding
    X_padded = np.pad(X, ((0, 0), (0, 0), (p, p), (p, p)), mode='constant')
    N, C, H, W = X_padded.shape
    
    # Convolve
    # =====================================================================================
    # ================================== Your code here ===================================
    # =====================================================================================
    for image in range(N):
        for filter in range(filter_N):
            for i in range(out_H):
                for j in range(out_W):
                    rows_begin, rows_end = i * stride, i * stride + filter_H
                    cols_begin, cols_end = j * stride, j * stride + filter_W
                    out[image, filter, i, j] = np.sum(X_padded[image, :, rows_begin:rows_end, cols_begin:cols_end] * 
                                                     kernel[filter]) + b[filter]
    
    return out

#### Сравнение по времени работы

In [4]:
from time import time

params = [{'padding': 0, 'stride': 1},
         {'padding': 0, 'stride': 2},
         {'padding': 2, 'stride': 1},
         {'padding': 2, 'stride': 2}]

shapes = [[(3, 3, 33, 33), (64, 3, 1, 1)],
         [(3, 3, 33, 33), (64, 3, 3, 3)],
         [(3, 3, 33, 33), (64, 3, 5, 5)],
         [(3, 3, 33, 33), (64, 3, 7, 7)]]

for i, param in enumerate(params):
    for j, shape in enumerate(shapes):
        # =====================================================================================
        # ================================== Your code here ===================================
        # =====================================================================================
        X = np.ones(shape[0])
        kernel = np.ones(shape[1])
        b = np.ones(shape[1][0])
        beg_time = time()
        conv_naive(X, kernel, b, **param)
        stupid_time = time() - beg_time
        
        beg_time = time()
        conv_im2col(X, kernel, b, **param)
        smart_time = time() - beg_time
        print(smart_time, stupid_time)

0.004839658737182617 1.2723395824432373
0.004199504852294922 1.1853375434875488
0.006306886672973633 1.096609354019165
0.007314920425415039 1.020918369293213
0.0009400844573974609 0.3913400173187256
0.0016109943389892578 0.32709622383117676
0.0018596649169921875 0.3356459140777588
0.0025026798248291016 0.3086521625518799
0.003965854644775391 1.6654398441314697
0.004519462585449219 1.5852611064910889
0.00726771354675293 1.4295868873596191
0.010019063949584961 1.3356540203094482
0.0012273788452148438 0.48798513412475586
0.0018579959869384766 0.4197056293487549
0.002376556396484375 0.41615796089172363
0.003222942352294922 0.40212106704711914


#### Выводы: ... 

как видим сделанная по-уму свертка работает на порядок быстрее наивной

### Другой способ реализации операции свертки

Ещё один возможный способ реализации свертки использует преобразование Фурье ($\mathcal{F}$), справедливо следующее утверждение:

$$f\ast g = \mathcal{F^{-1}}(\mathcal{F}(f)\circ\mathcal{F}(g))$$

где символом $\ast$ обозначена операция свертки, а $\circ$ обозначает поэлементное произведение. При этом замечено, что при реализации свертки таким способом наибольшее ускорение достигается при больших размерах фильтров.

## Особые виды сверточных слоев

### 2D свертка с размером фильтра $1~\times~1$

Такой вид свертки подробно рассмотрен в [работе](https://arxiv.org/pdf/1312.4400.pdf). Применяются для поканальной комбинации данных с изменением числа каналов. Очень часто используются в архитектурах для снижения глубины карт признаков. 


### Транспонированая 2D свертка (transposed convolution)

Этот слой позволяет повысить пространственный размер входных данных и при этом функционирует аналогично обычному сверточному слою. Формально транспонированная свертка задается сменой forward и backward местами у обычного сверточного слоя. Таким образом, пространственный размер результата будет таким, что если применить к нему операцию свертки с тем же размером фильтра, то пространственный размер результата совпадет с размером входа у транспонированной свертки.

В некоторых источниках также называется 'deconvolution', что формально неправильно, т.к. эта операция не является обратной к свертке. Транспонированная свертка с $S>=2$ называется **fractional strided convolution**. Такие свертки применяются при сегментации, восстановлении глубины, а также при работе с оптческим потоком, то есть в тех задачах, где требуется повышение размерности. Такую свертку можно инициализировать билинейным фильтром.


<table style="width:100%">  
    <tr>
        <td style="text-align:center">$S=1,~P=0$</td>
        <td style="text-align:center">$S=1,~P=2$</td>
        <td style="text-align:center">$S=2,~P=0$</td>
        <td style="text-align:center">$S=2,~P=1$</td>
    </tr>
    <tr>
        <td><img src="pictures/transposed_P0S1.gif"></td>
        <td><img src="pictures/transposed_P2S1.gif"></td>
        <td><img src="pictures/transposed_P0S2.gif"></td>
        <td><img src="pictures/transposed_P1S2.gif"></td>
    </tr>
</table>


### Расширенная 2D свертка (dilated convolution или atrous convolution)

Работает аналогично простой 2D свертке, но с дополнительным параметром - расширением (dilation). Этот параметр отвечает за расстояние между соседними клетками фильтра при применении его к входным данным (проиллюстрировано ниже). Идея введения dilation состоит в том, что с помощью таких сверток можно извлекать пространственную информацию из входных данных более агрессивно (по сравнению с обычными 2D свертками при последовательном расположении слоев область восприимчивости нейрона растет гораздо бысрее).  Этот тип сверточного слоя был предложен в этой [работе](https://arxiv.org/pdf/1511.07122.pdf "Multi-scale Context Aggregation by Dilated Convolutions"). Такие свертки могут применяться, например при решении задачи сегментации в реальном времени, чтобы быстрее (по сравнению с обычной сверткой) извлекать пространственную информацию из изображения.

<img src="pictures/dilation.gif">


### 3D свертка

В отличие от 2D свертки, где для каждому каналу входных данных соответствовал канал фильтра размером $K~\times~K$, в этом сверточном слое имеется всего $d~(d<C)$ каналов в фильтре ($C$ - число каналов в исходных данных). Таким образом свертка происходит в трех направлениях - по высоте, ширине и глубине. Схематично это изображено на картинке (пример для одного фильтра). Такой тип сверток может применяться для обработки соседних видеокадров (как например в этой [статье](https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Tran_Learning_Spatiotemporal_Features_ICCV_2015_paper.pdf)).
<img src="pictures/2d3d_conv.png">

# Визуализация с помощью tensorboard

Вторая часть это ipython notebook представляет собой кртакое введение в такой инструмент визуализации, как tensorboard. Этот инструмент позволяет в режиме реального времени отслеживать разнообразную статистику по графу вычислений, построенному в сессии tensorflow. 

Tensorboard анализирует файлы, записываемые с помощью метода add_summary у класса tf.summary.FileWriter, и визуализирует данные, записанные в этих файлах. Есть несколько типов статистики, которую можно собирать по графу (подробнее можно посмотреть в [документации](https://www.tensorflow.org/api_docs/python/tf/summary)): 
 * tf.summary.scalar
 * tf.summary.histogram
 * tf.summary.image
 * tf.summary.audio
 * tf.summary.text
 
Другая полезная функция tensorboard - визуализация архитектуры нейронной сети, потоков данных и связей между слоями, а также анализ врмени выполнения и потребления памяти для каждого слоя. Для получения наглядного результата очень важную роль играет правильна организация областей видимости, для это используется класс tf.name_scope(), примеры можно посмотреть в коде ниже или [документации](https://www.tensorflow.org/api_docs/python/tf/name_scope). Основная идея состоит в отделении каждой операции (инициализация, перемножение, применение функции активации и т.д.) в отдельный name_scope, уровень вложенности определяется в зависимотсти от того, насколько подробная нужна визуализация (например, можно выделять в отдельный name_scope только слои).

После запуска сессии (tf.Session) можно начинать отслеживать статистику с помощью следующей команды: 
```bash
tensorboard --logdir=logs --port=6006
```

Далее открыть в браузере вкладку [localhost:6006](http://localhost:6006). Обратите внимание, что  взависимости от ОС в некоторых браузерах tensorboard может работать некорректно (например, отображать архитектуру без связей между слоями), попробуйте разные браузеры.

Ниже приведен код для демонстрации возможностей tensorboard на примере обучения нейронной сети с одним скрытым слоем на MNIST. Также сделаны заготовки для реализации более сложной сверточной сети. 
* Разберитесь с кодом, обучите простую еть и посмотрите на визуализацию.
* Напишите фрагмент кода для задания более сложной сверточной архитектуры (минимум из двух сверточных слоев).
* Проведите тестирование написанной модели и достигните точности на тестовой выборке не ниже 98.5. 

Обратите внимание, что обучение на CPU может занять значительное время.

In [0]:
import os
import sys
import math
import numpy as np
import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data


def variable_statistic(var):
    """Add some useful statistic for rich visualization"""
    with tf.name_scope('summaries'):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)

        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)


def initializer_w(shape):
    """Initialize a weight (w) variable of a given shape."""
    init = tf.truncated_normal(np.array(shape), stddev=1.0 / math.sqrt(np.prod(shape)))

    return tf.Variable(init)


def initializer_b(shape):
    """Initialize a bias variable of a given shape."""
    init = tf.constant(1e-3, shape=np.array(shape))

    return tf.Variable(init)
                               
                               
def add_fc_layer(X, n_in, n_out, name, activation=tf.nn.relu):
    """Add a fully connected layer with activation to the network."""
    with tf.name_scope(name):
        # each name scope holds a variable and statistics
        with tf.name_scope('weights'):
            W = initializer_w([n_in, n_out])
            variable_statistic(W)
        with tf.name_scope('biases'):
            b = initializer_b([n_out])
            variable_statistic(b)
        with tf.name_scope('pre-activation'):
            z = tf.matmul(X, W) + b
            tf.summary.histogram('pre-activations', z)
                               
        a = activation(z, name='activation')
        tf.summary.histogram('activations', a)
        
        return a
                               

def add_conv_layer(X, shape, name, activation=tf.nn.relu, pooling=True):
    """Add a convolutional layer with activation and pooling to the network."""
    with tf.name_scope(name):
        with tf.name_scope('weights'):
            W_conv = initializer_w(shape)
            variable_statistic(W_conv)
        with tf.name_scope('biases'):
            b_conv = initializer_b([shape[-1]])
            variable_statistic(b_conv)
        with tf.name_scope('pre-activation'):
            z = conv2d(X, W_conv) + b_conv
            tf.summary.histogram('pre-activations', z)

        a = activation(z, name='activation')
        tf.summary.histogram('activations', a)
        if pooling:
            pool = max_pool_2x2(a, name='pooling')
            tf.summary.histogram('pooled', pool)
            
            return pool
        else:
            return a

        
def add_loss(predicted, gt):
    """Numerically stable cross-entropy with softmax activations on the last layer."""
    with tf.name_scope('cross_entropy'):
        error = tf.nn.softmax_cross_entropy_with_logits(labels=gt, logits=predicted)
        with tf.name_scope('averaged_loss'):
            cross_entropy = tf.reduce_mean(error)
                               
    tf.summary.scalar('cross_entropy', cross_entropy)
    return cross_entropy


def get_feed_dict(mnist, is_train, BATCH_SIZE=None):
    """Get next batch or test set with labels."""
    if is_train:
        x_placed, y_placed = mnist.train.next_batch(BATCH_SIZE)
    else:
        x_placed, y_placed = mnist.test.images, mnist.test.labels
        
    return x_placed.astype(np.float32), y_placed.astype(np.float32)


def conv2d(X, W):
    """Wrapper for tf.nn.conv2d."""
    return tf.nn.conv2d(X, W, strides=[1, 1, 1, 1], padding='SAME')


def max_pool_2x2(X, name):
    """Wrapper for tf.nn.max_pool."""
    return tf.nn.max_pool(X, ksize=[1, 2, 2, 1], 
                          strides=[1, 2, 2, 1], padding='SAME',
                          name=name)


def build_architecture(X, arch='dummy', keep_prob=0.5):
    """Builds an architecture of a given type."""
    if arch == 'dummy':
        hidden = add_fc_layer(X, 784, 1024, 'hidden')
        return add_fc_layer(hidden, 1024, 10, 'output', activation=tf.identity)
    elif arch == 'cnn':
        # =====================================================================================
        # ================================== Your code here ===================================
        # =====================================================================================
        X_resh = tf.reshape(X, [-1, 28, 28, 1])
        conv1 = add_conv_layer(X_resh, [3, 3, 1, 32], "conv_1", pooling=False)
        conv2 = add_conv_layer(conv1, [3, 3, 32, 64], "conv_2")
        dense_1 = add_fc_layer(tf.reshape(conv2, [-1, 14*14*64]), 14*14*64, 128, 'dense_1')
        return add_fc_layer(dense_1, 128, 10, "out", activation=tf.identity)
    else:
        raise ValueError('Unknown arch.')

In [0]:
from tqdm import tqdm, tqdm_notebook

In [0]:
def evaluate_network(mnist, arch='dummy', lr=0.001, batch_size=100, epochs=10000):
    """Buid, train and test network on mnist dataset."""
    # clear all previous graphs
    tf.reset_default_graph()
    
    # input placeholders
    with tf.name_scope('input'):
        X = tf.placeholder(tf.float32, [None, 784], name='input_X')
        y_ = tf.placeholder(tf.float32, [None, 10], name='input_Y')

    with tf.name_scope('input_as_images'):
        X_reshape = tf.reshape(X, [-1, 28, 28, 1])
        tf.summary.image('input', X_reshape, 10)

    # setting up the architecture
    y = build_architecture(X, arch)
    
    # define loss
    loss = add_loss(y, y_)  
    
    # define train step
    with tf.name_scope('train'):
        train_step = tf.train.AdamOptimizer(lr).minimize(loss)
    
    # define evaluation of predictions
    with tf.name_scope('evaluation'):
        with tf.name_scope('correct_mask'):
            correct_mask = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
        with tf.name_scope('accuracy'):
            accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
    tf.summary.scalar('accuracy', accuracy)
    
    # variable initialization
    init = tf.global_variables_initializer()
    
    # start tensorflow session to train the network
    with tf.Session() as sess:
        # run initialization
        sess.run(init)
                
        # merge all summary and write it to ./logs/<arch>
        merged_summary = tf.summary.merge_all()
        train_writer = tf.summary.FileWriter('./logs/{}/train'.format(arch), sess.graph)
        test_writer = tf.summary.FileWriter('./logs/{}/test'.format(arch))
        
        # run training and evaluate the model each 10th epoch
        for epoch in tqdm(range(0, epochs + 1)):
            # run train step and record train summary
            run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
            run_metadata = tf.RunMetadata()
            data, labels = get_feed_dict(mnist, True, batch_size)
            summary, _ = sess.run([merged_summary, train_step],
                                  feed_dict={X: data, y_: labels},
                                  options=run_options,
                                  run_metadata=run_metadata)
            train_writer.add_run_metadata(run_metadata, 'epoch: {:6d}'.format(epoch))
            train_writer.add_summary(summary, epoch)

            # evaluate on test set and record test summary
            if epoch % 10 == 0: 
                data, labels = get_feed_dict(mnist, False)
                summary, acc = sess.run([merged_summary, accuracy], feed_dict={X: data, y_: labels})
                test_writer.add_summary(summary, epoch)
                print('epoch: {:6d} | acc.: {:3.3f}'.format(epoch, acc))
                
        train_writer.close()
        test_writer.close()

In [8]:
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py fr

In [13]:
evaluate_network(mnist, arch='dummy', lr=0.001, batch_size=100, epochs=1000)

  0%|          | 3/1001 [00:00<10:15,  1.62it/s]

epoch:      0 | acc.: 0.439


  1%|▏         | 13/1001 [00:02<03:43,  4.43it/s]

epoch:     10 | acc.: 0.782


  2%|▏         | 23/1001 [00:03<02:39,  6.13it/s]

epoch:     20 | acc.: 0.828


  3%|▎         | 33/1001 [00:04<02:25,  6.66it/s]

epoch:     30 | acc.: 0.852


  4%|▍         | 43/1001 [00:06<02:22,  6.72it/s]

epoch:     40 | acc.: 0.877


  5%|▌         | 53/1001 [00:07<02:21,  6.70it/s]

epoch:     50 | acc.: 0.893


  6%|▋         | 63/1001 [00:08<02:20,  6.70it/s]

epoch:     60 | acc.: 0.896


  7%|▋         | 73/1001 [00:10<02:19,  6.66it/s]

epoch:     70 | acc.: 0.899


  8%|▊         | 83/1001 [00:11<02:18,  6.62it/s]

epoch:     80 | acc.: 0.899


  9%|▉         | 93/1001 [00:12<02:18,  6.56it/s]

epoch:     90 | acc.: 0.908


 10%|█         | 103/1001 [00:14<02:23,  6.24it/s]

epoch:    100 | acc.: 0.906


 11%|█▏        | 113/1001 [00:15<02:16,  6.50it/s]

epoch:    110 | acc.: 0.912


 12%|█▏        | 123/1001 [00:16<02:13,  6.59it/s]

epoch:    120 | acc.: 0.914


 13%|█▎        | 133/1001 [00:18<02:11,  6.62it/s]

epoch:    130 | acc.: 0.914


 14%|█▍        | 143/1001 [00:19<02:10,  6.55it/s]

epoch:    140 | acc.: 0.920


 15%|█▌        | 153/1001 [00:20<02:09,  6.52it/s]

epoch:    150 | acc.: 0.917


 16%|█▋        | 163/1001 [00:22<02:07,  6.56it/s]

epoch:    160 | acc.: 0.923


 17%|█▋        | 173/1001 [00:23<02:06,  6.53it/s]

epoch:    170 | acc.: 0.924


 18%|█▊        | 183/1001 [00:24<02:08,  6.38it/s]

epoch:    180 | acc.: 0.922


 19%|█▉        | 193/1001 [00:26<02:05,  6.45it/s]

epoch:    190 | acc.: 0.926


 20%|██        | 203/1001 [00:27<02:02,  6.53it/s]

epoch:    200 | acc.: 0.929


 21%|██▏       | 213/1001 [00:28<01:58,  6.66it/s]

epoch:    210 | acc.: 0.924


 22%|██▏       | 223/1001 [00:30<01:57,  6.64it/s]

epoch:    220 | acc.: 0.932


 23%|██▎       | 233/1001 [00:31<01:56,  6.62it/s]

epoch:    230 | acc.: 0.932


 24%|██▍       | 243/1001 [00:33<01:56,  6.48it/s]

epoch:    240 | acc.: 0.932


 25%|██▌       | 253/1001 [00:34<01:53,  6.59it/s]

epoch:    250 | acc.: 0.936


 26%|██▋       | 263/1001 [00:35<01:52,  6.54it/s]

epoch:    260 | acc.: 0.934


 27%|██▋       | 273/1001 [00:37<01:50,  6.59it/s]

epoch:    270 | acc.: 0.937


 28%|██▊       | 283/1001 [00:38<01:48,  6.59it/s]

epoch:    280 | acc.: 0.938


 29%|██▉       | 293/1001 [00:39<01:46,  6.66it/s]

epoch:    290 | acc.: 0.935


 30%|███       | 303/1001 [00:40<01:46,  6.56it/s]

epoch:    300 | acc.: 0.939


 31%|███▏      | 313/1001 [00:42<01:43,  6.62it/s]

epoch:    310 | acc.: 0.942


 32%|███▏      | 323/1001 [00:43<01:44,  6.51it/s]

epoch:    320 | acc.: 0.939


 33%|███▎      | 333/1001 [00:45<01:43,  6.48it/s]

epoch:    330 | acc.: 0.943


 34%|███▍      | 343/1001 [00:46<01:40,  6.56it/s]

epoch:    340 | acc.: 0.944


 35%|███▌      | 353/1001 [00:47<01:38,  6.58it/s]

epoch:    350 | acc.: 0.945


 36%|███▋      | 363/1001 [00:49<01:36,  6.58it/s]

epoch:    360 | acc.: 0.946


 37%|███▋      | 373/1001 [00:50<01:35,  6.60it/s]

epoch:    370 | acc.: 0.947


 38%|███▊      | 383/1001 [00:51<01:33,  6.58it/s]

epoch:    380 | acc.: 0.949


 39%|███▉      | 393/1001 [00:53<01:33,  6.50it/s]

epoch:    390 | acc.: 0.949


 40%|████      | 403/1001 [00:54<01:30,  6.62it/s]

epoch:    400 | acc.: 0.946


 41%|████▏     | 413/1001 [00:55<01:29,  6.60it/s]

epoch:    410 | acc.: 0.946


 42%|████▏     | 423/1001 [00:57<01:28,  6.54it/s]

epoch:    420 | acc.: 0.953


 43%|████▎     | 433/1001 [00:58<01:26,  6.56it/s]

epoch:    430 | acc.: 0.950


 44%|████▍     | 443/1001 [00:59<01:25,  6.56it/s]

epoch:    440 | acc.: 0.953


 45%|████▌     | 453/1001 [01:01<01:23,  6.59it/s]

epoch:    450 | acc.: 0.951


 46%|████▋     | 463/1001 [01:02<01:21,  6.63it/s]

epoch:    460 | acc.: 0.955


 47%|████▋     | 473/1001 [01:03<01:19,  6.60it/s]

epoch:    470 | acc.: 0.951


 48%|████▊     | 483/1001 [01:05<01:19,  6.52it/s]

epoch:    480 | acc.: 0.953


 49%|████▉     | 493/1001 [01:06<01:17,  6.56it/s]

epoch:    490 | acc.: 0.954


 50%|█████     | 503/1001 [01:07<01:16,  6.52it/s]

epoch:    500 | acc.: 0.957


 51%|█████     | 513/1001 [01:09<01:14,  6.53it/s]

epoch:    510 | acc.: 0.952


 52%|█████▏    | 523/1001 [01:10<01:13,  6.53it/s]

epoch:    520 | acc.: 0.957


 53%|█████▎    | 533/1001 [01:11<01:12,  6.45it/s]

epoch:    530 | acc.: 0.955


 54%|█████▍    | 543/1001 [01:13<01:10,  6.52it/s]

epoch:    540 | acc.: 0.958


 55%|█████▌    | 553/1001 [01:14<01:09,  6.47it/s]

epoch:    550 | acc.: 0.961


 56%|█████▌    | 563/1001 [01:15<01:07,  6.47it/s]

epoch:    560 | acc.: 0.958


 57%|█████▋    | 573/1001 [01:17<01:05,  6.58it/s]

epoch:    570 | acc.: 0.961


 58%|█████▊    | 583/1001 [01:18<01:03,  6.61it/s]

epoch:    580 | acc.: 0.959


 59%|█████▉    | 593/1001 [01:19<01:02,  6.55it/s]

epoch:    590 | acc.: 0.958


 60%|██████    | 603/1001 [01:21<01:01,  6.47it/s]

epoch:    600 | acc.: 0.957


 61%|██████    | 613/1001 [01:22<00:59,  6.54it/s]

epoch:    610 | acc.: 0.962


 62%|██████▏   | 623/1001 [01:23<00:57,  6.63it/s]

epoch:    620 | acc.: 0.962


 63%|██████▎   | 633/1001 [01:25<00:56,  6.57it/s]

epoch:    630 | acc.: 0.962


 64%|██████▍   | 643/1001 [01:26<00:54,  6.53it/s]

epoch:    640 | acc.: 0.962


 65%|██████▌   | 653/1001 [01:27<00:56,  6.20it/s]

epoch:    650 | acc.: 0.961


 66%|██████▌   | 663/1001 [01:29<00:51,  6.51it/s]

epoch:    660 | acc.: 0.963


 67%|██████▋   | 673/1001 [01:30<00:50,  6.47it/s]

epoch:    670 | acc.: 0.965


 68%|██████▊   | 683/1001 [01:32<00:48,  6.58it/s]

epoch:    680 | acc.: 0.962


 69%|██████▉   | 693/1001 [01:33<00:47,  6.51it/s]

epoch:    690 | acc.: 0.965


 70%|███████   | 703/1001 [01:34<00:45,  6.54it/s]

epoch:    700 | acc.: 0.963


 71%|███████   | 713/1001 [01:36<00:43,  6.57it/s]

epoch:    710 | acc.: 0.965


 72%|███████▏  | 723/1001 [01:37<00:42,  6.52it/s]

epoch:    720 | acc.: 0.964


 73%|███████▎  | 733/1001 [01:38<00:41,  6.51it/s]

epoch:    730 | acc.: 0.965


 74%|███████▍  | 743/1001 [01:40<00:39,  6.57it/s]

epoch:    740 | acc.: 0.964


 75%|███████▌  | 753/1001 [01:41<00:37,  6.57it/s]

epoch:    750 | acc.: 0.965


 76%|███████▌  | 763/1001 [01:42<00:36,  6.56it/s]

epoch:    760 | acc.: 0.963


 77%|███████▋  | 773/1001 [01:44<00:34,  6.57it/s]

epoch:    770 | acc.: 0.965


 78%|███████▊  | 783/1001 [01:45<00:33,  6.57it/s]

epoch:    780 | acc.: 0.967


 79%|███████▉  | 793/1001 [01:46<00:31,  6.54it/s]

epoch:    790 | acc.: 0.961


 80%|████████  | 803/1001 [01:48<00:30,  6.51it/s]

epoch:    800 | acc.: 0.963


 81%|████████  | 813/1001 [01:49<00:28,  6.52it/s]

epoch:    810 | acc.: 0.967


 82%|████████▏ | 823/1001 [01:50<00:27,  6.54it/s]

epoch:    820 | acc.: 0.966


 83%|████████▎ | 833/1001 [01:52<00:25,  6.58it/s]

epoch:    830 | acc.: 0.967


 84%|████████▍ | 843/1001 [01:53<00:24,  6.51it/s]

epoch:    840 | acc.: 0.965


 85%|████████▌ | 853/1001 [01:54<00:23,  6.40it/s]

epoch:    850 | acc.: 0.965


 86%|████████▌ | 863/1001 [01:56<00:21,  6.52it/s]

epoch:    860 | acc.: 0.967


 87%|████████▋ | 873/1001 [01:57<00:19,  6.45it/s]

epoch:    870 | acc.: 0.966


 88%|████████▊ | 883/1001 [01:58<00:18,  6.52it/s]

epoch:    880 | acc.: 0.966


 89%|████████▉ | 893/1001 [02:00<00:16,  6.46it/s]

epoch:    890 | acc.: 0.968


 90%|█████████ | 903/1001 [02:01<00:15,  6.52it/s]

epoch:    900 | acc.: 0.967


 91%|█████████ | 913/1001 [02:02<00:13,  6.47it/s]

epoch:    910 | acc.: 0.970


 92%|█████████▏| 923/1001 [02:04<00:11,  6.51it/s]

epoch:    920 | acc.: 0.970


 93%|█████████▎| 933/1001 [02:05<00:10,  6.47it/s]

epoch:    930 | acc.: 0.970


 94%|█████████▍| 943/1001 [02:07<00:09,  6.43it/s]

epoch:    940 | acc.: 0.971


 95%|█████████▌| 953/1001 [02:08<00:07,  6.51it/s]

epoch:    950 | acc.: 0.967


 96%|█████████▌| 963/1001 [02:09<00:05,  6.45it/s]

epoch:    960 | acc.: 0.970


 97%|█████████▋| 973/1001 [02:11<00:04,  6.56it/s]

epoch:    970 | acc.: 0.969


 98%|█████████▊| 983/1001 [02:12<00:02,  6.54it/s]

epoch:    980 | acc.: 0.970


 99%|█████████▉| 993/1001 [02:13<00:01,  6.54it/s]

epoch:    990 | acc.: 0.969


100%|██████████| 1001/1001 [02:14<00:00,  5.17it/s]

epoch:   1000 | acc.: 0.970





In [0]:
evaluate_network(mnist, arch='cnn', lr=0.001, batch_size=128, epochs=30000)





  0%|          | 0/30001 [00:00<?, ?it/s][A[A[A[A



  0%|          | 1/30001 [00:40<334:48:27, 40.18s/it][A[A[A[A

epoch:      0 | acc.: 0.288






  0%|          | 2/30001 [00:40<236:01:32, 28.32s/it][A[A[A[A



  0%|          | 3/30001 [00:41<166:42:25, 20.01s/it][A[A[A[A



  0%|          | 4/30001 [00:42<118:14:01, 14.19s/it][A[A[A[A



  0%|          | 5/30001 [00:42<84:20:13, 10.12s/it] [A[A[A[A



  0%|          | 6/30001 [00:43<60:33:29,  7.27s/it][A[A[A[A



  0%|          | 7/30001 [00:43<43:54:29,  5.27s/it][A[A[A[A



  0%|          | 8/30001 [00:44<32:17:19,  3.88s/it][A[A[A[A



  0%|          | 9/30001 [00:45<24:05:45,  2.89s/it][A[A[A[A



  0%|          | 10/30001 [00:45<18:21:37,  2.20s/it][A[A[A[A



  0%|          | 11/30001 [01:26<114:36:17, 13.76s/it][A[A[A[A

epoch:     10 | acc.: 0.612






  0%|          | 12/30001 [01:27<81:43:24,  9.81s/it] [A[A[A[A



  0%|          | 13/30001 [01:27<58:42:53,  7.05s/it][A[A[A[A



  0%|          | 14/30001 [01:28<42:38:12,  5.12s/it][A[A[A[A



  0%|          | 15/30001 [01:28<31:19:20,  3.76s/it][A[A[A[A



  0%|          | 16/30001 [01:29<23:23:13,  2.81s/it][A[A[A[A



  0%|          | 17/30001 [01:30<17:49:47,  2.14s/it][A[A[A[A



  0%|          | 18/30001 [01:30<13:59:40,  1.68s/it][A[A[A[A



  0%|          | 19/30001 [01:31<11:16:31,  1.35s/it][A[A[A[A



  0%|          | 20/30001 [01:31<9:22:01,  1.12s/it] [A[A[A[A



  0%|          | 21/30001 [02:13<110:59:14, 13.33s/it][A[A[A[A

epoch:     20 | acc.: 0.805






  0%|          | 22/30001 [02:14<79:15:34,  9.52s/it] [A[A[A[A



  0%|          | 23/30001 [02:14<56:56:41,  6.84s/it][A[A[A[A



  0%|          | 24/30001 [02:15<41:18:52,  4.96s/it][A[A[A[A



  0%|          | 25/30001 [02:15<30:22:25,  3.65s/it][A[A[A[A



  0%|          | 26/30001 [02:16<22:45:09,  2.73s/it][A[A[A[A



  0%|          | 27/30001 [02:17<17:22:55,  2.09s/it][A[A[A[A



  0%|          | 28/30001 [02:17<13:41:39,  1.64s/it][A[A[A[A



  0%|          | 29/30001 [02:18<11:03:18,  1.33s/it][A[A[A[A



  0%|          | 30/30001 [02:18<9:11:51,  1.10s/it] [A[A[A[A



  0%|          | 31/30001 [02:58<105:02:39, 12.62s/it][A[A[A[A

epoch:     30 | acc.: 0.860






  0%|          | 32/30001 [02:59<74:58:33,  9.01s/it] [A[A[A[A



  0%|          | 33/30001 [02:59<53:55:29,  6.48s/it][A[A[A[A



  0%|          | 34/30001 [03:00<39:10:56,  4.71s/it][A[A[A[A



  0%|          | 35/30001 [03:00<28:52:32,  3.47s/it][A[A[A[A



  0%|          | 36/30001 [03:01<21:42:23,  2.61s/it][A[A[A[A



  0%|          | 37/30001 [03:01<16:37:33,  2.00s/it][A[A[A[A



  0%|          | 38/30001 [03:02<13:10:02,  1.58s/it][A[A[A[A



  0%|          | 39/30001 [03:03<10:39:00,  1.28s/it][A[A[A[A



  0%|          | 40/30001 [03:03<8:52:54,  1.07s/it] [A[A[A[A



  0%|          | 41/30001 [03:41<101:19:26, 12.18s/it][A[A[A[A

epoch:     40 | acc.: 0.879






  0%|          | 42/30001 [03:42<72:23:46,  8.70s/it] [A[A[A[A



  0%|          | 43/30001 [03:42<52:05:13,  6.26s/it][A[A[A[A



  0%|          | 44/30001 [03:43<37:53:09,  4.55s/it][A[A[A[A



  0%|          | 45/30001 [03:44<27:56:58,  3.36s/it][A[A[A[A



  0%|          | 46/30001 [03:44<20:59:20,  2.52s/it][A[A[A[A



  0%|          | 47/30001 [03:45<16:05:25,  1.93s/it][A[A[A[A



  0%|          | 48/30001 [03:45<12:45:10,  1.53s/it][A[A[A[A



  0%|          | 49/30001 [03:46<10:21:03,  1.24s/it][A[A[A[A



  0%|          | 50/30001 [03:46<8:39:21,  1.04s/it] [A[A[A[A



  0%|          | 51/30001 [04:26<104:03:59, 12.51s/it][A[A[A[A

epoch:     50 | acc.: 0.885






  0%|          | 52/30001 [04:26<74:20:17,  8.94s/it] [A[A[A[A



  0%|          | 53/30001 [04:27<53:29:19,  6.43s/it][A[A[A[A



  0%|          | 54/30001 [04:27<38:53:26,  4.68s/it][A[A[A[A



  0%|          | 55/30001 [04:28<28:37:04,  3.44s/it][A[A[A[A



  0%|          | 56/30001 [04:29<21:29:21,  2.58s/it][A[A[A[A



  0%|          | 57/30001 [04:29<16:29:40,  1.98s/it][A[A[A[A



  0%|          | 58/30001 [04:30<12:57:56,  1.56s/it][A[A[A[A



  0%|          | 59/30001 [04:30<10:35:22,  1.27s/it][A[A[A[A



  0%|          | 60/30001 [04:31<8:49:07,  1.06s/it] [A[A[A[A



  0%|          | 61/30001 [05:09<100:54:23, 12.13s/it][A[A[A[A

epoch:     60 | acc.: 0.889






  0%|          | 62/30001 [05:09<72:05:44,  8.67s/it] [A[A[A[A



  0%|          | 63/30001 [05:10<51:51:28,  6.24s/it][A[A[A[A



  0%|          | 64/30001 [05:11<37:47:18,  4.54s/it][A[A[A[A



  0%|          | 65/30001 [05:11<27:54:00,  3.36s/it][A[A[A[A



  0%|          | 66/30001 [05:12<21:00:19,  2.53s/it][A[A[A[A



  0%|          | 67/30001 [05:12<16:09:09,  1.94s/it][A[A[A[A



  0%|          | 68/30001 [05:13<12:47:28,  1.54s/it][A[A[A[A



  0%|          | 69/30001 [05:14<10:27:16,  1.26s/it][A[A[A[A



  0%|          | 70/30001 [05:14<8:45:08,  1.05s/it] [A[A[A[A



  0%|          | 71/30001 [05:52<100:52:00, 12.13s/it][A[A[A[A

epoch:     70 | acc.: 0.900






  0%|          | 72/30001 [05:53<72:00:47,  8.66s/it] [A[A[A[A



  0%|          | 73/30001 [05:53<51:52:09,  6.24s/it][A[A[A[A



  0%|          | 74/30001 [05:54<37:47:35,  4.55s/it][A[A[A[A



  0%|          | 75/30001 [05:54<27:54:40,  3.36s/it][A[A[A[A



  0%|          | 76/30001 [05:55<21:03:26,  2.53s/it][A[A[A[A



  0%|          | 77/30001 [05:56<16:12:38,  1.95s/it][A[A[A[A



  0%|          | 78/30001 [05:56<12:46:59,  1.54s/it][A[A[A[A



  0%|          | 79/30001 [05:57<10:28:36,  1.26s/it][A[A[A[A



  0%|          | 80/30001 [05:57<8:46:11,  1.06s/it] [A[A[A[A



  0%|          | 81/30001 [06:36<101:09:51, 12.17s/it][A[A[A[A

epoch:     80 | acc.: 0.907






  0%|          | 82/30001 [06:36<72:15:29,  8.69s/it] [A[A[A[A



  0%|          | 83/30001 [06:37<52:05:27,  6.27s/it][A[A[A[A



  0%|          | 84/30001 [06:37<37:55:13,  4.56s/it][A[A[A[A



  0%|          | 85/30001 [06:38<27:58:55,  3.37s/it][A[A[A[A



  0%|          | 86/30001 [06:39<21:04:42,  2.54s/it][A[A[A[A



  0%|          | 87/30001 [06:39<16:11:57,  1.95s/it][A[A[A[A



  0%|          | 88/30001 [06:40<12:50:33,  1.55s/it][A[A[A[A



  0%|          | 89/30001 [06:40<10:31:40,  1.27s/it][A[A[A[A



  0%|          | 90/30001 [06:41<8:49:11,  1.06s/it] [A[A[A[A



  0%|          | 91/30001 [07:22<109:32:13, 13.18s/it][A[A[A[A

epoch:     90 | acc.: 0.898






  0%|          | 92/30001 [07:23<78:09:37,  9.41s/it] [A[A[A[A



  0%|          | 93/30001 [07:24<56:08:36,  6.76s/it][A[A[A[A



  0%|          | 94/30001 [07:24<40:45:52,  4.91s/it][A[A[A[A



  0%|          | 95/30001 [07:25<30:02:16,  3.62s/it][A[A[A[A



  0%|          | 96/30001 [07:25<22:30:02,  2.71s/it][A[A[A[A



  0%|          | 97/30001 [07:26<17:12:25,  2.07s/it][A[A[A[A



  0%|          | 98/30001 [07:26<13:30:06,  1.63s/it][A[A[A[A



  0%|          | 99/30001 [07:27<10:55:06,  1.31s/it][A[A[A[A



  0%|          | 100/30001 [07:28<9:06:15,  1.10s/it][A[A[A[A



  0%|          | 101/30001 [08:07<105:21:48, 12.69s/it][A[A[A[A

epoch:    100 | acc.: 0.920






  0%|          | 102/30001 [08:08<75:09:15,  9.05s/it] [A[A[A[A



  0%|          | 103/30001 [08:09<54:03:58,  6.51s/it][A[A[A[A



  0%|          | 104/30001 [08:09<39:19:23,  4.74s/it][A[A[A[A



  0%|          | 105/30001 [08:10<29:00:34,  3.49s/it][A[A[A[A



  0%|          | 106/30001 [08:10<21:47:15,  2.62s/it][A[A[A[A



  0%|          | 107/30001 [08:11<16:43:33,  2.01s/it][A[A[A[A



  0%|          | 108/30001 [08:11<13:09:30,  1.58s/it][A[A[A[A



  0%|          | 109/30001 [08:12<10:41:19,  1.29s/it][A[A[A[A



  0%|          | 110/30001 [08:13<8:58:35,  1.08s/it] [A[A[A[A



  0%|          | 111/30001 [08:54<109:34:26, 13.20s/it][A[A[A[A

epoch:    110 | acc.: 0.916






  0%|          | 112/30001 [08:55<78:13:24,  9.42s/it] [A[A[A[A



  0%|          | 113/30001 [08:55<56:13:54,  6.77s/it][A[A[A[A



  0%|          | 114/30001 [08:56<40:48:40,  4.92s/it][A[A[A[A



  0%|          | 115/30001 [08:57<30:01:50,  3.62s/it][A[A[A[A



  0%|          | 116/30001 [08:57<22:33:40,  2.72s/it][A[A[A[A



  0%|          | 117/30001 [08:58<17:17:45,  2.08s/it][A[A[A[A



  0%|          | 118/30001 [08:58<13:34:11,  1.63s/it][A[A[A[A



  0%|          | 119/30001 [08:59<11:01:00,  1.33s/it][A[A[A[A



  0%|          | 120/30001 [09:00<9:09:52,  1.10s/it] [A[A[A[A



  0%|          | 121/30001 [09:41<110:00:00, 13.25s/it][A[A[A[A

epoch:    120 | acc.: 0.910






  0%|          | 122/30001 [09:42<78:28:38,  9.46s/it] [A[A[A[A



  0%|          | 123/30001 [09:42<56:23:14,  6.79s/it][A[A[A[A



  0%|          | 124/30001 [09:43<40:56:25,  4.93s/it][A[A[A[A



  0%|          | 125/30001 [09:43<30:06:55,  3.63s/it][A[A[A[A



  0%|          | 126/30001 [09:44<22:31:53,  2.72s/it][A[A[A[A



  0%|          | 127/30001 [09:45<17:19:02,  2.09s/it][A[A[A[A



  0%|          | 128/30001 [09:45<13:37:17,  1.64s/it][A[A[A[A



  0%|          | 129/30001 [09:46<11:00:24,  1.33s/it][A[A[A[A



  0%|          | 130/30001 [09:46<9:12:04,  1.11s/it] [A[A[A[A



  0%|          | 131/30001 [10:26<105:30:54, 12.72s/it][A[A[A[A

epoch:    130 | acc.: 0.928






  0%|          | 132/30001 [10:27<75:22:33,  9.08s/it] [A[A[A[A



  0%|          | 133/30001 [10:28<54:15:59,  6.54s/it][A[A[A[A



  0%|          | 134/30001 [10:28<39:25:46,  4.75s/it][A[A[A[A



  0%|          | 135/30001 [10:29<29:05:37,  3.51s/it][A[A[A[A



  0%|          | 136/30001 [10:29<21:47:43,  2.63s/it][A[A[A[A



  0%|          | 137/30001 [10:30<16:42:37,  2.01s/it][A[A[A[A



  0%|          | 138/30001 [10:30<13:08:41,  1.58s/it][A[A[A[A



  0%|          | 139/30001 [10:31<10:45:55,  1.30s/it][A[A[A[A



  0%|          | 140/30001 [10:32<9:02:11,  1.09s/it] [A[A[A[A



  0%|          | 141/30001 [11:10<100:53:11, 12.16s/it][A[A[A[A

epoch:    140 | acc.: 0.926






  0%|          | 142/30001 [11:10<72:06:15,  8.69s/it] [A[A[A[A



  0%|          | 143/30001 [11:11<51:58:13,  6.27s/it][A[A[A[A



  0%|          | 144/30001 [11:11<37:48:00,  4.56s/it][A[A[A[A



  0%|          | 145/30001 [11:12<27:58:23,  3.37s/it][A[A[A[A



  0%|          | 146/30001 [11:13<21:04:34,  2.54s/it][A[A[A[A



  0%|          | 147/30001 [11:13<16:14:31,  1.96s/it][A[A[A[A



  0%|          | 148/30001 [11:14<12:54:09,  1.56s/it][A[A[A[A



  0%|          | 149/30001 [11:14<10:29:43,  1.27s/it][A[A[A[A



  0%|          | 150/30001 [11:15<8:49:41,  1.06s/it] [A[A[A[A



  1%|          | 151/30001 [11:56<108:54:23, 13.13s/it][A[A[A[A

epoch:    150 | acc.: 0.932






  1%|          | 152/30001 [11:57<77:41:31,  9.37s/it] [A[A[A[A



  1%|          | 153/30001 [11:58<55:48:44,  6.73s/it][A[A[A[A



  1%|          | 154/30001 [11:58<40:29:16,  4.88s/it][A[A[A[A



  1%|          | 155/30001 [11:59<29:46:56,  3.59s/it][A[A[A[A



  1%|          | 156/30001 [11:59<22:18:14,  2.69s/it][A[A[A[A



  1%|          | 157/30001 [12:00<17:05:58,  2.06s/it][A[A[A[A



  1%|          | 158/30001 [12:00<13:27:19,  1.62s/it][A[A[A[A



  1%|          | 159/30001 [12:01<10:54:04,  1.32s/it][A[A[A[A



  1%|          | 160/30001 [12:02<9:07:37,  1.10s/it] [A[A[A[A



  1%|          | 161/30001 [12:41<104:59:59, 12.67s/it][A[A[A[A

epoch:    160 | acc.: 0.927






  1%|          | 162/30001 [12:42<75:00:20,  9.05s/it] [A[A[A[A



  1%|          | 163/30001 [12:42<53:57:10,  6.51s/it][A[A[A[A



  1%|          | 164/30001 [12:43<39:14:26,  4.73s/it][A[A[A[A



  1%|          | 165/30001 [12:44<28:56:16,  3.49s/it][A[A[A[A



  1%|          | 166/30001 [12:44<21:43:08,  2.62s/it][A[A[A[A



  1%|          | 167/30001 [12:45<16:40:01,  2.01s/it][A[A[A[A



  1%|          | 168/30001 [12:45<13:06:52,  1.58s/it][A[A[A[A



  1%|          | 169/30001 [12:46<10:41:16,  1.29s/it][A[A[A[A



  1%|          | 170/30001 [12:47<8:54:11,  1.07s/it] [A[A[A[A



  1%|          | 171/30001 [13:25<101:27:46, 12.24s/it][A[A[A[A

epoch:    170 | acc.: 0.936






  1%|          | 172/30001 [13:26<72:31:32,  8.75s/it] [A[A[A[A



  1%|          | 173/30001 [13:26<52:14:31,  6.31s/it][A[A[A[A



  1%|          | 174/30001 [13:27<38:03:33,  4.59s/it][A[A[A[A



  1%|          | 175/30001 [13:27<28:08:07,  3.40s/it][A[A[A[A



  1%|          | 176/30001 [13:28<21:11:13,  2.56s/it][A[A[A[A



  1%|          | 177/30001 [13:29<16:21:08,  1.97s/it][A[A[A[A



  1%|          | 178/30001 [13:29<13:01:05,  1.57s/it][A[A[A[A



  1%|          | 179/30001 [13:30<10:35:47,  1.28s/it][A[A[A[A



  1%|          | 180/30001 [13:30<8:55:10,  1.08s/it] [A[A[A[A



  1%|          | 181/30001 [14:09<102:18:25, 12.35s/it][A[A[A[A

epoch:    180 | acc.: 0.945






  1%|          | 182/30001 [14:10<73:07:07,  8.83s/it] [A[A[A[A



  1%|          | 183/30001 [14:10<52:39:30,  6.36s/it][A[A[A[A



  1%|          | 184/30001 [14:11<38:22:26,  4.63s/it][A[A[A[A



  1%|          | 185/30001 [14:11<28:18:15,  3.42s/it][A[A[A[A



  1%|          | 186/30001 [14:12<21:18:11,  2.57s/it][A[A[A[A



  1%|          | 187/30001 [14:13<16:22:14,  1.98s/it][A[A[A[A



  1%|          | 188/30001 [14:13<12:56:10,  1.56s/it][A[A[A[A



  1%|          | 189/30001 [14:14<10:29:12,  1.27s/it][A[A[A[A



  1%|          | 190/30001 [14:14<8:46:25,  1.06s/it] [A[A[A[A



  1%|          | 191/30001 [14:56<109:11:51, 13.19s/it][A[A[A[A

epoch:    190 | acc.: 0.951






  1%|          | 192/30001 [14:56<77:54:01,  9.41s/it] [A[A[A[A



  1%|          | 193/30001 [14:57<55:59:29,  6.76s/it][A[A[A[A



  1%|          | 194/30001 [14:58<40:41:31,  4.91s/it][A[A[A[A



  1%|          | 195/30001 [14:58<30:02:47,  3.63s/it][A[A[A[A



  1%|          | 196/30001 [14:59<22:26:41,  2.71s/it][A[A[A[A



  1%|          | 197/30001 [14:59<17:13:04,  2.08s/it][A[A[A[A



  1%|          | 198/30001 [15:00<13:32:15,  1.64s/it][A[A[A[A



  1%|          | 199/30001 [15:01<11:02:08,  1.33s/it][A[A[A[A



  1%|          | 200/30001 [15:01<9:13:14,  1.11s/it] [A[A[A[A



  1%|          | 201/30001 [15:43<109:59:57, 13.29s/it][A[A[A[A

epoch:    200 | acc.: 0.953






  1%|          | 202/30001 [15:44<78:31:05,  9.49s/it] [A[A[A[A



  1%|          | 203/30001 [15:44<56:25:19,  6.82s/it][A[A[A[A



  1%|          | 204/30001 [15:45<40:57:14,  4.95s/it][A[A[A[A



  1%|          | 205/30001 [15:45<30:05:11,  3.64s/it][A[A[A[A



  1%|          | 206/30001 [15:46<22:35:56,  2.73s/it][A[A[A[A



  1%|          | 207/30001 [15:46<17:15:47,  2.09s/it][A[A[A[A



  1%|          | 208/30001 [15:47<13:32:33,  1.64s/it][A[A[A[A



  1%|          | 209/30001 [15:48<10:55:22,  1.32s/it][A[A[A[A



  1%|          | 210/30001 [15:48<9:07:53,  1.10s/it] [A[A[A[A



  1%|          | 211/30001 [16:30<110:15:11, 13.32s/it][A[A[A[A

epoch:    210 | acc.: 0.954






  1%|          | 212/30001 [16:31<78:39:42,  9.51s/it] [A[A[A[A



  1%|          | 213/30001 [16:31<56:33:50,  6.84s/it][A[A[A[A



  1%|          | 214/30001 [16:32<41:00:54,  4.96s/it][A[A[A[A



  1%|          | 215/30001 [16:32<30:11:16,  3.65s/it][A[A[A[A



  1%|          | 216/30001 [16:33<22:34:29,  2.73s/it][A[A[A[A



  1%|          | 217/30001 [16:34<17:15:45,  2.09s/it][A[A[A[A



  1%|          | 218/30001 [16:34<13:33:49,  1.64s/it][A[A[A[A



  1%|          | 219/30001 [16:35<10:54:50,  1.32s/it][A[A[A[A



  1%|          | 220/30001 [16:35<9:09:00,  1.11s/it] [A[A[A[A



  1%|          | 221/30001 [17:14<101:00:08, 12.21s/it][A[A[A[A

epoch:    220 | acc.: 0.952






  1%|          | 222/30001 [17:14<72:08:47,  8.72s/it] [A[A[A[A



  1%|          | 223/30001 [17:15<52:00:02,  6.29s/it][A[A[A[A



  1%|          | 224/30001 [17:15<37:52:58,  4.58s/it][A[A[A[A



  1%|          | 225/30001 [17:16<28:00:43,  3.39s/it][A[A[A[A



  1%|          | 226/30001 [17:17<21:05:50,  2.55s/it][A[A[A[A



  1%|          | 227/30001 [17:17<16:12:54,  1.96s/it][A[A[A[A



  1%|          | 228/30001 [17:18<12:48:46,  1.55s/it][A[A[A[A



  1%|          | 229/30001 [17:18<10:24:33,  1.26s/it][A[A[A[A



  1%|          | 230/30001 [17:19<8:51:51,  1.07s/it] [A[A[A[A



  1%|          | 231/30001 [18:00<109:08:55, 13.20s/it][A[A[A[A

epoch:    230 | acc.: 0.956






  1%|          | 232/30001 [18:01<77:52:34,  9.42s/it] [A[A[A[A



  1%|          | 233/30001 [18:02<55:55:39,  6.76s/it][A[A[A[A



  1%|          | 234/30001 [18:02<40:41:49,  4.92s/it][A[A[A[A



  1%|          | 235/30001 [18:03<29:55:53,  3.62s/it][A[A[A[A



  1%|          | 236/30001 [18:03<22:27:31,  2.72s/it][A[A[A[A



  1%|          | 237/30001 [18:04<17:11:36,  2.08s/it][A[A[A[A



  1%|          | 238/30001 [18:05<13:31:49,  1.64s/it][A[A[A[A



  1%|          | 239/30001 [18:05<10:57:49,  1.33s/it][A[A[A[A



  1%|          | 240/30001 [18:06<9:06:16,  1.10s/it] [A[A[A[A



  1%|          | 241/30001 [18:47<109:15:10, 13.22s/it][A[A[A[A

epoch:    240 | acc.: 0.960






  1%|          | 242/30001 [18:48<77:57:04,  9.43s/it] [A[A[A[A



  1%|          | 243/30001 [18:48<56:04:15,  6.78s/it][A[A[A[A



  1%|          | 244/30001 [18:49<40:42:12,  4.92s/it][A[A[A[A



  1%|          | 245/30001 [18:50<29:59:11,  3.63s/it][A[A[A[A



  1%|          | 246/30001 [18:50<22:33:42,  2.73s/it][A[A[A[A



  1%|          | 247/30001 [18:51<17:18:49,  2.09s/it][A[A[A[A



  1%|          | 248/30001 [18:52<13:37:41,  1.65s/it][A[A[A[A



  1%|          | 249/30001 [18:52<11:00:57,  1.33s/it][A[A[A[A



  1%|          | 250/30001 [18:53<9:10:56,  1.11s/it] [A[A[A[A



  1%|          | 251/30001 [19:35<110:30:10, 13.37s/it][A[A[A[A

epoch:    250 | acc.: 0.959






  1%|          | 252/30001 [19:35<78:47:55,  9.54s/it] [A[A[A[A



  1%|          | 253/30001 [19:36<56:34:45,  6.85s/it][A[A[A[A



  1%|          | 254/30001 [19:36<41:06:06,  4.97s/it][A[A[A[A



  1%|          | 255/30001 [19:37<30:13:53,  3.66s/it][A[A[A[A



  1%|          | 256/30001 [19:38<22:36:38,  2.74s/it][A[A[A[A



  1%|          | 257/30001 [19:38<17:19:13,  2.10s/it][A[A[A[A



  1%|          | 258/30001 [19:39<13:34:34,  1.64s/it][A[A[A[A



  1%|          | 259/30001 [19:39<10:59:12,  1.33s/it][A[A[A[A



  1%|          | 260/30001 [19:40<9:07:29,  1.10s/it] [A[A[A[A



  1%|          | 261/30001 [20:21<108:40:10, 13.15s/it][A[A[A[A

epoch:    260 | acc.: 0.959






  1%|          | 262/30001 [20:22<77:32:27,  9.39s/it] [A[A[A[A



  1%|          | 263/30001 [20:22<55:47:40,  6.75s/it][A[A[A[A



  1%|          | 264/30001 [20:23<40:32:19,  4.91s/it][A[A[A[A



  1%|          | 265/30001 [20:24<29:50:55,  3.61s/it][A[A[A[A



  1%|          | 266/30001 [20:24<22:25:29,  2.71s/it][A[A[A[A



  1%|          | 267/30001 [20:25<17:08:53,  2.08s/it][A[A[A[A



  1%|          | 268/30001 [20:26<13:51:20,  1.68s/it][A[A[A[A



  1%|          | 269/30001 [20:26<11:08:11,  1.35s/it][A[A[A[A



  1%|          | 270/30001 [20:27<9:16:02,  1.12s/it] [A[A[A[A



  1%|          | 271/30001 [21:04<99:55:05, 12.10s/it][A[A[A[A

epoch:    270 | acc.: 0.960






  1%|          | 272/30001 [21:05<71:27:00,  8.65s/it][A[A[A[A



  1%|          | 273/30001 [21:06<51:27:14,  6.23s/it][A[A[A[A



  1%|          | 274/30001 [21:06<37:27:28,  4.54s/it][A[A[A[A



  1%|          | 275/30001 [21:07<27:37:47,  3.35s/it][A[A[A[A



  1%|          | 276/30001 [21:07<20:47:22,  2.52s/it][A[A[A[A



  1%|          | 277/30001 [21:08<16:05:58,  1.95s/it][A[A[A[A



  1%|          | 278/30001 [21:09<12:43:06,  1.54s/it][A[A[A[A



  1%|          | 279/30001 [21:09<10:23:40,  1.26s/it][A[A[A[A



  1%|          | 280/30001 [21:10<8:44:01,  1.06s/it] [A[A[A[A



  1%|          | 281/30001 [21:48<100:16:13, 12.15s/it][A[A[A[A

epoch:    280 | acc.: 0.965






  1%|          | 282/30001 [21:48<71:39:46,  8.68s/it] [A[A[A[A



  1%|          | 283/30001 [21:49<51:42:47,  6.26s/it][A[A[A[A



  1%|          | 284/30001 [21:50<37:42:28,  4.57s/it][A[A[A[A



  1%|          | 285/30001 [21:50<27:52:43,  3.38s/it][A[A[A[A



  1%|          | 286/30001 [21:51<20:57:10,  2.54s/it][A[A[A[A



  1%|          | 287/30001 [21:51<16:06:15,  1.95s/it][A[A[A[A



  1%|          | 288/30001 [21:52<12:45:11,  1.55s/it][A[A[A[A



  1%|          | 289/30001 [21:53<10:24:25,  1.26s/it][A[A[A[A



  1%|          | 290/30001 [21:53<8:48:31,  1.07s/it] [A[A[A[A



  1%|          | 291/30001 [22:31<99:41:27, 12.08s/it][A[A[A[A

epoch:    290 | acc.: 0.961






  1%|          | 292/30001 [22:32<71:11:51,  8.63s/it][A[A[A[A



  1%|          | 293/30001 [22:32<51:16:51,  6.21s/it][A[A[A[A



  1%|          | 294/30001 [22:33<37:19:56,  4.52s/it][A[A[A[A



  1%|          | 295/30001 [22:33<27:35:15,  3.34s/it][A[A[A[A



  1%|          | 296/30001 [22:34<20:44:31,  2.51s/it][A[A[A[A



  1%|          | 297/30001 [22:35<15:58:49,  1.94s/it][A[A[A[A



  1%|          | 298/30001 [22:35<12:38:41,  1.53s/it][A[A[A[A



  1%|          | 299/30001 [22:36<10:15:08,  1.24s/it][A[A[A[A



  1%|          | 300/30001 [22:36<8:39:34,  1.05s/it] [A[A[A[A



  1%|          | 301/30001 [23:14<99:50:20, 12.10s/it][A[A[A[A

epoch:    300 | acc.: 0.965






  1%|          | 302/30001 [23:15<71:19:45,  8.65s/it][A[A[A[A



  1%|          | 303/30001 [23:15<51:21:22,  6.23s/it][A[A[A[A



  1%|          | 304/30001 [23:16<37:21:45,  4.53s/it][A[A[A[A



  1%|          | 305/30001 [23:16<27:40:29,  3.35s/it][A[A[A[A



  1%|          | 306/30001 [23:17<20:47:23,  2.52s/it][A[A[A[A



  1%|          | 307/30001 [23:18<16:01:05,  1.94s/it][A[A[A[A



  1%|          | 308/30001 [23:18<12:37:52,  1.53s/it][A[A[A[A



  1%|          | 309/30001 [23:19<10:19:40,  1.25s/it][A[A[A[A



  1%|          | 310/30001 [23:19<8:38:55,  1.05s/it] [A[A[A[A



  1%|          | 311/30001 [24:00<106:30:40, 12.91s/it][A[A[A[A

epoch:    310 | acc.: 0.968






  1%|          | 312/30001 [24:01<75:59:51,  9.22s/it] [A[A[A[A



  1%|          | 313/30001 [24:01<54:47:15,  6.64s/it][A[A[A[A



  1%|          | 314/30001 [24:02<39:51:22,  4.83s/it][A[A[A[A



  1%|          | 315/30001 [24:02<29:23:15,  3.56s/it][A[A[A[A



  1%|          | 316/30001 [24:03<22:05:52,  2.68s/it][A[A[A[A



  1%|          | 317/30001 [24:04<16:57:06,  2.06s/it][A[A[A[A



  1%|          | 318/30001 [24:04<13:21:09,  1.62s/it][A[A[A[A



  1%|          | 319/30001 [24:05<10:52:16,  1.32s/it][A[A[A[A



  1%|          | 320/30001 [24:05<9:05:15,  1.10s/it] [A[A[A[A



  1%|          | 321/30001 [24:46<107:05:28, 12.99s/it][A[A[A[A

epoch:    320 | acc.: 0.970






  1%|          | 322/30001 [24:47<76:26:20,  9.27s/it] [A[A[A[A



  1%|          | 323/30001 [24:47<54:58:42,  6.67s/it][A[A[A[A



  1%|          | 324/30001 [24:48<39:55:07,  4.84s/it][A[A[A[A



  1%|          | 325/30001 [24:49<29:25:35,  3.57s/it][A[A[A[A



  1%|          | 326/30001 [24:49<22:02:47,  2.67s/it][A[A[A[A



  1%|          | 327/30001 [24:50<16:52:32,  2.05s/it][A[A[A[A



  1%|          | 328/30001 [24:50<13:14:58,  1.61s/it][A[A[A[A



  1%|          | 329/30001 [24:51<10:49:58,  1.31s/it][A[A[A[A



  1%|          | 330/30001 [24:52<9:03:15,  1.10s/it] [A[A[A[A



  1%|          | 331/30001 [25:29<99:39:48, 12.09s/it][A[A[A[A

epoch:    330 | acc.: 0.972






  1%|          | 332/30001 [25:30<71:16:06,  8.65s/it][A[A[A[A



  1%|          | 333/30001 [25:30<51:17:23,  6.22s/it][A[A[A[A



  1%|          | 334/30001 [25:31<37:20:51,  4.53s/it][A[A[A[A



  1%|          | 335/30001 [25:32<27:32:56,  3.34s/it][A[A[A[A



  1%|          | 336/30001 [25:32<20:46:01,  2.52s/it][A[A[A[A



  1%|          | 337/30001 [25:33<15:58:15,  1.94s/it][A[A[A[A



  1%|          | 338/30001 [25:33<12:38:55,  1.54s/it][A[A[A[A



  1%|          | 339/30001 [25:34<10:19:54,  1.25s/it][A[A[A[A



  1%|          | 340/30001 [25:35<8:42:08,  1.06s/it] [A[A[A[A



  1%|          | 341/30001 [26:14<103:05:20, 12.51s/it][A[A[A[A

epoch:    340 | acc.: 0.972






  1%|          | 342/30001 [26:14<73:40:15,  8.94s/it] [A[A[A[A



  1%|          | 343/30001 [26:15<52:58:38,  6.43s/it][A[A[A[A



  1%|          | 344/30001 [26:16<38:34:21,  4.68s/it][A[A[A[A



  1%|          | 345/30001 [26:16<28:26:01,  3.45s/it][A[A[A[A



  1%|          | 346/30001 [26:17<21:22:43,  2.60s/it][A[A[A[A



  1%|          | 347/30001 [26:17<16:25:25,  1.99s/it][A[A[A[A



  1%|          | 348/30001 [26:18<12:55:28,  1.57s/it][A[A[A[A



  1%|          | 349/30001 [26:19<10:31:02,  1.28s/it][A[A[A[A



  1%|          | 350/30001 [26:19<8:47:07,  1.07s/it] [A[A[A[A



  1%|          | 351/30001 [26:58<103:21:29, 12.55s/it][A[A[A[A

epoch:    350 | acc.: 0.972






  1%|          | 352/30001 [26:59<73:49:50,  8.96s/it] [A[A[A[A



  1%|          | 353/30001 [27:00<53:07:12,  6.45s/it][A[A[A[A



  1%|          | 354/30001 [27:00<38:38:44,  4.69s/it][A[A[A[A



  1%|          | 355/30001 [27:01<28:31:26,  3.46s/it][A[A[A[A



  1%|          | 356/30001 [27:01<21:22:51,  2.60s/it][A[A[A[A



  1%|          | 357/30001 [27:02<16:26:04,  2.00s/it][A[A[A[A



  1%|          | 358/30001 [27:03<12:58:55,  1.58s/it][A[A[A[A



  1%|          | 359/30001 [27:03<10:35:00,  1.29s/it][A[A[A[A



  1%|          | 360/30001 [27:04<8:53:19,  1.08s/it] [A[A[A[A



  1%|          | 361/30001 [27:43<103:53:49, 12.62s/it][A[A[A[A

epoch:    360 | acc.: 0.971






  1%|          | 362/30001 [27:44<74:07:47,  9.00s/it] [A[A[A[A



  1%|          | 363/30001 [27:45<53:19:41,  6.48s/it][A[A[A[A



  1%|          | 364/30001 [27:45<38:45:55,  4.71s/it][A[A[A[A



  1%|          | 365/30001 [27:46<28:38:37,  3.48s/it][A[A[A[A



  1%|          | 366/30001 [27:46<21:30:12,  2.61s/it][A[A[A[A



  1%|          | 367/30001 [27:47<16:30:46,  2.01s/it][A[A[A[A



  1%|          | 368/30001 [27:48<13:02:58,  1.59s/it][A[A[A[A



  1%|          | 369/30001 [27:48<10:39:25,  1.29s/it][A[A[A[A



  1%|          | 370/30001 [27:49<8:56:57,  1.09s/it] [A[A[A[A



  1%|          | 371/30001 [28:27<99:35:45, 12.10s/it][A[A[A[A

epoch:    370 | acc.: 0.974






  1%|          | 372/30001 [28:27<71:16:16,  8.66s/it][A[A[A[A



  1%|          | 373/30001 [28:28<51:22:01,  6.24s/it][A[A[A[A



  1%|          | 374/30001 [28:28<37:21:24,  4.54s/it][A[A[A[A



  1%|          | 375/30001 [28:29<27:37:28,  3.36s/it][A[A[A[A



  1%|▏         | 376/30001 [28:30<20:49:21,  2.53s/it][A[A[A[A



  1%|▏         | 377/30001 [28:30<16:04:13,  1.95s/it][A[A[A[A



  1%|▏         | 378/30001 [28:31<12:44:30,  1.55s/it][A[A[A[A



  1%|▏         | 379/30001 [28:31<10:21:42,  1.26s/it][A[A[A[A



  1%|▏         | 380/30001 [28:32<8:43:07,  1.06s/it] [A[A[A[A



  1%|▏         | 381/30001 [29:10<100:14:49, 12.18s/it][A[A[A[A

epoch:    380 | acc.: 0.975






  1%|▏         | 382/30001 [29:11<71:41:03,  8.71s/it] [A[A[A[A



  1%|▏         | 383/30001 [29:11<51:40:38,  6.28s/it][A[A[A[A



  1%|▏         | 384/30001 [29:12<37:37:05,  4.57s/it][A[A[A[A



  1%|▏         | 385/30001 [29:12<27:48:06,  3.38s/it][A[A[A[A



  1%|▏         | 386/30001 [29:13<20:56:00,  2.54s/it][A[A[A[A



  1%|▏         | 387/30001 [29:14<16:10:26,  1.97s/it][A[A[A[A



  1%|▏         | 388/30001 [29:14<12:50:43,  1.56s/it][A[A[A[A



  1%|▏         | 389/30001 [29:15<10:29:52,  1.28s/it][A[A[A[A



  1%|▏         | 390/30001 [29:15<8:50:25,  1.07s/it] [A[A[A[A



  1%|▏         | 391/30001 [29:54<100:16:01, 12.19s/it][A[A[A[A

epoch:    390 | acc.: 0.971






  1%|▏         | 392/30001 [29:54<71:38:07,  8.71s/it] [A[A[A[A



  1%|▏         | 393/30001 [29:55<51:36:27,  6.27s/it][A[A[A[A



  1%|▏         | 394/30001 [29:55<37:32:03,  4.56s/it][A[A[A[A



  1%|▏         | 395/30001 [29:56<27:41:30,  3.37s/it][A[A[A[A



  1%|▏         | 396/30001 [29:57<20:51:29,  2.54s/it][A[A[A[A



  1%|▏         | 397/30001 [29:57<16:01:40,  1.95s/it][A[A[A[A



  1%|▏         | 398/30001 [29:58<12:40:45,  1.54s/it][A[A[A[A



  1%|▏         | 399/30001 [29:58<10:19:55,  1.26s/it][A[A[A[A



  1%|▏         | 400/30001 [29:59<8:41:16,  1.06s/it] [A[A[A[A



  1%|▏         | 401/30001 [30:37<99:37:34, 12.12s/it][A[A[A[A

epoch:    400 | acc.: 0.968






  1%|▏         | 402/30001 [30:37<71:13:37,  8.66s/it][A[A[A[A



  1%|▏         | 403/30001 [30:38<51:19:02,  6.24s/it][A[A[A[A



  1%|▏         | 404/30001 [30:39<37:19:15,  4.54s/it][A[A[A[A



  1%|▏         | 405/30001 [30:39<27:35:33,  3.36s/it][A[A[A[A



  1%|▏         | 406/30001 [30:40<20:43:53,  2.52s/it][A[A[A[A



  1%|▏         | 407/30001 [30:40<15:56:27,  1.94s/it][A[A[A[A



  1%|▏         | 408/30001 [30:41<12:34:56,  1.53s/it][A[A[A[A



  1%|▏         | 409/30001 [30:42<10:15:45,  1.25s/it][A[A[A[A



  1%|▏         | 410/30001 [30:42<8:38:18,  1.05s/it] [A[A[A[A



  1%|▏         | 411/30001 [31:24<108:13:35, 13.17s/it][A[A[A[A

epoch:    410 | acc.: 0.974






  1%|▏         | 412/30001 [31:24<77:13:21,  9.40s/it] [A[A[A[A



  1%|▏         | 413/30001 [31:25<55:32:13,  6.76s/it][A[A[A[A



  1%|▏         | 414/30001 [31:25<40:22:04,  4.91s/it][A[A[A[A



  1%|▏         | 415/30001 [31:26<29:43:08,  3.62s/it][A[A[A[A



  1%|▏         | 416/30001 [31:27<22:13:38,  2.70s/it][A[A[A[A



  1%|▏         | 417/30001 [31:27<16:58:56,  2.07s/it][A[A[A[A



  1%|▏         | 418/30001 [31:28<13:18:09,  1.62s/it][A[A[A[A



  1%|▏         | 419/30001 [31:28<10:41:26,  1.30s/it][A[A[A[A



  1%|▏         | 420/30001 [31:29<8:58:25,  1.09s/it] [A[A[A[A



  1%|▏         | 421/30001 [32:06<99:01:54, 12.05s/it][A[A[A[A

epoch:    420 | acc.: 0.975






  1%|▏         | 422/30001 [32:07<70:52:20,  8.63s/it][A[A[A[A



  1%|▏         | 423/30001 [32:08<51:00:31,  6.21s/it][A[A[A[A



  1%|▏         | 424/30001 [32:08<37:06:32,  4.52s/it][A[A[A[A



  1%|▏         | 425/30001 [32:09<27:22:45,  3.33s/it][A[A[A[A



  1%|▏         | 426/30001 [32:09<20:36:06,  2.51s/it][A[A[A[A



  1%|▏         | 427/30001 [32:10<15:51:47,  1.93s/it][A[A[A[A



  1%|▏         | 428/30001 [32:11<12:32:36,  1.53s/it][A[A[A[A



  1%|▏         | 429/30001 [32:11<10:12:17,  1.24s/it][A[A[A[A



  1%|▏         | 430/30001 [32:12<8:34:33,  1.04s/it] [A[A[A[A



  1%|▏         | 431/30001 [32:51<101:53:52, 12.41s/it][A[A[A[A

epoch:    430 | acc.: 0.975






  1%|▏         | 432/30001 [32:51<72:47:26,  8.86s/it] [A[A[A[A



  1%|▏         | 433/30001 [32:52<52:22:18,  6.38s/it][A[A[A[A



  1%|▏         | 434/30001 [32:52<38:04:52,  4.64s/it][A[A[A[A



  1%|▏         | 435/30001 [32:53<28:06:29,  3.42s/it][A[A[A[A



  1%|▏         | 436/30001 [32:54<21:06:53,  2.57s/it][A[A[A[A



  1%|▏         | 437/30001 [32:54<16:10:59,  1.97s/it][A[A[A[A



  1%|▏         | 438/30001 [32:55<12:49:55,  1.56s/it][A[A[A[A



  1%|▏         | 439/30001 [32:55<10:24:10,  1.27s/it][A[A[A[A



  1%|▏         | 440/30001 [32:56<8:41:15,  1.06s/it] [A[A[A[A



  1%|▏         | 441/30001 [33:33<98:19:02, 11.97s/it][A[A[A[A

epoch:    440 | acc.: 0.977






  1%|▏         | 442/30001 [33:34<70:18:15,  8.56s/it][A[A[A[A



  1%|▏         | 443/30001 [33:34<50:35:13,  6.16s/it][A[A[A[A



  1%|▏         | 444/30001 [33:35<36:50:38,  4.49s/it][A[A[A[A



  1%|▏         | 445/30001 [33:36<27:12:27,  3.31s/it][A[A[A[A



  1%|▏         | 446/30001 [33:36<20:28:06,  2.49s/it][A[A[A[A



  1%|▏         | 447/30001 [33:37<15:47:21,  1.92s/it][A[A[A[A



  1%|▏         | 448/30001 [33:37<12:25:41,  1.51s/it][A[A[A[A



  1%|▏         | 449/30001 [33:38<10:07:47,  1.23s/it][A[A[A[A



  1%|▏         | 450/30001 [33:38<8:28:25,  1.03s/it] [A[A[A[A



  2%|▏         | 451/30001 [34:20<107:33:30, 13.10s/it][A[A[A[A

epoch:    450 | acc.: 0.978






  2%|▏         | 452/30001 [34:20<76:47:05,  9.35s/it] [A[A[A[A



  2%|▏         | 453/30001 [34:21<55:09:22,  6.72s/it][A[A[A[A



  2%|▏         | 454/30001 [34:22<40:03:27,  4.88s/it][A[A[A[A



  2%|▏         | 455/30001 [34:22<29:28:15,  3.59s/it][A[A[A[A



  2%|▏         | 456/30001 [34:23<22:06:59,  2.69s/it][A[A[A[A



  2%|▏         | 457/30001 [34:23<16:58:13,  2.07s/it][A[A[A[A



  2%|▏         | 458/30001 [34:24<13:18:44,  1.62s/it][A[A[A[A



  2%|▏         | 459/30001 [34:24<10:45:08,  1.31s/it][A[A[A[A



  2%|▏         | 460/30001 [34:25<8:55:55,  1.09s/it] [A[A[A[A



  2%|▏         | 461/30001 [35:03<99:01:30, 12.07s/it][A[A[A[A

epoch:    460 | acc.: 0.980






  2%|▏         | 462/30001 [35:03<70:48:50,  8.63s/it][A[A[A[A



  2%|▏         | 463/30001 [35:04<51:01:12,  6.22s/it][A[A[A[A



  2%|▏         | 464/30001 [35:05<37:08:08,  4.53s/it][A[A[A[A



  2%|▏         | 465/30001 [35:05<27:27:20,  3.35s/it][A[A[A[A



  2%|▏         | 466/30001 [35:06<20:37:56,  2.51s/it][A[A[A[A



  2%|▏         | 467/30001 [35:06<15:54:20,  1.94s/it][A[A[A[A



  2%|▏         | 468/30001 [35:07<12:35:59,  1.54s/it][A[A[A[A



  2%|▏         | 469/30001 [35:07<10:19:15,  1.26s/it][A[A[A[A



  2%|▏         | 470/30001 [35:08<8:38:04,  1.05s/it] [A[A[A[A



  2%|▏         | 471/30001 [35:49<107:32:44, 13.11s/it][A[A[A[A

epoch:    470 | acc.: 0.979






  2%|▏         | 472/30001 [35:50<76:41:06,  9.35s/it] [A[A[A[A



  2%|▏         | 473/30001 [35:50<55:05:04,  6.72s/it][A[A[A[A



  2%|▏         | 474/30001 [35:51<39:59:45,  4.88s/it][A[A[A[A



  2%|▏         | 475/30001 [35:52<29:22:21,  3.58s/it][A[A[A[A



  2%|▏         | 476/30001 [35:52<21:59:23,  2.68s/it][A[A[A[A



  2%|▏         | 477/30001 [35:53<16:46:32,  2.05s/it][A[A[A[A



  2%|▏         | 478/30001 [35:53<13:14:35,  1.61s/it][A[A[A[A



  2%|▏         | 479/30001 [35:54<10:39:46,  1.30s/it][A[A[A[A



  2%|▏         | 480/30001 [35:55<8:57:28,  1.09s/it] [A[A[A[A



  2%|▏         | 481/30001 [36:36<108:04:16, 13.18s/it][A[A[A[A

epoch:    480 | acc.: 0.975






  2%|▏         | 482/30001 [36:37<77:07:00,  9.40s/it] [A[A[A[A



  2%|▏         | 483/30001 [36:37<55:21:41,  6.75s/it][A[A[A[A



  2%|▏         | 484/30001 [36:38<40:11:28,  4.90s/it][A[A[A[A



  2%|▏         | 485/30001 [36:38<29:36:23,  3.61s/it][A[A[A[A



  2%|▏         | 486/30001 [36:39<22:10:45,  2.71s/it][A[A[A[A



  2%|▏         | 487/30001 [36:39<16:58:46,  2.07s/it][A[A[A[A



  2%|▏         | 488/30001 [36:40<13:16:47,  1.62s/it][A[A[A[A



  2%|▏         | 489/30001 [36:41<10:41:58,  1.31s/it][A[A[A[A



  2%|▏         | 490/30001 [36:41<8:56:29,  1.09s/it] [A[A[A[A



  2%|▏         | 491/30001 [37:19<98:17:24, 11.99s/it][A[A[A[A

epoch:    490 | acc.: 0.978






  2%|▏         | 492/30001 [37:19<70:15:43,  8.57s/it][A[A[A[A



  2%|▏         | 493/30001 [37:20<50:37:46,  6.18s/it][A[A[A[A



  2%|▏         | 494/30001 [37:20<36:52:28,  4.50s/it][A[A[A[A



  2%|▏         | 495/30001 [37:21<27:14:57,  3.32s/it][A[A[A[A



  2%|▏         | 496/30001 [37:22<20:27:33,  2.50s/it][A[A[A[A



  2%|▏         | 497/30001 [37:22<15:45:23,  1.92s/it][A[A[A[A



  2%|▏         | 498/30001 [37:23<12:30:45,  1.53s/it][A[A[A[A



  2%|▏         | 499/30001 [37:23<10:13:43,  1.25s/it][A[A[A[A



  2%|▏         | 500/30001 [37:24<8:34:09,  1.05s/it] [A[A[A[A



  2%|▏         | 501/30001 [38:02<99:09:33, 12.10s/it][A[A[A[A

epoch:    500 | acc.: 0.977






  2%|▏         | 502/30001 [38:02<70:52:41,  8.65s/it][A[A[A[A



  2%|▏         | 503/30001 [38:03<51:00:39,  6.23s/it][A[A[A[A



  2%|▏         | 504/30001 [38:04<37:10:42,  4.54s/it][A[A[A[A



  2%|▏         | 505/30001 [38:04<27:25:14,  3.35s/it][A[A[A[A



  2%|▏         | 506/30001 [38:05<20:37:08,  2.52s/it][A[A[A[A



  2%|▏         | 507/30001 [38:05<15:52:25,  1.94s/it][A[A[A[A



  2%|▏         | 508/30001 [38:06<12:31:35,  1.53s/it][A[A[A[A



  2%|▏         | 509/30001 [38:06<10:15:08,  1.25s/it][A[A[A[A



  2%|▏         | 510/30001 [38:07<8:35:44,  1.05s/it] [A[A[A[A



  2%|▏         | 511/30001 [38:48<106:08:25, 12.96s/it][A[A[A[A

epoch:    510 | acc.: 0.977






  2%|▏         | 512/30001 [38:48<75:49:41,  9.26s/it] [A[A[A[A



  2%|▏         | 513/30001 [38:49<54:30:44,  6.66s/it][A[A[A[A



  2%|▏         | 514/30001 [38:50<39:37:28,  4.84s/it][A[A[A[A



  2%|▏         | 515/30001 [38:50<29:08:47,  3.56s/it][A[A[A[A



  2%|▏         | 516/30001 [38:51<21:47:31,  2.66s/it][A[A[A[A



  2%|▏         | 517/30001 [38:51<16:44:29,  2.04s/it][A[A[A[A



  2%|▏         | 518/30001 [38:52<13:12:51,  1.61s/it][A[A[A[A



  2%|▏         | 519/30001 [38:52<10:40:29,  1.30s/it][A[A[A[A



  2%|▏         | 520/30001 [38:53<8:56:22,  1.09s/it] [A[A[A[A



  2%|▏         | 521/30001 [39:35<109:53:46, 13.42s/it][A[A[A[A

epoch:    520 | acc.: 0.978






  2%|▏         | 522/30001 [39:36<78:21:03,  9.57s/it] [A[A[A[A



  2%|▏         | 523/30001 [39:36<56:17:58,  6.88s/it][A[A[A[A



  2%|▏         | 524/30001 [39:37<40:51:03,  4.99s/it][A[A[A[A



  2%|▏         | 525/30001 [39:38<30:06:27,  3.68s/it][A[A[A[A



  2%|▏         | 526/30001 [39:38<22:34:44,  2.76s/it][A[A[A[A



  2%|▏         | 527/30001 [39:39<17:15:33,  2.11s/it][A[A[A[A



  2%|▏         | 528/30001 [39:39<13:30:41,  1.65s/it][A[A[A[A



  2%|▏         | 529/30001 [39:40<10:55:47,  1.34s/it][A[A[A[A



  2%|▏         | 530/30001 [39:41<9:05:50,  1.11s/it] [A[A[A[A



  2%|▏         | 531/30001 [40:20<102:46:50, 12.56s/it][A[A[A[A

epoch:    530 | acc.: 0.976






  2%|▏         | 532/30001 [40:20<73:24:22,  8.97s/it] [A[A[A[A



  2%|▏         | 533/30001 [40:21<52:54:20,  6.46s/it][A[A[A[A



  2%|▏         | 534/30001 [40:22<38:27:34,  4.70s/it][A[A[A[A



  2%|▏         | 535/30001 [40:22<28:22:44,  3.47s/it][A[A[A[A



  2%|▏         | 536/30001 [40:23<21:20:13,  2.61s/it][A[A[A[A



  2%|▏         | 537/30001 [40:23<16:23:33,  2.00s/it][A[A[A[A



  2%|▏         | 538/30001 [40:24<12:56:45,  1.58s/it][A[A[A[A



  2%|▏         | 539/30001 [40:25<10:29:07,  1.28s/it][A[A[A[A



  2%|▏         | 540/30001 [40:25<8:47:31,  1.07s/it] [A[A[A[A



  2%|▏         | 541/30001 [41:07<108:16:58, 13.23s/it][A[A[A[A

epoch:    540 | acc.: 0.978






  2%|▏         | 542/30001 [41:07<77:13:39,  9.44s/it] [A[A[A[A



  2%|▏         | 543/30001 [41:08<55:29:26,  6.78s/it][A[A[A[A



  2%|▏         | 544/30001 [41:09<40:14:16,  4.92s/it][A[A[A[A



  2%|▏         | 545/30001 [41:09<29:33:24,  3.61s/it][A[A[A[A



  2%|▏         | 546/30001 [41:10<22:08:31,  2.71s/it][A[A[A[A



  2%|▏         | 547/30001 [41:10<16:54:00,  2.07s/it][A[A[A[A



  2%|▏         | 548/30001 [41:11<13:15:38,  1.62s/it][A[A[A[A



  2%|▏         | 549/30001 [41:11<10:44:53,  1.31s/it][A[A[A[A



  2%|▏         | 550/30001 [41:12<8:55:29,  1.09s/it] [A[A[A[A



  2%|▏         | 551/30001 [41:51<102:01:16, 12.47s/it][A[A[A[A

epoch:    550 | acc.: 0.981






  2%|▏         | 552/30001 [41:52<72:48:17,  8.90s/it] [A[A[A[A



  2%|▏         | 553/30001 [41:52<52:23:01,  6.40s/it][A[A[A[A



  2%|▏         | 554/30001 [41:53<38:05:59,  4.66s/it][A[A[A[A



  2%|▏         | 555/30001 [41:53<28:07:04,  3.44s/it][A[A[A[A



  2%|▏         | 556/30001 [41:54<21:08:18,  2.58s/it][A[A[A[A



  2%|▏         | 557/30001 [41:55<16:11:39,  1.98s/it][A[A[A[A



  2%|▏         | 558/30001 [41:55<12:45:37,  1.56s/it][A[A[A[A



  2%|▏         | 559/30001 [41:56<10:20:03,  1.26s/it][A[A[A[A



  2%|▏         | 560/30001 [41:56<8:38:50,  1.06s/it] [A[A[A[A



  2%|▏         | 561/30001 [42:35<101:36:47, 12.43s/it][A[A[A[A

epoch:    560 | acc.: 0.980






  2%|▏         | 562/30001 [42:36<72:37:11,  8.88s/it] [A[A[A[A



  2%|▏         | 563/30001 [42:36<52:19:30,  6.40s/it][A[A[A[A



  2%|▏         | 564/30001 [42:37<38:02:01,  4.65s/it][A[A[A[A



  2%|▏         | 565/30001 [42:38<28:09:38,  3.44s/it][A[A[A[A



  2%|▏         | 566/30001 [42:38<21:09:32,  2.59s/it][A[A[A[A



  2%|▏         | 567/30001 [42:39<16:14:09,  1.99s/it][A[A[A[A



  2%|▏         | 568/30001 [42:39<12:52:46,  1.58s/it][A[A[A[A



  2%|▏         | 569/30001 [42:40<10:27:04,  1.28s/it][A[A[A[A



  2%|▏         | 570/30001 [42:41<8:45:17,  1.07s/it] [A[A[A[A



  2%|▏         | 571/30001 [43:23<109:19:29, 13.37s/it][A[A[A[A

epoch:    570 | acc.: 0.978






  2%|▏         | 572/30001 [43:23<78:02:49,  9.55s/it] [A[A[A[A



  2%|▏         | 573/30001 [43:24<56:06:12,  6.86s/it][A[A[A[A



  2%|▏         | 574/30001 [43:24<40:39:17,  4.97s/it][A[A[A[A



  2%|▏         | 575/30001 [43:25<29:56:22,  3.66s/it][A[A[A[A



  2%|▏         | 576/30001 [43:26<22:24:25,  2.74s/it][A[A[A[A



  2%|▏         | 577/30001 [43:26<17:08:21,  2.10s/it][A[A[A[A



  2%|▏         | 578/30001 [43:27<13:27:32,  1.65s/it][A[A[A[A



  2%|▏         | 579/30001 [43:27<10:49:19,  1.32s/it][A[A[A[A



  2%|▏         | 580/30001 [43:28<9:03:49,  1.11s/it] [A[A[A[A



  2%|▏         | 581/30001 [44:06<99:55:40, 12.23s/it][A[A[A[A

epoch:    580 | acc.: 0.975






  2%|▏         | 582/30001 [44:07<71:22:50,  8.73s/it][A[A[A[A



  2%|▏         | 583/30001 [44:07<51:23:38,  6.29s/it][A[A[A[A



  2%|▏         | 584/30001 [44:08<37:24:14,  4.58s/it][A[A[A[A



  2%|▏         | 585/30001 [44:09<27:38:00,  3.38s/it][A[A[A[A



  2%|▏         | 586/30001 [44:09<20:47:08,  2.54s/it][A[A[A[A



  2%|▏         | 587/30001 [44:10<16:03:00,  1.96s/it][A[A[A[A



  2%|▏         | 588/30001 [44:10<12:41:39,  1.55s/it][A[A[A[A



  2%|▏         | 589/30001 [44:11<10:19:04,  1.26s/it][A[A[A[A



  2%|▏         | 590/30001 [44:12<8:42:33,  1.07s/it] [A[A[A[A



  2%|▏         | 591/30001 [44:50<99:23:37, 12.17s/it][A[A[A[A

epoch:    590 | acc.: 0.979






  2%|▏         | 592/30001 [44:50<71:03:12,  8.70s/it][A[A[A[A



  2%|▏         | 593/30001 [44:51<51:13:50,  6.27s/it][A[A[A[A



  2%|▏         | 594/30001 [44:51<37:20:51,  4.57s/it][A[A[A[A



  2%|▏         | 595/30001 [44:52<27:39:19,  3.39s/it][A[A[A[A



  2%|▏         | 596/30001 [44:53<20:49:12,  2.55s/it][A[A[A[A



  2%|▏         | 597/30001 [44:53<16:04:56,  1.97s/it][A[A[A[A



  2%|▏         | 598/30001 [44:54<12:40:52,  1.55s/it][A[A[A[A



  2%|▏         | 599/30001 [44:54<10:24:25,  1.27s/it][A[A[A[A



  2%|▏         | 600/30001 [44:55<8:47:48,  1.08s/it] [A[A[A[A



  2%|▏         | 601/30001 [45:37<109:21:25, 13.39s/it][A[A[A[A

epoch:    600 | acc.: 0.979






  2%|▏         | 602/30001 [45:38<78:02:30,  9.56s/it] [A[A[A[A



  2%|▏         | 603/30001 [45:38<56:05:04,  6.87s/it][A[A[A[A



  2%|▏         | 604/30001 [45:39<40:40:33,  4.98s/it][A[A[A[A



  2%|▏         | 605/30001 [45:40<29:54:24,  3.66s/it][A[A[A[A



  2%|▏         | 606/30001 [45:40<22:21:58,  2.74s/it][A[A[A[A



  2%|▏         | 607/30001 [45:41<17:06:03,  2.09s/it][A[A[A[A



  2%|▏         | 608/30001 [45:41<13:24:39,  1.64s/it][A[A[A[A



  2%|▏         | 609/30001 [45:42<10:46:18,  1.32s/it][A[A[A[A



  2%|▏         | 610/30001 [45:43<9:00:57,  1.10s/it] [A[A[A[A



  2%|▏         | 611/30001 [46:20<99:14:42, 12.16s/it][A[A[A[A

epoch:    610 | acc.: 0.980






  2%|▏         | 612/30001 [46:21<70:55:30,  8.69s/it][A[A[A[A



  2%|▏         | 613/30001 [46:22<51:09:12,  6.27s/it][A[A[A[A



  2%|▏         | 614/30001 [46:22<37:15:44,  4.56s/it][A[A[A[A



  2%|▏         | 615/30001 [46:23<27:33:53,  3.38s/it][A[A[A[A



  2%|▏         | 616/30001 [46:23<20:41:14,  2.53s/it][A[A[A[A



  2%|▏         | 617/30001 [46:24<15:52:26,  1.94s/it][A[A[A[A



  2%|▏         | 618/30001 [46:25<12:32:29,  1.54s/it][A[A[A[A



  2%|▏         | 619/30001 [46:25<10:15:19,  1.26s/it][A[A[A[A



  2%|▏         | 620/30001 [46:26<8:36:52,  1.06s/it] [A[A[A[A



  2%|▏         | 621/30001 [47:04<99:11:21, 12.15s/it][A[A[A[A

epoch:    620 | acc.: 0.980






  2%|▏         | 622/30001 [47:04<70:52:55,  8.69s/it][A[A[A[A



  2%|▏         | 623/30001 [47:05<51:00:04,  6.25s/it][A[A[A[A



  2%|▏         | 624/30001 [47:06<37:09:07,  4.55s/it][A[A[A[A



  2%|▏         | 625/30001 [47:06<27:27:12,  3.36s/it][A[A[A[A



  2%|▏         | 626/30001 [47:07<20:41:22,  2.54s/it][A[A[A[A



  2%|▏         | 627/30001 [47:07<15:51:38,  1.94s/it][A[A[A[A



  2%|▏         | 628/30001 [47:08<12:30:58,  1.53s/it][A[A[A[A



  2%|▏         | 629/30001 [47:09<10:10:47,  1.25s/it][A[A[A[A



  2%|▏         | 631/30001 [47:48<101:07:03, 12.39s/it][A[A[A[A

epoch:    630 | acc.: 0.981






  2%|▏         | 632/30001 [47:49<72:12:40,  8.85s/it] [A[A[A[A



  2%|▏         | 633/30001 [47:49<51:58:40,  6.37s/it][A[A[A[A



  2%|▏         | 634/30001 [47:50<37:48:33,  4.63s/it][A[A[A[A



  2%|▏         | 635/30001 [47:50<27:55:14,  3.42s/it][A[A[A[A



  2%|▏         | 636/30001 [47:51<20:57:45,  2.57s/it][A[A[A[A



  2%|▏         | 637/30001 [47:51<16:06:56,  1.98s/it][A[A[A[A



  2%|▏         | 638/30001 [47:52<12:45:51,  1.56s/it][A[A[A[A



  2%|▏         | 639/30001 [47:53<10:21:24,  1.27s/it][A[A[A[A



  2%|▏         | 640/30001 [47:53<8:41:09,  1.07s/it] [A[A[A[A



  2%|▏         | 641/30001 [48:31<97:55:19, 12.01s/it][A[A[A[A

epoch:    640 | acc.: 0.981






  2%|▏         | 642/30001 [48:31<70:02:01,  8.59s/it][A[A[A[A



  2%|▏         | 643/30001 [48:32<50:24:01,  6.18s/it][A[A[A[A



  2%|▏         | 644/30001 [48:33<36:42:52,  4.50s/it][A[A[A[A



  2%|▏         | 645/30001 [48:33<27:05:01,  3.32s/it][A[A[A[A



  2%|▏         | 646/30001 [48:34<20:23:59,  2.50s/it][A[A[A[A



  2%|▏         | 647/30001 [48:34<15:44:31,  1.93s/it][A[A[A[A



  2%|▏         | 648/30001 [48:35<12:26:01,  1.52s/it][A[A[A[A



  2%|▏         | 649/30001 [48:35<10:10:41,  1.25s/it][A[A[A[A



  2%|▏         | 650/30001 [48:36<8:31:08,  1.04s/it] [A[A[A[A



  2%|▏         | 651/30001 [49:16<103:17:35, 12.67s/it][A[A[A[A

epoch:    650 | acc.: 0.980






  2%|▏         | 652/30001 [49:16<73:44:02,  9.04s/it] [A[A[A[A



  2%|▏         | 653/30001 [49:17<53:03:35,  6.51s/it][A[A[A[A



  2%|▏         | 654/30001 [49:18<38:33:32,  4.73s/it][A[A[A[A



  2%|▏         | 655/30001 [49:18<28:25:12,  3.49s/it][A[A[A[A



  2%|▏         | 656/30001 [49:19<21:16:20,  2.61s/it][A[A[A[A



  2%|▏         | 657/30001 [49:19<16:19:20,  2.00s/it][A[A[A[A



  2%|▏         | 658/30001 [49:20<12:50:13,  1.57s/it][A[A[A[A



  2%|▏         | 659/30001 [49:20<10:21:39,  1.27s/it][A[A[A[A



  2%|▏         | 660/30001 [49:21<8:41:05,  1.07s/it] [A[A[A[A



  2%|▏         | 661/30001 [49:58<97:34:28, 11.97s/it][A[A[A[A

epoch:    660 | acc.: 0.982






  2%|▏         | 662/30001 [49:59<69:42:39,  8.55s/it][A[A[A[A



  2%|▏         | 663/30001 [50:00<50:14:57,  6.17s/it][A[A[A[A



  2%|▏         | 664/30001 [50:00<36:36:29,  4.49s/it][A[A[A[A



  2%|▏         | 665/30001 [50:01<27:02:49,  3.32s/it][A[A[A[A



  2%|▏         | 666/30001 [50:01<20:23:24,  2.50s/it][A[A[A[A



  2%|▏         | 667/30001 [50:02<15:46:48,  1.94s/it][A[A[A[A



  2%|▏         | 668/30001 [50:03<12:31:56,  1.54s/it][A[A[A[A



  2%|▏         | 669/30001 [50:03<10:13:11,  1.25s/it][A[A[A[A



  2%|▏         | 670/30001 [50:04<8:39:48,  1.06s/it] [A[A[A[A



  2%|▏         | 671/30001 [50:46<108:22:09, 13.30s/it][A[A[A[A

epoch:    670 | acc.: 0.981






  2%|▏         | 672/30001 [50:46<77:17:28,  9.49s/it] [A[A[A[A



  2%|▏         | 673/30001 [50:47<55:32:53,  6.82s/it][A[A[A[A



  2%|▏         | 674/30001 [50:47<40:19:14,  4.95s/it][A[A[A[A



  2%|▏         | 675/30001 [50:48<29:38:03,  3.64s/it][A[A[A[A



  2%|▏         | 676/30001 [50:49<22:07:28,  2.72s/it][A[A[A[A



  2%|▏         | 677/30001 [50:49<16:56:42,  2.08s/it][A[A[A[A



  2%|▏         | 678/30001 [50:50<13:16:02,  1.63s/it][A[A[A[A



  2%|▏         | 679/30001 [50:50<10:41:32,  1.31s/it][A[A[A[A



  2%|▏         | 680/30001 [50:51<8:56:26,  1.10s/it] [A[A[A[A



  2%|▏         | 681/30001 [51:29<98:30:32, 12.10s/it][A[A[A[A

epoch:    680 | acc.: 0.982






  2%|▏         | 682/30001 [51:29<70:25:54,  8.65s/it][A[A[A[A



  2%|▏         | 683/30001 [51:30<50:45:12,  6.23s/it][A[A[A[A



  2%|▏         | 684/30001 [51:31<37:00:45,  4.54s/it][A[A[A[A



  2%|▏         | 685/30001 [51:31<27:20:35,  3.36s/it][A[A[A[A



  2%|▏         | 686/30001 [51:32<20:35:19,  2.53s/it][A[A[A[A



  2%|▏         | 687/30001 [51:32<15:53:12,  1.95s/it][A[A[A[A



  2%|▏         | 688/30001 [51:33<12:30:23,  1.54s/it][A[A[A[A



  2%|▏         | 689/30001 [51:33<10:12:57,  1.25s/it][A[A[A[A



  2%|▏         | 690/30001 [51:34<8:34:10,  1.05s/it] [A[A[A[A



  2%|▏         | 691/30001 [52:12<97:59:37, 12.04s/it][A[A[A[A

epoch:    690 | acc.: 0.978






  2%|▏         | 692/30001 [52:12<69:58:40,  8.60s/it][A[A[A[A



  2%|▏         | 693/30001 [52:13<50:25:50,  6.19s/it][A[A[A[A



  2%|▏         | 694/30001 [52:13<36:42:34,  4.51s/it][A[A[A[A



  2%|▏         | 695/30001 [52:14<27:07:40,  3.33s/it][A[A[A[A



  2%|▏         | 696/30001 [52:15<20:22:27,  2.50s/it][A[A[A[A



  2%|▏         | 697/30001 [52:15<15:39:24,  1.92s/it][A[A[A[A



  2%|▏         | 698/30001 [52:16<12:34:23,  1.54s/it][A[A[A[A



  2%|▏         | 699/30001 [52:16<10:12:03,  1.25s/it][A[A[A[A



  2%|▏         | 700/30001 [52:17<8:35:17,  1.06s/it] [A[A[A[A



  2%|▏         | 701/30001 [52:54<96:46:35, 11.89s/it][A[A[A[A

epoch:    700 | acc.: 0.981






  2%|▏         | 702/30001 [52:55<69:11:10,  8.50s/it][A[A[A[A



  2%|▏         | 703/30001 [52:55<49:49:24,  6.12s/it][A[A[A[A



  2%|▏         | 704/30001 [52:56<36:14:04,  4.45s/it][A[A[A[A



  2%|▏         | 705/30001 [52:57<26:49:34,  3.30s/it][A[A[A[A



  2%|▏         | 706/30001 [52:57<20:10:57,  2.48s/it][A[A[A[A



  2%|▏         | 707/30001 [52:58<15:33:58,  1.91s/it][A[A[A[A



  2%|▏         | 708/30001 [52:58<12:20:58,  1.52s/it][A[A[A[A



  2%|▏         | 709/30001 [52:59<10:01:59,  1.23s/it][A[A[A[A



  2%|▏         | 710/30001 [52:59<8:33:50,  1.05s/it] [A[A[A[A



  2%|▏         | 711/30001 [53:39<102:53:20, 12.65s/it][A[A[A[A

epoch:    710 | acc.: 0.982






  2%|▏         | 712/30001 [53:40<73:27:55,  9.03s/it] [A[A[A[A



  2%|▏         | 713/30001 [53:40<52:48:14,  6.49s/it][A[A[A[A



  2%|▏         | 714/30001 [53:41<38:26:15,  4.72s/it][A[A[A[A



  2%|▏         | 715/30001 [53:42<28:20:09,  3.48s/it][A[A[A[A



  2%|▏         | 716/30001 [53:42<21:20:18,  2.62s/it][A[A[A[A



  2%|▏         | 717/30001 [53:43<16:20:46,  2.01s/it][A[A[A[A



  2%|▏         | 718/30001 [53:43<12:50:22,  1.58s/it][A[A[A[A



  2%|▏         | 719/30001 [53:44<10:24:43,  1.28s/it][A[A[A[A



  2%|▏         | 720/30001 [53:44<8:46:21,  1.08s/it] [A[A[A[A



  2%|▏         | 721/30001 [54:23<100:17:06, 12.33s/it][A[A[A[A

epoch:    720 | acc.: 0.984






  2%|▏         | 722/30001 [54:24<71:36:24,  8.80s/it] [A[A[A[A



  2%|▏         | 723/30001 [54:24<51:31:34,  6.34s/it][A[A[A[A



  2%|▏         | 724/30001 [54:25<37:30:46,  4.61s/it][A[A[A[A



  2%|▏         | 725/30001 [54:25<27:40:10,  3.40s/it][A[A[A[A



  2%|▏         | 726/30001 [54:26<20:44:02,  2.55s/it][A[A[A[A



  2%|▏         | 727/30001 [54:27<15:57:54,  1.96s/it][A[A[A[A



  2%|▏         | 728/30001 [54:27<12:35:08,  1.55s/it][A[A[A[A



  2%|▏         | 729/30001 [54:28<10:09:58,  1.25s/it][A[A[A[A



  2%|▏         | 730/30001 [54:28<8:34:01,  1.05s/it] [A[A[A[A



  2%|▏         | 731/30001 [55:10<107:24:02, 13.21s/it][A[A[A[A

epoch:    730 | acc.: 0.985






  2%|▏         | 732/30001 [55:10<76:37:43,  9.43s/it] [A[A[A[A



  2%|▏         | 733/30001 [55:11<55:02:40,  6.77s/it][A[A[A[A



  2%|▏         | 734/30001 [55:12<39:57:44,  4.92s/it][A[A[A[A



  2%|▏         | 735/30001 [55:12<29:25:15,  3.62s/it][A[A[A[A



  2%|▏         | 736/30001 [55:13<22:04:06,  2.71s/it][A[A[A[A



  2%|▏         | 737/30001 [55:13<16:53:59,  2.08s/it][A[A[A[A



  2%|▏         | 738/30001 [55:14<13:18:41,  1.64s/it][A[A[A[A



  2%|▏         | 739/30001 [55:15<10:47:13,  1.33s/it][A[A[A[A



  2%|▏         | 740/30001 [55:15<9:00:42,  1.11s/it] [A[A[A[A



  2%|▏         | 741/30001 [55:57<107:36:06, 13.24s/it][A[A[A[A

epoch:    740 | acc.: 0.983






  2%|▏         | 742/30001 [55:57<76:47:43,  9.45s/it] [A[A[A[A



  2%|▏         | 743/30001 [55:58<55:12:07,  6.79s/it][A[A[A[A



  2%|▏         | 744/30001 [55:59<40:02:46,  4.93s/it][A[A[A[A



  2%|▏         | 745/30001 [55:59<29:25:39,  3.62s/it][A[A[A[A



  2%|▏         | 746/30001 [56:00<21:58:59,  2.71s/it][A[A[A[A



  2%|▏         | 747/30001 [56:00<16:46:38,  2.06s/it][A[A[A[A



  2%|▏         | 748/30001 [56:01<13:08:22,  1.62s/it][A[A[A[A



  2%|▏         | 749/30001 [56:01<10:36:20,  1.31s/it][A[A[A[A



  2%|▏         | 750/30001 [56:02<8:48:02,  1.08s/it] [A[A[A[A



  3%|▎         | 751/30001 [56:43<106:57:05, 13.16s/it][A[A[A[A

epoch:    750 | acc.: 0.983






  3%|▎         | 752/30001 [56:44<76:19:12,  9.39s/it] [A[A[A[A



  3%|▎         | 753/30001 [56:45<54:55:54,  6.76s/it][A[A[A[A



  3%|▎         | 754/30001 [56:45<39:51:16,  4.91s/it][A[A[A[A



  3%|▎         | 755/30001 [56:46<29:21:49,  3.61s/it][A[A[A[A



  3%|▎         | 756/30001 [56:46<21:55:38,  2.70s/it][A[A[A[A



  3%|▎         | 757/30001 [56:47<16:44:53,  2.06s/it][A[A[A[A



  3%|▎         | 758/30001 [56:47<13:12:53,  1.63s/it][A[A[A[A



  3%|▎         | 759/30001 [56:48<10:39:46,  1.31s/it][A[A[A[A



  3%|▎         | 760/30001 [56:49<8:53:26,  1.09s/it] [A[A[A[A



  3%|▎         | 761/30001 [57:26<97:46:48, 12.04s/it][A[A[A[A

epoch:    760 | acc.: 0.983






  3%|▎         | 762/30001 [57:27<69:54:45,  8.61s/it][A[A[A[A



  3%|▎         | 763/30001 [57:27<50:21:10,  6.20s/it][A[A[A[A



  3%|▎         | 764/30001 [57:28<36:38:56,  4.51s/it][A[A[A[A



  3%|▎         | 765/30001 [57:29<27:02:59,  3.33s/it][A[A[A[A



  3%|▎         | 766/30001 [57:29<20:21:42,  2.51s/it][A[A[A[A



  3%|▎         | 767/30001 [57:30<15:40:27,  1.93s/it][A[A[A[A



  3%|▎         | 768/30001 [57:30<12:24:29,  1.53s/it][A[A[A[A



  3%|▎         | 769/30001 [57:31<10:05:26,  1.24s/it][A[A[A[A



  3%|▎         | 770/30001 [57:31<8:30:34,  1.05s/it] [A[A[A[A



  3%|▎         | 771/30001 [58:09<97:44:53, 12.04s/it][A[A[A[A

epoch:    770 | acc.: 0.986






  3%|▎         | 772/30001 [58:10<69:52:12,  8.61s/it][A[A[A[A



  3%|▎         | 773/30001 [58:10<50:22:24,  6.20s/it][A[A[A[A



  3%|▎         | 774/30001 [58:11<36:41:23,  4.52s/it][A[A[A[A



  3%|▎         | 775/30001 [58:12<27:09:30,  3.35s/it][A[A[A[A



  3%|▎         | 776/30001 [58:12<20:23:30,  2.51s/it][A[A[A[A



  3%|▎         | 777/30001 [58:13<15:43:57,  1.94s/it][A[A[A[A



  3%|▎         | 778/30001 [58:13<12:23:23,  1.53s/it][A[A[A[A



  3%|▎         | 779/30001 [58:14<10:07:46,  1.25s/it][A[A[A[A



  3%|▎         | 780/30001 [58:14<8:30:37,  1.05s/it] [A[A[A[A



  3%|▎         | 781/30001 [58:54<103:00:57, 12.69s/it][A[A[A[A

epoch:    780 | acc.: 0.985






  3%|▎         | 782/30001 [58:55<73:33:14,  9.06s/it] [A[A[A[A



  3%|▎         | 783/30001 [58:55<52:52:21,  6.51s/it][A[A[A[A



  3%|▎         | 784/30001 [58:56<38:26:33,  4.74s/it][A[A[A[A



  3%|▎         | 785/30001 [58:57<28:19:47,  3.49s/it][A[A[A[A



  3%|▎         | 786/30001 [58:57<21:14:21,  2.62s/it][A[A[A[A



  3%|▎         | 787/30001 [58:58<16:18:37,  2.01s/it][A[A[A[A



  3%|▎         | 788/30001 [58:58<12:51:00,  1.58s/it][A[A[A[A



  3%|▎         | 789/30001 [58:59<10:25:12,  1.28s/it][A[A[A[A



  3%|▎         | 790/30001 [59:00<8:43:25,  1.08s/it] [A[A[A[A



  3%|▎         | 791/30001 [59:41<107:49:38, 13.29s/it][A[A[A[A

epoch:    790 | acc.: 0.984






  3%|▎         | 792/30001 [59:42<76:53:00,  9.48s/it] [A[A[A[A



  3%|▎         | 793/30001 [59:42<55:13:28,  6.81s/it][A[A[A[A



  3%|▎         | 794/30001 [59:43<40:04:41,  4.94s/it][A[A[A[A



  3%|▎         | 795/30001 [59:44<29:29:10,  3.63s/it][A[A[A[A



  3%|▎         | 796/30001 [59:44<22:04:40,  2.72s/it][A[A[A[A



  3%|▎         | 797/30001 [59:45<16:52:08,  2.08s/it][A[A[A[A



  3%|▎         | 798/30001 [59:45<13:14:52,  1.63s/it][A[A[A[A



  3%|▎         | 799/30001 [59:46<10:39:13,  1.31s/it][A[A[A[A



  3%|▎         | 800/30001 [59:47<8:52:01,  1.09s/it] [A[A[A[A



  3%|▎         | 801/30001 [1:00:24<97:19:33, 12.00s/it][A[A[A[A

epoch:    800 | acc.: 0.985






  3%|▎         | 802/30001 [1:00:25<69:37:27,  8.58s/it][A[A[A[A



  3%|▎         | 803/30001 [1:00:25<50:11:16,  6.19s/it][A[A[A[A



  3%|▎         | 804/30001 [1:00:26<36:38:42,  4.52s/it][A[A[A[A



  3%|▎         | 805/30001 [1:00:26<27:04:05,  3.34s/it][A[A[A[A



  3%|▎         | 806/30001 [1:00:27<20:29:03,  2.53s/it][A[A[A[A



  3%|▎         | 807/30001 [1:00:28<15:49:55,  1.95s/it][A[A[A[A



  3%|▎         | 808/30001 [1:00:28<12:26:48,  1.53s/it][A[A[A[A



  3%|▎         | 809/30001 [1:00:29<10:12:21,  1.26s/it][A[A[A[A



  3%|▎         | 810/30001 [1:00:29<8:33:09,  1.05s/it] [A[A[A[A



  3%|▎         | 811/30001 [1:01:09<102:06:04, 12.59s/it][A[A[A[A

epoch:    810 | acc.: 0.984






  3%|▎         | 812/30001 [1:01:10<72:51:42,  8.99s/it] [A[A[A[A



  3%|▎         | 813/30001 [1:01:10<52:27:25,  6.47s/it][A[A[A[A



  3%|▎         | 814/30001 [1:01:11<38:08:14,  4.70s/it][A[A[A[A



  3%|▎         | 815/30001 [1:01:11<28:09:25,  3.47s/it][A[A[A[A



  3%|▎         | 816/30001 [1:01:12<21:08:04,  2.61s/it][A[A[A[A



  3%|▎         | 817/30001 [1:01:12<16:10:55,  2.00s/it][A[A[A[A



  3%|▎         | 818/30001 [1:01:13<12:40:19,  1.56s/it][A[A[A[A



  3%|▎         | 819/30001 [1:01:14<10:16:59,  1.27s/it][A[A[A[A



  3%|▎         | 820/30001 [1:01:14<8:39:08,  1.07s/it] [A[A[A[A



  3%|▎         | 821/30001 [1:01:52<96:57:37, 11.96s/it][A[A[A[A

epoch:    820 | acc.: 0.985






  3%|▎         | 822/30001 [1:01:52<69:17:15,  8.55s/it][A[A[A[A



  3%|▎         | 823/30001 [1:01:53<49:55:28,  6.16s/it][A[A[A[A



  3%|▎         | 824/30001 [1:01:53<36:23:33,  4.49s/it][A[A[A[A



  3%|▎         | 825/30001 [1:01:54<26:55:16,  3.32s/it][A[A[A[A



  3%|▎         | 826/30001 [1:01:55<20:15:56,  2.50s/it][A[A[A[A



  3%|▎         | 827/30001 [1:01:55<15:39:28,  1.93s/it][A[A[A[A



  3%|▎         | 828/30001 [1:01:56<12:24:33,  1.53s/it][A[A[A[A



  3%|▎         | 829/30001 [1:01:56<10:03:29,  1.24s/it][A[A[A[A



  3%|▎         | 830/30001 [1:01:57<8:30:45,  1.05s/it] [A[A[A[A



  3%|▎         | 831/30001 [1:02:34<96:53:07, 11.96s/it][A[A[A[A

epoch:    830 | acc.: 0.985






  3%|▎         | 832/30001 [1:02:35<69:13:30,  8.54s/it][A[A[A[A



  3%|▎         | 833/30001 [1:02:35<49:48:08,  6.15s/it][A[A[A[A



  3%|▎         | 834/30001 [1:02:36<36:17:33,  4.48s/it][A[A[A[A



  3%|▎         | 835/30001 [1:02:37<26:53:22,  3.32s/it][A[A[A[A



  3%|▎         | 836/30001 [1:02:37<20:16:43,  2.50s/it][A[A[A[A



  3%|▎         | 837/30001 [1:02:38<15:33:53,  1.92s/it][A[A[A[A



  3%|▎         | 838/30001 [1:02:38<12:19:09,  1.52s/it][A[A[A[A



  3%|▎         | 839/30001 [1:02:39<10:00:40,  1.24s/it][A[A[A[A



  3%|▎         | 840/30001 [1:02:40<8:25:07,  1.04s/it] [A[A[A[A



  3%|▎         | 841/30001 [1:03:18<99:11:14, 12.25s/it][A[A[A[A

epoch:    840 | acc.: 0.985






  3%|▎         | 842/30001 [1:03:19<70:51:41,  8.75s/it][A[A[A[A



  3%|▎         | 843/30001 [1:03:19<51:00:31,  6.30s/it][A[A[A[A



  3%|▎         | 844/30001 [1:03:20<37:06:08,  4.58s/it][A[A[A[A



  3%|▎         | 845/30001 [1:03:20<27:19:41,  3.37s/it][A[A[A[A



  3%|▎         | 846/30001 [1:03:21<20:31:51,  2.54s/it][A[A[A[A



  3%|▎         | 847/30001 [1:03:21<15:50:01,  1.96s/it][A[A[A[A



  3%|▎         | 848/30001 [1:03:22<12:28:22,  1.54s/it][A[A[A[A



  3%|▎         | 849/30001 [1:03:23<10:09:51,  1.26s/it][A[A[A[A



  3%|▎         | 850/30001 [1:03:23<8:31:18,  1.05s/it] [A[A[A[A



  3%|▎         | 851/30001 [1:04:03<102:35:51, 12.67s/it][A[A[A[A

epoch:    850 | acc.: 0.984






  3%|▎         | 852/30001 [1:04:03<73:10:50,  9.04s/it] [A[A[A[A



  3%|▎         | 853/30001 [1:04:04<52:37:19,  6.50s/it][A[A[A[A



  3%|▎         | 854/30001 [1:04:05<38:13:37,  4.72s/it][A[A[A[A



  3%|▎         | 855/30001 [1:04:05<28:10:52,  3.48s/it][A[A[A[A



  3%|▎         | 856/30001 [1:04:06<21:10:01,  2.61s/it][A[A[A[A



  3%|▎         | 857/30001 [1:04:06<16:13:10,  2.00s/it][A[A[A[A



  3%|▎         | 858/30001 [1:04:07<12:45:50,  1.58s/it][A[A[A[A



  3%|▎         | 859/30001 [1:04:08<10:17:48,  1.27s/it][A[A[A[A



  3%|▎         | 860/30001 [1:04:08<8:39:49,  1.07s/it] [A[A[A[A



  3%|▎         | 861/30001 [1:04:47<99:21:00, 12.27s/it][A[A[A[A

epoch:    860 | acc.: 0.980






  3%|▎         | 862/30001 [1:04:47<70:57:51,  8.77s/it][A[A[A[A



  3%|▎         | 863/30001 [1:04:48<51:05:13,  6.31s/it][A[A[A[A



  3%|▎         | 864/30001 [1:04:48<37:11:19,  4.59s/it][A[A[A[A



  3%|▎         | 865/30001 [1:04:49<27:23:45,  3.38s/it][A[A[A[A



  3%|▎         | 866/30001 [1:04:49<20:33:36,  2.54s/it][A[A[A[A



  3%|▎         | 867/30001 [1:04:50<15:51:32,  1.96s/it][A[A[A[A



  3%|▎         | 868/30001 [1:04:51<12:32:13,  1.55s/it][A[A[A[A



  3%|▎         | 869/30001 [1:04:51<10:09:49,  1.26s/it][A[A[A[A



  3%|▎         | 870/30001 [1:04:52<8:31:46,  1.05s/it] [A[A[A[A



  3%|▎         | 871/30001 [1:05:29<96:23:32, 11.91s/it][A[A[A[A

epoch:    870 | acc.: 0.982






  3%|▎         | 872/30001 [1:05:30<68:52:57,  8.51s/it][A[A[A[A



  3%|▎         | 873/30001 [1:05:30<49:35:59,  6.13s/it][A[A[A[A



  3%|▎         | 874/30001 [1:05:31<36:13:08,  4.48s/it][A[A[A[A



  3%|▎         | 875/30001 [1:05:31<26:43:32,  3.30s/it][A[A[A[A



  3%|▎         | 876/30001 [1:05:32<20:07:54,  2.49s/it][A[A[A[A



  3%|▎         | 877/30001 [1:05:33<15:28:10,  1.91s/it][A[A[A[A



  3%|▎         | 878/30001 [1:05:33<12:16:03,  1.52s/it][A[A[A[A



  3%|▎         | 879/30001 [1:05:34<10:02:06,  1.24s/it][A[A[A[A



  3%|▎         | 880/30001 [1:05:34<8:26:02,  1.04s/it] [A[A[A[A



  3%|▎         | 881/30001 [1:06:14<102:26:37, 12.66s/it][A[A[A[A

epoch:    880 | acc.: 0.984






  3%|▎         | 882/30001 [1:06:15<73:06:58,  9.04s/it] [A[A[A[A



  3%|▎         | 883/30001 [1:06:15<52:36:39,  6.50s/it][A[A[A[A



  3%|▎         | 884/30001 [1:06:16<38:13:06,  4.73s/it][A[A[A[A



  3%|▎         | 885/30001 [1:06:16<28:09:12,  3.48s/it][A[A[A[A



  3%|▎         | 886/30001 [1:06:17<21:08:29,  2.61s/it][A[A[A[A



  3%|▎         | 887/30001 [1:06:18<16:13:52,  2.01s/it][A[A[A[A



  3%|▎         | 888/30001 [1:06:18<12:49:02,  1.58s/it][A[A[A[A



  3%|▎         | 889/30001 [1:06:19<10:24:51,  1.29s/it][A[A[A[A



  3%|▎         | 890/30001 [1:06:19<8:41:39,  1.08s/it] [A[A[A[A



  3%|▎         | 891/30001 [1:06:58<99:22:12, 12.29s/it][A[A[A[A

epoch:    890 | acc.: 0.983






  3%|▎         | 892/30001 [1:06:58<70:55:43,  8.77s/it][A[A[A[A



  3%|▎         | 893/30001 [1:06:59<51:02:10,  6.31s/it][A[A[A[A



  3%|▎         | 894/30001 [1:07:00<37:07:17,  4.59s/it][A[A[A[A



  3%|▎         | 895/30001 [1:07:00<27:28:34,  3.40s/it][A[A[A[A



  3%|▎         | 896/30001 [1:07:01<20:38:34,  2.55s/it][A[A[A[A



  3%|▎         | 897/30001 [1:07:01<15:52:42,  1.96s/it][A[A[A[A



  3%|▎         | 898/30001 [1:07:02<12:34:02,  1.55s/it][A[A[A[A



  3%|▎         | 899/30001 [1:07:02<10:10:18,  1.26s/it][A[A[A[A



  3%|▎         | 900/30001 [1:07:03<8:30:49,  1.05s/it] [A[A[A[A



  3%|▎         | 901/30001 [1:07:41<98:17:22, 12.16s/it][A[A[A[A

epoch:    900 | acc.: 0.982






  3%|▎         | 902/30001 [1:07:42<70:14:14,  8.69s/it][A[A[A[A



  3%|▎         | 903/30001 [1:07:42<50:35:55,  6.26s/it][A[A[A[A



  3%|▎         | 904/30001 [1:07:43<36:48:56,  4.55s/it][A[A[A[A



  3%|▎         | 905/30001 [1:07:43<27:12:24,  3.37s/it][A[A[A[A



  3%|▎         | 906/30001 [1:07:44<20:25:41,  2.53s/it][A[A[A[A



  3%|▎         | 907/30001 [1:07:45<15:40:32,  1.94s/it][A[A[A[A



  3%|▎         | 908/30001 [1:07:45<12:23:38,  1.53s/it][A[A[A[A



  3%|▎         | 909/30001 [1:07:46<10:10:02,  1.26s/it][A[A[A[A



  3%|▎         | 910/30001 [1:07:46<8:30:14,  1.05s/it] [A[A[A[A



  3%|▎         | 911/30001 [1:08:28<106:48:46, 13.22s/it][A[A[A[A

epoch:    910 | acc.: 0.985






  3%|▎         | 912/30001 [1:08:29<76:12:06,  9.43s/it] [A[A[A[A



  3%|▎         | 913/30001 [1:08:29<54:42:13,  6.77s/it][A[A[A[A



  3%|▎         | 914/30001 [1:08:30<39:42:55,  4.92s/it][A[A[A[A



  3%|▎         | 915/30001 [1:08:30<29:14:51,  3.62s/it][A[A[A[A



  3%|▎         | 916/30001 [1:08:31<21:52:53,  2.71s/it][A[A[A[A



  3%|▎         | 917/30001 [1:08:32<16:41:47,  2.07s/it][A[A[A[A



  3%|▎         | 918/30001 [1:08:32<13:04:20,  1.62s/it][A[A[A[A



  3%|▎         | 919/30001 [1:08:33<10:33:25,  1.31s/it][A[A[A[A



  3%|▎         | 920/30001 [1:08:33<8:44:36,  1.08s/it] [A[A[A[A



  3%|▎         | 921/30001 [1:09:10<96:12:04, 11.91s/it][A[A[A[A

epoch:    920 | acc.: 0.983






  3%|▎         | 922/30001 [1:09:11<68:44:15,  8.51s/it][A[A[A[A



  3%|▎         | 923/30001 [1:09:12<49:29:51,  6.13s/it][A[A[A[A



  3%|▎         | 924/30001 [1:09:12<36:08:32,  4.47s/it][A[A[A[A



  3%|▎         | 925/30001 [1:09:13<26:41:25,  3.30s/it][A[A[A[A



  3%|▎         | 926/30001 [1:09:13<20:04:46,  2.49s/it][A[A[A[A



  3%|▎         | 927/30001 [1:09:14<15:29:29,  1.92s/it][A[A[A[A



  3%|▎         | 928/30001 [1:09:14<12:16:02,  1.52s/it][A[A[A[A



  3%|▎         | 929/30001 [1:09:15<10:01:24,  1.24s/it][A[A[A[A



  3%|▎         | 930/30001 [1:09:16<8:22:27,  1.04s/it] [A[A[A[A



  3%|▎         | 931/30001 [1:09:53<96:00:40, 11.89s/it][A[A[A[A

epoch:    930 | acc.: 0.984






  3%|▎         | 932/30001 [1:09:53<68:37:54,  8.50s/it][A[A[A[A



  3%|▎         | 933/30001 [1:09:54<49:26:58,  6.12s/it][A[A[A[A



  3%|▎         | 934/30001 [1:09:55<36:04:48,  4.47s/it][A[A[A[A



  3%|▎         | 935/30001 [1:09:55<26:37:01,  3.30s/it][A[A[A[A



  3%|▎         | 936/30001 [1:09:56<20:02:23,  2.48s/it][A[A[A[A



  3%|▎         | 937/30001 [1:09:56<15:30:25,  1.92s/it][A[A[A[A



  3%|▎         | 938/30001 [1:09:57<12:15:04,  1.52s/it][A[A[A[A



  3%|▎         | 939/30001 [1:09:58<10:00:15,  1.24s/it][A[A[A[A



  3%|▎         | 940/30001 [1:09:58<8:25:51,  1.04s/it] [A[A[A[A



  3%|▎         | 941/30001 [1:10:35<96:05:33, 11.90s/it][A[A[A[A

epoch:    940 | acc.: 0.986






  3%|▎         | 942/30001 [1:10:36<68:42:15,  8.51s/it][A[A[A[A



  3%|▎         | 943/30001 [1:10:37<49:29:42,  6.13s/it][A[A[A[A



  3%|▎         | 944/30001 [1:10:37<36:04:44,  4.47s/it][A[A[A[A



  3%|▎         | 945/30001 [1:10:38<26:41:26,  3.31s/it][A[A[A[A



  3%|▎         | 946/30001 [1:10:38<20:06:16,  2.49s/it][A[A[A[A



  3%|▎         | 947/30001 [1:10:39<15:31:38,  1.92s/it][A[A[A[A



  3%|▎         | 948/30001 [1:10:40<12:19:41,  1.53s/it][A[A[A[A



  3%|▎         | 949/30001 [1:10:40<10:05:11,  1.25s/it][A[A[A[A



  3%|▎         | 950/30001 [1:10:41<8:29:14,  1.05s/it] [A[A[A[A



  3%|▎         | 951/30001 [1:11:18<96:39:22, 11.98s/it][A[A[A[A

epoch:    950 | acc.: 0.986






  3%|▎         | 952/30001 [1:11:19<69:04:01,  8.56s/it][A[A[A[A



  3%|▎         | 953/30001 [1:11:19<49:44:39,  6.16s/it][A[A[A[A



  3%|▎         | 954/30001 [1:11:20<36:13:58,  4.49s/it][A[A[A[A



  3%|▎         | 955/30001 [1:11:21<26:48:06,  3.32s/it][A[A[A[A



  3%|▎         | 956/30001 [1:11:21<20:06:36,  2.49s/it][A[A[A[A



  3%|▎         | 957/30001 [1:11:22<15:27:12,  1.92s/it][A[A[A[A



  3%|▎         | 958/30001 [1:11:22<12:12:55,  1.51s/it][A[A[A[A



  3%|▎         | 959/30001 [1:11:23<9:53:37,  1.23s/it] [A[A[A[A



  3%|▎         | 960/30001 [1:11:23<8:18:13,  1.03s/it][A[A[A[A



  3%|▎         | 961/30001 [1:12:05<105:58:46, 13.14s/it][A[A[A[A

epoch:    960 | acc.: 0.985






  3%|▎         | 962/30001 [1:12:05<75:34:00,  9.37s/it] [A[A[A[A



  3%|▎         | 963/30001 [1:12:06<54:18:33,  6.73s/it][A[A[A[A



  3%|▎         | 964/30001 [1:12:07<39:29:24,  4.90s/it][A[A[A[A