# Tutorial for 3D object detection
This tutorial is about 3D object detection based on Lidar point cloud. The project is about implementing a newly proposed Triple Attention module to enhance the performance of PointPillar.

First downlaod [MobaXterm](https://mobaxterm.mobatek.net/download.html) for Windows users.(This is necessary if you want to foward the demo display that originally shown in GPU Farm to your local computer)

For ubuntu users, just use the terminal provided by the OS itself and connect to GPU farm with -X option, e.g. ssh -X 

For Mac users, install XQuartz.
```shell
brew cask install xquartz
```

If you are using gpu farm phase 1 which has pre-installed cuda-10.2, you can skip the installation of CUDA. 

If you are using gpu farm phase 2, you have to install CUDA 10.2 manually.

### Manually install CUDA 10.2 (skip this if you are using gpu farm phase 2)
```shell
gpu-interactive
cd ~
wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
chmod 777 cuda_10.2.89_440.33.01_linux.run
./cuda_10.2.89_440.33.01_linux.run --toolkit --toolkitpath=$HOME/cuda-10.2 --defaultroot=$HOME/cuda-10.2
```

## Before excuting this jupyter notebook file, first open a terminal to prepare the environment and download the KITTI dataset.
## 1. Installation of OpenPCDet environment
### Create conda environment
(skip the command gpu-interactive if you have already executed before)
```shell
gpu-interactive
conda create -n openpcdet python=3.7
conda activate openpcdet
```

### Install required packages (pytorch, mayavi, etc.)
```shell
conda install pytorch==1.7.0 torchvision==0.8.0 -c pytorch
conda install mayavi -c conda-forge
pip install scipy==1.7.3
```

### Install spconv pre-compiled for cuda-10.2
```shell
pip install spconv-cu102
```

### Install OpenPCDet
```shell
cd
git clone https://github.com/tianqi-wang1996/OpenPCDet.git
cd ~/OpenPCDet
```
if you are using gpu farm phase 1:
```shell
CUDA_HOME=/usr/local/cuda-10.2 python setup.py develop
```
if you are using gpu farm phase 2 which has manually installed CUDA-10.2 in the previous steps:
```shell
CUDA_HOME=$HOME/cuda-10.2 python setup.py develop
```
### Install Jupyter notebook
```shell
conda install jupyter notebook
```

## 2. Download KITTI dataset, unzip and put the data under OpenPCDet/data/kitti/
### Download and unzip KITTI dataset into ~/OpenPCDet/data/kitti folder
```shell
cd ~/OpenPCDet/data/kitti 
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_calib.zip
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_velodyne.zip
unzip data_object_calib.zip
unzip data_object_image_2.zip
unzip data_object_velodyne.zip
```
### Install gdown to download files from Google Drive links
```shell
pip install gdown
```
### Download the road_plane folder and unzip it to data/kitti/training/ folder
```shell
gdown --id 1d5mq0RXRnvHPVeKx6Q612z0YRO1t2wAp
unzip train_planes.zip -d ~/OpenPCDet/data/kitti/training
```
### Generate the info files that needed for training
```shell
cd ~/OpenPCDet
python -m pcdet.datasets.kitti.kitti_dataset create_kitti_infos tools/cfgs/dataset_configs/kitti_dataset.yaml
```

## 3. Visualize the detection result
### Download a pre-trained checkpoint file for PointPillar and use the demo.py to check the visualization result
```shell
cd ~/OpenPCDet
mkdir pretrained_ckpt
cd ~/OpenPCDet/pretrained_ckpt
gdown --id 1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm
```
#### Important: 
1. You have to launch this jupyter notebook file (.ipynb) or start your local terminal from MobaXterm for Windows or XQuartz for Mac or regular terminal for Ubuntu.
2. make sure you have activated openpcdet environment in your terminal before launching the jupyter notebook, or executing further commands in your terminal.
3. If you choose to launch jupyter notebook, choose the kernel which shown in Kernel/Change kernel to Python 3 (ipykernel)

#### From the next jupyter cell, you can just simply excute the following cells in jupyter notebook.
#### If you prefer execute in your local terminal, then exclude the "%" or "!" symbols ahead of each line, e.g. %cd -> cd, !python -> python.

In [None]:
%cd ~/OpenPCDet/tools
!python demo.py --cfg_file cfgs/kitti_models/pointpillar.yaml --ckpt ../pretrained_ckpt/pointpillar_7728.pth --data_path ../data/kitti/training/velodyne/000039.bin

/userhome/35/tqwang/OpenPCDet/tools
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/.tqwang/runtime-tqwang'
2022-01-04 16:57:53,243   INFO  -----------------Quick Demo of OpenPCDet-------------------------
2022-01-04 16:57:53,245   INFO  Total number of samples: 	1
2022-01-04 16:57:55,936   INFO  ==> Loading parameters from checkpoint ../pretrained_ckpt/pointpillar_7728.pth to CPU
2022-01-04 16:57:55,963   INFO  ==> Done (loaded 127/127)
2022-01-04 16:57:56,003   INFO  Visualized sample index: 	1
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /opt/conda/conda-bld/pytorch_1603729138878/work/torch/csrc/utils/python_arg_parser.cpp:882.)
  original_idxs = scores_mask.nonzero().view(-1)
2022-01-04 16:58:08,663   INFO  Demo done.


#### You will see a mayavi window popped up to show the visualization result. 
#### Important: You have to launch this jupyter notebook file (.ipynb) from MobaXterm for Windows or XQuartz for Mac or regular terminal for Ubuntu.
<center>
    <img src="https://i.imgur.com/b5o5ssq.png" width = "80%">
    <br>
    <div style="color:orange;
    display: inline-block;
    ">Fig. Visualization result for 3D object detection in point cloud </div>
    
</center>

## 4. Training and testing given a model config

## Train model with given config (takes around 2 hours on a single gpu), train for 20 epochs.
#### You can add --extra_tag (e.g. --extra_tag experiment1), the checkpoint files will be stored after every epoch, and be stored in OpenPCDet/output/pointpillar/experiment1/ckpt/ folder

In [None]:
%cd ~/OpenPCDet/tools/
!python train.py --cfg_file ./cfgs/kitti_models/pointpillar.yaml --extra_tag experiment1

/userhome/35/tqwang/OpenPCDet/tools
2022-01-04 16:58:33,961   INFO  **********************Start logging**********************
2022-01-04 16:58:33,962   INFO  CUDA_VISIBLE_DEVICES=0
2022-01-04 16:58:33,962   INFO  cfg_file         ./cfgs/kitti_models/pointpillar.yaml
2022-01-04 16:58:33,962   INFO  batch_size       4
2022-01-04 16:58:33,962   INFO  epochs           20
2022-01-04 16:58:33,962   INFO  workers          4
2022-01-04 16:58:33,962   INFO  extra_tag        experiment1
2022-01-04 16:58:33,962   INFO  ckpt             None
2022-01-04 16:58:33,962   INFO  pretrained_model None
2022-01-04 16:58:33,962   INFO  launcher         none
2022-01-04 16:58:33,962   INFO  tcp_port         18888
2022-01-04 16:58:33,962   INFO  sync_bn          False
2022-01-04 16:58:33,962   INFO  fix_random_seed  False
2022-01-04 16:58:33,962   INFO  ckpt_save_interval 1
2022-01-04 16:58:33,962   INFO  local_rank       0
2022-01-04 16:58:33,962   INFO  max_ckpt_save_num 30
2022-01-04 16:58:33,963   INFO  me

2022-01-04 16:58:34,250   INFO  Database filter by min points Car: 14357 => 13532
2022-01-04 16:58:34,251   INFO  Database filter by min points Pedestrian: 2207 => 2168
2022-01-04 16:58:34,251   INFO  Database filter by min points Cyclist: 734 => 705
2022-01-04 16:58:34,284   INFO  Database filter by difficulty Car: 13532 => 10759
2022-01-04 16:58:34,289   INFO  Database filter by difficulty Pedestrian: 2168 => 2075
2022-01-04 16:58:34,292   INFO  Database filter by difficulty Cyclist: 705 => 581
2022-01-04 16:58:34,301   INFO  Loading KITTI dataset
2022-01-04 16:58:34,476   INFO  Total samples for KITTI dataset: 3712
2022-01-04 16:58:37,228   INFO  PointPillar(
  (vfe): PillarVFE(
    (pfn_layers): ModuleList(
      (0): PFNLayer(
        (linear): Linear(in_features=10, out_features=64, bias=False)
        (norm): BatchNorm1d(64, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
      )
    )
  )
  (backbone_3d): None
  (map_to_bev_module): PointPillarScatter()
  (pfe)

epochs:   0%| | 0/20 [00:05<?, ?it/s, loss=2.23, lr=0.0003, d_time=0.00(0.10), f[A
train:   1%|▎                     | 11/928 [00:05<05:34,  2.74it/s, total_it=10][A
epochs:   0%| | 0/20 [00:05<?, ?it/s, loss=2.21, lr=0.0003, d_time=0.00(0.10), f[A
train:   1%|▎                     | 12/928 [00:05<05:24,  2.82it/s, total_it=11][A
epochs:   0%| | 0/20 [00:06<?, ?it/s, loss=2.47, lr=0.0003, d_time=0.00(0.09), f[A
train:   1%|▎                     | 13/928 [00:06<05:14,  2.91it/s, total_it=12][A
epochs:   0%| | 0/20 [00:06<?, ?it/s, loss=2.38, lr=0.0003, d_time=0.00(0.08), f[A
train:   2%|▎                     | 14/928 [00:06<05:16,  2.89it/s, total_it=13][A
epochs:   0%| | 0/20 [00:06<?, ?it/s, loss=2.38, lr=0.0003, d_time=0.00(0.07), f[A
train:   2%|▎                     | 15/928 [00:06<05:08,  2.96it/s, total_it=14][A
epochs:   0%| | 0/20 [00:07<?, ?it/s, loss=2.18, lr=0.0003, d_time=0.00(0.07), f[A
train:   2%|▍                     | 16/928 [00:07<05:00,  3.04it/s, total_it

epochs:   0%| | 0/20 [00:21<?, ?it/s, loss=1.68, lr=0.0003, d_time=0.00(0.02), f[A
train:   6%|█▍                    | 60/928 [00:21<05:03,  2.86it/s, total_it=59][A
epochs:   0%| | 0/20 [00:22<?, ?it/s, loss=1.87, lr=0.0003, d_time=0.00(0.02), f[A
train:   7%|█▍                    | 61/928 [00:22<04:55,  2.93it/s, total_it=60][A
epochs:   0%| | 0/20 [00:22<?, ?it/s, loss=1.78, lr=0.0003, d_time=0.00(0.02), f[A
train:   7%|█▍                    | 62/928 [00:22<04:46,  3.02it/s, total_it=61][A
epochs:   0%| | 0/20 [00:22<?, ?it/s, loss=1.72, lr=0.0003, d_time=0.00(0.02), f[A
train:   7%|█▍                    | 63/928 [00:22<04:53,  2.94it/s, total_it=62][A
epochs:   0%| | 0/20 [00:23<?, ?it/s, loss=2.02, lr=0.0003, d_time=0.00(0.02), f[A
train:   7%|█▌                    | 64/928 [00:23<04:52,  2.96it/s, total_it=63][A
epochs:   0%| | 0/20 [00:23<?, ?it/s, loss=1.65, lr=0.0003, d_time=0.00(0.02), f[A
train:   7%|█▌                    | 65/928 [00:23<04:45,  3.02it/s, total_it

epochs:   0%| | 0/20 [00:38<?, ?it/s, loss=1.51, lr=0.000301, d_time=0.00(0.01),[A
train:  12%|██▎                 | 109/928 [00:38<04:53,  2.79it/s, total_it=108][A
epochs:   0%| | 0/20 [00:38<?, ?it/s, loss=1.6, lr=0.000301, d_time=0.00(0.01), [A
train:  12%|██▎                 | 110/928 [00:38<04:41,  2.91it/s, total_it=109][A
epochs:   0%| | 0/20 [00:39<?, ?it/s, loss=1.46, lr=0.000301, d_time=0.00(0.01),[A
train:  12%|██▍                 | 111/928 [00:39<04:37,  2.94it/s, total_it=110][A
epochs:   0%| | 0/20 [00:39<?, ?it/s, loss=1.56, lr=0.000301, d_time=0.00(0.01),[A
train:  12%|██▍                 | 112/928 [00:39<04:40,  2.91it/s, total_it=111][A
epochs:   0%| | 0/20 [00:39<?, ?it/s, loss=1.57, lr=0.000301, d_time=0.00(0.01),[A
train:  12%|██▍                 | 113/928 [00:39<04:35,  2.96it/s, total_it=112][A
epochs:   0%| | 0/20 [00:40<?, ?it/s, loss=1.46, lr=0.000302, d_time=0.00(0.01),[A
train:  12%|██▍                 | 114/928 [00:40<04:32,  2.98it/s, total_it=

epochs:   0%| | 0/20 [00:55<?, ?it/s, loss=1.4, lr=0.000303, d_time=0.00(0.01), [A
train:  17%|███▍                | 158/928 [00:55<04:44,  2.71it/s, total_it=157][A
epochs:   0%| | 0/20 [00:55<?, ?it/s, loss=1.46, lr=0.000303, d_time=0.00(0.01),[A
train:  17%|███▍                | 159/928 [00:55<04:34,  2.80it/s, total_it=158][A
epochs:   0%| | 0/20 [00:55<?, ?it/s, loss=1.31, lr=0.000303, d_time=0.00(0.01),[A
train:  17%|███▍                | 160/928 [00:55<04:29,  2.85it/s, total_it=159][A
epochs:   0%| | 0/20 [00:56<?, ?it/s, loss=1.44, lr=0.000303, d_time=0.00(0.01),[A
train:  17%|███▍                | 161/928 [00:56<04:30,  2.84it/s, total_it=160][A
epochs:   0%| | 0/20 [00:56<?, ?it/s, loss=1.45, lr=0.000303, d_time=0.00(0.01),[A
train:  17%|███▍                | 162/928 [00:56<04:20,  2.94it/s, total_it=161][A
epochs:   0%| | 0/20 [00:56<?, ?it/s, loss=1.35, lr=0.000303, d_time=0.00(0.01),[A
train:  18%|███▌                | 163/928 [00:56<04:12,  3.03it/s, total_it=

epochs:   0%| | 0/20 [01:11<?, ?it/s, loss=1.26, lr=0.000305, d_time=0.00(0.01),[A
train:  22%|████▍               | 207/928 [01:11<04:11,  2.87it/s, total_it=206][A
epochs:   0%| | 0/20 [01:12<?, ?it/s, loss=1.17, lr=0.000305, d_time=0.00(0.01),[A
train:  22%|████▍               | 208/928 [01:12<04:06,  2.92it/s, total_it=207][A
epochs:   0%| | 0/20 [01:12<?, ?it/s, loss=1.38, lr=0.000305, d_time=0.00(0.01),[A
train:  23%|████▌               | 209/928 [01:12<04:06,  2.91it/s, total_it=208][A
epochs:   0%| | 0/20 [01:12<?, ?it/s, loss=1.36, lr=0.000305, d_time=0.00(0.01),[A
train:  23%|████▌               | 210/928 [01:12<04:05,  2.92it/s, total_it=209][A
epochs:   0%| | 0/20 [01:13<?, ?it/s, loss=1.39, lr=0.000305, d_time=0.00(0.01),[A
train:  23%|████▌               | 211/928 [01:13<04:07,  2.90it/s, total_it=210][A
epochs:   0%| | 0/20 [01:13<?, ?it/s, loss=1.42, lr=0.000305, d_time=0.00(0.01),[A
train:  23%|████▌               | 212/928 [01:13<04:03,  2.94it/s, total_it=

epochs:   0%| | 0/20 [01:28<?, ?it/s, loss=1.42, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▌              | 256/928 [01:28<03:39,  3.06it/s, total_it=255][A
epochs:   0%| | 0/20 [01:28<?, ?it/s, loss=1.28, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▌              | 257/928 [01:29<03:49,  2.92it/s, total_it=256][A
epochs:   0%| | 0/20 [01:29<?, ?it/s, loss=1.19, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▌              | 258/928 [01:29<03:43,  3.00it/s, total_it=257][A
epochs:   0%| | 0/20 [01:29<?, ?it/s, loss=1.26, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▌              | 259/928 [01:29<03:43,  3.00it/s, total_it=258][A
epochs:   0%| | 0/20 [01:29<?, ?it/s, loss=1.27, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▌              | 260/928 [01:30<03:45,  2.96it/s, total_it=259][A
epochs:   0%| | 0/20 [01:30<?, ?it/s, loss=1.43, lr=0.000308, d_time=0.00(0.00),[A
train:  28%|█████▋              | 261/928 [01:30<03:42,  2.99it/s, total_it=

epochs:   0%| | 0/20 [01:45<?, ?it/s, loss=1.24, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▌             | 305/928 [01:45<03:29,  2.98it/s, total_it=304][A
epochs:   0%| | 0/20 [01:45<?, ?it/s, loss=1.23, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▌             | 306/928 [01:45<03:25,  3.03it/s, total_it=305][A
epochs:   0%| | 0/20 [01:45<?, ?it/s, loss=1.68, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▌             | 307/928 [01:46<03:29,  2.97it/s, total_it=306][A
epochs:   0%| | 0/20 [01:46<?, ?it/s, loss=1.29, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▋             | 308/928 [01:46<03:24,  3.03it/s, total_it=307][A
epochs:   0%| | 0/20 [01:46<?, ?it/s, loss=1.26, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▋             | 309/928 [01:46<03:20,  3.09it/s, total_it=308][A
epochs:   0%| | 0/20 [01:46<?, ?it/s, loss=1.15, lr=0.000311, d_time=0.00(0.00),[A
train:  33%|██████▋             | 310/928 [01:46<03:16,  3.14it/s, total_it=

epochs:   0%| | 0/20 [02:02<?, ?it/s, loss=1.15, lr=0.000315, d_time=0.00(0.00),[A
train:  38%|███████▋            | 354/928 [02:02<03:28,  2.76it/s, total_it=353][A
epochs:   0%| | 0/20 [02:02<?, ?it/s, loss=1.28, lr=0.000315, d_time=0.00(0.00),[A
train:  38%|███████▋            | 355/928 [02:02<03:36,  2.64it/s, total_it=354][A
epochs:   0%| | 0/20 [02:03<?, ?it/s, loss=1.28, lr=0.000315, d_time=0.00(0.00),[A
train:  38%|███████▋            | 356/928 [02:03<03:27,  2.75it/s, total_it=355][A
epochs:   0%| | 0/20 [02:03<?, ?it/s, loss=1.08, lr=0.000315, d_time=0.00(0.00),[A
train:  38%|███████▋            | 357/928 [02:03<03:23,  2.80it/s, total_it=356][A
epochs:   0%| | 0/20 [02:03<?, ?it/s, loss=1.17, lr=0.000315, d_time=0.00(0.00),[A
train:  39%|███████▋            | 358/928 [02:03<03:18,  2.87it/s, total_it=357][A
epochs:   0%| | 0/20 [02:04<?, ?it/s, loss=1.1, lr=0.000315, d_time=0.00(0.00), [A
train:  39%|███████▋            | 359/928 [02:04<03:13,  2.95it/s, total_it=

epochs:   0%| | 0/20 [02:19<?, ?it/s, loss=1.03, lr=0.000319, d_time=0.00(0.00),[A
train:  43%|████████▋           | 403/928 [02:19<02:58,  2.94it/s, total_it=402][A
epochs:   0%| | 0/20 [02:19<?, ?it/s, loss=1.07, lr=0.000319, d_time=0.00(0.00),[A
train:  44%|████████▋           | 404/928 [02:19<03:04,  2.84it/s, total_it=403][A
epochs:   0%| | 0/20 [02:20<?, ?it/s, loss=1.19, lr=0.00032, d_time=0.00(0.00), [A
train:  44%|████████▋           | 405/928 [02:20<03:05,  2.83it/s, total_it=404][A
epochs:   0%| | 0/20 [02:20<?, ?it/s, loss=1.05, lr=0.00032, d_time=0.00(0.00), [A
train:  44%|████████▊           | 406/928 [02:20<03:13,  2.70it/s, total_it=405][A
epochs:   0%| | 0/20 [02:20<?, ?it/s, loss=1.03, lr=0.00032, d_time=0.00(0.00), [A
train:  44%|████████▊           | 407/928 [02:20<03:14,  2.68it/s, total_it=406][A
epochs:   0%| | 0/20 [02:21<?, ?it/s, loss=1.08, lr=0.00032, d_time=0.00(0.00), [A
train:  44%|████████▊           | 408/928 [02:21<03:20,  2.59it/s, total_it=

epochs:   0%| | 0/20 [02:37<?, ?it/s, loss=1.24, lr=0.000324, d_time=0.00(0.00),[A
train:  49%|█████████▋          | 452/928 [02:37<03:09,  2.51it/s, total_it=451][A
epochs:   0%| | 0/20 [02:38<?, ?it/s, loss=1.08, lr=0.000325, d_time=0.00(0.00),[A
train:  49%|█████████▊          | 453/928 [02:38<03:05,  2.56it/s, total_it=452][A
epochs:   0%| | 0/20 [02:38<?, ?it/s, loss=1.25, lr=0.000325, d_time=0.00(0.00),[A
train:  49%|█████████▊          | 454/928 [02:38<03:01,  2.61it/s, total_it=453][A
epochs:   0%| | 0/20 [02:38<?, ?it/s, loss=1.26, lr=0.000325, d_time=0.00(0.00),[A
train:  49%|█████████▊          | 455/928 [02:39<02:57,  2.67it/s, total_it=454][A
epochs:   0%| | 0/20 [02:39<?, ?it/s, loss=1.38, lr=0.000325, d_time=0.00(0.00),[A
train:  49%|█████████▊          | 456/928 [02:39<02:50,  2.77it/s, total_it=455][A
epochs:   0%| | 0/20 [02:39<?, ?it/s, loss=1.04, lr=0.000325, d_time=0.00(0.00),[A
train:  49%|█████████▊          | 457/928 [02:39<02:45,  2.85it/s, total_it=

epochs:   0%| | 0/20 [02:56<?, ?it/s, loss=1.12, lr=0.00033, d_time=0.00(0.00), [A
train:  54%|██████████▊         | 501/928 [02:56<02:45,  2.58it/s, total_it=500][A
epochs:   0%| | 0/20 [02:56<?, ?it/s, loss=1.12, lr=0.00033, d_time=0.00(0.00), [A
train:  54%|██████████▊         | 502/928 [02:56<02:46,  2.55it/s, total_it=501][A
epochs:   0%| | 0/20 [02:57<?, ?it/s, loss=1.11, lr=0.00033, d_time=0.00(0.00), [A
train:  54%|██████████▊         | 503/928 [02:57<02:45,  2.57it/s, total_it=502][A
epochs:   0%| | 0/20 [02:57<?, ?it/s, loss=1.12, lr=0.00033, d_time=0.00(0.00), [A
train:  54%|██████████▊         | 504/928 [02:57<02:48,  2.52it/s, total_it=503][A
epochs:   0%| | 0/20 [02:57<?, ?it/s, loss=1.15, lr=0.00033, d_time=0.00(0.00), [A
train:  54%|██████████▉         | 505/928 [02:58<02:45,  2.56it/s, total_it=504][A
epochs:   0%| | 0/20 [02:58<?, ?it/s, loss=1.26, lr=0.000331, d_time=0.00(0.00),[A
train:  55%|██████████▉         | 506/928 [02:58<02:37,  2.68it/s, total_it=

epochs:   0%| | 0/20 [03:14<?, ?it/s, loss=1.1, lr=0.000336, d_time=0.00(0.00), [A
train:  59%|███████████▊        | 550/928 [03:14<02:29,  2.52it/s, total_it=549][A
epochs:   0%| | 0/20 [03:15<?, ?it/s, loss=1.08, lr=0.000336, d_time=0.00(0.00),[A
train:  59%|███████████▉        | 551/928 [03:15<02:19,  2.71it/s, total_it=550][A
epochs:   0%| | 0/20 [03:15<?, ?it/s, loss=1.08, lr=0.000336, d_time=0.00(0.00),[A
train:  59%|███████████▉        | 552/928 [03:15<02:19,  2.70it/s, total_it=551][A
epochs:   0%| | 0/20 [03:15<?, ?it/s, loss=1.12, lr=0.000337, d_time=0.00(0.00),[A
train:  60%|███████████▉        | 553/928 [03:16<02:24,  2.59it/s, total_it=552][A
epochs:   0%| | 0/20 [03:16<?, ?it/s, loss=1.09, lr=0.000337, d_time=0.00(0.00),[A
train:  60%|███████████▉        | 554/928 [03:16<02:30,  2.48it/s, total_it=553][A
epochs:   0%| | 0/20 [03:16<?, ?it/s, loss=0.992, lr=0.000337, d_time=0.00(0.00)[A
train:  60%|███████████▉        | 555/928 [03:16<02:26,  2.54it/s, total_it=

epochs:   0%| | 0/20 [03:33<?, ?it/s, loss=1.11, lr=0.000343, d_time=0.00(0.00),[A
train:  65%|████████████▉       | 599/928 [03:33<02:01,  2.71it/s, total_it=598][A
epochs:   0%| | 0/20 [03:33<?, ?it/s, loss=1.14, lr=0.000343, d_time=0.00(0.00),[A
train:  65%|████████████▉       | 600/928 [03:33<01:54,  2.88it/s, total_it=599][A
epochs:   0%| | 0/20 [03:34<?, ?it/s, loss=1.08, lr=0.000343, d_time=0.00(0.00),[A
train:  65%|████████████▉       | 601/928 [03:34<01:58,  2.77it/s, total_it=600][A
epochs:   0%| | 0/20 [03:34<?, ?it/s, loss=1.16, lr=0.000343, d_time=0.00(0.00),[A
train:  65%|████████████▉       | 602/928 [03:34<01:59,  2.73it/s, total_it=601][A
epochs:   0%| | 0/20 [03:34<?, ?it/s, loss=1.06, lr=0.000343, d_time=0.00(0.00),[A
train:  65%|████████████▉       | 603/928 [03:34<01:52,  2.88it/s, total_it=602][A
epochs:   0%| | 0/20 [03:35<?, ?it/s, loss=0.984, lr=0.000344, d_time=0.00(0.00)[A
train:  65%|█████████████       | 604/928 [03:35<01:55,  2.82it/s, total_it=

epochs:   0%| | 0/20 [03:51<?, ?it/s, loss=1.04, lr=0.00035, d_time=0.00(0.00), [A
train:  70%|█████████████▉      | 648/928 [03:51<01:44,  2.69it/s, total_it=647][A
epochs:   0%| | 0/20 [03:52<?, ?it/s, loss=1.06, lr=0.00035, d_time=0.00(0.00), [A
train:  70%|█████████████▉      | 649/928 [03:52<01:44,  2.68it/s, total_it=648][A
epochs:   0%| | 0/20 [03:52<?, ?it/s, loss=1.04, lr=0.00035, d_time=0.00(0.00), [A
train:  70%|██████████████      | 650/928 [03:52<01:43,  2.68it/s, total_it=649][A
epochs:   0%| | 0/20 [03:52<?, ?it/s, loss=1.35, lr=0.000351, d_time=0.00(0.00),[A
train:  70%|██████████████      | 651/928 [03:53<01:44,  2.66it/s, total_it=650][A
epochs:   0%| | 0/20 [03:53<?, ?it/s, loss=1.06, lr=0.000351, d_time=0.00(0.00),[A
train:  70%|██████████████      | 652/928 [03:53<01:49,  2.52it/s, total_it=651][A
epochs:   0%| | 0/20 [03:53<?, ?it/s, loss=0.962, lr=0.000351, d_time=0.00(0.00)[A
train:  70%|██████████████      | 653/928 [03:53<01:47,  2.55it/s, total_it=

epochs:   0%| | 0/20 [04:10<?, ?it/s, loss=1.1, lr=0.000358, d_time=0.00(0.00), [A
train:  75%|███████████████     | 697/928 [04:10<01:29,  2.58it/s, total_it=696][A
epochs:   0%| | 0/20 [04:10<?, ?it/s, loss=1.03, lr=0.000358, d_time=0.00(0.00),[A
train:  75%|███████████████     | 698/928 [04:10<01:25,  2.70it/s, total_it=697][A
epochs:   0%| | 0/20 [04:11<?, ?it/s, loss=0.98, lr=0.000358, d_time=0.00(0.00),[A
train:  75%|███████████████     | 699/928 [04:11<01:22,  2.76it/s, total_it=698][A
epochs:   0%| | 0/20 [04:11<?, ?it/s, loss=1.09, lr=0.000358, d_time=0.00(0.00),[A
train:  75%|███████████████     | 700/928 [04:11<01:24,  2.70it/s, total_it=699][A
epochs:   0%| | 0/20 [04:11<?, ?it/s, loss=1.08, lr=0.000359, d_time=0.00(0.00),[A
train:  76%|███████████████     | 701/928 [04:11<01:22,  2.75it/s, total_it=700][A
epochs:   0%| | 0/20 [04:12<?, ?it/s, loss=1.05, lr=0.000359, d_time=0.00(0.00),[A
train:  76%|███████████████▏    | 702/928 [04:12<01:22,  2.73it/s, total_it=

train:  80%|████████████████    | 745/928 [04:28<01:11,  2.57it/s, total_it=744][A
epochs:   0%| | 0/20 [04:29<?, ?it/s, loss=0.963, lr=0.000366, d_time=0.00(0.00)[A
train:  80%|████████████████    | 746/928 [04:29<01:09,  2.61it/s, total_it=745][A
epochs:   0%| | 0/20 [04:29<?, ?it/s, loss=1.12, lr=0.000367, d_time=0.00(0.00),[A
train:  80%|████████████████    | 747/928 [04:29<01:05,  2.76it/s, total_it=746][A
epochs:   0%| | 0/20 [04:29<?, ?it/s, loss=0.968, lr=0.000367, d_time=0.00(0.00)[A
train:  81%|████████████████    | 748/928 [04:29<01:08,  2.65it/s, total_it=747][A
epochs:   0%| | 0/20 [04:30<?, ?it/s, loss=0.866, lr=0.000367, d_time=0.00(0.00)[A
train:  81%|████████████████▏   | 749/928 [04:30<01:09,  2.56it/s, total_it=748][A
epochs:   0%| | 0/20 [04:30<?, ?it/s, loss=1.14, lr=0.000367, d_time=0.00(0.00),[A
train:  81%|████████████████▏   | 750/928 [04:30<01:10,  2.52it/s, total_it=749][A
epochs:   0%| | 0/20 [04:30<?, ?it/s, loss=1.08, lr=0.000367, d_time=0.00(0.

epochs:   0%| | 0/20 [04:47<?, ?it/s, loss=1.01, lr=0.000375, d_time=0.00(0.00),[A
train:  86%|█████████████████   | 794/928 [04:47<00:55,  2.41it/s, total_it=793][A
epochs:   0%| | 0/20 [04:47<?, ?it/s, loss=0.958, lr=0.000375, d_time=0.00(0.00)[A
train:  86%|█████████████████▏  | 795/928 [04:48<00:54,  2.45it/s, total_it=794][A
epochs:   0%| | 0/20 [04:48<?, ?it/s, loss=1.1, lr=0.000375, d_time=0.00(0.00), [A
train:  86%|█████████████████▏  | 796/928 [04:48<00:51,  2.58it/s, total_it=795][A
epochs:   0%| | 0/20 [04:48<?, ?it/s, loss=1.04, lr=0.000376, d_time=0.00(0.00),[A
train:  86%|█████████████████▏  | 797/928 [04:48<00:51,  2.54it/s, total_it=796][A
epochs:   0%| | 0/20 [04:49<?, ?it/s, loss=0.94, lr=0.000376, d_time=0.00(0.00),[A
train:  86%|█████████████████▏  | 798/928 [04:49<00:51,  2.51it/s, total_it=797][A
epochs:   0%| | 0/20 [04:49<?, ?it/s, loss=1.04, lr=0.000376, d_time=0.00(0.00),[A
train:  86%|█████████████████▏  | 799/928 [04:49<00:51,  2.48it/s, total_it=

epochs:   0%| | 0/20 [05:06<?, ?it/s, loss=1.01, lr=0.000385, d_time=0.00(0.00),[A
train:  91%|██████████████████▏ | 843/928 [05:06<00:32,  2.59it/s, total_it=842][A
epochs:   0%| | 0/20 [05:06<?, ?it/s, loss=1.04, lr=0.000385, d_time=0.00(0.00),[A
train:  91%|██████████████████▏ | 844/928 [05:06<00:32,  2.61it/s, total_it=843][A
epochs:   0%| | 0/20 [05:06<?, ?it/s, loss=0.934, lr=0.000385, d_time=0.00(0.00)[A
train:  91%|██████████████████▏ | 845/928 [05:07<00:30,  2.75it/s, total_it=844][A
epochs:   0%| | 0/20 [05:07<?, ?it/s, loss=1.15, lr=0.000385, d_time=0.00(0.00),[A
train:  91%|██████████████████▏ | 846/928 [05:07<00:29,  2.83it/s, total_it=845][A
epochs:   0%| | 0/20 [05:07<?, ?it/s, loss=1.02, lr=0.000385, d_time=0.00(0.00),[A
train:  91%|██████████████████▎ | 847/928 [05:07<00:29,  2.78it/s, total_it=846][A
epochs:   0%| | 0/20 [05:07<?, ?it/s, loss=0.919, lr=0.000386, d_time=0.00(0.00)[A
train:  91%|██████████████████▎ | 848/928 [05:08<00:30,  2.61it/s, total_it=

epochs:   0%| | 0/20 [05:24<?, ?it/s, loss=1.12, lr=0.000395, d_time=0.00(0.00),[A
train:  96%|███████████████████▏| 892/928 [05:24<00:12,  2.80it/s, total_it=891][A
epochs:   0%| | 0/20 [05:24<?, ?it/s, loss=0.907, lr=0.000395, d_time=0.00(0.00)[A
train:  96%|███████████████████▏| 893/928 [05:24<00:12,  2.74it/s, total_it=892][A
epochs:   0%| | 0/20 [05:24<?, ?it/s, loss=1.01, lr=0.000395, d_time=0.00(0.00),[A
train:  96%|███████████████████▎| 894/928 [05:25<00:12,  2.66it/s, total_it=893][A
epochs:   0%| | 0/20 [05:25<?, ?it/s, loss=1.01, lr=0.000395, d_time=0.00(0.00),[A
train:  96%|███████████████████▎| 895/928 [05:25<00:13,  2.52it/s, total_it=894][A
epochs:   0%| | 0/20 [05:25<?, ?it/s, loss=0.955, lr=0.000395, d_time=0.00(0.00)[A
train:  97%|███████████████████▎| 896/928 [05:25<00:12,  2.63it/s, total_it=895][A
epochs:   0%| | 0/20 [05:26<?, ?it/s, loss=1.04, lr=0.000396, d_time=0.00(0.00),[A
train:  97%|███████████████████▎| 897/928 [05:26<00:11,  2.73it/s, total_it=

epochs:   5%| | 1/20 [05:43<1:47:08, 338.35s/it, loss=0.921, lr=0.000405, d_time[A
train:   1%|▎                    | 12/928 [00:05<05:53,  2.59it/s, total_it=939][A
epochs:   5%| | 1/20 [05:43<1:47:08, 338.35s/it, loss=0.929, lr=0.000405, d_time[A
train:   1%|▎                    | 13/928 [00:05<05:56,  2.57it/s, total_it=940][A
epochs:   5%| | 1/20 [05:44<1:47:08, 338.35s/it, loss=1.2, lr=0.000405, d_time=0[A
train:   2%|▎                    | 14/928 [00:06<05:47,  2.63it/s, total_it=941][A
epochs:   5%| | 1/20 [05:44<1:47:08, 338.35s/it, loss=1.06, lr=0.000406, d_time=[A
train:   2%|▎                    | 15/928 [00:06<05:54,  2.58it/s, total_it=942][A
epochs:   5%| | 1/20 [05:44<1:47:08, 338.35s/it, loss=1.04, lr=0.000406, d_time=[A
train:   2%|▎                    | 16/928 [00:06<05:47,  2.63it/s, total_it=943][A
epochs:   5%| | 1/20 [05:45<1:47:08, 338.35s/it, loss=0.977, lr=0.000406, d_time[A
train:   2%|▍                    | 17/928 [00:07<05:50,  2.60it/s, total_it=

epochs:   5%| | 1/20 [06:02<1:47:08, 338.35s/it, loss=0.958, lr=0.000416, d_time[A
train:   7%|█▍                   | 61/928 [00:23<05:10,  2.79it/s, total_it=988][A
epochs:   5%| | 1/20 [06:02<1:47:08, 338.35s/it, loss=0.969, lr=0.000416, d_time[A
train:   7%|█▍                   | 62/928 [00:24<05:12,  2.77it/s, total_it=989][A
epochs:   5%| | 1/20 [06:02<1:47:08, 338.35s/it, loss=1.05, lr=0.000417, d_time=[A
train:   7%|█▍                   | 63/928 [00:24<05:23,  2.67it/s, total_it=990][A
epochs:   5%| | 1/20 [06:03<1:47:08, 338.35s/it, loss=1.05, lr=0.000417, d_time=[A
train:   7%|█▍                   | 64/928 [00:25<05:05,  2.83it/s, total_it=991][A
epochs:   5%| | 1/20 [06:03<1:47:08, 338.35s/it, loss=0.992, lr=0.000417, d_time[A
train:   7%|█▍                   | 65/928 [00:25<05:16,  2.73it/s, total_it=992][A
epochs:   5%| | 1/20 [06:03<1:47:08, 338.35s/it, loss=0.912, lr=0.000417, d_time[A
train:   7%|█▍                   | 66/928 [00:25<04:59,  2.87it/s, total_it=

epochs:   5%| | 1/20 [06:20<1:47:08, 338.35s/it, loss=1.08, lr=0.000428, d_time=[A
train:  12%|██▎                | 110/928 [00:42<04:40,  2.92it/s, total_it=1037][A
epochs:   5%| | 1/20 [06:20<1:47:08, 338.35s/it, loss=0.987, lr=0.000428, d_time[A
train:  12%|██▎                | 111/928 [00:42<04:47,  2.84it/s, total_it=1038][A
epochs:   5%| | 1/20 [06:20<1:47:08, 338.35s/it, loss=0.952, lr=0.000428, d_time[A
train:  12%|██▎                | 112/928 [00:42<04:50,  2.81it/s, total_it=1039][A
epochs:   5%| | 1/20 [06:21<1:47:08, 338.35s/it, loss=0.904, lr=0.000428, d_time[A
train:  12%|██▎                | 113/928 [00:43<05:02,  2.70it/s, total_it=1040][A
epochs:   5%| | 1/20 [06:21<1:47:08, 338.35s/it, loss=0.877, lr=0.000429, d_time[A
train:  12%|██▎                | 114/928 [00:43<05:12,  2.61it/s, total_it=1041][A
epochs:   5%| | 1/20 [06:22<1:47:08, 338.35s/it, loss=0.911, lr=0.000429, d_time[A
train:  12%|██▎                | 115/928 [00:43<05:01,  2.69it/s, total_it=1

epochs:   5%| | 1/20 [06:38<1:47:08, 338.35s/it, loss=0.888, lr=0.00044, d_time=[A
train:  17%|███▎               | 159/928 [01:00<05:08,  2.49it/s, total_it=1086][A
epochs:   5%| | 1/20 [06:38<1:47:08, 338.35s/it, loss=0.894, lr=0.00044, d_time=[A
train:  17%|███▎               | 160/928 [01:00<05:18,  2.41it/s, total_it=1087][A
epochs:   5%| | 1/20 [06:39<1:47:08, 338.35s/it, loss=0.867, lr=0.00044, d_time=[A
train:  17%|███▎               | 161/928 [01:00<05:18,  2.41it/s, total_it=1088][A
epochs:   5%| | 1/20 [06:39<1:47:08, 338.35s/it, loss=0.952, lr=0.000441, d_time[A
train:  17%|███▎               | 162/928 [01:01<05:04,  2.52it/s, total_it=1089][A
epochs:   5%| | 1/20 [06:39<1:47:08, 338.35s/it, loss=0.951, lr=0.000441, d_time[A
train:  18%|███▎               | 163/928 [01:01<04:49,  2.64it/s, total_it=1090][A
epochs:   5%| | 1/20 [06:40<1:47:08, 338.35s/it, loss=1.01, lr=0.000441, d_time=[A
train:  18%|███▎               | 164/928 [01:02<04:56,  2.58it/s, total_it=1

epochs:   5%| | 1/20 [06:56<1:47:08, 338.35s/it, loss=1.09, lr=0.000452, d_time=[A
train:  22%|████▎              | 208/928 [01:18<04:18,  2.79it/s, total_it=1135][A
epochs:   5%| | 1/20 [06:56<1:47:08, 338.35s/it, loss=0.868, lr=0.000453, d_time[A
train:  23%|████▎              | 209/928 [01:18<04:18,  2.78it/s, total_it=1136][A
epochs:   5%| | 1/20 [06:57<1:47:08, 338.35s/it, loss=0.96, lr=0.000453, d_time=[A
train:  23%|████▎              | 210/928 [01:19<04:21,  2.74it/s, total_it=1137][A
epochs:   5%| | 1/20 [06:57<1:47:08, 338.35s/it, loss=0.977, lr=0.000453, d_time[A
train:  23%|████▎              | 211/928 [01:19<04:15,  2.81it/s, total_it=1138][A
epochs:   5%| | 1/20 [06:57<1:47:08, 338.35s/it, loss=0.999, lr=0.000454, d_time[A
train:  23%|████▎              | 212/928 [01:19<04:17,  2.78it/s, total_it=1139][A
epochs:   5%| | 1/20 [06:58<1:47:08, 338.35s/it, loss=1.09, lr=0.000454, d_time=[A
train:  23%|████▎              | 213/928 [01:20<04:13,  2.82it/s, total_it=1

epochs:   5%| | 1/20 [07:14<1:47:08, 338.35s/it, loss=0.991, lr=0.000466, d_time[A
train:  28%|█████▎             | 257/928 [01:36<04:11,  2.67it/s, total_it=1184][A
epochs:   5%| | 1/20 [07:15<1:47:08, 338.35s/it, loss=0.883, lr=0.000466, d_time[A
train:  28%|█████▎             | 258/928 [01:37<04:16,  2.62it/s, total_it=1185][A
epochs:   5%| | 1/20 [07:15<1:47:08, 338.35s/it, loss=1.04, lr=0.000466, d_time=[A
train:  28%|█████▎             | 259/928 [01:37<04:26,  2.51it/s, total_it=1186][A
epochs:   5%| | 1/20 [07:16<1:47:08, 338.35s/it, loss=0.957, lr=0.000466, d_time[A
train:  28%|█████▎             | 260/928 [01:38<04:18,  2.59it/s, total_it=1187][A
epochs:   5%| | 1/20 [07:16<1:47:08, 338.35s/it, loss=0.92, lr=0.000467, d_time=[A
train:  28%|█████▎             | 261/928 [01:38<04:10,  2.66it/s, total_it=1188][A
epochs:   5%| | 1/20 [07:16<1:47:08, 338.35s/it, loss=0.99, lr=0.000467, d_time=[A
train:  28%|█████▎             | 262/928 [01:38<04:00,  2.77it/s, total_it=1

epochs:   5%| | 1/20 [07:33<1:47:08, 338.35s/it, loss=0.908, lr=0.000479, d_time[A
train:  33%|██████▎            | 306/928 [01:54<03:59,  2.60it/s, total_it=1233][A
epochs:   5%| | 1/20 [07:33<1:47:08, 338.35s/it, loss=0.918, lr=0.00048, d_time=[A
train:  33%|██████▎            | 307/928 [01:55<03:51,  2.68it/s, total_it=1234][A
epochs:   5%| | 1/20 [07:33<1:47:08, 338.35s/it, loss=1.07, lr=0.00048, d_time=0[A
train:  33%|██████▎            | 308/928 [01:55<03:55,  2.63it/s, total_it=1235][A
epochs:   5%| | 1/20 [07:34<1:47:08, 338.35s/it, loss=0.977, lr=0.00048, d_time=[A
train:  33%|██████▎            | 309/928 [01:56<04:01,  2.56it/s, total_it=1236][A
epochs:   5%| | 1/20 [07:34<1:47:08, 338.35s/it, loss=1.14, lr=0.00048, d_time=0[A
train:  33%|██████▎            | 310/928 [01:56<03:52,  2.66it/s, total_it=1237][A
epochs:   5%| | 1/20 [07:34<1:47:08, 338.35s/it, loss=1.02, lr=0.000481, d_time=[A
train:  34%|██████▎            | 311/928 [01:56<03:40,  2.80it/s, total_it=1

epochs:   5%| | 1/20 [07:51<1:47:08, 338.35s/it, loss=1.04, lr=0.000494, d_time=[A
train:  38%|███████▎           | 355/928 [02:13<03:48,  2.51it/s, total_it=1282][A
epochs:   5%| | 1/20 [07:51<1:47:08, 338.35s/it, loss=1.04, lr=0.000494, d_time=[A
train:  38%|███████▎           | 356/928 [02:13<03:48,  2.50it/s, total_it=1283][A
epochs:   5%| | 1/20 [07:52<1:47:08, 338.35s/it, loss=0.93, lr=0.000494, d_time=[A
train:  38%|███████▎           | 357/928 [02:14<03:50,  2.48it/s, total_it=1284][A
epochs:   5%| | 1/20 [07:52<1:47:08, 338.35s/it, loss=0.917, lr=0.000494, d_time[A
train:  39%|███████▎           | 358/928 [02:14<03:47,  2.50it/s, total_it=1285][A
epochs:   5%| | 1/20 [07:53<1:47:08, 338.35s/it, loss=1.01, lr=0.000495, d_time=[A
train:  39%|███████▎           | 359/928 [02:15<03:40,  2.58it/s, total_it=1286][A
epochs:   5%| | 1/20 [07:53<1:47:08, 338.35s/it, loss=0.921, lr=0.000495, d_time[A
train:  39%|███████▎           | 360/928 [02:15<03:46,  2.51it/s, total_it=1

epochs:   5%| | 1/20 [08:09<1:47:08, 338.35s/it, loss=0.899, lr=0.000508, d_time[A
train:  44%|████████▎          | 404/928 [02:31<03:06,  2.81it/s, total_it=1331][A
epochs:   5%| | 1/20 [08:10<1:47:08, 338.35s/it, loss=1, lr=0.000509, d_time=0.0[A
train:  44%|████████▎          | 405/928 [02:32<03:04,  2.84it/s, total_it=1332][A
epochs:   5%| | 1/20 [08:10<1:47:08, 338.35s/it, loss=0.925, lr=0.000509, d_time[A
train:  44%|████████▎          | 406/928 [02:32<03:19,  2.61it/s, total_it=1333][A
epochs:   5%| | 1/20 [08:10<1:47:08, 338.35s/it, loss=1.01, lr=0.000509, d_time=[A
train:  44%|████████▎          | 407/928 [02:32<03:22,  2.58it/s, total_it=1334][A
epochs:   5%| | 1/20 [08:11<1:47:08, 338.35s/it, loss=0.998, lr=0.000509, d_time[A
train:  44%|████████▎          | 408/928 [02:33<03:24,  2.54it/s, total_it=1335][A
epochs:   5%| | 1/20 [08:11<1:47:08, 338.35s/it, loss=1.22, lr=0.00051, d_time=0[A
train:  44%|████████▎          | 409/928 [02:33<03:21,  2.57it/s, total_it=1

epochs:   5%| | 1/20 [08:28<1:47:08, 338.35s/it, loss=1.08, lr=0.000523, d_time=[A
train:  49%|█████████▎         | 453/928 [02:50<02:59,  2.64it/s, total_it=1380][A
epochs:   5%| | 1/20 [08:28<1:47:08, 338.35s/it, loss=0.97, lr=0.000524, d_time=[A
train:  49%|█████████▎         | 454/928 [02:50<02:55,  2.70it/s, total_it=1381][A
epochs:   5%| | 1/20 [08:28<1:47:08, 338.35s/it, loss=0.791, lr=0.000524, d_time[A
train:  49%|█████████▎         | 455/928 [02:50<03:01,  2.61it/s, total_it=1382][A
epochs:   5%| | 1/20 [08:29<1:47:08, 338.35s/it, loss=1.02, lr=0.000524, d_time=[A
train:  49%|█████████▎         | 456/928 [02:51<02:54,  2.71it/s, total_it=1383][A
epochs:   5%| | 1/20 [08:29<1:47:08, 338.35s/it, loss=0.857, lr=0.000525, d_time[A
train:  49%|█████████▎         | 457/928 [02:51<03:00,  2.60it/s, total_it=1384][A
epochs:   5%| | 1/20 [08:30<1:47:08, 338.35s/it, loss=0.9, lr=0.000525, d_time=0[A
train:  49%|█████████▍         | 458/928 [02:51<02:50,  2.75it/s, total_it=1

epochs:   5%| | 1/20 [08:46<1:47:08, 338.35s/it, loss=0.863, lr=0.000539, d_time[A
train:  54%|██████████▎        | 502/928 [03:08<02:38,  2.69it/s, total_it=1429][A
epochs:   5%| | 1/20 [08:46<1:47:08, 338.35s/it, loss=0.892, lr=0.000539, d_time[A
train:  54%|██████████▎        | 503/928 [03:08<02:40,  2.65it/s, total_it=1430][A
epochs:   5%| | 1/20 [08:47<1:47:08, 338.35s/it, loss=0.906, lr=0.00054, d_time=[A
train:  54%|██████████▎        | 504/928 [03:09<02:37,  2.70it/s, total_it=1431][A
epochs:   5%| | 1/20 [08:47<1:47:08, 338.35s/it, loss=1.1, lr=0.00054, d_time=0.[A
train:  54%|██████████▎        | 505/928 [03:09<02:35,  2.73it/s, total_it=1432][A
epochs:   5%| | 1/20 [08:47<1:47:08, 338.35s/it, loss=0.874, lr=0.00054, d_time=[A
train:  55%|██████████▎        | 506/928 [03:09<02:39,  2.65it/s, total_it=1433][A
epochs:   5%| | 1/20 [08:48<1:47:08, 338.35s/it, loss=0.909, lr=0.000541, d_time[A
train:  55%|██████████▍        | 507/928 [03:10<02:39,  2.64it/s, total_it=1

epochs:   5%| | 1/20 [09:04<1:47:08, 338.35s/it, loss=0.864, lr=0.000555, d_time[A
train:  59%|███████████▎       | 551/928 [03:26<02:23,  2.63it/s, total_it=1478][A
epochs:   5%| | 1/20 [09:05<1:47:08, 338.35s/it, loss=0.885, lr=0.000556, d_time[A
train:  59%|███████████▎       | 552/928 [03:26<02:24,  2.60it/s, total_it=1479][A
epochs:   5%| | 1/20 [09:05<1:47:08, 338.35s/it, loss=0.921, lr=0.000556, d_time[A
train:  60%|███████████▎       | 553/928 [03:27<02:19,  2.68it/s, total_it=1480][A
epochs:   5%| | 1/20 [09:05<1:47:08, 338.35s/it, loss=0.884, lr=0.000556, d_time[A
train:  60%|███████████▎       | 554/928 [03:27<02:21,  2.64it/s, total_it=1481][A
epochs:   5%| | 1/20 [09:06<1:47:08, 338.35s/it, loss=0.984, lr=0.000557, d_time[A
train:  60%|███████████▎       | 555/928 [03:28<02:21,  2.64it/s, total_it=1482][A
epochs:   5%| | 1/20 [09:06<1:47:08, 338.35s/it, loss=0.829, lr=0.000557, d_time[A
train:  60%|███████████▍       | 556/928 [03:28<02:21,  2.63it/s, total_it=1

epochs:   5%| | 1/20 [09:23<1:47:08, 338.35s/it, loss=1.01, lr=0.000572, d_time=[A
train:  65%|████████████▎      | 600/928 [03:45<02:07,  2.57it/s, total_it=1527][A
epochs:   5%| | 1/20 [09:23<1:47:08, 338.35s/it, loss=1.15, lr=0.000572, d_time=[A
train:  65%|████████████▎      | 601/928 [03:45<02:04,  2.63it/s, total_it=1528][A
epochs:   5%| | 1/20 [09:24<1:47:08, 338.35s/it, loss=0.881, lr=0.000573, d_time[A
train:  65%|████████████▎      | 602/928 [03:45<01:55,  2.81it/s, total_it=1529][A
epochs:   5%| | 1/20 [09:24<1:47:08, 338.35s/it, loss=0.972, lr=0.000573, d_time[A
train:  65%|████████████▎      | 603/928 [03:46<02:02,  2.65it/s, total_it=1530][A
epochs:   5%| | 1/20 [09:24<1:47:08, 338.35s/it, loss=1.05, lr=0.000573, d_time=[A
train:  65%|████████████▎      | 604/928 [03:46<02:03,  2.62it/s, total_it=1531][A
epochs:   5%| | 1/20 [09:25<1:47:08, 338.35s/it, loss=1.04, lr=0.000574, d_time=[A
train:  65%|████████████▍      | 605/928 [03:47<02:04,  2.60it/s, total_it=1

epochs:   5%| | 1/20 [09:41<1:47:08, 338.35s/it, loss=0.822, lr=0.000589, d_time[A
train:  70%|█████████████▎     | 649/928 [04:03<01:42,  2.72it/s, total_it=1576][A
epochs:   5%| | 1/20 [09:41<1:47:08, 338.35s/it, loss=0.859, lr=0.000589, d_time[A
train:  70%|█████████████▎     | 650/928 [04:03<01:43,  2.69it/s, total_it=1577][A
epochs:   5%| | 1/20 [09:42<1:47:08, 338.35s/it, loss=0.903, lr=0.00059, d_time=[A
train:  70%|█████████████▎     | 651/928 [04:04<01:43,  2.67it/s, total_it=1578][A
epochs:   5%| | 1/20 [09:42<1:47:08, 338.35s/it, loss=0.95, lr=0.00059, d_time=0[A
train:  70%|█████████████▎     | 652/928 [04:04<01:37,  2.82it/s, total_it=1579][A
epochs:   5%| | 1/20 [09:42<1:47:08, 338.35s/it, loss=0.855, lr=0.00059, d_time=[A
train:  70%|█████████████▎     | 653/928 [04:04<01:37,  2.82it/s, total_it=1580][A
epochs:   5%| | 1/20 [09:43<1:47:08, 338.35s/it, loss=1.52, lr=0.000591, d_time=[A
train:  70%|█████████████▍     | 654/928 [04:05<01:35,  2.88it/s, total_it=1

epochs:   5%| | 1/20 [09:59<1:47:08, 338.35s/it, loss=0.975, lr=0.000606, d_time[A
train:  75%|██████████████▎    | 698/928 [04:21<01:27,  2.62it/s, total_it=1625][A
epochs:   5%| | 1/20 [09:59<1:47:08, 338.35s/it, loss=0.902, lr=0.000607, d_time[A
train:  75%|██████████████▎    | 699/928 [04:21<01:30,  2.53it/s, total_it=1626][A
epochs:   5%| | 1/20 [10:00<1:47:08, 338.35s/it, loss=1.13, lr=0.000607, d_time=[A
train:  75%|██████████████▎    | 700/928 [04:22<01:30,  2.52it/s, total_it=1627][A
epochs:   5%| | 1/20 [10:00<1:47:08, 338.35s/it, loss=0.951, lr=0.000608, d_time[A
train:  76%|██████████████▎    | 701/928 [04:22<01:31,  2.48it/s, total_it=1628][A
epochs:   5%| | 1/20 [10:01<1:47:08, 338.35s/it, loss=0.932, lr=0.000608, d_time[A
train:  76%|██████████████▎    | 702/928 [04:23<01:32,  2.45it/s, total_it=1629][A
epochs:   5%| | 1/20 [10:01<1:47:08, 338.35s/it, loss=0.915, lr=0.000608, d_time[A
train:  76%|██████████████▍    | 703/928 [04:23<01:32,  2.42it/s, total_it=1

epochs:   5%| | 1/20 [10:17<1:47:08, 338.35s/it, loss=0.963, lr=0.000624, d_time[A
train:  80%|███████████████▎   | 747/928 [04:39<00:58,  3.08it/s, total_it=1674][A
epochs:   5%| | 1/20 [10:17<1:47:08, 338.35s/it, loss=0.839, lr=0.000625, d_time[A
train:  81%|███████████████▎   | 748/928 [04:39<00:59,  3.02it/s, total_it=1675][A
epochs:   5%| | 1/20 [10:18<1:47:08, 338.35s/it, loss=0.941, lr=0.000625, d_time[A
train:  81%|███████████████▎   | 749/928 [04:40<01:02,  2.86it/s, total_it=1676][A
epochs:   5%| | 1/20 [10:18<1:47:08, 338.35s/it, loss=1, lr=0.000626, d_time=0.0[A
train:  81%|███████████████▎   | 750/928 [04:40<01:01,  2.91it/s, total_it=1677][A
epochs:   5%| | 1/20 [10:18<1:47:08, 338.35s/it, loss=0.909, lr=0.000626, d_time[A
train:  81%|███████████████▍   | 751/928 [04:40<01:00,  2.94it/s, total_it=1678][A
epochs:   5%| | 1/20 [10:19<1:47:08, 338.35s/it, loss=0.899, lr=0.000626, d_time[A
train:  81%|███████████████▍   | 752/928 [04:41<00:58,  3.03it/s, total_it=1

epochs:   5%| | 1/20 [10:34<1:47:08, 338.35s/it, loss=0.947, lr=0.000643, d_time[A
train:  86%|████████████████▎  | 796/928 [04:56<00:46,  2.81it/s, total_it=1723][A
epochs:   5%| | 1/20 [10:34<1:47:08, 338.35s/it, loss=0.832, lr=0.000643, d_time[A
train:  86%|████████████████▎  | 797/928 [04:56<00:45,  2.87it/s, total_it=1724][A
epochs:   5%| | 1/20 [10:34<1:47:08, 338.35s/it, loss=0.739, lr=0.000644, d_time[A
train:  86%|████████████████▎  | 798/928 [04:56<00:44,  2.93it/s, total_it=1725][A
epochs:   5%| | 1/20 [10:35<1:47:08, 338.35s/it, loss=0.996, lr=0.000644, d_time[A
train:  86%|████████████████▎  | 799/928 [04:57<00:43,  2.97it/s, total_it=1726][A
epochs:   5%| | 1/20 [10:35<1:47:08, 338.35s/it, loss=1, lr=0.000644, d_time=0.0[A
train:  86%|████████████████▍  | 800/928 [04:57<00:43,  2.97it/s, total_it=1727][A
epochs:   5%| | 1/20 [10:35<1:47:08, 338.35s/it, loss=0.875, lr=0.000645, d_time[A
train:  86%|████████████████▍  | 801/928 [04:57<00:42,  2.99it/s, total_it=1

epochs:   5%| | 1/20 [10:50<1:47:08, 338.35s/it, loss=0.867, lr=0.000662, d_time[A
train:  91%|█████████████████▎ | 845/928 [05:12<00:28,  2.90it/s, total_it=1772][A
epochs:   5%| | 1/20 [10:50<1:47:08, 338.35s/it, loss=0.891, lr=0.000662, d_time[A
train:  91%|█████████████████▎ | 846/928 [05:12<00:27,  2.94it/s, total_it=1773][A
epochs:   5%| | 1/20 [10:51<1:47:08, 338.35s/it, loss=0.901, lr=0.000662, d_time[A
train:  91%|█████████████████▎ | 847/928 [05:13<00:27,  2.97it/s, total_it=1774][A
epochs:   5%| | 1/20 [10:51<1:47:08, 338.35s/it, loss=0.814, lr=0.000663, d_time[A
train:  91%|█████████████████▎ | 848/928 [05:13<00:27,  2.92it/s, total_it=1775][A
epochs:   5%| | 1/20 [10:51<1:47:08, 338.35s/it, loss=0.89, lr=0.000663, d_time=[A
train:  91%|█████████████████▍ | 849/928 [05:13<00:26,  2.94it/s, total_it=1776][A
epochs:   5%| | 1/20 [10:52<1:47:08, 338.35s/it, loss=0.83, lr=0.000664, d_time=[A
train:  92%|█████████████████▍ | 850/928 [05:14<00:26,  2.96it/s, total_it=1

epochs:   5%| | 1/20 [11:07<1:47:08, 338.35s/it, loss=0.972, lr=0.000681, d_time[A
train:  96%|██████████████████▎| 894/928 [05:28<00:11,  3.08it/s, total_it=1821][A
epochs:   5%| | 1/20 [11:07<1:47:08, 338.35s/it, loss=0.966, lr=0.000681, d_time[A
train:  96%|██████████████████▎| 895/928 [05:29<00:10,  3.03it/s, total_it=1822][A
epochs:   5%| | 1/20 [11:07<1:47:08, 338.35s/it, loss=0.892, lr=0.000682, d_time[A
train:  97%|██████████████████▎| 896/928 [05:29<00:10,  3.06it/s, total_it=1823][A
epochs:   5%| | 1/20 [11:08<1:47:08, 338.35s/it, loss=0.934, lr=0.000682, d_time[A
train:  97%|██████████████████▎| 897/928 [05:29<00:10,  2.92it/s, total_it=1824][A
epochs:   5%| | 1/20 [11:08<1:47:08, 338.35s/it, loss=0.827, lr=0.000683, d_time[A
train:  97%|██████████████████▍| 898/928 [05:30<00:10,  2.90it/s, total_it=1825][A
epochs:   5%| | 1/20 [11:08<1:47:08, 338.35s/it, loss=0.796, lr=0.000683, d_time[A
train:  97%|██████████████████▍| 899/928 [05:30<00:09,  2.91it/s, total_it=1

epochs:  10%| | 2/20 [11:24<1:41:55, 339.74s/it, loss=0.831, lr=0.0007, d_time=0[A
train:   2%|▎                   | 14/928 [00:05<05:26,  2.80it/s, total_it=1869][A
epochs:  10%| | 2/20 [11:24<1:41:55, 339.74s/it, loss=0.924, lr=0.000701, d_time[A
train:   2%|▎                   | 15/928 [00:06<05:20,  2.85it/s, total_it=1870][A
epochs:  10%| | 2/20 [11:25<1:41:55, 339.74s/it, loss=0.852, lr=0.000701, d_time[A
train:   2%|▎                   | 16/928 [00:06<05:20,  2.84it/s, total_it=1871][A
epochs:  10%| | 2/20 [11:25<1:41:55, 339.74s/it, loss=0.908, lr=0.000701, d_time[A
train:   2%|▎                   | 17/928 [00:06<05:33,  2.73it/s, total_it=1872][A
epochs:  10%| | 2/20 [11:25<1:41:55, 339.74s/it, loss=0.903, lr=0.000702, d_time[A
train:   2%|▍                   | 18/928 [00:07<05:24,  2.81it/s, total_it=1873][A
epochs:  10%| | 2/20 [11:26<1:41:55, 339.74s/it, loss=0.936, lr=0.000702, d_time[A
train:   2%|▍                   | 19/928 [00:07<05:17,  2.86it/s, total_it=1

epochs:  10%| | 2/20 [11:41<1:41:55, 339.74s/it, loss=0.884, lr=0.00072, d_time=[A
train:   7%|█▎                  | 63/928 [00:22<04:54,  2.94it/s, total_it=1918][A
epochs:  10%| | 2/20 [11:41<1:41:55, 339.74s/it, loss=1.05, lr=0.000721, d_time=[A
train:   7%|█▍                  | 64/928 [00:22<04:55,  2.93it/s, total_it=1919][A
epochs:  10%| | 2/20 [11:42<1:41:55, 339.74s/it, loss=0.805, lr=0.000721, d_time[A
train:   7%|█▍                  | 65/928 [00:23<04:52,  2.95it/s, total_it=1920][A
epochs:  10%| | 2/20 [11:42<1:41:55, 339.74s/it, loss=0.864, lr=0.000722, d_time[A
train:   7%|█▍                  | 66/928 [00:23<04:51,  2.96it/s, total_it=1921][A
epochs:  10%| | 2/20 [11:42<1:41:55, 339.74s/it, loss=0.909, lr=0.000722, d_time[A
train:   7%|█▍                  | 67/928 [00:23<04:48,  2.99it/s, total_it=1922][A
epochs:  10%| | 2/20 [11:43<1:41:55, 339.74s/it, loss=0.848, lr=0.000722, d_time[A
train:   7%|█▍                  | 68/928 [00:24<04:51,  2.95it/s, total_it=1

epochs:  10%| | 2/20 [11:57<1:41:55, 339.74s/it, loss=0.857, lr=0.000741, d_time[A
train:  12%|██▎                | 112/928 [00:39<04:29,  3.03it/s, total_it=1967][A
epochs:  10%| | 2/20 [11:58<1:41:55, 339.74s/it, loss=0.878, lr=0.000741, d_time[A
train:  12%|██▎                | 113/928 [00:39<04:34,  2.97it/s, total_it=1968][A
epochs:  10%| | 2/20 [11:58<1:41:55, 339.74s/it, loss=0.878, lr=0.000742, d_time[A
train:  12%|██▎                | 114/928 [00:39<04:25,  3.06it/s, total_it=1969][A
epochs:  10%| | 2/20 [11:58<1:41:55, 339.74s/it, loss=0.794, lr=0.000742, d_time[A
train:  12%|██▎                | 115/928 [00:40<04:23,  3.09it/s, total_it=1970][A
epochs:  10%| | 2/20 [11:59<1:41:55, 339.74s/it, loss=0.924, lr=0.000743, d_time[A
train:  12%|██▍                | 116/928 [00:40<04:25,  3.06it/s, total_it=1971][A
epochs:  10%| | 2/20 [11:59<1:41:55, 339.74s/it, loss=0.851, lr=0.000743, d_time[A
train:  13%|██▍                | 117/928 [00:40<04:22,  3.09it/s, total_it=1

epochs:  10%| | 2/20 [12:14<1:41:55, 339.74s/it, loss=0.769, lr=0.000762, d_time[A
train:  17%|███▎               | 161/928 [00:55<04:29,  2.84it/s, total_it=2016][A
epochs:  10%| | 2/20 [12:14<1:41:55, 339.74s/it, loss=0.764, lr=0.000762, d_time[A
train:  17%|███▎               | 162/928 [00:55<04:18,  2.96it/s, total_it=2017][A
epochs:  10%| | 2/20 [12:15<1:41:55, 339.74s/it, loss=0.894, lr=0.000763, d_time[A
train:  18%|███▎               | 163/928 [00:56<04:16,  2.98it/s, total_it=2018][A
epochs:  10%| | 2/20 [12:15<1:41:55, 339.74s/it, loss=1.25, lr=0.000763, d_time=[A
train:  18%|███▎               | 164/928 [00:56<04:23,  2.90it/s, total_it=2019][A
epochs:  10%| | 2/20 [12:15<1:41:55, 339.74s/it, loss=0.94, lr=0.000763, d_time=[A
train:  18%|███▍               | 165/928 [00:56<04:28,  2.84it/s, total_it=2020][A
epochs:  10%| | 2/20 [12:16<1:41:55, 339.74s/it, loss=1.25, lr=0.000764, d_time=[A
train:  18%|███▍               | 166/928 [00:57<04:22,  2.90it/s, total_it=2

epochs:  10%| | 2/20 [12:30<1:41:55, 339.74s/it, loss=0.93, lr=0.000783, d_time=[A
train:  23%|████▎              | 210/928 [01:11<04:07,  2.91it/s, total_it=2065][A
epochs:  10%| | 2/20 [12:31<1:41:55, 339.74s/it, loss=0.904, lr=0.000783, d_time[A
train:  23%|████▎              | 211/928 [01:12<04:02,  2.95it/s, total_it=2066][A
epochs:  10%| | 2/20 [12:31<1:41:55, 339.74s/it, loss=0.88, lr=0.000784, d_time=[A
train:  23%|████▎              | 212/928 [01:12<03:56,  3.03it/s, total_it=2067][A
epochs:  10%| | 2/20 [12:31<1:41:55, 339.74s/it, loss=0.932, lr=0.000784, d_time[A
train:  23%|████▎              | 213/928 [01:12<03:58,  3.00it/s, total_it=2068][A
epochs:  10%| | 2/20 [12:32<1:41:55, 339.74s/it, loss=1.06, lr=0.000785, d_time=[A
train:  23%|████▍              | 214/928 [01:13<03:52,  3.07it/s, total_it=2069][A
epochs:  10%| | 2/20 [12:32<1:41:55, 339.74s/it, loss=1.16, lr=0.000785, d_time=[A
train:  23%|████▍              | 215/928 [01:13<03:54,  3.04it/s, total_it=2

epochs:  10%| | 2/20 [12:47<1:41:55, 339.74s/it, loss=0.901, lr=0.000805, d_time[A
train:  28%|█████▎             | 259/928 [01:28<04:05,  2.73it/s, total_it=2114][A
epochs:  10%| | 2/20 [12:47<1:41:55, 339.74s/it, loss=0.884, lr=0.000805, d_time[A
train:  28%|█████▎             | 260/928 [01:28<03:57,  2.81it/s, total_it=2115][A
epochs:  10%| | 2/20 [12:48<1:41:55, 339.74s/it, loss=0.739, lr=0.000806, d_time[A
train:  28%|█████▎             | 261/928 [01:29<03:46,  2.95it/s, total_it=2116][A
epochs:  10%| | 2/20 [12:48<1:41:55, 339.74s/it, loss=0.804, lr=0.000806, d_time[A
train:  28%|█████▎             | 262/928 [01:29<03:44,  2.97it/s, total_it=2117][A
epochs:  10%| | 2/20 [12:48<1:41:55, 339.74s/it, loss=0.846, lr=0.000806, d_time[A
train:  28%|█████▍             | 263/928 [01:29<03:42,  2.99it/s, total_it=2118][A
epochs:  10%| | 2/20 [12:49<1:41:55, 339.74s/it, loss=0.829, lr=0.000807, d_time[A
train:  28%|█████▍             | 264/928 [01:30<03:42,  2.99it/s, total_it=2

epochs:  10%| | 2/20 [13:03<1:41:55, 339.74s/it, loss=0.813, lr=0.000827, d_time[A
train:  33%|██████▎            | 308/928 [01:44<03:42,  2.79it/s, total_it=2163][A
epochs:  10%| | 2/20 [13:04<1:41:55, 339.74s/it, loss=0.91, lr=0.000827, d_time=[A
train:  33%|██████▎            | 309/928 [01:45<03:35,  2.88it/s, total_it=2164][A
epochs:  10%| | 2/20 [13:04<1:41:55, 339.74s/it, loss=0.786, lr=0.000828, d_time[A
train:  33%|██████▎            | 310/928 [01:45<03:35,  2.87it/s, total_it=2165][A
epochs:  10%| | 2/20 [13:04<1:41:55, 339.74s/it, loss=0.888, lr=0.000828, d_time[A
train:  34%|██████▎            | 311/928 [01:45<03:35,  2.86it/s, total_it=2166][A
epochs:  10%| | 2/20 [13:05<1:41:55, 339.74s/it, loss=0.834, lr=0.000828, d_time[A
train:  34%|██████▍            | 312/928 [01:46<03:29,  2.94it/s, total_it=2167][A
epochs:  10%| | 2/20 [13:05<1:41:55, 339.74s/it, loss=0.909, lr=0.000829, d_time[A
train:  34%|██████▍            | 313/928 [01:46<03:24,  3.00it/s, total_it=2

epochs:  10%| | 2/20 [13:20<1:41:55, 339.74s/it, loss=0.821, lr=0.000849, d_time[A
train:  38%|███████▎           | 357/928 [02:01<03:10,  2.99it/s, total_it=2212][A
epochs:  10%| | 2/20 [13:20<1:41:55, 339.74s/it, loss=0.781, lr=0.000849, d_time[A
train:  39%|███████▎           | 358/928 [02:01<03:12,  2.96it/s, total_it=2213][A
epochs:  10%| | 2/20 [13:20<1:41:55, 339.74s/it, loss=0.809, lr=0.00085, d_time=[A
train:  39%|███████▎           | 359/928 [02:02<03:06,  3.05it/s, total_it=2214][A
epochs:  10%| | 2/20 [13:21<1:41:55, 339.74s/it, loss=0.832, lr=0.00085, d_time=[A
train:  39%|███████▎           | 360/928 [02:02<03:03,  3.09it/s, total_it=2215][A
epochs:  10%| | 2/20 [13:21<1:41:55, 339.74s/it, loss=0.762, lr=0.000851, d_time[A
train:  39%|███████▍           | 361/928 [02:02<03:02,  3.11it/s, total_it=2216][A
epochs:  10%| | 2/20 [13:21<1:41:55, 339.74s/it, loss=0.873, lr=0.000851, d_time[A
train:  39%|███████▍           | 362/928 [02:02<03:02,  3.11it/s, total_it=2

epochs:  10%| | 2/20 [13:36<1:41:55, 339.74s/it, loss=0.824, lr=0.000872, d_time[A
train:  44%|████████▎          | 406/928 [02:17<02:54,  3.00it/s, total_it=2261][A
epochs:  10%| | 2/20 [13:36<1:41:55, 339.74s/it, loss=0.911, lr=0.000872, d_time[A
train:  44%|████████▎          | 407/928 [02:18<02:50,  3.05it/s, total_it=2262][A
epochs:  10%| | 2/20 [13:37<1:41:55, 339.74s/it, loss=0.856, lr=0.000873, d_time[A
train:  44%|████████▎          | 408/928 [02:18<02:57,  2.94it/s, total_it=2263][A
epochs:  10%| | 2/20 [13:37<1:41:55, 339.74s/it, loss=0.745, lr=0.000873, d_time[A
train:  44%|████████▎          | 409/928 [02:18<02:52,  3.01it/s, total_it=2264][A
epochs:  10%| | 2/20 [13:37<1:41:55, 339.74s/it, loss=0.935, lr=0.000874, d_time[A
train:  44%|████████▍          | 410/928 [02:19<02:49,  3.05it/s, total_it=2265][A
epochs:  10%| | 2/20 [13:38<1:41:55, 339.74s/it, loss=0.898, lr=0.000874, d_time[A
train:  44%|████████▍          | 411/928 [02:19<02:52,  3.00it/s, total_it=2

epochs:  10%| | 2/20 [13:52<1:41:55, 339.74s/it, loss=1.03, lr=0.000895, d_time=[A
train:  49%|█████████▎         | 455/928 [02:34<02:40,  2.95it/s, total_it=2310][A
epochs:  10%| | 2/20 [13:53<1:41:55, 339.74s/it, loss=0.795, lr=0.000895, d_time[A
train:  49%|█████████▎         | 456/928 [02:34<02:39,  2.95it/s, total_it=2311][A
epochs:  10%| | 2/20 [13:53<1:41:55, 339.74s/it, loss=0.842, lr=0.000896, d_time[A
train:  49%|█████████▎         | 457/928 [02:34<02:37,  3.00it/s, total_it=2312][A
epochs:  10%| | 2/20 [13:53<1:41:55, 339.74s/it, loss=0.836, lr=0.000896, d_time[A
train:  49%|█████████▍         | 458/928 [02:35<02:36,  3.00it/s, total_it=2313][A
epochs:  10%| | 2/20 [13:54<1:41:55, 339.74s/it, loss=0.877, lr=0.000897, d_time[A
train:  49%|█████████▍         | 459/928 [02:35<02:35,  3.01it/s, total_it=2314][A
epochs:  10%| | 2/20 [13:54<1:41:55, 339.74s/it, loss=0.812, lr=0.000897, d_time[A
train:  50%|█████████▍         | 460/928 [02:35<02:34,  3.04it/s, total_it=2

epochs:  10%| | 2/20 [14:09<1:41:55, 339.74s/it, loss=0.863, lr=0.000918, d_time[A
train:  54%|██████████▎        | 504/928 [02:50<02:18,  3.07it/s, total_it=2359][A
epochs:  10%| | 2/20 [14:09<1:41:55, 339.74s/it, loss=0.787, lr=0.000919, d_time[A
train:  54%|██████████▎        | 505/928 [02:50<02:27,  2.87it/s, total_it=2360][A
epochs:  10%| | 2/20 [14:10<1:41:55, 339.74s/it, loss=0.75, lr=0.000919, d_time=[A
train:  55%|██████████▎        | 506/928 [02:51<02:25,  2.89it/s, total_it=2361][A
epochs:  10%| | 2/20 [14:10<1:41:55, 339.74s/it, loss=0.896, lr=0.00092, d_time=[A
train:  55%|██████████▍        | 507/928 [02:51<02:27,  2.85it/s, total_it=2362][A
epochs:  10%| | 2/20 [14:10<1:41:55, 339.74s/it, loss=0.817, lr=0.00092, d_time=[A
train:  55%|██████████▍        | 508/928 [02:51<02:27,  2.84it/s, total_it=2363][A
epochs:  10%| | 2/20 [14:11<1:41:55, 339.74s/it, loss=0.781, lr=0.000921, d_time[A
train:  55%|██████████▍        | 509/928 [02:52<02:34,  2.71it/s, total_it=2

epochs:  10%| | 2/20 [14:25<1:41:55, 339.74s/it, loss=0.817, lr=0.000942, d_time[A
train:  60%|███████████▎       | 553/928 [03:06<02:03,  3.04it/s, total_it=2408][A
epochs:  10%| | 2/20 [14:26<1:41:55, 339.74s/it, loss=0.793, lr=0.000942, d_time[A
train:  60%|███████████▎       | 554/928 [03:07<02:06,  2.95it/s, total_it=2409][A
epochs:  10%| | 2/20 [14:26<1:41:55, 339.74s/it, loss=1.41, lr=0.000943, d_time=[A
train:  60%|███████████▎       | 555/928 [03:07<02:04,  2.99it/s, total_it=2410][A
epochs:  10%| | 2/20 [14:26<1:41:55, 339.74s/it, loss=0.811, lr=0.000943, d_time[A
train:  60%|███████████▍       | 556/928 [03:07<02:03,  3.01it/s, total_it=2411][A
epochs:  10%| | 2/20 [14:27<1:41:55, 339.74s/it, loss=0.84, lr=0.000944, d_time=[A
train:  60%|███████████▍       | 557/928 [03:08<02:01,  3.05it/s, total_it=2412][A
epochs:  10%| | 2/20 [14:27<1:41:55, 339.74s/it, loss=0.964, lr=0.000944, d_time[A
train:  60%|███████████▍       | 558/928 [03:08<01:59,  3.08it/s, total_it=2

epochs:  10%| | 2/20 [14:42<1:41:55, 339.74s/it, loss=0.953, lr=0.000966, d_time[A
train:  65%|████████████▎      | 602/928 [03:23<01:44,  3.13it/s, total_it=2457][A
epochs:  10%| | 2/20 [14:42<1:41:55, 339.74s/it, loss=0.815, lr=0.000966, d_time[A
train:  65%|████████████▎      | 603/928 [03:23<01:44,  3.11it/s, total_it=2458][A
epochs:  10%| | 2/20 [14:42<1:41:55, 339.74s/it, loss=0.826, lr=0.000967, d_time[A
train:  65%|████████████▎      | 604/928 [03:23<01:50,  2.92it/s, total_it=2459][A
epochs:  10%| | 2/20 [14:43<1:41:55, 339.74s/it, loss=0.803, lr=0.000967, d_time[A
train:  65%|████████████▍      | 605/928 [03:24<01:47,  3.00it/s, total_it=2460][A
epochs:  10%| | 2/20 [14:43<1:41:55, 339.74s/it, loss=0.85, lr=0.000968, d_time=[A
train:  65%|████████████▍      | 606/928 [03:24<01:47,  3.00it/s, total_it=2461][A
epochs:  10%| | 2/20 [14:43<1:41:55, 339.74s/it, loss=0.866, lr=0.000968, d_time[A
train:  65%|████████████▍      | 607/928 [03:24<01:48,  2.97it/s, total_it=2

epochs:  10%| | 2/20 [14:58<1:41:55, 339.74s/it, loss=0.799, lr=0.00099, d_time=[A
train:  70%|█████████████▎     | 651/928 [03:39<01:39,  2.78it/s, total_it=2506][A
epochs:  10%| | 2/20 [14:59<1:41:55, 339.74s/it, loss=0.874, lr=0.000991, d_time[A
train:  70%|█████████████▎     | 652/928 [03:40<01:35,  2.88it/s, total_it=2507][A
epochs:  10%| | 2/20 [14:59<1:41:55, 339.74s/it, loss=0.941, lr=0.000991, d_time[A
train:  70%|█████████████▎     | 653/928 [03:40<01:40,  2.73it/s, total_it=2508][A
epochs:  10%| | 2/20 [14:59<1:41:55, 339.74s/it, loss=0.827, lr=0.000992, d_time[A
train:  70%|█████████████▍     | 654/928 [03:40<01:38,  2.79it/s, total_it=2509][A
epochs:  10%| | 2/20 [15:00<1:41:55, 339.74s/it, loss=0.859, lr=0.000992, d_time[A
train:  71%|█████████████▍     | 655/928 [03:41<01:36,  2.82it/s, total_it=2510][A
epochs:  10%| | 2/20 [15:00<1:41:55, 339.74s/it, loss=0.968, lr=0.000993, d_time[A
train:  71%|█████████████▍     | 656/928 [03:41<01:34,  2.89it/s, total_it=2

epochs:  10%| | 2/20 [15:15<1:41:55, 339.74s/it, loss=0.93, lr=0.00101, d_time=0[A
train:  75%|██████████████▎    | 700/928 [03:56<01:13,  3.09it/s, total_it=2555][A
epochs:  10%| | 2/20 [15:15<1:41:55, 339.74s/it, loss=0.779, lr=0.00102, d_time=[A
train:  76%|██████████████▎    | 701/928 [03:56<01:16,  2.97it/s, total_it=2556][A
epochs:  10%| | 2/20 [15:16<1:41:55, 339.74s/it, loss=0.865, lr=0.00102, d_time=[A
train:  76%|██████████████▎    | 702/928 [03:57<01:14,  3.03it/s, total_it=2557][A
epochs:  10%| | 2/20 [15:16<1:41:55, 339.74s/it, loss=0.853, lr=0.00102, d_time=[A
train:  76%|██████████████▍    | 703/928 [03:57<01:15,  3.00it/s, total_it=2558][A
epochs:  10%| | 2/20 [15:16<1:41:55, 339.74s/it, loss=0.926, lr=0.00102, d_time=[A
train:  76%|██████████████▍    | 704/928 [03:57<01:12,  3.07it/s, total_it=2559][A
epochs:  10%| | 2/20 [15:17<1:41:55, 339.74s/it, loss=0.805, lr=0.00102, d_time=[A
train:  76%|██████████████▍    | 705/928 [03:58<01:11,  3.10it/s, total_it=2

epochs:  10%| | 2/20 [15:32<1:41:55, 339.74s/it, loss=0.893, lr=0.00104, d_time=[A
train:  81%|███████████████▎   | 749/928 [04:13<01:00,  2.98it/s, total_it=2604][A
epochs:  10%| | 2/20 [15:32<1:41:55, 339.74s/it, loss=0.908, lr=0.00104, d_time=[A
train:  81%|███████████████▎   | 750/928 [04:13<01:01,  2.89it/s, total_it=2605][A
epochs:  10%| | 2/20 [15:32<1:41:55, 339.74s/it, loss=0.809, lr=0.00104, d_time=[A
train:  81%|███████████████▍   | 751/928 [04:14<01:00,  2.90it/s, total_it=2606][A
epochs:  10%| | 2/20 [15:33<1:41:55, 339.74s/it, loss=0.822, lr=0.00104, d_time=[A
train:  81%|███████████████▍   | 752/928 [04:14<00:57,  3.05it/s, total_it=2607][A
epochs:  10%| | 2/20 [15:33<1:41:55, 339.74s/it, loss=0.789, lr=0.00104, d_time=[A
train:  81%|███████████████▍   | 753/928 [04:14<00:59,  2.96it/s, total_it=2608][A
epochs:  10%| | 2/20 [15:33<1:41:55, 339.74s/it, loss=0.898, lr=0.00104, d_time=[A
train:  81%|███████████████▍   | 754/928 [04:15<00:58,  2.96it/s, total_it=2

epochs:  10%| | 2/20 [15:48<1:41:55, 339.74s/it, loss=0.869, lr=0.00106, d_time=[A
train:  86%|████████████████▎  | 798/928 [04:29<00:42,  3.04it/s, total_it=2653][A
epochs:  10%| | 2/20 [15:48<1:41:55, 339.74s/it, loss=0.884, lr=0.00107, d_time=[A
train:  86%|████████████████▎  | 799/928 [04:29<00:42,  3.04it/s, total_it=2654][A
epochs:  10%| | 2/20 [15:49<1:41:55, 339.74s/it, loss=0.833, lr=0.00107, d_time=[A
train:  86%|████████████████▍  | 800/928 [04:30<00:42,  2.99it/s, total_it=2655][A
epochs:  10%| | 2/20 [15:49<1:41:55, 339.74s/it, loss=0.783, lr=0.00107, d_time=[A
train:  86%|████████████████▍  | 801/928 [04:30<00:41,  3.06it/s, total_it=2656][A
epochs:  10%| | 2/20 [15:49<1:41:55, 339.74s/it, loss=0.809, lr=0.00107, d_time=[A
train:  86%|████████████████▍  | 802/928 [04:30<00:41,  3.03it/s, total_it=2657][A
epochs:  10%| | 2/20 [15:50<1:41:55, 339.74s/it, loss=0.899, lr=0.00107, d_time=[A
train:  87%|████████████████▍  | 803/928 [04:31<00:40,  3.06it/s, total_it=2

epochs:  10%| | 2/20 [16:04<1:41:55, 339.74s/it, loss=0.808, lr=0.00109, d_time=[A
train:  91%|█████████████████▎ | 847/928 [04:45<00:26,  3.09it/s, total_it=2702][A
epochs:  10%| | 2/20 [16:04<1:41:55, 339.74s/it, loss=0.852, lr=0.00109, d_time=[A
train:  91%|█████████████████▎ | 848/928 [04:46<00:27,  2.93it/s, total_it=2703][A
epochs:  10%| | 2/20 [16:05<1:41:55, 339.74s/it, loss=0.659, lr=0.00109, d_time=[A
train:  91%|█████████████████▍ | 849/928 [04:46<00:27,  2.92it/s, total_it=2704][A
epochs:  10%| | 2/20 [16:05<1:41:55, 339.74s/it, loss=0.846, lr=0.00109, d_time=[A
train:  92%|█████████████████▍ | 850/928 [04:46<00:28,  2.79it/s, total_it=2705][A
epochs:  10%| | 2/20 [16:06<1:41:55, 339.74s/it, loss=0.8, lr=0.00109, d_time=0.[A
train:  92%|█████████████████▍ | 851/928 [04:47<00:26,  2.86it/s, total_it=2706][A
epochs:  10%| | 2/20 [16:06<1:41:55, 339.74s/it, loss=0.849, lr=0.00109, d_time=[A
train:  92%|█████████████████▍ | 852/928 [04:47<00:26,  2.86it/s, total_it=2

epochs:  10%| | 2/20 [16:20<1:41:55, 339.74s/it, loss=0.876, lr=0.00112, d_time=[A
train:  97%|██████████████████▎| 896/928 [05:02<00:10,  3.10it/s, total_it=2751][A
epochs:  10%| | 2/20 [16:21<1:41:55, 339.74s/it, loss=0.93, lr=0.00112, d_time=0[A
train:  97%|██████████████████▎| 897/928 [05:02<00:10,  3.00it/s, total_it=2752][A
epochs:  10%| | 2/20 [16:21<1:41:55, 339.74s/it, loss=0.751, lr=0.00112, d_time=[A
train:  97%|██████████████████▍| 898/928 [05:02<00:09,  3.02it/s, total_it=2753][A
epochs:  10%| | 2/20 [16:22<1:41:55, 339.74s/it, loss=0.841, lr=0.00112, d_time=[A
train:  97%|██████████████████▍| 899/928 [05:03<00:09,  3.07it/s, total_it=2754][A
epochs:  10%| | 2/20 [16:22<1:41:55, 339.74s/it, loss=0.801, lr=0.00112, d_time=[A
train:  97%|██████████████████▍| 900/928 [05:03<00:09,  3.06it/s, total_it=2755][A
epochs:  10%| | 2/20 [16:22<1:41:55, 339.74s/it, loss=0.9, lr=0.00112, d_time=0.[A
train:  97%|██████████████████▍| 901/928 [05:03<00:09,  2.94it/s, total_it=2

epochs:  15%|▏| 3/20 [16:37<1:32:48, 327.55s/it, loss=0.892, lr=0.00114, d_time=[A
train:   2%|▎                   | 16/928 [00:05<04:53,  3.11it/s, total_it=2799][A
epochs:  15%|▏| 3/20 [16:38<1:32:48, 327.55s/it, loss=0.865, lr=0.00114, d_time=[A
train:   2%|▎                   | 17/928 [00:06<04:51,  3.12it/s, total_it=2800][A
epochs:  15%|▏| 3/20 [16:38<1:32:48, 327.55s/it, loss=0.733, lr=0.00114, d_time=[A
train:   2%|▍                   | 18/928 [00:06<04:51,  3.13it/s, total_it=2801][A
epochs:  15%|▏| 3/20 [16:38<1:32:48, 327.55s/it, loss=0.79, lr=0.00114, d_time=0[A
train:   2%|▍                   | 19/928 [00:06<04:50,  3.13it/s, total_it=2802][A
epochs:  15%|▏| 3/20 [16:39<1:32:48, 327.55s/it, loss=0.732, lr=0.00114, d_time=[A
train:   2%|▍                   | 20/928 [00:07<05:17,  2.86it/s, total_it=2803][A
epochs:  15%|▏| 3/20 [16:39<1:32:48, 327.55s/it, loss=0.917, lr=0.00114, d_time=[A
train:   2%|▍                   | 21/928 [00:07<05:04,  2.98it/s, total_it=2

epochs:  15%|▏| 3/20 [16:54<1:32:48, 327.55s/it, loss=0.784, lr=0.00117, d_time=[A
train:   7%|█▍                  | 65/928 [00:22<05:04,  2.84it/s, total_it=2848][A
epochs:  15%|▏| 3/20 [16:54<1:32:48, 327.55s/it, loss=0.923, lr=0.00117, d_time=[A
train:   7%|█▍                  | 66/928 [00:22<04:51,  2.95it/s, total_it=2849][A
epochs:  15%|▏| 3/20 [16:55<1:32:48, 327.55s/it, loss=0.878, lr=0.00117, d_time=[A
train:   7%|█▍                  | 67/928 [00:23<04:51,  2.95it/s, total_it=2850][A
epochs:  15%|▏| 3/20 [16:55<1:32:48, 327.55s/it, loss=0.777, lr=0.00117, d_time=[A
train:   7%|█▍                  | 68/928 [00:23<04:55,  2.91it/s, total_it=2851][A
epochs:  15%|▏| 3/20 [16:55<1:32:48, 327.55s/it, loss=0.904, lr=0.00117, d_time=[A
train:   7%|█▍                  | 69/928 [00:23<04:49,  2.96it/s, total_it=2852][A
epochs:  15%|▏| 3/20 [16:56<1:32:48, 327.55s/it, loss=0.988, lr=0.00117, d_time=[A
train:   8%|█▌                  | 70/928 [00:24<04:50,  2.96it/s, total_it=2

epochs:  15%|▏| 3/20 [17:10<1:32:48, 327.55s/it, loss=0.876, lr=0.00119, d_time=[A
train:  12%|██▎                | 114/928 [00:38<04:37,  2.93it/s, total_it=2897][A
epochs:  15%|▏| 3/20 [17:11<1:32:48, 327.55s/it, loss=0.826, lr=0.00119, d_time=[A
train:  12%|██▎                | 115/928 [00:39<04:33,  2.97it/s, total_it=2898][A
epochs:  15%|▏| 3/20 [17:11<1:32:48, 327.55s/it, loss=0.83, lr=0.00119, d_time=0[A
train:  12%|██▍                | 116/928 [00:39<04:37,  2.93it/s, total_it=2899][A
epochs:  15%|▏| 3/20 [17:11<1:32:48, 327.55s/it, loss=0.871, lr=0.00119, d_time=[A
train:  13%|██▍                | 117/928 [00:39<04:40,  2.89it/s, total_it=2900][A
epochs:  15%|▏| 3/20 [17:12<1:32:48, 327.55s/it, loss=0.774, lr=0.0012, d_time=0[A
train:  13%|██▍                | 118/928 [00:40<04:36,  2.93it/s, total_it=2901][A
epochs:  15%|▏| 3/20 [17:12<1:32:48, 327.55s/it, loss=0.696, lr=0.0012, d_time=0[A
train:  13%|██▍                | 119/928 [00:40<04:34,  2.94it/s, total_it=2

epochs:  15%|▏| 3/20 [17:27<1:32:48, 327.55s/it, loss=0.921, lr=0.00122, d_time=[A
train:  18%|███▎               | 163/928 [00:55<04:15,  2.99it/s, total_it=2946][A
epochs:  15%|▏| 3/20 [17:27<1:32:48, 327.55s/it, loss=0.857, lr=0.00122, d_time=[A
train:  18%|███▎               | 164/928 [00:55<04:22,  2.92it/s, total_it=2947][A
epochs:  15%|▏| 3/20 [17:27<1:32:48, 327.55s/it, loss=0.821, lr=0.00122, d_time=[A
train:  18%|███▍               | 165/928 [00:56<04:22,  2.90it/s, total_it=2948][A
epochs:  15%|▏| 3/20 [17:28<1:32:48, 327.55s/it, loss=0.827, lr=0.00122, d_time=[A
train:  18%|███▍               | 166/928 [00:56<04:16,  2.97it/s, total_it=2949][A
epochs:  15%|▏| 3/20 [17:28<1:32:48, 327.55s/it, loss=0.856, lr=0.00122, d_time=[A
train:  18%|███▍               | 167/928 [00:56<04:12,  3.01it/s, total_it=2950][A
epochs:  15%|▏| 3/20 [17:28<1:32:48, 327.55s/it, loss=0.906, lr=0.00122, d_time=[A
train:  18%|███▍               | 168/928 [00:57<04:31,  2.80it/s, total_it=2

epochs:  15%|▏| 3/20 [17:43<1:32:48, 327.55s/it, loss=0.937, lr=0.00125, d_time=[A
train:  23%|████▎              | 212/928 [01:11<04:15,  2.80it/s, total_it=2995][A
epochs:  15%|▏| 3/20 [17:43<1:32:48, 327.55s/it, loss=0.873, lr=0.00125, d_time=[A
train:  23%|████▎              | 213/928 [01:12<04:07,  2.88it/s, total_it=2996][A
epochs:  15%|▏| 3/20 [17:44<1:32:48, 327.55s/it, loss=0.822, lr=0.00125, d_time=[A
train:  23%|████▍              | 214/928 [01:12<04:22,  2.72it/s, total_it=2997][A
epochs:  15%|▏| 3/20 [17:44<1:32:48, 327.55s/it, loss=0.803, lr=0.00125, d_time=[A
train:  23%|████▍              | 215/928 [01:12<04:13,  2.82it/s, total_it=2998][A
epochs:  15%|▏| 3/20 [17:45<1:32:48, 327.55s/it, loss=0.852, lr=0.00125, d_time=[A
train:  23%|████▍              | 216/928 [01:13<04:21,  2.73it/s, total_it=2999][A
epochs:  15%|▏| 3/20 [17:45<1:32:48, 327.55s/it, loss=0.785, lr=0.00125, d_time=[A
train:  23%|████▍              | 217/928 [01:13<04:13,  2.80it/s, total_it=3

epochs:  15%|▏| 3/20 [17:59<1:32:48, 327.55s/it, loss=0.773, lr=0.00127, d_time=[A
train:  28%|█████▎             | 261/928 [01:27<03:30,  3.17it/s, total_it=3044][A
epochs:  15%|▏| 3/20 [18:00<1:32:48, 327.55s/it, loss=0.913, lr=0.00127, d_time=[A
train:  28%|█████▎             | 262/928 [01:28<03:33,  3.12it/s, total_it=3045][A
epochs:  15%|▏| 3/20 [18:00<1:32:48, 327.55s/it, loss=0.862, lr=0.00127, d_time=[A
train:  28%|█████▍             | 263/928 [01:28<03:32,  3.12it/s, total_it=3046][A
epochs:  15%|▏| 3/20 [18:00<1:32:48, 327.55s/it, loss=0.808, lr=0.00127, d_time=[A
train:  28%|█████▍             | 264/928 [01:28<03:33,  3.11it/s, total_it=3047][A
epochs:  15%|▏| 3/20 [18:01<1:32:48, 327.55s/it, loss=0.818, lr=0.00128, d_time=[A
train:  29%|█████▍             | 265/928 [01:29<03:40,  3.00it/s, total_it=3048][A
epochs:  15%|▏| 3/20 [18:01<1:32:48, 327.55s/it, loss=0.787, lr=0.00128, d_time=[A
train:  29%|█████▍             | 266/928 [01:29<03:31,  3.13it/s, total_it=3

epochs:  15%|▏| 3/20 [18:16<1:32:48, 327.55s/it, loss=0.802, lr=0.0013, d_time=0[A
train:  33%|██████▎            | 310/928 [01:44<03:18,  3.12it/s, total_it=3093][A
epochs:  15%|▏| 3/20 [18:16<1:32:48, 327.55s/it, loss=0.698, lr=0.0013, d_time=0[A
train:  34%|██████▎            | 311/928 [01:44<03:17,  3.12it/s, total_it=3094][A
epochs:  15%|▏| 3/20 [18:16<1:32:48, 327.55s/it, loss=0.721, lr=0.0013, d_time=0[A
train:  34%|██████▍            | 312/928 [01:44<03:19,  3.08it/s, total_it=3095][A
epochs:  15%|▏| 3/20 [18:17<1:32:48, 327.55s/it, loss=0.807, lr=0.0013, d_time=0[A
train:  34%|██████▍            | 313/928 [01:45<03:22,  3.03it/s, total_it=3096][A
epochs:  15%|▏| 3/20 [18:17<1:32:48, 327.55s/it, loss=0.833, lr=0.0013, d_time=0[A
train:  34%|██████▍            | 314/928 [01:45<03:22,  3.04it/s, total_it=3097][A
epochs:  15%|▏| 3/20 [18:17<1:32:48, 327.55s/it, loss=0.849, lr=0.0013, d_time=0[A
train:  34%|██████▍            | 315/928 [01:45<03:21,  3.05it/s, total_it=3

epochs:  15%|▏| 3/20 [18:32<1:32:48, 327.55s/it, loss=0.769, lr=0.00133, d_time=[A
train:  39%|███████▎           | 359/928 [02:00<03:04,  3.09it/s, total_it=3142][A
epochs:  15%|▏| 3/20 [18:32<1:32:48, 327.55s/it, loss=0.742, lr=0.00133, d_time=[A
train:  39%|███████▎           | 360/928 [02:00<03:04,  3.08it/s, total_it=3143][A
epochs:  15%|▏| 3/20 [18:33<1:32:48, 327.55s/it, loss=0.777, lr=0.00133, d_time=[A
train:  39%|███████▍           | 361/928 [02:01<03:02,  3.11it/s, total_it=3144][A
epochs:  15%|▏| 3/20 [18:33<1:32:48, 327.55s/it, loss=0.841, lr=0.00133, d_time=[A
train:  39%|███████▍           | 362/928 [02:01<02:59,  3.16it/s, total_it=3145][A
epochs:  15%|▏| 3/20 [18:33<1:32:48, 327.55s/it, loss=0.697, lr=0.00133, d_time=[A
train:  39%|███████▍           | 363/928 [02:01<02:56,  3.19it/s, total_it=3146][A
epochs:  15%|▏| 3/20 [18:34<1:32:48, 327.55s/it, loss=0.835, lr=0.00133, d_time=[A
train:  39%|███████▍           | 364/928 [02:02<02:54,  3.23it/s, total_it=3

epochs:  15%|▏| 3/20 [18:48<1:32:48, 327.55s/it, loss=0.824, lr=0.00135, d_time=[A
train:  44%|████████▎          | 408/928 [02:16<02:52,  3.02it/s, total_it=3191][A
epochs:  15%|▏| 3/20 [18:49<1:32:48, 327.55s/it, loss=0.785, lr=0.00135, d_time=[A
train:  44%|████████▎          | 409/928 [02:17<02:50,  3.05it/s, total_it=3192][A
epochs:  15%|▏| 3/20 [18:49<1:32:48, 327.55s/it, loss=0.922, lr=0.00136, d_time=[A
train:  44%|████████▍          | 410/928 [02:17<02:59,  2.89it/s, total_it=3193][A
epochs:  15%|▏| 3/20 [18:49<1:32:48, 327.55s/it, loss=0.915, lr=0.00136, d_time=[A
train:  44%|████████▍          | 411/928 [02:17<02:54,  2.97it/s, total_it=3194][A
epochs:  15%|▏| 3/20 [18:50<1:32:48, 327.55s/it, loss=1.05, lr=0.00136, d_time=0[A
train:  44%|████████▍          | 412/928 [02:18<02:51,  3.00it/s, total_it=3195][A
epochs:  15%|▏| 3/20 [18:50<1:32:48, 327.55s/it, loss=0.804, lr=0.00136, d_time=[A
train:  45%|████████▍          | 413/928 [02:18<03:00,  2.85it/s, total_it=3

epochs:  15%|▏| 3/20 [19:05<1:32:48, 327.55s/it, loss=0.782, lr=0.00138, d_time=[A
train:  49%|█████████▎         | 457/928 [02:33<02:31,  3.12it/s, total_it=3240][A
epochs:  15%|▏| 3/20 [19:05<1:32:48, 327.55s/it, loss=0.721, lr=0.00138, d_time=[A
train:  49%|█████████▍         | 458/928 [02:33<02:35,  3.03it/s, total_it=3241][A
epochs:  15%|▏| 3/20 [19:05<1:32:48, 327.55s/it, loss=0.834, lr=0.00138, d_time=[A
train:  49%|█████████▍         | 459/928 [02:33<02:37,  2.98it/s, total_it=3242][A
epochs:  15%|▏| 3/20 [19:06<1:32:48, 327.55s/it, loss=0.767, lr=0.00138, d_time=[A
train:  50%|█████████▍         | 460/928 [02:34<02:39,  2.94it/s, total_it=3243][A
epochs:  15%|▏| 3/20 [19:06<1:32:48, 327.55s/it, loss=0.771, lr=0.00138, d_time=[A
train:  50%|█████████▍         | 461/928 [02:34<02:40,  2.92it/s, total_it=3244][A
epochs:  15%|▏| 3/20 [19:06<1:32:48, 327.55s/it, loss=0.761, lr=0.00138, d_time=[A
train:  50%|█████████▍         | 462/928 [02:34<02:38,  2.94it/s, total_it=3

epochs:  15%|▏| 3/20 [19:21<1:32:48, 327.55s/it, loss=0.705, lr=0.00141, d_time=[A
train:  55%|██████████▎        | 506/928 [02:49<02:26,  2.87it/s, total_it=3289][A
epochs:  15%|▏| 3/20 [19:22<1:32:48, 327.55s/it, loss=0.734, lr=0.00141, d_time=[A
train:  55%|██████████▍        | 507/928 [02:50<02:21,  2.97it/s, total_it=3290][A
epochs:  15%|▏| 3/20 [19:22<1:32:48, 327.55s/it, loss=0.787, lr=0.00141, d_time=[A
train:  55%|██████████▍        | 508/928 [02:50<02:25,  2.89it/s, total_it=3291][A
epochs:  15%|▏| 3/20 [19:22<1:32:48, 327.55s/it, loss=0.792, lr=0.00141, d_time=[A
train:  55%|██████████▍        | 509/928 [02:51<02:22,  2.93it/s, total_it=3292][A
epochs:  15%|▏| 3/20 [19:23<1:32:48, 327.55s/it, loss=0.898, lr=0.00141, d_time=[A
train:  55%|██████████▍        | 510/928 [02:51<02:19,  3.01it/s, total_it=3293][A
epochs:  15%|▏| 3/20 [19:23<1:32:48, 327.55s/it, loss=0.772, lr=0.00141, d_time=[A
train:  55%|██████████▍        | 511/928 [02:51<02:18,  3.02it/s, total_it=3

epochs:  15%|▏| 3/20 [19:38<1:32:48, 327.55s/it, loss=0.765, lr=0.00144, d_time=[A
train:  60%|███████████▎       | 555/928 [03:06<02:02,  3.05it/s, total_it=3338][A
epochs:  15%|▏| 3/20 [19:38<1:32:48, 327.55s/it, loss=0.754, lr=0.00144, d_time=[A
train:  60%|███████████▍       | 556/928 [03:06<02:01,  3.07it/s, total_it=3339][A
epochs:  15%|▏| 3/20 [19:39<1:32:48, 327.55s/it, loss=0.673, lr=0.00144, d_time=[A
train:  60%|███████████▍       | 557/928 [03:07<02:08,  2.88it/s, total_it=3340][A
epochs:  15%|▏| 3/20 [19:39<1:32:48, 327.55s/it, loss=0.924, lr=0.00144, d_time=[A
train:  60%|███████████▍       | 558/928 [03:07<02:08,  2.88it/s, total_it=3341][A
epochs:  15%|▏| 3/20 [19:39<1:32:48, 327.55s/it, loss=0.82, lr=0.00144, d_time=0[A
train:  60%|███████████▍       | 559/928 [03:08<02:09,  2.85it/s, total_it=3342][A
epochs:  15%|▏| 3/20 [19:40<1:32:48, 327.55s/it, loss=0.937, lr=0.00144, d_time=[A
train:  60%|███████████▍       | 560/928 [03:08<02:10,  2.81it/s, total_it=3

epochs:  15%|▏| 3/20 [19:54<1:32:48, 327.55s/it, loss=0.773, lr=0.00146, d_time=[A
train:  65%|████████████▎      | 604/928 [03:22<01:51,  2.91it/s, total_it=3387][A
epochs:  15%|▏| 3/20 [19:55<1:32:48, 327.55s/it, loss=0.768, lr=0.00146, d_time=[A
train:  65%|████████████▍      | 605/928 [03:23<01:50,  2.93it/s, total_it=3388][A
epochs:  15%|▏| 3/20 [19:55<1:32:48, 327.55s/it, loss=0.941, lr=0.00147, d_time=[A
train:  65%|████████████▍      | 606/928 [03:23<01:48,  2.97it/s, total_it=3389][A
epochs:  15%|▏| 3/20 [19:55<1:32:48, 327.55s/it, loss=0.721, lr=0.00147, d_time=[A
train:  65%|████████████▍      | 607/928 [03:23<01:45,  3.05it/s, total_it=3390][A
epochs:  15%|▏| 3/20 [19:56<1:32:48, 327.55s/it, loss=0.841, lr=0.00147, d_time=[A
train:  66%|████████████▍      | 608/928 [03:24<01:52,  2.84it/s, total_it=3391][A
epochs:  15%|▏| 3/20 [19:56<1:32:48, 327.55s/it, loss=0.819, lr=0.00147, d_time=[A
train:  66%|████████████▍      | 609/928 [03:24<01:49,  2.91it/s, total_it=3

epochs:  15%|▏| 3/20 [20:11<1:32:48, 327.55s/it, loss=0.822, lr=0.00149, d_time=[A
train:  70%|█████████████▎     | 653/928 [03:39<01:33,  2.95it/s, total_it=3436][A
epochs:  15%|▏| 3/20 [20:11<1:32:48, 327.55s/it, loss=0.755, lr=0.00149, d_time=[A
train:  70%|█████████████▍     | 654/928 [03:39<01:33,  2.93it/s, total_it=3437][A
epochs:  15%|▏| 3/20 [20:11<1:32:48, 327.55s/it, loss=0.926, lr=0.00149, d_time=[A
train:  71%|█████████████▍     | 655/928 [03:40<01:33,  2.93it/s, total_it=3438][A
epochs:  15%|▏| 3/20 [20:12<1:32:48, 327.55s/it, loss=0.797, lr=0.00149, d_time=[A
train:  71%|█████████████▍     | 656/928 [03:40<01:30,  3.00it/s, total_it=3439][A
epochs:  15%|▏| 3/20 [20:12<1:32:48, 327.55s/it, loss=0.831, lr=0.00149, d_time=[A
train:  71%|█████████████▍     | 657/928 [03:40<01:31,  2.95it/s, total_it=3440][A
epochs:  15%|▏| 3/20 [20:13<1:32:48, 327.55s/it, loss=0.933, lr=0.00149, d_time=[A
train:  71%|█████████████▍     | 658/928 [03:41<01:30,  2.99it/s, total_it=3

epochs:  15%|▏| 3/20 [20:28<1:32:48, 327.55s/it, loss=0.918, lr=0.00152, d_time=[A
train:  76%|██████████████▎    | 702/928 [03:56<01:16,  2.95it/s, total_it=3485][A
epochs:  15%|▏| 3/20 [20:28<1:32:48, 327.55s/it, loss=0.933, lr=0.00152, d_time=[A
train:  76%|██████████████▍    | 703/928 [03:56<01:15,  2.98it/s, total_it=3486][A
epochs:  15%|▏| 3/20 [20:28<1:32:48, 327.55s/it, loss=0.879, lr=0.00152, d_time=[A
train:  76%|██████████████▍    | 704/928 [03:56<01:15,  2.97it/s, total_it=3487][A
epochs:  15%|▏| 3/20 [20:29<1:32:48, 327.55s/it, loss=0.862, lr=0.00152, d_time=[A
train:  76%|██████████████▍    | 705/928 [03:57<01:17,  2.89it/s, total_it=3488][A
epochs:  15%|▏| 3/20 [20:29<1:32:48, 327.55s/it, loss=0.74, lr=0.00152, d_time=0[A
train:  76%|██████████████▍    | 706/928 [03:57<01:17,  2.88it/s, total_it=3489][A
epochs:  15%|▏| 3/20 [20:29<1:32:48, 327.55s/it, loss=0.752, lr=0.00152, d_time=[A
train:  76%|██████████████▍    | 707/928 [03:57<01:14,  2.96it/s, total_it=3

epochs:  15%|▏| 3/20 [20:45<1:32:48, 327.55s/it, loss=0.895, lr=0.00155, d_time=[A
train:  81%|███████████████▍   | 751/928 [04:13<00:58,  3.02it/s, total_it=3534][A
epochs:  15%|▏| 3/20 [20:45<1:32:48, 327.55s/it, loss=0.795, lr=0.00155, d_time=[A
train:  81%|███████████████▍   | 752/928 [04:13<00:57,  3.07it/s, total_it=3535][A
epochs:  15%|▏| 3/20 [20:45<1:32:48, 327.55s/it, loss=0.85, lr=0.00155, d_time=0[A
train:  81%|███████████████▍   | 753/928 [04:13<00:56,  3.10it/s, total_it=3536][A
epochs:  15%|▏| 3/20 [20:45<1:32:48, 327.55s/it, loss=0.719, lr=0.00155, d_time=[A
train:  81%|███████████████▍   | 754/928 [04:14<00:56,  3.07it/s, total_it=3537][A
epochs:  15%|▏| 3/20 [20:46<1:32:48, 327.55s/it, loss=0.832, lr=0.00155, d_time=[A
train:  81%|███████████████▍   | 755/928 [04:14<00:55,  3.10it/s, total_it=3538][A
epochs:  15%|▏| 3/20 [20:46<1:32:48, 327.55s/it, loss=0.823, lr=0.00155, d_time=[A
train:  81%|███████████████▍   | 756/928 [04:14<00:59,  2.88it/s, total_it=3

epochs:  15%|▏| 3/20 [21:01<1:32:48, 327.55s/it, loss=0.768, lr=0.00158, d_time=[A
train:  86%|████████████████▍  | 800/928 [04:29<00:42,  3.04it/s, total_it=3583][A
epochs:  15%|▏| 3/20 [21:01<1:32:48, 327.55s/it, loss=0.79, lr=0.00158, d_time=0[A
train:  86%|████████████████▍  | 801/928 [04:30<00:43,  2.89it/s, total_it=3584][A
epochs:  15%|▏| 3/20 [21:02<1:32:48, 327.55s/it, loss=1.12, lr=0.00158, d_time=0[A
train:  86%|████████████████▍  | 802/928 [04:30<00:43,  2.89it/s, total_it=3585][A
epochs:  15%|▏| 3/20 [21:02<1:32:48, 327.55s/it, loss=0.782, lr=0.00158, d_time=[A
train:  87%|████████████████▍  | 803/928 [04:30<00:42,  2.93it/s, total_it=3586][A
epochs:  15%|▏| 3/20 [21:03<1:32:48, 327.55s/it, loss=0.854, lr=0.00158, d_time=[A
train:  87%|████████████████▍  | 804/928 [04:31<00:41,  2.99it/s, total_it=3587][A
epochs:  15%|▏| 3/20 [21:03<1:32:48, 327.55s/it, loss=0.725, lr=0.00158, d_time=[A
train:  87%|████████████████▍  | 805/928 [04:31<00:42,  2.91it/s, total_it=3

epochs:  15%|▏| 3/20 [21:18<1:32:48, 327.55s/it, loss=0.818, lr=0.0016, d_time=0[A
train:  91%|█████████████████▍ | 849/928 [04:46<00:26,  3.00it/s, total_it=3632][A
epochs:  15%|▏| 3/20 [21:18<1:32:48, 327.55s/it, loss=0.813, lr=0.0016, d_time=0[A
train:  92%|█████████████████▍ | 850/928 [04:46<00:26,  2.97it/s, total_it=3633][A
epochs:  15%|▏| 3/20 [21:18<1:32:48, 327.55s/it, loss=0.742, lr=0.0016, d_time=0[A
train:  92%|█████████████████▍ | 851/928 [04:46<00:25,  3.01it/s, total_it=3634][A
epochs:  15%|▏| 3/20 [21:19<1:32:48, 327.55s/it, loss=0.781, lr=0.00161, d_time=[A
train:  92%|█████████████████▍ | 852/928 [04:47<00:24,  3.06it/s, total_it=3635][A
epochs:  15%|▏| 3/20 [21:19<1:32:48, 327.55s/it, loss=0.746, lr=0.00161, d_time=[A
train:  92%|█████████████████▍ | 853/928 [04:47<00:24,  3.05it/s, total_it=3636][A
epochs:  15%|▏| 3/20 [21:19<1:32:48, 327.55s/it, loss=0.944, lr=0.00161, d_time=[A
train:  92%|█████████████████▍ | 854/928 [04:47<00:23,  3.09it/s, total_it=3

epochs:  15%|▏| 3/20 [21:34<1:32:48, 327.55s/it, loss=0.706, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 898/928 [05:02<00:10,  2.93it/s, total_it=3681][A
epochs:  15%|▏| 3/20 [21:35<1:32:48, 327.55s/it, loss=0.747, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 899/928 [05:03<00:09,  2.99it/s, total_it=3682][A
epochs:  15%|▏| 3/20 [21:35<1:32:48, 327.55s/it, loss=0.687, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 900/928 [05:03<00:09,  2.89it/s, total_it=3683][A
epochs:  15%|▏| 3/20 [21:35<1:32:48, 327.55s/it, loss=0.732, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 901/928 [05:03<00:09,  2.98it/s, total_it=3684][A
epochs:  15%|▏| 3/20 [21:36<1:32:48, 327.55s/it, loss=0.776, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 902/928 [05:04<00:09,  2.86it/s, total_it=3685][A
epochs:  15%|▏| 3/20 [21:36<1:32:48, 327.55s/it, loss=0.668, lr=0.00163, d_time=[A
train:  97%|██████████████████▍| 903/928 [05:04<00:08,  2.88it/s, total_it=3

epochs:  20%|▏| 4/20 [21:51<1:25:49, 321.85s/it, loss=0.776, lr=0.00166, d_time=[A
train:   2%|▍                   | 18/928 [00:06<05:06,  2.97it/s, total_it=3729][A
epochs:  20%|▏| 4/20 [21:52<1:25:49, 321.85s/it, loss=0.832, lr=0.00166, d_time=[A
train:   2%|▍                   | 19/928 [00:07<05:01,  3.01it/s, total_it=3730][A
epochs:  20%|▏| 4/20 [21:52<1:25:49, 321.85s/it, loss=0.895, lr=0.00166, d_time=[A
train:   2%|▍                   | 20/928 [00:07<05:05,  2.97it/s, total_it=3731][A
epochs:  20%|▏| 4/20 [21:52<1:25:49, 321.85s/it, loss=0.723, lr=0.00166, d_time=[A
train:   2%|▍                   | 21/928 [00:07<04:58,  3.04it/s, total_it=3732][A
epochs:  20%|▏| 4/20 [21:53<1:25:49, 321.85s/it, loss=0.794, lr=0.00166, d_time=[A
train:   2%|▍                   | 22/928 [00:08<04:59,  3.03it/s, total_it=3733][A
epochs:  20%|▏| 4/20 [21:53<1:25:49, 321.85s/it, loss=0.661, lr=0.00166, d_time=[A
train:   2%|▍                   | 23/928 [00:08<04:58,  3.03it/s, total_it=3

epochs:  20%|▏| 4/20 [22:07<1:25:49, 321.85s/it, loss=0.784, lr=0.00169, d_time=[A
train:   7%|█▍                  | 67/928 [00:22<04:50,  2.96it/s, total_it=3778][A
epochs:  20%|▏| 4/20 [22:08<1:25:49, 321.85s/it, loss=0.82, lr=0.00169, d_time=0[A
train:   7%|█▍                  | 68/928 [00:23<04:47,  2.99it/s, total_it=3779][A
epochs:  20%|▏| 4/20 [22:08<1:25:49, 321.85s/it, loss=0.813, lr=0.00169, d_time=[A
train:   7%|█▍                  | 69/928 [00:23<04:48,  2.98it/s, total_it=3780][A
epochs:  20%|▏| 4/20 [22:08<1:25:49, 321.85s/it, loss=0.747, lr=0.00169, d_time=[A
train:   8%|█▌                  | 70/928 [00:23<04:51,  2.94it/s, total_it=3781][A
epochs:  20%|▏| 4/20 [22:09<1:25:49, 321.85s/it, loss=0.827, lr=0.00169, d_time=[A
train:   8%|█▌                  | 71/928 [00:24<04:48,  2.98it/s, total_it=3782][A
epochs:  20%|▏| 4/20 [22:09<1:25:49, 321.85s/it, loss=0.785, lr=0.00169, d_time=[A
train:   8%|█▌                  | 72/928 [00:24<04:52,  2.92it/s, total_it=3

epochs:  20%|▏| 4/20 [22:24<1:25:49, 321.85s/it, loss=0.842, lr=0.00172, d_time=[A
train:  12%|██▍                | 116/928 [00:39<04:36,  2.93it/s, total_it=3827][A
epochs:  20%|▏| 4/20 [22:24<1:25:49, 321.85s/it, loss=0.745, lr=0.00172, d_time=[A
train:  13%|██▍                | 117/928 [00:39<04:29,  3.01it/s, total_it=3828][A
epochs:  20%|▏| 4/20 [22:25<1:25:49, 321.85s/it, loss=0.833, lr=0.00172, d_time=[A
train:  13%|██▍                | 118/928 [00:40<04:27,  3.03it/s, total_it=3829][A
epochs:  20%|▏| 4/20 [22:25<1:25:49, 321.85s/it, loss=0.743, lr=0.00172, d_time=[A
train:  13%|██▍                | 119/928 [00:40<04:28,  3.01it/s, total_it=3830][A
epochs:  20%|▏| 4/20 [22:25<1:25:49, 321.85s/it, loss=0.699, lr=0.00172, d_time=[A
train:  13%|██▍                | 120/928 [00:40<04:30,  2.98it/s, total_it=3831][A
epochs:  20%|▏| 4/20 [22:26<1:25:49, 321.85s/it, loss=0.612, lr=0.00172, d_time=[A
train:  13%|██▍                | 121/928 [00:41<04:26,  3.03it/s, total_it=3

epochs:  20%|▏| 4/20 [22:41<1:25:49, 321.85s/it, loss=0.797, lr=0.00174, d_time=[A
train:  18%|███▍               | 165/928 [00:56<04:09,  3.06it/s, total_it=3876][A
epochs:  20%|▏| 4/20 [22:41<1:25:49, 321.85s/it, loss=0.867, lr=0.00174, d_time=[A
train:  18%|███▍               | 166/928 [00:56<04:09,  3.06it/s, total_it=3877][A
epochs:  20%|▏| 4/20 [22:41<1:25:49, 321.85s/it, loss=0.677, lr=0.00174, d_time=[A
train:  18%|███▍               | 167/928 [00:56<04:10,  3.04it/s, total_it=3878][A
epochs:  20%|▏| 4/20 [22:42<1:25:49, 321.85s/it, loss=0.848, lr=0.00174, d_time=[A
train:  18%|███▍               | 168/928 [00:57<04:09,  3.04it/s, total_it=3879][A
epochs:  20%|▏| 4/20 [22:42<1:25:49, 321.85s/it, loss=0.771, lr=0.00175, d_time=[A
train:  18%|███▍               | 169/928 [00:57<04:09,  3.04it/s, total_it=3880][A
epochs:  20%|▏| 4/20 [22:42<1:25:49, 321.85s/it, loss=0.664, lr=0.00175, d_time=[A
train:  18%|███▍               | 170/928 [00:57<04:12,  3.01it/s, total_it=3

epochs:  20%|▏| 4/20 [22:57<1:25:49, 321.85s/it, loss=0.843, lr=0.00177, d_time=[A
train:  23%|████▍              | 214/928 [01:12<03:57,  3.00it/s, total_it=3925][A
epochs:  20%|▏| 4/20 [22:57<1:25:49, 321.85s/it, loss=0.83, lr=0.00177, d_time=0[A
train:  23%|████▍              | 215/928 [01:12<03:54,  3.04it/s, total_it=3926][A
epochs:  20%|▏| 4/20 [22:57<1:25:49, 321.85s/it, loss=0.669, lr=0.00177, d_time=[A
train:  23%|████▍              | 216/928 [01:12<03:53,  3.04it/s, total_it=3927][A
epochs:  20%|▏| 4/20 [22:58<1:25:49, 321.85s/it, loss=0.798, lr=0.00177, d_time=[A
train:  23%|████▍              | 217/928 [01:13<03:52,  3.06it/s, total_it=3928][A
epochs:  20%|▏| 4/20 [22:58<1:25:49, 321.85s/it, loss=0.747, lr=0.00177, d_time=[A
train:  23%|████▍              | 218/928 [01:13<03:52,  3.06it/s, total_it=3929][A
epochs:  20%|▏| 4/20 [22:58<1:25:49, 321.85s/it, loss=0.752, lr=0.00177, d_time=[A
train:  24%|████▍              | 219/928 [01:13<03:53,  3.04it/s, total_it=3

epochs:  20%|▏| 4/20 [23:13<1:25:49, 321.85s/it, loss=0.797, lr=0.0018, d_time=0[A
train:  28%|█████▍             | 263/928 [01:28<03:48,  2.91it/s, total_it=3974][A
epochs:  20%|▏| 4/20 [23:14<1:25:49, 321.85s/it, loss=0.711, lr=0.0018, d_time=0[A
train:  28%|█████▍             | 264/928 [01:29<03:43,  2.97it/s, total_it=3975][A
epochs:  20%|▏| 4/20 [23:14<1:25:49, 321.85s/it, loss=0.719, lr=0.0018, d_time=0[A
train:  29%|█████▍             | 265/928 [01:29<03:39,  3.02it/s, total_it=3976][A
epochs:  20%|▏| 4/20 [23:14<1:25:49, 321.85s/it, loss=0.871, lr=0.0018, d_time=0[A
train:  29%|█████▍             | 266/928 [01:29<03:42,  2.97it/s, total_it=3977][A
epochs:  20%|▏| 4/20 [23:15<1:25:49, 321.85s/it, loss=0.788, lr=0.0018, d_time=0[A
train:  29%|█████▍             | 267/928 [01:30<03:38,  3.03it/s, total_it=3978][A
epochs:  20%|▏| 4/20 [23:15<1:25:49, 321.85s/it, loss=0.723, lr=0.0018, d_time=0[A
train:  29%|█████▍             | 268/928 [01:30<03:39,  3.01it/s, total_it=3

epochs:  20%|▏| 4/20 [23:29<1:25:49, 321.85s/it, loss=0.623, lr=0.00183, d_time=[A
train:  34%|██████▍            | 312/928 [01:44<03:24,  3.01it/s, total_it=4023][A
epochs:  20%|▏| 4/20 [23:30<1:25:49, 321.85s/it, loss=0.691, lr=0.00183, d_time=[A
train:  34%|██████▍            | 313/928 [01:45<03:33,  2.89it/s, total_it=4024][A
epochs:  20%|▏| 4/20 [23:30<1:25:49, 321.85s/it, loss=0.696, lr=0.00183, d_time=[A
train:  34%|██████▍            | 314/928 [01:45<03:24,  3.01it/s, total_it=4025][A
epochs:  20%|▏| 4/20 [23:30<1:25:49, 321.85s/it, loss=0.781, lr=0.00183, d_time=[A
train:  34%|██████▍            | 315/928 [01:46<03:40,  2.78it/s, total_it=4026][A
epochs:  20%|▏| 4/20 [23:31<1:25:49, 321.85s/it, loss=0.762, lr=0.00183, d_time=[A
train:  34%|██████▍            | 316/928 [01:46<03:37,  2.81it/s, total_it=4027][A
epochs:  20%|▏| 4/20 [23:31<1:25:49, 321.85s/it, loss=0.776, lr=0.00183, d_time=[A
train:  34%|██████▍            | 317/928 [01:46<03:31,  2.88it/s, total_it=4

epochs:  20%|▏| 4/20 [23:46<1:25:49, 321.85s/it, loss=0.642, lr=0.00185, d_time=[A
train:  39%|███████▍           | 361/928 [02:01<03:05,  3.06it/s, total_it=4072][A
epochs:  20%|▏| 4/20 [23:47<1:25:49, 321.85s/it, loss=0.779, lr=0.00185, d_time=[A
train:  39%|███████▍           | 362/928 [02:02<03:03,  3.08it/s, total_it=4073][A
epochs:  20%|▏| 4/20 [23:47<1:25:49, 321.85s/it, loss=0.675, lr=0.00186, d_time=[A
train:  39%|███████▍           | 363/928 [02:02<03:09,  2.98it/s, total_it=4074][A
epochs:  20%|▏| 4/20 [23:47<1:25:49, 321.85s/it, loss=0.76, lr=0.00186, d_time=0[A
train:  39%|███████▍           | 364/928 [02:02<03:04,  3.06it/s, total_it=4075][A
epochs:  20%|▏| 4/20 [23:48<1:25:49, 321.85s/it, loss=0.671, lr=0.00186, d_time=[A
train:  39%|███████▍           | 365/928 [02:02<03:01,  3.11it/s, total_it=4076][A
epochs:  20%|▏| 4/20 [23:48<1:25:49, 321.85s/it, loss=0.819, lr=0.00186, d_time=[A
train:  39%|███████▍           | 366/928 [02:03<02:57,  3.16it/s, total_it=4

epochs:  20%|▏| 4/20 [24:03<1:25:49, 321.85s/it, loss=0.81, lr=0.00188, d_time=0[A
train:  44%|████████▍          | 410/928 [02:18<02:58,  2.90it/s, total_it=4121][A
epochs:  20%|▏| 4/20 [24:03<1:25:49, 321.85s/it, loss=0.756, lr=0.00188, d_time=[A
train:  44%|████████▍          | 411/928 [02:18<02:54,  2.96it/s, total_it=4122][A
epochs:  20%|▏| 4/20 [24:03<1:25:49, 321.85s/it, loss=0.721, lr=0.00188, d_time=[A
train:  44%|████████▍          | 412/928 [02:18<02:51,  3.01it/s, total_it=4123][A
epochs:  20%|▏| 4/20 [24:04<1:25:49, 321.85s/it, loss=0.967, lr=0.00188, d_time=[A
train:  45%|████████▍          | 413/928 [02:19<02:48,  3.06it/s, total_it=4124][A
epochs:  20%|▏| 4/20 [24:04<1:25:49, 321.85s/it, loss=0.858, lr=0.00188, d_time=[A
train:  45%|████████▍          | 414/928 [02:19<02:48,  3.05it/s, total_it=4125][A
epochs:  20%|▏| 4/20 [24:04<1:25:49, 321.85s/it, loss=0.826, lr=0.00188, d_time=[A
train:  45%|████████▍          | 415/928 [02:19<02:44,  3.12it/s, total_it=4

epochs:  20%|▏| 4/20 [24:19<1:25:49, 321.85s/it, loss=0.803, lr=0.00191, d_time=[A
train:  49%|█████████▍         | 459/928 [02:34<02:31,  3.09it/s, total_it=4170][A
epochs:  20%|▏| 4/20 [24:19<1:25:49, 321.85s/it, loss=0.861, lr=0.00191, d_time=[A
train:  50%|█████████▍         | 460/928 [02:34<02:30,  3.10it/s, total_it=4171][A
epochs:  20%|▏| 4/20 [24:19<1:25:49, 321.85s/it, loss=0.765, lr=0.00191, d_time=[A
train:  50%|█████████▍         | 461/928 [02:34<02:32,  3.07it/s, total_it=4172][A
epochs:  20%|▏| 4/20 [24:20<1:25:49, 321.85s/it, loss=0.662, lr=0.00191, d_time=[A
train:  50%|█████████▍         | 462/928 [02:35<02:32,  3.06it/s, total_it=4173][A
epochs:  20%|▏| 4/20 [24:20<1:25:49, 321.85s/it, loss=0.68, lr=0.00191, d_time=0[A
train:  50%|█████████▍         | 463/928 [02:35<02:28,  3.13it/s, total_it=4174][A
epochs:  20%|▏| 4/20 [24:20<1:25:49, 321.85s/it, loss=0.734, lr=0.00191, d_time=[A
train:  50%|█████████▌         | 464/928 [02:35<02:33,  3.02it/s, total_it=4

epochs:  20%|▏| 4/20 [24:35<1:25:49, 321.85s/it, loss=0.685, lr=0.00194, d_time=[A
train:  55%|██████████▍        | 508/928 [02:50<02:26,  2.86it/s, total_it=4219][A
epochs:  20%|▏| 4/20 [24:35<1:25:49, 321.85s/it, loss=0.726, lr=0.00194, d_time=[A
train:  55%|██████████▍        | 509/928 [02:50<02:21,  2.96it/s, total_it=4220][A
epochs:  20%|▏| 4/20 [24:36<1:25:49, 321.85s/it, loss=0.763, lr=0.00194, d_time=[A
train:  55%|██████████▍        | 510/928 [02:51<02:25,  2.87it/s, total_it=4221][A
epochs:  20%|▏| 4/20 [24:36<1:25:49, 321.85s/it, loss=0.854, lr=0.00194, d_time=[A
train:  55%|██████████▍        | 511/928 [02:51<02:26,  2.84it/s, total_it=4222][A
epochs:  20%|▏| 4/20 [24:36<1:25:49, 321.85s/it, loss=0.836, lr=0.00194, d_time=[A
train:  55%|██████████▍        | 512/928 [02:51<02:23,  2.90it/s, total_it=4223][A
epochs:  20%|▏| 4/20 [24:37<1:25:49, 321.85s/it, loss=0.822, lr=0.00194, d_time=[A
train:  55%|██████████▌        | 513/928 [02:52<02:20,  2.95it/s, total_it=4

epochs:  20%|▏| 4/20 [24:52<1:25:49, 321.85s/it, loss=0.708, lr=0.00196, d_time=[A
train:  60%|███████████▍       | 557/928 [03:07<02:03,  3.01it/s, total_it=4268][A
epochs:  20%|▏| 4/20 [24:52<1:25:49, 321.85s/it, loss=0.774, lr=0.00196, d_time=[A
train:  60%|███████████▍       | 558/928 [03:07<02:04,  2.97it/s, total_it=4269][A
epochs:  20%|▏| 4/20 [24:52<1:25:49, 321.85s/it, loss=0.824, lr=0.00197, d_time=[A
train:  60%|███████████▍       | 559/928 [03:07<02:11,  2.81it/s, total_it=4270][A
epochs:  20%|▏| 4/20 [24:53<1:25:49, 321.85s/it, loss=0.708, lr=0.00197, d_time=[A
train:  60%|███████████▍       | 560/928 [03:08<02:06,  2.92it/s, total_it=4271][A
epochs:  20%|▏| 4/20 [24:53<1:25:49, 321.85s/it, loss=0.822, lr=0.00197, d_time=[A
train:  60%|███████████▍       | 561/928 [03:08<02:06,  2.89it/s, total_it=4272][A
epochs:  20%|▏| 4/20 [24:53<1:25:49, 321.85s/it, loss=0.803, lr=0.00197, d_time=[A
train:  61%|███████████▌       | 562/928 [03:08<02:05,  2.91it/s, total_it=4

epochs:  20%|▏| 4/20 [25:08<1:25:49, 321.85s/it, loss=0.775, lr=0.00199, d_time=[A
train:  65%|████████████▍      | 606/928 [03:23<01:47,  2.98it/s, total_it=4317][A
epochs:  20%|▏| 4/20 [25:08<1:25:49, 321.85s/it, loss=0.759, lr=0.00199, d_time=[A
train:  65%|████████████▍      | 607/928 [03:23<01:46,  3.02it/s, total_it=4318][A
epochs:  20%|▏| 4/20 [25:09<1:25:49, 321.85s/it, loss=0.76, lr=0.00199, d_time=0[A
train:  66%|████████████▍      | 608/928 [03:24<01:44,  3.08it/s, total_it=4319][A
epochs:  20%|▏| 4/20 [25:09<1:25:49, 321.85s/it, loss=0.755, lr=0.00199, d_time=[A
train:  66%|████████████▍      | 609/928 [03:24<01:49,  2.90it/s, total_it=4320][A
epochs:  20%|▏| 4/20 [25:09<1:25:49, 321.85s/it, loss=0.765, lr=0.00199, d_time=[A
train:  66%|████████████▍      | 610/928 [03:25<01:55,  2.75it/s, total_it=4321][A
epochs:  20%|▏| 4/20 [25:10<1:25:49, 321.85s/it, loss=0.72, lr=0.00199, d_time=0[A
train:  66%|████████████▌      | 611/928 [03:25<01:52,  2.83it/s, total_it=4

epochs:  20%|▏| 4/20 [25:25<1:25:49, 321.85s/it, loss=0.795, lr=0.00202, d_time=[A
train:  71%|█████████████▍     | 655/928 [03:40<01:29,  3.06it/s, total_it=4366][A
epochs:  20%|▏| 4/20 [25:25<1:25:49, 321.85s/it, loss=0.795, lr=0.00202, d_time=[A
train:  71%|█████████████▍     | 656/928 [03:40<01:28,  3.08it/s, total_it=4367][A
epochs:  20%|▏| 4/20 [25:25<1:25:49, 321.85s/it, loss=0.663, lr=0.00202, d_time=[A
train:  71%|█████████████▍     | 657/928 [03:40<01:27,  3.09it/s, total_it=4368][A
epochs:  20%|▏| 4/20 [25:26<1:25:49, 321.85s/it, loss=0.646, lr=0.00202, d_time=[A
train:  71%|█████████████▍     | 658/928 [03:40<01:26,  3.13it/s, total_it=4369][A
epochs:  20%|▏| 4/20 [25:26<1:25:49, 321.85s/it, loss=0.737, lr=0.00202, d_time=[A
train:  71%|█████████████▍     | 659/928 [03:41<01:30,  2.96it/s, total_it=4370][A
epochs:  20%|▏| 4/20 [25:26<1:25:49, 321.85s/it, loss=0.694, lr=0.00202, d_time=[A
train:  71%|█████████████▌     | 660/928 [03:41<01:34,  2.83it/s, total_it=4

epochs:  20%|▏| 4/20 [25:41<1:25:49, 321.85s/it, loss=0.903, lr=0.00205, d_time=[A
train:  76%|██████████████▍    | 704/928 [03:56<01:17,  2.89it/s, total_it=4415][A
epochs:  20%|▏| 4/20 [25:42<1:25:49, 321.85s/it, loss=0.857, lr=0.00205, d_time=[A
train:  76%|██████████████▍    | 705/928 [03:57<01:18,  2.84it/s, total_it=4416][A
epochs:  20%|▏| 4/20 [25:42<1:25:49, 321.85s/it, loss=0.768, lr=0.00205, d_time=[A
train:  76%|██████████████▍    | 706/928 [03:57<01:15,  2.95it/s, total_it=4417][A
epochs:  20%|▏| 4/20 [25:42<1:25:49, 321.85s/it, loss=0.643, lr=0.00205, d_time=[A
train:  76%|██████████████▍    | 707/928 [03:57<01:16,  2.91it/s, total_it=4418][A
epochs:  20%|▏| 4/20 [25:43<1:25:49, 321.85s/it, loss=0.652, lr=0.00205, d_time=[A
train:  76%|██████████████▍    | 708/928 [03:58<01:14,  2.94it/s, total_it=4419][A
epochs:  20%|▏| 4/20 [25:43<1:25:49, 321.85s/it, loss=0.68, lr=0.00205, d_time=0[A
train:  76%|██████████████▌    | 709/928 [03:58<01:13,  2.97it/s, total_it=4

epochs:  20%|▏| 4/20 [25:58<1:25:49, 321.85s/it, loss=0.799, lr=0.00207, d_time=[A
train:  81%|███████████████▍   | 753/928 [04:13<00:58,  2.99it/s, total_it=4464][A
epochs:  20%|▏| 4/20 [25:58<1:25:49, 321.85s/it, loss=1.19, lr=0.00207, d_time=0[A
train:  81%|███████████████▍   | 754/928 [04:13<00:57,  3.03it/s, total_it=4465][A
epochs:  20%|▏| 4/20 [25:59<1:25:49, 321.85s/it, loss=0.669, lr=0.00207, d_time=[A
train:  81%|███████████████▍   | 755/928 [04:13<00:57,  3.01it/s, total_it=4466][A
epochs:  20%|▏| 4/20 [25:59<1:25:49, 321.85s/it, loss=1.05, lr=0.00207, d_time=0[A
train:  81%|███████████████▍   | 756/928 [04:14<00:57,  2.97it/s, total_it=4467][A
epochs:  20%|▏| 4/20 [25:59<1:25:49, 321.85s/it, loss=0.786, lr=0.00207, d_time=[A
train:  82%|███████████████▍   | 757/928 [04:14<00:55,  3.06it/s, total_it=4468][A
epochs:  20%|▏| 4/20 [25:59<1:25:49, 321.85s/it, loss=0.702, lr=0.00207, d_time=[A
train:  82%|███████████████▌   | 758/928 [04:14<00:55,  3.09it/s, total_it=4

epochs:  20%|▏| 4/20 [26:14<1:25:49, 321.85s/it, loss=0.705, lr=0.0021, d_time=0[A
train:  86%|████████████████▍  | 802/928 [04:29<00:40,  3.08it/s, total_it=4513][A
epochs:  20%|▏| 4/20 [26:15<1:25:49, 321.85s/it, loss=0.623, lr=0.0021, d_time=0[A
train:  87%|████████████████▍  | 803/928 [04:30<00:41,  2.98it/s, total_it=4514][A
epochs:  20%|▏| 4/20 [26:15<1:25:49, 321.85s/it, loss=0.727, lr=0.0021, d_time=0[A
train:  87%|████████████████▍  | 804/928 [04:30<00:40,  3.09it/s, total_it=4515][A
epochs:  20%|▏| 4/20 [26:15<1:25:49, 321.85s/it, loss=0.82, lr=0.0021, d_time=0.[A
train:  87%|████████████████▍  | 805/928 [04:30<00:41,  2.98it/s, total_it=4516][A
epochs:  20%|▏| 4/20 [26:16<1:25:49, 321.85s/it, loss=0.706, lr=0.0021, d_time=0[A
train:  87%|████████████████▌  | 806/928 [04:31<00:40,  3.00it/s, total_it=4517][A
epochs:  20%|▏| 4/20 [26:16<1:25:49, 321.85s/it, loss=0.775, lr=0.0021, d_time=0[A
train:  87%|████████████████▌  | 807/928 [04:31<00:39,  3.06it/s, total_it=4

epochs:  20%|▏| 4/20 [26:31<1:25:49, 321.85s/it, loss=1.02, lr=0.00212, d_time=0[A
train:  92%|█████████████████▍ | 851/928 [04:46<00:27,  2.80it/s, total_it=4562][A
epochs:  20%|▏| 4/20 [26:31<1:25:49, 321.85s/it, loss=0.755, lr=0.00213, d_time=[A
train:  92%|█████████████████▍ | 852/928 [04:46<00:26,  2.84it/s, total_it=4563][A
epochs:  20%|▏| 4/20 [26:32<1:25:49, 321.85s/it, loss=0.733, lr=0.00213, d_time=[A
train:  92%|█████████████████▍ | 853/928 [04:47<00:26,  2.87it/s, total_it=4564][A
epochs:  20%|▏| 4/20 [26:32<1:25:49, 321.85s/it, loss=0.667, lr=0.00213, d_time=[A
train:  92%|█████████████████▍ | 854/928 [04:47<00:26,  2.84it/s, total_it=4565][A
epochs:  20%|▏| 4/20 [26:32<1:25:49, 321.85s/it, loss=0.641, lr=0.00213, d_time=[A
train:  92%|█████████████████▌ | 855/928 [04:47<00:25,  2.92it/s, total_it=4566][A
epochs:  20%|▏| 4/20 [26:33<1:25:49, 321.85s/it, loss=0.666, lr=0.00213, d_time=[A
train:  92%|█████████████████▌ | 856/928 [04:48<00:24,  2.93it/s, total_it=4

epochs:  20%|▏| 4/20 [26:48<1:25:49, 321.85s/it, loss=0.7, lr=0.00215, d_time=0.[A
train:  97%|██████████████████▍| 900/928 [05:03<00:09,  2.91it/s, total_it=4611][A
epochs:  20%|▏| 4/20 [26:48<1:25:49, 321.85s/it, loss=0.81, lr=0.00215, d_time=0[A
train:  97%|██████████████████▍| 901/928 [05:03<00:09,  2.73it/s, total_it=4612][A
epochs:  20%|▏| 4/20 [26:48<1:25:49, 321.85s/it, loss=0.77, lr=0.00215, d_time=0[A
train:  97%|██████████████████▍| 902/928 [05:03<00:09,  2.86it/s, total_it=4613][A
epochs:  20%|▏| 4/20 [26:49<1:25:49, 321.85s/it, loss=0.685, lr=0.00215, d_time=[A
train:  97%|██████████████████▍| 903/928 [05:04<00:08,  2.90it/s, total_it=4614][A
epochs:  20%|▏| 4/20 [26:49<1:25:49, 321.85s/it, loss=0.749, lr=0.00215, d_time=[A
train:  97%|██████████████████▌| 904/928 [05:04<00:08,  2.86it/s, total_it=4615][A
epochs:  20%|▏| 4/20 [26:49<1:25:49, 321.85s/it, loss=0.851, lr=0.00215, d_time=[A
train:  98%|██████████████████▌| 905/928 [05:04<00:07,  2.96it/s, total_it=4

epochs:  25%|▎| 5/20 [27:05<1:19:37, 318.52s/it, loss=0.713, lr=0.00218, d_time=[A
train:   2%|▍                   | 20/928 [00:07<05:51,  2.58it/s, total_it=4659][A
epochs:  25%|▎| 5/20 [27:05<1:19:37, 318.52s/it, loss=0.619, lr=0.00218, d_time=[A
train:   2%|▍                   | 21/928 [00:08<05:27,  2.77it/s, total_it=4660][A
epochs:  25%|▎| 5/20 [27:06<1:19:37, 318.52s/it, loss=0.71, lr=0.00218, d_time=0[A
train:   2%|▍                   | 22/928 [00:08<05:16,  2.86it/s, total_it=4661][A
epochs:  25%|▎| 5/20 [27:06<1:19:37, 318.52s/it, loss=0.755, lr=0.00218, d_time=[A
train:   2%|▍                   | 23/928 [00:08<05:13,  2.88it/s, total_it=4662][A
epochs:  25%|▎| 5/20 [27:06<1:19:37, 318.52s/it, loss=0.717, lr=0.00218, d_time=[A
train:   3%|▌                   | 24/928 [00:09<05:08,  2.93it/s, total_it=4663][A
epochs:  25%|▎| 5/20 [27:07<1:19:37, 318.52s/it, loss=0.906, lr=0.00218, d_time=[A
train:   3%|▌                   | 25/928 [00:09<05:09,  2.92it/s, total_it=4

epochs:  25%|▎| 5/20 [27:22<1:19:37, 318.52s/it, loss=0.841, lr=0.0022, d_time=0[A
train:   7%|█▍                  | 69/928 [00:24<05:04,  2.82it/s, total_it=4708][A
epochs:  25%|▎| 5/20 [27:22<1:19:37, 318.52s/it, loss=0.695, lr=0.0022, d_time=0[A
train:   8%|█▌                  | 70/928 [00:24<04:55,  2.90it/s, total_it=4709][A
epochs:  25%|▎| 5/20 [27:22<1:19:37, 318.52s/it, loss=0.808, lr=0.0022, d_time=0[A
train:   8%|█▌                  | 71/928 [00:25<05:00,  2.85it/s, total_it=4710][A
epochs:  25%|▎| 5/20 [27:23<1:19:37, 318.52s/it, loss=0.799, lr=0.0022, d_time=0[A
train:   8%|█▌                  | 72/928 [00:25<04:51,  2.93it/s, total_it=4711][A
epochs:  25%|▎| 5/20 [27:23<1:19:37, 318.52s/it, loss=0.731, lr=0.0022, d_time=0[A
train:   8%|█▌                  | 73/928 [00:25<04:42,  3.02it/s, total_it=4712][A
epochs:  25%|▎| 5/20 [27:23<1:19:37, 318.52s/it, loss=0.828, lr=0.0022, d_time=0[A
train:   8%|█▌                  | 74/928 [00:26<04:36,  3.09it/s, total_it=4

epochs:  25%|▎| 5/20 [27:38<1:19:37, 318.52s/it, loss=0.688, lr=0.00223, d_time=[A
train:  13%|██▍                | 118/928 [00:41<04:39,  2.90it/s, total_it=4757][A
epochs:  25%|▎| 5/20 [27:39<1:19:37, 318.52s/it, loss=0.699, lr=0.00223, d_time=[A
train:  13%|██▍                | 119/928 [00:41<04:41,  2.88it/s, total_it=4758][A
epochs:  25%|▎| 5/20 [27:39<1:19:37, 318.52s/it, loss=0.818, lr=0.00223, d_time=[A
train:  13%|██▍                | 120/928 [00:41<04:36,  2.92it/s, total_it=4759][A
epochs:  25%|▎| 5/20 [27:39<1:19:37, 318.52s/it, loss=0.654, lr=0.00223, d_time=[A
train:  13%|██▍                | 121/928 [00:42<04:43,  2.85it/s, total_it=4760][A
epochs:  25%|▎| 5/20 [27:40<1:19:37, 318.52s/it, loss=0.787, lr=0.00223, d_time=[A
train:  13%|██▍                | 122/928 [00:42<04:42,  2.85it/s, total_it=4761][A
epochs:  25%|▎| 5/20 [27:40<1:19:37, 318.52s/it, loss=0.768, lr=0.00223, d_time=[A
train:  13%|██▌                | 123/928 [00:42<04:32,  2.95it/s, total_it=4

epochs:  25%|▎| 5/20 [27:55<1:19:37, 318.52s/it, loss=0.705, lr=0.00225, d_time=[A
train:  18%|███▍               | 167/928 [00:57<04:12,  3.01it/s, total_it=4806][A
epochs:  25%|▎| 5/20 [27:55<1:19:37, 318.52s/it, loss=0.643, lr=0.00225, d_time=[A
train:  18%|███▍               | 168/928 [00:58<04:08,  3.05it/s, total_it=4807][A
epochs:  25%|▎| 5/20 [27:56<1:19:37, 318.52s/it, loss=0.789, lr=0.00225, d_time=[A
train:  18%|███▍               | 169/928 [00:58<04:07,  3.06it/s, total_it=4808][A
epochs:  25%|▎| 5/20 [27:56<1:19:37, 318.52s/it, loss=0.656, lr=0.00225, d_time=[A
train:  18%|███▍               | 170/928 [00:58<04:28,  2.82it/s, total_it=4809][A
epochs:  25%|▎| 5/20 [27:56<1:19:37, 318.52s/it, loss=0.745, lr=0.00225, d_time=[A
train:  18%|███▌               | 171/928 [00:59<04:18,  2.93it/s, total_it=4810][A
epochs:  25%|▎| 5/20 [27:57<1:19:37, 318.52s/it, loss=0.713, lr=0.00225, d_time=[A
train:  19%|███▌               | 172/928 [00:59<04:14,  2.97it/s, total_it=4

epochs:  25%|▎| 5/20 [28:11<1:19:37, 318.52s/it, loss=0.935, lr=0.00228, d_time=[A
train:  23%|████▍              | 216/928 [01:14<03:50,  3.09it/s, total_it=4855][A
epochs:  25%|▎| 5/20 [28:12<1:19:37, 318.52s/it, loss=0.731, lr=0.00228, d_time=[A
train:  23%|████▍              | 217/928 [01:14<03:48,  3.11it/s, total_it=4856][A
epochs:  25%|▎| 5/20 [28:12<1:19:37, 318.52s/it, loss=0.784, lr=0.00228, d_time=[A
train:  23%|████▍              | 218/928 [01:14<03:51,  3.06it/s, total_it=4857][A
epochs:  25%|▎| 5/20 [28:12<1:19:37, 318.52s/it, loss=0.817, lr=0.00228, d_time=[A
train:  24%|████▍              | 219/928 [01:15<04:00,  2.95it/s, total_it=4858][A
epochs:  25%|▎| 5/20 [28:13<1:19:37, 318.52s/it, loss=0.816, lr=0.00228, d_time=[A
train:  24%|████▌              | 220/928 [01:15<03:57,  2.98it/s, total_it=4859][A
epochs:  25%|▎| 5/20 [28:13<1:19:37, 318.52s/it, loss=0.821, lr=0.00228, d_time=[A
train:  24%|████▌              | 221/928 [01:15<03:51,  3.06it/s, total_it=4

epochs:  25%|▎| 5/20 [28:28<1:19:37, 318.52s/it, loss=0.606, lr=0.0023, d_time=0[A
train:  29%|█████▍             | 265/928 [01:30<03:34,  3.08it/s, total_it=4904][A
epochs:  25%|▎| 5/20 [28:28<1:19:37, 318.52s/it, loss=0.766, lr=0.0023, d_time=0[A
train:  29%|█████▍             | 266/928 [01:30<03:39,  3.01it/s, total_it=4905][A
epochs:  25%|▎| 5/20 [28:28<1:19:37, 318.52s/it, loss=0.661, lr=0.0023, d_time=0[A
train:  29%|█████▍             | 267/928 [01:31<03:39,  3.01it/s, total_it=4906][A
epochs:  25%|▎| 5/20 [28:29<1:19:37, 318.52s/it, loss=0.8, lr=0.0023, d_time=0.0[A
train:  29%|█████▍             | 268/928 [01:31<03:35,  3.07it/s, total_it=4907][A
epochs:  25%|▎| 5/20 [28:29<1:19:37, 318.52s/it, loss=0.868, lr=0.0023, d_time=0[A
train:  29%|█████▌             | 269/928 [01:31<03:37,  3.02it/s, total_it=4908][A
epochs:  25%|▎| 5/20 [28:29<1:19:37, 318.52s/it, loss=0.823, lr=0.0023, d_time=0[A
train:  29%|█████▌             | 270/928 [01:32<03:36,  3.04it/s, total_it=4

epochs:  25%|▎| 5/20 [28:44<1:19:37, 318.52s/it, loss=0.755, lr=0.00233, d_time=[A
train:  34%|██████▍            | 314/928 [01:46<03:27,  2.96it/s, total_it=4953][A
epochs:  25%|▎| 5/20 [28:44<1:19:37, 318.52s/it, loss=0.774, lr=0.00233, d_time=[A
train:  34%|██████▍            | 315/928 [01:47<03:31,  2.89it/s, total_it=4954][A
epochs:  25%|▎| 5/20 [28:45<1:19:37, 318.52s/it, loss=0.752, lr=0.00233, d_time=[A
train:  34%|██████▍            | 316/928 [01:47<03:28,  2.94it/s, total_it=4955][A
epochs:  25%|▎| 5/20 [28:45<1:19:37, 318.52s/it, loss=0.727, lr=0.00233, d_time=[A
train:  34%|██████▍            | 317/928 [01:47<03:30,  2.90it/s, total_it=4956][A
epochs:  25%|▎| 5/20 [28:45<1:19:37, 318.52s/it, loss=0.744, lr=0.00233, d_time=[A
train:  34%|██████▌            | 318/928 [01:48<03:27,  2.95it/s, total_it=4957][A
epochs:  25%|▎| 5/20 [28:46<1:19:37, 318.52s/it, loss=0.771, lr=0.00233, d_time=[A
train:  34%|██████▌            | 319/928 [01:48<03:30,  2.90it/s, total_it=4

epochs:  25%|▎| 5/20 [29:00<1:19:37, 318.52s/it, loss=0.771, lr=0.00235, d_time=[A
train:  39%|███████▍           | 363/928 [02:03<03:15,  2.89it/s, total_it=5002][A
epochs:  25%|▎| 5/20 [29:01<1:19:37, 318.52s/it, loss=0.832, lr=0.00235, d_time=[A
train:  39%|███████▍           | 364/928 [02:03<03:22,  2.79it/s, total_it=5003][A
epochs:  25%|▎| 5/20 [29:01<1:19:37, 318.52s/it, loss=0.709, lr=0.00235, d_time=[A
train:  39%|███████▍           | 365/928 [02:03<03:23,  2.77it/s, total_it=5004][A
epochs:  25%|▎| 5/20 [29:01<1:19:37, 318.52s/it, loss=0.67, lr=0.00235, d_time=0[A
train:  39%|███████▍           | 366/928 [02:04<03:16,  2.86it/s, total_it=5005][A
epochs:  25%|▎| 5/20 [29:02<1:19:37, 318.52s/it, loss=0.657, lr=0.00235, d_time=[A
train:  40%|███████▌           | 367/928 [02:04<03:17,  2.84it/s, total_it=5006][A
epochs:  25%|▎| 5/20 [29:02<1:19:37, 318.52s/it, loss=0.746, lr=0.00235, d_time=[A
train:  40%|███████▌           | 368/928 [02:04<03:14,  2.88it/s, total_it=5

epochs:  25%|▎| 5/20 [29:17<1:19:37, 318.52s/it, loss=0.914, lr=0.00237, d_time=[A
train:  44%|████████▍          | 412/928 [02:19<02:48,  3.07it/s, total_it=5051][A
epochs:  25%|▎| 5/20 [29:17<1:19:37, 318.52s/it, loss=0.818, lr=0.00237, d_time=[A
train:  45%|████████▍          | 413/928 [02:20<02:46,  3.09it/s, total_it=5052][A
epochs:  25%|▎| 5/20 [29:17<1:19:37, 318.52s/it, loss=0.684, lr=0.00238, d_time=[A
train:  45%|████████▍          | 414/928 [02:20<02:50,  3.02it/s, total_it=5053][A
epochs:  25%|▎| 5/20 [29:18<1:19:37, 318.52s/it, loss=0.738, lr=0.00238, d_time=[A
train:  45%|████████▍          | 415/928 [02:20<02:46,  3.08it/s, total_it=5054][A
epochs:  25%|▎| 5/20 [29:18<1:19:37, 318.52s/it, loss=0.932, lr=0.00238, d_time=[A
train:  45%|████████▌          | 416/928 [02:21<02:56,  2.90it/s, total_it=5055][A
epochs:  25%|▎| 5/20 [29:19<1:19:37, 318.52s/it, loss=0.723, lr=0.00238, d_time=[A
train:  45%|████████▌          | 417/928 [02:21<02:48,  3.03it/s, total_it=5

epochs:  25%|▎| 5/20 [29:33<1:19:37, 318.52s/it, loss=0.749, lr=0.0024, d_time=0[A
train:  50%|█████████▍         | 461/928 [02:36<02:35,  3.00it/s, total_it=5100][A
epochs:  25%|▎| 5/20 [29:34<1:19:37, 318.52s/it, loss=0.73, lr=0.0024, d_time=0.[A
train:  50%|█████████▍         | 462/928 [02:36<02:31,  3.07it/s, total_it=5101][A
epochs:  25%|▎| 5/20 [29:34<1:19:37, 318.52s/it, loss=0.686, lr=0.0024, d_time=0[A
train:  50%|█████████▍         | 463/928 [02:36<02:44,  2.83it/s, total_it=5102][A
epochs:  25%|▎| 5/20 [29:34<1:19:37, 318.52s/it, loss=0.649, lr=0.0024, d_time=0[A
train:  50%|█████████▌         | 464/928 [02:37<02:41,  2.88it/s, total_it=5103][A
epochs:  25%|▎| 5/20 [29:35<1:19:37, 318.52s/it, loss=0.771, lr=0.0024, d_time=0[A
train:  50%|█████████▌         | 465/928 [02:37<02:37,  2.94it/s, total_it=5104][A
epochs:  25%|▎| 5/20 [29:35<1:19:37, 318.52s/it, loss=0.684, lr=0.0024, d_time=0[A
train:  50%|█████████▌         | 466/928 [02:37<02:36,  2.94it/s, total_it=5

epochs:  25%|▎| 5/20 [29:50<1:19:37, 318.52s/it, loss=0.69, lr=0.00242, d_time=0[A
train:  55%|██████████▍        | 510/928 [02:52<02:18,  3.03it/s, total_it=5149][A
epochs:  25%|▎| 5/20 [29:50<1:19:37, 318.52s/it, loss=0.775, lr=0.00242, d_time=[A
train:  55%|██████████▍        | 511/928 [02:53<02:20,  2.96it/s, total_it=5150][A
epochs:  25%|▎| 5/20 [29:51<1:19:37, 318.52s/it, loss=0.8, lr=0.00242, d_time=0.[A
train:  55%|██████████▍        | 512/928 [02:53<02:26,  2.84it/s, total_it=5151][A
epochs:  25%|▎| 5/20 [29:51<1:19:37, 318.52s/it, loss=0.731, lr=0.00242, d_time=[A
train:  55%|██████████▌        | 513/928 [02:53<02:22,  2.91it/s, total_it=5152][A
epochs:  25%|▎| 5/20 [29:51<1:19:37, 318.52s/it, loss=0.847, lr=0.00242, d_time=[A
train:  55%|██████████▌        | 514/928 [02:54<02:18,  2.98it/s, total_it=5153][A
epochs:  25%|▎| 5/20 [29:52<1:19:37, 318.52s/it, loss=0.717, lr=0.00242, d_time=[A
train:  55%|██████████▌        | 515/928 [02:54<02:15,  3.04it/s, total_it=5

epochs:  25%|▎| 5/20 [30:06<1:19:37, 318.52s/it, loss=0.749, lr=0.00244, d_time=[A
train:  60%|███████████▍       | 559/928 [03:09<01:59,  3.08it/s, total_it=5198][A
epochs:  25%|▎| 5/20 [30:07<1:19:37, 318.52s/it, loss=0.675, lr=0.00244, d_time=[A
train:  60%|███████████▍       | 560/928 [03:09<02:00,  3.05it/s, total_it=5199][A
epochs:  25%|▎| 5/20 [30:07<1:19:37, 318.52s/it, loss=0.735, lr=0.00244, d_time=[A
train:  60%|███████████▍       | 561/928 [03:09<02:05,  2.92it/s, total_it=5200][A
epochs:  25%|▎| 5/20 [30:07<1:19:37, 318.52s/it, loss=0.682, lr=0.00244, d_time=[A
train:  61%|███████████▌       | 562/928 [03:10<02:02,  2.98it/s, total_it=5201][A
epochs:  25%|▎| 5/20 [30:08<1:19:37, 318.52s/it, loss=0.706, lr=0.00245, d_time=[A
train:  61%|███████████▌       | 563/928 [03:10<02:01,  3.01it/s, total_it=5202][A
epochs:  25%|▎| 5/20 [30:08<1:19:37, 318.52s/it, loss=0.61, lr=0.00245, d_time=0[A
train:  61%|███████████▌       | 564/928 [03:10<02:07,  2.86it/s, total_it=5

epochs:  25%|▎| 5/20 [30:23<1:19:37, 318.52s/it, loss=0.716, lr=0.00247, d_time=[A
train:  66%|████████████▍      | 608/928 [03:26<01:47,  2.97it/s, total_it=5247][A
epochs:  25%|▎| 5/20 [30:24<1:19:37, 318.52s/it, loss=0.735, lr=0.00247, d_time=[A
train:  66%|████████████▍      | 609/928 [03:26<01:47,  2.97it/s, total_it=5248][A
epochs:  25%|▎| 5/20 [30:24<1:19:37, 318.52s/it, loss=0.642, lr=0.00247, d_time=[A
train:  66%|████████████▍      | 610/928 [03:26<01:47,  2.96it/s, total_it=5249][A
epochs:  25%|▎| 5/20 [30:24<1:19:37, 318.52s/it, loss=0.783, lr=0.00247, d_time=[A
train:  66%|████████████▌      | 611/928 [03:27<01:44,  3.05it/s, total_it=5250][A
epochs:  25%|▎| 5/20 [30:25<1:19:37, 318.52s/it, loss=0.732, lr=0.00247, d_time=[A
train:  66%|████████████▌      | 612/928 [03:27<01:48,  2.91it/s, total_it=5251][A
epochs:  25%|▎| 5/20 [30:25<1:19:37, 318.52s/it, loss=0.631, lr=0.00247, d_time=[A
train:  66%|████████████▌      | 613/928 [03:27<01:45,  2.99it/s, total_it=5

epochs:  25%|▎| 5/20 [30:40<1:19:37, 318.52s/it, loss=0.621, lr=0.00249, d_time=[A
train:  71%|█████████████▍     | 657/928 [03:42<01:28,  3.06it/s, total_it=5296][A
epochs:  25%|▎| 5/20 [30:40<1:19:37, 318.52s/it, loss=0.775, lr=0.00249, d_time=[A
train:  71%|█████████████▍     | 658/928 [03:42<01:29,  3.02it/s, total_it=5297][A
epochs:  25%|▎| 5/20 [30:40<1:19:37, 318.52s/it, loss=0.615, lr=0.00249, d_time=[A
train:  71%|█████████████▍     | 659/928 [03:43<01:28,  3.02it/s, total_it=5298][A
epochs:  25%|▎| 5/20 [30:41<1:19:37, 318.52s/it, loss=0.753, lr=0.00249, d_time=[A
train:  71%|█████████████▌     | 660/928 [03:43<01:29,  2.98it/s, total_it=5299][A
epochs:  25%|▎| 5/20 [30:41<1:19:37, 318.52s/it, loss=0.647, lr=0.00249, d_time=[A
train:  71%|█████████████▌     | 661/928 [03:44<01:34,  2.82it/s, total_it=5300][A
epochs:  25%|▎| 5/20 [30:41<1:19:37, 318.52s/it, loss=0.737, lr=0.00249, d_time=[A
train:  71%|█████████████▌     | 662/928 [03:44<01:32,  2.86it/s, total_it=5

epochs:  25%|▎| 5/20 [30:56<1:19:37, 318.52s/it, loss=0.757, lr=0.00251, d_time=[A
train:  76%|██████████████▍    | 706/928 [03:58<01:13,  3.04it/s, total_it=5345][A
epochs:  25%|▎| 5/20 [30:56<1:19:37, 318.52s/it, loss=0.707, lr=0.00251, d_time=[A
train:  76%|██████████████▍    | 707/928 [03:59<01:13,  3.00it/s, total_it=5346][A
epochs:  25%|▎| 5/20 [30:57<1:19:37, 318.52s/it, loss=0.591, lr=0.00251, d_time=[A
train:  76%|██████████████▍    | 708/928 [03:59<01:13,  2.98it/s, total_it=5347][A
epochs:  25%|▎| 5/20 [30:57<1:19:37, 318.52s/it, loss=0.681, lr=0.00251, d_time=[A
train:  76%|██████████████▌    | 709/928 [03:59<01:13,  2.98it/s, total_it=5348][A
epochs:  25%|▎| 5/20 [30:57<1:19:37, 318.52s/it, loss=0.765, lr=0.00251, d_time=[A
train:  77%|██████████████▌    | 710/928 [04:00<01:13,  2.98it/s, total_it=5349][A
epochs:  25%|▎| 5/20 [30:58<1:19:37, 318.52s/it, loss=0.684, lr=0.00251, d_time=[A
train:  77%|██████████████▌    | 711/928 [04:00<01:11,  3.04it/s, total_it=5

epochs:  25%|▎| 5/20 [31:12<1:19:37, 318.52s/it, loss=0.716, lr=0.00253, d_time=[A
train:  81%|███████████████▍   | 755/928 [04:15<01:01,  2.80it/s, total_it=5394][A
epochs:  25%|▎| 5/20 [31:13<1:19:37, 318.52s/it, loss=0.788, lr=0.00253, d_time=[A
train:  81%|███████████████▍   | 756/928 [04:15<01:00,  2.84it/s, total_it=5395][A
epochs:  25%|▎| 5/20 [31:13<1:19:37, 318.52s/it, loss=0.692, lr=0.00253, d_time=[A
train:  82%|███████████████▍   | 757/928 [04:16<00:58,  2.93it/s, total_it=5396][A
epochs:  25%|▎| 5/20 [31:14<1:19:37, 318.52s/it, loss=0.671, lr=0.00253, d_time=[A
train:  82%|███████████████▌   | 758/928 [04:16<01:00,  2.82it/s, total_it=5397][A
epochs:  25%|▎| 5/20 [31:14<1:19:37, 318.52s/it, loss=0.835, lr=0.00253, d_time=[A
train:  82%|███████████████▌   | 759/928 [04:16<01:01,  2.75it/s, total_it=5398][A
epochs:  25%|▎| 5/20 [31:14<1:19:37, 318.52s/it, loss=0.741, lr=0.00253, d_time=[A
train:  82%|███████████████▌   | 760/928 [04:17<01:01,  2.74it/s, total_it=5

epochs:  25%|▎| 5/20 [31:29<1:19:37, 318.52s/it, loss=0.612, lr=0.00255, d_time=[A
train:  87%|████████████████▍  | 804/928 [04:32<00:42,  2.94it/s, total_it=5443][A
epochs:  25%|▎| 5/20 [31:30<1:19:37, 318.52s/it, loss=0.731, lr=0.00255, d_time=[A
train:  87%|████████████████▍  | 805/928 [04:32<00:41,  2.96it/s, total_it=5444][A
epochs:  25%|▎| 5/20 [31:30<1:19:37, 318.52s/it, loss=0.695, lr=0.00255, d_time=[A
train:  87%|████████████████▌  | 806/928 [04:32<00:40,  3.00it/s, total_it=5445][A
epochs:  25%|▎| 5/20 [31:30<1:19:37, 318.52s/it, loss=0.633, lr=0.00255, d_time=[A
train:  87%|████████████████▌  | 807/928 [04:33<00:41,  2.93it/s, total_it=5446][A
epochs:  25%|▎| 5/20 [31:31<1:19:37, 318.52s/it, loss=0.599, lr=0.00255, d_time=[A
train:  87%|████████████████▌  | 808/928 [04:33<00:39,  3.02it/s, total_it=5447][A
epochs:  25%|▎| 5/20 [31:31<1:19:37, 318.52s/it, loss=0.709, lr=0.00255, d_time=[A
train:  87%|████████████████▌  | 809/928 [04:33<00:38,  3.09it/s, total_it=5

epochs:  25%|▎| 5/20 [31:46<1:19:37, 318.52s/it, loss=0.68, lr=0.00257, d_time=0[A
train:  92%|█████████████████▍ | 853/928 [04:48<00:25,  2.96it/s, total_it=5492][A
epochs:  25%|▎| 5/20 [31:46<1:19:37, 318.52s/it, loss=0.6, lr=0.00257, d_time=0.[A
train:  92%|█████████████████▍ | 854/928 [04:49<00:25,  2.93it/s, total_it=5493][A
epochs:  25%|▎| 5/20 [31:46<1:19:37, 318.52s/it, loss=0.777, lr=0.00257, d_time=[A
train:  92%|█████████████████▌ | 855/928 [04:49<00:26,  2.76it/s, total_it=5494][A
epochs:  25%|▎| 5/20 [31:47<1:19:37, 318.52s/it, loss=0.704, lr=0.00257, d_time=[A
train:  92%|█████████████████▌ | 856/928 [04:49<00:26,  2.72it/s, total_it=5495][A
epochs:  25%|▎| 5/20 [31:47<1:19:37, 318.52s/it, loss=0.643, lr=0.00257, d_time=[A
train:  92%|█████████████████▌ | 857/928 [04:50<00:24,  2.84it/s, total_it=5496][A
epochs:  25%|▎| 5/20 [31:48<1:19:37, 318.52s/it, loss=0.724, lr=0.00258, d_time=[A
train:  92%|█████████████████▌ | 858/928 [04:50<00:24,  2.91it/s, total_it=5

epochs:  25%|▎| 5/20 [32:02<1:19:37, 318.52s/it, loss=0.767, lr=0.00259, d_time=[A
train:  97%|██████████████████▍| 902/928 [05:05<00:08,  2.93it/s, total_it=5541][A
epochs:  25%|▎| 5/20 [32:03<1:19:37, 318.52s/it, loss=0.652, lr=0.00259, d_time=[A
train:  97%|██████████████████▍| 903/928 [05:05<00:08,  2.94it/s, total_it=5542][A
epochs:  25%|▎| 5/20 [32:03<1:19:37, 318.52s/it, loss=0.744, lr=0.00259, d_time=[A
train:  97%|██████████████████▌| 904/928 [05:06<00:08,  2.98it/s, total_it=5543][A
epochs:  25%|▎| 5/20 [32:04<1:19:37, 318.52s/it, loss=0.685, lr=0.00259, d_time=[A
train:  98%|██████████████████▌| 905/928 [05:06<00:07,  3.02it/s, total_it=5544][A
epochs:  25%|▎| 5/20 [32:04<1:19:37, 318.52s/it, loss=0.65, lr=0.00259, d_time=0[A
train:  98%|██████████████████▌| 906/928 [05:06<00:07,  3.00it/s, total_it=5545][A
epochs:  25%|▎| 5/20 [32:04<1:19:37, 318.52s/it, loss=0.661, lr=0.0026, d_time=0[A
train:  98%|██████████████████▌| 907/928 [05:07<00:06,  3.02it/s, total_it=5

epochs:  30%|▎| 6/20 [32:19<1:13:58, 317.02s/it, loss=0.847, lr=0.00261, d_time=[A
train:   2%|▍                   | 22/928 [00:08<04:49,  3.13it/s, total_it=5589][A
epochs:  30%|▎| 6/20 [32:20<1:13:58, 317.02s/it, loss=0.591, lr=0.00261, d_time=[A
train:   2%|▍                   | 23/928 [00:08<05:16,  2.86it/s, total_it=5590][A
epochs:  30%|▎| 6/20 [32:20<1:13:58, 317.02s/it, loss=0.679, lr=0.00261, d_time=[A
train:   3%|▌                   | 24/928 [00:08<05:10,  2.91it/s, total_it=5591][A
epochs:  30%|▎| 6/20 [32:20<1:13:58, 317.02s/it, loss=0.651, lr=0.00261, d_time=[A
train:   3%|▌                   | 25/928 [00:09<05:16,  2.86it/s, total_it=5592][A
epochs:  30%|▎| 6/20 [32:21<1:13:58, 317.02s/it, loss=0.819, lr=0.00261, d_time=[A
train:   3%|▌                   | 26/928 [00:09<05:07,  2.93it/s, total_it=5593][A
epochs:  30%|▎| 6/20 [32:21<1:13:58, 317.02s/it, loss=0.74, lr=0.00261, d_time=0[A
train:   3%|▌                   | 27/928 [00:09<04:58,  3.02it/s, total_it=5

epochs:  30%|▎| 6/20 [32:36<1:13:58, 317.02s/it, loss=0.595, lr=0.00263, d_time=[A
train:   8%|█▌                  | 71/928 [00:24<04:46,  2.99it/s, total_it=5638][A
epochs:  30%|▎| 6/20 [32:36<1:13:58, 317.02s/it, loss=0.661, lr=0.00263, d_time=[A
train:   8%|█▌                  | 72/928 [00:24<04:48,  2.97it/s, total_it=5639][A
epochs:  30%|▎| 6/20 [32:36<1:13:58, 317.02s/it, loss=0.709, lr=0.00263, d_time=[A
train:   8%|█▌                  | 73/928 [00:25<04:42,  3.03it/s, total_it=5640][A
epochs:  30%|▎| 6/20 [32:37<1:13:58, 317.02s/it, loss=0.675, lr=0.00263, d_time=[A
train:   8%|█▌                  | 74/928 [00:25<04:41,  3.04it/s, total_it=5641][A
epochs:  30%|▎| 6/20 [32:37<1:13:58, 317.02s/it, loss=0.715, lr=0.00263, d_time=[A
train:   8%|█▌                  | 75/928 [00:25<04:51,  2.92it/s, total_it=5642][A
epochs:  30%|▎| 6/20 [32:37<1:13:58, 317.02s/it, loss=0.683, lr=0.00263, d_time=[A
train:   8%|█▋                  | 76/928 [00:26<04:51,  2.92it/s, total_it=5

epochs:  30%|▎| 6/20 [32:52<1:13:58, 317.02s/it, loss=0.661, lr=0.00265, d_time=[A
train:  13%|██▍                | 120/928 [00:41<04:18,  3.13it/s, total_it=5687][A
epochs:  30%|▎| 6/20 [32:53<1:13:58, 317.02s/it, loss=0.67, lr=0.00265, d_time=0[A
train:  13%|██▍                | 121/928 [00:41<04:39,  2.89it/s, total_it=5688][A
epochs:  30%|▎| 6/20 [32:53<1:13:58, 317.02s/it, loss=0.731, lr=0.00265, d_time=[A
train:  13%|██▍                | 122/928 [00:41<04:32,  2.96it/s, total_it=5689][A
epochs:  30%|▎| 6/20 [32:53<1:13:58, 317.02s/it, loss=0.741, lr=0.00265, d_time=[A
train:  13%|██▌                | 123/928 [00:42<04:24,  3.05it/s, total_it=5690][A
epochs:  30%|▎| 6/20 [32:54<1:13:58, 317.02s/it, loss=0.716, lr=0.00265, d_time=[A
train:  13%|██▌                | 124/928 [00:42<04:37,  2.90it/s, total_it=5691][A
epochs:  30%|▎| 6/20 [32:54<1:13:58, 317.02s/it, loss=0.723, lr=0.00265, d_time=[A
train:  13%|██▌                | 125/928 [00:42<04:33,  2.93it/s, total_it=5

epochs:  30%|▎| 6/20 [33:09<1:13:58, 317.02s/it, loss=0.586, lr=0.00267, d_time=[A
train:  18%|███▍               | 169/928 [00:57<04:28,  2.83it/s, total_it=5736][A
epochs:  30%|▎| 6/20 [33:09<1:13:58, 317.02s/it, loss=0.666, lr=0.00267, d_time=[A
train:  18%|███▍               | 170/928 [00:57<04:24,  2.87it/s, total_it=5737][A
epochs:  30%|▎| 6/20 [33:09<1:13:58, 317.02s/it, loss=0.79, lr=0.00267, d_time=0[A
train:  18%|███▌               | 171/928 [00:58<04:37,  2.72it/s, total_it=5738][A
epochs:  30%|▎| 6/20 [33:10<1:13:58, 317.02s/it, loss=0.999, lr=0.00267, d_time=[A
train:  19%|███▌               | 172/928 [00:58<04:26,  2.84it/s, total_it=5739][A
epochs:  30%|▎| 6/20 [33:10<1:13:58, 317.02s/it, loss=0.738, lr=0.00267, d_time=[A
train:  19%|███▌               | 173/928 [00:58<04:25,  2.84it/s, total_it=5740][A
epochs:  30%|▎| 6/20 [33:11<1:13:58, 317.02s/it, loss=0.57, lr=0.00267, d_time=0[A
train:  19%|███▌               | 174/928 [00:59<04:18,  2.92it/s, total_it=5

epochs:  30%|▎| 6/20 [33:25<1:13:58, 317.02s/it, loss=0.701, lr=0.00269, d_time=[A
train:  23%|████▍              | 218/928 [01:14<04:03,  2.92it/s, total_it=5785][A
epochs:  30%|▎| 6/20 [33:26<1:13:58, 317.02s/it, loss=0.718, lr=0.00269, d_time=[A
train:  24%|████▍              | 219/928 [01:14<03:55,  3.01it/s, total_it=5786][A
epochs:  30%|▎| 6/20 [33:26<1:13:58, 317.02s/it, loss=0.762, lr=0.00269, d_time=[A
train:  24%|████▌              | 220/928 [01:14<03:59,  2.96it/s, total_it=5787][A
epochs:  30%|▎| 6/20 [33:26<1:13:58, 317.02s/it, loss=0.675, lr=0.00269, d_time=[A
train:  24%|████▌              | 221/928 [01:15<04:00,  2.94it/s, total_it=5788][A
epochs:  30%|▎| 6/20 [33:27<1:13:58, 317.02s/it, loss=0.662, lr=0.00269, d_time=[A
train:  24%|████▌              | 222/928 [01:15<03:52,  3.04it/s, total_it=5789][A
epochs:  30%|▎| 6/20 [33:27<1:13:58, 317.02s/it, loss=0.671, lr=0.00269, d_time=[A
train:  24%|████▌              | 223/928 [01:15<03:54,  3.00it/s, total_it=5

epochs:  30%|▎| 6/20 [33:42<1:13:58, 317.02s/it, loss=0.749, lr=0.00271, d_time=[A
train:  29%|█████▍             | 267/928 [01:30<03:38,  3.03it/s, total_it=5834][A
epochs:  30%|▎| 6/20 [33:42<1:13:58, 317.02s/it, loss=0.717, lr=0.00271, d_time=[A
train:  29%|█████▍             | 268/928 [01:30<03:37,  3.03it/s, total_it=5835][A
epochs:  30%|▎| 6/20 [33:42<1:13:58, 317.02s/it, loss=0.786, lr=0.00271, d_time=[A
train:  29%|█████▌             | 269/928 [01:31<03:40,  2.99it/s, total_it=5836][A
epochs:  30%|▎| 6/20 [33:43<1:13:58, 317.02s/it, loss=0.736, lr=0.00271, d_time=[A
train:  29%|█████▌             | 270/928 [01:31<03:35,  3.05it/s, total_it=5837][A
epochs:  30%|▎| 6/20 [33:43<1:13:58, 317.02s/it, loss=0.681, lr=0.00271, d_time=[A
train:  29%|█████▌             | 271/928 [01:31<03:38,  3.00it/s, total_it=5838][A
epochs:  30%|▎| 6/20 [33:43<1:13:58, 317.02s/it, loss=0.558, lr=0.00271, d_time=[A
train:  29%|█████▌             | 272/928 [01:32<03:35,  3.05it/s, total_it=5

epochs:  30%|▎| 6/20 [33:58<1:13:58, 317.02s/it, loss=0.8, lr=0.00272, d_time=0.[A
train:  34%|██████▍            | 316/928 [01:46<03:29,  2.92it/s, total_it=5883][A
epochs:  30%|▎| 6/20 [33:58<1:13:58, 317.02s/it, loss=0.873, lr=0.00272, d_time=[A
train:  34%|██████▍            | 317/928 [01:47<03:28,  2.93it/s, total_it=5884][A
epochs:  30%|▎| 6/20 [33:59<1:13:58, 317.02s/it, loss=0.612, lr=0.00272, d_time=[A
train:  34%|██████▌            | 318/928 [01:47<03:27,  2.94it/s, total_it=5885][A
epochs:  30%|▎| 6/20 [33:59<1:13:58, 317.02s/it, loss=0.648, lr=0.00272, d_time=[A
train:  34%|██████▌            | 319/928 [01:47<03:29,  2.91it/s, total_it=5886][A
epochs:  30%|▎| 6/20 [33:59<1:13:58, 317.02s/it, loss=0.672, lr=0.00272, d_time=[A
train:  34%|██████▌            | 320/928 [01:48<03:27,  2.93it/s, total_it=5887][A
epochs:  30%|▎| 6/20 [34:00<1:13:58, 317.02s/it, loss=0.592, lr=0.00272, d_time=[A
train:  35%|██████▌            | 321/928 [01:48<03:40,  2.75it/s, total_it=5

epochs:  30%|▎| 6/20 [34:14<1:13:58, 317.02s/it, loss=0.682, lr=0.00274, d_time=[A
train:  39%|███████▍           | 365/928 [02:03<03:06,  3.02it/s, total_it=5932][A
epochs:  30%|▎| 6/20 [34:15<1:13:58, 317.02s/it, loss=0.705, lr=0.00274, d_time=[A
train:  39%|███████▍           | 366/928 [02:03<03:08,  2.98it/s, total_it=5933][A
epochs:  30%|▎| 6/20 [34:15<1:13:58, 317.02s/it, loss=0.676, lr=0.00274, d_time=[A
train:  40%|███████▌           | 367/928 [02:03<03:07,  2.99it/s, total_it=5934][A
epochs:  30%|▎| 6/20 [34:16<1:13:58, 317.02s/it, loss=0.598, lr=0.00274, d_time=[A
train:  40%|███████▌           | 368/928 [02:04<03:09,  2.95it/s, total_it=5935][A
epochs:  30%|▎| 6/20 [34:16<1:13:58, 317.02s/it, loss=0.559, lr=0.00274, d_time=[A
train:  40%|███████▌           | 369/928 [02:04<03:05,  3.01it/s, total_it=5936][A
epochs:  30%|▎| 6/20 [34:16<1:13:58, 317.02s/it, loss=0.707, lr=0.00274, d_time=[A
train:  40%|███████▌           | 370/928 [02:05<03:14,  2.87it/s, total_it=5

epochs:  30%|▎| 6/20 [34:31<1:13:58, 317.02s/it, loss=0.803, lr=0.00276, d_time=[A
train:  45%|████████▍          | 414/928 [02:19<02:47,  3.06it/s, total_it=5981][A
epochs:  30%|▎| 6/20 [34:31<1:13:58, 317.02s/it, loss=0.723, lr=0.00276, d_time=[A
train:  45%|████████▍          | 415/928 [02:20<02:45,  3.10it/s, total_it=5982][A
epochs:  30%|▎| 6/20 [34:32<1:13:58, 317.02s/it, loss=0.768, lr=0.00276, d_time=[A
train:  45%|████████▌          | 416/928 [02:20<02:47,  3.06it/s, total_it=5983][A
epochs:  30%|▎| 6/20 [34:32<1:13:58, 317.02s/it, loss=0.623, lr=0.00276, d_time=[A
train:  45%|████████▌          | 417/928 [02:20<02:45,  3.09it/s, total_it=5984][A
epochs:  30%|▎| 6/20 [34:32<1:13:58, 317.02s/it, loss=0.772, lr=0.00276, d_time=[A
train:  45%|████████▌          | 418/928 [02:21<02:43,  3.12it/s, total_it=5985][A
epochs:  30%|▎| 6/20 [34:33<1:13:58, 317.02s/it, loss=0.658, lr=0.00276, d_time=[A
train:  45%|████████▌          | 419/928 [02:21<02:41,  3.16it/s, total_it=5

epochs:  30%|▎| 6/20 [34:47<1:13:58, 317.02s/it, loss=0.787, lr=0.00277, d_time=[A
train:  50%|█████████▍         | 463/928 [02:36<02:38,  2.94it/s, total_it=6030][A
epochs:  30%|▎| 6/20 [34:48<1:13:58, 317.02s/it, loss=0.648, lr=0.00277, d_time=[A
train:  50%|█████████▌         | 464/928 [02:36<02:40,  2.89it/s, total_it=6031][A
epochs:  30%|▎| 6/20 [34:48<1:13:58, 317.02s/it, loss=0.636, lr=0.00277, d_time=[A
train:  50%|█████████▌         | 465/928 [02:36<02:35,  2.97it/s, total_it=6032][A
epochs:  30%|▎| 6/20 [34:48<1:13:58, 317.02s/it, loss=0.706, lr=0.00277, d_time=[A
train:  50%|█████████▌         | 466/928 [02:37<02:37,  2.94it/s, total_it=6033][A
epochs:  30%|▎| 6/20 [34:49<1:13:58, 317.02s/it, loss=0.747, lr=0.00277, d_time=[A
train:  50%|█████████▌         | 467/928 [02:37<02:35,  2.96it/s, total_it=6034][A
epochs:  30%|▎| 6/20 [34:49<1:13:58, 317.02s/it, loss=0.912, lr=0.00277, d_time=[A
train:  50%|█████████▌         | 468/928 [02:37<02:37,  2.93it/s, total_it=6

epochs:  30%|▎| 6/20 [35:04<1:13:58, 317.02s/it, loss=0.644, lr=0.00279, d_time=[A
train:  55%|██████████▍        | 512/928 [02:52<02:13,  3.11it/s, total_it=6079][A
epochs:  30%|▎| 6/20 [35:04<1:13:58, 317.02s/it, loss=0.579, lr=0.00279, d_time=[A
train:  55%|██████████▌        | 513/928 [02:52<02:14,  3.08it/s, total_it=6080][A
epochs:  30%|▎| 6/20 [35:04<1:13:58, 317.02s/it, loss=0.675, lr=0.00279, d_time=[A
train:  55%|██████████▌        | 514/928 [02:53<02:12,  3.13it/s, total_it=6081][A
epochs:  30%|▎| 6/20 [35:05<1:13:58, 317.02s/it, loss=0.694, lr=0.00279, d_time=[A
train:  55%|██████████▌        | 515/928 [02:53<02:16,  3.02it/s, total_it=6082][A
epochs:  30%|▎| 6/20 [35:05<1:13:58, 317.02s/it, loss=0.756, lr=0.00279, d_time=[A
train:  56%|██████████▌        | 516/928 [02:53<02:12,  3.11it/s, total_it=6083][A
epochs:  30%|▎| 6/20 [35:05<1:13:58, 317.02s/it, loss=0.687, lr=0.00279, d_time=[A
train:  56%|██████████▌        | 517/928 [02:54<02:16,  3.00it/s, total_it=6

epochs:  30%|▎| 6/20 [35:20<1:13:58, 317.02s/it, loss=0.72, lr=0.0028, d_time=0.[A
train:  60%|███████████▍       | 561/928 [03:08<02:02,  2.99it/s, total_it=6128][A
epochs:  30%|▎| 6/20 [35:20<1:13:58, 317.02s/it, loss=0.773, lr=0.0028, d_time=0[A
train:  61%|███████████▌       | 562/928 [03:09<02:03,  2.97it/s, total_it=6129][A
epochs:  30%|▎| 6/20 [35:21<1:13:58, 317.02s/it, loss=0.743, lr=0.0028, d_time=0[A
train:  61%|███████████▌       | 563/928 [03:09<02:05,  2.92it/s, total_it=6130][A
epochs:  30%|▎| 6/20 [35:21<1:13:58, 317.02s/it, loss=0.632, lr=0.0028, d_time=0[A
train:  61%|███████████▌       | 564/928 [03:09<02:03,  2.96it/s, total_it=6131][A
epochs:  30%|▎| 6/20 [35:21<1:13:58, 317.02s/it, loss=0.593, lr=0.0028, d_time=0[A
train:  61%|███████████▌       | 565/928 [03:10<02:00,  3.01it/s, total_it=6132][A
epochs:  30%|▎| 6/20 [35:22<1:13:58, 317.02s/it, loss=0.65, lr=0.0028, d_time=0.[A
train:  61%|███████████▌       | 566/928 [03:10<02:01,  2.97it/s, total_it=6

epochs:  30%|▎| 6/20 [35:36<1:13:58, 317.02s/it, loss=0.633, lr=0.00282, d_time=[A
train:  66%|████████████▍      | 610/928 [03:25<01:46,  2.98it/s, total_it=6177][A
epochs:  30%|▎| 6/20 [35:37<1:13:58, 317.02s/it, loss=0.819, lr=0.00282, d_time=[A
train:  66%|████████████▌      | 611/928 [03:25<01:48,  2.93it/s, total_it=6178][A
epochs:  30%|▎| 6/20 [35:37<1:13:58, 317.02s/it, loss=0.698, lr=0.00282, d_time=[A
train:  66%|████████████▌      | 612/928 [03:25<01:49,  2.88it/s, total_it=6179][A
epochs:  30%|▎| 6/20 [35:37<1:13:58, 317.02s/it, loss=0.73, lr=0.00282, d_time=0[A
train:  66%|████████████▌      | 613/928 [03:26<01:47,  2.93it/s, total_it=6180][A
epochs:  30%|▎| 6/20 [35:38<1:13:58, 317.02s/it, loss=0.723, lr=0.00282, d_time=[A
train:  66%|████████████▌      | 614/928 [03:26<01:47,  2.92it/s, total_it=6181][A
epochs:  30%|▎| 6/20 [35:38<1:13:58, 317.02s/it, loss=0.854, lr=0.00282, d_time=[A
train:  66%|████████████▌      | 615/928 [03:26<01:44,  2.99it/s, total_it=6

epochs:  30%|▎| 6/20 [35:53<1:13:58, 317.02s/it, loss=0.635, lr=0.00283, d_time=[A
train:  71%|█████████████▍     | 659/928 [03:41<01:32,  2.90it/s, total_it=6226][A
epochs:  30%|▎| 6/20 [35:53<1:13:58, 317.02s/it, loss=0.643, lr=0.00283, d_time=[A
train:  71%|█████████████▌     | 660/928 [03:42<01:37,  2.74it/s, total_it=6227][A
epochs:  30%|▎| 6/20 [35:54<1:13:58, 317.02s/it, loss=0.729, lr=0.00283, d_time=[A
train:  71%|█████████████▌     | 661/928 [03:42<01:32,  2.89it/s, total_it=6228][A
epochs:  30%|▎| 6/20 [35:54<1:13:58, 317.02s/it, loss=0.668, lr=0.00283, d_time=[A
train:  71%|█████████████▌     | 662/928 [03:42<01:28,  2.99it/s, total_it=6229][A
epochs:  30%|▎| 6/20 [35:54<1:13:58, 317.02s/it, loss=0.66, lr=0.00283, d_time=0[A
train:  71%|█████████████▌     | 663/928 [03:43<01:30,  2.94it/s, total_it=6230][A
epochs:  30%|▎| 6/20 [35:55<1:13:58, 317.02s/it, loss=0.643, lr=0.00283, d_time=[A
train:  72%|█████████████▌     | 664/928 [03:43<01:29,  2.94it/s, total_it=6

epochs:  30%|▎| 6/20 [36:10<1:13:58, 317.02s/it, loss=0.62, lr=0.00284, d_time=0[A
train:  76%|██████████████▍    | 708/928 [03:58<01:13,  2.99it/s, total_it=6275][A
epochs:  30%|▎| 6/20 [36:10<1:13:58, 317.02s/it, loss=0.581, lr=0.00284, d_time=[A
train:  76%|██████████████▌    | 709/928 [03:58<01:12,  3.01it/s, total_it=6276][A
epochs:  30%|▎| 6/20 [36:10<1:13:58, 317.02s/it, loss=0.694, lr=0.00284, d_time=[A
train:  77%|██████████████▌    | 710/928 [03:59<01:12,  3.00it/s, total_it=6277][A
epochs:  30%|▎| 6/20 [36:11<1:13:58, 317.02s/it, loss=0.709, lr=0.00284, d_time=[A
train:  77%|██████████████▌    | 711/928 [03:59<01:12,  3.01it/s, total_it=6278][A
epochs:  30%|▎| 6/20 [36:11<1:13:58, 317.02s/it, loss=0.63, lr=0.00284, d_time=0[A
train:  77%|██████████████▌    | 712/928 [03:59<01:11,  3.03it/s, total_it=6279][A
epochs:  30%|▎| 6/20 [36:11<1:13:58, 317.02s/it, loss=0.688, lr=0.00284, d_time=[A
train:  77%|██████████████▌    | 713/928 [04:00<01:12,  2.97it/s, total_it=6

epochs:  30%|▎| 6/20 [36:26<1:13:58, 317.02s/it, loss=0.678, lr=0.00286, d_time=[A
train:  82%|███████████████▍   | 757/928 [04:14<00:54,  3.13it/s, total_it=6324][A
epochs:  30%|▎| 6/20 [36:26<1:13:58, 317.02s/it, loss=0.657, lr=0.00286, d_time=[A
train:  82%|███████████████▌   | 758/928 [04:15<00:54,  3.12it/s, total_it=6325][A
epochs:  30%|▎| 6/20 [36:27<1:13:58, 317.02s/it, loss=0.699, lr=0.00286, d_time=[A
train:  82%|███████████████▌   | 759/928 [04:15<00:55,  3.07it/s, total_it=6326][A
epochs:  30%|▎| 6/20 [36:27<1:13:58, 317.02s/it, loss=0.695, lr=0.00286, d_time=[A
train:  82%|███████████████▌   | 760/928 [04:15<00:54,  3.09it/s, total_it=6327][A
epochs:  30%|▎| 6/20 [36:27<1:13:58, 317.02s/it, loss=0.745, lr=0.00286, d_time=[A
train:  82%|███████████████▌   | 761/928 [04:16<00:54,  3.06it/s, total_it=6328][A
epochs:  30%|▎| 6/20 [36:28<1:13:58, 317.02s/it, loss=0.629, lr=0.00286, d_time=[A
train:  82%|███████████████▌   | 762/928 [04:16<00:52,  3.15it/s, total_it=6

epochs:  30%|▎| 6/20 [36:43<1:13:58, 317.02s/it, loss=0.879, lr=0.00287, d_time=[A
train:  87%|████████████████▌  | 806/928 [04:31<00:39,  3.06it/s, total_it=6373][A
epochs:  30%|▎| 6/20 [36:43<1:13:58, 317.02s/it, loss=1.09, lr=0.00287, d_time=0[A
train:  87%|████████████████▌  | 807/928 [04:31<00:39,  3.03it/s, total_it=6374][A
epochs:  30%|▎| 6/20 [36:43<1:13:58, 317.02s/it, loss=0.958, lr=0.00287, d_time=[A
train:  87%|████████████████▌  | 808/928 [04:32<00:38,  3.08it/s, total_it=6375][A
epochs:  30%|▎| 6/20 [36:44<1:13:58, 317.02s/it, loss=0.924, lr=0.00287, d_time=[A
train:  87%|████████████████▌  | 809/928 [04:32<00:39,  3.03it/s, total_it=6376][A
epochs:  30%|▎| 6/20 [36:44<1:13:58, 317.02s/it, loss=0.755, lr=0.00287, d_time=[A
train:  87%|████████████████▌  | 810/928 [04:32<00:38,  3.05it/s, total_it=6377][A
epochs:  30%|▎| 6/20 [36:44<1:13:58, 317.02s/it, loss=0.835, lr=0.00287, d_time=[A
train:  87%|████████████████▌  | 811/928 [04:33<00:37,  3.12it/s, total_it=6

epochs:  30%|▎| 6/20 [36:59<1:13:58, 317.02s/it, loss=0.685, lr=0.00288, d_time=[A
train:  92%|█████████████████▌ | 855/928 [04:48<00:25,  2.90it/s, total_it=6422][A
epochs:  30%|▎| 6/20 [37:00<1:13:58, 317.02s/it, loss=0.702, lr=0.00288, d_time=[A
train:  92%|█████████████████▌ | 856/928 [04:48<00:24,  2.89it/s, total_it=6423][A
epochs:  30%|▎| 6/20 [37:00<1:13:58, 317.02s/it, loss=0.68, lr=0.00288, d_time=0[A
train:  92%|█████████████████▌ | 857/928 [04:48<00:23,  2.97it/s, total_it=6424][A
epochs:  30%|▎| 6/20 [37:00<1:13:58, 317.02s/it, loss=0.684, lr=0.00288, d_time=[A
train:  92%|█████████████████▌ | 858/928 [04:49<00:23,  2.98it/s, total_it=6425][A
epochs:  30%|▎| 6/20 [37:01<1:13:58, 317.02s/it, loss=0.721, lr=0.00288, d_time=[A
train:  93%|█████████████████▌ | 859/928 [04:49<00:22,  3.08it/s, total_it=6426][A
epochs:  30%|▎| 6/20 [37:01<1:13:58, 317.02s/it, loss=0.853, lr=0.00288, d_time=[A
train:  93%|█████████████████▌ | 860/928 [04:49<00:21,  3.09it/s, total_it=6

epochs:  30%|▎| 6/20 [37:16<1:13:58, 317.02s/it, loss=0.664, lr=0.00289, d_time=[A
train:  97%|██████████████████▌| 904/928 [05:04<00:07,  3.02it/s, total_it=6471][A
epochs:  30%|▎| 6/20 [37:16<1:13:58, 317.02s/it, loss=0.643, lr=0.00289, d_time=[A
train:  98%|██████████████████▌| 905/928 [05:05<00:07,  3.01it/s, total_it=6472][A
epochs:  30%|▎| 6/20 [37:17<1:13:58, 317.02s/it, loss=0.701, lr=0.00289, d_time=[A
train:  98%|██████████████████▌| 906/928 [05:05<00:07,  2.99it/s, total_it=6473][A
epochs:  30%|▎| 6/20 [37:17<1:13:58, 317.02s/it, loss=0.638, lr=0.00289, d_time=[A
train:  98%|██████████████████▌| 907/928 [05:05<00:06,  3.02it/s, total_it=6474][A
epochs:  30%|▎| 6/20 [37:17<1:13:58, 317.02s/it, loss=1.03, lr=0.00289, d_time=0[A
train:  98%|██████████████████▌| 908/928 [05:05<00:06,  3.08it/s, total_it=6475][A
epochs:  30%|▎| 6/20 [37:18<1:13:58, 317.02s/it, loss=0.766, lr=0.00289, d_time=[A
train:  98%|██████████████████▌| 909/928 [05:06<00:06,  3.11it/s, total_it=6

epochs:  35%|▎| 7/20 [37:33<1:08:23, 315.66s/it, loss=0.622, lr=0.0029, d_time=0[A
train:   3%|▌                   | 24/928 [00:08<05:03,  2.98it/s, total_it=6519][A
epochs:  35%|▎| 7/20 [37:33<1:08:23, 315.66s/it, loss=0.576, lr=0.0029, d_time=0[A
train:   3%|▌                   | 25/928 [00:09<04:58,  3.03it/s, total_it=6520][A
epochs:  35%|▎| 7/20 [37:34<1:08:23, 315.66s/it, loss=0.642, lr=0.0029, d_time=0[A
train:   3%|▌                   | 26/928 [00:09<04:53,  3.08it/s, total_it=6521][A
epochs:  35%|▎| 7/20 [37:34<1:08:23, 315.66s/it, loss=0.775, lr=0.0029, d_time=0[A
train:   3%|▌                   | 27/928 [00:09<04:47,  3.13it/s, total_it=6522][A
epochs:  35%|▎| 7/20 [37:34<1:08:23, 315.66s/it, loss=0.67, lr=0.0029, d_time=0.[A
train:   3%|▌                   | 28/928 [00:10<04:55,  3.04it/s, total_it=6523][A
epochs:  35%|▎| 7/20 [37:35<1:08:23, 315.66s/it, loss=0.679, lr=0.0029, d_time=0[A
train:   3%|▋                   | 29/928 [00:10<05:06,  2.94it/s, total_it=6

epochs:  35%|▎| 7/20 [37:49<1:08:23, 315.66s/it, loss=0.736, lr=0.00291, d_time=[A
train:   8%|█▌                  | 73/928 [00:25<04:42,  3.03it/s, total_it=6568][A
epochs:  35%|▎| 7/20 [37:50<1:08:23, 315.66s/it, loss=0.801, lr=0.00291, d_time=[A
train:   8%|█▌                  | 74/928 [00:25<04:41,  3.04it/s, total_it=6569][A
epochs:  35%|▎| 7/20 [37:50<1:08:23, 315.66s/it, loss=0.694, lr=0.00291, d_time=[A
train:   8%|█▌                  | 75/928 [00:25<04:32,  3.13it/s, total_it=6570][A
epochs:  35%|▎| 7/20 [37:50<1:08:23, 315.66s/it, loss=0.659, lr=0.00291, d_time=[A
train:   8%|█▋                  | 76/928 [00:26<04:38,  3.06it/s, total_it=6571][A
epochs:  35%|▎| 7/20 [37:51<1:08:23, 315.66s/it, loss=0.683, lr=0.00291, d_time=[A
train:   8%|█▋                  | 77/928 [00:26<04:30,  3.14it/s, total_it=6572][A
epochs:  35%|▎| 7/20 [37:51<1:08:23, 315.66s/it, loss=0.641, lr=0.00291, d_time=[A
train:   8%|█▋                  | 78/928 [00:26<04:29,  3.15it/s, total_it=6

epochs:  35%|▎| 7/20 [38:06<1:08:23, 315.66s/it, loss=0.777, lr=0.00292, d_time=[A
train:  13%|██▍                | 122/928 [00:41<04:38,  2.89it/s, total_it=6617][A
epochs:  35%|▎| 7/20 [38:06<1:08:23, 315.66s/it, loss=0.587, lr=0.00292, d_time=[A
train:  13%|██▌                | 123/928 [00:42<04:36,  2.91it/s, total_it=6618][A
epochs:  35%|▎| 7/20 [38:07<1:08:23, 315.66s/it, loss=0.688, lr=0.00292, d_time=[A
train:  13%|██▌                | 124/928 [00:42<04:29,  2.98it/s, total_it=6619][A
epochs:  35%|▎| 7/20 [38:07<1:08:23, 315.66s/it, loss=0.682, lr=0.00292, d_time=[A
train:  13%|██▌                | 125/928 [00:42<04:29,  2.98it/s, total_it=6620][A
epochs:  35%|▎| 7/20 [38:07<1:08:23, 315.66s/it, loss=0.662, lr=0.00292, d_time=[A
train:  14%|██▌                | 126/928 [00:43<04:36,  2.90it/s, total_it=6621][A
epochs:  35%|▎| 7/20 [38:08<1:08:23, 315.66s/it, loss=0.585, lr=0.00292, d_time=[A
train:  14%|██▌                | 127/928 [00:43<04:47,  2.78it/s, total_it=6

epochs:  35%|▎| 7/20 [38:22<1:08:23, 315.66s/it, loss=0.719, lr=0.00293, d_time=[A
train:  18%|███▌               | 171/928 [00:58<04:12,  2.99it/s, total_it=6666][A
epochs:  35%|▎| 7/20 [38:23<1:08:23, 315.66s/it, loss=0.655, lr=0.00293, d_time=[A
train:  19%|███▌               | 172/928 [00:58<04:11,  3.01it/s, total_it=6667][A
epochs:  35%|▎| 7/20 [38:23<1:08:23, 315.66s/it, loss=0.657, lr=0.00293, d_time=[A
train:  19%|███▌               | 173/928 [00:58<04:31,  2.78it/s, total_it=6668][A
epochs:  35%|▎| 7/20 [38:23<1:08:23, 315.66s/it, loss=0.575, lr=0.00293, d_time=[A
train:  19%|███▌               | 174/928 [00:59<04:29,  2.80it/s, total_it=6669][A
epochs:  35%|▎| 7/20 [38:24<1:08:23, 315.66s/it, loss=0.616, lr=0.00293, d_time=[A
train:  19%|███▌               | 175/928 [00:59<04:20,  2.89it/s, total_it=6670][A
epochs:  35%|▎| 7/20 [38:24<1:08:23, 315.66s/it, loss=0.581, lr=0.00293, d_time=[A
train:  19%|███▌               | 176/928 [01:00<04:25,  2.83it/s, total_it=6

epochs:  35%|▎| 7/20 [38:39<1:08:23, 315.66s/it, loss=0.617, lr=0.00294, d_time=[A
train:  24%|████▌              | 220/928 [01:14<04:01,  2.93it/s, total_it=6715][A
epochs:  35%|▎| 7/20 [38:39<1:08:23, 315.66s/it, loss=0.696, lr=0.00294, d_time=[A
train:  24%|████▌              | 221/928 [01:15<03:58,  2.97it/s, total_it=6716][A
epochs:  35%|▎| 7/20 [38:40<1:08:23, 315.66s/it, loss=0.607, lr=0.00294, d_time=[A
train:  24%|████▌              | 222/928 [01:15<03:58,  2.95it/s, total_it=6717][A
epochs:  35%|▎| 7/20 [38:40<1:08:23, 315.66s/it, loss=0.6, lr=0.00294, d_time=0.[A
train:  24%|████▌              | 223/928 [01:15<03:53,  3.02it/s, total_it=6718][A
epochs:  35%|▎| 7/20 [38:40<1:08:23, 315.66s/it, loss=0.602, lr=0.00294, d_time=[A
train:  24%|████▌              | 224/928 [01:16<03:51,  3.04it/s, total_it=6719][A
epochs:  35%|▎| 7/20 [38:41<1:08:23, 315.66s/it, loss=0.585, lr=0.00294, d_time=[A
train:  24%|████▌              | 225/928 [01:16<03:49,  3.06it/s, total_it=6

epochs:  35%|▎| 7/20 [38:56<1:08:23, 315.66s/it, loss=0.704, lr=0.00295, d_time=[A
train:  29%|█████▌             | 269/928 [01:31<03:41,  2.98it/s, total_it=6764][A
epochs:  35%|▎| 7/20 [38:56<1:08:23, 315.66s/it, loss=0.661, lr=0.00295, d_time=[A
train:  29%|█████▌             | 270/928 [01:31<03:37,  3.02it/s, total_it=6765][A
epochs:  35%|▎| 7/20 [38:56<1:08:23, 315.66s/it, loss=0.657, lr=0.00295, d_time=[A
train:  29%|█████▌             | 271/928 [01:32<03:37,  3.02it/s, total_it=6766][A
epochs:  35%|▎| 7/20 [38:57<1:08:23, 315.66s/it, loss=0.745, lr=0.00295, d_time=[A
train:  29%|█████▌             | 272/928 [01:32<03:38,  3.00it/s, total_it=6767][A
epochs:  35%|▎| 7/20 [38:57<1:08:23, 315.66s/it, loss=0.735, lr=0.00295, d_time=[A
train:  29%|█████▌             | 273/928 [01:32<03:33,  3.07it/s, total_it=6768][A
epochs:  35%|▎| 7/20 [38:57<1:08:23, 315.66s/it, loss=0.677, lr=0.00295, d_time=[A
train:  30%|█████▌             | 274/928 [01:33<03:27,  3.15it/s, total_it=6

epochs:  35%|▎| 7/20 [39:12<1:08:23, 315.66s/it, loss=0.677, lr=0.00295, d_time=[A
train:  34%|██████▌            | 318/928 [01:47<03:26,  2.95it/s, total_it=6813][A
epochs:  35%|▎| 7/20 [39:12<1:08:23, 315.66s/it, loss=0.635, lr=0.00296, d_time=[A
train:  34%|██████▌            | 319/928 [01:48<03:27,  2.93it/s, total_it=6814][A
epochs:  35%|▎| 7/20 [39:13<1:08:23, 315.66s/it, loss=0.667, lr=0.00296, d_time=[A
train:  34%|██████▌            | 320/928 [01:48<03:30,  2.89it/s, total_it=6815][A
epochs:  35%|▎| 7/20 [39:13<1:08:23, 315.66s/it, loss=0.677, lr=0.00296, d_time=[A
train:  35%|██████▌            | 321/928 [01:48<03:27,  2.93it/s, total_it=6816][A
epochs:  35%|▎| 7/20 [39:13<1:08:23, 315.66s/it, loss=0.743, lr=0.00296, d_time=[A
train:  35%|██████▌            | 322/928 [01:49<03:22,  3.00it/s, total_it=6817][A
epochs:  35%|▎| 7/20 [39:14<1:08:23, 315.66s/it, loss=0.652, lr=0.00296, d_time=[A
train:  35%|██████▌            | 323/928 [01:49<03:25,  2.95it/s, total_it=6

epochs:  35%|▎| 7/20 [39:28<1:08:23, 315.66s/it, loss=0.609, lr=0.00296, d_time=[A
train:  40%|███████▌           | 367/928 [02:04<03:03,  3.07it/s, total_it=6862][A
epochs:  35%|▎| 7/20 [39:28<1:08:23, 315.66s/it, loss=0.694, lr=0.00296, d_time=[A
train:  40%|███████▌           | 368/928 [02:04<03:03,  3.05it/s, total_it=6863][A
epochs:  35%|▎| 7/20 [39:29<1:08:23, 315.66s/it, loss=0.607, lr=0.00296, d_time=[A
train:  40%|███████▌           | 369/928 [02:04<03:03,  3.05it/s, total_it=6864][A
epochs:  35%|▎| 7/20 [39:29<1:08:23, 315.66s/it, loss=0.572, lr=0.00296, d_time=[A
train:  40%|███████▌           | 370/928 [02:05<03:08,  2.96it/s, total_it=6865][A
epochs:  35%|▎| 7/20 [39:29<1:08:23, 315.66s/it, loss=0.731, lr=0.00296, d_time=[A
train:  40%|███████▌           | 371/928 [02:05<03:02,  3.05it/s, total_it=6866][A
epochs:  35%|▎| 7/20 [39:30<1:08:23, 315.66s/it, loss=0.622, lr=0.00296, d_time=[A
train:  40%|███████▌           | 372/928 [02:05<03:03,  3.03it/s, total_it=6

epochs:  35%|▎| 7/20 [39:45<1:08:23, 315.66s/it, loss=0.654, lr=0.00297, d_time=[A
train:  45%|████████▌          | 416/928 [02:20<02:51,  2.99it/s, total_it=6911][A
epochs:  35%|▎| 7/20 [39:45<1:08:23, 315.66s/it, loss=0.79, lr=0.00297, d_time=0[A
train:  45%|████████▌          | 417/928 [02:20<02:55,  2.92it/s, total_it=6912][A
epochs:  35%|▎| 7/20 [39:45<1:08:23, 315.66s/it, loss=0.64, lr=0.00297, d_time=0[A
train:  45%|████████▌          | 418/928 [02:21<02:53,  2.95it/s, total_it=6913][A
epochs:  35%|▎| 7/20 [39:46<1:08:23, 315.66s/it, loss=0.665, lr=0.00297, d_time=[A
train:  45%|████████▌          | 419/928 [02:21<02:46,  3.06it/s, total_it=6914][A
epochs:  35%|▎| 7/20 [39:46<1:08:23, 315.66s/it, loss=0.679, lr=0.00297, d_time=[A
train:  45%|████████▌          | 420/928 [02:21<02:50,  2.98it/s, total_it=6915][A
epochs:  35%|▎| 7/20 [39:46<1:08:23, 315.66s/it, loss=0.639, lr=0.00297, d_time=[A
train:  45%|████████▌          | 421/928 [02:22<02:49,  2.99it/s, total_it=6

epochs:  35%|▎| 7/20 [40:01<1:08:23, 315.66s/it, loss=0.539, lr=0.00297, d_time=[A
train:  50%|█████████▌         | 465/928 [02:37<02:42,  2.86it/s, total_it=6960][A
epochs:  35%|▎| 7/20 [40:01<1:08:23, 315.66s/it, loss=0.641, lr=0.00297, d_time=[A
train:  50%|█████████▌         | 466/928 [02:37<02:38,  2.91it/s, total_it=6961][A
epochs:  35%|▎| 7/20 [40:02<1:08:23, 315.66s/it, loss=0.559, lr=0.00297, d_time=[A
train:  50%|█████████▌         | 467/928 [02:37<02:38,  2.92it/s, total_it=6962][A
epochs:  35%|▎| 7/20 [40:02<1:08:23, 315.66s/it, loss=0.556, lr=0.00297, d_time=[A
train:  50%|█████████▌         | 468/928 [02:38<02:33,  3.00it/s, total_it=6963][A
epochs:  35%|▎| 7/20 [40:02<1:08:23, 315.66s/it, loss=0.658, lr=0.00297, d_time=[A
train:  51%|█████████▌         | 469/928 [02:38<02:28,  3.10it/s, total_it=6964][A
epochs:  35%|▎| 7/20 [40:03<1:08:23, 315.66s/it, loss=0.566, lr=0.00297, d_time=[A
train:  51%|█████████▌         | 470/928 [02:38<02:28,  3.08it/s, total_it=6

epochs:  35%|▎| 7/20 [40:18<1:08:23, 315.66s/it, loss=0.663, lr=0.00298, d_time=[A
train:  55%|██████████▌        | 514/928 [02:53<02:18,  2.98it/s, total_it=7009][A
epochs:  35%|▎| 7/20 [40:18<1:08:23, 315.66s/it, loss=0.585, lr=0.00298, d_time=[A
train:  55%|██████████▌        | 515/928 [02:53<02:16,  3.02it/s, total_it=7010][A
epochs:  35%|▎| 7/20 [40:18<1:08:23, 315.66s/it, loss=0.722, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 516/928 [02:54<02:17,  3.00it/s, total_it=7011][A
epochs:  35%|▎| 7/20 [40:19<1:08:23, 315.66s/it, loss=0.692, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 517/928 [02:54<02:18,  2.97it/s, total_it=7012][A
epochs:  35%|▎| 7/20 [40:19<1:08:23, 315.66s/it, loss=0.615, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 518/928 [02:54<02:17,  2.98it/s, total_it=7013][A
epochs:  35%|▎| 7/20 [40:19<1:08:23, 315.66s/it, loss=0.7, lr=0.00298, d_time=0.[A
train:  56%|██████████▋        | 519/928 [02:55<02:18,  2.96it/s, total_it=7

epochs:  35%|▎| 7/20 [40:34<1:08:23, 315.66s/it, loss=0.661, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 563/928 [03:09<01:56,  3.13it/s, total_it=7058][A
epochs:  35%|▎| 7/20 [40:34<1:08:23, 315.66s/it, loss=0.661, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 564/928 [03:10<01:58,  3.06it/s, total_it=7059][A
epochs:  35%|▎| 7/20 [40:34<1:08:23, 315.66s/it, loss=0.628, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 565/928 [03:10<01:59,  3.04it/s, total_it=7060][A
epochs:  35%|▎| 7/20 [40:35<1:08:23, 315.66s/it, loss=0.659, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 566/928 [03:10<02:00,  2.99it/s, total_it=7061][A
epochs:  35%|▎| 7/20 [40:35<1:08:23, 315.66s/it, loss=0.672, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 567/928 [03:11<01:59,  3.02it/s, total_it=7062][A
epochs:  35%|▎| 7/20 [40:35<1:08:23, 315.66s/it, loss=0.664, lr=0.00298, d_time=[A
train:  61%|███████████▋       | 568/928 [03:11<01:57,  3.06it/s, total_it=7

epochs:  35%|▎| 7/20 [40:50<1:08:23, 315.66s/it, loss=0.613, lr=0.00299, d_time=[A
train:  66%|████████████▌      | 612/928 [03:26<01:46,  2.97it/s, total_it=7107][A
epochs:  35%|▎| 7/20 [40:50<1:08:23, 315.66s/it, loss=0.64, lr=0.00299, d_time=0[A
train:  66%|████████████▌      | 613/928 [03:26<01:51,  2.83it/s, total_it=7108][A
epochs:  35%|▎| 7/20 [40:51<1:08:23, 315.66s/it, loss=0.709, lr=0.00299, d_time=[A
train:  66%|████████████▌      | 614/928 [03:26<01:49,  2.88it/s, total_it=7109][A
epochs:  35%|▎| 7/20 [40:51<1:08:23, 315.66s/it, loss=0.619, lr=0.00299, d_time=[A
train:  66%|████████████▌      | 615/928 [03:27<01:45,  2.98it/s, total_it=7110][A
epochs:  35%|▎| 7/20 [40:51<1:08:23, 315.66s/it, loss=0.7, lr=0.00299, d_time=0.[A
train:  66%|████████████▌      | 616/928 [03:27<01:44,  2.97it/s, total_it=7111][A
epochs:  35%|▎| 7/20 [40:52<1:08:23, 315.66s/it, loss=0.551, lr=0.00299, d_time=[A
train:  66%|████████████▋      | 617/928 [03:27<01:44,  2.99it/s, total_it=7

epochs:  35%|▎| 7/20 [41:07<1:08:23, 315.66s/it, loss=0.691, lr=0.00299, d_time=[A
train:  71%|█████████████▌     | 661/928 [03:42<01:27,  3.06it/s, total_it=7156][A
epochs:  35%|▎| 7/20 [41:07<1:08:23, 315.66s/it, loss=0.636, lr=0.00299, d_time=[A
train:  71%|█████████████▌     | 662/928 [03:42<01:26,  3.07it/s, total_it=7157][A
epochs:  35%|▎| 7/20 [41:07<1:08:23, 315.66s/it, loss=0.541, lr=0.00299, d_time=[A
train:  71%|█████████████▌     | 663/928 [03:43<01:24,  3.14it/s, total_it=7158][A
epochs:  35%|▎| 7/20 [41:07<1:08:23, 315.66s/it, loss=0.679, lr=0.00299, d_time=[A
train:  72%|█████████████▌     | 664/928 [03:43<01:28,  2.98it/s, total_it=7159][A
epochs:  35%|▎| 7/20 [41:08<1:08:23, 315.66s/it, loss=0.74, lr=0.00299, d_time=0[A
train:  72%|█████████████▌     | 665/928 [03:43<01:27,  2.99it/s, total_it=7160][A
epochs:  35%|▎| 7/20 [41:08<1:08:23, 315.66s/it, loss=0.603, lr=0.00299, d_time=[A
train:  72%|█████████████▋     | 666/928 [03:44<01:25,  3.06it/s, total_it=7

epochs:  35%|▎| 7/20 [41:23<1:08:23, 315.66s/it, loss=0.702, lr=0.00299, d_time=[A
train:  77%|██████████████▌    | 710/928 [03:59<01:12,  3.00it/s, total_it=7205][A
epochs:  35%|▎| 7/20 [41:24<1:08:23, 315.66s/it, loss=0.651, lr=0.00299, d_time=[A
train:  77%|██████████████▌    | 711/928 [03:59<01:14,  2.92it/s, total_it=7206][A
epochs:  35%|▎| 7/20 [41:24<1:08:23, 315.66s/it, loss=0.584, lr=0.00299, d_time=[A
train:  77%|██████████████▌    | 712/928 [03:59<01:14,  2.91it/s, total_it=7207][A
epochs:  35%|▎| 7/20 [41:24<1:08:23, 315.66s/it, loss=0.621, lr=0.00299, d_time=[A
train:  77%|██████████████▌    | 713/928 [04:00<01:12,  2.95it/s, total_it=7208][A
epochs:  35%|▎| 7/20 [41:25<1:08:23, 315.66s/it, loss=0.626, lr=0.00299, d_time=[A
train:  77%|██████████████▌    | 714/928 [04:00<01:10,  3.03it/s, total_it=7209][A
epochs:  35%|▎| 7/20 [41:25<1:08:23, 315.66s/it, loss=0.889, lr=0.00299, d_time=[A
train:  77%|██████████████▋    | 715/928 [04:00<01:08,  3.10it/s, total_it=7

epochs:  35%|▎| 7/20 [41:39<1:08:23, 315.66s/it, loss=0.684, lr=0.003, d_time=0.[A
train:  82%|███████████████▌   | 759/928 [04:15<00:56,  3.01it/s, total_it=7254][A
epochs:  35%|▎| 7/20 [41:40<1:08:23, 315.66s/it, loss=0.781, lr=0.003, d_time=0.[A
train:  82%|███████████████▌   | 760/928 [04:15<00:57,  2.92it/s, total_it=7255][A
epochs:  35%|▎| 7/20 [41:40<1:08:23, 315.66s/it, loss=0.697, lr=0.003, d_time=0.[A
train:  82%|███████████████▌   | 761/928 [04:15<00:56,  2.98it/s, total_it=7256][A
epochs:  35%|▎| 7/20 [41:40<1:08:23, 315.66s/it, loss=0.662, lr=0.003, d_time=0.[A
train:  82%|███████████████▌   | 762/928 [04:16<00:54,  3.03it/s, total_it=7257][A
epochs:  35%|▎| 7/20 [41:41<1:08:23, 315.66s/it, loss=2.95, lr=0.003, d_time=0.0[A
train:  82%|███████████████▌   | 763/928 [04:16<00:56,  2.93it/s, total_it=7258][A
epochs:  35%|▎| 7/20 [41:41<1:08:23, 315.66s/it, loss=0.618, lr=0.003, d_time=0.[A
train:  82%|███████████████▋   | 764/928 [04:16<00:56,  2.93it/s, total_it=7

epochs:  35%|▎| 7/20 [41:56<1:08:23, 315.66s/it, loss=0.688, lr=0.003, d_time=0.[A
train:  87%|████████████████▌  | 808/928 [04:31<00:41,  2.88it/s, total_it=7303][A
epochs:  35%|▎| 7/20 [41:56<1:08:23, 315.66s/it, loss=0.569, lr=0.003, d_time=0.[A
train:  87%|████████████████▌  | 809/928 [04:32<00:42,  2.82it/s, total_it=7304][A
epochs:  35%|▎| 7/20 [41:57<1:08:23, 315.66s/it, loss=0.719, lr=0.003, d_time=0.[A
train:  87%|████████████████▌  | 810/928 [04:32<00:40,  2.88it/s, total_it=7305][A
epochs:  35%|▎| 7/20 [41:57<1:08:23, 315.66s/it, loss=0.764, lr=0.003, d_time=0.[A
train:  87%|████████████████▌  | 811/928 [04:32<00:39,  2.94it/s, total_it=7306][A
epochs:  35%|▎| 7/20 [41:57<1:08:23, 315.66s/it, loss=0.725, lr=0.003, d_time=0.[A
train:  88%|████████████████▋  | 812/928 [04:33<00:39,  2.90it/s, total_it=7307][A
epochs:  35%|▎| 7/20 [41:58<1:08:23, 315.66s/it, loss=0.745, lr=0.003, d_time=0.[A
train:  88%|████████████████▋  | 813/928 [04:33<00:42,  2.72it/s, total_it=7

epochs:  35%|▎| 7/20 [42:12<1:08:23, 315.66s/it, loss=0.603, lr=0.003, d_time=0.[A
train:  92%|█████████████████▌ | 857/928 [04:48<00:24,  2.94it/s, total_it=7352][A
epochs:  35%|▎| 7/20 [42:13<1:08:23, 315.66s/it, loss=0.888, lr=0.003, d_time=0.[A
train:  92%|█████████████████▌ | 858/928 [04:48<00:23,  3.03it/s, total_it=7353][A
epochs:  35%|▎| 7/20 [42:13<1:08:23, 315.66s/it, loss=0.8, lr=0.003, d_time=0.00[A
train:  93%|█████████████████▌ | 859/928 [04:49<00:23,  2.91it/s, total_it=7354][A
epochs:  35%|▎| 7/20 [42:13<1:08:23, 315.66s/it, loss=0.749, lr=0.003, d_time=0.[A
train:  93%|█████████████████▌ | 860/928 [04:49<00:24,  2.81it/s, total_it=7355][A
epochs:  35%|▎| 7/20 [42:14<1:08:23, 315.66s/it, loss=0.751, lr=0.003, d_time=0.[A
train:  93%|█████████████████▋ | 861/928 [04:49<00:22,  2.92it/s, total_it=7356][A
epochs:  35%|▎| 7/20 [42:14<1:08:23, 315.66s/it, loss=0.693, lr=0.003, d_time=0.[A
train:  93%|█████████████████▋ | 862/928 [04:50<00:23,  2.76it/s, total_it=7

epochs:  35%|▎| 7/20 [42:29<1:08:23, 315.66s/it, loss=0.54, lr=0.003, d_time=0.0[A
train:  98%|██████████████████▌| 906/928 [05:04<00:07,  3.01it/s, total_it=7401][A
epochs:  35%|▎| 7/20 [42:29<1:08:23, 315.66s/it, loss=0.665, lr=0.003, d_time=0.[A
train:  98%|██████████████████▌| 907/928 [05:05<00:06,  3.12it/s, total_it=7402][A
epochs:  35%|▎| 7/20 [42:30<1:08:23, 315.66s/it, loss=0.664, lr=0.003, d_time=0.[A
train:  98%|██████████████████▌| 908/928 [05:05<00:06,  3.04it/s, total_it=7403][A
epochs:  35%|▎| 7/20 [42:30<1:08:23, 315.66s/it, loss=0.596, lr=0.003, d_time=0.[A
train:  98%|██████████████████▌| 909/928 [05:05<00:06,  3.05it/s, total_it=7404][A
epochs:  35%|▎| 7/20 [42:30<1:08:23, 315.66s/it, loss=0.703, lr=0.003, d_time=0.[A
train:  98%|██████████████████▋| 910/928 [05:06<00:05,  3.03it/s, total_it=7405][A
epochs:  35%|▎| 7/20 [42:31<1:08:23, 315.66s/it, loss=0.716, lr=0.003, d_time=0.[A
train:  98%|██████████████████▋| 911/928 [05:06<00:05,  3.02it/s, total_it=7

epochs:  40%|▍| 8/20 [42:46<1:02:55, 314.64s/it, loss=0.665, lr=0.003, d_time=0.[A
train:   3%|▌                   | 26/928 [00:09<05:10,  2.90it/s, total_it=7449][A
epochs:  40%|▍| 8/20 [42:46<1:02:55, 314.64s/it, loss=0.682, lr=0.003, d_time=0.[A
train:   3%|▌                   | 27/928 [00:09<05:20,  2.81it/s, total_it=7450][A
epochs:  40%|▍| 8/20 [42:47<1:02:55, 314.64s/it, loss=0.711, lr=0.003, d_time=0.[A
train:   3%|▌                   | 28/928 [00:10<05:12,  2.88it/s, total_it=7451][A
epochs:  40%|▍| 8/20 [42:47<1:02:55, 314.64s/it, loss=0.655, lr=0.003, d_time=0.[A
train:   3%|▋                   | 29/928 [00:10<05:11,  2.89it/s, total_it=7452][A
epochs:  40%|▍| 8/20 [42:47<1:02:55, 314.64s/it, loss=0.742, lr=0.003, d_time=0.[A
train:   3%|▋                   | 30/928 [00:10<05:13,  2.87it/s, total_it=7453][A
epochs:  40%|▍| 8/20 [42:48<1:02:55, 314.64s/it, loss=0.609, lr=0.003, d_time=0.[A
train:   3%|▋                   | 31/928 [00:11<05:03,  2.95it/s, total_it=7

epochs:  40%|▍| 8/20 [43:03<1:02:55, 314.64s/it, loss=0.652, lr=0.003, d_time=0.[A
train:   8%|█▌                  | 75/928 [00:25<04:39,  3.06it/s, total_it=7498][A
epochs:  40%|▍| 8/20 [43:03<1:02:55, 314.64s/it, loss=0.625, lr=0.003, d_time=0.[A
train:   8%|█▋                  | 76/928 [00:26<04:34,  3.10it/s, total_it=7499][A
epochs:  40%|▍| 8/20 [43:03<1:02:55, 314.64s/it, loss=0.743, lr=0.003, d_time=0.[A
train:   8%|█▋                  | 77/928 [00:26<04:29,  3.16it/s, total_it=7500][A
epochs:  40%|▍| 8/20 [43:03<1:02:55, 314.64s/it, loss=0.636, lr=0.003, d_time=0.[A
train:   8%|█▋                  | 78/928 [00:26<04:34,  3.10it/s, total_it=7501][A
epochs:  40%|▍| 8/20 [43:04<1:02:55, 314.64s/it, loss=0.574, lr=0.003, d_time=0.[A
train:   9%|█▋                  | 79/928 [00:27<04:36,  3.07it/s, total_it=7502][A
epochs:  40%|▍| 8/20 [43:04<1:02:55, 314.64s/it, loss=0.699, lr=0.003, d_time=0.[A
train:   9%|█▋                  | 80/928 [00:27<04:33,  3.10it/s, total_it=7

epochs:  40%|▍| 8/20 [43:19<1:02:55, 314.64s/it, loss=0.636, lr=0.003, d_time=0.[A
train:  13%|██▌                | 124/928 [00:42<04:35,  2.91it/s, total_it=7547][A
epochs:  40%|▍| 8/20 [43:19<1:02:55, 314.64s/it, loss=0.548, lr=0.003, d_time=0.[A
train:  13%|██▌                | 125/928 [00:42<04:29,  2.98it/s, total_it=7548][A
epochs:  40%|▍| 8/20 [43:20<1:02:55, 314.64s/it, loss=0.624, lr=0.003, d_time=0.[A
train:  14%|██▌                | 126/928 [00:43<04:28,  2.98it/s, total_it=7549][A
epochs:  40%|▍| 8/20 [43:20<1:02:55, 314.64s/it, loss=0.576, lr=0.003, d_time=0.[A
train:  14%|██▌                | 127/928 [00:43<04:20,  3.08it/s, total_it=7550][A
epochs:  40%|▍| 8/20 [43:20<1:02:55, 314.64s/it, loss=0.543, lr=0.003, d_time=0.[A
train:  14%|██▌                | 128/928 [00:43<04:17,  3.11it/s, total_it=7551][A
epochs:  40%|▍| 8/20 [43:21<1:02:55, 314.64s/it, loss=0.624, lr=0.003, d_time=0.[A
train:  14%|██▋                | 129/928 [00:44<04:28,  2.98it/s, total_it=7

epochs:  40%|▍| 8/20 [43:35<1:02:55, 314.64s/it, loss=0.608, lr=0.003, d_time=0.[A
train:  19%|███▌               | 173/928 [00:58<04:11,  3.01it/s, total_it=7596][A
epochs:  40%|▍| 8/20 [43:36<1:02:55, 314.64s/it, loss=0.626, lr=0.003, d_time=0.[A
train:  19%|███▌               | 174/928 [00:58<04:14,  2.96it/s, total_it=7597][A
epochs:  40%|▍| 8/20 [43:36<1:02:55, 314.64s/it, loss=0.612, lr=0.003, d_time=0.[A
train:  19%|███▌               | 175/928 [00:59<04:12,  2.98it/s, total_it=7598][A
epochs:  40%|▍| 8/20 [43:36<1:02:55, 314.64s/it, loss=0.616, lr=0.003, d_time=0.[A
train:  19%|███▌               | 176/928 [00:59<04:17,  2.92it/s, total_it=7599][A
epochs:  40%|▍| 8/20 [43:37<1:02:55, 314.64s/it, loss=0.525, lr=0.003, d_time=0.[A
train:  19%|███▌               | 177/928 [00:59<04:10,  3.00it/s, total_it=7600][A
epochs:  40%|▍| 8/20 [43:37<1:02:55, 314.64s/it, loss=0.742, lr=0.003, d_time=0.[A
train:  19%|███▋               | 178/928 [01:00<04:07,  3.03it/s, total_it=7

epochs:  40%|▍| 8/20 [43:51<1:02:55, 314.64s/it, loss=0.633, lr=0.003, d_time=0.[A
train:  24%|████▌              | 222/928 [01:14<03:46,  3.12it/s, total_it=7645][A
epochs:  40%|▍| 8/20 [43:52<1:02:55, 314.64s/it, loss=0.474, lr=0.003, d_time=0.[A
train:  24%|████▌              | 223/928 [01:15<03:46,  3.11it/s, total_it=7646][A
epochs:  40%|▍| 8/20 [43:52<1:02:55, 314.64s/it, loss=0.643, lr=0.003, d_time=0.[A
train:  24%|████▌              | 224/928 [01:15<03:54,  3.01it/s, total_it=7647][A
epochs:  40%|▍| 8/20 [43:52<1:02:55, 314.64s/it, loss=0.657, lr=0.003, d_time=0.[A
train:  24%|████▌              | 225/928 [01:15<04:01,  2.91it/s, total_it=7648][A
epochs:  40%|▍| 8/20 [43:53<1:02:55, 314.64s/it, loss=0.642, lr=0.003, d_time=0.[A
train:  24%|████▋              | 226/928 [01:16<04:01,  2.90it/s, total_it=7649][A
epochs:  40%|▍| 8/20 [43:53<1:02:55, 314.64s/it, loss=0.527, lr=0.003, d_time=0.[A
train:  24%|████▋              | 227/928 [01:16<03:57,  2.95it/s, total_it=7

epochs:  40%|▍| 8/20 [44:08<1:02:55, 314.64s/it, loss=0.702, lr=0.003, d_time=0.[A
train:  29%|█████▌             | 271/928 [01:31<03:30,  3.12it/s, total_it=7694][A
epochs:  40%|▍| 8/20 [44:08<1:02:55, 314.64s/it, loss=0.617, lr=0.003, d_time=0.[A
train:  29%|█████▌             | 272/928 [01:31<03:28,  3.14it/s, total_it=7695][A
epochs:  40%|▍| 8/20 [44:09<1:02:55, 314.64s/it, loss=0.617, lr=0.003, d_time=0.[A
train:  29%|█████▌             | 273/928 [01:31<03:34,  3.05it/s, total_it=7696][A
epochs:  40%|▍| 8/20 [44:09<1:02:55, 314.64s/it, loss=0.751, lr=0.003, d_time=0.[A
train:  30%|█████▌             | 274/928 [01:32<03:31,  3.10it/s, total_it=7697][A
epochs:  40%|▍| 8/20 [44:09<1:02:55, 314.64s/it, loss=0.677, lr=0.003, d_time=0.[A
train:  30%|█████▋             | 275/928 [01:32<03:28,  3.13it/s, total_it=7698][A
epochs:  40%|▍| 8/20 [44:10<1:02:55, 314.64s/it, loss=0.739, lr=0.003, d_time=0.[A
train:  30%|█████▋             | 276/928 [01:32<03:32,  3.07it/s, total_it=7

epochs:  40%|▍| 8/20 [44:24<1:02:55, 314.64s/it, loss=0.533, lr=0.00299, d_time=[A
train:  34%|██████▌            | 320/928 [01:47<03:18,  3.07it/s, total_it=7743][A
epochs:  40%|▍| 8/20 [44:24<1:02:55, 314.64s/it, loss=0.558, lr=0.00299, d_time=[A
train:  35%|██████▌            | 321/928 [01:47<03:18,  3.06it/s, total_it=7744][A
epochs:  40%|▍| 8/20 [44:25<1:02:55, 314.64s/it, loss=0.714, lr=0.00299, d_time=[A
train:  35%|██████▌            | 322/928 [01:48<03:17,  3.06it/s, total_it=7745][A
epochs:  40%|▍| 8/20 [44:25<1:02:55, 314.64s/it, loss=0.589, lr=0.00299, d_time=[A
train:  35%|██████▌            | 323/928 [01:48<03:18,  3.05it/s, total_it=7746][A
epochs:  40%|▍| 8/20 [44:25<1:02:55, 314.64s/it, loss=0.707, lr=0.00299, d_time=[A
train:  35%|██████▋            | 324/928 [01:48<03:14,  3.10it/s, total_it=7747][A
epochs:  40%|▍| 8/20 [44:26<1:02:55, 314.64s/it, loss=0.639, lr=0.00299, d_time=[A
train:  35%|██████▋            | 325/928 [01:49<03:17,  3.06it/s, total_it=7

epochs:  40%|▍| 8/20 [44:40<1:02:55, 314.64s/it, loss=0.613, lr=0.00299, d_time=[A
train:  40%|███████▌           | 369/928 [02:03<02:58,  3.14it/s, total_it=7792][A
epochs:  40%|▍| 8/20 [44:41<1:02:55, 314.64s/it, loss=0.575, lr=0.00299, d_time=[A
train:  40%|███████▌           | 370/928 [02:04<03:03,  3.03it/s, total_it=7793][A
epochs:  40%|▍| 8/20 [44:41<1:02:55, 314.64s/it, loss=0.597, lr=0.00299, d_time=[A
train:  40%|███████▌           | 371/928 [02:04<03:10,  2.93it/s, total_it=7794][A
epochs:  40%|▍| 8/20 [44:41<1:02:55, 314.64s/it, loss=0.562, lr=0.00299, d_time=[A
train:  40%|███████▌           | 372/928 [02:04<03:03,  3.04it/s, total_it=7795][A
epochs:  40%|▍| 8/20 [44:42<1:02:55, 314.64s/it, loss=0.622, lr=0.00299, d_time=[A
train:  40%|███████▋           | 373/928 [02:05<02:56,  3.14it/s, total_it=7796][A
epochs:  40%|▍| 8/20 [44:42<1:02:55, 314.64s/it, loss=0.667, lr=0.00299, d_time=[A
train:  40%|███████▋           | 374/928 [02:05<02:56,  3.14it/s, total_it=7

epochs:  40%|▍| 8/20 [44:57<1:02:55, 314.64s/it, loss=0.6, lr=0.00299, d_time=0.[A
train:  45%|████████▌          | 418/928 [02:20<02:47,  3.05it/s, total_it=7841][A
epochs:  40%|▍| 8/20 [44:57<1:02:55, 314.64s/it, loss=0.644, lr=0.00299, d_time=[A
train:  45%|████████▌          | 419/928 [02:20<02:45,  3.08it/s, total_it=7842][A
epochs:  40%|▍| 8/20 [44:57<1:02:55, 314.64s/it, loss=0.689, lr=0.00299, d_time=[A
train:  45%|████████▌          | 420/928 [02:20<02:46,  3.05it/s, total_it=7843][A
epochs:  40%|▍| 8/20 [44:58<1:02:55, 314.64s/it, loss=0.615, lr=0.00299, d_time=[A
train:  45%|████████▌          | 421/928 [02:21<02:50,  2.97it/s, total_it=7844][A
epochs:  40%|▍| 8/20 [44:58<1:02:55, 314.64s/it, loss=0.655, lr=0.00299, d_time=[A
train:  45%|████████▋          | 422/928 [02:21<02:50,  2.97it/s, total_it=7845][A
epochs:  40%|▍| 8/20 [44:58<1:02:55, 314.64s/it, loss=0.723, lr=0.00299, d_time=[A
train:  46%|████████▋          | 423/928 [02:21<02:43,  3.10it/s, total_it=7

epochs:  40%|▍| 8/20 [45:13<1:02:55, 314.64s/it, loss=0.675, lr=0.00299, d_time=[A
train:  50%|█████████▌         | 467/928 [02:36<02:37,  2.94it/s, total_it=7890][A
epochs:  40%|▍| 8/20 [45:13<1:02:55, 314.64s/it, loss=0.682, lr=0.00299, d_time=[A
train:  50%|█████████▌         | 468/928 [02:36<02:31,  3.03it/s, total_it=7891][A
epochs:  40%|▍| 8/20 [45:14<1:02:55, 314.64s/it, loss=0.654, lr=0.00299, d_time=[A
train:  51%|█████████▌         | 469/928 [02:36<02:27,  3.10it/s, total_it=7892][A
epochs:  40%|▍| 8/20 [45:14<1:02:55, 314.64s/it, loss=0.647, lr=0.00299, d_time=[A
train:  51%|█████████▌         | 470/928 [02:37<02:29,  3.07it/s, total_it=7893][A
epochs:  40%|▍| 8/20 [45:14<1:02:55, 314.64s/it, loss=0.736, lr=0.00299, d_time=[A
train:  51%|█████████▋         | 471/928 [02:37<02:26,  3.11it/s, total_it=7894][A
epochs:  40%|▍| 8/20 [45:14<1:02:55, 314.64s/it, loss=0.613, lr=0.00299, d_time=[A
train:  51%|█████████▋         | 472/928 [02:37<02:24,  3.16it/s, total_it=7

epochs:  40%|▍| 8/20 [45:30<1:02:55, 314.64s/it, loss=0.624, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 516/928 [02:53<02:17,  2.99it/s, total_it=7939][A
epochs:  40%|▍| 8/20 [45:30<1:02:55, 314.64s/it, loss=0.759, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 517/928 [02:53<02:16,  3.02it/s, total_it=7940][A
epochs:  40%|▍| 8/20 [45:30<1:02:55, 314.64s/it, loss=0.759, lr=0.00298, d_time=[A
train:  56%|██████████▌        | 518/928 [02:53<02:14,  3.04it/s, total_it=7941][A
epochs:  40%|▍| 8/20 [45:31<1:02:55, 314.64s/it, loss=0.644, lr=0.00298, d_time=[A
train:  56%|██████████▋        | 519/928 [02:54<02:25,  2.80it/s, total_it=7942][A
epochs:  40%|▍| 8/20 [45:31<1:02:55, 314.64s/it, loss=0.613, lr=0.00298, d_time=[A
train:  56%|██████████▋        | 520/928 [02:54<02:22,  2.87it/s, total_it=7943][A
epochs:  40%|▍| 8/20 [45:31<1:02:55, 314.64s/it, loss=0.558, lr=0.00298, d_time=[A
train:  56%|██████████▋        | 521/928 [02:54<02:22,  2.86it/s, total_it=7

epochs:  40%|▍| 8/20 [45:46<1:02:55, 314.64s/it, loss=0.515, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 565/928 [03:09<01:52,  3.22it/s, total_it=7988][A
epochs:  40%|▍| 8/20 [45:46<1:02:55, 314.64s/it, loss=0.595, lr=0.00298, d_time=[A
train:  61%|███████████▌       | 566/928 [03:09<01:53,  3.19it/s, total_it=7989][A
epochs:  40%|▍| 8/20 [45:47<1:02:55, 314.64s/it, loss=0.57, lr=0.00298, d_time=0[A
train:  61%|███████████▌       | 567/928 [03:10<01:55,  3.12it/s, total_it=7990][A
epochs:  40%|▍| 8/20 [45:47<1:02:55, 314.64s/it, loss=0.737, lr=0.00298, d_time=[A
train:  61%|███████████▋       | 568/928 [03:10<01:56,  3.10it/s, total_it=7991][A
epochs:  40%|▍| 8/20 [45:47<1:02:55, 314.64s/it, loss=0.496, lr=0.00298, d_time=[A
train:  61%|███████████▋       | 569/928 [03:10<02:04,  2.89it/s, total_it=7992][A
epochs:  40%|▍| 8/20 [45:48<1:02:55, 314.64s/it, loss=0.575, lr=0.00298, d_time=[A
train:  61%|███████████▋       | 570/928 [03:11<02:05,  2.86it/s, total_it=7

epochs:  40%|▍| 8/20 [46:03<1:02:55, 314.64s/it, loss=0.676, lr=0.00298, d_time=[A
train:  66%|████████████▌      | 614/928 [03:26<01:48,  2.89it/s, total_it=8037][A
epochs:  40%|▍| 8/20 [46:03<1:02:55, 314.64s/it, loss=0.664, lr=0.00298, d_time=[A
train:  66%|████████████▌      | 615/928 [03:26<01:54,  2.74it/s, total_it=8038][A
epochs:  40%|▍| 8/20 [46:03<1:02:55, 314.64s/it, loss=0.627, lr=0.00298, d_time=[A
train:  66%|████████████▌      | 616/928 [03:26<01:56,  2.67it/s, total_it=8039][A
epochs:  40%|▍| 8/20 [46:04<1:02:55, 314.64s/it, loss=0.621, lr=0.00298, d_time=[A
train:  66%|████████████▋      | 617/928 [03:27<01:54,  2.71it/s, total_it=8040][A
epochs:  40%|▍| 8/20 [46:04<1:02:55, 314.64s/it, loss=0.604, lr=0.00298, d_time=[A
train:  67%|████████████▋      | 618/928 [03:27<01:48,  2.86it/s, total_it=8041][A
epochs:  40%|▍| 8/20 [46:04<1:02:55, 314.64s/it, loss=0.642, lr=0.00298, d_time=[A
train:  67%|████████████▋      | 619/928 [03:27<01:49,  2.82it/s, total_it=8

epochs:  40%|▍| 8/20 [46:19<1:02:55, 314.64s/it, loss=0.578, lr=0.00297, d_time=[A
train:  71%|█████████████▌     | 663/928 [03:42<01:26,  3.06it/s, total_it=8086][A
epochs:  40%|▍| 8/20 [46:20<1:02:55, 314.64s/it, loss=0.607, lr=0.00297, d_time=[A
train:  72%|█████████████▌     | 664/928 [03:43<01:29,  2.96it/s, total_it=8087][A
epochs:  40%|▍| 8/20 [46:20<1:02:55, 314.64s/it, loss=0.679, lr=0.00297, d_time=[A
train:  72%|█████████████▌     | 665/928 [03:43<01:27,  2.99it/s, total_it=8088][A
epochs:  40%|▍| 8/20 [46:20<1:02:55, 314.64s/it, loss=0.621, lr=0.00297, d_time=[A
train:  72%|█████████████▋     | 666/928 [03:43<01:25,  3.05it/s, total_it=8089][A
epochs:  40%|▍| 8/20 [46:21<1:02:55, 314.64s/it, loss=0.527, lr=0.00297, d_time=[A
train:  72%|█████████████▋     | 667/928 [03:44<01:23,  3.13it/s, total_it=8090][A
epochs:  40%|▍| 8/20 [46:21<1:02:55, 314.64s/it, loss=0.598, lr=0.00297, d_time=[A
train:  72%|█████████████▋     | 668/928 [03:44<01:24,  3.06it/s, total_it=8

epochs:  40%|▍| 8/20 [46:36<1:02:55, 314.64s/it, loss=0.582, lr=0.00297, d_time=[A
train:  77%|██████████████▌    | 712/928 [03:59<01:15,  2.85it/s, total_it=8135][A
epochs:  40%|▍| 8/20 [46:37<1:02:55, 314.64s/it, loss=0.908, lr=0.00297, d_time=[A
train:  77%|██████████████▌    | 713/928 [04:00<01:15,  2.83it/s, total_it=8136][A
epochs:  40%|▍| 8/20 [46:37<1:02:55, 314.64s/it, loss=0.674, lr=0.00297, d_time=[A
train:  77%|██████████████▌    | 714/928 [04:00<01:13,  2.90it/s, total_it=8137][A
epochs:  40%|▍| 8/20 [46:37<1:02:55, 314.64s/it, loss=0.632, lr=0.00297, d_time=[A
train:  77%|██████████████▋    | 715/928 [04:00<01:12,  2.92it/s, total_it=8138][A
epochs:  40%|▍| 8/20 [46:38<1:02:55, 314.64s/it, loss=0.583, lr=0.00297, d_time=[A
train:  77%|██████████████▋    | 716/928 [04:01<01:10,  3.03it/s, total_it=8139][A
epochs:  40%|▍| 8/20 [46:38<1:02:55, 314.64s/it, loss=0.7, lr=0.00297, d_time=0.[A
train:  77%|██████████████▋    | 717/928 [04:01<01:09,  3.04it/s, total_it=8

epochs:  40%|▍| 8/20 [46:52<1:02:55, 314.64s/it, loss=0.583, lr=0.00297, d_time=[A
train:  82%|███████████████▌   | 761/928 [04:15<00:54,  3.09it/s, total_it=8184][A
epochs:  40%|▍| 8/20 [46:53<1:02:55, 314.64s/it, loss=0.621, lr=0.00297, d_time=[A
train:  82%|███████████████▌   | 762/928 [04:16<00:53,  3.12it/s, total_it=8185][A
epochs:  40%|▍| 8/20 [46:53<1:02:55, 314.64s/it, loss=0.778, lr=0.00297, d_time=[A
train:  82%|███████████████▌   | 763/928 [04:16<00:52,  3.14it/s, total_it=8186][A
epochs:  40%|▍| 8/20 [46:53<1:02:55, 314.64s/it, loss=0.646, lr=0.00297, d_time=[A
train:  82%|███████████████▋   | 764/928 [04:16<00:53,  3.05it/s, total_it=8187][A
epochs:  40%|▍| 8/20 [46:54<1:02:55, 314.64s/it, loss=0.583, lr=0.00297, d_time=[A
train:  82%|███████████████▋   | 765/928 [04:17<00:54,  3.02it/s, total_it=8188][A
epochs:  40%|▍| 8/20 [46:54<1:02:55, 314.64s/it, loss=0.553, lr=0.00297, d_time=[A
train:  83%|███████████████▋   | 766/928 [04:17<00:52,  3.10it/s, total_it=8

epochs:  40%|▍| 8/20 [47:09<1:02:55, 314.64s/it, loss=0.673, lr=0.00296, d_time=[A
train:  87%|████████████████▌  | 810/928 [04:31<00:39,  2.99it/s, total_it=8233][A
epochs:  40%|▍| 8/20 [47:09<1:02:55, 314.64s/it, loss=0.675, lr=0.00296, d_time=[A
train:  87%|████████████████▌  | 811/928 [04:32<00:39,  2.99it/s, total_it=8234][A
epochs:  40%|▍| 8/20 [47:09<1:02:55, 314.64s/it, loss=0.646, lr=0.00296, d_time=[A
train:  88%|████████████████▋  | 812/928 [04:32<00:37,  3.05it/s, total_it=8235][A
epochs:  40%|▍| 8/20 [47:10<1:02:55, 314.64s/it, loss=0.599, lr=0.00296, d_time=[A
train:  88%|████████████████▋  | 813/928 [04:32<00:38,  2.99it/s, total_it=8236][A
epochs:  40%|▍| 8/20 [47:10<1:02:55, 314.64s/it, loss=0.783, lr=0.00296, d_time=[A
train:  88%|████████████████▋  | 814/928 [04:33<00:37,  3.04it/s, total_it=8237][A
epochs:  40%|▍| 8/20 [47:10<1:02:55, 314.64s/it, loss=0.663, lr=0.00296, d_time=[A
train:  88%|████████████████▋  | 815/928 [04:33<00:36,  3.10it/s, total_it=8

epochs:  40%|▍| 8/20 [47:25<1:02:55, 314.64s/it, loss=0.59, lr=0.00296, d_time=0[A
train:  93%|█████████████████▌ | 859/928 [04:48<00:23,  2.88it/s, total_it=8282][A
epochs:  40%|▍| 8/20 [47:25<1:02:55, 314.64s/it, loss=0.513, lr=0.00296, d_time=[A
train:  93%|█████████████████▌ | 860/928 [04:48<00:23,  2.89it/s, total_it=8283][A
epochs:  40%|▍| 8/20 [47:26<1:02:55, 314.64s/it, loss=0.555, lr=0.00296, d_time=[A
train:  93%|█████████████████▋ | 861/928 [04:49<00:22,  3.02it/s, total_it=8284][A
epochs:  40%|▍| 8/20 [47:26<1:02:55, 314.64s/it, loss=0.835, lr=0.00296, d_time=[A
train:  93%|█████████████████▋ | 862/928 [04:49<00:21,  3.07it/s, total_it=8285][A
epochs:  40%|▍| 8/20 [47:26<1:02:55, 314.64s/it, loss=0.719, lr=0.00296, d_time=[A
train:  93%|█████████████████▋ | 863/928 [04:49<00:21,  3.02it/s, total_it=8286][A
epochs:  40%|▍| 8/20 [47:27<1:02:55, 314.64s/it, loss=0.592, lr=0.00296, d_time=[A
train:  93%|█████████████████▋ | 864/928 [04:50<00:22,  2.80it/s, total_it=8

epochs:  40%|▍| 8/20 [47:42<1:02:55, 314.64s/it, loss=0.594, lr=0.00295, d_time=[A
train:  98%|██████████████████▌| 908/928 [05:05<00:07,  2.80it/s, total_it=8331][A
epochs:  40%|▍| 8/20 [47:42<1:02:55, 314.64s/it, loss=0.512, lr=0.00295, d_time=[A
train:  98%|██████████████████▌| 909/928 [05:05<00:06,  2.86it/s, total_it=8332][A
epochs:  40%|▍| 8/20 [47:42<1:02:55, 314.64s/it, loss=0.63, lr=0.00295, d_time=0[A
train:  98%|██████████████████▋| 910/928 [05:05<00:06,  2.91it/s, total_it=8333][A
epochs:  40%|▍| 8/20 [47:43<1:02:55, 314.64s/it, loss=0.688, lr=0.00295, d_time=[A
train:  98%|██████████████████▋| 911/928 [05:06<00:05,  2.91it/s, total_it=8334][A
epochs:  40%|▍| 8/20 [47:43<1:02:55, 314.64s/it, loss=0.691, lr=0.00295, d_time=[A
train:  98%|██████████████████▋| 912/928 [05:06<00:05,  2.89it/s, total_it=8335][A
epochs:  40%|▍| 8/20 [47:43<1:02:55, 314.64s/it, loss=0.575, lr=0.00295, d_time=[A
train:  98%|██████████████████▋| 913/928 [05:06<00:05,  2.98it/s, total_it=8

epochs:  45%|▍| 9/20 [47:59<57:32, 313.84s/it, loss=0.569, lr=0.00295, d_time=0.[A
train:   3%|▌                   | 28/928 [00:10<05:07,  2.93it/s, total_it=8379][A
epochs:  45%|▍| 9/20 [47:59<57:32, 313.84s/it, loss=0.484, lr=0.00295, d_time=0.[A
train:   3%|▋                   | 29/928 [00:10<05:00,  2.99it/s, total_it=8380][A
epochs:  45%|▍| 9/20 [47:59<57:32, 313.84s/it, loss=0.823, lr=0.00295, d_time=0.[A
train:   3%|▋                   | 30/928 [00:10<04:55,  3.04it/s, total_it=8381][A
epochs:  45%|▍| 9/20 [48:00<57:32, 313.84s/it, loss=0.637, lr=0.00295, d_time=0.[A
train:   3%|▋                   | 31/928 [00:11<04:52,  3.07it/s, total_it=8382][A
epochs:  45%|▍| 9/20 [48:00<57:32, 313.84s/it, loss=0.788, lr=0.00295, d_time=0.[A
train:   3%|▋                   | 32/928 [00:11<05:07,  2.91it/s, total_it=8383][A
epochs:  45%|▍| 9/20 [48:00<57:32, 313.84s/it, loss=0.578, lr=0.00295, d_time=0.[A
train:   4%|▋                   | 33/928 [00:11<05:01,  2.97it/s, total_it=8

epochs:  45%|▍| 9/20 [48:16<57:32, 313.84s/it, loss=0.641, lr=0.00294, d_time=0.[A
train:   8%|█▋                  | 77/928 [00:26<04:39,  3.05it/s, total_it=8428][A
epochs:  45%|▍| 9/20 [48:16<57:32, 313.84s/it, loss=0.537, lr=0.00294, d_time=0.[A
train:   8%|█▋                  | 78/928 [00:27<04:36,  3.08it/s, total_it=8429][A
epochs:  45%|▍| 9/20 [48:16<57:32, 313.84s/it, loss=0.497, lr=0.00294, d_time=0.[A
train:   9%|█▋                  | 79/928 [00:27<04:38,  3.05it/s, total_it=8430][A
epochs:  45%|▍| 9/20 [48:17<57:32, 313.84s/it, loss=0.567, lr=0.00294, d_time=0.[A
train:   9%|█▋                  | 80/928 [00:27<04:41,  3.01it/s, total_it=8431][A
epochs:  45%|▍| 9/20 [48:17<57:32, 313.84s/it, loss=0.629, lr=0.00294, d_time=0.[A
train:   9%|█▋                  | 81/928 [00:28<04:40,  3.02it/s, total_it=8432][A
epochs:  45%|▍| 9/20 [48:17<57:32, 313.84s/it, loss=0.639, lr=0.00294, d_time=0.[A
train:   9%|█▊                  | 82/928 [00:28<04:41,  3.00it/s, total_it=8

epochs:  45%|▍| 9/20 [48:32<57:32, 313.84s/it, loss=0.64, lr=0.00293, d_time=0.0[A
train:  14%|██▌                | 126/928 [00:43<04:28,  2.99it/s, total_it=8477][A
epochs:  45%|▍| 9/20 [48:32<57:32, 313.84s/it, loss=0.573, lr=0.00293, d_time=0.[A
train:  14%|██▌                | 127/928 [00:43<04:27,  2.99it/s, total_it=8478][A
epochs:  45%|▍| 9/20 [48:33<57:32, 313.84s/it, loss=0.545, lr=0.00293, d_time=0.[A
train:  14%|██▌                | 128/928 [00:43<04:31,  2.95it/s, total_it=8479][A
epochs:  45%|▍| 9/20 [48:33<57:32, 313.84s/it, loss=0.507, lr=0.00293, d_time=0.[A
train:  14%|██▋                | 129/928 [00:44<04:49,  2.76it/s, total_it=8480][A
epochs:  45%|▍| 9/20 [48:33<57:32, 313.84s/it, loss=0.65, lr=0.00293, d_time=0.0[A
train:  14%|██▋                | 130/928 [00:44<04:41,  2.83it/s, total_it=8481][A
epochs:  45%|▍| 9/20 [48:34<57:32, 313.84s/it, loss=0.607, lr=0.00293, d_time=0.[A
train:  14%|██▋                | 131/928 [00:44<04:26,  2.99it/s, total_it=8

epochs:  45%|▍| 9/20 [48:48<57:32, 313.84s/it, loss=0.567, lr=0.00293, d_time=0.[A
train:  19%|███▌               | 175/928 [00:59<04:27,  2.82it/s, total_it=8526][A
epochs:  45%|▍| 9/20 [48:49<57:32, 313.84s/it, loss=0.56, lr=0.00293, d_time=0.0[A
train:  19%|███▌               | 176/928 [01:00<04:17,  2.92it/s, total_it=8527][A
epochs:  45%|▍| 9/20 [48:49<57:32, 313.84s/it, loss=0.569, lr=0.00293, d_time=0.[A
train:  19%|███▌               | 177/928 [01:00<04:11,  2.98it/s, total_it=8528][A
epochs:  45%|▍| 9/20 [48:49<57:32, 313.84s/it, loss=0.666, lr=0.00293, d_time=0.[A
train:  19%|███▋               | 178/928 [01:00<04:14,  2.95it/s, total_it=8529][A
epochs:  45%|▍| 9/20 [48:50<57:32, 313.84s/it, loss=0.625, lr=0.00293, d_time=0.[A
train:  19%|███▋               | 179/928 [01:01<04:09,  3.00it/s, total_it=8530][A
epochs:  45%|▍| 9/20 [48:50<57:32, 313.84s/it, loss=0.595, lr=0.00293, d_time=0.[A
train:  19%|███▋               | 180/928 [01:01<04:23,  2.84it/s, total_it=8

epochs:  45%|▍| 9/20 [49:05<57:32, 313.84s/it, loss=0.548, lr=0.00292, d_time=0.[A
train:  24%|████▌              | 224/928 [01:16<04:01,  2.91it/s, total_it=8575][A
epochs:  45%|▍| 9/20 [49:05<57:32, 313.84s/it, loss=0.819, lr=0.00292, d_time=0.[A
train:  24%|████▌              | 225/928 [01:16<03:53,  3.01it/s, total_it=8576][A
epochs:  45%|▍| 9/20 [49:06<57:32, 313.84s/it, loss=0.591, lr=0.00292, d_time=0.[A
train:  24%|████▋              | 226/928 [01:16<03:46,  3.09it/s, total_it=8577][A
epochs:  45%|▍| 9/20 [49:06<57:32, 313.84s/it, loss=0.621, lr=0.00292, d_time=0.[A
train:  24%|████▋              | 227/928 [01:17<03:46,  3.10it/s, total_it=8578][A
epochs:  45%|▍| 9/20 [49:06<57:32, 313.84s/it, loss=0.641, lr=0.00292, d_time=0.[A
train:  25%|████▋              | 228/928 [01:17<03:59,  2.92it/s, total_it=8579][A
epochs:  45%|▍| 9/20 [49:07<57:32, 313.84s/it, loss=0.564, lr=0.00292, d_time=0.[A
train:  25%|████▋              | 229/928 [01:18<04:06,  2.84it/s, total_it=8

epochs:  45%|▍| 9/20 [49:21<57:32, 313.84s/it, loss=0.589, lr=0.00292, d_time=0.[A
train:  29%|█████▌             | 273/928 [01:32<03:43,  2.93it/s, total_it=8624][A
epochs:  45%|▍| 9/20 [49:22<57:32, 313.84s/it, loss=0.605, lr=0.00291, d_time=0.[A
train:  30%|█████▌             | 274/928 [01:32<03:39,  2.98it/s, total_it=8625][A
epochs:  45%|▍| 9/20 [49:22<57:32, 313.84s/it, loss=0.62, lr=0.00291, d_time=0.0[A
train:  30%|█████▋             | 275/928 [01:33<03:50,  2.83it/s, total_it=8626][A
epochs:  45%|▍| 9/20 [49:22<57:32, 313.84s/it, loss=0.581, lr=0.00291, d_time=0.[A
train:  30%|█████▋             | 276/928 [01:33<03:59,  2.73it/s, total_it=8627][A
epochs:  45%|▍| 9/20 [49:23<57:32, 313.84s/it, loss=0.681, lr=0.00291, d_time=0.[A
train:  30%|█████▋             | 277/928 [01:34<03:53,  2.78it/s, total_it=8628][A
epochs:  45%|▍| 9/20 [49:23<57:32, 313.84s/it, loss=0.571, lr=0.00291, d_time=0.[A
train:  30%|█████▋             | 278/928 [01:34<03:48,  2.85it/s, total_it=8

epochs:  45%|▍| 9/20 [49:38<57:32, 313.84s/it, loss=0.593, lr=0.00291, d_time=0.[A
train:  35%|██████▌            | 322/928 [01:49<03:32,  2.85it/s, total_it=8673][A
epochs:  45%|▍| 9/20 [49:39<57:32, 313.84s/it, loss=0.542, lr=0.00291, d_time=0.[A
train:  35%|██████▌            | 323/928 [01:49<03:26,  2.92it/s, total_it=8674][A
epochs:  45%|▍| 9/20 [49:39<57:32, 313.84s/it, loss=0.65, lr=0.00291, d_time=0.0[A
train:  35%|██████▋            | 324/928 [01:50<03:18,  3.04it/s, total_it=8675][A
epochs:  45%|▍| 9/20 [49:39<57:32, 313.84s/it, loss=0.671, lr=0.00291, d_time=0.[A
train:  35%|██████▋            | 325/928 [01:50<03:22,  2.97it/s, total_it=8676][A
epochs:  45%|▍| 9/20 [49:40<57:32, 313.84s/it, loss=0.623, lr=0.00291, d_time=0.[A
train:  35%|██████▋            | 326/928 [01:50<03:18,  3.04it/s, total_it=8677][A
epochs:  45%|▍| 9/20 [49:40<57:32, 313.84s/it, loss=0.517, lr=0.00291, d_time=0.[A
train:  35%|██████▋            | 327/928 [01:51<03:21,  2.99it/s, total_it=8

epochs:  45%|▍| 9/20 [49:54<57:32, 313.84s/it, loss=0.489, lr=0.0029, d_time=0.0[A
train:  40%|███████▌           | 371/928 [02:05<03:02,  3.05it/s, total_it=8722][A
epochs:  45%|▍| 9/20 [49:55<57:32, 313.84s/it, loss=0.667, lr=0.0029, d_time=0.0[A
train:  40%|███████▌           | 372/928 [02:06<03:02,  3.04it/s, total_it=8723][A
epochs:  45%|▍| 9/20 [49:55<57:32, 313.84s/it, loss=0.494, lr=0.0029, d_time=0.0[A
train:  40%|███████▋           | 373/928 [02:06<03:06,  2.97it/s, total_it=8724][A
epochs:  45%|▍| 9/20 [49:55<57:32, 313.84s/it, loss=0.552, lr=0.0029, d_time=0.0[A
train:  40%|███████▋           | 374/928 [02:06<03:15,  2.83it/s, total_it=8725][A
epochs:  45%|▍| 9/20 [49:56<57:32, 313.84s/it, loss=0.646, lr=0.0029, d_time=0.0[A
train:  40%|███████▋           | 375/928 [02:07<03:16,  2.81it/s, total_it=8726][A
epochs:  45%|▍| 9/20 [49:56<57:32, 313.84s/it, loss=0.487, lr=0.0029, d_time=0.0[A
train:  41%|███████▋           | 376/928 [02:07<03:10,  2.89it/s, total_it=8

epochs:  45%|▍| 9/20 [50:11<57:32, 313.84s/it, loss=0.557, lr=0.00289, d_time=0.[A
train:  45%|████████▌          | 420/928 [02:22<02:49,  3.00it/s, total_it=8771][A
epochs:  45%|▍| 9/20 [50:11<57:32, 313.84s/it, loss=0.591, lr=0.00289, d_time=0.[A
train:  45%|████████▌          | 421/928 [02:22<02:45,  3.06it/s, total_it=8772][A
epochs:  45%|▍| 9/20 [50:12<57:32, 313.84s/it, loss=0.56, lr=0.00289, d_time=0.0[A
train:  45%|████████▋          | 422/928 [02:22<02:45,  3.06it/s, total_it=8773][A
epochs:  45%|▍| 9/20 [50:12<57:32, 313.84s/it, loss=0.506, lr=0.00289, d_time=0.[A
train:  46%|████████▋          | 423/928 [02:23<02:46,  3.04it/s, total_it=8774][A
epochs:  45%|▍| 9/20 [50:12<57:32, 313.84s/it, loss=0.576, lr=0.00289, d_time=0.[A
train:  46%|████████▋          | 424/928 [02:23<02:41,  3.12it/s, total_it=8775][A
epochs:  45%|▍| 9/20 [50:13<57:32, 313.84s/it, loss=0.56, lr=0.00289, d_time=0.0[A
train:  46%|████████▋          | 425/928 [02:23<02:40,  3.13it/s, total_it=8

epochs:  45%|▍| 9/20 [50:28<57:32, 313.84s/it, loss=0.637, lr=0.00289, d_time=0.[A
train:  51%|█████████▌         | 469/928 [02:38<02:36,  2.93it/s, total_it=8820][A
epochs:  45%|▍| 9/20 [50:28<57:32, 313.84s/it, loss=0.598, lr=0.00289, d_time=0.[A
train:  51%|█████████▌         | 470/928 [02:39<02:35,  2.94it/s, total_it=8821][A
epochs:  45%|▍| 9/20 [50:28<57:32, 313.84s/it, loss=0.525, lr=0.00289, d_time=0.[A
train:  51%|█████████▋         | 471/928 [02:39<02:31,  3.02it/s, total_it=8822][A
epochs:  45%|▍| 9/20 [50:29<57:32, 313.84s/it, loss=0.611, lr=0.00288, d_time=0.[A
train:  51%|█████████▋         | 472/928 [02:39<02:26,  3.12it/s, total_it=8823][A
epochs:  45%|▍| 9/20 [50:29<57:32, 313.84s/it, loss=0.589, lr=0.00288, d_time=0.[A
train:  51%|█████████▋         | 473/928 [02:40<02:33,  2.96it/s, total_it=8824][A
epochs:  45%|▍| 9/20 [50:29<57:32, 313.84s/it, loss=0.586, lr=0.00288, d_time=0.[A
train:  51%|█████████▋         | 474/928 [02:40<02:34,  2.95it/s, total_it=8

epochs:  45%|▍| 9/20 [50:44<57:32, 313.84s/it, loss=0.63, lr=0.00288, d_time=0.0[A
train:  56%|██████████▌        | 518/928 [02:55<02:13,  3.06it/s, total_it=8869][A
epochs:  45%|▍| 9/20 [50:44<57:32, 313.84s/it, loss=0.708, lr=0.00288, d_time=0.[A
train:  56%|██████████▋        | 519/928 [02:55<02:14,  3.05it/s, total_it=8870][A
epochs:  45%|▍| 9/20 [50:45<57:32, 313.84s/it, loss=0.52, lr=0.00288, d_time=0.0[A
train:  56%|██████████▋        | 520/928 [02:55<02:11,  3.11it/s, total_it=8871][A
epochs:  45%|▍| 9/20 [50:45<57:32, 313.84s/it, loss=0.533, lr=0.00288, d_time=0.[A
train:  56%|██████████▋        | 521/928 [02:56<02:18,  2.93it/s, total_it=8872][A
epochs:  45%|▍| 9/20 [50:45<57:32, 313.84s/it, loss=0.508, lr=0.00288, d_time=0.[A
train:  56%|██████████▋        | 522/928 [02:56<02:15,  3.00it/s, total_it=8873][A
epochs:  45%|▍| 9/20 [50:46<57:32, 313.84s/it, loss=0.525, lr=0.00288, d_time=0.[A
train:  56%|██████████▋        | 523/928 [02:57<02:14,  3.02it/s, total_it=8

epochs:  45%|▍| 9/20 [51:01<57:32, 313.84s/it, loss=0.626, lr=0.00287, d_time=0.[A
train:  61%|███████████▌       | 567/928 [03:12<02:01,  2.97it/s, total_it=8918][A
epochs:  45%|▍| 9/20 [51:01<57:32, 313.84s/it, loss=0.537, lr=0.00287, d_time=0.[A
train:  61%|███████████▋       | 568/928 [03:12<01:58,  3.03it/s, total_it=8919][A
epochs:  45%|▍| 9/20 [51:01<57:32, 313.84s/it, loss=0.715, lr=0.00287, d_time=0.[A
train:  61%|███████████▋       | 569/928 [03:12<01:59,  3.00it/s, total_it=8920][A
epochs:  45%|▍| 9/20 [51:02<57:32, 313.84s/it, loss=0.604, lr=0.00287, d_time=0.[A
train:  61%|███████████▋       | 570/928 [03:13<01:59,  3.00it/s, total_it=8921][A
epochs:  45%|▍| 9/20 [51:02<57:32, 313.84s/it, loss=0.52, lr=0.00287, d_time=0.0[A
train:  62%|███████████▋       | 571/928 [03:13<01:55,  3.08it/s, total_it=8922][A
epochs:  45%|▍| 9/20 [51:02<57:32, 313.84s/it, loss=0.671, lr=0.00287, d_time=0.[A
train:  62%|███████████▋       | 572/928 [03:13<01:56,  3.06it/s, total_it=8

epochs:  45%|▍| 9/20 [51:17<57:32, 313.84s/it, loss=0.697, lr=0.00286, d_time=0.[A
train:  66%|████████████▌      | 616/928 [03:28<01:41,  3.07it/s, total_it=8967][A
epochs:  45%|▍| 9/20 [51:17<57:32, 313.84s/it, loss=0.591, lr=0.00286, d_time=0.[A
train:  66%|████████████▋      | 617/928 [03:28<01:39,  3.13it/s, total_it=8968][A
epochs:  45%|▍| 9/20 [51:18<57:32, 313.84s/it, loss=0.646, lr=0.00286, d_time=0.[A
train:  67%|████████████▋      | 618/928 [03:29<01:45,  2.93it/s, total_it=8969][A
epochs:  45%|▍| 9/20 [51:18<57:32, 313.84s/it, loss=0.522, lr=0.00286, d_time=0.[A
train:  67%|████████████▋      | 619/928 [03:29<01:43,  2.99it/s, total_it=8970][A
epochs:  45%|▍| 9/20 [51:18<57:32, 313.84s/it, loss=0.529, lr=0.00286, d_time=0.[A
train:  67%|████████████▋      | 620/928 [03:29<01:41,  3.03it/s, total_it=8971][A
epochs:  45%|▍| 9/20 [51:19<57:32, 313.84s/it, loss=0.569, lr=0.00286, d_time=0.[A
train:  67%|████████████▋      | 621/928 [03:30<01:40,  3.05it/s, total_it=8

epochs:  45%|▍| 9/20 [51:33<57:32, 313.84s/it, loss=0.558, lr=0.00285, d_time=0.[A
train:  72%|█████████████▌     | 665/928 [03:44<01:29,  2.95it/s, total_it=9016][A
epochs:  45%|▍| 9/20 [51:34<57:32, 313.84s/it, loss=0.643, lr=0.00285, d_time=0.[A
train:  72%|█████████████▋     | 666/928 [03:45<01:26,  3.02it/s, total_it=9017][A
epochs:  45%|▍| 9/20 [51:34<57:32, 313.84s/it, loss=0.679, lr=0.00285, d_time=0.[A
train:  72%|█████████████▋     | 667/928 [03:45<01:25,  3.07it/s, total_it=9018][A
epochs:  45%|▍| 9/20 [51:34<57:32, 313.84s/it, loss=0.529, lr=0.00285, d_time=0.[A
train:  72%|█████████████▋     | 668/928 [03:45<01:25,  3.05it/s, total_it=9019][A
epochs:  45%|▍| 9/20 [51:35<57:32, 313.84s/it, loss=0.508, lr=0.00285, d_time=0.[A
train:  72%|█████████████▋     | 669/928 [03:46<01:24,  3.06it/s, total_it=9020][A
epochs:  45%|▍| 9/20 [51:35<57:32, 313.84s/it, loss=0.623, lr=0.00285, d_time=0.[A
train:  72%|█████████████▋     | 670/928 [03:46<01:27,  2.95it/s, total_it=9

epochs:  45%|▍| 9/20 [51:50<57:32, 313.84s/it, loss=0.699, lr=0.00284, d_time=0.[A
train:  77%|██████████████▌    | 714/928 [04:00<01:10,  3.02it/s, total_it=9065][A
epochs:  45%|▍| 9/20 [51:50<57:32, 313.84s/it, loss=0.644, lr=0.00284, d_time=0.[A
train:  77%|██████████████▋    | 715/928 [04:01<01:10,  3.02it/s, total_it=9066][A
epochs:  45%|▍| 9/20 [51:50<57:32, 313.84s/it, loss=0.683, lr=0.00284, d_time=0.[A
train:  77%|██████████████▋    | 716/928 [04:01<01:09,  3.07it/s, total_it=9067][A
epochs:  45%|▍| 9/20 [51:50<57:32, 313.84s/it, loss=0.693, lr=0.00284, d_time=0.[A
train:  77%|██████████████▋    | 717/928 [04:01<01:08,  3.07it/s, total_it=9068][A
epochs:  45%|▍| 9/20 [51:51<57:32, 313.84s/it, loss=0.642, lr=0.00284, d_time=0.[A
train:  77%|██████████████▋    | 718/928 [04:02<01:08,  3.07it/s, total_it=9069][A
epochs:  45%|▍| 9/20 [51:51<57:32, 313.84s/it, loss=0.525, lr=0.00284, d_time=0.[A
train:  77%|██████████████▋    | 719/928 [04:02<01:07,  3.08it/s, total_it=9

epochs:  45%|▍| 9/20 [52:06<57:32, 313.84s/it, loss=0.637, lr=0.00283, d_time=0.[A
train:  82%|███████████████▌   | 763/928 [04:17<00:53,  3.06it/s, total_it=9114][A
epochs:  45%|▍| 9/20 [52:06<57:32, 313.84s/it, loss=0.58, lr=0.00283, d_time=0.0[A
train:  82%|███████████████▋   | 764/928 [04:17<00:53,  3.07it/s, total_it=9115][A
epochs:  45%|▍| 9/20 [52:07<57:32, 313.84s/it, loss=0.726, lr=0.00283, d_time=0.[A
train:  82%|███████████████▋   | 765/928 [04:17<00:55,  2.96it/s, total_it=9116][A
epochs:  45%|▍| 9/20 [52:07<57:32, 313.84s/it, loss=0.613, lr=0.00283, d_time=0.[A
train:  83%|███████████████▋   | 766/928 [04:18<00:54,  2.98it/s, total_it=9117][A
epochs:  45%|▍| 9/20 [52:07<57:32, 313.84s/it, loss=0.614, lr=0.00283, d_time=0.[A
train:  83%|███████████████▋   | 767/928 [04:18<00:53,  3.00it/s, total_it=9118][A
epochs:  45%|▍| 9/20 [52:08<57:32, 313.84s/it, loss=0.637, lr=0.00283, d_time=0.[A
train:  83%|███████████████▋   | 768/928 [04:18<00:52,  3.07it/s, total_it=9

epochs:  45%|▍| 9/20 [52:22<57:32, 313.84s/it, loss=0.572, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 812/928 [04:33<00:37,  3.07it/s, total_it=9163][A
epochs:  45%|▍| 9/20 [52:23<57:32, 313.84s/it, loss=0.681, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 813/928 [04:33<00:37,  3.06it/s, total_it=9164][A
epochs:  45%|▍| 9/20 [52:23<57:32, 313.84s/it, loss=0.595, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 814/928 [04:34<00:39,  2.89it/s, total_it=9165][A
epochs:  45%|▍| 9/20 [52:23<57:32, 313.84s/it, loss=0.584, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 815/928 [04:34<00:38,  2.90it/s, total_it=9166][A
epochs:  45%|▍| 9/20 [52:24<57:32, 313.84s/it, loss=0.628, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 816/928 [04:34<00:37,  3.00it/s, total_it=9167][A
epochs:  45%|▍| 9/20 [52:24<57:32, 313.84s/it, loss=0.639, lr=0.00282, d_time=0.[A
train:  88%|████████████████▋  | 817/928 [04:35<00:35,  3.09it/s, total_it=9

epochs:  45%|▍| 9/20 [52:39<57:32, 313.84s/it, loss=0.665, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 861/928 [04:49<00:21,  3.12it/s, total_it=9212][A
epochs:  45%|▍| 9/20 [52:39<57:32, 313.84s/it, loss=0.561, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 862/928 [04:50<00:21,  3.13it/s, total_it=9213][A
epochs:  45%|▍| 9/20 [52:39<57:32, 313.84s/it, loss=0.549, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 863/928 [04:50<00:20,  3.14it/s, total_it=9214][A
epochs:  45%|▍| 9/20 [52:39<57:32, 313.84s/it, loss=0.843, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 864/928 [04:50<00:20,  3.13it/s, total_it=9215][A
epochs:  45%|▍| 9/20 [52:40<57:32, 313.84s/it, loss=0.586, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 865/928 [04:51<00:21,  2.97it/s, total_it=9216][A
epochs:  45%|▍| 9/20 [52:40<57:32, 313.84s/it, loss=0.682, lr=0.00281, d_time=0.[A
train:  93%|█████████████████▋ | 866/928 [04:51<00:20,  3.04it/s, total_it=9

epochs:  45%|▍| 9/20 [52:55<57:32, 313.84s/it, loss=0.544, lr=0.0028, d_time=0.0[A
train:  98%|██████████████████▋| 910/928 [05:06<00:05,  3.09it/s, total_it=9261][A
epochs:  45%|▍| 9/20 [52:55<57:32, 313.84s/it, loss=0.756, lr=0.0028, d_time=0.0[A
train:  98%|██████████████████▋| 911/928 [05:06<00:05,  3.11it/s, total_it=9262][A
epochs:  45%|▍| 9/20 [52:55<57:32, 313.84s/it, loss=0.647, lr=0.0028, d_time=0.0[A
train:  98%|██████████████████▋| 912/928 [05:06<00:05,  3.03it/s, total_it=9263][A
epochs:  45%|▍| 9/20 [52:56<57:32, 313.84s/it, loss=0.589, lr=0.0028, d_time=0.0[A
train:  98%|██████████████████▋| 913/928 [05:07<00:05,  2.95it/s, total_it=9264][A
epochs:  45%|▍| 9/20 [52:56<57:32, 313.84s/it, loss=0.726, lr=0.0028, d_time=0.0[A
train:  98%|██████████████████▋| 914/928 [05:07<00:04,  2.87it/s, total_it=9265][A
epochs:  45%|▍| 9/20 [52:57<57:32, 313.84s/it, loss=0.537, lr=0.0028, d_time=0.0[A
train:  99%|██████████████████▋| 915/928 [05:07<00:04,  2.94it/s, total_it=9

epochs:  50%|▌| 10/20 [53:12<52:14, 313.41s/it, loss=0.575, lr=0.00279, d_time=0[A
train:   3%|▋                   | 30/928 [00:11<04:54,  3.05it/s, total_it=9309][A
epochs:  50%|▌| 10/20 [53:13<52:14, 313.41s/it, loss=0.591, lr=0.00279, d_time=0[A
train:   3%|▋                   | 31/928 [00:11<04:49,  3.10it/s, total_it=9310][A
epochs:  50%|▌| 10/20 [53:13<52:14, 313.41s/it, loss=0.711, lr=0.00279, d_time=0[A
train:   3%|▋                   | 32/928 [00:11<04:46,  3.13it/s, total_it=9311][A
epochs:  50%|▌| 10/20 [53:13<52:14, 313.41s/it, loss=0.675, lr=0.00279, d_time=0[A
train:   4%|▋                   | 33/928 [00:12<04:45,  3.13it/s, total_it=9312][A
epochs:  50%|▌| 10/20 [53:14<52:14, 313.41s/it, loss=0.66, lr=0.00279, d_time=0.[A
train:   4%|▋                   | 34/928 [00:12<05:10,  2.88it/s, total_it=9313][A
epochs:  50%|▌| 10/20 [53:14<52:14, 313.41s/it, loss=0.527, lr=0.00279, d_time=0[A
train:   4%|▊                   | 35/928 [00:12<04:59,  2.98it/s, total_it=9

epochs:  50%|▌| 10/20 [53:29<52:14, 313.41s/it, loss=0.63, lr=0.00278, d_time=0.[A
train:   9%|█▋                  | 79/928 [00:28<05:09,  2.74it/s, total_it=9358][A
epochs:  50%|▌| 10/20 [53:29<52:14, 313.41s/it, loss=0.661, lr=0.00278, d_time=0[A
train:   9%|█▋                  | 80/928 [00:28<05:11,  2.73it/s, total_it=9359][A
epochs:  50%|▌| 10/20 [53:30<52:14, 313.41s/it, loss=0.614, lr=0.00278, d_time=0[A
train:   9%|█▋                  | 81/928 [00:28<04:58,  2.84it/s, total_it=9360][A
epochs:  50%|▌| 10/20 [53:30<52:14, 313.41s/it, loss=0.644, lr=0.00278, d_time=0[A
train:   9%|█▊                  | 82/928 [00:29<04:54,  2.87it/s, total_it=9361][A
epochs:  50%|▌| 10/20 [53:31<52:14, 313.41s/it, loss=0.561, lr=0.00278, d_time=0[A
train:   9%|█▊                  | 83/928 [00:29<04:46,  2.95it/s, total_it=9362][A
epochs:  50%|▌| 10/20 [53:31<52:14, 313.41s/it, loss=0.667, lr=0.00278, d_time=0[A
train:   9%|█▊                  | 84/928 [00:29<04:52,  2.88it/s, total_it=9

epochs:  50%|▌| 10/20 [53:46<52:14, 313.41s/it, loss=0.474, lr=0.00277, d_time=0[A
train:  14%|██▌                | 128/928 [00:44<04:36,  2.89it/s, total_it=9407][A
epochs:  50%|▌| 10/20 [53:46<52:14, 313.41s/it, loss=0.539, lr=0.00277, d_time=0[A
train:  14%|██▋                | 129/928 [00:44<04:28,  2.98it/s, total_it=9408][A
epochs:  50%|▌| 10/20 [53:46<52:14, 313.41s/it, loss=0.619, lr=0.00277, d_time=0[A
train:  14%|██▋                | 130/928 [00:45<04:32,  2.93it/s, total_it=9409][A
epochs:  50%|▌| 10/20 [53:47<52:14, 313.41s/it, loss=0.616, lr=0.00277, d_time=0[A
train:  14%|██▋                | 131/928 [00:45<04:25,  3.00it/s, total_it=9410][A
epochs:  50%|▌| 10/20 [53:47<52:14, 313.41s/it, loss=0.681, lr=0.00277, d_time=0[A
train:  14%|██▋                | 132/928 [00:45<04:26,  2.98it/s, total_it=9411][A
epochs:  50%|▌| 10/20 [53:47<52:14, 313.41s/it, loss=0.814, lr=0.00277, d_time=0[A
train:  14%|██▋                | 133/928 [00:46<04:43,  2.80it/s, total_it=9

epochs:  50%|▌| 10/20 [54:02<52:14, 313.41s/it, loss=0.63, lr=0.00276, d_time=0.[A
train:  19%|███▌               | 177/928 [01:01<04:20,  2.88it/s, total_it=9456][A
epochs:  50%|▌| 10/20 [54:03<52:14, 313.41s/it, loss=0.533, lr=0.00276, d_time=0[A
train:  19%|███▋               | 178/928 [01:01<04:12,  2.97it/s, total_it=9457][A
epochs:  50%|▌| 10/20 [54:03<52:14, 313.41s/it, loss=0.596, lr=0.00276, d_time=0[A
train:  19%|███▋               | 179/928 [01:02<04:17,  2.91it/s, total_it=9458][A
epochs:  50%|▌| 10/20 [54:03<52:14, 313.41s/it, loss=0.527, lr=0.00276, d_time=0[A
train:  19%|███▋               | 180/928 [01:02<04:12,  2.96it/s, total_it=9459][A
epochs:  50%|▌| 10/20 [54:04<52:14, 313.41s/it, loss=0.541, lr=0.00276, d_time=0[A
train:  20%|███▋               | 181/928 [01:02<04:18,  2.89it/s, total_it=9460][A
epochs:  50%|▌| 10/20 [54:04<52:14, 313.41s/it, loss=0.722, lr=0.00276, d_time=0[A
train:  20%|███▋               | 182/928 [01:03<04:14,  2.93it/s, total_it=9

epochs:  50%|▌| 10/20 [54:19<52:14, 313.41s/it, loss=0.639, lr=0.00275, d_time=0[A
train:  24%|████▋              | 226/928 [01:17<03:58,  2.94it/s, total_it=9505][A
epochs:  50%|▌| 10/20 [54:19<52:14, 313.41s/it, loss=0.546, lr=0.00275, d_time=0[A
train:  24%|████▋              | 227/928 [01:18<03:56,  2.96it/s, total_it=9506][A
epochs:  50%|▌| 10/20 [54:20<52:14, 313.41s/it, loss=0.572, lr=0.00275, d_time=0[A
train:  25%|████▋              | 228/928 [01:18<03:56,  2.96it/s, total_it=9507][A
epochs:  50%|▌| 10/20 [54:20<52:14, 313.41s/it, loss=0.458, lr=0.00275, d_time=0[A
train:  25%|████▋              | 229/928 [01:18<03:54,  2.99it/s, total_it=9508][A
epochs:  50%|▌| 10/20 [54:20<52:14, 313.41s/it, loss=0.617, lr=0.00275, d_time=0[A
train:  25%|████▋              | 230/928 [01:19<03:56,  2.96it/s, total_it=9509][A
epochs:  50%|▌| 10/20 [54:21<52:14, 313.41s/it, loss=0.575, lr=0.00275, d_time=0[A
train:  25%|████▋              | 231/928 [01:19<03:49,  3.03it/s, total_it=9

epochs:  50%|▌| 10/20 [54:35<52:14, 313.41s/it, loss=0.578, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 275/928 [01:34<03:30,  3.10it/s, total_it=9554][A
epochs:  50%|▌| 10/20 [54:35<52:14, 313.41s/it, loss=0.627, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 276/928 [01:34<03:36,  3.01it/s, total_it=9555][A
epochs:  50%|▌| 10/20 [54:36<52:14, 313.41s/it, loss=0.583, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 277/928 [01:34<03:33,  3.04it/s, total_it=9556][A
epochs:  50%|▌| 10/20 [54:36<52:14, 313.41s/it, loss=0.463, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 278/928 [01:35<03:31,  3.07it/s, total_it=9557][A
epochs:  50%|▌| 10/20 [54:36<52:14, 313.41s/it, loss=0.475, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 279/928 [01:35<03:29,  3.09it/s, total_it=9558][A
epochs:  50%|▌| 10/20 [54:37<52:14, 313.41s/it, loss=0.545, lr=0.00274, d_time=0[A
train:  30%|█████▋             | 280/928 [01:35<03:38,  2.97it/s, total_it=9

epochs:  50%|▌| 10/20 [54:51<52:14, 313.41s/it, loss=0.64, lr=0.00273, d_time=0.[A
train:  35%|██████▋            | 324/928 [01:50<03:21,  2.99it/s, total_it=9603][A
epochs:  50%|▌| 10/20 [54:52<52:14, 313.41s/it, loss=0.587, lr=0.00273, d_time=0[A
train:  35%|██████▋            | 325/928 [01:50<03:18,  3.03it/s, total_it=9604][A
epochs:  50%|▌| 10/20 [54:52<52:14, 313.41s/it, loss=0.608, lr=0.00273, d_time=0[A
train:  35%|██████▋            | 326/928 [01:51<03:16,  3.06it/s, total_it=9605][A
epochs:  50%|▌| 10/20 [54:52<52:14, 313.41s/it, loss=0.587, lr=0.00272, d_time=0[A
train:  35%|██████▋            | 327/928 [01:51<03:15,  3.07it/s, total_it=9606][A
epochs:  50%|▌| 10/20 [54:53<52:14, 313.41s/it, loss=0.596, lr=0.00272, d_time=0[A
train:  35%|██████▋            | 328/928 [01:51<03:13,  3.11it/s, total_it=9607][A
epochs:  50%|▌| 10/20 [54:53<52:14, 313.41s/it, loss=0.65, lr=0.00272, d_time=0.[A
train:  35%|██████▋            | 329/928 [01:52<03:23,  2.95it/s, total_it=9

epochs:  50%|▌| 10/20 [55:08<52:14, 313.41s/it, loss=0.627, lr=0.00271, d_time=0[A
train:  40%|███████▋           | 373/928 [02:06<03:01,  3.06it/s, total_it=9652][A
epochs:  50%|▌| 10/20 [55:08<52:14, 313.41s/it, loss=0.591, lr=0.00271, d_time=0[A
train:  40%|███████▋           | 374/928 [02:07<03:05,  2.99it/s, total_it=9653][A
epochs:  50%|▌| 10/20 [55:09<52:14, 313.41s/it, loss=0.528, lr=0.00271, d_time=0[A
train:  40%|███████▋           | 375/928 [02:07<03:00,  3.07it/s, total_it=9654][A
epochs:  50%|▌| 10/20 [55:09<52:14, 313.41s/it, loss=0.669, lr=0.00271, d_time=0[A
train:  41%|███████▋           | 376/928 [02:07<02:59,  3.07it/s, total_it=9655][A
epochs:  50%|▌| 10/20 [55:09<52:14, 313.41s/it, loss=0.575, lr=0.00271, d_time=0[A
train:  41%|███████▋           | 377/928 [02:08<03:02,  3.02it/s, total_it=9656][A
epochs:  50%|▌| 10/20 [55:10<52:14, 313.41s/it, loss=0.807, lr=0.00271, d_time=0[A
train:  41%|███████▋           | 378/928 [02:08<03:04,  2.99it/s, total_it=9

epochs:  50%|▌| 10/20 [55:24<52:14, 313.41s/it, loss=0.497, lr=0.0027, d_time=0.[A
train:  45%|████████▋          | 422/928 [02:23<02:51,  2.96it/s, total_it=9701][A
epochs:  50%|▌| 10/20 [55:25<52:14, 313.41s/it, loss=0.564, lr=0.0027, d_time=0.[A
train:  46%|████████▋          | 423/928 [02:23<02:51,  2.94it/s, total_it=9702][A
epochs:  50%|▌| 10/20 [55:25<52:14, 313.41s/it, loss=0.454, lr=0.0027, d_time=0.[A
train:  46%|████████▋          | 424/928 [02:23<02:49,  2.97it/s, total_it=9703][A
epochs:  50%|▌| 10/20 [55:25<52:14, 313.41s/it, loss=0.59, lr=0.0027, d_time=0.0[A
train:  46%|████████▋          | 425/928 [02:24<02:46,  3.02it/s, total_it=9704][A
epochs:  50%|▌| 10/20 [55:26<52:14, 313.41s/it, loss=0.662, lr=0.0027, d_time=0.[A
train:  46%|████████▋          | 426/928 [02:24<02:44,  3.06it/s, total_it=9705][A
epochs:  50%|▌| 10/20 [55:26<52:14, 313.41s/it, loss=0.692, lr=0.0027, d_time=0.[A
train:  46%|████████▋          | 427/928 [02:24<02:38,  3.16it/s, total_it=9

epochs:  50%|▌| 10/20 [55:41<52:14, 313.41s/it, loss=0.582, lr=0.00269, d_time=0[A
train:  51%|█████████▋         | 471/928 [02:39<02:31,  3.01it/s, total_it=9750][A
epochs:  50%|▌| 10/20 [55:41<52:14, 313.41s/it, loss=0.56, lr=0.00269, d_time=0.[A
train:  51%|█████████▋         | 472/928 [02:40<02:31,  3.00it/s, total_it=9751][A
epochs:  50%|▌| 10/20 [55:42<52:14, 313.41s/it, loss=0.537, lr=0.00269, d_time=0[A
train:  51%|█████████▋         | 473/928 [02:40<02:38,  2.88it/s, total_it=9752][A
epochs:  50%|▌| 10/20 [55:42<52:14, 313.41s/it, loss=0.65, lr=0.00269, d_time=0.[A
train:  51%|█████████▋         | 474/928 [02:40<02:36,  2.90it/s, total_it=9753][A
epochs:  50%|▌| 10/20 [55:42<52:14, 313.41s/it, loss=0.576, lr=0.00269, d_time=0[A
train:  51%|█████████▋         | 475/928 [02:41<02:37,  2.88it/s, total_it=9754][A
epochs:  50%|▌| 10/20 [55:43<52:14, 313.41s/it, loss=0.704, lr=0.00269, d_time=0[A
train:  51%|█████████▋         | 476/928 [02:41<02:36,  2.89it/s, total_it=9

epochs:  50%|▌| 10/20 [55:57<52:14, 313.41s/it, loss=0.605, lr=0.00268, d_time=0[A
train:  56%|██████████▋        | 520/928 [02:56<02:12,  3.08it/s, total_it=9799][A
epochs:  50%|▌| 10/20 [55:58<52:14, 313.41s/it, loss=0.6, lr=0.00268, d_time=0.0[A
train:  56%|██████████▋        | 521/928 [02:56<02:13,  3.05it/s, total_it=9800][A
epochs:  50%|▌| 10/20 [55:58<52:14, 313.41s/it, loss=0.548, lr=0.00268, d_time=0[A
train:  56%|██████████▋        | 522/928 [02:56<02:12,  3.06it/s, total_it=9801][A
epochs:  50%|▌| 10/20 [55:58<52:14, 313.41s/it, loss=0.695, lr=0.00268, d_time=0[A
train:  56%|██████████▋        | 523/928 [02:57<02:11,  3.08it/s, total_it=9802][A
epochs:  50%|▌| 10/20 [55:59<52:14, 313.41s/it, loss=0.578, lr=0.00267, d_time=0[A
train:  56%|██████████▋        | 524/928 [02:57<02:13,  3.03it/s, total_it=9803][A
epochs:  50%|▌| 10/20 [55:59<52:14, 313.41s/it, loss=0.691, lr=0.00267, d_time=0[A
train:  57%|██████████▋        | 525/928 [02:57<02:21,  2.86it/s, total_it=9

epochs:  50%|▌| 10/20 [56:14<52:14, 313.41s/it, loss=0.552, lr=0.00266, d_time=0[A
train:  61%|███████████▋       | 569/928 [03:12<01:57,  3.04it/s, total_it=9848][A
epochs:  50%|▌| 10/20 [56:14<52:14, 313.41s/it, loss=0.53, lr=0.00266, d_time=0.[A
train:  61%|███████████▋       | 570/928 [03:13<01:58,  3.02it/s, total_it=9849][A
epochs:  50%|▌| 10/20 [56:15<52:14, 313.41s/it, loss=0.52, lr=0.00266, d_time=0.[A
train:  62%|███████████▋       | 571/928 [03:13<01:57,  3.03it/s, total_it=9850][A
epochs:  50%|▌| 10/20 [56:15<52:14, 313.41s/it, loss=0.617, lr=0.00266, d_time=0[A
train:  62%|███████████▋       | 572/928 [03:13<01:58,  3.00it/s, total_it=9851][A
epochs:  50%|▌| 10/20 [56:15<52:14, 313.41s/it, loss=0.446, lr=0.00266, d_time=0[A
train:  62%|███████████▋       | 573/928 [03:14<02:02,  2.91it/s, total_it=9852][A
epochs:  50%|▌| 10/20 [56:16<52:14, 313.41s/it, loss=0.641, lr=0.00266, d_time=0[A
train:  62%|███████████▊       | 574/928 [03:14<02:07,  2.79it/s, total_it=9

epochs:  50%|▌| 10/20 [56:31<52:14, 313.41s/it, loss=0.592, lr=0.00265, d_time=0[A
train:  67%|████████████▋      | 618/928 [03:30<01:52,  2.77it/s, total_it=9897][A
epochs:  50%|▌| 10/20 [56:32<52:14, 313.41s/it, loss=0.507, lr=0.00265, d_time=0[A
train:  67%|████████████▋      | 619/928 [03:30<01:49,  2.81it/s, total_it=9898][A
epochs:  50%|▌| 10/20 [56:32<52:14, 313.41s/it, loss=0.447, lr=0.00265, d_time=0[A
train:  67%|████████████▋      | 620/928 [03:30<01:46,  2.90it/s, total_it=9899][A
epochs:  50%|▌| 10/20 [56:32<52:14, 313.41s/it, loss=0.504, lr=0.00265, d_time=0[A
train:  67%|████████████▋      | 621/928 [03:31<01:44,  2.94it/s, total_it=9900][A
epochs:  50%|▌| 10/20 [56:33<52:14, 313.41s/it, loss=0.557, lr=0.00265, d_time=0[A
train:  67%|████████████▋      | 622/928 [03:31<01:41,  3.02it/s, total_it=9901][A
epochs:  50%|▌| 10/20 [56:33<52:14, 313.41s/it, loss=0.586, lr=0.00265, d_time=0[A
train:  67%|████████████▊      | 623/928 [03:31<01:41,  3.00it/s, total_it=9

epochs:  50%|▌| 10/20 [56:48<52:14, 313.41s/it, loss=0.488, lr=0.00264, d_time=0[A
train:  72%|█████████████▋     | 667/928 [03:46<01:31,  2.84it/s, total_it=9946][A
epochs:  50%|▌| 10/20 [56:48<52:14, 313.41s/it, loss=0.572, lr=0.00264, d_time=0[A
train:  72%|█████████████▋     | 668/928 [03:47<01:32,  2.81it/s, total_it=9947][A
epochs:  50%|▌| 10/20 [56:49<52:14, 313.41s/it, loss=0.516, lr=0.00264, d_time=0[A
train:  72%|█████████████▋     | 669/928 [03:47<01:31,  2.82it/s, total_it=9948][A
epochs:  50%|▌| 10/20 [56:49<52:14, 313.41s/it, loss=0.607, lr=0.00264, d_time=0[A
train:  72%|█████████████▋     | 670/928 [03:47<01:30,  2.86it/s, total_it=9949][A
epochs:  50%|▌| 10/20 [56:49<52:14, 313.41s/it, loss=0.523, lr=0.00264, d_time=0[A
train:  72%|█████████████▋     | 671/928 [03:48<01:29,  2.86it/s, total_it=9950][A
epochs:  50%|▌| 10/20 [56:50<52:14, 313.41s/it, loss=0.597, lr=0.00263, d_time=0[A
train:  72%|█████████████▊     | 672/928 [03:48<01:29,  2.85it/s, total_it=9

epochs:  50%|▌| 10/20 [57:05<52:14, 313.41s/it, loss=0.57, lr=0.00262, d_time=0.[A
train:  77%|██████████████▋    | 716/928 [04:03<01:09,  3.03it/s, total_it=9995][A
epochs:  50%|▌| 10/20 [57:05<52:14, 313.41s/it, loss=0.654, lr=0.00262, d_time=0[A
train:  77%|██████████████▋    | 717/928 [04:04<01:13,  2.89it/s, total_it=9996][A
epochs:  50%|▌| 10/20 [57:06<52:14, 313.41s/it, loss=0.525, lr=0.00262, d_time=0[A
train:  77%|██████████████▋    | 718/928 [04:04<01:10,  2.98it/s, total_it=9997][A
epochs:  50%|▌| 10/20 [57:06<52:14, 313.41s/it, loss=0.587, lr=0.00262, d_time=0[A
train:  77%|██████████████▋    | 719/928 [04:04<01:09,  2.99it/s, total_it=9998][A
epochs:  50%|▌| 10/20 [57:06<52:14, 313.41s/it, loss=0.576, lr=0.00262, d_time=0[A
train:  78%|██████████████▋    | 720/928 [04:05<01:10,  2.97it/s, total_it=9999][A
epochs:  50%|▌| 10/20 [57:07<52:14, 313.41s/it, loss=0.51, lr=0.00262, d_time=0.[A
train:  78%|██████████████▊    | 721/928 [04:05<01:07,  3.07it/s, total_it=1

epochs:  50%|▌| 10/20 [57:21<52:14, 313.41s/it, loss=0.641, lr=0.00261, d_time=0[A
train:  82%|███████████████▋   | 765/928 [04:20<00:57,  2.83it/s, total_it=1e+4][A
epochs:  50%|▌| 10/20 [57:22<52:14, 313.41s/it, loss=0.607, lr=0.00261, d_time=0[A
train:  83%|███████████████▋   | 766/928 [04:20<00:58,  2.77it/s, total_it=1e+4][A
epochs:  50%|▌| 10/20 [57:22<52:14, 313.41s/it, loss=0.569, lr=0.00261, d_time=0[A
train:  83%|███████████████▋   | 767/928 [04:21<00:55,  2.91it/s, total_it=1e+4][A
epochs:  50%|▌| 10/20 [57:22<52:14, 313.41s/it, loss=0.55, lr=0.00261, d_time=0.[A
train:  83%|███████████████▋   | 768/928 [04:21<00:55,  2.91it/s, total_it=1e+4][A
epochs:  50%|▌| 10/20 [57:23<52:14, 313.41s/it, loss=0.548, lr=0.00261, d_time=0[A
train:  83%|███████████████▋   | 769/928 [04:21<00:55,  2.88it/s, total_it=1e+4][A
epochs:  50%|▌| 10/20 [57:23<52:14, 313.41s/it, loss=0.548, lr=0.00261, d_time=0[A
train:  83%|███████████████▊   | 770/928 [04:22<00:53,  2.95it/s, total_it=1

epochs:  50%|▌| 10/20 [57:38<52:14, 313.41s/it, loss=0.62, lr=0.00259, d_time=0.[A
train:  88%|███████████████▊  | 814/928 [04:36<00:39,  2.85it/s, total_it=10093][A
epochs:  50%|▌| 10/20 [57:38<52:14, 313.41s/it, loss=0.683, lr=0.00259, d_time=0[A
train:  88%|███████████████▊  | 815/928 [04:37<00:38,  2.92it/s, total_it=10094][A
epochs:  50%|▌| 10/20 [57:39<52:14, 313.41s/it, loss=0.681, lr=0.00259, d_time=0[A
train:  88%|███████████████▊  | 816/928 [04:37<00:38,  2.90it/s, total_it=10095][A
epochs:  50%|▌| 10/20 [57:39<52:14, 313.41s/it, loss=0.556, lr=0.00259, d_time=0[A
train:  88%|███████████████▊  | 817/928 [04:37<00:36,  3.01it/s, total_it=10096][A
epochs:  50%|▌| 10/20 [57:39<52:14, 313.41s/it, loss=0.595, lr=0.00259, d_time=0[A
train:  88%|███████████████▊  | 818/928 [04:38<00:39,  2.77it/s, total_it=10097][A
epochs:  50%|▌| 10/20 [57:40<52:14, 313.41s/it, loss=0.573, lr=0.00259, d_time=0[A
train:  88%|███████████████▉  | 819/928 [04:38<00:38,  2.84it/s, total_it=10

epochs:  50%|▌| 10/20 [57:54<52:14, 313.41s/it, loss=0.531, lr=0.00258, d_time=0[A
train:  93%|████████████████▋ | 863/928 [04:53<00:21,  3.09it/s, total_it=10142][A
epochs:  50%|▌| 10/20 [57:55<52:14, 313.41s/it, loss=0.652, lr=0.00258, d_time=0[A
train:  93%|████████████████▊ | 864/928 [04:53<00:20,  3.08it/s, total_it=10143][A
epochs:  50%|▌| 10/20 [57:55<52:14, 313.41s/it, loss=0.552, lr=0.00258, d_time=0[A
train:  93%|████████████████▊ | 865/928 [04:54<00:21,  2.88it/s, total_it=10144][A
epochs:  50%|▌| 10/20 [57:56<52:14, 313.41s/it, loss=0.564, lr=0.00258, d_time=0[A
train:  93%|████████████████▊ | 866/928 [04:54<00:21,  2.92it/s, total_it=10145][A
epochs:  50%|▌| 10/20 [57:56<52:14, 313.41s/it, loss=0.602, lr=0.00258, d_time=0[A
train:  93%|████████████████▊ | 867/928 [04:54<00:20,  2.99it/s, total_it=10146][A
epochs:  50%|▌| 10/20 [57:56<52:14, 313.41s/it, loss=0.645, lr=0.00258, d_time=0[A
train:  94%|████████████████▊ | 868/928 [04:55<00:21,  2.78it/s, total_it=10

epochs:  50%|▌| 10/20 [58:11<52:14, 313.41s/it, loss=0.558, lr=0.00257, d_time=0[A
train:  98%|█████████████████▋| 912/928 [05:10<00:05,  2.99it/s, total_it=10191][A
epochs:  50%|▌| 10/20 [58:12<52:14, 313.41s/it, loss=0.506, lr=0.00257, d_time=0[A
train:  98%|█████████████████▋| 913/928 [05:10<00:04,  3.02it/s, total_it=10192][A
epochs:  50%|▌| 10/20 [58:12<52:14, 313.41s/it, loss=0.512, lr=0.00257, d_time=0[A
train:  98%|█████████████████▋| 914/928 [05:10<00:04,  3.00it/s, total_it=10193][A
epochs:  50%|▌| 10/20 [58:12<52:14, 313.41s/it, loss=0.554, lr=0.00257, d_time=0[A
train:  99%|█████████████████▋| 915/928 [05:11<00:04,  2.93it/s, total_it=10194][A
epochs:  50%|▌| 10/20 [58:13<52:14, 313.41s/it, loss=0.488, lr=0.00256, d_time=0[A
train:  99%|█████████████████▊| 916/928 [05:11<00:04,  2.96it/s, total_it=10195][A
epochs:  50%|▌| 10/20 [58:13<52:14, 313.41s/it, loss=0.569, lr=0.00256, d_time=0[A
train:  99%|█████████████████▊| 917/928 [05:11<00:03,  3.03it/s, total_it=10

epochs:  55%|▌| 11/20 [58:28<47:07, 314.13s/it, loss=0.547, lr=0.00255, d_time=0[A
train:   3%|▋                  | 32/928 [00:11<04:53,  3.05it/s, total_it=10239][A
epochs:  55%|▌| 11/20 [58:29<47:07, 314.13s/it, loss=0.712, lr=0.00255, d_time=0[A
train:   4%|▋                  | 33/928 [00:11<04:58,  2.99it/s, total_it=10240][A
epochs:  55%|▌| 11/20 [58:29<47:07, 314.13s/it, loss=0.55, lr=0.00255, d_time=0.[A
train:   4%|▋                  | 34/928 [00:12<05:09,  2.89it/s, total_it=10241][A
epochs:  55%|▌| 11/20 [58:29<47:07, 314.13s/it, loss=0.582, lr=0.00255, d_time=0[A
train:   4%|▋                  | 35/928 [00:12<05:25,  2.74it/s, total_it=10242][A
epochs:  55%|▌| 11/20 [58:30<47:07, 314.13s/it, loss=0.479, lr=0.00255, d_time=0[A
train:   4%|▋                  | 36/928 [00:13<05:22,  2.76it/s, total_it=10243][A
epochs:  55%|▌| 11/20 [58:30<47:07, 314.13s/it, loss=0.629, lr=0.00255, d_time=0[A
train:   4%|▊                  | 37/928 [00:13<05:26,  2.73it/s, total_it=10

epochs:  55%|▌| 11/20 [58:45<47:07, 314.13s/it, loss=0.56, lr=0.00254, d_time=0.[A
train:   9%|█▋                 | 81/928 [00:28<04:48,  2.94it/s, total_it=10288][A
epochs:  55%|▌| 11/20 [58:45<47:07, 314.13s/it, loss=0.611, lr=0.00254, d_time=0[A
train:   9%|█▋                 | 82/928 [00:28<04:39,  3.03it/s, total_it=10289][A
epochs:  55%|▌| 11/20 [58:45<47:07, 314.13s/it, loss=0.49, lr=0.00254, d_time=0.[A
train:   9%|█▋                 | 83/928 [00:28<04:28,  3.15it/s, total_it=10290][A
epochs:  55%|▌| 11/20 [58:46<47:07, 314.13s/it, loss=0.505, lr=0.00254, d_time=0[A
train:   9%|█▋                 | 84/928 [00:28<04:25,  3.18it/s, total_it=10291][A
epochs:  55%|▌| 11/20 [58:46<47:07, 314.13s/it, loss=0.615, lr=0.00254, d_time=0[A
train:   9%|█▋                 | 85/928 [00:29<04:30,  3.12it/s, total_it=10292][A
epochs:  55%|▌| 11/20 [58:46<47:07, 314.13s/it, loss=0.539, lr=0.00254, d_time=0[A
train:   9%|█▊                 | 86/928 [00:29<04:22,  3.20it/s, total_it=10

epochs:  55%|▌| 11/20 [59:01<47:07, 314.13s/it, loss=0.563, lr=0.00252, d_time=0[A
train:  14%|██▌               | 130/928 [00:44<04:29,  2.96it/s, total_it=10337][A
epochs:  55%|▌| 11/20 [59:02<47:07, 314.13s/it, loss=0.584, lr=0.00252, d_time=0[A
train:  14%|██▌               | 131/928 [00:44<04:31,  2.94it/s, total_it=10338][A
epochs:  55%|▌| 11/20 [59:02<47:07, 314.13s/it, loss=0.58, lr=0.00252, d_time=0.[A
train:  14%|██▌               | 132/928 [00:45<04:24,  3.01it/s, total_it=10339][A
epochs:  55%|▌| 11/20 [59:02<47:07, 314.13s/it, loss=0.537, lr=0.00252, d_time=0[A
train:  14%|██▌               | 133/928 [00:45<04:20,  3.05it/s, total_it=10340][A
epochs:  55%|▌| 11/20 [59:03<47:07, 314.13s/it, loss=0.569, lr=0.00252, d_time=0[A
train:  14%|██▌               | 134/928 [00:45<04:17,  3.08it/s, total_it=10341][A
epochs:  55%|▌| 11/20 [59:03<47:07, 314.13s/it, loss=0.533, lr=0.00252, d_time=0[A
train:  15%|██▌               | 135/928 [00:46<04:14,  3.12it/s, total_it=10

epochs:  55%|▌| 11/20 [59:18<47:07, 314.13s/it, loss=0.557, lr=0.00251, d_time=0[A
train:  19%|███▍              | 179/928 [01:00<04:06,  3.03it/s, total_it=10386][A
epochs:  55%|▌| 11/20 [59:18<47:07, 314.13s/it, loss=0.574, lr=0.00251, d_time=0[A
train:  19%|███▍              | 180/928 [01:01<04:01,  3.10it/s, total_it=10387][A
epochs:  55%|▌| 11/20 [59:18<47:07, 314.13s/it, loss=0.557, lr=0.00251, d_time=0[A
train:  20%|███▌              | 181/928 [01:01<03:58,  3.14it/s, total_it=10388][A
epochs:  55%|▌| 11/20 [59:19<47:07, 314.13s/it, loss=0.61, lr=0.00251, d_time=0.[A
train:  20%|███▌              | 182/928 [01:01<03:58,  3.13it/s, total_it=10389][A
epochs:  55%|▌| 11/20 [59:19<47:07, 314.13s/it, loss=0.598, lr=0.00251, d_time=0[A
train:  20%|███▌              | 183/928 [01:02<03:57,  3.14it/s, total_it=10390][A
epochs:  55%|▌| 11/20 [59:19<47:07, 314.13s/it, loss=0.617, lr=0.0025, d_time=0.[A
train:  20%|███▌              | 184/928 [01:02<04:00,  3.09it/s, total_it=10

epochs:  55%|▌| 11/20 [59:34<47:07, 314.13s/it, loss=0.545, lr=0.00249, d_time=0[A
train:  25%|████▍             | 228/928 [01:17<03:59,  2.92it/s, total_it=10435][A
epochs:  55%|▌| 11/20 [59:35<47:07, 314.13s/it, loss=0.522, lr=0.00249, d_time=0[A
train:  25%|████▍             | 229/928 [01:17<03:57,  2.95it/s, total_it=10436][A
epochs:  55%|▌| 11/20 [59:35<47:07, 314.13s/it, loss=0.524, lr=0.00249, d_time=0[A
train:  25%|████▍             | 230/928 [01:18<03:54,  2.97it/s, total_it=10437][A
epochs:  55%|▌| 11/20 [59:35<47:07, 314.13s/it, loss=0.543, lr=0.00249, d_time=0[A
train:  25%|████▍             | 231/928 [01:18<03:56,  2.95it/s, total_it=10438][A
epochs:  55%|▌| 11/20 [59:36<47:07, 314.13s/it, loss=0.475, lr=0.00249, d_time=0[A
train:  25%|████▌             | 232/928 [01:18<04:00,  2.90it/s, total_it=10439][A
epochs:  55%|▌| 11/20 [59:36<47:07, 314.13s/it, loss=0.534, lr=0.00249, d_time=0[A
train:  25%|████▌             | 233/928 [01:19<04:00,  2.89it/s, total_it=10

epochs:  55%|▌| 11/20 [59:51<47:07, 314.13s/it, loss=0.597, lr=0.00248, d_time=0[A
train:  30%|█████▎            | 277/928 [01:33<03:36,  3.00it/s, total_it=10484][A
epochs:  55%|▌| 11/20 [59:51<47:07, 314.13s/it, loss=0.519, lr=0.00247, d_time=0[A
train:  30%|█████▍            | 278/928 [01:34<03:40,  2.95it/s, total_it=10485][A
epochs:  55%|▌| 11/20 [59:52<47:07, 314.13s/it, loss=0.549, lr=0.00247, d_time=0[A
train:  30%|█████▍            | 279/928 [01:34<03:35,  3.02it/s, total_it=10486][A
epochs:  55%|▌| 11/20 [59:52<47:07, 314.13s/it, loss=0.502, lr=0.00247, d_time=0[A
train:  30%|█████▍            | 280/928 [01:34<03:32,  3.05it/s, total_it=10487][A
epochs:  55%|▌| 11/20 [59:52<47:07, 314.13s/it, loss=0.517, lr=0.00247, d_time=0[A
train:  30%|█████▍            | 281/928 [01:35<03:34,  3.01it/s, total_it=10488][A
epochs:  55%|▌| 11/20 [59:52<47:07, 314.13s/it, loss=0.567, lr=0.00247, d_time=0[A
train:  30%|█████▍            | 282/928 [01:35<03:33,  3.03it/s, total_it=10

epochs:  55%|▌| 11/20 [1:00:08<47:07, 314.13s/it, loss=0.581, lr=0.00246, d_time[A
train:  35%|██████▎           | 326/928 [01:50<03:18,  3.04it/s, total_it=10533][A
epochs:  55%|▌| 11/20 [1:00:08<47:07, 314.13s/it, loss=0.556, lr=0.00246, d_time[A
train:  35%|██████▎           | 327/928 [01:51<03:19,  3.02it/s, total_it=10534][A
epochs:  55%|▌| 11/20 [1:00:08<47:07, 314.13s/it, loss=0.569, lr=0.00246, d_time[A
train:  35%|██████▎           | 328/928 [01:51<03:15,  3.07it/s, total_it=10535][A
epochs:  55%|▌| 11/20 [1:00:09<47:07, 314.13s/it, loss=0.541, lr=0.00246, d_time[A
train:  35%|██████▍           | 329/928 [01:51<03:28,  2.87it/s, total_it=10536][A
epochs:  55%|▌| 11/20 [1:00:09<47:07, 314.13s/it, loss=0.598, lr=0.00246, d_time[A
train:  36%|██████▍           | 330/928 [01:52<03:24,  2.92it/s, total_it=10537][A
epochs:  55%|▌| 11/20 [1:00:09<47:07, 314.13s/it, loss=0.525, lr=0.00246, d_time[A
train:  36%|██████▍           | 331/928 [01:52<03:22,  2.94it/s, total_it=10

epochs:  55%|▌| 11/20 [1:00:24<47:07, 314.13s/it, loss=0.55, lr=0.00244, d_time=[A
train:  40%|███████▎          | 375/928 [02:07<03:04,  3.00it/s, total_it=10582][A
epochs:  55%|▌| 11/20 [1:00:24<47:07, 314.13s/it, loss=0.494, lr=0.00244, d_time[A
train:  41%|███████▎          | 376/928 [02:07<03:00,  3.06it/s, total_it=10583][A
epochs:  55%|▌| 11/20 [1:00:25<47:07, 314.13s/it, loss=0.565, lr=0.00244, d_time[A
train:  41%|███████▎          | 377/928 [02:07<03:07,  2.94it/s, total_it=10584][A
epochs:  55%|▌| 11/20 [1:00:25<47:07, 314.13s/it, loss=0.638, lr=0.00244, d_time[A
train:  41%|███████▎          | 378/928 [02:08<03:04,  2.98it/s, total_it=10585][A
epochs:  55%|▌| 11/20 [1:00:25<47:07, 314.13s/it, loss=0.522, lr=0.00244, d_time[A
train:  41%|███████▎          | 379/928 [02:08<03:03,  2.99it/s, total_it=10586][A
epochs:  55%|▌| 11/20 [1:00:26<47:07, 314.13s/it, loss=0.838, lr=0.00244, d_time[A
train:  41%|███████▎          | 380/928 [02:08<02:57,  3.08it/s, total_it=10

epochs:  55%|▌| 11/20 [1:00:40<47:07, 314.13s/it, loss=0.503, lr=0.00243, d_time[A
train:  46%|████████▏         | 424/928 [02:23<02:46,  3.03it/s, total_it=10631][A
epochs:  55%|▌| 11/20 [1:00:41<47:07, 314.13s/it, loss=0.524, lr=0.00243, d_time[A
train:  46%|████████▏         | 425/928 [02:23<02:54,  2.88it/s, total_it=10632][A
epochs:  55%|▌| 11/20 [1:00:41<47:07, 314.13s/it, loss=0.546, lr=0.00243, d_time[A
train:  46%|████████▎         | 426/928 [02:24<02:46,  3.01it/s, total_it=10633][A
epochs:  55%|▌| 11/20 [1:00:41<47:07, 314.13s/it, loss=0.563, lr=0.00243, d_time[A
train:  46%|████████▎         | 427/928 [02:24<02:50,  2.94it/s, total_it=10634][A
epochs:  55%|▌| 11/20 [1:00:42<47:07, 314.13s/it, loss=0.543, lr=0.00243, d_time[A
train:  46%|████████▎         | 428/928 [02:24<02:46,  3.00it/s, total_it=10635][A
epochs:  55%|▌| 11/20 [1:00:42<47:07, 314.13s/it, loss=0.534, lr=0.00243, d_time[A
train:  46%|████████▎         | 429/928 [02:25<02:48,  2.96it/s, total_it=10

epochs:  55%|▌| 11/20 [1:00:57<47:07, 314.13s/it, loss=0.552, lr=0.00241, d_time[A
train:  51%|█████████▏        | 473/928 [02:39<02:29,  3.04it/s, total_it=10680][A
epochs:  55%|▌| 11/20 [1:00:57<47:07, 314.13s/it, loss=0.499, lr=0.00241, d_time[A
train:  51%|█████████▏        | 474/928 [02:40<02:30,  3.02it/s, total_it=10681][A
epochs:  55%|▌| 11/20 [1:00:57<47:07, 314.13s/it, loss=0.512, lr=0.00241, d_time[A
train:  51%|█████████▏        | 475/928 [02:40<02:34,  2.92it/s, total_it=10682][A
epochs:  55%|▌| 11/20 [1:00:58<47:07, 314.13s/it, loss=0.55, lr=0.00241, d_time=[A
train:  51%|█████████▏        | 476/928 [02:40<02:32,  2.96it/s, total_it=10683][A
epochs:  55%|▌| 11/20 [1:00:58<47:07, 314.13s/it, loss=0.708, lr=0.00241, d_time[A
train:  51%|█████████▎        | 477/928 [02:41<02:31,  2.98it/s, total_it=10684][A
epochs:  55%|▌| 11/20 [1:00:58<47:07, 314.13s/it, loss=0.579, lr=0.00241, d_time[A
train:  52%|█████████▎        | 478/928 [02:41<02:28,  3.03it/s, total_it=10

epochs:  55%|▌| 11/20 [1:01:13<47:07, 314.13s/it, loss=0.55, lr=0.00239, d_time=[A
train:  56%|██████████▏       | 522/928 [02:56<02:18,  2.92it/s, total_it=10729][A
epochs:  55%|▌| 11/20 [1:01:14<47:07, 314.13s/it, loss=0.565, lr=0.00239, d_time[A
train:  56%|██████████▏       | 523/928 [02:56<02:16,  2.97it/s, total_it=10730][A
epochs:  55%|▌| 11/20 [1:01:14<47:07, 314.13s/it, loss=0.558, lr=0.00239, d_time[A
train:  56%|██████████▏       | 524/928 [02:57<02:16,  2.95it/s, total_it=10731][A
epochs:  55%|▌| 11/20 [1:01:14<47:07, 314.13s/it, loss=0.542, lr=0.00239, d_time[A
train:  57%|██████████▏       | 525/928 [02:57<02:15,  2.97it/s, total_it=10732][A
epochs:  55%|▌| 11/20 [1:01:15<47:07, 314.13s/it, loss=0.614, lr=0.00239, d_time[A
train:  57%|██████████▏       | 526/928 [02:57<02:13,  3.02it/s, total_it=10733][A
epochs:  55%|▌| 11/20 [1:01:15<47:07, 314.13s/it, loss=0.816, lr=0.00239, d_time[A
train:  57%|██████████▏       | 527/928 [02:58<02:21,  2.83it/s, total_it=10

epochs:  55%|▌| 11/20 [1:01:30<47:07, 314.13s/it, loss=0.562, lr=0.00238, d_time[A
train:  62%|███████████       | 571/928 [03:13<01:59,  2.98it/s, total_it=10778][A
epochs:  55%|▌| 11/20 [1:01:30<47:07, 314.13s/it, loss=0.473, lr=0.00238, d_time[A
train:  62%|███████████       | 572/928 [03:13<01:55,  3.08it/s, total_it=10779][A
epochs:  55%|▌| 11/20 [1:01:31<47:07, 314.13s/it, loss=0.441, lr=0.00238, d_time[A
train:  62%|███████████       | 573/928 [03:13<01:54,  3.10it/s, total_it=10780][A
epochs:  55%|▌| 11/20 [1:01:31<47:07, 314.13s/it, loss=0.516, lr=0.00238, d_time[A
train:  62%|███████████▏      | 574/928 [03:14<01:57,  3.03it/s, total_it=10781][A
epochs:  55%|▌| 11/20 [1:01:31<47:07, 314.13s/it, loss=0.542, lr=0.00238, d_time[A
train:  62%|███████████▏      | 575/928 [03:14<01:54,  3.08it/s, total_it=10782][A
epochs:  55%|▌| 11/20 [1:01:32<47:07, 314.13s/it, loss=0.556, lr=0.00238, d_time[A
train:  62%|███████████▏      | 576/928 [03:14<01:55,  3.04it/s, total_it=10

epochs:  55%|▌| 11/20 [1:01:46<47:07, 314.13s/it, loss=0.621, lr=0.00236, d_time[A
train:  67%|████████████      | 620/928 [03:29<01:39,  3.09it/s, total_it=10827][A
epochs:  55%|▌| 11/20 [1:01:47<47:07, 314.13s/it, loss=0.521, lr=0.00236, d_time[A
train:  67%|████████████      | 621/928 [03:29<01:39,  3.10it/s, total_it=10828][A
epochs:  55%|▌| 11/20 [1:01:47<47:07, 314.13s/it, loss=0.589, lr=0.00236, d_time[A
train:  67%|████████████      | 622/928 [03:30<01:36,  3.19it/s, total_it=10829][A
epochs:  55%|▌| 11/20 [1:01:47<47:07, 314.13s/it, loss=0.639, lr=0.00236, d_time[A
train:  67%|████████████      | 623/928 [03:30<01:37,  3.14it/s, total_it=10830][A
epochs:  55%|▌| 11/20 [1:01:48<47:07, 314.13s/it, loss=0.738, lr=0.00236, d_time[A
train:  67%|████████████      | 624/928 [03:30<01:36,  3.15it/s, total_it=10831][A
epochs:  55%|▌| 11/20 [1:01:48<47:07, 314.13s/it, loss=0.594, lr=0.00236, d_time[A
train:  67%|████████████      | 625/928 [03:31<01:35,  3.16it/s, total_it=10

epochs:  55%|▌| 11/20 [1:02:03<47:07, 314.13s/it, loss=0.56, lr=0.00234, d_time=[A
train:  72%|████████████▉     | 669/928 [03:46<01:29,  2.88it/s, total_it=10876][A
epochs:  55%|▌| 11/20 [1:02:03<47:07, 314.13s/it, loss=0.488, lr=0.00234, d_time[A
train:  72%|████████████▉     | 670/928 [03:46<01:27,  2.95it/s, total_it=10877][A
epochs:  55%|▌| 11/20 [1:02:04<47:07, 314.13s/it, loss=0.548, lr=0.00234, d_time[A
train:  72%|█████████████     | 671/928 [03:46<01:24,  3.03it/s, total_it=10878][A
epochs:  55%|▌| 11/20 [1:02:04<47:07, 314.13s/it, loss=0.522, lr=0.00234, d_time[A
train:  72%|█████████████     | 672/928 [03:46<01:22,  3.11it/s, total_it=10879][A
epochs:  55%|▌| 11/20 [1:02:04<47:07, 314.13s/it, loss=0.676, lr=0.00234, d_time[A
train:  73%|█████████████     | 673/928 [03:47<01:22,  3.08it/s, total_it=10880][A
epochs:  55%|▌| 11/20 [1:02:04<47:07, 314.13s/it, loss=0.518, lr=0.00234, d_time[A
train:  73%|█████████████     | 674/928 [03:47<01:23,  3.03it/s, total_it=10

epochs:  55%|▌| 11/20 [1:02:19<47:07, 314.13s/it, loss=0.637, lr=0.00233, d_time[A
train:  77%|█████████████▉    | 718/928 [04:02<01:09,  3.04it/s, total_it=10925][A
epochs:  55%|▌| 11/20 [1:02:20<47:07, 314.13s/it, loss=0.493, lr=0.00233, d_time[A
train:  77%|█████████████▉    | 719/928 [04:02<01:07,  3.11it/s, total_it=10926][A
epochs:  55%|▌| 11/20 [1:02:20<47:07, 314.13s/it, loss=0.561, lr=0.00233, d_time[A
train:  78%|█████████████▉    | 720/928 [04:03<01:06,  3.13it/s, total_it=10927][A
epochs:  55%|▌| 11/20 [1:02:20<47:07, 314.13s/it, loss=0.66, lr=0.00233, d_time=[A
train:  78%|█████████████▉    | 721/928 [04:03<01:06,  3.09it/s, total_it=10928][A
epochs:  55%|▌| 11/20 [1:02:21<47:07, 314.13s/it, loss=0.548, lr=0.00232, d_time[A
train:  78%|██████████████    | 722/928 [04:03<01:06,  3.09it/s, total_it=10929][A
epochs:  55%|▌| 11/20 [1:02:21<47:07, 314.13s/it, loss=0.64, lr=0.00232, d_time=[A
train:  78%|██████████████    | 723/928 [04:04<01:05,  3.15it/s, total_it=10

epochs:  55%|▌| 11/20 [1:02:36<47:07, 314.13s/it, loss=0.65, lr=0.00231, d_time=[A
train:  83%|██████████████▉   | 767/928 [04:19<00:57,  2.82it/s, total_it=10974][A
epochs:  55%|▌| 11/20 [1:02:36<47:07, 314.13s/it, loss=0.441, lr=0.00231, d_time[A
train:  83%|██████████████▉   | 768/928 [04:19<00:55,  2.86it/s, total_it=10975][A
epochs:  55%|▌| 11/20 [1:02:37<47:07, 314.13s/it, loss=0.497, lr=0.00231, d_time[A
train:  83%|██████████████▉   | 769/928 [04:19<00:56,  2.83it/s, total_it=10976][A
epochs:  55%|▌| 11/20 [1:02:37<47:07, 314.13s/it, loss=0.494, lr=0.00231, d_time[A
train:  83%|██████████████▉   | 770/928 [04:20<00:56,  2.81it/s, total_it=10977][A
epochs:  55%|▌| 11/20 [1:02:37<47:07, 314.13s/it, loss=0.454, lr=0.00231, d_time[A
train:  83%|██████████████▉   | 771/928 [04:20<00:55,  2.83it/s, total_it=10978][A
epochs:  55%|▌| 11/20 [1:02:38<47:07, 314.13s/it, loss=0.623, lr=0.00231, d_time[A
train:  83%|██████████████▉   | 772/928 [04:20<00:54,  2.85it/s, total_it=10

epochs:  55%|▌| 11/20 [1:02:53<47:07, 314.13s/it, loss=0.556, lr=0.00229, d_time[A
train:  88%|███████████████▊  | 816/928 [04:35<00:37,  2.95it/s, total_it=11023][A
epochs:  55%|▌| 11/20 [1:02:53<47:07, 314.13s/it, loss=0.6, lr=0.00229, d_time=0[A
train:  88%|███████████████▊  | 817/928 [04:36<00:36,  3.01it/s, total_it=11024][A
epochs:  55%|▌| 11/20 [1:02:53<47:07, 314.13s/it, loss=0.647, lr=0.00229, d_time[A
train:  88%|███████████████▊  | 818/928 [04:36<00:36,  3.04it/s, total_it=11025][A
epochs:  55%|▌| 11/20 [1:02:54<47:07, 314.13s/it, loss=0.578, lr=0.00229, d_time[A
train:  88%|███████████████▉  | 819/928 [04:36<00:35,  3.08it/s, total_it=11026][A
epochs:  55%|▌| 11/20 [1:02:54<47:07, 314.13s/it, loss=0.467, lr=0.00229, d_time[A
train:  88%|███████████████▉  | 820/928 [04:36<00:35,  3.07it/s, total_it=11027][A
epochs:  55%|▌| 11/20 [1:02:54<47:07, 314.13s/it, loss=0.555, lr=0.00229, d_time[A
train:  88%|███████████████▉  | 821/928 [04:37<00:35,  3.05it/s, total_it=11

epochs:  55%|▌| 11/20 [1:03:09<47:07, 314.13s/it, loss=0.621, lr=0.00227, d_time[A
train:  93%|████████████████▊ | 865/928 [04:52<00:21,  2.90it/s, total_it=11072][A
epochs:  55%|▌| 11/20 [1:03:09<47:07, 314.13s/it, loss=0.756, lr=0.00227, d_time[A
train:  93%|████████████████▊ | 866/928 [04:52<00:20,  2.97it/s, total_it=11073][A
epochs:  55%|▌| 11/20 [1:03:10<47:07, 314.13s/it, loss=0.603, lr=0.00227, d_time[A
train:  93%|████████████████▊ | 867/928 [04:52<00:20,  2.94it/s, total_it=11074][A
epochs:  55%|▌| 11/20 [1:03:10<47:07, 314.13s/it, loss=0.597, lr=0.00227, d_time[A
train:  94%|████████████████▊ | 868/928 [04:53<00:21,  2.82it/s, total_it=11075][A
epochs:  55%|▌| 11/20 [1:03:10<47:07, 314.13s/it, loss=0.565, lr=0.00227, d_time[A
train:  94%|████████████████▊ | 869/928 [04:53<00:20,  2.85it/s, total_it=11076][A
epochs:  55%|▌| 11/20 [1:03:11<47:07, 314.13s/it, loss=0.617, lr=0.00227, d_time[A
train:  94%|████████████████▉ | 870/928 [04:53<00:19,  2.90it/s, total_it=11

epochs:  55%|▌| 11/20 [1:03:26<47:07, 314.13s/it, loss=0.573, lr=0.00226, d_time[A
train:  98%|█████████████████▋| 914/928 [05:09<00:04,  2.99it/s, total_it=11121][A
epochs:  55%|▌| 11/20 [1:03:27<47:07, 314.13s/it, loss=0.529, lr=0.00226, d_time[A
train:  99%|█████████████████▋| 915/928 [05:09<00:04,  3.01it/s, total_it=11122][A
epochs:  55%|▌| 11/20 [1:03:27<47:07, 314.13s/it, loss=0.489, lr=0.00226, d_time[A
train:  99%|█████████████████▊| 916/928 [05:10<00:03,  3.04it/s, total_it=11123][A
epochs:  55%|▌| 11/20 [1:03:27<47:07, 314.13s/it, loss=0.559, lr=0.00225, d_time[A
train:  99%|█████████████████▊| 917/928 [05:10<00:03,  2.94it/s, total_it=11124][A
epochs:  55%|▌| 11/20 [1:03:28<47:07, 314.13s/it, loss=0.583, lr=0.00225, d_time[A
train:  99%|█████████████████▊| 918/928 [05:10<00:03,  3.00it/s, total_it=11125][A
epochs:  55%|▌| 11/20 [1:03:28<47:07, 314.13s/it, loss=0.543, lr=0.00225, d_time[A
train:  99%|█████████████████▊| 919/928 [05:11<00:03,  2.88it/s, total_it=11

epochs:  60%|▌| 12/20 [1:03:43<41:53, 314.14s/it, loss=0.527, lr=0.00224, d_time[A
train:   4%|▋                  | 34/928 [00:12<05:06,  2.92it/s, total_it=11169][A
epochs:  60%|▌| 12/20 [1:03:44<41:53, 314.14s/it, loss=0.477, lr=0.00224, d_time[A
train:   4%|▋                  | 35/928 [00:12<04:57,  3.00it/s, total_it=11170][A
epochs:  60%|▌| 12/20 [1:03:44<41:53, 314.14s/it, loss=0.582, lr=0.00224, d_time[A
train:   4%|▋                  | 36/928 [00:13<04:53,  3.04it/s, total_it=11171][A
epochs:  60%|▌| 12/20 [1:03:44<41:53, 314.14s/it, loss=0.478, lr=0.00224, d_time[A
train:   4%|▊                  | 37/928 [00:13<04:53,  3.04it/s, total_it=11172][A
epochs:  60%|▌| 12/20 [1:03:45<41:53, 314.14s/it, loss=0.478, lr=0.00224, d_time[A
train:   4%|▊                  | 38/928 [00:13<04:50,  3.06it/s, total_it=11173][A
epochs:  60%|▌| 12/20 [1:03:45<41:53, 314.14s/it, loss=0.569, lr=0.00224, d_time[A
train:   4%|▊                  | 39/928 [00:13<04:46,  3.10it/s, total_it=11

epochs:  60%|▌| 12/20 [1:04:00<41:53, 314.14s/it, loss=0.665, lr=0.00222, d_time[A
train:   9%|█▋                 | 83/928 [00:28<04:52,  2.89it/s, total_it=11218][A
epochs:  60%|▌| 12/20 [1:04:00<41:53, 314.14s/it, loss=0.49, lr=0.00222, d_time=[A
train:   9%|█▋                 | 84/928 [00:29<04:39,  3.02it/s, total_it=11219][A
epochs:  60%|▌| 12/20 [1:04:01<41:53, 314.14s/it, loss=0.704, lr=0.00222, d_time[A
train:   9%|█▋                 | 85/928 [00:29<04:33,  3.08it/s, total_it=11220][A
epochs:  60%|▌| 12/20 [1:04:01<41:53, 314.14s/it, loss=0.489, lr=0.00222, d_time[A
train:   9%|█▊                 | 86/928 [00:29<04:33,  3.07it/s, total_it=11221][A
epochs:  60%|▌| 12/20 [1:04:01<41:53, 314.14s/it, loss=0.525, lr=0.00222, d_time[A
train:   9%|█▊                 | 87/928 [00:30<04:25,  3.16it/s, total_it=11222][A
epochs:  60%|▌| 12/20 [1:04:01<41:53, 314.14s/it, loss=0.66, lr=0.00222, d_time=[A
train:   9%|█▊                 | 88/928 [00:30<04:32,  3.08it/s, total_it=11

epochs:  60%|▌| 12/20 [1:04:16<41:53, 314.14s/it, loss=0.554, lr=0.0022, d_time=[A
train:  14%|██▌               | 132/928 [00:45<04:17,  3.09it/s, total_it=11267][A
epochs:  60%|▌| 12/20 [1:04:17<41:53, 314.14s/it, loss=0.574, lr=0.0022, d_time=[A
train:  14%|██▌               | 133/928 [00:45<04:12,  3.14it/s, total_it=11268][A
epochs:  60%|▌| 12/20 [1:04:17<41:53, 314.14s/it, loss=0.534, lr=0.0022, d_time=[A
train:  14%|██▌               | 134/928 [00:45<04:21,  3.04it/s, total_it=11269][A
epochs:  60%|▌| 12/20 [1:04:17<41:53, 314.14s/it, loss=0.571, lr=0.0022, d_time=[A
train:  15%|██▌               | 135/928 [00:46<04:22,  3.02it/s, total_it=11270][A
epochs:  60%|▌| 12/20 [1:04:18<41:53, 314.14s/it, loss=0.523, lr=0.0022, d_time=[A
train:  15%|██▋               | 136/928 [00:46<04:20,  3.04it/s, total_it=11271][A
epochs:  60%|▌| 12/20 [1:04:18<41:53, 314.14s/it, loss=0.494, lr=0.0022, d_time=[A
train:  15%|██▋               | 137/928 [00:46<04:15,  3.10it/s, total_it=11

epochs:  60%|▌| 12/20 [1:04:32<41:53, 314.14s/it, loss=0.524, lr=0.00218, d_time[A
train:  20%|███▌              | 181/928 [01:01<04:29,  2.77it/s, total_it=11316][A
epochs:  60%|▌| 12/20 [1:04:33<41:53, 314.14s/it, loss=0.62, lr=0.00218, d_time=[A
train:  20%|███▌              | 182/928 [01:01<04:21,  2.86it/s, total_it=11317][A
epochs:  60%|▌| 12/20 [1:04:33<41:53, 314.14s/it, loss=0.613, lr=0.00218, d_time[A
train:  20%|███▌              | 183/928 [01:02<04:21,  2.85it/s, total_it=11318][A
epochs:  60%|▌| 12/20 [1:04:34<41:53, 314.14s/it, loss=0.519, lr=0.00218, d_time[A
train:  20%|███▌              | 184/928 [01:02<04:19,  2.87it/s, total_it=11319][A
epochs:  60%|▌| 12/20 [1:04:34<41:53, 314.14s/it, loss=0.642, lr=0.00218, d_time[A
train:  20%|███▌              | 185/928 [01:02<04:11,  2.96it/s, total_it=11320][A
epochs:  60%|▌| 12/20 [1:04:34<41:53, 314.14s/it, loss=0.632, lr=0.00218, d_time[A
train:  20%|███▌              | 186/928 [01:03<04:02,  3.06it/s, total_it=11

epochs:  60%|▌| 12/20 [1:04:49<41:53, 314.14s/it, loss=0.574, lr=0.00216, d_time[A
train:  25%|████▍             | 230/928 [01:17<03:49,  3.04it/s, total_it=11365][A
epochs:  60%|▌| 12/20 [1:04:49<41:53, 314.14s/it, loss=0.701, lr=0.00216, d_time[A
train:  25%|████▍             | 231/928 [01:18<03:51,  3.01it/s, total_it=11366][A
epochs:  60%|▌| 12/20 [1:04:49<41:53, 314.14s/it, loss=0.509, lr=0.00216, d_time[A
train:  25%|████▌             | 232/928 [01:18<03:47,  3.06it/s, total_it=11367][A
epochs:  60%|▌| 12/20 [1:04:50<41:53, 314.14s/it, loss=0.528, lr=0.00216, d_time[A
train:  25%|████▌             | 233/928 [01:18<03:51,  3.01it/s, total_it=11368][A
epochs:  60%|▌| 12/20 [1:04:50<41:53, 314.14s/it, loss=0.508, lr=0.00216, d_time[A
train:  25%|████▌             | 234/928 [01:19<03:45,  3.07it/s, total_it=11369][A
epochs:  60%|▌| 12/20 [1:04:50<41:53, 314.14s/it, loss=0.555, lr=0.00216, d_time[A
train:  25%|████▌             | 235/928 [01:19<03:49,  3.03it/s, total_it=11

epochs:  60%|▌| 12/20 [1:05:05<41:53, 314.14s/it, loss=0.601, lr=0.00215, d_time[A
train:  30%|█████▍            | 279/928 [01:34<03:37,  2.98it/s, total_it=11414][A
epochs:  60%|▌| 12/20 [1:05:05<41:53, 314.14s/it, loss=0.5, lr=0.00215, d_time=0[A
train:  30%|█████▍            | 280/928 [01:34<03:38,  2.97it/s, total_it=11415][A
epochs:  60%|▌| 12/20 [1:05:06<41:53, 314.14s/it, loss=0.56, lr=0.00215, d_time=[A
train:  30%|█████▍            | 281/928 [01:34<03:37,  2.97it/s, total_it=11416][A
epochs:  60%|▌| 12/20 [1:05:06<41:53, 314.14s/it, loss=0.695, lr=0.00215, d_time[A
train:  30%|█████▍            | 282/928 [01:35<03:38,  2.95it/s, total_it=11417][A
epochs:  60%|▌| 12/20 [1:05:06<41:53, 314.14s/it, loss=0.485, lr=0.00214, d_time[A
train:  30%|█████▍            | 283/928 [01:35<03:37,  2.96it/s, total_it=11418][A
epochs:  60%|▌| 12/20 [1:05:07<41:53, 314.14s/it, loss=0.546, lr=0.00214, d_time[A
train:  31%|█████▌            | 284/928 [01:35<03:33,  3.02it/s, total_it=11

epochs:  60%|▌| 12/20 [1:05:22<41:53, 314.14s/it, loss=0.572, lr=0.00213, d_time[A
train:  35%|██████▎           | 328/928 [01:50<03:23,  2.94it/s, total_it=11463][A
epochs:  60%|▌| 12/20 [1:05:22<41:53, 314.14s/it, loss=0.473, lr=0.00213, d_time[A
train:  35%|██████▍           | 329/928 [01:50<03:21,  2.98it/s, total_it=11464][A
epochs:  60%|▌| 12/20 [1:05:22<41:53, 314.14s/it, loss=0.516, lr=0.00213, d_time[A
train:  36%|██████▍           | 330/928 [01:51<03:18,  3.02it/s, total_it=11465][A
epochs:  60%|▌| 12/20 [1:05:23<41:53, 314.14s/it, loss=0.619, lr=0.00213, d_time[A
train:  36%|██████▍           | 331/928 [01:51<03:13,  3.09it/s, total_it=11466][A
epochs:  60%|▌| 12/20 [1:05:23<41:53, 314.14s/it, loss=0.567, lr=0.00213, d_time[A
train:  36%|██████▍           | 332/928 [01:51<03:14,  3.06it/s, total_it=11467][A
epochs:  60%|▌| 12/20 [1:05:23<41:53, 314.14s/it, loss=0.544, lr=0.00213, d_time[A
train:  36%|██████▍           | 333/928 [01:52<03:10,  3.13it/s, total_it=11

epochs:  60%|▌| 12/20 [1:05:38<41:53, 314.14s/it, loss=0.513, lr=0.00211, d_time[A
train:  41%|███████▎          | 377/928 [02:06<02:59,  3.07it/s, total_it=11512][A
epochs:  60%|▌| 12/20 [1:05:38<41:53, 314.14s/it, loss=0.55, lr=0.00211, d_time=[A
train:  41%|███████▎          | 378/928 [02:07<03:06,  2.96it/s, total_it=11513][A
epochs:  60%|▌| 12/20 [1:05:38<41:53, 314.14s/it, loss=0.576, lr=0.00211, d_time[A
train:  41%|███████▎          | 379/928 [02:07<03:14,  2.82it/s, total_it=11514][A
epochs:  60%|▌| 12/20 [1:05:39<41:53, 314.14s/it, loss=0.483, lr=0.00211, d_time[A
train:  41%|███████▎          | 380/928 [02:07<03:03,  2.98it/s, total_it=11515][A
epochs:  60%|▌| 12/20 [1:05:39<41:53, 314.14s/it, loss=0.659, lr=0.00211, d_time[A
train:  41%|███████▍          | 381/928 [02:08<03:01,  3.01it/s, total_it=11516][A
epochs:  60%|▌| 12/20 [1:05:39<41:53, 314.14s/it, loss=0.594, lr=0.00211, d_time[A
train:  41%|███████▍          | 382/928 [02:08<02:57,  3.07it/s, total_it=11

epochs:  60%|▌| 12/20 [1:05:54<41:53, 314.14s/it, loss=0.538, lr=0.00209, d_time[A
train:  46%|████████▎         | 426/928 [02:23<03:02,  2.74it/s, total_it=11561][A
epochs:  60%|▌| 12/20 [1:05:54<41:53, 314.14s/it, loss=0.627, lr=0.00209, d_time[A
train:  46%|████████▎         | 427/928 [02:23<02:55,  2.85it/s, total_it=11562][A
epochs:  60%|▌| 12/20 [1:05:55<41:53, 314.14s/it, loss=0.586, lr=0.00209, d_time[A
train:  46%|████████▎         | 428/928 [02:23<02:56,  2.83it/s, total_it=11563][A
epochs:  60%|▌| 12/20 [1:05:55<41:53, 314.14s/it, loss=0.576, lr=0.00209, d_time[A
train:  46%|████████▎         | 429/928 [02:24<02:52,  2.89it/s, total_it=11564][A
epochs:  60%|▌| 12/20 [1:05:55<41:53, 314.14s/it, loss=0.65, lr=0.00209, d_time=[A
train:  46%|████████▎         | 430/928 [02:24<02:50,  2.91it/s, total_it=11565][A
epochs:  60%|▌| 12/20 [1:05:56<41:53, 314.14s/it, loss=0.572, lr=0.00209, d_time[A
train:  46%|████████▎         | 431/928 [02:24<02:50,  2.92it/s, total_it=11

epochs:  60%|▌| 12/20 [1:06:11<41:53, 314.14s/it, loss=0.426, lr=0.00207, d_time[A
train:  51%|█████████▏        | 475/928 [02:39<02:30,  3.01it/s, total_it=11610][A
epochs:  60%|▌| 12/20 [1:06:11<41:53, 314.14s/it, loss=0.537, lr=0.00207, d_time[A
train:  51%|█████████▏        | 476/928 [02:40<02:31,  2.99it/s, total_it=11611][A
epochs:  60%|▌| 12/20 [1:06:12<41:53, 314.14s/it, loss=0.6, lr=0.00207, d_time=0[A
train:  51%|█████████▎        | 477/928 [02:40<02:32,  2.96it/s, total_it=11612][A
epochs:  60%|▌| 12/20 [1:06:12<41:53, 314.14s/it, loss=0.693, lr=0.00207, d_time[A
train:  52%|█████████▎        | 478/928 [02:40<02:34,  2.91it/s, total_it=11613][A
epochs:  60%|▌| 12/20 [1:06:12<41:53, 314.14s/it, loss=0.641, lr=0.00207, d_time[A
train:  52%|█████████▎        | 479/928 [02:41<02:30,  2.98it/s, total_it=11614][A
epochs:  60%|▌| 12/20 [1:06:13<41:53, 314.14s/it, loss=0.508, lr=0.00207, d_time[A
train:  52%|█████████▎        | 480/928 [02:41<02:38,  2.83it/s, total_it=11

epochs:  60%|▌| 12/20 [1:06:27<41:53, 314.14s/it, loss=0.607, lr=0.00205, d_time[A
train:  56%|██████████▏       | 524/928 [02:56<02:10,  3.10it/s, total_it=11659][A
epochs:  60%|▌| 12/20 [1:06:28<41:53, 314.14s/it, loss=0.518, lr=0.00205, d_time[A
train:  57%|██████████▏       | 525/928 [02:56<02:06,  3.18it/s, total_it=11660][A
epochs:  60%|▌| 12/20 [1:06:28<41:53, 314.14s/it, loss=0.465, lr=0.00205, d_time[A
train:  57%|██████████▏       | 526/928 [02:56<02:08,  3.13it/s, total_it=11661][A
epochs:  60%|▌| 12/20 [1:06:28<41:53, 314.14s/it, loss=0.493, lr=0.00205, d_time[A
train:  57%|██████████▏       | 527/928 [02:57<02:08,  3.12it/s, total_it=11662][A
epochs:  60%|▌| 12/20 [1:06:29<41:53, 314.14s/it, loss=0.539, lr=0.00205, d_time[A
train:  57%|██████████▏       | 528/928 [02:57<02:12,  3.02it/s, total_it=11663][A
epochs:  60%|▌| 12/20 [1:06:29<41:53, 314.14s/it, loss=0.561, lr=0.00205, d_time[A
train:  57%|██████████▎       | 529/928 [02:57<02:10,  3.06it/s, total_it=11

epochs:  60%|▌| 12/20 [1:06:44<41:53, 314.14s/it, loss=0.549, lr=0.00203, d_time[A
train:  62%|███████████       | 573/928 [03:12<02:04,  2.85it/s, total_it=11708][A
epochs:  60%|▌| 12/20 [1:06:44<41:53, 314.14s/it, loss=0.5, lr=0.00203, d_time=0[A
train:  62%|███████████▏      | 574/928 [03:13<01:59,  2.97it/s, total_it=11709][A
epochs:  60%|▌| 12/20 [1:06:44<41:53, 314.14s/it, loss=0.493, lr=0.00203, d_time[A
train:  62%|███████████▏      | 575/928 [03:13<01:59,  2.95it/s, total_it=11710][A
epochs:  60%|▌| 12/20 [1:06:45<41:53, 314.14s/it, loss=0.49, lr=0.00203, d_time=[A
train:  62%|███████████▏      | 576/928 [03:13<01:58,  2.98it/s, total_it=11711][A
epochs:  60%|▌| 12/20 [1:06:45<41:53, 314.14s/it, loss=0.552, lr=0.00203, d_time[A
train:  62%|███████████▏      | 577/928 [03:14<01:56,  3.02it/s, total_it=11712][A
epochs:  60%|▌| 12/20 [1:06:45<41:53, 314.14s/it, loss=0.514, lr=0.00203, d_time[A
train:  62%|███████████▏      | 578/928 [03:14<01:55,  3.03it/s, total_it=11

epochs:  60%|▌| 12/20 [1:07:00<41:53, 314.14s/it, loss=0.556, lr=0.00201, d_time[A
train:  67%|████████████      | 622/928 [03:29<01:44,  2.94it/s, total_it=11757][A
epochs:  60%|▌| 12/20 [1:07:01<41:53, 314.14s/it, loss=0.53, lr=0.00201, d_time=[A
train:  67%|████████████      | 623/928 [03:29<01:39,  3.05it/s, total_it=11758][A
epochs:  60%|▌| 12/20 [1:07:01<41:53, 314.14s/it, loss=0.583, lr=0.00201, d_time[A
train:  67%|████████████      | 624/928 [03:29<01:45,  2.88it/s, total_it=11759][A
epochs:  60%|▌| 12/20 [1:07:01<41:53, 314.14s/it, loss=0.659, lr=0.00201, d_time[A
train:  67%|████████████      | 625/928 [03:30<01:41,  2.98it/s, total_it=11760][A
epochs:  60%|▌| 12/20 [1:07:02<41:53, 314.14s/it, loss=0.53, lr=0.00201, d_time=[A
train:  67%|████████████▏     | 626/928 [03:30<01:38,  3.06it/s, total_it=11761][A
epochs:  60%|▌| 12/20 [1:07:02<41:53, 314.14s/it, loss=0.622, lr=0.00201, d_time[A
train:  68%|████████████▏     | 627/928 [03:30<01:38,  3.05it/s, total_it=11

epochs:  60%|▌| 12/20 [1:07:17<41:53, 314.14s/it, loss=0.515, lr=0.00199, d_time[A
train:  72%|█████████████     | 671/928 [03:45<01:25,  3.00it/s, total_it=11806][A
epochs:  60%|▌| 12/20 [1:07:17<41:53, 314.14s/it, loss=0.472, lr=0.00199, d_time[A
train:  72%|█████████████     | 672/928 [03:46<01:22,  3.09it/s, total_it=11807][A
epochs:  60%|▌| 12/20 [1:07:17<41:53, 314.14s/it, loss=0.685, lr=0.00199, d_time[A
train:  73%|█████████████     | 673/928 [03:46<01:21,  3.15it/s, total_it=11808][A
epochs:  60%|▌| 12/20 [1:07:18<41:53, 314.14s/it, loss=0.557, lr=0.00199, d_time[A
train:  73%|█████████████     | 674/928 [03:46<01:22,  3.07it/s, total_it=11809][A
epochs:  60%|▌| 12/20 [1:07:18<41:53, 314.14s/it, loss=0.536, lr=0.00199, d_time[A
train:  73%|█████████████     | 675/928 [03:47<01:20,  3.14it/s, total_it=11810][A
epochs:  60%|▌| 12/20 [1:07:18<41:53, 314.14s/it, loss=0.485, lr=0.00199, d_time[A
train:  73%|█████████████     | 676/928 [03:47<01:19,  3.15it/s, total_it=11

epochs:  60%|▌| 12/20 [1:07:33<41:53, 314.14s/it, loss=0.613, lr=0.00197, d_time[A
train:  78%|█████████████▉    | 720/928 [04:01<01:08,  3.06it/s, total_it=11855][A
epochs:  60%|▌| 12/20 [1:07:33<41:53, 314.14s/it, loss=0.562, lr=0.00197, d_time[A
train:  78%|█████████████▉    | 721/928 [04:02<01:06,  3.12it/s, total_it=11856][A
epochs:  60%|▌| 12/20 [1:07:34<41:53, 314.14s/it, loss=0.702, lr=0.00197, d_time[A
train:  78%|██████████████    | 722/928 [04:02<01:06,  3.12it/s, total_it=11857][A
epochs:  60%|▌| 12/20 [1:07:34<41:53, 314.14s/it, loss=0.604, lr=0.00197, d_time[A
train:  78%|██████████████    | 723/928 [04:02<01:05,  3.12it/s, total_it=11858][A
epochs:  60%|▌| 12/20 [1:07:34<41:53, 314.14s/it, loss=0.49, lr=0.00197, d_time=[A
train:  78%|██████████████    | 724/928 [04:03<01:06,  3.08it/s, total_it=11859][A
epochs:  60%|▌| 12/20 [1:07:35<41:53, 314.14s/it, loss=0.62, lr=0.00197, d_time=[A
train:  78%|██████████████    | 725/928 [04:03<01:08,  2.98it/s, total_it=11

epochs:  60%|▌| 12/20 [1:07:49<41:53, 314.14s/it, loss=0.534, lr=0.00195, d_time[A
train:  83%|██████████████▉   | 769/928 [04:18<00:51,  3.07it/s, total_it=11904][A
epochs:  60%|▌| 12/20 [1:07:50<41:53, 314.14s/it, loss=0.583, lr=0.00195, d_time[A
train:  83%|██████████████▉   | 770/928 [04:18<00:51,  3.07it/s, total_it=11905][A
epochs:  60%|▌| 12/20 [1:07:50<41:53, 314.14s/it, loss=0.548, lr=0.00195, d_time[A
train:  83%|██████████████▉   | 771/928 [04:19<00:52,  2.98it/s, total_it=11906][A
epochs:  60%|▌| 12/20 [1:07:50<41:53, 314.14s/it, loss=0.619, lr=0.00195, d_time[A
train:  83%|██████████████▉   | 772/928 [04:19<00:52,  2.98it/s, total_it=11907][A
epochs:  60%|▌| 12/20 [1:07:51<41:53, 314.14s/it, loss=0.522, lr=0.00195, d_time[A
train:  83%|██████████████▉   | 773/928 [04:19<00:52,  2.94it/s, total_it=11908][A
epochs:  60%|▌| 12/20 [1:07:51<41:53, 314.14s/it, loss=0.565, lr=0.00195, d_time[A
train:  83%|███████████████   | 774/928 [04:20<00:50,  3.06it/s, total_it=11

epochs:  60%|▌| 12/20 [1:08:06<41:53, 314.14s/it, loss=0.442, lr=0.00193, d_time[A
train:  88%|███████████████▊  | 818/928 [04:35<00:41,  2.64it/s, total_it=11953][A
epochs:  60%|▌| 12/20 [1:08:07<41:53, 314.14s/it, loss=0.537, lr=0.00193, d_time[A
train:  88%|███████████████▉  | 819/928 [04:35<00:40,  2.72it/s, total_it=11954][A
epochs:  60%|▌| 12/20 [1:08:07<41:53, 314.14s/it, loss=0.537, lr=0.00193, d_time[A
train:  88%|███████████████▉  | 820/928 [04:36<00:39,  2.70it/s, total_it=11955][A
epochs:  60%|▌| 12/20 [1:08:07<41:53, 314.14s/it, loss=0.528, lr=0.00193, d_time[A
train:  88%|███████████████▉  | 821/928 [04:36<00:38,  2.80it/s, total_it=11956][A
epochs:  60%|▌| 12/20 [1:08:08<41:53, 314.14s/it, loss=0.513, lr=0.00193, d_time[A
train:  89%|███████████████▉  | 822/928 [04:36<00:36,  2.89it/s, total_it=11957][A
epochs:  60%|▌| 12/20 [1:08:08<41:53, 314.14s/it, loss=0.483, lr=0.00193, d_time[A
train:  89%|███████████████▉  | 823/928 [04:37<00:36,  2.90it/s, total_it=11

epochs:  60%|▌| 12/20 [1:08:23<41:53, 314.14s/it, loss=0.593, lr=0.00191, d_time[A
train:  93%|████████████████▊ | 867/928 [04:52<00:19,  3.10it/s, total_it=12002][A
epochs:  60%|▌| 12/20 [1:08:23<41:53, 314.14s/it, loss=0.627, lr=0.00191, d_time[A
train:  94%|████████████████▊ | 868/928 [04:52<00:19,  3.08it/s, total_it=12003][A
epochs:  60%|▌| 12/20 [1:08:24<41:53, 314.14s/it, loss=0.471, lr=0.00191, d_time[A
train:  94%|████████████████▊ | 869/928 [04:52<00:18,  3.12it/s, total_it=12004][A
epochs:  60%|▌| 12/20 [1:08:24<41:53, 314.14s/it, loss=0.596, lr=0.00191, d_time[A
train:  94%|████████████████▉ | 870/928 [04:53<00:18,  3.12it/s, total_it=12005][A
epochs:  60%|▌| 12/20 [1:08:24<41:53, 314.14s/it, loss=0.573, lr=0.00191, d_time[A
train:  94%|████████████████▉ | 871/928 [04:53<00:18,  3.16it/s, total_it=12006][A
epochs:  60%|▌| 12/20 [1:08:25<41:53, 314.14s/it, loss=0.542, lr=0.00191, d_time[A
train:  94%|████████████████▉ | 872/928 [04:53<00:18,  3.04it/s, total_it=12

epochs:  60%|▌| 12/20 [1:08:39<41:53, 314.14s/it, loss=0.538, lr=0.00189, d_time[A
train:  99%|█████████████████▊| 916/928 [05:08<00:03,  3.02it/s, total_it=12051][A
epochs:  60%|▌| 12/20 [1:08:40<41:53, 314.14s/it, loss=0.5, lr=0.00189, d_time=0[A
train:  99%|█████████████████▊| 917/928 [05:08<00:03,  3.06it/s, total_it=12052][A
epochs:  60%|▌| 12/20 [1:08:40<41:53, 314.14s/it, loss=0.556, lr=0.00189, d_time[A
train:  99%|█████████████████▊| 918/928 [05:08<00:03,  3.09it/s, total_it=12053][A
epochs:  60%|▌| 12/20 [1:08:40<41:53, 314.14s/it, loss=0.688, lr=0.00189, d_time[A
train:  99%|█████████████████▊| 919/928 [05:09<00:02,  3.08it/s, total_it=12054][A
epochs:  60%|▌| 12/20 [1:08:41<41:53, 314.14s/it, loss=0.517, lr=0.00189, d_time[A
train:  99%|█████████████████▊| 920/928 [05:09<00:02,  3.10it/s, total_it=12055][A
epochs:  60%|▌| 12/20 [1:08:41<41:53, 314.14s/it, loss=0.616, lr=0.00189, d_time[A
train:  99%|█████████████████▊| 921/928 [05:09<00:02,  3.16it/s, total_it=12

epochs:  65%|▋| 13/20 [1:08:56<36:35, 313.65s/it, loss=0.547, lr=0.00187, d_time[A
train:   4%|▋                  | 36/928 [00:12<05:10,  2.87it/s, total_it=12099][A
epochs:  65%|▋| 13/20 [1:08:57<36:35, 313.65s/it, loss=0.512, lr=0.00187, d_time[A
train:   4%|▊                  | 37/928 [00:13<04:58,  2.98it/s, total_it=12100][A
epochs:  65%|▋| 13/20 [1:08:57<36:35, 313.65s/it, loss=0.484, lr=0.00187, d_time[A
train:   4%|▊                  | 38/928 [00:13<04:55,  3.01it/s, total_it=12101][A
epochs:  65%|▋| 13/20 [1:08:57<36:35, 313.65s/it, loss=0.479, lr=0.00187, d_time[A
train:   4%|▊                  | 39/928 [00:13<04:50,  3.06it/s, total_it=12102][A
epochs:  65%|▋| 13/20 [1:08:58<36:35, 313.65s/it, loss=0.581, lr=0.00187, d_time[A
train:   4%|▊                  | 40/928 [00:14<04:45,  3.11it/s, total_it=12103][A
epochs:  65%|▋| 13/20 [1:08:58<36:35, 313.65s/it, loss=0.494, lr=0.00187, d_time[A
train:   4%|▊                  | 41/928 [00:14<04:55,  3.00it/s, total_it=12

epochs:  65%|▋| 13/20 [1:09:13<36:35, 313.65s/it, loss=0.544, lr=0.00185, d_time[A
train:   9%|█▋                 | 85/928 [00:29<04:32,  3.10it/s, total_it=12148][A
epochs:  65%|▋| 13/20 [1:09:13<36:35, 313.65s/it, loss=0.517, lr=0.00185, d_time[A
train:   9%|█▊                 | 86/928 [00:29<04:28,  3.14it/s, total_it=12149][A
epochs:  65%|▋| 13/20 [1:09:13<36:35, 313.65s/it, loss=0.399, lr=0.00185, d_time[A
train:   9%|█▊                 | 87/928 [00:29<04:33,  3.08it/s, total_it=12150][A
epochs:  65%|▋| 13/20 [1:09:14<36:35, 313.65s/it, loss=0.577, lr=0.00185, d_time[A
train:   9%|█▊                 | 88/928 [00:30<04:37,  3.02it/s, total_it=12151][A
epochs:  65%|▋| 13/20 [1:09:14<36:35, 313.65s/it, loss=0.464, lr=0.00185, d_time[A
train:  10%|█▊                 | 89/928 [00:30<04:57,  2.82it/s, total_it=12152][A
epochs:  65%|▋| 13/20 [1:09:14<36:35, 313.65s/it, loss=0.685, lr=0.00185, d_time[A
train:  10%|█▊                 | 90/928 [00:30<04:53,  2.85it/s, total_it=12

epochs:  65%|▋| 13/20 [1:09:29<36:35, 313.65s/it, loss=0.506, lr=0.00183, d_time[A
train:  14%|██▌               | 134/928 [00:45<04:27,  2.97it/s, total_it=12197][A
epochs:  65%|▋| 13/20 [1:09:29<36:35, 313.65s/it, loss=0.502, lr=0.00183, d_time[A
train:  15%|██▌               | 135/928 [00:45<04:25,  2.98it/s, total_it=12198][A
epochs:  65%|▋| 13/20 [1:09:30<36:35, 313.65s/it, loss=0.511, lr=0.00183, d_time[A
train:  15%|██▋               | 136/928 [00:46<04:22,  3.02it/s, total_it=12199][A
epochs:  65%|▋| 13/20 [1:09:30<36:35, 313.65s/it, loss=0.408, lr=0.00183, d_time[A
train:  15%|██▋               | 137/928 [00:46<04:23,  3.01it/s, total_it=12200][A
epochs:  65%|▋| 13/20 [1:09:30<36:35, 313.65s/it, loss=0.52, lr=0.00183, d_time=[A
train:  15%|██▋               | 138/928 [00:46<04:38,  2.84it/s, total_it=12201][A
epochs:  65%|▋| 13/20 [1:09:31<36:35, 313.65s/it, loss=0.487, lr=0.00183, d_time[A
train:  15%|██▋               | 139/928 [00:47<04:51,  2.71it/s, total_it=12

epochs:  65%|▋| 13/20 [1:09:46<36:35, 313.65s/it, loss=0.546, lr=0.00181, d_time[A
train:  20%|███▌              | 183/928 [01:02<04:07,  3.01it/s, total_it=12246][A
epochs:  65%|▋| 13/20 [1:09:46<36:35, 313.65s/it, loss=0.667, lr=0.00181, d_time[A
train:  20%|███▌              | 184/928 [01:02<04:02,  3.07it/s, total_it=12247][A
epochs:  65%|▋| 13/20 [1:09:46<36:35, 313.65s/it, loss=0.616, lr=0.00181, d_time[A
train:  20%|███▌              | 185/928 [01:02<04:01,  3.07it/s, total_it=12248][A
epochs:  65%|▋| 13/20 [1:09:47<36:35, 313.65s/it, loss=0.558, lr=0.00181, d_time[A
train:  20%|███▌              | 186/928 [01:03<03:59,  3.10it/s, total_it=12249][A
epochs:  65%|▋| 13/20 [1:09:47<36:35, 313.65s/it, loss=0.482, lr=0.00181, d_time[A
train:  20%|███▋              | 187/928 [01:03<04:04,  3.03it/s, total_it=12250][A
epochs:  65%|▋| 13/20 [1:09:47<36:35, 313.65s/it, loss=0.445, lr=0.00181, d_time[A
train:  20%|███▋              | 188/928 [01:03<04:03,  3.04it/s, total_it=12

epochs:  65%|▋| 13/20 [1:10:02<36:35, 313.65s/it, loss=0.446, lr=0.00179, d_time[A
train:  25%|████▌             | 232/928 [01:18<04:04,  2.84it/s, total_it=12295][A
epochs:  65%|▋| 13/20 [1:10:03<36:35, 313.65s/it, loss=0.502, lr=0.00179, d_time[A
train:  25%|████▌             | 233/928 [01:19<03:58,  2.92it/s, total_it=12296][A
epochs:  65%|▋| 13/20 [1:10:03<36:35, 313.65s/it, loss=0.647, lr=0.00179, d_time[A
train:  25%|████▌             | 234/928 [01:19<03:50,  3.02it/s, total_it=12297][A
epochs:  65%|▋| 13/20 [1:10:03<36:35, 313.65s/it, loss=0.516, lr=0.00179, d_time[A
train:  25%|████▌             | 235/928 [01:19<03:47,  3.04it/s, total_it=12298][A
epochs:  65%|▋| 13/20 [1:10:04<36:35, 313.65s/it, loss=0.529, lr=0.00179, d_time[A
train:  25%|████▌             | 236/928 [01:20<03:51,  2.99it/s, total_it=12299][A
epochs:  65%|▋| 13/20 [1:10:04<36:35, 313.65s/it, loss=0.525, lr=0.00179, d_time[A
train:  26%|████▌             | 237/928 [01:20<03:51,  2.99it/s, total_it=12

epochs:  65%|▋| 13/20 [1:10:19<36:35, 313.65s/it, loss=0.418, lr=0.00177, d_time[A
train:  30%|█████▍            | 281/928 [01:35<03:44,  2.88it/s, total_it=12344][A
epochs:  65%|▋| 13/20 [1:10:19<36:35, 313.65s/it, loss=0.543, lr=0.00177, d_time[A
train:  30%|█████▍            | 282/928 [01:35<03:46,  2.86it/s, total_it=12345][A
epochs:  65%|▋| 13/20 [1:10:20<36:35, 313.65s/it, loss=0.483, lr=0.00177, d_time[A
train:  30%|█████▍            | 283/928 [01:36<03:39,  2.94it/s, total_it=12346][A
epochs:  65%|▋| 13/20 [1:10:20<36:35, 313.65s/it, loss=0.513, lr=0.00177, d_time[A
train:  31%|█████▌            | 284/928 [01:36<03:45,  2.85it/s, total_it=12347][A
epochs:  65%|▋| 13/20 [1:10:20<36:35, 313.65s/it, loss=0.479, lr=0.00177, d_time[A
train:  31%|█████▌            | 285/928 [01:36<03:40,  2.92it/s, total_it=12348][A
epochs:  65%|▋| 13/20 [1:10:21<36:35, 313.65s/it, loss=0.457, lr=0.00177, d_time[A
train:  31%|█████▌            | 286/928 [01:37<03:35,  2.99it/s, total_it=12

epochs:  65%|▋| 13/20 [1:10:36<36:35, 313.65s/it, loss=0.539, lr=0.00175, d_time[A
train:  36%|██████▍           | 330/928 [01:52<03:29,  2.86it/s, total_it=12393][A
epochs:  65%|▋| 13/20 [1:10:36<36:35, 313.65s/it, loss=0.55, lr=0.00175, d_time=[A
train:  36%|██████▍           | 331/928 [01:52<03:24,  2.92it/s, total_it=12394][A
epochs:  65%|▋| 13/20 [1:10:37<36:35, 313.65s/it, loss=0.56, lr=0.00175, d_time=[A
train:  36%|██████▍           | 332/928 [01:53<03:36,  2.75it/s, total_it=12395][A
epochs:  65%|▋| 13/20 [1:10:37<36:35, 313.65s/it, loss=0.546, lr=0.00175, d_time[A
train:  36%|██████▍           | 333/928 [01:53<03:25,  2.90it/s, total_it=12396][A
epochs:  65%|▋| 13/20 [1:10:37<36:35, 313.65s/it, loss=0.392, lr=0.00175, d_time[A
train:  36%|██████▍           | 334/928 [01:53<03:25,  2.89it/s, total_it=12397][A
epochs:  65%|▋| 13/20 [1:10:38<36:35, 313.65s/it, loss=0.519, lr=0.00175, d_time[A
train:  36%|██████▍           | 335/928 [01:54<03:20,  2.95it/s, total_it=12

epochs:  65%|▋| 13/20 [1:10:53<36:35, 313.65s/it, loss=0.584, lr=0.00173, d_time[A
train:  41%|███████▎          | 379/928 [02:09<03:08,  2.91it/s, total_it=12442][A
epochs:  65%|▋| 13/20 [1:10:53<36:35, 313.65s/it, loss=0.61, lr=0.00173, d_time=[A
train:  41%|███████▎          | 380/928 [02:09<03:02,  3.00it/s, total_it=12443][A
epochs:  65%|▋| 13/20 [1:10:53<36:35, 313.65s/it, loss=0.556, lr=0.00173, d_time[A
train:  41%|███████▍          | 381/928 [02:09<03:02,  2.99it/s, total_it=12444][A
epochs:  65%|▋| 13/20 [1:10:54<36:35, 313.65s/it, loss=0.515, lr=0.00173, d_time[A
train:  41%|███████▍          | 382/928 [02:10<03:10,  2.87it/s, total_it=12445][A
epochs:  65%|▋| 13/20 [1:10:54<36:35, 313.65s/it, loss=0.563, lr=0.00173, d_time[A
train:  41%|███████▍          | 383/928 [02:10<03:03,  2.96it/s, total_it=12446][A
epochs:  65%|▋| 13/20 [1:10:54<36:35, 313.65s/it, loss=0.418, lr=0.00173, d_time[A
train:  41%|███████▍          | 384/928 [02:10<02:59,  3.03it/s, total_it=12

epochs:  65%|▋| 13/20 [1:11:09<36:35, 313.65s/it, loss=0.589, lr=0.00171, d_time[A
train:  46%|████████▎         | 428/928 [02:25<02:53,  2.89it/s, total_it=12491][A
epochs:  65%|▋| 13/20 [1:11:10<36:35, 313.65s/it, loss=0.629, lr=0.00171, d_time[A
train:  46%|████████▎         | 429/928 [02:26<02:56,  2.83it/s, total_it=12492][A
epochs:  65%|▋| 13/20 [1:11:10<36:35, 313.65s/it, loss=0.505, lr=0.00171, d_time[A
train:  46%|████████▎         | 430/928 [02:26<02:51,  2.90it/s, total_it=12493][A
epochs:  65%|▋| 13/20 [1:11:10<36:35, 313.65s/it, loss=0.624, lr=0.00171, d_time[A
train:  46%|████████▎         | 431/928 [02:26<02:50,  2.91it/s, total_it=12494][A
epochs:  65%|▋| 13/20 [1:11:11<36:35, 313.65s/it, loss=0.632, lr=0.00171, d_time[A
train:  47%|████████▍         | 432/928 [02:27<02:47,  2.97it/s, total_it=12495][A
epochs:  65%|▋| 13/20 [1:11:11<36:35, 313.65s/it, loss=0.513, lr=0.00171, d_time[A
train:  47%|████████▍         | 433/928 [02:27<02:49,  2.91it/s, total_it=12

epochs:  65%|▋| 13/20 [1:11:26<36:35, 313.65s/it, loss=0.605, lr=0.00169, d_time[A
train:  51%|█████████▎        | 477/928 [02:42<02:32,  2.95it/s, total_it=12540][A
epochs:  65%|▋| 13/20 [1:11:26<36:35, 313.65s/it, loss=0.532, lr=0.00169, d_time[A
train:  52%|█████████▎        | 478/928 [02:42<02:31,  2.98it/s, total_it=12541][A
epochs:  65%|▋| 13/20 [1:11:27<36:35, 313.65s/it, loss=0.571, lr=0.00169, d_time[A
train:  52%|█████████▎        | 479/928 [02:42<02:29,  3.00it/s, total_it=12542][A
epochs:  65%|▋| 13/20 [1:11:27<36:35, 313.65s/it, loss=0.585, lr=0.00169, d_time[A
train:  52%|█████████▎        | 480/928 [02:43<02:25,  3.08it/s, total_it=12543][A
epochs:  65%|▋| 13/20 [1:11:27<36:35, 313.65s/it, loss=0.644, lr=0.00169, d_time[A
train:  52%|█████████▎        | 481/928 [02:43<02:26,  3.05it/s, total_it=12544][A
epochs:  65%|▋| 13/20 [1:11:27<36:35, 313.65s/it, loss=0.458, lr=0.00169, d_time[A
train:  52%|█████████▎        | 482/928 [02:43<02:35,  2.87it/s, total_it=12

epochs:  65%|▋| 13/20 [1:11:42<36:35, 313.65s/it, loss=0.592, lr=0.00167, d_time[A
train:  57%|██████████▏       | 526/928 [02:58<02:16,  2.94it/s, total_it=12589][A
epochs:  65%|▋| 13/20 [1:11:43<36:35, 313.65s/it, loss=0.66, lr=0.00167, d_time=[A
train:  57%|██████████▏       | 527/928 [02:59<02:13,  3.01it/s, total_it=12590][A
epochs:  65%|▋| 13/20 [1:11:43<36:35, 313.65s/it, loss=0.614, lr=0.00167, d_time[A
train:  57%|██████████▏       | 528/928 [02:59<02:12,  3.01it/s, total_it=12591][A
epochs:  65%|▋| 13/20 [1:11:43<36:35, 313.65s/it, loss=0.575, lr=0.00167, d_time[A
train:  57%|██████████▎       | 529/928 [02:59<02:11,  3.04it/s, total_it=12592][A
epochs:  65%|▋| 13/20 [1:11:44<36:35, 313.65s/it, loss=0.526, lr=0.00167, d_time[A
train:  57%|██████████▎       | 530/928 [03:00<02:12,  2.99it/s, total_it=12593][A
epochs:  65%|▋| 13/20 [1:11:44<36:35, 313.65s/it, loss=0.563, lr=0.00167, d_time[A
train:  57%|██████████▎       | 531/928 [03:00<02:14,  2.95it/s, total_it=12

epochs:  65%|▋| 13/20 [1:11:59<36:35, 313.65s/it, loss=0.562, lr=0.00165, d_time[A
train:  62%|███████████▏      | 575/928 [03:15<02:00,  2.93it/s, total_it=12638][A
epochs:  65%|▋| 13/20 [1:11:59<36:35, 313.65s/it, loss=0.466, lr=0.00165, d_time[A
train:  62%|███████████▏      | 576/928 [03:15<02:06,  2.78it/s, total_it=12639][A
epochs:  65%|▋| 13/20 [1:12:00<36:35, 313.65s/it, loss=0.482, lr=0.00165, d_time[A
train:  62%|███████████▏      | 577/928 [03:16<02:00,  2.91it/s, total_it=12640][A
epochs:  65%|▋| 13/20 [1:12:00<36:35, 313.65s/it, loss=0.435, lr=0.00165, d_time[A
train:  62%|███████████▏      | 578/928 [03:16<02:02,  2.85it/s, total_it=12641][A
epochs:  65%|▋| 13/20 [1:12:00<36:35, 313.65s/it, loss=0.588, lr=0.00165, d_time[A
train:  62%|███████████▏      | 579/928 [03:16<01:56,  3.00it/s, total_it=12642][A
epochs:  65%|▋| 13/20 [1:12:01<36:35, 313.65s/it, loss=0.552, lr=0.00165, d_time[A
train:  62%|███████████▎      | 580/928 [03:17<01:53,  3.07it/s, total_it=12

epochs:  65%|▋| 13/20 [1:12:15<36:35, 313.65s/it, loss=0.592, lr=0.00163, d_time[A
train:  67%|████████████      | 624/928 [03:31<01:42,  2.95it/s, total_it=12687][A
epochs:  65%|▋| 13/20 [1:12:16<36:35, 313.65s/it, loss=0.636, lr=0.00163, d_time[A
train:  67%|████████████      | 625/928 [03:32<01:42,  2.96it/s, total_it=12688][A
epochs:  65%|▋| 13/20 [1:12:16<36:35, 313.65s/it, loss=0.504, lr=0.00163, d_time[A
train:  67%|████████████▏     | 626/928 [03:32<01:46,  2.84it/s, total_it=12689][A
epochs:  65%|▋| 13/20 [1:12:16<36:35, 313.65s/it, loss=0.495, lr=0.00163, d_time[A
train:  68%|████████████▏     | 627/928 [03:32<01:43,  2.90it/s, total_it=12690][A
epochs:  65%|▋| 13/20 [1:12:17<36:35, 313.65s/it, loss=0.523, lr=0.00163, d_time[A
train:  68%|████████████▏     | 628/928 [03:33<01:40,  2.98it/s, total_it=12691][A
epochs:  65%|▋| 13/20 [1:12:17<36:35, 313.65s/it, loss=0.491, lr=0.00163, d_time[A
train:  68%|████████████▏     | 629/928 [03:33<01:40,  2.99it/s, total_it=12

epochs:  65%|▋| 13/20 [1:12:32<36:35, 313.65s/it, loss=0.544, lr=0.00161, d_time[A
train:  73%|█████████████     | 673/928 [03:48<01:22,  3.08it/s, total_it=12736][A
epochs:  65%|▋| 13/20 [1:12:32<36:35, 313.65s/it, loss=0.5, lr=0.00161, d_time=0[A
train:  73%|█████████████     | 674/928 [03:48<01:23,  3.03it/s, total_it=12737][A
epochs:  65%|▋| 13/20 [1:12:32<36:35, 313.65s/it, loss=0.53, lr=0.00161, d_time=[A
train:  73%|█████████████     | 675/928 [03:48<01:21,  3.10it/s, total_it=12738][A
epochs:  65%|▋| 13/20 [1:12:33<36:35, 313.65s/it, loss=0.518, lr=0.00161, d_time[A
train:  73%|█████████████     | 676/928 [03:49<01:27,  2.88it/s, total_it=12739][A
epochs:  65%|▋| 13/20 [1:12:33<36:35, 313.65s/it, loss=0.496, lr=0.00161, d_time[A
train:  73%|█████████████▏    | 677/928 [03:49<01:25,  2.95it/s, total_it=12740][A
epochs:  65%|▋| 13/20 [1:12:34<36:35, 313.65s/it, loss=0.571, lr=0.00161, d_time[A
train:  73%|█████████████▏    | 678/928 [03:50<01:30,  2.76it/s, total_it=12

epochs:  65%|▋| 13/20 [1:12:49<36:35, 313.65s/it, loss=0.56, lr=0.00159, d_time=[A
train:  78%|██████████████    | 722/928 [04:05<01:09,  2.98it/s, total_it=12785][A
epochs:  65%|▋| 13/20 [1:12:49<36:35, 313.65s/it, loss=0.529, lr=0.00159, d_time[A
train:  78%|██████████████    | 723/928 [04:05<01:08,  3.00it/s, total_it=12786][A
epochs:  65%|▋| 13/20 [1:12:49<36:35, 313.65s/it, loss=0.525, lr=0.00159, d_time[A
train:  78%|██████████████    | 724/928 [04:05<01:08,  2.96it/s, total_it=12787][A
epochs:  65%|▋| 13/20 [1:12:50<36:35, 313.65s/it, loss=0.518, lr=0.00159, d_time[A
train:  78%|██████████████    | 725/928 [04:06<01:08,  2.98it/s, total_it=12788][A
epochs:  65%|▋| 13/20 [1:12:50<36:35, 313.65s/it, loss=0.501, lr=0.00159, d_time[A
train:  78%|██████████████    | 726/928 [04:06<01:08,  2.94it/s, total_it=12789][A
epochs:  65%|▋| 13/20 [1:12:50<36:35, 313.65s/it, loss=0.432, lr=0.00159, d_time[A
train:  78%|██████████████    | 727/928 [04:06<01:06,  3.04it/s, total_it=12

epochs:  65%|▋| 13/20 [1:13:05<36:35, 313.65s/it, loss=0.532, lr=0.00157, d_time[A
train:  83%|██████████████▉   | 771/928 [04:21<00:52,  3.00it/s, total_it=12834][A
epochs:  65%|▋| 13/20 [1:13:06<36:35, 313.65s/it, loss=0.539, lr=0.00157, d_time[A
train:  83%|██████████████▉   | 772/928 [04:22<00:52,  2.99it/s, total_it=12835][A
epochs:  65%|▋| 13/20 [1:13:06<36:35, 313.65s/it, loss=0.464, lr=0.00157, d_time[A
train:  83%|██████████████▉   | 773/928 [04:22<00:50,  3.07it/s, total_it=12836][A
epochs:  65%|▋| 13/20 [1:13:06<36:35, 313.65s/it, loss=0.5, lr=0.00157, d_time=0[A
train:  83%|███████████████   | 774/928 [04:22<00:51,  2.98it/s, total_it=12837][A
epochs:  65%|▋| 13/20 [1:13:07<36:35, 313.65s/it, loss=0.498, lr=0.00157, d_time[A
train:  84%|███████████████   | 775/928 [04:23<00:52,  2.93it/s, total_it=12838][A
epochs:  65%|▋| 13/20 [1:13:07<36:35, 313.65s/it, loss=0.528, lr=0.00157, d_time[A
train:  84%|███████████████   | 776/928 [04:23<00:51,  2.96it/s, total_it=12

epochs:  65%|▋| 13/20 [1:13:22<36:35, 313.65s/it, loss=0.541, lr=0.00155, d_time[A
train:  88%|███████████████▉  | 820/928 [04:38<00:35,  3.03it/s, total_it=12883][A
epochs:  65%|▋| 13/20 [1:13:23<36:35, 313.65s/it, loss=0.615, lr=0.00155, d_time[A
train:  88%|███████████████▉  | 821/928 [04:39<00:36,  2.95it/s, total_it=12884][A
epochs:  65%|▋| 13/20 [1:13:23<36:35, 313.65s/it, loss=0.566, lr=0.00155, d_time[A
train:  89%|███████████████▉  | 822/928 [04:39<00:35,  3.01it/s, total_it=12885][A
epochs:  65%|▋| 13/20 [1:13:23<36:35, 313.65s/it, loss=0.458, lr=0.00155, d_time[A
train:  89%|███████████████▉  | 823/928 [04:39<00:34,  3.07it/s, total_it=12886][A
epochs:  65%|▋| 13/20 [1:13:24<36:35, 313.65s/it, loss=0.522, lr=0.00154, d_time[A
train:  89%|███████████████▉  | 824/928 [04:40<00:33,  3.07it/s, total_it=12887][A
epochs:  65%|▋| 13/20 [1:13:24<36:35, 313.65s/it, loss=0.458, lr=0.00154, d_time[A
train:  89%|████████████████  | 825/928 [04:40<00:33,  3.10it/s, total_it=12

epochs:  65%|▋| 13/20 [1:13:39<36:35, 313.65s/it, loss=0.549, lr=0.00153, d_time[A
train:  94%|████████████████▊ | 869/928 [04:55<00:18,  3.12it/s, total_it=12932][A
epochs:  65%|▋| 13/20 [1:13:39<36:35, 313.65s/it, loss=0.459, lr=0.00153, d_time[A
train:  94%|████████████████▉ | 870/928 [04:55<00:19,  2.92it/s, total_it=12933][A
epochs:  65%|▋| 13/20 [1:13:40<36:35, 313.65s/it, loss=0.489, lr=0.00152, d_time[A
train:  94%|████████████████▉ | 871/928 [04:56<00:19,  2.90it/s, total_it=12934][A
epochs:  65%|▋| 13/20 [1:13:40<36:35, 313.65s/it, loss=0.508, lr=0.00152, d_time[A
train:  94%|████████████████▉ | 872/928 [04:56<00:18,  2.96it/s, total_it=12935][A
epochs:  65%|▋| 13/20 [1:13:40<36:35, 313.65s/it, loss=0.586, lr=0.00152, d_time[A
train:  94%|████████████████▉ | 873/928 [04:56<00:18,  2.97it/s, total_it=12936][A
epochs:  65%|▋| 13/20 [1:13:41<36:35, 313.65s/it, loss=0.426, lr=0.00152, d_time[A
train:  94%|████████████████▉ | 874/928 [04:57<00:18,  2.91it/s, total_it=12

epochs:  65%|▋| 13/20 [1:13:55<36:35, 313.65s/it, loss=0.536, lr=0.00151, d_time[A
train:  99%|█████████████████▊| 918/928 [05:11<00:03,  3.20it/s, total_it=12981][A
epochs:  65%|▋| 13/20 [1:13:56<36:35, 313.65s/it, loss=0.639, lr=0.0015, d_time=[A
train:  99%|█████████████████▊| 919/928 [05:12<00:02,  3.14it/s, total_it=12982][A
epochs:  65%|▋| 13/20 [1:13:56<36:35, 313.65s/it, loss=0.552, lr=0.0015, d_time=[A
train:  99%|█████████████████▊| 920/928 [05:12<00:02,  3.18it/s, total_it=12983][A
epochs:  65%|▋| 13/20 [1:13:56<36:35, 313.65s/it, loss=0.381, lr=0.0015, d_time=[A
train:  99%|█████████████████▊| 921/928 [05:12<00:02,  3.20it/s, total_it=12984][A
epochs:  65%|▋| 13/20 [1:13:57<36:35, 313.65s/it, loss=0.499, lr=0.0015, d_time=[A
train:  99%|█████████████████▉| 922/928 [05:13<00:01,  3.17it/s, total_it=12985][A
epochs:  65%|▋| 13/20 [1:13:57<36:35, 313.65s/it, loss=0.512, lr=0.0015, d_time=[A
train:  99%|█████████████████▉| 923/928 [05:13<00:01,  3.20it/s, total_it=12

epochs:  70%|▋| 14/20 [1:14:12<31:24, 314.12s/it, loss=0.427, lr=0.00148, d_time[A
train:   4%|▊                  | 38/928 [00:13<05:05,  2.91it/s, total_it=13029][A
epochs:  70%|▋| 14/20 [1:14:13<31:24, 314.12s/it, loss=0.468, lr=0.00148, d_time[A
train:   4%|▊                  | 39/928 [00:14<04:55,  3.01it/s, total_it=13030][A
epochs:  70%|▋| 14/20 [1:14:13<31:24, 314.12s/it, loss=0.551, lr=0.00148, d_time[A
train:   4%|▊                  | 40/928 [00:14<04:51,  3.04it/s, total_it=13031][A
epochs:  70%|▋| 14/20 [1:14:13<31:24, 314.12s/it, loss=0.566, lr=0.00148, d_time[A
train:   4%|▊                  | 41/928 [00:14<04:53,  3.03it/s, total_it=13032][A
epochs:  70%|▋| 14/20 [1:14:14<31:24, 314.12s/it, loss=0.551, lr=0.00148, d_time[A
train:   5%|▊                  | 42/928 [00:15<05:07,  2.88it/s, total_it=13033][A
epochs:  70%|▋| 14/20 [1:14:14<31:24, 314.12s/it, loss=0.469, lr=0.00148, d_time[A
train:   5%|▉                  | 43/928 [00:15<05:03,  2.91it/s, total_it=13

epochs:  70%|▋| 14/20 [1:14:29<31:24, 314.12s/it, loss=0.548, lr=0.00146, d_time[A
train:   9%|█▊                 | 87/928 [00:30<05:11,  2.70it/s, total_it=13078][A
epochs:  70%|▋| 14/20 [1:14:30<31:24, 314.12s/it, loss=0.433, lr=0.00146, d_time[A
train:   9%|█▊                 | 88/928 [00:30<05:04,  2.76it/s, total_it=13079][A
epochs:  70%|▋| 14/20 [1:14:30<31:24, 314.12s/it, loss=0.454, lr=0.00146, d_time[A
train:  10%|█▊                 | 89/928 [00:31<04:51,  2.88it/s, total_it=13080][A
epochs:  70%|▋| 14/20 [1:14:30<31:24, 314.12s/it, loss=0.675, lr=0.00146, d_time[A
train:  10%|█▊                 | 90/928 [00:31<04:51,  2.87it/s, total_it=13081][A
epochs:  70%|▋| 14/20 [1:14:31<31:24, 314.12s/it, loss=0.559, lr=0.00146, d_time[A
train:  10%|█▊                 | 91/928 [00:31<05:05,  2.74it/s, total_it=13082][A
epochs:  70%|▋| 14/20 [1:14:31<31:24, 314.12s/it, loss=0.575, lr=0.00146, d_time[A
train:  10%|█▉                 | 92/928 [00:32<04:50,  2.88it/s, total_it=13

epochs:  70%|▋| 14/20 [1:14:46<31:24, 314.12s/it, loss=0.409, lr=0.00144, d_time[A
train:  15%|██▋               | 136/928 [00:46<04:19,  3.05it/s, total_it=13127][A
epochs:  70%|▋| 14/20 [1:14:46<31:24, 314.12s/it, loss=0.499, lr=0.00144, d_time[A
train:  15%|██▋               | 137/928 [00:47<04:18,  3.06it/s, total_it=13128][A
epochs:  70%|▋| 14/20 [1:14:46<31:24, 314.12s/it, loss=0.464, lr=0.00144, d_time[A
train:  15%|██▋               | 138/928 [00:47<04:09,  3.17it/s, total_it=13129][A
epochs:  70%|▋| 14/20 [1:14:47<31:24, 314.12s/it, loss=0.515, lr=0.00144, d_time[A
train:  15%|██▋               | 139/928 [00:47<04:18,  3.05it/s, total_it=13130][A
epochs:  70%|▋| 14/20 [1:14:47<31:24, 314.12s/it, loss=0.509, lr=0.00144, d_time[A
train:  15%|██▋               | 140/928 [00:48<04:35,  2.86it/s, total_it=13131][A
epochs:  70%|▋| 14/20 [1:14:47<31:24, 314.12s/it, loss=0.485, lr=0.00144, d_time[A
train:  15%|██▋               | 141/928 [00:48<04:33,  2.88it/s, total_it=13

epochs:  70%|▋| 14/20 [1:15:02<31:24, 314.12s/it, loss=0.567, lr=0.00142, d_time[A
train:  20%|███▌              | 185/928 [01:03<04:03,  3.06it/s, total_it=13176][A
epochs:  70%|▋| 14/20 [1:15:02<31:24, 314.12s/it, loss=0.527, lr=0.00142, d_time[A
train:  20%|███▌              | 186/928 [01:03<04:01,  3.07it/s, total_it=13177][A
epochs:  70%|▋| 14/20 [1:15:03<31:24, 314.12s/it, loss=0.502, lr=0.00142, d_time[A
train:  20%|███▋              | 187/928 [01:03<03:58,  3.11it/s, total_it=13178][A
epochs:  70%|▋| 14/20 [1:15:03<31:24, 314.12s/it, loss=0.448, lr=0.00142, d_time[A
train:  20%|███▋              | 188/928 [01:04<03:57,  3.11it/s, total_it=13179][A
epochs:  70%|▋| 14/20 [1:15:03<31:24, 314.12s/it, loss=0.486, lr=0.00142, d_time[A
train:  20%|███▋              | 189/928 [01:04<03:53,  3.16it/s, total_it=13180][A
epochs:  70%|▋| 14/20 [1:15:04<31:24, 314.12s/it, loss=0.457, lr=0.00142, d_time[A
train:  20%|███▋              | 190/928 [01:04<03:52,  3.17it/s, total_it=13

epochs:  70%|▋| 14/20 [1:15:19<31:24, 314.12s/it, loss=0.526, lr=0.0014, d_time=[A
train:  25%|████▌             | 234/928 [01:19<03:49,  3.02it/s, total_it=13225][A
epochs:  70%|▋| 14/20 [1:15:19<31:24, 314.12s/it, loss=0.472, lr=0.0014, d_time=[A
train:  25%|████▌             | 235/928 [01:20<03:49,  3.02it/s, total_it=13226][A
epochs:  70%|▋| 14/20 [1:15:19<31:24, 314.12s/it, loss=0.443, lr=0.0014, d_time=[A
train:  25%|████▌             | 236/928 [01:20<03:49,  3.01it/s, total_it=13227][A
epochs:  70%|▋| 14/20 [1:15:20<31:24, 314.12s/it, loss=0.583, lr=0.0014, d_time=[A
train:  26%|████▌             | 237/928 [01:20<03:54,  2.94it/s, total_it=13228][A
epochs:  70%|▋| 14/20 [1:15:20<31:24, 314.12s/it, loss=0.495, lr=0.0014, d_time=[A
train:  26%|████▌             | 238/928 [01:21<03:50,  2.99it/s, total_it=13229][A
epochs:  70%|▋| 14/20 [1:15:20<31:24, 314.12s/it, loss=0.456, lr=0.0014, d_time=[A
train:  26%|████▋             | 239/928 [01:21<03:48,  3.01it/s, total_it=13

epochs:  70%|▋| 14/20 [1:15:35<31:24, 314.12s/it, loss=0.419, lr=0.00138, d_time[A
train:  30%|█████▍            | 283/928 [01:36<03:30,  3.07it/s, total_it=13274][A
epochs:  70%|▋| 14/20 [1:15:36<31:24, 314.12s/it, loss=0.448, lr=0.00138, d_time[A
train:  31%|█████▌            | 284/928 [01:36<03:33,  3.02it/s, total_it=13275][A
epochs:  70%|▋| 14/20 [1:15:36<31:24, 314.12s/it, loss=0.51, lr=0.00138, d_time=[A
train:  31%|█████▌            | 285/928 [01:37<03:31,  3.03it/s, total_it=13276][A
epochs:  70%|▋| 14/20 [1:15:36<31:24, 314.12s/it, loss=0.49, lr=0.00138, d_time=[A
train:  31%|█████▌            | 286/928 [01:37<03:32,  3.02it/s, total_it=13277][A
epochs:  70%|▋| 14/20 [1:15:37<31:24, 314.12s/it, loss=0.512, lr=0.00138, d_time[A
train:  31%|█████▌            | 287/928 [01:37<03:34,  2.99it/s, total_it=13278][A
epochs:  70%|▋| 14/20 [1:15:37<31:24, 314.12s/it, loss=0.502, lr=0.00138, d_time[A
train:  31%|█████▌            | 288/928 [01:38<03:34,  2.98it/s, total_it=13

epochs:  70%|▋| 14/20 [1:15:52<31:24, 314.12s/it, loss=0.468, lr=0.00136, d_time[A
train:  36%|██████▍           | 332/928 [01:53<03:23,  2.93it/s, total_it=13323][A
epochs:  70%|▋| 14/20 [1:15:52<31:24, 314.12s/it, loss=0.595, lr=0.00136, d_time[A
train:  36%|██████▍           | 333/928 [01:53<03:20,  2.96it/s, total_it=13324][A
epochs:  70%|▋| 14/20 [1:15:52<31:24, 314.12s/it, loss=0.455, lr=0.00136, d_time[A
train:  36%|██████▍           | 334/928 [01:53<03:14,  3.05it/s, total_it=13325][A
epochs:  70%|▋| 14/20 [1:15:53<31:24, 314.12s/it, loss=0.464, lr=0.00136, d_time[A
train:  36%|██████▍           | 335/928 [01:54<03:16,  3.02it/s, total_it=13326][A
epochs:  70%|▋| 14/20 [1:15:53<31:24, 314.12s/it, loss=0.446, lr=0.00136, d_time[A
train:  36%|██████▌           | 336/928 [01:54<03:15,  3.03it/s, total_it=13327][A
epochs:  70%|▋| 14/20 [1:15:53<31:24, 314.12s/it, loss=0.451, lr=0.00136, d_time[A
train:  36%|██████▌           | 337/928 [01:54<03:18,  2.97it/s, total_it=13

epochs:  70%|▋| 14/20 [1:16:08<31:24, 314.12s/it, loss=0.54, lr=0.00134, d_time=[A
train:  41%|███████▍          | 381/928 [02:09<02:55,  3.12it/s, total_it=13372][A
epochs:  70%|▋| 14/20 [1:16:08<31:24, 314.12s/it, loss=0.618, lr=0.00134, d_time[A
train:  41%|███████▍          | 382/928 [02:09<02:53,  3.14it/s, total_it=13373][A
epochs:  70%|▋| 14/20 [1:16:09<31:24, 314.12s/it, loss=0.422, lr=0.00134, d_time[A
train:  41%|███████▍          | 383/928 [02:10<03:05,  2.93it/s, total_it=13374][A
epochs:  70%|▋| 14/20 [1:16:09<31:24, 314.12s/it, loss=0.555, lr=0.00134, d_time[A
train:  41%|███████▍          | 384/928 [02:10<02:58,  3.04it/s, total_it=13375][A
epochs:  70%|▋| 14/20 [1:16:09<31:24, 314.12s/it, loss=0.48, lr=0.00134, d_time=[A
train:  41%|███████▍          | 385/928 [02:10<02:57,  3.06it/s, total_it=13376][A
epochs:  70%|▋| 14/20 [1:16:10<31:24, 314.12s/it, loss=0.435, lr=0.00134, d_time[A
train:  42%|███████▍          | 386/928 [02:10<02:56,  3.07it/s, total_it=13

epochs:  70%|▋| 14/20 [1:16:25<31:24, 314.12s/it, loss=0.588, lr=0.00132, d_time[A
train:  46%|████████▎         | 430/928 [02:26<02:53,  2.87it/s, total_it=13421][A
epochs:  70%|▋| 14/20 [1:16:25<31:24, 314.12s/it, loss=0.529, lr=0.00132, d_time[A
train:  46%|████████▎         | 431/928 [02:26<02:51,  2.90it/s, total_it=13422][A
epochs:  70%|▋| 14/20 [1:16:25<31:24, 314.12s/it, loss=0.445, lr=0.00132, d_time[A
train:  47%|████████▍         | 432/928 [02:26<02:47,  2.97it/s, total_it=13423][A
epochs:  70%|▋| 14/20 [1:16:26<31:24, 314.12s/it, loss=0.59, lr=0.00132, d_time=[A
train:  47%|████████▍         | 433/928 [02:26<02:45,  2.99it/s, total_it=13424][A
epochs:  70%|▋| 14/20 [1:16:26<31:24, 314.12s/it, loss=0.539, lr=0.00132, d_time[A
train:  47%|████████▍         | 434/928 [02:27<02:49,  2.92it/s, total_it=13425][A
epochs:  70%|▋| 14/20 [1:16:26<31:24, 314.12s/it, loss=0.455, lr=0.00132, d_time[A
train:  47%|████████▍         | 435/928 [02:27<02:45,  2.99it/s, total_it=13

epochs:  70%|▋| 14/20 [1:16:41<31:24, 314.12s/it, loss=0.525, lr=0.0013, d_time=[A
train:  52%|█████████▎        | 479/928 [02:42<02:23,  3.14it/s, total_it=13470][A
epochs:  70%|▋| 14/20 [1:16:41<31:24, 314.12s/it, loss=0.447, lr=0.0013, d_time=[A
train:  52%|█████████▎        | 480/928 [02:42<02:21,  3.16it/s, total_it=13471][A
epochs:  70%|▋| 14/20 [1:16:42<31:24, 314.12s/it, loss=0.612, lr=0.0013, d_time=[A
train:  52%|█████████▎        | 481/928 [02:43<02:23,  3.12it/s, total_it=13472][A
epochs:  70%|▋| 14/20 [1:16:42<31:24, 314.12s/it, loss=0.443, lr=0.0013, d_time=[A
train:  52%|█████████▎        | 482/928 [02:43<02:20,  3.17it/s, total_it=13473][A
epochs:  70%|▋| 14/20 [1:16:42<31:24, 314.12s/it, loss=0.528, lr=0.0013, d_time=[A
train:  52%|█████████▎        | 483/928 [02:43<02:19,  3.18it/s, total_it=13474][A
epochs:  70%|▋| 14/20 [1:16:43<31:24, 314.12s/it, loss=0.477, lr=0.0013, d_time=[A
train:  52%|█████████▍        | 484/928 [02:43<02:18,  3.22it/s, total_it=13

epochs:  70%|▋| 14/20 [1:16:58<31:24, 314.12s/it, loss=0.581, lr=0.00128, d_time[A
train:  57%|██████████▏       | 528/928 [02:58<02:23,  2.78it/s, total_it=13519][A
epochs:  70%|▋| 14/20 [1:16:58<31:24, 314.12s/it, loss=0.591, lr=0.00128, d_time[A
train:  57%|██████████▎       | 529/928 [02:59<02:25,  2.73it/s, total_it=13520][A
epochs:  70%|▋| 14/20 [1:16:58<31:24, 314.12s/it, loss=0.454, lr=0.00128, d_time[A
train:  57%|██████████▎       | 530/928 [02:59<02:17,  2.89it/s, total_it=13521][A
epochs:  70%|▋| 14/20 [1:16:59<31:24, 314.12s/it, loss=0.503, lr=0.00128, d_time[A
train:  57%|██████████▎       | 531/928 [02:59<02:13,  2.97it/s, total_it=13522][A
epochs:  70%|▋| 14/20 [1:16:59<31:24, 314.12s/it, loss=0.462, lr=0.00128, d_time[A
train:  57%|██████████▎       | 532/928 [03:00<02:17,  2.88it/s, total_it=13523][A
epochs:  70%|▋| 14/20 [1:16:59<31:24, 314.12s/it, loss=0.709, lr=0.00128, d_time[A
train:  57%|██████████▎       | 533/928 [03:00<02:14,  2.94it/s, total_it=13

epochs:  70%|▋| 14/20 [1:17:14<31:24, 314.12s/it, loss=0.457, lr=0.00126, d_time[A
train:  62%|███████████▏      | 577/928 [03:15<01:58,  2.97it/s, total_it=13568][A
epochs:  70%|▋| 14/20 [1:17:14<31:24, 314.12s/it, loss=0.438, lr=0.00126, d_time[A
train:  62%|███████████▏      | 578/928 [03:15<01:57,  2.97it/s, total_it=13569][A
epochs:  70%|▋| 14/20 [1:17:15<31:24, 314.12s/it, loss=0.427, lr=0.00126, d_time[A
train:  62%|███████████▏      | 579/928 [03:15<01:56,  2.98it/s, total_it=13570][A
epochs:  70%|▋| 14/20 [1:17:15<31:24, 314.12s/it, loss=0.62, lr=0.00126, d_time=[A
train:  62%|███████████▎      | 580/928 [03:16<01:56,  3.00it/s, total_it=13571][A
epochs:  70%|▋| 14/20 [1:17:15<31:24, 314.12s/it, loss=0.503, lr=0.00126, d_time[A
train:  63%|███████████▎      | 581/928 [03:16<01:54,  3.02it/s, total_it=13572][A
epochs:  70%|▋| 14/20 [1:17:16<31:24, 314.12s/it, loss=0.52, lr=0.00126, d_time=[A
train:  63%|███████████▎      | 582/928 [03:16<01:52,  3.09it/s, total_it=13

epochs:  70%|▋| 14/20 [1:17:30<31:24, 314.12s/it, loss=0.55, lr=0.00124, d_time=[A
train:  67%|████████████▏     | 626/928 [03:31<01:42,  2.95it/s, total_it=13617][A
epochs:  70%|▋| 14/20 [1:17:31<31:24, 314.12s/it, loss=0.463, lr=0.00124, d_time[A
train:  68%|████████████▏     | 627/928 [03:32<01:41,  2.97it/s, total_it=13618][A
epochs:  70%|▋| 14/20 [1:17:31<31:24, 314.12s/it, loss=0.446, lr=0.00124, d_time[A
train:  68%|████████████▏     | 628/928 [03:32<01:42,  2.92it/s, total_it=13619][A
epochs:  70%|▋| 14/20 [1:17:31<31:24, 314.12s/it, loss=0.502, lr=0.00124, d_time[A
train:  68%|████████████▏     | 629/928 [03:32<01:43,  2.89it/s, total_it=13620][A
epochs:  70%|▋| 14/20 [1:17:32<31:24, 314.12s/it, loss=0.536, lr=0.00124, d_time[A
train:  68%|████████████▏     | 630/928 [03:33<01:40,  2.96it/s, total_it=13621][A
epochs:  70%|▋| 14/20 [1:17:32<31:24, 314.12s/it, loss=0.595, lr=0.00124, d_time[A
train:  68%|████████████▏     | 631/928 [03:33<01:38,  3.01it/s, total_it=13

epochs:  70%|▋| 14/20 [1:17:47<31:24, 314.12s/it, loss=0.42, lr=0.00122, d_time=[A
train:  73%|█████████████     | 675/928 [03:48<01:22,  3.08it/s, total_it=13666][A
epochs:  70%|▋| 14/20 [1:17:47<31:24, 314.12s/it, loss=0.537, lr=0.00122, d_time[A
train:  73%|█████████████     | 676/928 [03:48<01:23,  3.01it/s, total_it=13667][A
epochs:  70%|▋| 14/20 [1:17:48<31:24, 314.12s/it, loss=0.453, lr=0.00122, d_time[A
train:  73%|█████████████▏    | 677/928 [03:48<01:24,  2.98it/s, total_it=13668][A
epochs:  70%|▋| 14/20 [1:17:48<31:24, 314.12s/it, loss=0.52, lr=0.00122, d_time=[A
train:  73%|█████████████▏    | 678/928 [03:49<01:21,  3.07it/s, total_it=13669][A
epochs:  70%|▋| 14/20 [1:17:48<31:24, 314.12s/it, loss=0.491, lr=0.00122, d_time[A
train:  73%|█████████████▏    | 679/928 [03:49<01:20,  3.07it/s, total_it=13670][A
epochs:  70%|▋| 14/20 [1:17:49<31:24, 314.12s/it, loss=0.504, lr=0.00121, d_time[A
train:  73%|█████████████▏    | 680/928 [03:49<01:21,  3.05it/s, total_it=13

epochs:  70%|▋| 14/20 [1:18:03<31:24, 314.12s/it, loss=0.551, lr=0.0012, d_time=[A
train:  78%|██████████████    | 724/928 [04:04<01:09,  2.95it/s, total_it=13715][A
epochs:  70%|▋| 14/20 [1:18:04<31:24, 314.12s/it, loss=0.522, lr=0.0012, d_time=[A
train:  78%|██████████████    | 725/928 [04:05<01:08,  2.98it/s, total_it=13716][A
epochs:  70%|▋| 14/20 [1:18:04<31:24, 314.12s/it, loss=0.44, lr=0.0012, d_time=0[A
train:  78%|██████████████    | 726/928 [04:05<01:12,  2.78it/s, total_it=13717][A
epochs:  70%|▋| 14/20 [1:18:05<31:24, 314.12s/it, loss=0.577, lr=0.0012, d_time=[A
train:  78%|██████████████    | 727/928 [04:05<01:09,  2.90it/s, total_it=13718][A
epochs:  70%|▋| 14/20 [1:18:05<31:24, 314.12s/it, loss=0.53, lr=0.00119, d_time=[A
train:  78%|██████████████    | 728/928 [04:06<01:11,  2.79it/s, total_it=13719][A
epochs:  70%|▋| 14/20 [1:18:05<31:24, 314.12s/it, loss=0.377, lr=0.00119, d_time[A
train:  79%|██████████████▏   | 729/928 [04:06<01:09,  2.87it/s, total_it=13

epochs:  70%|▋| 14/20 [1:18:20<31:24, 314.12s/it, loss=0.472, lr=0.00118, d_time[A
train:  83%|██████████████▉   | 773/928 [04:21<00:52,  2.97it/s, total_it=13764][A
epochs:  70%|▋| 14/20 [1:18:20<31:24, 314.12s/it, loss=0.411, lr=0.00118, d_time[A
train:  83%|███████████████   | 774/928 [04:21<00:55,  2.78it/s, total_it=13765][A
epochs:  70%|▋| 14/20 [1:18:21<31:24, 314.12s/it, loss=0.388, lr=0.00118, d_time[A
train:  84%|███████████████   | 775/928 [04:22<00:54,  2.83it/s, total_it=13766][A
epochs:  70%|▋| 14/20 [1:18:21<31:24, 314.12s/it, loss=0.493, lr=0.00118, d_time[A
train:  84%|███████████████   | 776/928 [04:22<00:51,  2.93it/s, total_it=13767][A
epochs:  70%|▋| 14/20 [1:18:21<31:24, 314.12s/it, loss=0.532, lr=0.00117, d_time[A
train:  84%|███████████████   | 777/928 [04:22<00:50,  3.00it/s, total_it=13768][A
epochs:  70%|▋| 14/20 [1:18:22<31:24, 314.12s/it, loss=0.449, lr=0.00117, d_time[A
train:  84%|███████████████   | 778/928 [04:23<00:49,  3.02it/s, total_it=13

epochs:  70%|▋| 14/20 [1:18:36<31:24, 314.12s/it, loss=0.441, lr=0.00116, d_time[A
train:  89%|███████████████▉  | 822/928 [04:37<00:35,  3.01it/s, total_it=13813][A
epochs:  70%|▋| 14/20 [1:18:37<31:24, 314.12s/it, loss=0.465, lr=0.00116, d_time[A
train:  89%|███████████████▉  | 823/928 [04:37<00:33,  3.11it/s, total_it=13814][A
epochs:  70%|▋| 14/20 [1:18:37<31:24, 314.12s/it, loss=0.65, lr=0.00116, d_time=[A
train:  89%|███████████████▉  | 824/928 [04:38<00:33,  3.07it/s, total_it=13815][A
epochs:  70%|▋| 14/20 [1:18:37<31:24, 314.12s/it, loss=0.534, lr=0.00115, d_time[A
train:  89%|████████████████  | 825/928 [04:38<00:33,  3.08it/s, total_it=13816][A
epochs:  70%|▋| 14/20 [1:18:38<31:24, 314.12s/it, loss=0.422, lr=0.00115, d_time[A
train:  89%|████████████████  | 826/928 [04:38<00:32,  3.15it/s, total_it=13817][A
epochs:  70%|▋| 14/20 [1:18:38<31:24, 314.12s/it, loss=0.499, lr=0.00115, d_time[A
train:  89%|████████████████  | 827/928 [04:39<00:32,  3.08it/s, total_it=13

epochs:  70%|▋| 14/20 [1:18:53<31:24, 314.12s/it, loss=0.484, lr=0.00114, d_time[A
train:  94%|████████████████▉ | 871/928 [04:53<00:18,  3.09it/s, total_it=13862][A
epochs:  70%|▋| 14/20 [1:18:53<31:24, 314.12s/it, loss=0.489, lr=0.00114, d_time[A
train:  94%|████████████████▉ | 872/928 [04:54<00:17,  3.12it/s, total_it=13863][A
epochs:  70%|▋| 14/20 [1:18:53<31:24, 314.12s/it, loss=0.529, lr=0.00114, d_time[A
train:  94%|████████████████▉ | 873/928 [04:54<00:17,  3.11it/s, total_it=13864][A
epochs:  70%|▋| 14/20 [1:18:54<31:24, 314.12s/it, loss=0.501, lr=0.00113, d_time[A
train:  94%|████████████████▉ | 874/928 [04:54<00:17,  3.15it/s, total_it=13865][A
epochs:  70%|▋| 14/20 [1:18:54<31:24, 314.12s/it, loss=0.544, lr=0.00113, d_time[A
train:  94%|████████████████▉ | 875/928 [04:55<00:16,  3.13it/s, total_it=13866][A
epochs:  70%|▋| 14/20 [1:18:54<31:24, 314.12s/it, loss=0.466, lr=0.00113, d_time[A
train:  94%|████████████████▉ | 876/928 [04:55<00:16,  3.09it/s, total_it=13

epochs:  70%|▋| 14/20 [1:19:09<31:24, 314.12s/it, loss=0.602, lr=0.00112, d_time[A
train:  99%|█████████████████▊| 920/928 [05:10<00:02,  3.05it/s, total_it=13911][A
epochs:  70%|▋| 14/20 [1:19:09<31:24, 314.12s/it, loss=0.443, lr=0.00112, d_time[A
train:  99%|█████████████████▊| 921/928 [05:10<00:02,  3.08it/s, total_it=13912][A
epochs:  70%|▋| 14/20 [1:19:10<31:24, 314.12s/it, loss=0.444, lr=0.00112, d_time[A
train:  99%|█████████████████▉| 922/928 [05:10<00:01,  3.05it/s, total_it=13913][A
epochs:  70%|▋| 14/20 [1:19:10<31:24, 314.12s/it, loss=0.464, lr=0.00111, d_time[A
train:  99%|█████████████████▉| 923/928 [05:11<00:01,  3.07it/s, total_it=13914][A
epochs:  70%|▋| 14/20 [1:19:10<31:24, 314.12s/it, loss=0.442, lr=0.00111, d_time[A
train: 100%|█████████████████▉| 924/928 [05:11<00:01,  3.11it/s, total_it=13915][A
epochs:  70%|▋| 14/20 [1:19:11<31:24, 314.12s/it, loss=0.479, lr=0.00111, d_time[A
train: 100%|█████████████████▉| 925/928 [05:11<00:00,  3.16it/s, total_it=13

epochs:  75%|▊| 15/20 [1:19:26<26:09, 313.81s/it, loss=0.422, lr=0.0011, d_time=[A
train:   4%|▊                  | 40/928 [00:14<05:08,  2.88it/s, total_it=13959][A
epochs:  75%|▊| 15/20 [1:19:27<26:09, 313.81s/it, loss=0.539, lr=0.0011, d_time=[A
train:   4%|▊                  | 41/928 [00:14<05:06,  2.89it/s, total_it=13960][A
epochs:  75%|▊| 15/20 [1:19:27<26:09, 313.81s/it, loss=0.494, lr=0.0011, d_time=[A
train:   5%|▊                  | 42/928 [00:15<04:56,  2.99it/s, total_it=13961][A
epochs:  75%|▊| 15/20 [1:19:27<26:09, 313.81s/it, loss=0.463, lr=0.0011, d_time=[A
train:   5%|▉                  | 43/928 [00:15<04:59,  2.96it/s, total_it=13962][A
epochs:  75%|▊| 15/20 [1:19:28<26:09, 313.81s/it, loss=0.734, lr=0.00109, d_time[A
train:   5%|▉                  | 44/928 [00:15<04:53,  3.02it/s, total_it=13963][A
epochs:  75%|▊| 15/20 [1:19:28<26:09, 313.81s/it, loss=0.55, lr=0.00109, d_time=[A
train:   5%|▉                  | 45/928 [00:16<04:55,  2.99it/s, total_it=13

epochs:  75%|▊| 15/20 [1:19:43<26:09, 313.81s/it, loss=0.402, lr=0.00108, d_time[A
train:  10%|█▊                 | 89/928 [00:30<04:26,  3.14it/s, total_it=14008][A
epochs:  75%|▊| 15/20 [1:19:43<26:09, 313.81s/it, loss=0.508, lr=0.00108, d_time[A
train:  10%|█▊                 | 90/928 [00:31<04:27,  3.13it/s, total_it=14009][A
epochs:  75%|▊| 15/20 [1:19:43<26:09, 313.81s/it, loss=0.572, lr=0.00108, d_time[A
train:  10%|█▊                 | 91/928 [00:31<04:32,  3.07it/s, total_it=14010][A
epochs:  75%|▊| 15/20 [1:19:44<26:09, 313.81s/it, loss=0.571, lr=0.00108, d_time[A
train:  10%|█▉                 | 92/928 [00:31<04:28,  3.11it/s, total_it=14011][A
epochs:  75%|▊| 15/20 [1:19:44<26:09, 313.81s/it, loss=0.528, lr=0.00107, d_time[A
train:  10%|█▉                 | 93/928 [00:32<04:33,  3.05it/s, total_it=14012][A
epochs:  75%|▊| 15/20 [1:19:44<26:09, 313.81s/it, loss=0.472, lr=0.00107, d_time[A
train:  10%|█▉                 | 94/928 [00:32<04:37,  3.01it/s, total_it=14

epochs:  75%|▊| 15/20 [1:19:59<26:09, 313.81s/it, loss=0.441, lr=0.00106, d_time[A
train:  15%|██▋               | 138/928 [00:47<04:23,  3.00it/s, total_it=14057][A
epochs:  75%|▊| 15/20 [1:19:59<26:09, 313.81s/it, loss=0.528, lr=0.00106, d_time[A
train:  15%|██▋               | 139/928 [00:47<04:20,  3.02it/s, total_it=14058][A
epochs:  75%|▊| 15/20 [1:20:00<26:09, 313.81s/it, loss=0.561, lr=0.00106, d_time[A
train:  15%|██▋               | 140/928 [00:48<04:35,  2.86it/s, total_it=14059][A
epochs:  75%|▊| 15/20 [1:20:00<26:09, 313.81s/it, loss=0.474, lr=0.00106, d_time[A
train:  15%|██▋               | 141/928 [00:48<04:52,  2.69it/s, total_it=14060][A
epochs:  75%|▊| 15/20 [1:20:01<26:09, 313.81s/it, loss=0.522, lr=0.00105, d_time[A
train:  15%|██▊               | 142/928 [00:48<04:38,  2.82it/s, total_it=14061][A
epochs:  75%|▊| 15/20 [1:20:01<26:09, 313.81s/it, loss=0.544, lr=0.00105, d_time[A
train:  15%|██▊               | 143/928 [00:49<04:31,  2.89it/s, total_it=14

epochs:  75%|▊| 15/20 [1:20:15<26:09, 313.81s/it, loss=0.473, lr=0.00104, d_time[A
train:  20%|███▋              | 187/928 [01:03<03:56,  3.13it/s, total_it=14106][A
epochs:  75%|▊| 15/20 [1:20:16<26:09, 313.81s/it, loss=0.476, lr=0.00104, d_time[A
train:  20%|███▋              | 188/928 [01:03<03:54,  3.16it/s, total_it=14107][A
epochs:  75%|▊| 15/20 [1:20:16<26:09, 313.81s/it, loss=0.552, lr=0.00104, d_time[A
train:  20%|███▋              | 189/928 [01:04<04:00,  3.07it/s, total_it=14108][A
epochs:  75%|▊| 15/20 [1:20:16<26:09, 313.81s/it, loss=0.607, lr=0.00104, d_time[A
train:  20%|███▋              | 190/928 [01:04<03:54,  3.15it/s, total_it=14109][A
epochs:  75%|▊| 15/20 [1:20:17<26:09, 313.81s/it, loss=0.512, lr=0.00104, d_time[A
train:  21%|███▋              | 191/928 [01:04<04:05,  3.00it/s, total_it=14110][A
epochs:  75%|▊| 15/20 [1:20:17<26:09, 313.81s/it, loss=0.501, lr=0.00103, d_time[A
train:  21%|███▋              | 192/928 [01:05<03:56,  3.11it/s, total_it=14

epochs:  75%|▊| 15/20 [1:20:32<26:09, 313.81s/it, loss=0.471, lr=0.00102, d_time[A
train:  25%|████▌             | 236/928 [01:20<03:55,  2.94it/s, total_it=14155][A
epochs:  75%|▊| 15/20 [1:20:32<26:09, 313.81s/it, loss=0.466, lr=0.00102, d_time[A
train:  26%|████▌             | 237/928 [01:20<03:47,  3.04it/s, total_it=14156][A
epochs:  75%|▊| 15/20 [1:20:33<26:09, 313.81s/it, loss=0.449, lr=0.00102, d_time[A
train:  26%|████▌             | 238/928 [01:20<03:46,  3.04it/s, total_it=14157][A
epochs:  75%|▊| 15/20 [1:20:33<26:09, 313.81s/it, loss=0.483, lr=0.00102, d_time[A
train:  26%|████▋             | 239/928 [01:21<03:52,  2.97it/s, total_it=14158][A
epochs:  75%|▊| 15/20 [1:20:33<26:09, 313.81s/it, loss=0.529, lr=0.00102, d_time[A
train:  26%|████▋             | 240/928 [01:21<03:48,  3.01it/s, total_it=14159][A
epochs:  75%|▊| 15/20 [1:20:34<26:09, 313.81s/it, loss=0.438, lr=0.00102, d_time[A
train:  26%|████▋             | 241/928 [01:21<03:43,  3.07it/s, total_it=14

epochs:  75%|▊| 15/20 [1:20:48<26:09, 313.81s/it, loss=0.472, lr=0.000997, d_tim[A
train:  31%|█████▌            | 285/928 [01:36<03:38,  2.94it/s, total_it=14204][A
epochs:  75%|▊| 15/20 [1:20:49<26:09, 313.81s/it, loss=0.579, lr=0.000997, d_tim[A
train:  31%|█████▌            | 286/928 [01:36<03:41,  2.90it/s, total_it=14205][A
epochs:  75%|▊| 15/20 [1:20:49<26:09, 313.81s/it, loss=0.472, lr=0.000997, d_tim[A
train:  31%|█████▌            | 287/928 [01:37<03:40,  2.90it/s, total_it=14206][A
epochs:  75%|▊| 15/20 [1:20:49<26:09, 313.81s/it, loss=0.462, lr=0.000996, d_tim[A
train:  31%|█████▌            | 288/928 [01:37<03:38,  2.93it/s, total_it=14207][A
epochs:  75%|▊| 15/20 [1:20:50<26:09, 313.81s/it, loss=0.513, lr=0.000996, d_tim[A
train:  31%|█████▌            | 289/928 [01:37<03:34,  2.98it/s, total_it=14208][A
epochs:  75%|▊| 15/20 [1:20:50<26:09, 313.81s/it, loss=0.518, lr=0.000995, d_tim[A
train:  31%|█████▋            | 290/928 [01:38<03:29,  3.04it/s, total_it=14

epochs:  75%|▊| 15/20 [1:21:05<26:09, 313.81s/it, loss=0.419, lr=0.000978, d_tim[A
train:  36%|██████▍           | 334/928 [01:52<03:12,  3.09it/s, total_it=14253][A
epochs:  75%|▊| 15/20 [1:21:05<26:09, 313.81s/it, loss=0.479, lr=0.000978, d_tim[A
train:  36%|██████▍           | 335/928 [01:53<03:25,  2.89it/s, total_it=14254][A
epochs:  75%|▊| 15/20 [1:21:05<26:09, 313.81s/it, loss=0.412, lr=0.000977, d_tim[A
train:  36%|██████▌           | 336/928 [01:53<03:37,  2.73it/s, total_it=14255][A
epochs:  75%|▊| 15/20 [1:21:06<26:09, 313.81s/it, loss=0.349, lr=0.000977, d_tim[A
train:  36%|██████▌           | 337/928 [01:53<03:33,  2.77it/s, total_it=14256][A
epochs:  75%|▊| 15/20 [1:21:06<26:09, 313.81s/it, loss=0.488, lr=0.000976, d_tim[A
train:  36%|██████▌           | 338/928 [01:54<03:22,  2.92it/s, total_it=14257][A
epochs:  75%|▊| 15/20 [1:21:06<26:09, 313.81s/it, loss=0.419, lr=0.000976, d_tim[A
train:  37%|██████▌           | 339/928 [01:54<03:24,  2.88it/s, total_it=14

epochs:  75%|▊| 15/20 [1:21:21<26:09, 313.81s/it, loss=0.51, lr=0.000959, d_time[A
train:  41%|███████▍          | 383/928 [02:09<02:58,  3.05it/s, total_it=14302][A
epochs:  75%|▊| 15/20 [1:21:21<26:09, 313.81s/it, loss=0.383, lr=0.000958, d_tim[A
train:  41%|███████▍          | 384/928 [02:09<03:00,  3.02it/s, total_it=14303][A
epochs:  75%|▊| 15/20 [1:21:22<26:09, 313.81s/it, loss=0.474, lr=0.000958, d_tim[A
train:  41%|███████▍          | 385/928 [02:09<02:57,  3.05it/s, total_it=14304][A
epochs:  75%|▊| 15/20 [1:21:22<26:09, 313.81s/it, loss=0.434, lr=0.000957, d_tim[A
train:  42%|███████▍          | 386/928 [02:10<03:06,  2.91it/s, total_it=14305][A
epochs:  75%|▊| 15/20 [1:21:22<26:09, 313.81s/it, loss=0.452, lr=0.000957, d_tim[A
train:  42%|███████▌          | 387/928 [02:10<02:58,  3.03it/s, total_it=14306][A
epochs:  75%|▊| 15/20 [1:21:23<26:09, 313.81s/it, loss=0.489, lr=0.000957, d_tim[A
train:  42%|███████▌          | 388/928 [02:10<03:05,  2.90it/s, total_it=14

epochs:  75%|▊| 15/20 [1:21:37<26:09, 313.81s/it, loss=0.53, lr=0.000939, d_time[A
train:  47%|████████▍         | 432/928 [02:25<02:43,  3.03it/s, total_it=14351][A
epochs:  75%|▊| 15/20 [1:21:38<26:09, 313.81s/it, loss=0.518, lr=0.000939, d_tim[A
train:  47%|████████▍         | 433/928 [02:25<02:44,  3.01it/s, total_it=14352][A
epochs:  75%|▊| 15/20 [1:21:38<26:09, 313.81s/it, loss=0.414, lr=0.000939, d_tim[A
train:  47%|████████▍         | 434/928 [02:25<02:41,  3.05it/s, total_it=14353][A
epochs:  75%|▊| 15/20 [1:21:38<26:09, 313.81s/it, loss=0.417, lr=0.000938, d_tim[A
train:  47%|████████▍         | 435/928 [02:26<02:43,  3.01it/s, total_it=14354][A
epochs:  75%|▊| 15/20 [1:21:39<26:09, 313.81s/it, loss=0.495, lr=0.000938, d_tim[A
train:  47%|████████▍         | 436/928 [02:26<02:53,  2.84it/s, total_it=14355][A
epochs:  75%|▊| 15/20 [1:21:39<26:09, 313.81s/it, loss=0.498, lr=0.000937, d_tim[A
train:  47%|████████▍         | 437/928 [02:27<02:48,  2.92it/s, total_it=14

epochs:  75%|▊| 15/20 [1:21:54<26:09, 313.81s/it, loss=0.661, lr=0.00092, d_time[A
train:  52%|█████████▎        | 481/928 [02:41<02:32,  2.93it/s, total_it=14400][A
epochs:  75%|▊| 15/20 [1:21:54<26:09, 313.81s/it, loss=0.441, lr=0.00092, d_time[A
train:  52%|█████████▎        | 482/928 [02:42<02:30,  2.97it/s, total_it=14401][A
epochs:  75%|▊| 15/20 [1:21:54<26:09, 313.81s/it, loss=0.384, lr=0.000919, d_tim[A
train:  52%|█████████▎        | 483/928 [02:42<02:32,  2.92it/s, total_it=14402][A
epochs:  75%|▊| 15/20 [1:21:55<26:09, 313.81s/it, loss=0.469, lr=0.000919, d_tim[A
train:  52%|█████████▍        | 484/928 [02:43<02:40,  2.76it/s, total_it=14403][A
epochs:  75%|▊| 15/20 [1:21:55<26:09, 313.81s/it, loss=0.495, lr=0.000919, d_tim[A
train:  52%|█████████▍        | 485/928 [02:43<02:38,  2.79it/s, total_it=14404][A
epochs:  75%|▊| 15/20 [1:21:56<26:09, 313.81s/it, loss=0.567, lr=0.000918, d_tim[A
train:  52%|█████████▍        | 486/928 [02:43<02:32,  2.90it/s, total_it=14

epochs:  75%|▊| 15/20 [1:22:10<26:09, 313.81s/it, loss=0.46, lr=0.000901, d_time[A
train:  57%|██████████▎       | 530/928 [02:58<02:14,  2.97it/s, total_it=14449][A
epochs:  75%|▊| 15/20 [1:22:11<26:09, 313.81s/it, loss=0.557, lr=0.000901, d_tim[A
train:  57%|██████████▎       | 531/928 [02:58<02:11,  3.03it/s, total_it=14450][A
epochs:  75%|▊| 15/20 [1:22:11<26:09, 313.81s/it, loss=0.471, lr=0.0009, d_time=[A
train:  57%|██████████▎       | 532/928 [02:59<02:10,  3.04it/s, total_it=14451][A
epochs:  75%|▊| 15/20 [1:22:11<26:09, 313.81s/it, loss=0.436, lr=0.0009, d_time=[A
train:  57%|██████████▎       | 533/928 [02:59<02:06,  3.12it/s, total_it=14452][A
epochs:  75%|▊| 15/20 [1:22:12<26:09, 313.81s/it, loss=0.465, lr=0.0009, d_time=[A
train:  58%|██████████▎       | 534/928 [02:59<02:07,  3.10it/s, total_it=14453][A
epochs:  75%|▊| 15/20 [1:22:12<26:09, 313.81s/it, loss=0.508, lr=0.000899, d_tim[A
train:  58%|██████████▍       | 535/928 [03:00<02:07,  3.07it/s, total_it=14

epochs:  75%|▊| 15/20 [1:22:27<26:09, 313.81s/it, loss=0.507, lr=0.000882, d_tim[A
train:  62%|███████████▏      | 579/928 [03:14<02:07,  2.74it/s, total_it=14498][A
epochs:  75%|▊| 15/20 [1:22:27<26:09, 313.81s/it, loss=0.495, lr=0.000882, d_tim[A
train:  62%|███████████▎      | 580/928 [03:15<02:10,  2.67it/s, total_it=14499][A
epochs:  75%|▊| 15/20 [1:22:27<26:09, 313.81s/it, loss=0.439, lr=0.000881, d_tim[A
train:  63%|███████████▎      | 581/928 [03:15<02:05,  2.76it/s, total_it=14500][A
epochs:  75%|▊| 15/20 [1:22:28<26:09, 313.81s/it, loss=0.452, lr=0.000881, d_tim[A
train:  63%|███████████▎      | 582/928 [03:15<02:06,  2.73it/s, total_it=14501][A
epochs:  75%|▊| 15/20 [1:22:28<26:09, 313.81s/it, loss=0.475, lr=0.000881, d_tim[A
train:  63%|███████████▎      | 583/928 [03:16<02:05,  2.75it/s, total_it=14502][A
epochs:  75%|▊| 15/20 [1:22:29<26:09, 313.81s/it, loss=0.528, lr=0.00088, d_time[A
train:  63%|███████████▎      | 584/928 [03:16<02:01,  2.84it/s, total_it=14

epochs:  75%|▊| 15/20 [1:22:44<26:09, 313.81s/it, loss=0.469, lr=0.000863, d_tim[A
train:  68%|████████████▏     | 628/928 [03:31<01:36,  3.10it/s, total_it=14547][A
epochs:  75%|▊| 15/20 [1:22:44<26:09, 313.81s/it, loss=0.503, lr=0.000863, d_tim[A
train:  68%|████████████▏     | 629/928 [03:32<01:38,  3.05it/s, total_it=14548][A
epochs:  75%|▊| 15/20 [1:22:44<26:09, 313.81s/it, loss=0.469, lr=0.000863, d_tim[A
train:  68%|████████████▏     | 630/928 [03:32<01:40,  2.97it/s, total_it=14549][A
epochs:  75%|▊| 15/20 [1:22:45<26:09, 313.81s/it, loss=0.536, lr=0.000862, d_tim[A
train:  68%|████████████▏     | 631/928 [03:32<01:37,  3.05it/s, total_it=14550][A
epochs:  75%|▊| 15/20 [1:22:45<26:09, 313.81s/it, loss=0.604, lr=0.000862, d_tim[A
train:  68%|████████████▎     | 632/928 [03:33<01:37,  3.04it/s, total_it=14551][A
epochs:  75%|▊| 15/20 [1:22:45<26:09, 313.81s/it, loss=0.542, lr=0.000861, d_tim[A
train:  68%|████████████▎     | 633/928 [03:33<01:38,  2.99it/s, total_it=14

epochs:  75%|▊| 15/20 [1:23:00<26:09, 313.81s/it, loss=0.457, lr=0.000845, d_tim[A
train:  73%|█████████████▏    | 677/928 [03:48<01:23,  3.01it/s, total_it=14596][A
epochs:  75%|▊| 15/20 [1:23:01<26:09, 313.81s/it, loss=0.474, lr=0.000844, d_tim[A
train:  73%|█████████████▏    | 678/928 [03:48<01:27,  2.86it/s, total_it=14597][A
epochs:  75%|▊| 15/20 [1:23:01<26:09, 313.81s/it, loss=0.481, lr=0.000844, d_tim[A
train:  73%|█████████████▏    | 679/928 [03:49<01:23,  2.98it/s, total_it=14598][A
epochs:  75%|▊| 15/20 [1:23:01<26:09, 313.81s/it, loss=0.477, lr=0.000843, d_tim[A
train:  73%|█████████████▏    | 680/928 [03:49<01:23,  2.96it/s, total_it=14599][A
epochs:  75%|▊| 15/20 [1:23:02<26:09, 313.81s/it, loss=0.641, lr=0.000843, d_tim[A
train:  73%|█████████████▏    | 681/928 [03:49<01:21,  3.02it/s, total_it=14600][A
epochs:  75%|▊| 15/20 [1:23:02<26:09, 313.81s/it, loss=0.49, lr=0.000843, d_time[A
train:  73%|█████████████▏    | 682/928 [03:50<01:23,  2.93it/s, total_it=14

epochs:  75%|▊| 15/20 [1:23:17<26:09, 313.81s/it, loss=0.484, lr=0.000826, d_tim[A
train:  78%|██████████████    | 726/928 [04:05<01:08,  2.94it/s, total_it=14645][A
epochs:  75%|▊| 15/20 [1:23:17<26:09, 313.81s/it, loss=0.429, lr=0.000826, d_tim[A
train:  78%|██████████████    | 727/928 [04:05<01:11,  2.80it/s, total_it=14646][A
epochs:  75%|▊| 15/20 [1:23:18<26:09, 313.81s/it, loss=0.439, lr=0.000825, d_tim[A
train:  78%|██████████████    | 728/928 [04:05<01:09,  2.88it/s, total_it=14647][A
epochs:  75%|▊| 15/20 [1:23:18<26:09, 313.81s/it, loss=0.515, lr=0.000825, d_tim[A
train:  79%|██████████████▏   | 729/928 [04:06<01:07,  2.94it/s, total_it=14648][A
epochs:  75%|▊| 15/20 [1:23:18<26:09, 313.81s/it, loss=0.466, lr=0.000824, d_tim[A
train:  79%|██████████████▏   | 730/928 [04:06<01:05,  3.00it/s, total_it=14649][A
epochs:  75%|▊| 15/20 [1:23:19<26:09, 313.81s/it, loss=0.553, lr=0.000824, d_tim[A
train:  79%|██████████████▏   | 731/928 [04:06<01:05,  3.03it/s, total_it=14

epochs:  75%|▊| 15/20 [1:23:33<26:09, 313.81s/it, loss=0.43, lr=0.000808, d_time[A
train:  84%|███████████████   | 775/928 [04:21<00:51,  2.98it/s, total_it=14694][A
epochs:  75%|▊| 15/20 [1:23:34<26:09, 313.81s/it, loss=0.489, lr=0.000807, d_tim[A
train:  84%|███████████████   | 776/928 [04:21<00:51,  2.97it/s, total_it=14695][A
epochs:  75%|▊| 15/20 [1:23:34<26:09, 313.81s/it, loss=0.444, lr=0.000807, d_tim[A
train:  84%|███████████████   | 777/928 [04:22<00:50,  2.99it/s, total_it=14696][A
epochs:  75%|▊| 15/20 [1:23:34<26:09, 313.81s/it, loss=0.538, lr=0.000806, d_tim[A
train:  84%|███████████████   | 778/928 [04:22<00:49,  3.02it/s, total_it=14697][A
epochs:  75%|▊| 15/20 [1:23:35<26:09, 313.81s/it, loss=0.536, lr=0.000806, d_tim[A
train:  84%|███████████████   | 779/928 [04:22<00:47,  3.13it/s, total_it=14698][A
epochs:  75%|▊| 15/20 [1:23:35<26:09, 313.81s/it, loss=0.441, lr=0.000806, d_tim[A
train:  84%|███████████████▏  | 780/928 [04:23<00:49,  2.99it/s, total_it=14

epochs:  75%|▊| 15/20 [1:23:50<26:09, 313.81s/it, loss=0.437, lr=0.000789, d_tim[A
train:  89%|███████████████▉  | 824/928 [04:38<00:34,  2.99it/s, total_it=14743][A
epochs:  75%|▊| 15/20 [1:23:50<26:09, 313.81s/it, loss=0.549, lr=0.000789, d_tim[A
train:  89%|████████████████  | 825/928 [04:38<00:34,  2.98it/s, total_it=14744][A
epochs:  75%|▊| 15/20 [1:23:51<26:09, 313.81s/it, loss=0.48, lr=0.000788, d_time[A
train:  89%|████████████████  | 826/928 [04:38<00:34,  3.00it/s, total_it=14745][A
epochs:  75%|▊| 15/20 [1:23:51<26:09, 313.81s/it, loss=0.498, lr=0.000788, d_tim[A
train:  89%|████████████████  | 827/928 [04:39<00:33,  3.03it/s, total_it=14746][A
epochs:  75%|▊| 15/20 [1:23:51<26:09, 313.81s/it, loss=0.497, lr=0.000788, d_tim[A
train:  89%|████████████████  | 828/928 [04:39<00:32,  3.08it/s, total_it=14747][A
epochs:  75%|▊| 15/20 [1:23:52<26:09, 313.81s/it, loss=0.631, lr=0.000787, d_tim[A
train:  89%|████████████████  | 829/928 [04:39<00:32,  3.01it/s, total_it=14

epochs:  75%|▊| 15/20 [1:24:06<26:09, 313.81s/it, loss=0.527, lr=0.000771, d_tim[A
train:  94%|████████████████▉ | 873/928 [04:54<00:18,  3.02it/s, total_it=14792][A
epochs:  75%|▊| 15/20 [1:24:07<26:09, 313.81s/it, loss=0.605, lr=0.000771, d_tim[A
train:  94%|████████████████▉ | 874/928 [04:54<00:17,  3.07it/s, total_it=14793][A
epochs:  75%|▊| 15/20 [1:24:07<26:09, 313.81s/it, loss=0.484, lr=0.00077, d_time[A
train:  94%|████████████████▉ | 875/928 [04:55<00:17,  3.09it/s, total_it=14794][A
epochs:  75%|▊| 15/20 [1:24:07<26:09, 313.81s/it, loss=0.448, lr=0.00077, d_time[A
train:  94%|████████████████▉ | 876/928 [04:55<00:16,  3.14it/s, total_it=14795][A
epochs:  75%|▊| 15/20 [1:24:08<26:09, 313.81s/it, loss=0.528, lr=0.00077, d_time[A
train:  95%|█████████████████ | 877/928 [04:55<00:16,  3.09it/s, total_it=14796][A
epochs:  75%|▊| 15/20 [1:24:08<26:09, 313.81s/it, loss=0.539, lr=0.000769, d_tim[A
train:  95%|█████████████████ | 878/928 [04:56<00:16,  3.04it/s, total_it=14

epochs:  75%|▊| 15/20 [1:24:22<26:09, 313.81s/it, loss=0.462, lr=0.000753, d_tim[A
train:  99%|█████████████████▉| 922/928 [05:10<00:01,  3.19it/s, total_it=14841][A
epochs:  75%|▊| 15/20 [1:24:23<26:09, 313.81s/it, loss=0.533, lr=0.000753, d_tim[A
train:  99%|█████████████████▉| 923/928 [05:10<00:01,  3.26it/s, total_it=14842][A
epochs:  75%|▊| 15/20 [1:24:23<26:09, 313.81s/it, loss=0.581, lr=0.000752, d_tim[A
train: 100%|█████████████████▉| 924/928 [05:11<00:01,  3.26it/s, total_it=14843][A
epochs:  75%|▊| 15/20 [1:24:23<26:09, 313.81s/it, loss=0.52, lr=0.000752, d_time[A
train: 100%|█████████████████▉| 925/928 [05:11<00:00,  3.29it/s, total_it=14844][A
epochs:  75%|▊| 15/20 [1:24:24<26:09, 313.81s/it, loss=0.443, lr=0.000751, d_tim[A
train: 100%|█████████████████▉| 926/928 [05:11<00:00,  3.30it/s, total_it=14845][A
epochs:  75%|▊| 15/20 [1:24:24<26:09, 313.81s/it, loss=0.421, lr=0.000751, d_tim[A
train: 100%|█████████████████▉| 927/928 [05:12<00:00,  3.30it/s, total_it=14

epochs:  80%|▊| 16/20 [1:24:40<20:53, 313.49s/it, loss=0.5, lr=0.000735, d_time=[A
train:   5%|▊                  | 42/928 [00:15<05:11,  2.84it/s, total_it=14889][A
epochs:  80%|▊| 16/20 [1:24:40<20:53, 313.49s/it, loss=0.471, lr=0.000735, d_tim[A
train:   5%|▉                  | 43/928 [00:15<05:00,  2.95it/s, total_it=14890][A
epochs:  80%|▊| 16/20 [1:24:40<20:53, 313.49s/it, loss=0.55, lr=0.000735, d_time[A
train:   5%|▉                  | 44/928 [00:15<04:57,  2.97it/s, total_it=14891][A
epochs:  80%|▊| 16/20 [1:24:41<20:53, 313.49s/it, loss=0.563, lr=0.000734, d_tim[A
train:   5%|▉                  | 45/928 [00:16<04:54,  3.00it/s, total_it=14892][A
epochs:  80%|▊| 16/20 [1:24:41<20:53, 313.49s/it, loss=0.5, lr=0.000734, d_time=[A
train:   5%|▉                  | 46/928 [00:16<04:55,  2.98it/s, total_it=14893][A
epochs:  80%|▊| 16/20 [1:24:41<20:53, 313.49s/it, loss=0.396, lr=0.000734, d_tim[A
train:   5%|▉                  | 47/928 [00:16<05:00,  2.93it/s, total_it=14

epochs:  80%|▊| 16/20 [1:24:56<20:53, 313.49s/it, loss=0.516, lr=0.000718, d_tim[A
train:  10%|█▊                 | 91/928 [00:31<04:47,  2.91it/s, total_it=14938][A
epochs:  80%|▊| 16/20 [1:24:57<20:53, 313.49s/it, loss=0.504, lr=0.000717, d_tim[A
train:  10%|█▉                 | 92/928 [00:32<04:35,  3.04it/s, total_it=14939][A
epochs:  80%|▊| 16/20 [1:24:57<20:53, 313.49s/it, loss=0.587, lr=0.000717, d_tim[A
train:  10%|█▉                 | 93/928 [00:32<04:39,  2.99it/s, total_it=14940][A
epochs:  80%|▊| 16/20 [1:24:57<20:53, 313.49s/it, loss=0.43, lr=0.000717, d_time[A
train:  10%|█▉                 | 94/928 [00:32<04:37,  3.01it/s, total_it=14941][A
epochs:  80%|▊| 16/20 [1:24:58<20:53, 313.49s/it, loss=0.524, lr=0.000716, d_tim[A
train:  10%|█▉                 | 95/928 [00:33<04:36,  3.01it/s, total_it=14942][A
epochs:  80%|▊| 16/20 [1:24:58<20:53, 313.49s/it, loss=0.376, lr=0.000716, d_tim[A
train:  10%|█▉                 | 96/928 [00:33<04:29,  3.09it/s, total_it=14

epochs:  80%|▊| 16/20 [1:25:13<20:53, 313.49s/it, loss=0.471, lr=0.0007, d_time=[A
train:  15%|██▋               | 140/928 [00:47<04:29,  2.92it/s, total_it=14987][A
epochs:  80%|▊| 16/20 [1:25:13<20:53, 313.49s/it, loss=0.439, lr=0.0007, d_time=[A
train:  15%|██▋               | 141/928 [00:48<04:31,  2.89it/s, total_it=14988][A
epochs:  80%|▊| 16/20 [1:25:13<20:53, 313.49s/it, loss=0.503, lr=0.000699, d_tim[A
train:  15%|██▊               | 142/928 [00:48<04:38,  2.82it/s, total_it=14989][A
epochs:  80%|▊| 16/20 [1:25:14<20:53, 313.49s/it, loss=0.521, lr=0.000699, d_tim[A
train:  15%|██▊               | 143/928 [00:48<04:27,  2.94it/s, total_it=14990][A
epochs:  80%|▊| 16/20 [1:25:14<20:53, 313.49s/it, loss=0.515, lr=0.000699, d_tim[A
train:  16%|██▊               | 144/928 [00:49<04:27,  2.94it/s, total_it=14991][A
epochs:  80%|▊| 16/20 [1:25:14<20:53, 313.49s/it, loss=0.595, lr=0.000698, d_tim[A
train:  16%|██▊               | 145/928 [00:49<04:18,  3.03it/s, total_it=14

epochs:  80%|▊| 16/20 [1:25:29<20:53, 313.49s/it, loss=0.431, lr=0.000683, d_tim[A
train:  20%|███▋              | 189/928 [01:04<04:32,  2.72it/s, total_it=15036][A
epochs:  80%|▊| 16/20 [1:25:30<20:53, 313.49s/it, loss=0.417, lr=0.000682, d_tim[A
train:  20%|███▋              | 190/928 [01:05<04:40,  2.63it/s, total_it=15037][A
epochs:  80%|▊| 16/20 [1:25:30<20:53, 313.49s/it, loss=0.536, lr=0.000682, d_tim[A
train:  21%|███▋              | 191/928 [01:05<04:47,  2.56it/s, total_it=15038][A
epochs:  80%|▊| 16/20 [1:25:30<20:53, 313.49s/it, loss=0.476, lr=0.000682, d_tim[A
train:  21%|███▋              | 192/928 [01:05<04:34,  2.68it/s, total_it=15039][A
epochs:  80%|▊| 16/20 [1:25:31<20:53, 313.49s/it, loss=0.586, lr=0.000681, d_tim[A
train:  21%|███▋              | 193/928 [01:06<04:25,  2.77it/s, total_it=15040][A
epochs:  80%|▊| 16/20 [1:25:31<20:53, 313.49s/it, loss=0.498, lr=0.000681, d_tim[A
train:  21%|███▊              | 194/928 [01:06<04:26,  2.75it/s, total_it=15

epochs:  80%|▊| 16/20 [1:25:46<20:53, 313.49s/it, loss=0.449, lr=0.000665, d_tim[A
train:  26%|████▌             | 238/928 [01:21<03:48,  3.03it/s, total_it=15085][A
epochs:  80%|▊| 16/20 [1:25:46<20:53, 313.49s/it, loss=0.387, lr=0.000665, d_tim[A
train:  26%|████▋             | 239/928 [01:21<04:04,  2.82it/s, total_it=15086][A
epochs:  80%|▊| 16/20 [1:25:47<20:53, 313.49s/it, loss=0.483, lr=0.000665, d_tim[A
train:  26%|████▋             | 240/928 [01:22<03:57,  2.90it/s, total_it=15087][A
epochs:  80%|▊| 16/20 [1:25:47<20:53, 313.49s/it, loss=0.531, lr=0.000664, d_tim[A
train:  26%|████▋             | 241/928 [01:22<04:00,  2.86it/s, total_it=15088][A
epochs:  80%|▊| 16/20 [1:25:47<20:53, 313.49s/it, loss=0.496, lr=0.000664, d_tim[A
train:  26%|████▋             | 242/928 [01:22<03:58,  2.88it/s, total_it=15089][A
epochs:  80%|▊| 16/20 [1:25:48<20:53, 313.49s/it, loss=0.494, lr=0.000664, d_tim[A
train:  26%|████▋             | 243/928 [01:23<03:59,  2.86it/s, total_it=15

epochs:  80%|▊| 16/20 [1:26:03<20:53, 313.49s/it, loss=0.448, lr=0.000648, d_tim[A
train:  31%|█████▌            | 287/928 [01:38<03:37,  2.94it/s, total_it=15134][A
epochs:  80%|▊| 16/20 [1:26:03<20:53, 313.49s/it, loss=0.45, lr=0.000648, d_time[A
train:  31%|█████▌            | 288/928 [01:38<03:39,  2.92it/s, total_it=15135][A
epochs:  80%|▊| 16/20 [1:26:03<20:53, 313.49s/it, loss=0.502, lr=0.000647, d_tim[A
train:  31%|█████▌            | 289/928 [01:38<03:34,  2.98it/s, total_it=15136][A
epochs:  80%|▊| 16/20 [1:26:04<20:53, 313.49s/it, loss=0.431, lr=0.000647, d_tim[A
train:  31%|█████▋            | 290/928 [01:39<03:33,  2.99it/s, total_it=15137][A
epochs:  80%|▊| 16/20 [1:26:04<20:53, 313.49s/it, loss=0.449, lr=0.000647, d_tim[A
train:  31%|█████▋            | 291/928 [01:39<03:26,  3.08it/s, total_it=15138][A
epochs:  80%|▊| 16/20 [1:26:04<20:53, 313.49s/it, loss=0.388, lr=0.000646, d_tim[A
train:  31%|█████▋            | 292/928 [01:39<03:39,  2.90it/s, total_it=15

epochs:  80%|▊| 16/20 [1:26:19<20:53, 313.49s/it, loss=0.39, lr=0.000631, d_time[A
train:  36%|██████▌           | 336/928 [01:54<03:18,  2.98it/s, total_it=15183][A
epochs:  80%|▊| 16/20 [1:26:20<20:53, 313.49s/it, loss=0.426, lr=0.000631, d_tim[A
train:  36%|██████▌           | 337/928 [01:55<03:16,  3.01it/s, total_it=15184][A
epochs:  80%|▊| 16/20 [1:26:20<20:53, 313.49s/it, loss=0.457, lr=0.00063, d_time[A
train:  36%|██████▌           | 338/928 [01:55<03:18,  2.97it/s, total_it=15185][A
epochs:  80%|▊| 16/20 [1:26:20<20:53, 313.49s/it, loss=0.423, lr=0.00063, d_time[A
train:  37%|██████▌           | 339/928 [01:55<03:15,  3.01it/s, total_it=15186][A
epochs:  80%|▊| 16/20 [1:26:21<20:53, 313.49s/it, loss=0.411, lr=0.00063, d_time[A
train:  37%|██████▌           | 340/928 [01:56<03:23,  2.89it/s, total_it=15187][A
epochs:  80%|▊| 16/20 [1:26:21<20:53, 313.49s/it, loss=0.439, lr=0.000629, d_tim[A
train:  37%|██████▌           | 341/928 [01:56<03:23,  2.89it/s, total_it=15

epochs:  80%|▊| 16/20 [1:26:36<20:53, 313.49s/it, loss=0.433, lr=0.000614, d_tim[A
train:  41%|███████▍          | 385/928 [02:11<03:00,  3.01it/s, total_it=15232][A
epochs:  80%|▊| 16/20 [1:26:36<20:53, 313.49s/it, loss=0.447, lr=0.000614, d_tim[A
train:  42%|███████▍          | 386/928 [02:11<02:55,  3.09it/s, total_it=15233][A
epochs:  80%|▊| 16/20 [1:26:36<20:53, 313.49s/it, loss=0.467, lr=0.000614, d_tim[A
train:  42%|███████▌          | 387/928 [02:11<02:55,  3.09it/s, total_it=15234][A
epochs:  80%|▊| 16/20 [1:26:37<20:53, 313.49s/it, loss=0.527, lr=0.000613, d_tim[A
train:  42%|███████▌          | 388/928 [02:12<02:57,  3.04it/s, total_it=15235][A
epochs:  80%|▊| 16/20 [1:26:37<20:53, 313.49s/it, loss=0.467, lr=0.000613, d_tim[A
train:  42%|███████▌          | 389/928 [02:12<03:04,  2.93it/s, total_it=15236][A
epochs:  80%|▊| 16/20 [1:26:37<20:53, 313.49s/it, loss=0.459, lr=0.000613, d_tim[A
train:  42%|███████▌          | 390/928 [02:12<02:59,  2.99it/s, total_it=15

epochs:  80%|▊| 16/20 [1:26:53<20:53, 313.49s/it, loss=0.554, lr=0.000598, d_tim[A
train:  47%|████████▍         | 434/928 [02:28<02:53,  2.86it/s, total_it=15281][A
epochs:  80%|▊| 16/20 [1:26:53<20:53, 313.49s/it, loss=0.455, lr=0.000597, d_tim[A
train:  47%|████████▍         | 435/928 [02:28<02:55,  2.81it/s, total_it=15282][A
epochs:  80%|▊| 16/20 [1:26:54<20:53, 313.49s/it, loss=0.467, lr=0.000597, d_tim[A
train:  47%|████████▍         | 436/928 [02:29<02:52,  2.85it/s, total_it=15283][A
epochs:  80%|▊| 16/20 [1:26:54<20:53, 313.49s/it, loss=0.464, lr=0.000597, d_tim[A
train:  47%|████████▍         | 437/928 [02:29<02:52,  2.85it/s, total_it=15284][A
epochs:  80%|▊| 16/20 [1:26:54<20:53, 313.49s/it, loss=0.463, lr=0.000596, d_tim[A
train:  47%|████████▍         | 438/928 [02:29<02:46,  2.94it/s, total_it=15285][A
epochs:  80%|▊| 16/20 [1:26:55<20:53, 313.49s/it, loss=0.457, lr=0.000596, d_tim[A
train:  47%|████████▌         | 439/928 [02:30<02:47,  2.93it/s, total_it=15

epochs:  80%|▊| 16/20 [1:27:09<20:53, 313.49s/it, loss=0.444, lr=0.000581, d_tim[A
train:  52%|█████████▎        | 483/928 [02:44<02:25,  3.05it/s, total_it=15330][A
epochs:  80%|▊| 16/20 [1:27:10<20:53, 313.49s/it, loss=0.419, lr=0.000581, d_tim[A
train:  52%|█████████▍        | 484/928 [02:44<02:29,  2.98it/s, total_it=15331][A
epochs:  80%|▊| 16/20 [1:27:10<20:53, 313.49s/it, loss=0.456, lr=0.000581, d_tim[A
train:  52%|█████████▍        | 485/928 [02:45<02:28,  2.99it/s, total_it=15332][A
epochs:  80%|▊| 16/20 [1:27:10<20:53, 313.49s/it, loss=0.466, lr=0.00058, d_time[A
train:  52%|█████████▍        | 486/928 [02:45<02:34,  2.86it/s, total_it=15333][A
epochs:  80%|▊| 16/20 [1:27:11<20:53, 313.49s/it, loss=0.47, lr=0.00058, d_time=[A
train:  52%|█████████▍        | 487/928 [02:46<02:29,  2.94it/s, total_it=15334][A
epochs:  80%|▊| 16/20 [1:27:11<20:53, 313.49s/it, loss=0.537, lr=0.00058, d_time[A
train:  53%|█████████▍        | 488/928 [02:46<02:31,  2.91it/s, total_it=15

epochs:  80%|▊| 16/20 [1:27:26<20:53, 313.49s/it, loss=0.561, lr=0.000565, d_tim[A
train:  57%|██████████▎       | 532/928 [03:01<02:14,  2.94it/s, total_it=15379][A
epochs:  80%|▊| 16/20 [1:27:26<20:53, 313.49s/it, loss=0.488, lr=0.000565, d_tim[A
train:  57%|██████████▎       | 533/928 [03:01<02:19,  2.84it/s, total_it=15380][A
epochs:  80%|▊| 16/20 [1:27:26<20:53, 313.49s/it, loss=0.44, lr=0.000564, d_time[A
train:  58%|██████████▎       | 534/928 [03:01<02:15,  2.90it/s, total_it=15381][A
epochs:  80%|▊| 16/20 [1:27:27<20:53, 313.49s/it, loss=0.496, lr=0.000564, d_tim[A
train:  58%|██████████▍       | 535/928 [03:02<02:15,  2.90it/s, total_it=15382][A
epochs:  80%|▊| 16/20 [1:27:27<20:53, 313.49s/it, loss=0.465, lr=0.000564, d_tim[A
train:  58%|██████████▍       | 536/928 [03:02<02:11,  2.99it/s, total_it=15383][A
epochs:  80%|▊| 16/20 [1:27:27<20:53, 313.49s/it, loss=0.486, lr=0.000563, d_tim[A
train:  58%|██████████▍       | 537/928 [03:02<02:13,  2.92it/s, total_it=15

epochs:  80%|▊| 16/20 [1:27:42<20:53, 313.49s/it, loss=0.48, lr=0.000549, d_time[A
train:  63%|███████████▎      | 581/928 [03:17<02:06,  2.75it/s, total_it=15428][A
epochs:  80%|▊| 16/20 [1:27:43<20:53, 313.49s/it, loss=0.448, lr=0.000548, d_tim[A
train:  63%|███████████▎      | 582/928 [03:18<02:05,  2.76it/s, total_it=15429][A
epochs:  80%|▊| 16/20 [1:27:43<20:53, 313.49s/it, loss=0.432, lr=0.000548, d_tim[A
train:  63%|███████████▎      | 583/928 [03:18<02:00,  2.86it/s, total_it=15430][A
epochs:  80%|▊| 16/20 [1:27:44<20:53, 313.49s/it, loss=0.438, lr=0.000548, d_tim[A
train:  63%|███████████▎      | 584/928 [03:19<02:06,  2.72it/s, total_it=15431][A
epochs:  80%|▊| 16/20 [1:27:44<20:53, 313.49s/it, loss=0.747, lr=0.000547, d_tim[A
train:  63%|███████████▎      | 585/928 [03:19<02:01,  2.83it/s, total_it=15432][A
epochs:  80%|▊| 16/20 [1:27:44<20:53, 313.49s/it, loss=0.439, lr=0.000547, d_tim[A
train:  63%|███████████▎      | 586/928 [03:19<01:59,  2.87it/s, total_it=15

epochs:  80%|▊| 16/20 [1:27:59<20:53, 313.49s/it, loss=0.448, lr=0.000533, d_tim[A
train:  68%|████████████▏     | 630/928 [03:34<01:38,  3.02it/s, total_it=15477][A
epochs:  80%|▊| 16/20 [1:28:00<20:53, 313.49s/it, loss=0.478, lr=0.000532, d_tim[A
train:  68%|████████████▏     | 631/928 [03:34<01:40,  2.97it/s, total_it=15478][A
epochs:  80%|▊| 16/20 [1:28:00<20:53, 313.49s/it, loss=0.415, lr=0.000532, d_tim[A
train:  68%|████████████▎     | 632/928 [03:35<01:45,  2.82it/s, total_it=15479][A
epochs:  80%|▊| 16/20 [1:28:00<20:53, 313.49s/it, loss=0.503, lr=0.000532, d_tim[A
train:  68%|████████████▎     | 633/928 [03:35<01:42,  2.89it/s, total_it=15480][A
epochs:  80%|▊| 16/20 [1:28:01<20:53, 313.49s/it, loss=0.539, lr=0.000532, d_tim[A
train:  68%|████████████▎     | 634/928 [03:36<01:40,  2.94it/s, total_it=15481][A
epochs:  80%|▊| 16/20 [1:28:01<20:53, 313.49s/it, loss=0.526, lr=0.000531, d_tim[A
train:  68%|████████████▎     | 635/928 [03:36<01:38,  2.98it/s, total_it=15

epochs:  80%|▊| 16/20 [1:28:16<20:53, 313.49s/it, loss=0.554, lr=0.000517, d_tim[A
train:  73%|█████████████▏    | 679/928 [03:51<01:23,  3.00it/s, total_it=15526][A
epochs:  80%|▊| 16/20 [1:28:16<20:53, 313.49s/it, loss=0.51, lr=0.000517, d_time[A
train:  73%|█████████████▏    | 680/928 [03:51<01:21,  3.05it/s, total_it=15527][A
epochs:  80%|▊| 16/20 [1:28:16<20:53, 313.49s/it, loss=0.514, lr=0.000516, d_tim[A
train:  73%|█████████████▏    | 681/928 [03:51<01:19,  3.10it/s, total_it=15528][A
epochs:  80%|▊| 16/20 [1:28:17<20:53, 313.49s/it, loss=0.471, lr=0.000516, d_tim[A
train:  73%|█████████████▏    | 682/928 [03:52<01:19,  3.09it/s, total_it=15529][A
epochs:  80%|▊| 16/20 [1:28:17<20:53, 313.49s/it, loss=0.472, lr=0.000516, d_tim[A
train:  74%|█████████████▏    | 683/928 [03:52<01:26,  2.85it/s, total_it=15530][A
epochs:  80%|▊| 16/20 [1:28:17<20:53, 313.49s/it, loss=0.436, lr=0.000515, d_tim[A
train:  74%|█████████████▎    | 684/928 [03:52<01:28,  2.77it/s, total_it=15

epochs:  80%|▊| 16/20 [1:28:32<20:53, 313.49s/it, loss=0.404, lr=0.000501, d_tim[A
train:  78%|██████████████    | 728/928 [04:07<01:10,  2.83it/s, total_it=15575][A
epochs:  80%|▊| 16/20 [1:28:33<20:53, 313.49s/it, loss=0.594, lr=0.000501, d_tim[A
train:  79%|██████████████▏   | 729/928 [04:08<01:09,  2.86it/s, total_it=15576][A
epochs:  80%|▊| 16/20 [1:28:33<20:53, 313.49s/it, loss=0.48, lr=0.000501, d_time[A
train:  79%|██████████████▏   | 730/928 [04:08<01:09,  2.85it/s, total_it=15577][A
epochs:  80%|▊| 16/20 [1:28:33<20:53, 313.49s/it, loss=0.395, lr=0.000501, d_tim[A
train:  79%|██████████████▏   | 731/928 [04:08<01:06,  2.96it/s, total_it=15578][A
epochs:  80%|▊| 16/20 [1:28:34<20:53, 313.49s/it, loss=0.529, lr=0.0005, d_time=[A
train:  79%|██████████████▏   | 732/928 [04:09<01:06,  2.95it/s, total_it=15579][A
epochs:  80%|▊| 16/20 [1:28:34<20:53, 313.49s/it, loss=0.445, lr=0.0005, d_time=[A
train:  79%|██████████████▏   | 733/928 [04:09<01:07,  2.90it/s, total_it=15

epochs:  80%|▊| 16/20 [1:28:49<20:53, 313.49s/it, loss=0.407, lr=0.000486, d_tim[A
train:  84%|███████████████   | 777/928 [04:24<00:52,  2.90it/s, total_it=15624][A
epochs:  80%|▊| 16/20 [1:28:49<20:53, 313.49s/it, loss=0.412, lr=0.000486, d_tim[A
train:  84%|███████████████   | 778/928 [04:24<00:50,  2.99it/s, total_it=15625][A
epochs:  80%|▊| 16/20 [1:28:49<20:53, 313.49s/it, loss=0.474, lr=0.000485, d_tim[A
train:  84%|███████████████   | 779/928 [04:24<00:48,  3.06it/s, total_it=15626][A
epochs:  80%|▊| 16/20 [1:28:50<20:53, 313.49s/it, loss=0.531, lr=0.000485, d_tim[A
train:  84%|███████████████▏  | 780/928 [04:25<00:49,  3.00it/s, total_it=15627][A
epochs:  80%|▊| 16/20 [1:28:50<20:53, 313.49s/it, loss=0.436, lr=0.000485, d_tim[A
train:  84%|███████████████▏  | 781/928 [04:25<00:48,  3.02it/s, total_it=15628][A
epochs:  80%|▊| 16/20 [1:28:50<20:53, 313.49s/it, loss=0.448, lr=0.000485, d_tim[A
train:  84%|███████████████▏  | 782/928 [04:25<00:48,  3.04it/s, total_it=15

epochs:  80%|▊| 16/20 [1:29:05<20:53, 313.49s/it, loss=0.375, lr=0.000471, d_tim[A
train:  89%|████████████████  | 826/928 [04:40<00:34,  3.00it/s, total_it=15673][A
epochs:  80%|▊| 16/20 [1:29:05<20:53, 313.49s/it, loss=0.465, lr=0.000471, d_tim[A
train:  89%|████████████████  | 827/928 [04:40<00:33,  2.98it/s, total_it=15674][A
epochs:  80%|▊| 16/20 [1:29:06<20:53, 313.49s/it, loss=0.445, lr=0.00047, d_time[A
train:  89%|████████████████  | 828/928 [04:41<00:32,  3.07it/s, total_it=15675][A
epochs:  80%|▊| 16/20 [1:29:06<20:53, 313.49s/it, loss=0.528, lr=0.00047, d_time[A
train:  89%|████████████████  | 829/928 [04:41<00:32,  3.07it/s, total_it=15676][A
epochs:  80%|▊| 16/20 [1:29:06<20:53, 313.49s/it, loss=0.488, lr=0.00047, d_time[A
train:  89%|████████████████  | 830/928 [04:41<00:32,  2.97it/s, total_it=15677][A
epochs:  80%|▊| 16/20 [1:29:07<20:53, 313.49s/it, loss=0.387, lr=0.000469, d_tim[A
train:  90%|████████████████  | 831/928 [04:42<00:34,  2.84it/s, total_it=15

epochs:  80%|▊| 16/20 [1:29:22<20:53, 313.49s/it, loss=0.464, lr=0.000456, d_tim[A
train:  94%|████████████████▉ | 875/928 [04:57<00:17,  2.99it/s, total_it=15722][A
epochs:  80%|▊| 16/20 [1:29:22<20:53, 313.49s/it, loss=0.481, lr=0.000456, d_tim[A
train:  94%|████████████████▉ | 876/928 [04:57<00:18,  2.80it/s, total_it=15723][A
epochs:  80%|▊| 16/20 [1:29:23<20:53, 313.49s/it, loss=0.595, lr=0.000455, d_tim[A
train:  95%|█████████████████ | 877/928 [04:57<00:17,  2.90it/s, total_it=15724][A
epochs:  80%|▊| 16/20 [1:29:23<20:53, 313.49s/it, loss=0.409, lr=0.000455, d_tim[A
train:  95%|█████████████████ | 878/928 [04:58<00:16,  2.97it/s, total_it=15725][A
epochs:  80%|▊| 16/20 [1:29:23<20:53, 313.49s/it, loss=0.494, lr=0.000455, d_tim[A
train:  95%|█████████████████ | 879/928 [04:58<00:17,  2.79it/s, total_it=15726][A
epochs:  80%|▊| 16/20 [1:29:24<20:53, 313.49s/it, loss=0.487, lr=0.000454, d_tim[A
train:  95%|█████████████████ | 880/928 [04:59<00:18,  2.66it/s, total_it=15

epochs:  80%|▊| 16/20 [1:29:38<20:53, 313.49s/it, loss=0.409, lr=0.000441, d_tim[A
train: 100%|█████████████████▉| 924/928 [05:13<00:01,  3.22it/s, total_it=15771][A
epochs:  80%|▊| 16/20 [1:29:39<20:53, 313.49s/it, loss=0.478, lr=0.000441, d_tim[A
train: 100%|█████████████████▉| 925/928 [05:14<00:00,  3.25it/s, total_it=15772][A
epochs:  80%|▊| 16/20 [1:29:39<20:53, 313.49s/it, loss=0.538, lr=0.000441, d_tim[A
train: 100%|█████████████████▉| 926/928 [05:14<00:00,  3.29it/s, total_it=15773][A
epochs:  80%|▊| 16/20 [1:29:39<20:53, 313.49s/it, loss=0.547, lr=0.00044, d_time[A
train: 100%|█████████████████▉| 927/928 [05:14<00:00,  3.32it/s, total_it=15774][A
epochs:  80%|▊| 16/20 [1:29:40<20:53, 313.49s/it, loss=0.397, lr=0.00044, d_time[A
train: 100%|██████████████████| 928/928 [05:15<00:00,  3.36it/s, total_it=15775][A
epochs:  80%|▊| 16/20 [1:29:40<20:53, 313.49s/it, loss=0.509, lr=0.00044, d_time[A
epochs:  85%|▊| 17/20 [1:29:40<15:42, 314.08s/it, loss=0.509, lr=0.00044, d_

epochs:  85%|▊| 17/20 [1:29:55<15:42, 314.08s/it, loss=0.534, lr=0.000427, d_tim[A
train:   5%|▉                  | 44/928 [00:15<04:47,  3.08it/s, total_it=15819][A
epochs:  85%|▊| 17/20 [1:29:56<15:42, 314.08s/it, loss=0.493, lr=0.000427, d_tim[A
train:   5%|▉                  | 45/928 [00:15<04:51,  3.03it/s, total_it=15820][A
epochs:  85%|▊| 17/20 [1:29:56<15:42, 314.08s/it, loss=0.464, lr=0.000426, d_tim[A
train:   5%|▉                  | 46/928 [00:16<04:49,  3.04it/s, total_it=15821][A
epochs:  85%|▊| 17/20 [1:29:56<15:42, 314.08s/it, loss=0.39, lr=0.000426, d_time[A
train:   5%|▉                  | 47/928 [00:16<04:44,  3.10it/s, total_it=15822][A
epochs:  85%|▊| 17/20 [1:29:57<15:42, 314.08s/it, loss=0.417, lr=0.000426, d_tim[A
train:   5%|▉                  | 48/928 [00:16<04:45,  3.08it/s, total_it=15823][A
epochs:  85%|▊| 17/20 [1:29:57<15:42, 314.08s/it, loss=0.411, lr=0.000425, d_tim[A
train:   5%|█                  | 49/928 [00:17<04:49,  3.04it/s, total_it=15

epochs:  85%|▊| 17/20 [1:30:12<15:42, 314.08s/it, loss=0.403, lr=0.000412, d_tim[A
train:  10%|█▉                 | 93/928 [00:31<04:34,  3.04it/s, total_it=15868][A
epochs:  85%|▊| 17/20 [1:30:12<15:42, 314.08s/it, loss=0.428, lr=0.000412, d_tim[A
train:  10%|█▉                 | 94/928 [00:32<04:51,  2.86it/s, total_it=15869][A
epochs:  85%|▊| 17/20 [1:30:12<15:42, 314.08s/it, loss=0.406, lr=0.000412, d_tim[A
train:  10%|█▉                 | 95/928 [00:32<04:42,  2.95it/s, total_it=15870][A
epochs:  85%|▊| 17/20 [1:30:13<15:42, 314.08s/it, loss=0.493, lr=0.000412, d_tim[A
train:  10%|█▉                 | 96/928 [00:32<04:41,  2.96it/s, total_it=15871][A
epochs:  85%|▊| 17/20 [1:30:13<15:42, 314.08s/it, loss=0.442, lr=0.000411, d_tim[A
train:  10%|█▉                 | 97/928 [00:32<04:29,  3.08it/s, total_it=15872][A
epochs:  85%|▊| 17/20 [1:30:13<15:42, 314.08s/it, loss=0.438, lr=0.000411, d_tim[A
train:  11%|██                 | 98/928 [00:33<04:34,  3.02it/s, total_it=15

epochs:  85%|▊| 17/20 [1:30:28<15:42, 314.08s/it, loss=0.484, lr=0.000398, d_tim[A
train:  15%|██▊               | 142/928 [00:47<04:30,  2.91it/s, total_it=15917][A
epochs:  85%|▊| 17/20 [1:30:28<15:42, 314.08s/it, loss=0.555, lr=0.000398, d_tim[A
train:  15%|██▊               | 143/928 [00:48<04:33,  2.87it/s, total_it=15918][A
epochs:  85%|▊| 17/20 [1:30:29<15:42, 314.08s/it, loss=0.487, lr=0.000398, d_tim[A
train:  16%|██▊               | 144/928 [00:48<04:25,  2.95it/s, total_it=15919][A
epochs:  85%|▊| 17/20 [1:30:29<15:42, 314.08s/it, loss=0.455, lr=0.000397, d_tim[A
train:  16%|██▊               | 145/928 [00:48<04:21,  2.99it/s, total_it=15920][A
epochs:  85%|▊| 17/20 [1:30:29<15:42, 314.08s/it, loss=0.432, lr=0.000397, d_tim[A
train:  16%|██▊               | 146/928 [00:49<04:21,  2.99it/s, total_it=15921][A
epochs:  85%|▊| 17/20 [1:30:30<15:42, 314.08s/it, loss=0.455, lr=0.000397, d_tim[A
train:  16%|██▊               | 147/928 [00:49<04:18,  3.02it/s, total_it=15

epochs:  85%|▊| 17/20 [1:30:45<15:42, 314.08s/it, loss=0.46, lr=0.000384, d_time[A
train:  21%|███▋              | 191/928 [01:04<04:15,  2.88it/s, total_it=15966][A
epochs:  85%|▊| 17/20 [1:30:45<15:42, 314.08s/it, loss=0.409, lr=0.000384, d_tim[A
train:  21%|███▋              | 192/928 [01:04<04:10,  2.94it/s, total_it=15967][A
epochs:  85%|▊| 17/20 [1:30:45<15:42, 314.08s/it, loss=0.42, lr=0.000384, d_time[A
train:  21%|███▋              | 193/928 [01:05<04:23,  2.79it/s, total_it=15968][A
epochs:  85%|▊| 17/20 [1:30:46<15:42, 314.08s/it, loss=0.537, lr=0.000383, d_tim[A
train:  21%|███▊              | 194/928 [01:05<04:13,  2.89it/s, total_it=15969][A
epochs:  85%|▊| 17/20 [1:30:46<15:42, 314.08s/it, loss=0.434, lr=0.000383, d_tim[A
train:  21%|███▊              | 195/928 [01:05<04:27,  2.75it/s, total_it=15970][A
epochs:  85%|▊| 17/20 [1:30:46<15:42, 314.08s/it, loss=0.425, lr=0.000383, d_tim[A
train:  21%|███▊              | 196/928 [01:06<04:14,  2.87it/s, total_it=15

epochs:  85%|▊| 17/20 [1:31:01<15:42, 314.08s/it, loss=0.452, lr=0.000371, d_tim[A
train:  26%|████▋             | 240/928 [01:21<03:56,  2.90it/s, total_it=16015][A
epochs:  85%|▊| 17/20 [1:31:02<15:42, 314.08s/it, loss=0.631, lr=0.00037, d_time[A
train:  26%|████▋             | 241/928 [01:21<03:58,  2.88it/s, total_it=16016][A
epochs:  85%|▊| 17/20 [1:31:02<15:42, 314.08s/it, loss=0.42, lr=0.00037, d_time=[A
train:  26%|████▋             | 242/928 [01:22<04:08,  2.76it/s, total_it=16017][A
epochs:  85%|▊| 17/20 [1:31:02<15:42, 314.08s/it, loss=0.457, lr=0.00037, d_time[A
train:  26%|████▋             | 243/928 [01:22<04:00,  2.84it/s, total_it=16018][A
epochs:  85%|▊| 17/20 [1:31:03<15:42, 314.08s/it, loss=0.458, lr=0.000369, d_tim[A
train:  26%|████▋             | 244/928 [01:22<04:08,  2.75it/s, total_it=16019][A
epochs:  85%|▊| 17/20 [1:31:03<15:42, 314.08s/it, loss=0.47, lr=0.000369, d_time[A
train:  26%|████▊             | 245/928 [01:23<04:04,  2.80it/s, total_it=16

epochs:  85%|▊| 17/20 [1:31:18<15:42, 314.08s/it, loss=0.472, lr=0.000357, d_tim[A
train:  31%|█████▌            | 289/928 [01:38<03:26,  3.09it/s, total_it=16064][A
epochs:  85%|▊| 17/20 [1:31:18<15:42, 314.08s/it, loss=0.485, lr=0.000357, d_tim[A
train:  31%|█████▋            | 290/928 [01:38<03:30,  3.03it/s, total_it=16065][A
epochs:  85%|▊| 17/20 [1:31:19<15:42, 314.08s/it, loss=0.583, lr=0.000357, d_tim[A
train:  31%|█████▋            | 291/928 [01:38<03:24,  3.11it/s, total_it=16066][A
epochs:  85%|▊| 17/20 [1:31:19<15:42, 314.08s/it, loss=0.513, lr=0.000356, d_tim[A
train:  31%|█████▋            | 292/928 [01:39<03:31,  3.01it/s, total_it=16067][A
epochs:  85%|▊| 17/20 [1:31:19<15:42, 314.08s/it, loss=0.518, lr=0.000356, d_tim[A
train:  32%|█████▋            | 293/928 [01:39<03:29,  3.04it/s, total_it=16068][A
epochs:  85%|▊| 17/20 [1:31:20<15:42, 314.08s/it, loss=0.436, lr=0.000356, d_tim[A
train:  32%|█████▋            | 294/928 [01:39<03:29,  3.03it/s, total_it=16

epochs:  85%|▊| 17/20 [1:31:35<15:42, 314.08s/it, loss=0.414, lr=0.000344, d_tim[A
train:  36%|██████▌           | 338/928 [01:54<03:17,  2.99it/s, total_it=16113][A
epochs:  85%|▊| 17/20 [1:31:35<15:42, 314.08s/it, loss=0.444, lr=0.000343, d_tim[A
train:  37%|██████▌           | 339/928 [01:54<03:16,  3.00it/s, total_it=16114][A
epochs:  85%|▊| 17/20 [1:31:35<15:42, 314.08s/it, loss=0.497, lr=0.000343, d_tim[A
train:  37%|██████▌           | 340/928 [01:55<03:15,  3.01it/s, total_it=16115][A
epochs:  85%|▊| 17/20 [1:31:36<15:42, 314.08s/it, loss=0.55, lr=0.000343, d_time[A
train:  37%|██████▌           | 341/928 [01:55<03:13,  3.04it/s, total_it=16116][A
epochs:  85%|▊| 17/20 [1:31:36<15:42, 314.08s/it, loss=0.545, lr=0.000343, d_tim[A
train:  37%|██████▋           | 342/928 [01:55<03:15,  2.99it/s, total_it=16117][A
epochs:  85%|▊| 17/20 [1:31:36<15:42, 314.08s/it, loss=0.384, lr=0.000342, d_tim[A
train:  37%|██████▋           | 343/928 [01:56<03:22,  2.90it/s, total_it=16

epochs:  85%|▊| 17/20 [1:31:51<15:42, 314.08s/it, loss=0.413, lr=0.000331, d_tim[A
train:  42%|███████▌          | 387/928 [02:10<02:57,  3.05it/s, total_it=16162][A
epochs:  85%|▊| 17/20 [1:31:51<15:42, 314.08s/it, loss=0.386, lr=0.00033, d_time[A
train:  42%|███████▌          | 388/928 [02:11<03:03,  2.95it/s, total_it=16163][A
epochs:  85%|▊| 17/20 [1:31:52<15:42, 314.08s/it, loss=0.393, lr=0.00033, d_time[A
train:  42%|███████▌          | 389/928 [02:11<03:00,  2.98it/s, total_it=16164][A
epochs:  85%|▊| 17/20 [1:31:52<15:42, 314.08s/it, loss=0.382, lr=0.00033, d_time[A
train:  42%|███████▌          | 390/928 [02:11<03:07,  2.87it/s, total_it=16165][A
epochs:  85%|▊| 17/20 [1:31:52<15:42, 314.08s/it, loss=0.6, lr=0.00033, d_time=0[A
train:  42%|███████▌          | 391/928 [02:12<03:05,  2.89it/s, total_it=16166][A
epochs:  85%|▊| 17/20 [1:31:53<15:42, 314.08s/it, loss=0.39, lr=0.000329, d_time[A
train:  42%|███████▌          | 392/928 [02:12<03:07,  2.86it/s, total_it=16

epochs:  85%|▊| 17/20 [1:32:08<15:42, 314.08s/it, loss=0.461, lr=0.000318, d_tim[A
train:  47%|████████▍         | 436/928 [02:27<02:47,  2.93it/s, total_it=16211][A
epochs:  85%|▊| 17/20 [1:32:08<15:42, 314.08s/it, loss=0.39, lr=0.000318, d_time[A
train:  47%|████████▍         | 437/928 [02:27<02:49,  2.89it/s, total_it=16212][A
epochs:  85%|▊| 17/20 [1:32:08<15:42, 314.08s/it, loss=0.47, lr=0.000317, d_time[A
train:  47%|████████▍         | 438/928 [02:28<02:46,  2.94it/s, total_it=16213][A
epochs:  85%|▊| 17/20 [1:32:09<15:42, 314.08s/it, loss=0.459, lr=0.000317, d_tim[A
train:  47%|████████▌         | 439/928 [02:28<02:46,  2.94it/s, total_it=16214][A
epochs:  85%|▊| 17/20 [1:32:09<15:42, 314.08s/it, loss=0.552, lr=0.000317, d_tim[A
train:  47%|████████▌         | 440/928 [02:28<02:46,  2.92it/s, total_it=16215][A
epochs:  85%|▊| 17/20 [1:32:09<15:42, 314.08s/it, loss=0.395, lr=0.000316, d_tim[A
train:  48%|████████▌         | 441/928 [02:29<02:42,  3.00it/s, total_it=16

epochs:  85%|▊| 17/20 [1:32:24<15:42, 314.08s/it, loss=0.391, lr=0.000305, d_tim[A
train:  52%|█████████▍        | 485/928 [02:44<02:33,  2.89it/s, total_it=16260][A
epochs:  85%|▊| 17/20 [1:32:25<15:42, 314.08s/it, loss=0.469, lr=0.000305, d_tim[A
train:  52%|█████████▍        | 486/928 [02:44<02:28,  2.98it/s, total_it=16261][A
epochs:  85%|▊| 17/20 [1:32:25<15:42, 314.08s/it, loss=0.451, lr=0.000305, d_tim[A
train:  52%|█████████▍        | 487/928 [02:44<02:26,  3.01it/s, total_it=16262][A
epochs:  85%|▊| 17/20 [1:32:25<15:42, 314.08s/it, loss=0.425, lr=0.000304, d_tim[A
train:  53%|█████████▍        | 488/928 [02:45<02:22,  3.08it/s, total_it=16263][A
epochs:  85%|▊| 17/20 [1:32:25<15:42, 314.08s/it, loss=0.472, lr=0.000304, d_tim[A
train:  53%|█████████▍        | 489/928 [02:45<02:33,  2.86it/s, total_it=16264][A
epochs:  85%|▊| 17/20 [1:32:26<15:42, 314.08s/it, loss=0.406, lr=0.000304, d_tim[A
train:  53%|█████████▌        | 490/928 [02:45<02:28,  2.95it/s, total_it=16

epochs:  85%|▊| 17/20 [1:32:41<15:42, 314.08s/it, loss=0.413, lr=0.000293, d_tim[A
train:  58%|██████████▎       | 534/928 [03:00<02:08,  3.06it/s, total_it=16309][A
epochs:  85%|▊| 17/20 [1:32:41<15:42, 314.08s/it, loss=0.432, lr=0.000292, d_tim[A
train:  58%|██████████▍       | 535/928 [03:00<02:14,  2.91it/s, total_it=16310][A
epochs:  85%|▊| 17/20 [1:32:41<15:42, 314.08s/it, loss=0.472, lr=0.000292, d_tim[A
train:  58%|██████████▍       | 536/928 [03:01<02:11,  2.98it/s, total_it=16311][A
epochs:  85%|▊| 17/20 [1:32:42<15:42, 314.08s/it, loss=0.445, lr=0.000292, d_tim[A
train:  58%|██████████▍       | 537/928 [03:01<02:07,  3.07it/s, total_it=16312][A
epochs:  85%|▊| 17/20 [1:32:42<15:42, 314.08s/it, loss=0.382, lr=0.000292, d_tim[A
train:  58%|██████████▍       | 538/928 [03:01<02:10,  2.99it/s, total_it=16313][A
epochs:  85%|▊| 17/20 [1:32:42<15:42, 314.08s/it, loss=0.492, lr=0.000291, d_tim[A
train:  58%|██████████▍       | 539/928 [03:02<02:07,  3.05it/s, total_it=16

epochs:  85%|▊| 17/20 [1:32:57<15:42, 314.08s/it, loss=0.493, lr=0.000281, d_tim[A
train:  63%|███████████▎      | 583/928 [03:16<01:55,  2.98it/s, total_it=16358][A
epochs:  85%|▊| 17/20 [1:32:57<15:42, 314.08s/it, loss=0.413, lr=0.00028, d_time[A
train:  63%|███████████▎      | 584/928 [03:17<01:55,  2.99it/s, total_it=16359][A
epochs:  85%|▊| 17/20 [1:32:57<15:42, 314.08s/it, loss=0.425, lr=0.00028, d_time[A
train:  63%|███████████▎      | 585/928 [03:17<01:54,  2.98it/s, total_it=16360][A
epochs:  85%|▊| 17/20 [1:32:58<15:42, 314.08s/it, loss=0.445, lr=0.00028, d_time[A
train:  63%|███████████▎      | 586/928 [03:17<01:52,  3.04it/s, total_it=16361][A
epochs:  85%|▊| 17/20 [1:32:58<15:42, 314.08s/it, loss=0.505, lr=0.00028, d_time[A
train:  63%|███████████▍      | 587/928 [03:18<01:53,  3.00it/s, total_it=16362][A
epochs:  85%|▊| 17/20 [1:32:58<15:42, 314.08s/it, loss=0.501, lr=0.000279, d_tim[A
train:  63%|███████████▍      | 588/928 [03:18<01:51,  3.06it/s, total_it=16

epochs:  85%|▊| 17/20 [1:33:13<15:42, 314.08s/it, loss=0.489, lr=0.000269, d_tim[A
train:  68%|████████████▎     | 632/928 [03:33<01:40,  2.94it/s, total_it=16407][A
epochs:  85%|▊| 17/20 [1:33:13<15:42, 314.08s/it, loss=0.437, lr=0.000268, d_tim[A
train:  68%|████████████▎     | 633/928 [03:33<01:38,  3.00it/s, total_it=16408][A
epochs:  85%|▊| 17/20 [1:33:14<15:42, 314.08s/it, loss=0.46, lr=0.000268, d_time[A
train:  68%|████████████▎     | 634/928 [03:33<01:40,  2.92it/s, total_it=16409][A
epochs:  85%|▊| 17/20 [1:33:14<15:42, 314.08s/it, loss=0.443, lr=0.000268, d_tim[A
train:  68%|████████████▎     | 635/928 [03:33<01:36,  3.03it/s, total_it=16410][A
epochs:  85%|▊| 17/20 [1:33:14<15:42, 314.08s/it, loss=0.437, lr=0.000268, d_tim[A
train:  69%|████████████▎     | 636/928 [03:34<01:37,  2.98it/s, total_it=16411][A
epochs:  85%|▊| 17/20 [1:33:15<15:42, 314.08s/it, loss=0.382, lr=0.000267, d_tim[A
train:  69%|████████████▎     | 637/928 [03:34<01:39,  2.92it/s, total_it=16

epochs:  85%|▊| 17/20 [1:33:30<15:42, 314.08s/it, loss=0.579, lr=0.000257, d_tim[A
train:  73%|█████████████▏    | 681/928 [03:49<01:20,  3.08it/s, total_it=16456][A
epochs:  85%|▊| 17/20 [1:33:30<15:42, 314.08s/it, loss=0.45, lr=0.000257, d_time[A
train:  73%|█████████████▏    | 682/928 [03:50<01:20,  3.05it/s, total_it=16457][A
epochs:  85%|▊| 17/20 [1:33:30<15:42, 314.08s/it, loss=0.419, lr=0.000256, d_tim[A
train:  74%|█████████████▏    | 683/928 [03:50<01:22,  2.96it/s, total_it=16458][A
epochs:  85%|▊| 17/20 [1:33:31<15:42, 314.08s/it, loss=0.464, lr=0.000256, d_tim[A
train:  74%|█████████████▎    | 684/928 [03:50<01:21,  2.98it/s, total_it=16459][A
epochs:  85%|▊| 17/20 [1:33:31<15:42, 314.08s/it, loss=0.438, lr=0.000256, d_tim[A
train:  74%|█████████████▎    | 685/928 [03:51<01:22,  2.94it/s, total_it=16460][A
epochs:  85%|▊| 17/20 [1:33:31<15:42, 314.08s/it, loss=0.347, lr=0.000256, d_tim[A
train:  74%|█████████████▎    | 686/928 [03:51<01:20,  3.00it/s, total_it=16

epochs:  85%|▊| 17/20 [1:33:46<15:42, 314.08s/it, loss=0.391, lr=0.000245, d_tim[A
train:  79%|██████████████▏   | 730/928 [04:06<01:05,  3.02it/s, total_it=16505][A
epochs:  85%|▊| 17/20 [1:33:47<15:42, 314.08s/it, loss=0.403, lr=0.000245, d_tim[A
train:  79%|██████████████▏   | 731/928 [04:06<01:09,  2.84it/s, total_it=16506][A
epochs:  85%|▊| 17/20 [1:33:47<15:42, 314.08s/it, loss=0.513, lr=0.000245, d_tim[A
train:  79%|██████████████▏   | 732/928 [04:06<01:07,  2.91it/s, total_it=16507][A
epochs:  85%|▊| 17/20 [1:33:47<15:42, 314.08s/it, loss=0.513, lr=0.000245, d_tim[A
train:  79%|██████████████▏   | 733/928 [04:07<01:05,  2.97it/s, total_it=16508][A
epochs:  85%|▊| 17/20 [1:33:48<15:42, 314.08s/it, loss=0.52, lr=0.000244, d_time[A
train:  79%|██████████████▏   | 734/928 [04:07<01:04,  3.01it/s, total_it=16509][A
epochs:  85%|▊| 17/20 [1:33:48<15:42, 314.08s/it, loss=0.366, lr=0.000244, d_tim[A
train:  79%|██████████████▎   | 735/928 [04:07<01:04,  3.01it/s, total_it=16

epochs:  85%|▊| 17/20 [1:34:03<15:42, 314.08s/it, loss=0.464, lr=0.000234, d_tim[A
train:  84%|███████████████   | 779/928 [04:22<00:50,  2.94it/s, total_it=16554][A
epochs:  85%|▊| 17/20 [1:34:03<15:42, 314.08s/it, loss=0.566, lr=0.000234, d_tim[A
train:  84%|███████████████▏  | 780/928 [04:22<00:49,  2.99it/s, total_it=16555][A
epochs:  85%|▊| 17/20 [1:34:03<15:42, 314.08s/it, loss=0.455, lr=0.000234, d_tim[A
train:  84%|███████████████▏  | 781/928 [04:23<00:51,  2.85it/s, total_it=16556][A
epochs:  85%|▊| 17/20 [1:34:04<15:42, 314.08s/it, loss=0.455, lr=0.000233, d_tim[A
train:  84%|███████████████▏  | 782/928 [04:23<00:50,  2.92it/s, total_it=16557][A
epochs:  85%|▊| 17/20 [1:34:04<15:42, 314.08s/it, loss=0.512, lr=0.000233, d_tim[A
train:  84%|███████████████▏  | 783/928 [04:24<00:49,  2.94it/s, total_it=16558][A
epochs:  85%|▊| 17/20 [1:34:04<15:42, 314.08s/it, loss=0.432, lr=0.000233, d_tim[A
train:  84%|███████████████▏  | 784/928 [04:24<00:48,  2.97it/s, total_it=16

epochs:  85%|▊| 17/20 [1:34:19<15:42, 314.08s/it, loss=0.449, lr=0.000223, d_tim[A
train:  89%|████████████████  | 828/928 [04:39<00:32,  3.10it/s, total_it=16603][A
epochs:  85%|▊| 17/20 [1:34:20<15:42, 314.08s/it, loss=0.609, lr=0.000223, d_tim[A
train:  89%|████████████████  | 829/928 [04:39<00:32,  3.08it/s, total_it=16604][A
epochs:  85%|▊| 17/20 [1:34:20<15:42, 314.08s/it, loss=0.569, lr=0.000223, d_tim[A
train:  89%|████████████████  | 830/928 [04:40<00:32,  2.98it/s, total_it=16605][A
epochs:  85%|▊| 17/20 [1:34:20<15:42, 314.08s/it, loss=0.437, lr=0.000222, d_tim[A
train:  90%|████████████████  | 831/928 [04:40<00:31,  3.05it/s, total_it=16606][A
epochs:  85%|▊| 17/20 [1:34:21<15:42, 314.08s/it, loss=0.521, lr=0.000222, d_tim[A
train:  90%|████████████████▏ | 832/928 [04:40<00:31,  3.08it/s, total_it=16607][A
epochs:  85%|▊| 17/20 [1:34:21<15:42, 314.08s/it, loss=0.478, lr=0.000222, d_tim[A
train:  90%|████████████████▏ | 833/928 [04:41<00:31,  3.00it/s, total_it=16

epochs:  85%|▊| 17/20 [1:34:36<15:42, 314.08s/it, loss=0.485, lr=0.000212, d_tim[A
train:  95%|█████████████████ | 877/928 [04:56<00:16,  3.03it/s, total_it=16652][A
epochs:  85%|▊| 17/20 [1:34:36<15:42, 314.08s/it, loss=0.389, lr=0.000212, d_tim[A
train:  95%|█████████████████ | 878/928 [04:56<00:16,  2.97it/s, total_it=16653][A
epochs:  85%|▊| 17/20 [1:34:37<15:42, 314.08s/it, loss=0.465, lr=0.000212, d_tim[A
train:  95%|█████████████████ | 879/928 [04:56<00:17,  2.88it/s, total_it=16654][A
epochs:  85%|▊| 17/20 [1:34:37<15:42, 314.08s/it, loss=0.371, lr=0.000212, d_tim[A
train:  95%|█████████████████ | 880/928 [04:57<00:16,  2.98it/s, total_it=16655][A
epochs:  85%|▊| 17/20 [1:34:37<15:42, 314.08s/it, loss=0.471, lr=0.000211, d_tim[A
train:  95%|█████████████████ | 881/928 [04:57<00:15,  3.06it/s, total_it=16656][A
epochs:  85%|▊| 17/20 [1:34:38<15:42, 314.08s/it, loss=0.514, lr=0.000211, d_tim[A
train:  95%|█████████████████ | 882/928 [04:57<00:16,  2.82it/s, total_it=16

epochs:  85%|▊| 17/20 [1:34:53<15:42, 314.08s/it, loss=0.439, lr=0.000202, d_tim[A
train: 100%|█████████████████▉| 926/928 [05:12<00:00,  3.31it/s, total_it=16701][A
epochs:  85%|▊| 17/20 [1:34:53<15:42, 314.08s/it, loss=0.615, lr=0.000202, d_tim[A
train: 100%|█████████████████▉| 927/928 [05:12<00:00,  3.32it/s, total_it=16702][A
epochs:  85%|▊| 17/20 [1:34:53<15:42, 314.08s/it, loss=0.429, lr=0.000201, d_tim[A
train: 100%|██████████████████| 928/928 [05:13<00:00,  3.33it/s, total_it=16703][A
epochs:  85%|▊| 17/20 [1:34:53<15:42, 314.08s/it, loss=0.41, lr=0.000201, d_time[A
epochs:  90%|▉| 18/20 [1:34:54<10:27, 313.89s/it, loss=0.41, lr=0.000201, d_time[A
train:   0%|                                            | 0/928 [00:00<?, ?it/s][A
train:   0%|                                    | 1/928 [00:01<17:53,  1.16s/it][A
epochs:  90%|▉| 18/20 [1:34:55<10:27, 313.89s/it, loss=0.408, lr=0.000201, d_tim[A
train:   0%|                    | 2/928 [00:01<10:59,  1.40it/s, total_it=16

epochs:  90%|▉| 18/20 [1:35:10<10:27, 313.89s/it, loss=0.35, lr=0.000192, d_time[A
train:   5%|▉                  | 46/928 [00:16<04:55,  2.98it/s, total_it=16749][A
epochs:  90%|▉| 18/20 [1:35:10<10:27, 313.89s/it, loss=0.39, lr=0.000192, d_time[A
train:   5%|▉                  | 47/928 [00:16<05:08,  2.86it/s, total_it=16750][A
epochs:  90%|▉| 18/20 [1:35:10<10:27, 313.89s/it, loss=0.436, lr=0.000191, d_tim[A
train:   5%|▉                  | 48/928 [00:16<04:56,  2.97it/s, total_it=16751][A
epochs:  90%|▉| 18/20 [1:35:11<10:27, 313.89s/it, loss=0.501, lr=0.000191, d_tim[A
train:   5%|█                  | 49/928 [00:17<05:01,  2.92it/s, total_it=16752][A
epochs:  90%|▉| 18/20 [1:35:11<10:27, 313.89s/it, loss=0.448, lr=0.000191, d_tim[A
train:   5%|█                  | 50/928 [00:17<05:02,  2.91it/s, total_it=16753][A
epochs:  90%|▉| 18/20 [1:35:11<10:27, 313.89s/it, loss=0.478, lr=0.000191, d_tim[A
train:   5%|█                  | 51/928 [00:17<04:52,  2.99it/s, total_it=16

epochs:  90%|▉| 18/20 [1:35:26<10:27, 313.89s/it, loss=0.458, lr=0.000182, d_tim[A
train:  10%|█▉                 | 95/928 [00:32<04:46,  2.91it/s, total_it=16798][A
epochs:  90%|▉| 18/20 [1:35:26<10:27, 313.89s/it, loss=0.502, lr=0.000182, d_tim[A
train:  10%|█▉                 | 96/928 [00:32<04:35,  3.02it/s, total_it=16799][A
epochs:  90%|▉| 18/20 [1:35:26<10:27, 313.89s/it, loss=0.539, lr=0.000181, d_tim[A
train:  10%|█▉                 | 97/928 [00:32<04:33,  3.03it/s, total_it=16800][A
epochs:  90%|▉| 18/20 [1:35:27<10:27, 313.89s/it, loss=0.432, lr=0.000181, d_tim[A
train:  11%|██                 | 98/928 [00:33<04:44,  2.92it/s, total_it=16801][A
epochs:  90%|▉| 18/20 [1:35:27<10:27, 313.89s/it, loss=0.454, lr=0.000181, d_tim[A
train:  11%|██                 | 99/928 [00:33<04:47,  2.88it/s, total_it=16802][A
epochs:  90%|▉| 18/20 [1:35:27<10:27, 313.89s/it, loss=0.395, lr=0.000181, d_tim[A
train:  11%|█▉                | 100/928 [00:33<04:39,  2.97it/s, total_it=16

epochs:  90%|▉| 18/20 [1:35:42<10:27, 313.89s/it, loss=0.45, lr=0.000172, d_time[A
train:  16%|██▊               | 144/928 [00:48<04:24,  2.96it/s, total_it=16847][A
epochs:  90%|▉| 18/20 [1:35:43<10:27, 313.89s/it, loss=0.519, lr=0.000172, d_tim[A
train:  16%|██▊               | 145/928 [00:49<04:19,  3.02it/s, total_it=16848][A
epochs:  90%|▉| 18/20 [1:35:43<10:27, 313.89s/it, loss=0.471, lr=0.000172, d_tim[A
train:  16%|██▊               | 146/928 [00:49<04:14,  3.07it/s, total_it=16849][A
epochs:  90%|▉| 18/20 [1:35:43<10:27, 313.89s/it, loss=0.422, lr=0.000171, d_tim[A
train:  16%|██▊               | 147/928 [00:49<04:27,  2.92it/s, total_it=16850][A
epochs:  90%|▉| 18/20 [1:35:44<10:27, 313.89s/it, loss=0.422, lr=0.000171, d_tim[A
train:  16%|██▊               | 148/928 [00:50<04:22,  2.97it/s, total_it=16851][A
epochs:  90%|▉| 18/20 [1:35:44<10:27, 313.89s/it, loss=0.458, lr=0.000171, d_tim[A
train:  16%|██▉               | 149/928 [00:50<04:21,  2.98it/s, total_it=16

epochs:  90%|▉| 18/20 [1:35:59<10:27, 313.89s/it, loss=0.606, lr=0.000162, d_tim[A
train:  21%|███▋              | 193/928 [01:05<04:03,  3.01it/s, total_it=16896][A
epochs:  90%|▉| 18/20 [1:35:59<10:27, 313.89s/it, loss=0.426, lr=0.000162, d_tim[A
train:  21%|███▊              | 194/928 [01:05<04:14,  2.88it/s, total_it=16897][A
epochs:  90%|▉| 18/20 [1:35:59<10:27, 313.89s/it, loss=0.462, lr=0.000162, d_tim[A
train:  21%|███▊              | 195/928 [01:05<04:09,  2.94it/s, total_it=16898][A
epochs:  90%|▉| 18/20 [1:36:00<10:27, 313.89s/it, loss=0.388, lr=0.000162, d_tim[A
train:  21%|███▊              | 196/928 [01:06<04:02,  3.02it/s, total_it=16899][A
epochs:  90%|▉| 18/20 [1:36:00<10:27, 313.89s/it, loss=0.486, lr=0.000162, d_tim[A
train:  21%|███▊              | 197/928 [01:06<04:08,  2.94it/s, total_it=16900][A
epochs:  90%|▉| 18/20 [1:36:00<10:27, 313.89s/it, loss=0.485, lr=0.000162, d_tim[A
train:  21%|███▊              | 198/928 [01:06<04:15,  2.86it/s, total_it=16

epochs:  90%|▉| 18/20 [1:36:15<10:27, 313.89s/it, loss=0.422, lr=0.000153, d_tim[A
train:  26%|████▋             | 242/928 [01:21<03:54,  2.92it/s, total_it=16945][A
epochs:  90%|▉| 18/20 [1:36:16<10:27, 313.89s/it, loss=0.478, lr=0.000153, d_tim[A
train:  26%|████▋             | 243/928 [01:22<04:04,  2.81it/s, total_it=16946][A
epochs:  90%|▉| 18/20 [1:36:16<10:27, 313.89s/it, loss=0.432, lr=0.000153, d_tim[A
train:  26%|████▋             | 244/928 [01:22<03:59,  2.86it/s, total_it=16947][A
epochs:  90%|▉| 18/20 [1:36:17<10:27, 313.89s/it, loss=0.423, lr=0.000153, d_tim[A
train:  26%|████▊             | 245/928 [01:23<03:57,  2.87it/s, total_it=16948][A
epochs:  90%|▉| 18/20 [1:36:17<10:27, 313.89s/it, loss=0.515, lr=0.000152, d_tim[A
train:  27%|████▊             | 246/928 [01:23<03:48,  2.98it/s, total_it=16949][A
epochs:  90%|▉| 18/20 [1:36:17<10:27, 313.89s/it, loss=0.435, lr=0.000152, d_tim[A
train:  27%|████▊             | 247/928 [01:23<03:57,  2.86it/s, total_it=16

epochs:  90%|▉| 18/20 [1:36:32<10:27, 313.89s/it, loss=0.434, lr=0.000144, d_tim[A
train:  31%|█████▋            | 291/928 [01:38<03:38,  2.92it/s, total_it=16994][A
epochs:  90%|▉| 18/20 [1:36:32<10:27, 313.89s/it, loss=0.48, lr=0.000144, d_time[A
train:  31%|█████▋            | 292/928 [01:38<03:37,  2.92it/s, total_it=16995][A
epochs:  90%|▉| 18/20 [1:36:33<10:27, 313.89s/it, loss=0.478, lr=0.000144, d_tim[A
train:  32%|█████▋            | 293/928 [01:39<03:33,  2.97it/s, total_it=16996][A
epochs:  90%|▉| 18/20 [1:36:33<10:27, 313.89s/it, loss=0.42, lr=0.000144, d_time[A
train:  32%|█████▋            | 294/928 [01:39<03:34,  2.96it/s, total_it=16997][A
epochs:  90%|▉| 18/20 [1:36:33<10:27, 313.89s/it, loss=0.46, lr=0.000144, d_time[A
train:  32%|█████▋            | 295/928 [01:39<03:36,  2.92it/s, total_it=16998][A
epochs:  90%|▉| 18/20 [1:36:34<10:27, 313.89s/it, loss=0.511, lr=0.000143, d_tim[A
train:  32%|█████▋            | 296/928 [01:40<03:35,  2.93it/s, total_it=16

epochs:  90%|▉| 18/20 [1:36:48<10:27, 313.89s/it, loss=0.447, lr=0.000135, d_tim[A
train:  37%|██████▌           | 340/928 [01:54<03:24,  2.87it/s, total_it=17043][A
epochs:  90%|▉| 18/20 [1:36:49<10:27, 313.89s/it, loss=0.38, lr=0.000135, d_time[A
train:  37%|██████▌           | 341/928 [01:55<03:17,  2.98it/s, total_it=17044][A
epochs:  90%|▉| 18/20 [1:36:49<10:27, 313.89s/it, loss=0.423, lr=0.000135, d_tim[A
train:  37%|██████▋           | 342/928 [01:55<03:13,  3.03it/s, total_it=17045][A
epochs:  90%|▉| 18/20 [1:36:49<10:27, 313.89s/it, loss=0.481, lr=0.000135, d_tim[A
train:  37%|██████▋           | 343/928 [01:55<03:14,  3.00it/s, total_it=17046][A
epochs:  90%|▉| 18/20 [1:36:50<10:27, 313.89s/it, loss=0.514, lr=0.000135, d_tim[A
train:  37%|██████▋           | 344/928 [01:56<03:13,  3.01it/s, total_it=17047][A
epochs:  90%|▉| 18/20 [1:36:50<10:27, 313.89s/it, loss=0.456, lr=0.000135, d_tim[A
train:  37%|██████▋           | 345/928 [01:56<03:09,  3.07it/s, total_it=17

epochs:  90%|▉| 18/20 [1:37:05<10:27, 313.89s/it, loss=0.433, lr=0.000127, d_tim[A
train:  42%|███████▌          | 389/928 [02:11<02:58,  3.01it/s, total_it=17092][A
epochs:  90%|▉| 18/20 [1:37:05<10:27, 313.89s/it, loss=0.452, lr=0.000127, d_tim[A
train:  42%|███████▌          | 390/928 [02:11<02:58,  3.01it/s, total_it=17093][A
epochs:  90%|▉| 18/20 [1:37:05<10:27, 313.89s/it, loss=0.447, lr=0.000127, d_tim[A
train:  42%|███████▌          | 391/928 [02:11<03:02,  2.94it/s, total_it=17094][A
epochs:  90%|▉| 18/20 [1:37:06<10:27, 313.89s/it, loss=0.438, lr=0.000126, d_tim[A
train:  42%|███████▌          | 392/928 [02:12<02:57,  3.02it/s, total_it=17095][A
epochs:  90%|▉| 18/20 [1:37:06<10:27, 313.89s/it, loss=0.478, lr=0.000126, d_tim[A
train:  42%|███████▌          | 393/928 [02:12<02:54,  3.06it/s, total_it=17096][A
epochs:  90%|▉| 18/20 [1:37:06<10:27, 313.89s/it, loss=0.419, lr=0.000126, d_tim[A
train:  42%|███████▋          | 394/928 [02:12<02:54,  3.06it/s, total_it=17

epochs:  90%|▉| 18/20 [1:37:21<10:27, 313.89s/it, loss=0.435, lr=0.000119, d_tim[A
train:  47%|████████▍         | 438/928 [02:27<02:41,  3.04it/s, total_it=17141][A
epochs:  90%|▉| 18/20 [1:37:21<10:27, 313.89s/it, loss=0.35, lr=0.000119, d_time[A
train:  47%|████████▌         | 439/928 [02:27<02:47,  2.92it/s, total_it=17142][A
epochs:  90%|▉| 18/20 [1:37:22<10:27, 313.89s/it, loss=0.433, lr=0.000118, d_tim[A
train:  47%|████████▌         | 440/928 [02:28<02:44,  2.97it/s, total_it=17143][A
epochs:  90%|▉| 18/20 [1:37:22<10:27, 313.89s/it, loss=0.506, lr=0.000118, d_tim[A
train:  48%|████████▌         | 441/928 [02:28<02:40,  3.03it/s, total_it=17144][A
epochs:  90%|▉| 18/20 [1:37:22<10:27, 313.89s/it, loss=0.448, lr=0.000118, d_tim[A
train:  48%|████████▌         | 442/928 [02:28<02:41,  3.01it/s, total_it=17145][A
epochs:  90%|▉| 18/20 [1:37:23<10:27, 313.89s/it, loss=0.338, lr=0.000118, d_tim[A
train:  48%|████████▌         | 443/928 [02:29<02:35,  3.11it/s, total_it=17

epochs:  90%|▉| 18/20 [1:37:37<10:27, 313.89s/it, loss=0.462, lr=0.000111, d_tim[A
train:  52%|█████████▍        | 487/928 [02:43<02:20,  3.15it/s, total_it=17190][A
epochs:  90%|▉| 18/20 [1:37:38<10:27, 313.89s/it, loss=0.513, lr=0.000111, d_tim[A
train:  53%|█████████▍        | 488/928 [02:43<02:19,  3.16it/s, total_it=17191][A
epochs:  90%|▉| 18/20 [1:37:38<10:27, 313.89s/it, loss=0.43, lr=0.000111, d_time[A
train:  53%|█████████▍        | 489/928 [02:44<02:21,  3.11it/s, total_it=17192][A
epochs:  90%|▉| 18/20 [1:37:38<10:27, 313.89s/it, loss=0.478, lr=0.00011, d_time[A
train:  53%|█████████▌        | 490/928 [02:44<02:22,  3.08it/s, total_it=17193][A
epochs:  90%|▉| 18/20 [1:37:38<10:27, 313.89s/it, loss=0.465, lr=0.00011, d_time[A
train:  53%|█████████▌        | 491/928 [02:44<02:21,  3.08it/s, total_it=17194][A
epochs:  90%|▉| 18/20 [1:37:39<10:27, 313.89s/it, loss=0.475, lr=0.00011, d_time[A
train:  53%|█████████▌        | 492/928 [02:45<02:18,  3.15it/s, total_it=17

epochs:  90%|▉| 18/20 [1:37:54<10:27, 313.89s/it, loss=0.515, lr=0.000103, d_tim[A
train:  58%|██████████▍       | 536/928 [03:00<02:15,  2.89it/s, total_it=17239][A
epochs:  90%|▉| 18/20 [1:37:54<10:27, 313.89s/it, loss=0.492, lr=0.000103, d_tim[A
train:  58%|██████████▍       | 537/928 [03:00<02:15,  2.90it/s, total_it=17240][A
epochs:  90%|▉| 18/20 [1:37:54<10:27, 313.89s/it, loss=0.429, lr=0.000103, d_tim[A
train:  58%|██████████▍       | 538/928 [03:00<02:15,  2.89it/s, total_it=17241][A
epochs:  90%|▉| 18/20 [1:37:55<10:27, 313.89s/it, loss=0.469, lr=0.000103, d_tim[A
train:  58%|██████████▍       | 539/928 [03:01<02:13,  2.91it/s, total_it=17242][A
epochs:  90%|▉| 18/20 [1:37:55<10:27, 313.89s/it, loss=0.452, lr=0.000103, d_tim[A
train:  58%|██████████▍       | 540/928 [03:01<02:14,  2.88it/s, total_it=17243][A
epochs:  90%|▉| 18/20 [1:37:55<10:27, 313.89s/it, loss=0.47, lr=0.000102, d_time[A
train:  58%|██████████▍       | 541/928 [03:01<02:11,  2.95it/s, total_it=17

epochs:  90%|▉| 18/20 [1:38:10<10:27, 313.89s/it, loss=0.452, lr=9.57e-5, d_time[A
train:  63%|███████████▎      | 585/928 [03:16<01:55,  2.98it/s, total_it=17288][A
epochs:  90%|▉| 18/20 [1:38:10<10:27, 313.89s/it, loss=0.481, lr=9.56e-5, d_time[A
train:  63%|███████████▎      | 586/928 [03:16<01:56,  2.93it/s, total_it=17289][A
epochs:  90%|▉| 18/20 [1:38:11<10:27, 313.89s/it, loss=0.503, lr=9.54e-5, d_time[A
train:  63%|███████████▍      | 587/928 [03:17<01:54,  2.97it/s, total_it=17290][A
epochs:  90%|▉| 18/20 [1:38:11<10:27, 313.89s/it, loss=0.499, lr=9.53e-5, d_time[A
train:  63%|███████████▍      | 588/928 [03:17<01:51,  3.06it/s, total_it=17291][A
epochs:  90%|▉| 18/20 [1:38:11<10:27, 313.89s/it, loss=0.405, lr=9.51e-5, d_time[A
train:  63%|███████████▍      | 589/928 [03:17<01:49,  3.10it/s, total_it=17292][A
epochs:  90%|▉| 18/20 [1:38:12<10:27, 313.89s/it, loss=0.731, lr=9.5e-5, d_time=[A
train:  64%|███████████▍      | 590/928 [03:18<01:49,  3.08it/s, total_it=17

epochs:  90%|▉| 18/20 [1:38:26<10:27, 313.89s/it, loss=0.523, lr=8.86e-5, d_time[A
train:  68%|████████████▎     | 634/928 [03:32<01:38,  3.00it/s, total_it=17337][A
epochs:  90%|▉| 18/20 [1:38:26<10:27, 313.89s/it, loss=0.424, lr=8.84e-5, d_time[A
train:  68%|████████████▎     | 635/928 [03:32<01:35,  3.05it/s, total_it=17338][A
epochs:  90%|▉| 18/20 [1:38:27<10:27, 313.89s/it, loss=0.535, lr=8.83e-5, d_time[A
train:  69%|████████████▎     | 636/928 [03:33<01:38,  2.96it/s, total_it=17339][A
epochs:  90%|▉| 18/20 [1:38:27<10:27, 313.89s/it, loss=0.444, lr=8.81e-5, d_time[A
train:  69%|████████████▎     | 637/928 [03:33<01:36,  3.01it/s, total_it=17340][A
epochs:  90%|▉| 18/20 [1:38:27<10:27, 313.89s/it, loss=0.361, lr=8.8e-5, d_time=[A
train:  69%|████████████▍     | 638/928 [03:33<01:34,  3.08it/s, total_it=17341][A
epochs:  90%|▉| 18/20 [1:38:28<10:27, 313.89s/it, loss=0.522, lr=8.79e-5, d_time[A
train:  69%|████████████▍     | 639/928 [03:34<01:38,  2.93it/s, total_it=17

epochs:  90%|▉| 18/20 [1:38:42<10:27, 313.89s/it, loss=0.438, lr=8.17e-5, d_time[A
train:  74%|█████████████▏    | 683/928 [03:49<01:29,  2.75it/s, total_it=17386][A
epochs:  90%|▉| 18/20 [1:38:43<10:27, 313.89s/it, loss=0.469, lr=8.15e-5, d_time[A
train:  74%|█████████████▎    | 684/928 [03:49<01:27,  2.80it/s, total_it=17387][A
epochs:  90%|▉| 18/20 [1:38:43<10:27, 313.89s/it, loss=0.488, lr=8.14e-5, d_time[A
train:  74%|█████████████▎    | 685/928 [03:49<01:24,  2.87it/s, total_it=17388][A
epochs:  90%|▉| 18/20 [1:38:44<10:27, 313.89s/it, loss=0.571, lr=8.13e-5, d_time[A
train:  74%|█████████████▎    | 686/928 [03:50<01:24,  2.86it/s, total_it=17389][A
epochs:  90%|▉| 18/20 [1:38:44<10:27, 313.89s/it, loss=0.424, lr=8.11e-5, d_time[A
train:  74%|█████████████▎    | 687/928 [03:50<01:20,  2.99it/s, total_it=17390][A
epochs:  90%|▉| 18/20 [1:38:44<10:27, 313.89s/it, loss=0.409, lr=8.1e-5, d_time=[A
train:  74%|█████████████▎    | 688/928 [03:50<01:23,  2.88it/s, total_it=17

epochs:  90%|▉| 18/20 [1:38:59<10:27, 313.89s/it, loss=0.512, lr=7.51e-5, d_time[A
train:  79%|██████████████▏   | 732/928 [04:05<01:08,  2.87it/s, total_it=17435][A
epochs:  90%|▉| 18/20 [1:38:59<10:27, 313.89s/it, loss=0.424, lr=7.49e-5, d_time[A
train:  79%|██████████████▏   | 733/928 [04:05<01:07,  2.90it/s, total_it=17436][A
epochs:  90%|▉| 18/20 [1:39:00<10:27, 313.89s/it, loss=0.418, lr=7.48e-5, d_time[A
train:  79%|██████████████▏   | 734/928 [04:06<01:07,  2.89it/s, total_it=17437][A
epochs:  90%|▉| 18/20 [1:39:00<10:27, 313.89s/it, loss=0.404, lr=7.47e-5, d_time[A
train:  79%|██████████████▎   | 735/928 [04:06<01:07,  2.87it/s, total_it=17438][A
epochs:  90%|▉| 18/20 [1:39:00<10:27, 313.89s/it, loss=0.402, lr=7.45e-5, d_time[A
train:  79%|██████████████▎   | 736/928 [04:06<01:07,  2.85it/s, total_it=17439][A
epochs:  90%|▉| 18/20 [1:39:01<10:27, 313.89s/it, loss=0.454, lr=7.44e-5, d_time[A
train:  79%|██████████████▎   | 737/928 [04:07<01:05,  2.92it/s, total_it=17

epochs:  90%|▉| 18/20 [1:39:15<10:27, 313.89s/it, loss=0.487, lr=6.87e-5, d_time[A
train:  84%|███████████████▏  | 781/928 [04:21<00:49,  3.00it/s, total_it=17484][A
epochs:  90%|▉| 18/20 [1:39:16<10:27, 313.89s/it, loss=0.39, lr=6.86e-5, d_time=[A
train:  84%|███████████████▏  | 782/928 [04:22<00:49,  2.93it/s, total_it=17485][A
epochs:  90%|▉| 18/20 [1:39:16<10:27, 313.89s/it, loss=0.424, lr=6.85e-5, d_time[A
train:  84%|███████████████▏  | 783/928 [04:22<00:49,  2.96it/s, total_it=17486][A
epochs:  90%|▉| 18/20 [1:39:16<10:27, 313.89s/it, loss=0.406, lr=6.84e-5, d_time[A
train:  84%|███████████████▏  | 784/928 [04:22<00:48,  2.96it/s, total_it=17487][A
epochs:  90%|▉| 18/20 [1:39:17<10:27, 313.89s/it, loss=0.417, lr=6.82e-5, d_time[A
train:  85%|███████████████▏  | 785/928 [04:23<00:47,  3.01it/s, total_it=17488][A
epochs:  90%|▉| 18/20 [1:39:17<10:27, 313.89s/it, loss=0.407, lr=6.81e-5, d_time[A
train:  85%|███████████████▏  | 786/928 [04:23<00:47,  3.01it/s, total_it=17

epochs:  90%|▉| 18/20 [1:39:32<10:27, 313.89s/it, loss=0.538, lr=6.27e-5, d_time[A
train:  89%|████████████████  | 830/928 [04:38<00:31,  3.11it/s, total_it=17533][A
epochs:  90%|▉| 18/20 [1:39:32<10:27, 313.89s/it, loss=0.319, lr=6.25e-5, d_time[A
train:  90%|████████████████  | 831/928 [04:38<00:31,  3.07it/s, total_it=17534][A
epochs:  90%|▉| 18/20 [1:39:32<10:27, 313.89s/it, loss=0.383, lr=6.24e-5, d_time[A
train:  90%|████████████████▏ | 832/928 [04:38<00:31,  3.09it/s, total_it=17535][A
epochs:  90%|▉| 18/20 [1:39:33<10:27, 313.89s/it, loss=0.417, lr=6.23e-5, d_time[A
train:  90%|████████████████▏ | 833/928 [04:39<00:30,  3.15it/s, total_it=17536][A
epochs:  90%|▉| 18/20 [1:39:33<10:27, 313.89s/it, loss=0.376, lr=6.22e-5, d_time[A
train:  90%|████████████████▏ | 834/928 [04:39<00:31,  2.99it/s, total_it=17537][A
epochs:  90%|▉| 18/20 [1:39:33<10:27, 313.89s/it, loss=0.361, lr=6.21e-5, d_time[A
train:  90%|████████████████▏ | 835/928 [04:40<00:32,  2.83it/s, total_it=17

epochs:  90%|▉| 18/20 [1:39:48<10:27, 313.89s/it, loss=0.411, lr=5.69e-5, d_time[A
train:  95%|█████████████████ | 879/928 [04:54<00:16,  2.92it/s, total_it=17582][A
epochs:  90%|▉| 18/20 [1:39:49<10:27, 313.89s/it, loss=0.467, lr=5.68e-5, d_time[A
train:  95%|█████████████████ | 880/928 [04:55<00:17,  2.79it/s, total_it=17583][A
epochs:  90%|▉| 18/20 [1:39:49<10:27, 313.89s/it, loss=0.564, lr=5.66e-5, d_time[A
train:  95%|█████████████████ | 881/928 [04:55<00:16,  2.92it/s, total_it=17584][A
epochs:  90%|▉| 18/20 [1:39:49<10:27, 313.89s/it, loss=0.558, lr=5.65e-5, d_time[A
train:  95%|█████████████████ | 882/928 [04:55<00:16,  2.80it/s, total_it=17585][A
epochs:  90%|▉| 18/20 [1:39:50<10:27, 313.89s/it, loss=0.448, lr=5.64e-5, d_time[A
train:  95%|█████████████████▏| 883/928 [04:56<00:15,  2.94it/s, total_it=17586][A
epochs:  90%|▉| 18/20 [1:39:50<10:27, 313.89s/it, loss=0.474, lr=5.63e-5, d_time[A
train:  95%|█████████████████▏| 884/928 [04:56<00:15,  2.86it/s, total_it=17

epochs:  90%|▉| 18/20 [1:40:04<10:27, 313.89s/it, loss=0.546, lr=5.14e-5, d_time[A
train: 100%|██████████████████| 928/928 [05:10<00:00,  3.27it/s, total_it=17631][A
epochs:  90%|▉| 18/20 [1:40:05<10:27, 313.89s/it, loss=0.394, lr=5.13e-5, d_time[A
epochs:  95%|▉| 19/20 [1:40:05<05:13, 313.14s/it, loss=0.394, lr=5.13e-5, d_time[A
train:   0%|                                            | 0/928 [00:00<?, ?it/s][A
train:   0%|                                    | 1/928 [00:01<16:01,  1.04s/it][A
epochs:  95%|▉| 19/20 [1:40:06<05:13, 313.14s/it, loss=0.42, lr=5.11e-5, d_time=[A
train:   0%|                    | 2/928 [00:01<09:44,  1.59it/s, total_it=17633][A
epochs:  95%|▉| 19/20 [1:40:07<05:13, 313.14s/it, loss=0.422, lr=5.1e-5, d_time=[A
train:   0%|                    | 3/928 [00:01<07:36,  2.02it/s, total_it=17634][A
epochs:  95%|▉| 19/20 [1:40:07<05:13, 313.14s/it, loss=0.444, lr=5.09e-5, d_time[A
train:   0%|                    | 4/928 [00:02<07:11,  2.14it/s, total_it=17

epochs:  95%|▉| 19/20 [1:40:22<05:13, 313.14s/it, loss=0.469, lr=4.62e-5, d_time[A
train:   5%|▉                  | 48/928 [00:17<04:47,  3.06it/s, total_it=17679][A
epochs:  95%|▉| 19/20 [1:40:22<05:13, 313.14s/it, loss=0.411, lr=4.61e-5, d_time[A
train:   5%|█                  | 49/928 [00:17<04:57,  2.96it/s, total_it=17680][A
epochs:  95%|▉| 19/20 [1:40:23<05:13, 313.14s/it, loss=0.481, lr=4.6e-5, d_time=[A
train:   5%|█                  | 50/928 [00:17<04:56,  2.96it/s, total_it=17681][A
epochs:  95%|▉| 19/20 [1:40:23<05:13, 313.14s/it, loss=0.483, lr=4.59e-5, d_time[A
train:   5%|█                  | 51/928 [00:18<05:08,  2.85it/s, total_it=17682][A
epochs:  95%|▉| 19/20 [1:40:23<05:13, 313.14s/it, loss=0.49, lr=4.58e-5, d_time=[A
train:   6%|█                  | 52/928 [00:18<04:56,  2.96it/s, total_it=17683][A
epochs:  95%|▉| 19/20 [1:40:24<05:13, 313.14s/it, loss=0.434, lr=4.57e-5, d_time[A
train:   6%|█                  | 53/928 [00:18<05:21,  2.72it/s, total_it=17

epochs:  95%|▉| 19/20 [1:40:38<05:13, 313.14s/it, loss=0.395, lr=4.13e-5, d_time[A
train:  10%|█▉                 | 97/928 [00:33<04:27,  3.11it/s, total_it=17728][A
epochs:  95%|▉| 19/20 [1:40:39<05:13, 313.14s/it, loss=0.381, lr=4.12e-5, d_time[A
train:  11%|██                 | 98/928 [00:33<04:24,  3.14it/s, total_it=17729][A
epochs:  95%|▉| 19/20 [1:40:39<05:13, 313.14s/it, loss=0.399, lr=4.11e-5, d_time[A
train:  11%|██                 | 99/928 [00:33<04:31,  3.06it/s, total_it=17730][A
epochs:  95%|▉| 19/20 [1:40:39<05:13, 313.14s/it, loss=0.398, lr=4.1e-5, d_time=[A
train:  11%|█▉                | 100/928 [00:34<04:30,  3.06it/s, total_it=17731][A
epochs:  95%|▉| 19/20 [1:40:39<05:13, 313.14s/it, loss=0.538, lr=4.09e-5, d_time[A
train:  11%|█▉                | 101/928 [00:34<04:33,  3.03it/s, total_it=17732][A
epochs:  95%|▉| 19/20 [1:40:40<05:13, 313.14s/it, loss=0.481, lr=4.08e-5, d_time[A
train:  11%|█▉                | 102/928 [00:34<04:24,  3.12it/s, total_it=17

epochs:  95%|▉| 19/20 [1:40:55<05:13, 313.14s/it, loss=0.453, lr=3.66e-5, d_time[A
train:  16%|██▊               | 146/928 [00:49<04:27,  2.92it/s, total_it=17777][A
epochs:  95%|▉| 19/20 [1:40:55<05:13, 313.14s/it, loss=0.462, lr=3.65e-5, d_time[A
train:  16%|██▊               | 147/928 [00:50<04:24,  2.95it/s, total_it=17778][A
epochs:  95%|▉| 19/20 [1:40:55<05:13, 313.14s/it, loss=0.431, lr=3.64e-5, d_time[A
train:  16%|██▊               | 148/928 [00:50<04:20,  2.99it/s, total_it=17779][A
epochs:  95%|▉| 19/20 [1:40:56<05:13, 313.14s/it, loss=0.396, lr=3.63e-5, d_time[A
train:  16%|██▉               | 149/928 [00:50<04:27,  2.91it/s, total_it=17780][A
epochs:  95%|▉| 19/20 [1:40:56<05:13, 313.14s/it, loss=0.378, lr=3.62e-5, d_time[A
train:  16%|██▉               | 150/928 [00:51<04:19,  2.99it/s, total_it=17781][A
epochs:  95%|▉| 19/20 [1:40:56<05:13, 313.14s/it, loss=0.405, lr=3.61e-5, d_time[A
train:  16%|██▉               | 151/928 [00:51<04:12,  3.07it/s, total_it=17

epochs:  95%|▉| 19/20 [1:41:11<05:13, 313.14s/it, loss=0.413, lr=3.22e-5, d_time[A
train:  21%|███▊              | 195/928 [01:05<03:55,  3.11it/s, total_it=17826][A
epochs:  95%|▉| 19/20 [1:41:11<05:13, 313.14s/it, loss=0.597, lr=3.21e-5, d_time[A
train:  21%|███▊              | 196/928 [01:06<04:01,  3.03it/s, total_it=17827][A
epochs:  95%|▉| 19/20 [1:41:11<05:13, 313.14s/it, loss=0.419, lr=3.2e-5, d_time=[A
train:  21%|███▊              | 197/928 [01:06<04:01,  3.03it/s, total_it=17828][A
epochs:  95%|▉| 19/20 [1:41:12<05:13, 313.14s/it, loss=0.409, lr=3.19e-5, d_time[A
train:  21%|███▊              | 198/928 [01:06<04:01,  3.02it/s, total_it=17829][A
epochs:  95%|▉| 19/20 [1:41:12<05:13, 313.14s/it, loss=0.46, lr=3.18e-5, d_time=[A
train:  21%|███▊              | 199/928 [01:07<04:10,  2.91it/s, total_it=17830][A
epochs:  95%|▉| 19/20 [1:41:12<05:13, 313.14s/it, loss=0.464, lr=3.17e-5, d_time[A
train:  22%|███▉              | 200/928 [01:07<04:24,  2.75it/s, total_it=17

epochs:  95%|▉| 19/20 [1:41:27<05:13, 313.14s/it, loss=0.441, lr=2.8e-5, d_time=[A
train:  26%|████▋             | 244/928 [01:22<03:44,  3.05it/s, total_it=17875][A
epochs:  95%|▉| 19/20 [1:41:28<05:13, 313.14s/it, loss=0.467, lr=2.8e-5, d_time=[A
train:  26%|████▊             | 245/928 [01:22<03:48,  2.99it/s, total_it=17876][A
epochs:  95%|▉| 19/20 [1:41:28<05:13, 313.14s/it, loss=0.572, lr=2.79e-5, d_time[A
train:  27%|████▊             | 246/928 [01:23<03:45,  3.02it/s, total_it=17877][A
epochs:  95%|▉| 19/20 [1:41:28<05:13, 313.14s/it, loss=0.455, lr=2.78e-5, d_time[A
train:  27%|████▊             | 247/928 [01:23<03:48,  2.98it/s, total_it=17878][A
epochs:  95%|▉| 19/20 [1:41:29<05:13, 313.14s/it, loss=0.439, lr=2.77e-5, d_time[A
train:  27%|████▊             | 248/928 [01:23<03:41,  3.06it/s, total_it=17879][A
epochs:  95%|▉| 19/20 [1:41:29<05:13, 313.14s/it, loss=0.408, lr=2.76e-5, d_time[A
train:  27%|████▊             | 249/928 [01:24<03:43,  3.04it/s, total_it=17

epochs:  95%|▉| 19/20 [1:41:44<05:13, 313.14s/it, loss=0.472, lr=2.42e-5, d_time[A
train:  32%|█████▋            | 293/928 [01:39<03:55,  2.70it/s, total_it=17924][A
epochs:  95%|▉| 19/20 [1:41:44<05:13, 313.14s/it, loss=0.372, lr=2.41e-5, d_time[A
train:  32%|█████▋            | 294/928 [01:39<03:53,  2.71it/s, total_it=17925][A
epochs:  95%|▉| 19/20 [1:41:45<05:13, 313.14s/it, loss=0.435, lr=2.4e-5, d_time=[A
train:  32%|█████▋            | 295/928 [01:39<04:00,  2.63it/s, total_it=17926][A
epochs:  95%|▉| 19/20 [1:41:45<05:13, 313.14s/it, loss=0.406, lr=2.4e-5, d_time=[A
train:  32%|█████▋            | 296/928 [01:40<03:52,  2.71it/s, total_it=17927][A
epochs:  95%|▉| 19/20 [1:41:45<05:13, 313.14s/it, loss=0.441, lr=2.39e-5, d_time[A
train:  32%|█████▊            | 297/928 [01:40<03:52,  2.71it/s, total_it=17928][A
epochs:  95%|▉| 19/20 [1:41:46<05:13, 313.14s/it, loss=0.434, lr=2.38e-5, d_time[A
train:  32%|█████▊            | 298/928 [01:40<03:45,  2.80it/s, total_it=17

epochs:  95%|▉| 19/20 [1:42:01<05:13, 313.14s/it, loss=0.535, lr=2.06e-5, d_time[A
train:  37%|██████▋           | 342/928 [01:55<03:17,  2.96it/s, total_it=17973][A
epochs:  95%|▉| 19/20 [1:42:01<05:13, 313.14s/it, loss=0.366, lr=2.06e-5, d_time[A
train:  37%|██████▋           | 343/928 [01:56<03:16,  2.97it/s, total_it=17974][A
epochs:  95%|▉| 19/20 [1:42:02<05:13, 313.14s/it, loss=0.387, lr=2.05e-5, d_time[A
train:  37%|██████▋           | 344/928 [01:56<03:15,  2.99it/s, total_it=17975][A
epochs:  95%|▉| 19/20 [1:42:02<05:13, 313.14s/it, loss=0.523, lr=2.04e-5, d_time[A
train:  37%|██████▋           | 345/928 [01:56<03:17,  2.96it/s, total_it=17976][A
epochs:  95%|▉| 19/20 [1:42:02<05:13, 313.14s/it, loss=0.474, lr=2.03e-5, d_time[A
train:  37%|██████▋           | 346/928 [01:57<03:12,  3.03it/s, total_it=17977][A
epochs:  95%|▉| 19/20 [1:42:03<05:13, 313.14s/it, loss=0.464, lr=2.03e-5, d_time[A
train:  37%|██████▋           | 347/928 [01:57<03:10,  3.05it/s, total_it=17

epochs:  95%|▉| 19/20 [1:42:17<05:13, 313.14s/it, loss=0.465, lr=1.73e-5, d_time[A
train:  42%|███████▌          | 391/928 [02:12<02:57,  3.02it/s, total_it=18022][A
epochs:  95%|▉| 19/20 [1:42:18<05:13, 313.14s/it, loss=0.439, lr=1.73e-5, d_time[A
train:  42%|███████▌          | 392/928 [02:12<02:58,  3.01it/s, total_it=18023][A
epochs:  95%|▉| 19/20 [1:42:18<05:13, 313.14s/it, loss=0.457, lr=1.72e-5, d_time[A
train:  42%|███████▌          | 393/928 [02:13<02:54,  3.07it/s, total_it=18024][A
epochs:  95%|▉| 19/20 [1:42:18<05:13, 313.14s/it, loss=0.44, lr=1.71e-5, d_time=[A
train:  42%|███████▋          | 394/928 [02:13<03:01,  2.95it/s, total_it=18025][A
epochs:  95%|▉| 19/20 [1:42:19<05:13, 313.14s/it, loss=0.432, lr=1.71e-5, d_time[A
train:  43%|███████▋          | 395/928 [02:13<02:55,  3.04it/s, total_it=18026][A
epochs:  95%|▉| 19/20 [1:42:19<05:13, 313.14s/it, loss=0.465, lr=1.7e-5, d_time=[A
train:  43%|███████▋          | 396/928 [02:14<02:56,  3.01it/s, total_it=18

epochs:  95%|▉| 19/20 [1:42:34<05:13, 313.14s/it, loss=0.367, lr=1.43e-5, d_time[A
train:  47%|████████▌         | 440/928 [02:28<02:40,  3.04it/s, total_it=18071][A
epochs:  95%|▉| 19/20 [1:42:34<05:13, 313.14s/it, loss=0.487, lr=1.43e-5, d_time[A
train:  48%|████████▌         | 441/928 [02:29<02:43,  2.98it/s, total_it=18072][A
epochs:  95%|▉| 19/20 [1:42:34<05:13, 313.14s/it, loss=0.363, lr=1.42e-5, d_time[A
train:  48%|████████▌         | 442/928 [02:29<02:44,  2.95it/s, total_it=18073][A
epochs:  95%|▉| 19/20 [1:42:35<05:13, 313.14s/it, loss=0.47, lr=1.42e-5, d_time=[A
train:  48%|████████▌         | 443/928 [02:29<02:40,  3.03it/s, total_it=18074][A
epochs:  95%|▉| 19/20 [1:42:35<05:13, 313.14s/it, loss=0.394, lr=1.41e-5, d_time[A
train:  48%|████████▌         | 444/928 [02:30<02:52,  2.80it/s, total_it=18075][A
epochs:  95%|▉| 19/20 [1:42:35<05:13, 313.14s/it, loss=0.375, lr=1.4e-5, d_time=[A
train:  48%|████████▋         | 445/928 [02:30<02:49,  2.85it/s, total_it=18

epochs:  95%|▉| 19/20 [1:42:50<05:13, 313.14s/it, loss=0.43, lr=1.16e-5, d_time=[A
train:  53%|█████████▍        | 489/928 [02:45<02:27,  2.99it/s, total_it=18120][A
epochs:  95%|▉| 19/20 [1:42:51<05:13, 313.14s/it, loss=0.392, lr=1.16e-5, d_time[A
train:  53%|█████████▌        | 490/928 [02:45<02:27,  2.98it/s, total_it=18121][A
epochs:  95%|▉| 19/20 [1:42:51<05:13, 313.14s/it, loss=0.362, lr=1.15e-5, d_time[A
train:  53%|█████████▌        | 491/928 [02:46<02:24,  3.03it/s, total_it=18122][A
epochs:  95%|▉| 19/20 [1:42:51<05:13, 313.14s/it, loss=0.394, lr=1.15e-5, d_time[A
train:  53%|█████████▌        | 492/928 [02:46<02:32,  2.86it/s, total_it=18123][A
epochs:  95%|▉| 19/20 [1:42:52<05:13, 313.14s/it, loss=0.445, lr=1.14e-5, d_time[A
train:  53%|█████████▌        | 493/928 [02:46<02:33,  2.84it/s, total_it=18124][A
epochs:  95%|▉| 19/20 [1:42:52<05:13, 313.14s/it, loss=0.419, lr=1.14e-5, d_time[A
train:  53%|█████████▌        | 494/928 [02:47<02:29,  2.91it/s, total_it=18

epochs:  95%|▉| 19/20 [1:43:07<05:13, 313.14s/it, loss=0.451, lr=9.19e-6, d_time[A
train:  58%|██████████▍       | 538/928 [03:02<02:08,  3.02it/s, total_it=18169][A
epochs:  95%|▉| 19/20 [1:43:07<05:13, 313.14s/it, loss=0.523, lr=9.15e-6, d_time[A
train:  58%|██████████▍       | 539/928 [03:02<02:07,  3.05it/s, total_it=18170][A
epochs:  95%|▉| 19/20 [1:43:08<05:13, 313.14s/it, loss=0.417, lr=9.1e-6, d_time=[A
train:  58%|██████████▍       | 540/928 [03:02<02:02,  3.17it/s, total_it=18171][A
epochs:  95%|▉| 19/20 [1:43:08<05:13, 313.14s/it, loss=0.551, lr=9.05e-6, d_time[A
train:  58%|██████████▍       | 541/928 [03:03<02:00,  3.20it/s, total_it=18172][A
epochs:  95%|▉| 19/20 [1:43:08<05:13, 313.14s/it, loss=0.516, lr=9.01e-6, d_time[A
train:  58%|██████████▌       | 542/928 [03:03<02:03,  3.14it/s, total_it=18173][A
epochs:  95%|▉| 19/20 [1:43:09<05:13, 313.14s/it, loss=0.389, lr=8.96e-6, d_time[A
train:  59%|██████████▌       | 543/928 [03:03<02:04,  3.09it/s, total_it=18

epochs:  95%|▉| 19/20 [1:43:23<05:13, 313.14s/it, loss=0.41, lr=7.05e-6, d_time=[A
train:  63%|███████████▍      | 587/928 [03:18<01:46,  3.20it/s, total_it=18218][A
epochs:  95%|▉| 19/20 [1:43:23<05:13, 313.14s/it, loss=0.355, lr=7.01e-6, d_time[A
train:  63%|███████████▍      | 588/928 [03:18<01:45,  3.22it/s, total_it=18219][A
epochs:  95%|▉| 19/20 [1:43:24<05:13, 313.14s/it, loss=0.693, lr=6.97e-6, d_time[A
train:  63%|███████████▍      | 589/928 [03:18<01:47,  3.15it/s, total_it=18220][A
epochs:  95%|▉| 19/20 [1:43:24<05:13, 313.14s/it, loss=0.418, lr=6.92e-6, d_time[A
train:  64%|███████████▍      | 590/928 [03:19<01:46,  3.18it/s, total_it=18221][A
epochs:  95%|▉| 19/20 [1:43:24<05:13, 313.14s/it, loss=0.389, lr=6.88e-6, d_time[A
train:  64%|███████████▍      | 591/928 [03:19<01:48,  3.10it/s, total_it=18222][A
epochs:  95%|▉| 19/20 [1:43:25<05:13, 313.14s/it, loss=0.393, lr=6.84e-6, d_time[A
train:  64%|███████████▍      | 592/928 [03:19<01:47,  3.12it/s, total_it=18

epochs:  95%|▉| 19/20 [1:43:39<05:13, 313.14s/it, loss=0.401, lr=5.19e-6, d_time[A
train:  69%|████████████▎     | 636/928 [03:34<01:35,  3.05it/s, total_it=18267][A
epochs:  95%|▉| 19/20 [1:43:40<05:13, 313.14s/it, loss=0.392, lr=5.15e-6, d_time[A
train:  69%|████████████▎     | 637/928 [03:34<01:41,  2.87it/s, total_it=18268][A
epochs:  95%|▉| 19/20 [1:43:40<05:13, 313.14s/it, loss=0.446, lr=5.12e-6, d_time[A
train:  69%|████████████▍     | 638/928 [03:35<01:39,  2.91it/s, total_it=18269][A
epochs:  95%|▉| 19/20 [1:43:40<05:13, 313.14s/it, loss=0.4, lr=5.08e-6, d_time=0[A
train:  69%|████████████▍     | 639/928 [03:35<01:38,  2.94it/s, total_it=18270][A
epochs:  95%|▉| 19/20 [1:43:41<05:13, 313.14s/it, loss=0.553, lr=5.05e-6, d_time[A
train:  69%|████████████▍     | 640/928 [03:35<01:39,  2.89it/s, total_it=18271][A
epochs:  95%|▉| 19/20 [1:43:41<05:13, 313.14s/it, loss=0.443, lr=5.01e-6, d_time[A
train:  69%|████████████▍     | 641/928 [03:36<01:38,  2.93it/s, total_it=18

epochs:  95%|▉| 19/20 [1:43:56<05:13, 313.14s/it, loss=0.472, lr=3.61e-6, d_time[A
train:  74%|█████████████▎    | 685/928 [03:51<01:18,  3.10it/s, total_it=18316][A
epochs:  95%|▉| 19/20 [1:43:56<05:13, 313.14s/it, loss=0.361, lr=3.58e-6, d_time[A
train:  74%|█████████████▎    | 686/928 [03:51<01:16,  3.17it/s, total_it=18317][A
epochs:  95%|▉| 19/20 [1:43:57<05:13, 313.14s/it, loss=0.409, lr=3.55e-6, d_time[A
train:  74%|█████████████▎    | 687/928 [03:51<01:22,  2.91it/s, total_it=18318][A
epochs:  95%|▉| 19/20 [1:43:57<05:13, 313.14s/it, loss=0.432, lr=3.52e-6, d_time[A
train:  74%|█████████████▎    | 688/928 [03:52<01:20,  2.98it/s, total_it=18319][A
epochs:  95%|▉| 19/20 [1:43:57<05:13, 313.14s/it, loss=0.378, lr=3.5e-6, d_time=[A
train:  74%|█████████████▎    | 689/928 [03:52<01:19,  3.00it/s, total_it=18320][A
epochs:  95%|▉| 19/20 [1:43:58<05:13, 313.14s/it, loss=0.475, lr=3.47e-6, d_time[A
train:  74%|█████████████▍    | 690/928 [03:52<01:19,  3.00it/s, total_it=18

epochs:  95%|▉| 19/20 [1:44:12<05:13, 313.14s/it, loss=0.44, lr=2.32e-6, d_time=[A
train:  79%|██████████████▏   | 734/928 [04:07<01:05,  2.97it/s, total_it=18365][A
epochs:  95%|▉| 19/20 [1:44:13<05:13, 313.14s/it, loss=0.315, lr=2.3e-6, d_time=[A
train:  79%|██████████████▎   | 735/928 [04:07<01:06,  2.92it/s, total_it=18366][A
epochs:  95%|▉| 19/20 [1:44:13<05:13, 313.14s/it, loss=0.426, lr=2.28e-6, d_time[A
train:  79%|██████████████▎   | 736/928 [04:08<01:04,  2.96it/s, total_it=18367][A
epochs:  95%|▉| 19/20 [1:44:14<05:13, 313.14s/it, loss=0.514, lr=2.25e-6, d_time[A
train:  79%|██████████████▎   | 737/928 [04:08<01:04,  2.98it/s, total_it=18368][A
epochs:  95%|▉| 19/20 [1:44:14<05:13, 313.14s/it, loss=0.43, lr=2.23e-6, d_time=[A
train:  80%|██████████████▎   | 738/928 [04:09<01:08,  2.77it/s, total_it=18369][A
epochs:  95%|▉| 19/20 [1:44:14<05:13, 313.14s/it, loss=0.464, lr=2.21e-6, d_time[A
train:  80%|██████████████▎   | 739/928 [04:09<01:07,  2.82it/s, total_it=18

epochs:  95%|▉| 19/20 [1:44:29<05:13, 313.14s/it, loss=0.338, lr=1.32e-6, d_time[A
train:  84%|███████████████▏  | 783/928 [04:24<00:48,  2.97it/s, total_it=18414][A
epochs:  95%|▉| 19/20 [1:44:29<05:13, 313.14s/it, loss=0.393, lr=1.3e-6, d_time=[A
train:  84%|███████████████▏  | 784/928 [04:24<00:47,  3.00it/s, total_it=18415][A
epochs:  95%|▉| 19/20 [1:44:30<05:13, 313.14s/it, loss=0.443, lr=1.28e-6, d_time[A
train:  85%|███████████████▏  | 785/928 [04:24<00:48,  2.94it/s, total_it=18416][A
epochs:  95%|▉| 19/20 [1:44:30<05:13, 313.14s/it, loss=0.459, lr=1.27e-6, d_time[A
train:  85%|███████████████▏  | 786/928 [04:25<00:48,  2.95it/s, total_it=18417][A
epochs:  95%|▉| 19/20 [1:44:30<05:13, 313.14s/it, loss=0.485, lr=1.25e-6, d_time[A
train:  85%|███████████████▎  | 787/928 [04:25<00:48,  2.92it/s, total_it=18418][A
epochs:  95%|▉| 19/20 [1:44:31<05:13, 313.14s/it, loss=0.425, lr=1.23e-6, d_time[A
train:  85%|███████████████▎  | 788/928 [04:25<00:47,  2.92it/s, total_it=18

epochs:  95%|▉| 19/20 [1:44:46<05:13, 313.14s/it, loss=0.365, lr=6.03e-7, d_time[A
train:  90%|████████████████▏ | 832/928 [04:40<00:32,  2.93it/s, total_it=18463][A
epochs:  95%|▉| 19/20 [1:44:46<05:13, 313.14s/it, loss=0.524, lr=5.92e-7, d_time[A
train:  90%|████████████████▏ | 833/928 [04:41<00:31,  3.02it/s, total_it=18464][A
epochs:  95%|▉| 19/20 [1:44:46<05:13, 313.14s/it, loss=0.457, lr=5.8e-7, d_time=[A
train:  90%|████████████████▏ | 834/928 [04:41<00:30,  3.06it/s, total_it=18465][A
epochs:  95%|▉| 19/20 [1:44:47<05:13, 313.14s/it, loss=0.436, lr=5.69e-7, d_time[A
train:  90%|████████████████▏ | 835/928 [04:41<00:31,  2.98it/s, total_it=18466][A
epochs:  95%|▉| 19/20 [1:44:47<05:13, 313.14s/it, loss=0.424, lr=5.57e-7, d_time[A
train:  90%|████████████████▏ | 836/928 [04:42<00:31,  2.95it/s, total_it=18467][A
epochs:  95%|▉| 19/20 [1:44:47<05:13, 313.14s/it, loss=0.417, lr=5.46e-7, d_time[A
train:  90%|████████████████▏ | 837/928 [04:42<00:32,  2.83it/s, total_it=18

epochs:  95%|▉| 19/20 [1:45:02<05:13, 313.14s/it, loss=0.44, lr=1.73e-7, d_time=[A
train:  95%|█████████████████ | 881/928 [04:57<00:15,  3.06it/s, total_it=18512][A
epochs:  95%|▉| 19/20 [1:45:03<05:13, 313.14s/it, loss=0.632, lr=1.68e-7, d_time[A
train:  95%|█████████████████ | 882/928 [04:57<00:15,  3.03it/s, total_it=18513][A
epochs:  95%|▉| 19/20 [1:45:03<05:13, 313.14s/it, loss=0.587, lr=1.62e-7, d_time[A
train:  95%|█████████████████▏| 883/928 [04:58<00:15,  2.98it/s, total_it=18514][A
epochs:  95%|▉| 19/20 [1:45:03<05:13, 313.14s/it, loss=0.396, lr=1.56e-7, d_time[A
train:  95%|█████████████████▏| 884/928 [04:58<00:15,  2.85it/s, total_it=18515][A
epochs:  95%|▉| 19/20 [1:45:04<05:13, 313.14s/it, loss=0.412, lr=1.51e-7, d_time[A
train:  95%|█████████████████▏| 885/928 [04:58<00:14,  2.94it/s, total_it=18516][A
epochs:  95%|▉| 19/20 [1:45:04<05:13, 313.14s/it, loss=0.402, lr=1.46e-7, d_time[A
train:  95%|█████████████████▏| 886/928 [04:59<00:14,  2.86it/s, total_it=18

epochs: 100%|█| 20/20 [1:45:19<00:00, 315.95s/it, loss=0.68, lr=3.01e-8, d_time=
2022-01-04 18:43:56,293   INFO  **********************End training cfgs/kitti_models/pointpillar(experiment1)**********************



2022-01-04 18:43:56,294   INFO  **********************Start evaluation cfgs/kitti_models/pointpillar(experiment1)**********************
2022-01-04 18:43:56,299   INFO  Loading KITTI dataset
2022-01-04 18:43:56,523   INFO  Total samples for KITTI dataset: 3769
2022-01-04 18:43:56,548   INFO  ==> Loading parameters from checkpoint /userhome/35/tqwang/OpenPCDet/output/pointpillar/experiment1/ckpt/checkpoint_epoch_20.pth to GPU
2022-01-04 18:43:56,656   INFO  ==> Checkpoint trained from version: pcdet+0.5.1+34c75bb
2022-01-04 18:43:56,662   INFO  ==> Done (loaded 127/127)
2022-01-04 18:43:56,666   INFO  *************** EPOCH 20 EVALUATION *****************
eval: 100%|████| 943/943 [02:06<00:00,  7.43it/s, recall_0.3=(0, 16463) / 17558]
2022-01-04 18:46:03,600   INFO  **********

### Test model with given config and checkpoint file.
#### Note that the stored checkpoint files are under OpenPCDet/output/cfgs/.... folder by default.

In [None]:
%cd ~/OpenPCDet/tools/
!python test.py --cfg_file ./cfgs/kitti_models/pointpillar.yaml --ckpt ../output/pointpillar/experiment1/ckpt/checkpoint_epoch_20.pth

/userhome/35/tqwang/OpenPCDet/tools
2022-01-04 20:49:11,487   INFO  **********************Start logging**********************
2022-01-04 20:49:11,487   INFO  CUDA_VISIBLE_DEVICES=0
2022-01-04 20:49:11,487   INFO  cfg_file         ./cfgs/kitti_models/pointpillar.yaml
2022-01-04 20:49:11,487   INFO  batch_size       4
2022-01-04 20:49:11,487   INFO  workers          4
2022-01-04 20:49:11,487   INFO  extra_tag        default
2022-01-04 20:49:11,487   INFO  ckpt             ../output/pointpillar/experiment1/ckpt/checkpoint_epoch_20.pth
2022-01-04 20:49:11,487   INFO  launcher         none
2022-01-04 20:49:11,487   INFO  tcp_port         18888
2022-01-04 20:49:11,487   INFO  local_rank       0
2022-01-04 20:49:11,487   INFO  set_cfgs         None
2022-01-04 20:49:11,488   INFO  max_waiting_mins 30
2022-01-04 20:49:11,488   INFO  start_epoch      0
2022-01-04 20:49:11,488   INFO  eval_tag         default
2022-01-04 20:49:11,488   INFO  eval_all         False
2022-01-04 20:49:11,488   INFO  c

2022-01-04 20:49:11,705   INFO  Total samples for KITTI dataset: 3769
2022-01-04 20:49:14,780   INFO  ==> Loading parameters from checkpoint ../output/pointpillar/experiment1/ckpt/checkpoint_epoch_20.pth to GPU
2022-01-04 20:49:14,915   INFO  ==> Checkpoint trained from version: pcdet+0.5.1+34c75bb
2022-01-04 20:49:14,930   INFO  ==> Done (loaded 127/127)
2022-01-04 20:49:14,944   INFO  *************** EPOCH 20 EVALUATION *****************
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /opt/conda/conda-bld/pytorch_1603729138878/work/torch/csrc/utils/python_arg_parser.cpp:882.)
  original_idxs = scores_mask.nonzero().view(-1)
eval: 100%|████| 943/943 [01:47<00:00,  8.78it/s, recall_0.3=(0, 16463) / 17558]
2022-01-04 20:51:02,354   INFO  *************** Performance of EPOCH 20 *****************
2022-01-04 20:51:02,355   INFO  Generate label finished(sec_per_example: 0.0285 second).
2022-01-04 20:51:02,355   INFO  re

## 5. Relationship between the config file and the model creation
### Let's have a look at PointPillar's config file (OpenPCDet/tools/cfg/kitti_models/pointpillar.yaml) and focus on the config for model creation
```python
MODEL:
    NAME: PointPillar

    VFE:
        NAME: PillarVFE
        WITH_DISTANCE: False
        USE_ABSLOTE_XYZ: True
        USE_NORM: True
        NUM_FILTERS: [64]

    MAP_TO_BEV:
        NAME: PointPillarScatter
        NUM_BEV_FEATURES: 64

    BACKBONE_2D:
        NAME: BaseBEVBackbone
        LAYER_NUMS: [3, 5, 5]
        LAYER_STRIDES: [2, 2, 2]
        NUM_FILTERS: [64, 128, 256]
        UPSAMPLE_STRIDES: [1, 2, 4]
        NUM_UPSAMPLE_FILTERS: [128, 128, 128]

    DENSE_HEAD:
        NAME: AnchorHeadSingle
        CLASS_AGNOSTIC: False

        USE_DIRECTION_CLASSIFIER: True
        DIR_OFFSET: 0.78539
        DIR_LIMIT_OFFSET: 0.0
        NUM_DIR_BINS: 2

        ANCHOR_GENERATOR_CONFIG: [
            {
                'class_name': 'Car',
                'anchor_sizes': [[3.9, 1.6, 1.56]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-1.78],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.6,
                'unmatched_threshold': 0.45
            },
            {
                'class_name': 'Pedestrian',
                'anchor_sizes': [[0.8, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            },
            {
                'class_name': 'Cyclist',
                'anchor_sizes': [[1.76, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            }
        ]

        TARGET_ASSIGNER_CONFIG:
            NAME: AxisAlignedTargetAssigner
            POS_FRACTION: -1.0
            SAMPLE_SIZE: 512
            NORM_BY_NUM_EXAMPLES: False
            MATCH_HEIGHT: False
            BOX_CODER: ResidualCoder

        LOSS_CONFIG:
            LOSS_WEIGHTS: {
                'cls_weight': 1.0,
                'loc_weight': 2.0,
                'dir_weight': 0.2,
                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
            }

    POST_PROCESSING:
        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
        SCORE_THRESH: 0.1
        OUTPUT_RAW_SCORE: False

        EVAL_METRIC: kitti

        NMS_CONFIG:
            MULTI_CLASSES_NMS: False
            NMS_TYPE: nms_gpu
            NMS_THRESH: 0.01
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500
```

#### The model is defined by the congiuration parameters of all its components, including the network architecture itself such as the VFE (Voxel Feature Extrator) and some task-specific components such as the target_assigner and post processing for object detection.

#### Next, we will show how those model configs actually build the model.

#### Starting from OpenPCDet/tools/train.py, we can see this line 
```python
model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=train_set)
```

#### Then, we jump to OpenPCDet/pcdet/models/\_\_init__.py where build_network is defined
```python
def build_network(model_cfg, num_class, dataset):
    model = build_detector(
        model_cfg=model_cfg, num_class=num_class, dataset=dataset
    )
    return model
```

#### Then, we jump to OpenPCDet/pcdet/models/detectors/\_\_init__.py where build_detector is defined
```python
from .detector3d_template import Detector3DTemplate
from .PartA2_net import PartA2Net
from .point_rcnn import PointRCNN
from .pointpillar import PointPillar
from .pv_rcnn import PVRCNN
from .second_net import SECONDNet
from .second_net_iou import SECONDNetIoU
from .caddn import CaDDN
from .voxel_rcnn import VoxelRCNN

__all__ = {
    'Detector3DTemplate': Detector3DTemplate,
    'SECONDNet': SECONDNet,
    'PartA2Net': PartA2Net,
    'PVRCNN': PVRCNN,
    'PointPillar': PointPillar,
    'PointRCNN': PointRCNN,
    'SECONDNetIoU': SECONDNetIoU,
    'CaDDN': CaDDN,
    'VoxelRCNN': VoxelRCNN,
}


def build_detector(model_cfg, num_class, dataset):
    model = __all__[model_cfg.NAME](
        model_cfg=model_cfg, num_class=num_class, dataset=dataset
    )

    return model
```

#### Use PointPillar as an example. Since we have specified the model name in the config file as follow:
```python
MODEL:
    NAME: PointPillar
```
#### The above line actually becomes:
```python
    model = PointPillar(model_cfg=model_cfg, num_class=num_class, dataset=dataset)
```

#### For the individual model components such as the VFE, MAP_TO_BEV, BACKBONE_2D, and etc., they all build in this similar way.

#### Taking the VFE as an example, in OpenPCDet/pcdet/models/detectors/detector3d_template.py:
```python
def build_vfe(self, model_info_dict):
    if self.model_cfg.get('VFE', None) is None:
        return None, model_info_dict

    vfe_module = vfe.__all__[self.model_cfg.VFE.NAME](
        model_cfg=self.model_cfg.VFE,
        num_point_features=model_info_dict['num_rawpoint_features'],
        point_cloud_range=model_info_dict['point_cloud_range'],
        voxel_size=model_info_dict['voxel_size'],
        grid_size=model_info_dict['grid_size'],
        depth_downsample_factor=model_info_dict['depth_downsample_factor']
    )
    model_info_dict['num_point_features'] = vfe_module.get_output_feature_dim()
    model_info_dict['module_list'].append(vfe_module)
    return vfe_module, model_info_dict
```
#### The line to build vfe_module actually becomes:
```python
    vfe_module = PillarVFE(
        model_cfg=self.model_cfg.VFE,
        num_point_features=model_info_dict['num_rawpoint_features'],
        point_cloud_range=model_info_dict['point_cloud_range'],
        voxel_size=model_info_dict['voxel_size'],
        grid_size=model_info_dict['grid_size'],
        depth_downsample_factor=model_info_dict['depth_downsample_factor']
    )
```

### PillarVFE is define in OpenPCDet/pcdet/models/backbones_3d/vfe/pillar_vfe.py
```python
class PillarVFE(VFETemplate):
    def __init__(self, model_cfg, num_point_features, voxel_size, point_cloud_range, **kwargs):
        super().__init__(model_cfg=model_cfg)

        self.use_norm = self.model_cfg.USE_NORM
        self.with_distance = self.model_cfg.WITH_DISTANCE
        self.use_absolute_xyz = self.model_cfg.USE_ABSLOTE_XYZ
        num_point_features += 6 if self.use_absolute_xyz else 3
        if self.with_distance:
            num_point_features += 1

        self.num_filters = self.model_cfg.NUM_FILTERS
        assert len(self.num_filters) > 0
        num_filters = [num_point_features] + list(self.num_filters)

        pfn_layers = []
        for i in range(len(num_filters) - 1):
            in_filters = num_filters[i]
            out_filters = num_filters[i + 1]
            pfn_layers.append(
                PFNLayer(in_filters, out_filters, self.use_norm, last_layer=(i >= len(num_filters) - 2))
            )
        self.pfn_layers = nn.ModuleList(pfn_layers)

        self.voxel_x = voxel_size[0]
        self.voxel_y = voxel_size[1]
        self.voxel_z = voxel_size[2]
        self.x_offset = self.voxel_x / 2 + point_cloud_range[0]
        self.y_offset = self.voxel_y / 2 + point_cloud_range[1]
        self.z_offset = self.voxel_z / 2 + point_cloud_range[2]

    def get_output_feature_dim(self):
        return self.num_filters[-1]

    def get_paddings_indicator(self, actual_num, max_num, axis=0):
        actual_num = torch.unsqueeze(actual_num, axis + 1)
        max_num_shape = [1] * len(actual_num.shape)
        max_num_shape[axis + 1] = -1
        max_num = torch.arange(max_num, dtype=torch.int, device=actual_num.device).view(max_num_shape)
        paddings_indicator = actual_num.int() > max_num
        return paddings_indicator

    def forward(self, batch_dict, **kwargs):
  
        voxel_features, voxel_num_points, coords = batch_dict['voxels'], batch_dict['voxel_num_points'], batch_dict['voxel_coords']
        points_mean = voxel_features[:, :, :3].sum(dim=1, keepdim=True) / voxel_num_points.type_as(voxel_features).view(-1, 1, 1)
        f_cluster = voxel_features[:, :, :3] - points_mean

        f_center = torch.zeros_like(voxel_features[:, :, :3])
        f_center[:, :, 0] = voxel_features[:, :, 0] - (coords[:, 3].to(voxel_features.dtype).unsqueeze(1) * self.voxel_x + self.x_offset)
        f_center[:, :, 1] = voxel_features[:, :, 1] - (coords[:, 2].to(voxel_features.dtype).unsqueeze(1) * self.voxel_y + self.y_offset)
        f_center[:, :, 2] = voxel_features[:, :, 2] - (coords[:, 1].to(voxel_features.dtype).unsqueeze(1) * self.voxel_z + self.z_offset)

        if self.use_absolute_xyz:
            features = [voxel_features, f_cluster, f_center]
        else:
            features = [voxel_features[..., 3:], f_cluster, f_center]

        if self.with_distance:
            points_dist = torch.norm(voxel_features[:, :, :3], 2, 2, keepdim=True)
            features.append(points_dist)
        features = torch.cat(features, dim=-1)

        voxel_count = features.shape[1]
        mask = self.get_paddings_indicator(voxel_num_points, voxel_count, axis=0)
        mask = torch.unsqueeze(mask, -1).type_as(voxel_features)
        features *= mask
        for pfn in self.pfn_layers:
            features = pfn(features)
        features = features.squeeze()
        batch_dict['pillar_features'] = features
        return batch_dict
```



## Project
### Modify the PointPillar's VFE module by inserting the TA (Tripple Attention) Module proposed in TANet, [paper link](https://arxiv.org/pdf/1912.05163.pdf)

### TANet: Robust 3D Object Detection from Point Clouds with Triple Attention
<center>
    <img src="https://i.imgur.com/BezIDFS.png" width = "80%">
    <br>
    <div style="color:orange;
    display: inline-block;
    ">Fig.Overall architecture of TANet</div>
</center>

#### The proposed TA (Tripple Attention) module in the VFE (Voxel Featutre Extractor) module is the one of the main contributions of this paper. 
<center>
    <img src="https://i.imgur.com/zt4lpcs.png" width = "40%">
    <br>
    <div style="color:orange;
    display: inline-block;
    ">Fig.Tripple Attention module for VFE (Voxel Feature Extractor)</div>
</center>

#### Visualization result proposed by TANet's authors compared to PointPillar. 
<center>
    <img src="https://i.imgur.com/YbzgaWX.png" width = "80%">
    <br>
    <div style="color:orange;
    display: inline-block;
    ">Fig.Visualization Result</div>
</center>


### Project Tasks:
### 1. Implement the TA (Tripple Attention) module.
### 2. Compare the results of the modified model and the original PointPillar in terms of both the mAP and the execution latency. (Both models train for 20 epochs)
### 3. Visualize and compare the prediction results of the modified model and the original PointPillar. Note: Use the samples in the [validation set](https://github.com/tianqi-wang1996/OpenPCDet/blob/master/data/kitti/ImageSets/val.txt) to compare. For example, the file contains 000001 meaning that OpenPCDet/data/kitti/training/velodyne/000001.bin is in the validation set.

#### To insert the TA module, we have already written and wrapped other necessary code for you. You can directly go to OpenPCDet/pcdet/models/backbones_3d/vfe/pillar_vfe.py, and fill the forward functions that are indicated by "# WRITE YOUR CODE BELOW!"
```python
# Point-wise attention for each voxel
class PALayer(nn.Module):
    def __init__(self, dim_pa, reduction_pa):
        super(PALayer, self).__init__()
        self.fc = nn.Sequential(
            nn.Linear(dim_pa, dim_pa // reduction_pa),
            nn.ReLU(inplace=True),
            nn.Linear(dim_pa // reduction_pa, dim_pa)
        )

    def forward(self, x):
        # WRITE YOUR CODE BELOW!
        
        return 

# Channel-wise attention for each voxel
class CALayer(nn.Module):
    def __init__(self, dim_ca, reduction_ca):
        super(CALayer, self).__init__()
        self.fc = nn.Sequential(
            nn.Linear(dim_ca, dim_ca // reduction_ca),
            nn.ReLU(inplace=True),
            nn.Linear(dim_ca // reduction_ca, dim_ca)
        )

    def forward(self, x):
        # WRITE YOUR CODE BELOW!

        return y


# Point-wise and Channel-wise attention for each voxel
class PACALayer(nn.Module):
    def __init__(self, dim_ca, dim_pa, reduction_r):
        super(PACALayer, self).__init__()
        self.pa = PALayer(dim_pa, dim_pa // reduction_r)
        self.ca = CALayer(dim_ca, dim_ca // reduction_r)
        self.sig = nn.Sigmoid()

    def forward(self, x):
        # WRITE YOUR CODE BELOW!
        
        return out, paca_normal_weight


# Voxel-wise attention for each voxel
class VALayer(nn.Module):
    def __init__(self, c_num, p_num):
        super(VALayer, self).__init__()
        self.fc1 = nn.Sequential(
            nn.Linear(c_num + 3, 1),
            nn.ReLU(inplace=True)
        )

        self.fc2 = nn.Sequential(
            nn.Linear(p_num, 1),
            nn.ReLU(inplace=True)
        )

        self.sigmod = nn.Sigmoid()

    def forward(self, voxel_center, paca_feat):
        '''
        :param voxel_center: size (K,1,3)
        :param SACA_Feat: size (K,N,C)
        :return: voxel_attention_weight: size (K,1,1)
        '''
        # WRITE YOUR CODE BELOW!

        return voxel_attention_weight
```

#### After finishing your code above, train & test the modified network by using the following commands
```python
cd ~/OpenPCDet/tools/
python train.py --cfg_file ./cfgs/kitti_models/pointpillar_tanet.yaml --extra_tag experiment1
```
```python
cd ~/OpenPCDet/tools/
python test.py --cfg_file ./cfgs/kitti_models/pointpillar_tanet.yaml --ckpt ../output/pointpillar_tanet/experiment1/ckpt/checkpoint_epoch_20.pth
```
```python
cd ~/OpenPCDet/tools/
python demo.py --cfg_file ./cfgs/kitti_models/pointpillar_tanet.yaml --ckpt ../output/pointpillar_tanet/experiment1/ckpt/checkpoint_epoch_20.pth --data_path ../data/kitti/training/velodyne/000039.bin
```