# **Cifar10 image classification**
******************************************
**점수관련**  
본 프로젝트는 Accuracy 점수와 보고서를 성적에 반영할 예정입니다. 평가 항목은 아래와 같습니다. 

A. 결과 (40%)
- Metric 성능 :(이미지 분류 - Accuracy)

B. 신규성 (30%)
- Network 변경 내용 - **필수**
- 성능 개선 및 overfitting 방지 시도 (ex. train validation split)

C. 이론적 근거 (20%)
- 기존 baseline과의 차별점 
- 성능 개선을 위한 시도와 이유

D. 보고서 완성도 (10%)

******************************************
**보고서**
1. 연구 목적
2. 모델 구조
3. 실험 내용
4. 실험 결과
5. 고찰 및 결론
6. Colab 파일 (동작 가능여부)

제출하실 파일은 **주피터 노트북 파일**('.ipynb',파일>다운로드)과 **결과 보고서**(pdf)입니다.
******************************************
**코드 검증**  
- 코드 성능 평가(metric)는 torchvision.datasets에 있는 cifar10의 testset을 사용해 Accuracy로 평가합니다.
- pretrain 모델 사용 불가능합니다.

또한 Random 라이브러리 사용시에 seed 고정하는 등 재현을 고려하시고 코딩부탁드립니다. 
******************************************
**GPU 사용 법**  
런타임 > 런타임 유형 변경 > 하드웨어 가속기에서 GPU를 선택하면 GPU를 사용 할 수 있습니다.  
******************************************
**Colab 사용시 유의사항**  
12시간 단위로 가상머신을 사용할 수 있고 12시간이 지나면 모든 파일과 작업로그들이 초기화 됩니다. 또한, 12시간 넘게 GPU를 사용하기는 어려우니 일찍 시작하셔서 틈틈히 실험해보시는 것을 추천드립니다.


******************************************
**Q?**

In [1]:
# Training
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np
from tqdm import tqdm as tqdm
import time
import math
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"



In [2]:
!nvidia-smi

Mon Nov  1 00:57:35 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 52%   60C    P0    84W / 260W |     18MiB / 11016MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  Off  | 00000000:03:00.0 Off |                  N/A |
| 73%   75C    P2   303W / 340W |   7797MiB / 10018MiB |     83%      Defaul

In [3]:
# 하이퍼 파라미터
EPOCH = 300
batch_size = 16
learning_rate = 0.1
momentum = 0.9
weight_decay = 1e-4

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(f'{device} is available')

# 분류 Class list
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


# 이미지 전처리
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.49139968, 0.48215827, 0.44653124), (0.24703233, 0.24348505, 0.26158768))
])

# Dataset. 변경 불가
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False)


cuda:0 is available
Files already downloaded and verified
Files already downloaded and verified


In [4]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

import torchvision
import torchvision.transforms as transforms
from torch.utils.data import dataloader, dataset


class Bottleneck(nn.Module):
    def __init__(self, nChannels, growthRate):
        super(Bottleneck, self).__init__()
        interChannels = 4 * growthRate
        self.bn1 = nn.BatchNorm2d(nChannels)
        self.conv1 = nn.Conv2d(nChannels, interChannels, kernel_size=1, bias=False)
        self.bn2 = nn.BatchNorm2d(interChannels)
        self.conv2 = nn.Conv2d(interChannels, growthRate, kernel_size=3, padding=1, bias=False)

    def forward(self, x):
        out = self.conv1(F.relu(self.bn1(x)))
        out = self.conv2(F.relu(self.bn2(out)))
        out = torch.cat((x, out), 1)    # concatenate short connection
        return out


class SingleLayer(nn.Module):
    def __init__(self, nChannels, growthRate):
        super(SingleLayer, self).__init__()
        self.bn1 = nn.BatchNorm2d(nChannels, growthRate)
        self.conv1 = nn.Conv2d(nChanels, growthRate, kernel_size=3, padding=1, bias=False)

    def forward(self, x):
        out = self.conv1(F.relu(self.bn1(x)))
        out = torch.cat((x, out), 1)
        return out

class Transition(nn.Module):
    def __init__(self, nChannels, nOutChannels):
        super(Transition, self).__init__()
        self.bn1 = nn.BatchNorm2d(nChannels)
        self.conv1 = nn.Conv2d(nChannels, nOutChannels, kernel_size=1, bias=False)

    def forward(self, x):
        out = self.conv1(F.relu(self.bn1(x)))
        out = F.avg_pool2d(out, 2)
        return out


class DenseNet(nn.Module):
    def __init__(self, growthRate, depth, reduction, nClasses=10):
        super(DenseNet, self).__init__()
        nDenseBlocks = (depth - 4) // 6

        nChannels = 2 * growthRate # 논문 확인.
        self.conv1 = nn.Conv2d(3, nChannels, kernel_size=3, padding=1, bias=False)
        self.dense1 = self._make_dense(nChannels, growthRate, nDenseBlocks)

        nChannels += nDenseBlocks * growthRate
        nOutChannels = int(math.floor(nChannels*reduction))
        self.trans1 = Transition(nChannels, nOutChannels)

        nChannels = nOutChannels
        self.dense2 = self._make_dense(nChannels, growthRate, nDenseBlocks)
        nChannels += nDenseBlocks*growthRate
        nOutChannels = int(math.floor(nChannels*reduction))
        self.trans2 = Transition(nChannels, nOutChannels)

        nChannels = nOutChannels
        self.dense3 = self._make_dense(nChannels, growthRate, nDenseBlocks)
        nChannels += nDenseBlocks*growthRate

        self.bn1 = nn.BatchNorm2d(nChannels)
        self.fc = nn.Linear(nChannels, nClasses)

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()


    def _make_dense(self, nChannels, growthRate, nDenseBlocks):
        layers = []
        for i in range(int(nDenseBlocks)):
            layers.append(Bottleneck(nChannels, growthRate))
            nChannels += growthRate
        return nn.Sequential(*layers)

    def forward(self, x):
        out = self.conv1(x)
        out = self.trans1(self.dense1(out))
        out = self.trans2(self.dense2(out))
        out = self.dense3(out)
        out = torch.squeeze(F.avg_pool2d(F.relu(self.bn1(out)), 8))
        out = F.log_softmax(self.fc(out))
        return out


In [5]:
# model = DenseNet(growthRate=12, depth=100, reduction=0.5, nClasses=10)
model = DenseNet(growthRate=40, depth=190, reduction=0.5, nClasses=10)
model.to(device)

DenseNet(
  (conv1): Conv2d(3, 80, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (dense1): Sequential(
    (0): Bottleneck(
      (bn1): BatchNorm2d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv1): Conv2d(80, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(160, 40, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (1): Bottleneck(
      (bn1): BatchNorm2d(120, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv1): Conv2d(120, 160, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn2): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(160, 40, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (2): Bottleneck(
      (bn1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_s

In [6]:
# criterion. 변경 '가능'
criterion = nn.CrossEntropyLoss()

# Optimizer
optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum, weight_decay=weight_decay)

# Learning Rate Scheduler
scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[150, 225], gamma=0.1, verbose=True)

loss_ = []
n = len(trainloader)  # number of batches

# Training
for epoch in range(EPOCH): 

  running_loss = 0.0
  start = time.time()
  pbar = tqdm(trainloader)
  for batch, (X, y) in enumerate(pbar):
    
    X, y = X.to(device), y.to(device)
    optimizer.zero_grad() # 배치마다 optimizer 초기화

    pred = model(X) 
    loss = criterion(pred, y) # 크로스 엔트로피 손실함수 계산 
      

    loss.backward() # backpropagation
    optimizer.step() # 가중치 최적화

    running_loss += loss.item()
  
  if (epoch + 1) % 10 == 0:
    dir = './model_' + str(epoch) + '.pt'
    torch.save(model.state_dict() , dir) # 모델 저장, path 수정
  loss_.append(running_loss / n)
  print('[%d] loss: %.3f' %(epoch + 1, running_loss / len(trainloader)))
  print("epoch time :", time.time()-start)
  scheduler.step()




  0%|          | 0/3125 [00:00<?, ?it/s]

Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:53<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[1] loss: 1.787
epoch time : 1013.7005136013031
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:55<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[2] loss: 1.133
epoch time : 1015.9270730018616
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[3] loss: 0.831
epoch time : 1017.0911803245544
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[4] loss: 0.684
epoch time : 1018.840559720993
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[5] loss: 0.615
epoch time : 1018.9209396839142
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [17:00<00:00,  3.06it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[6] loss: 0.575
epoch time : 1020.132402420044
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[7] loss: 0.551
epoch time : 1018.8470873832703
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:59<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[8] loss: 0.528
epoch time : 1019.3732826709747
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[9] loss: 0.514
epoch time : 1017.8320553302765
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[10] loss: 0.508
epoch time : 1017.6283023357391
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[11] loss: 0.492
epoch time : 1017.6989529132843
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[12] loss: 0.480
epoch time : 1018.1293067932129
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[13] loss: 0.472
epoch time : 1018.1751453876495
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[14] loss: 0.462
epoch time : 1018.7287044525146
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[15] loss: 0.464
epoch time : 1018.6801748275757
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[16] loss: 0.457
epoch time : 1018.3609108924866
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:59<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[17] loss: 0.449
epoch time : 1019.0419933795929
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:59<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[18] loss: 0.448
epoch time : 1019.4249813556671
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[19] loss: 0.445
epoch time : 1018.8263518810272
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [17:00<00:00,  3.06it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[20] loss: 0.436
epoch time : 1020.9844074249268
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[21] loss: 0.434
epoch time : 1018.2254927158356
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[22] loss: 0.435
epoch time : 1018.8953070640564
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:59<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[23] loss: 0.430
epoch time : 1019.0189926624298
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:59<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[24] loss: 0.427
epoch time : 1019.0175724029541
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[25] loss: 0.427
epoch time : 1018.3028981685638
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[26] loss: 0.420
epoch time : 1016.9080274105072
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[27] loss: 0.422
epoch time : 1016.8577256202698
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[28] loss: 0.421
epoch time : 1017.3779957294464
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:51<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[29] loss: 0.420
epoch time : 1011.6574401855469
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:52<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[30] loss: 0.414
epoch time : 1012.668760061264
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[31] loss: 0.420
epoch time : 1016.4279537200928
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[32] loss: 0.412
epoch time : 1016.7295143604279
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[33] loss: 0.411
epoch time : 1017.6247100830078
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[34] loss: 0.413
epoch time : 1018.6330528259277
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[35] loss: 0.409
epoch time : 1018.5783655643463
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[36] loss: 0.408
epoch time : 1017.9059817790985
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[37] loss: 0.411
epoch time : 1017.6403017044067
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:55<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[38] loss: 0.414
epoch time : 1015.6476843357086
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[39] loss: 0.407
epoch time : 1014.9475300312042
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:55<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[40] loss: 0.405
epoch time : 1015.4211733341217
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:53<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[41] loss: 0.409
epoch time : 1013.7347378730774
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:52<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[42] loss: 0.407
epoch time : 1012.465704202652
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:53<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[43] loss: 0.407
epoch time : 1013.18905377388
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:51<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[44] loss: 0.405
epoch time : 1011.5067863464355
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:51<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[45] loss: 0.404
epoch time : 1011.5559706687927
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:52<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[46] loss: 0.410
epoch time : 1012.8899526596069
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:51<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[47] loss: 0.409
epoch time : 1011.2702314853668
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[48] loss: 0.404
epoch time : 1014.1158378124237
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[49] loss: 0.401
epoch time : 1014.9414188861847
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[50] loss: 0.403
epoch time : 1017.2532608509064
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[51] loss: 0.407
epoch time : 1017.6879630088806
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[52] loss: 0.400
epoch time : 1018.2691309452057
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:57<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[53] loss: 0.398
epoch time : 1017.9661588668823
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:58<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[54] loss: 0.401
epoch time : 1018.0526964664459
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[55] loss: 0.395
epoch time : 1014.8466868400574
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[56] loss: 0.401
epoch time : 1014.3297805786133
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:53<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[57] loss: 0.396
epoch time : 1013.598947763443
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:52<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[58] loss: 0.399
epoch time : 1012.0902636051178
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:52<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[59] loss: 0.399
epoch time : 1012.1171791553497
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:49<00:00,  3.09it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[60] loss: 0.403
epoch time : 1010.2208995819092
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:54<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[61] loss: 0.400
epoch time : 1014.6508984565735
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[62] loss: 0.404
epoch time : 1016.0645496845245
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.07it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[63] loss: 0.400
epoch time : 1016.3052349090576
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:56<00:00,  3.08it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[64] loss: 0.396
epoch time : 1016.0463180541992
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:49<00:00,  3.10it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[65] loss: 0.401
epoch time : 1009.4647390842438
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:45<00:00,  3.11it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[66] loss: 0.405
epoch time : 1005.1028141975403
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:45<00:00,  3.11it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[67] loss: 0.400
epoch time : 1005.032000541687
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:44<00:00,  3.11it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[68] loss: 0.394
epoch time : 1004.2755634784698
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:42<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[69] loss: 0.400
epoch time : 1002.7767786979675
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:42<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[70] loss: 0.397
epoch time : 1002.46027302742
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:41<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[71] loss: 0.396
epoch time : 1001.5814979076385
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:41<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[72] loss: 0.400
epoch time : 1001.1184298992157
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:41<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[73] loss: 0.395
epoch time : 1001.6153438091278
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:40<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[74] loss: 0.391
epoch time : 1000.9499819278717
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:40<00:00,  3.12it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[75] loss: 0.398
epoch time : 1000.1944658756256
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:39<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[76] loss: 0.395
epoch time : 999.5218846797943
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:38<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[77] loss: 0.396
epoch time : 998.1602125167847
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:37<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[78] loss: 0.395
epoch time : 997.7619109153748
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:37<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[79] loss: 0.397
epoch time : 997.2788600921631
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:36<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[80] loss: 0.397
epoch time : 997.1874964237213
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:36<00:00,  3.14it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[81] loss: 0.395
epoch time : 996.4130570888519
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:36<00:00,  3.14it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[82] loss: 0.396
epoch time : 996.1475355625153
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:36<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[83] loss: 0.393
epoch time : 996.8911368846893
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:36<00:00,  3.14it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[84] loss: 0.394
epoch time : 996.6834244728088
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:37<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[85] loss: 0.402
epoch time : 997.2904934883118
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:38<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[86] loss: 0.395
epoch time : 998.8409218788147
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:39<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[87] loss: 0.395
epoch time : 999.5472316741943
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:39<00:00,  3.13it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[88] loss: 0.396
epoch time : 999.1681227684021
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:20<00:00,  3.19it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[89] loss: 0.401
epoch time : 980.3283612728119
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[90] loss: 0.398
epoch time : 968.0810120105743
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[91] loss: 0.394
epoch time : 965.3721082210541
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[92] loss: 0.388
epoch time : 965.6129727363586
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[93] loss: 0.399
epoch time : 965.6015701293945
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[94] loss: 0.399
epoch time : 965.8655579090118
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[95] loss: 0.399
epoch time : 965.5958943367004
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[96] loss: 0.393
epoch time : 965.7063021659851
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[97] loss: 0.394
epoch time : 965.9388153553009
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[98] loss: 0.394
epoch time : 966.1453621387482
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[99] loss: 0.395
epoch time : 965.7218310832977
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[100] loss: 0.393
epoch time : 966.2322645187378
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[101] loss: 0.389
epoch time : 966.0406093597412
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[102] loss: 0.398
epoch time : 965.9842729568481
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[103] loss: 0.392
epoch time : 965.3744270801544
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[104] loss: 0.399
epoch time : 965.6713404655457
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[105] loss: 0.394
epoch time : 965.4142155647278
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[106] loss: 0.400
epoch time : 965.8337726593018
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[107] loss: 0.391
epoch time : 966.0317714214325
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[108] loss: 0.391
epoch time : 965.530592918396
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[109] loss: 0.393
epoch time : 965.8355033397675
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[110] loss: 0.396
epoch time : 965.7052762508392
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[111] loss: 0.397
epoch time : 965.3968169689178
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[112] loss: 0.396
epoch time : 965.2747235298157
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[113] loss: 0.392
epoch time : 965.7274000644684
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[114] loss: 0.396
epoch time : 965.5571892261505
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[115] loss: 0.391
epoch time : 965.5684516429901
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[116] loss: 0.394
epoch time : 965.0967104434967
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[117] loss: 0.396
epoch time : 964.5485689640045
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[118] loss: 0.393
epoch time : 965.3614337444305
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[119] loss: 0.404
epoch time : 964.8516101837158
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[120] loss: 0.392
epoch time : 965.342636346817
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[121] loss: 0.395
epoch time : 965.5019626617432
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[122] loss: 0.399
epoch time : 965.2532134056091
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[123] loss: 0.390
epoch time : 965.4936468601227
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[124] loss: 0.400
epoch time : 966.0755302906036
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[125] loss: 0.396
epoch time : 967.493825674057
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:08<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[126] loss: 0.395
epoch time : 968.2652485370636
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:08<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[127] loss: 0.400
epoch time : 968.3099167346954
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:08<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[128] loss: 0.390
epoch time : 968.487470626831
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:08<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[129] loss: 0.397
epoch time : 968.9606025218964
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:09<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[130] loss: 0.398
epoch time : 969.6406662464142
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:09<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[131] loss: 0.397
epoch time : 969.4834485054016
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:10<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[132] loss: 0.396
epoch time : 970.3099105358124
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:10<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[133] loss: 0.393
epoch time : 970.1754593849182
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:09<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[134] loss: 0.392
epoch time : 969.2872273921967
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[135] loss: 0.391
epoch time : 967.7472786903381
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[136] loss: 0.395
epoch time : 967.5415720939636
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[137] loss: 0.394
epoch time : 966.924998998642
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[138] loss: 0.394
epoch time : 967.2213401794434
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[139] loss: 0.396
epoch time : 967.7628078460693
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[140] loss: 0.399
epoch time : 967.6467554569244
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[141] loss: 0.394
epoch time : 966.297306060791
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[142] loss: 0.398
epoch time : 966.594057559967
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[143] loss: 0.393
epoch time : 966.5899293422699
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[144] loss: 0.392
epoch time : 965.7675545215607
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[145] loss: 0.394
epoch time : 966.0079581737518
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[146] loss: 0.400
epoch time : 965.3521637916565
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[147] loss: 0.394
epoch time : 965.5470387935638
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[148] loss: 0.396
epoch time : 965.1579604148865
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[149] loss: 0.395
epoch time : 965.601571559906
Adjusting learning rate of group 0 to 1.0000e-01.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[150] loss: 0.399
epoch time : 965.881101846695
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[151] loss: 0.201
epoch time : 965.7638359069824
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[152] loss: 0.149
epoch time : 965.8296926021576
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[153] loss: 0.129
epoch time : 965.51265001297
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[154] loss: 0.113
epoch time : 965.8253695964813
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[155] loss: 0.102
epoch time : 965.3450980186462
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[156] loss: 0.089
epoch time : 965.3906452655792
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[157] loss: 0.084
epoch time : 965.8967106342316
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[158] loss: 0.082
epoch time : 965.4198868274689
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[159] loss: 0.077
epoch time : 965.4878621101379
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[160] loss: 0.072
epoch time : 965.9135975837708
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[161] loss: 0.070
epoch time : 964.6928176879883
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[162] loss: 0.068
epoch time : 965.1378841400146
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[163] loss: 0.067
epoch time : 965.1459658145905
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[164] loss: 0.065
epoch time : 965.6112842559814
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[165] loss: 0.068
epoch time : 964.737420797348
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[166] loss: 0.065
epoch time : 965.2185032367706
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[167] loss: 0.067
epoch time : 964.977292060852
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[168] loss: 0.070
epoch time : 965.0782663822174
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[169] loss: 0.071
epoch time : 965.1954884529114
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[170] loss: 0.071
epoch time : 965.5276832580566
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[171] loss: 0.068
epoch time : 965.1079187393188
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[172] loss: 0.072
epoch time : 964.5786654949188
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[173] loss: 0.073
epoch time : 964.6967749595642
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[174] loss: 0.076
epoch time : 964.9136500358582
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[175] loss: 0.072
epoch time : 964.9988036155701
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[176] loss: 0.072
epoch time : 966.0345284938812
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[177] loss: 0.075
epoch time : 965.7459373474121
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[178] loss: 0.084
epoch time : 965.7261810302734
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[179] loss: 0.081
epoch time : 965.970849275589
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[180] loss: 0.079
epoch time : 966.0040774345398
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[181] loss: 0.075
epoch time : 965.7694525718689
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[182] loss: 0.073
epoch time : 965.7097392082214
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[183] loss: 0.078
epoch time : 965.7715735435486
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[184] loss: 0.073
epoch time : 965.5422487258911
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[185] loss: 0.077
epoch time : 965.7759923934937
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[186] loss: 0.076
epoch time : 965.5223588943481
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[187] loss: 0.076
epoch time : 966.0193784236908
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[188] loss: 0.077
epoch time : 965.9296309947968
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[189] loss: 0.073
epoch time : 966.0027375221252
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[190] loss: 0.073
epoch time : 966.1614212989807
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[191] loss: 0.072
epoch time : 965.6463921070099
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[192] loss: 0.070
epoch time : 965.7100052833557
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[193] loss: 0.074
epoch time : 965.8190362453461
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[194] loss: 0.069
epoch time : 965.3345537185669
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[195] loss: 0.077
epoch time : 965.7036597728729
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[196] loss: 0.070
epoch time : 966.1001989841461
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[197] loss: 0.071
epoch time : 965.9694473743439
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[198] loss: 0.072
epoch time : 965.5570676326752
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[199] loss: 0.072
epoch time : 966.0938999652863
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[200] loss: 0.069
epoch time : 966.4716262817383
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[201] loss: 0.072
epoch time : 966.2944931983948
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[202] loss: 0.072
epoch time : 965.9301912784576
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[203] loss: 0.069
epoch time : 965.7048630714417
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[204] loss: 0.068
epoch time : 965.9428384304047
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[205] loss: 0.070
epoch time : 965.2026877403259
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[206] loss: 0.074
epoch time : 965.6873426437378
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[207] loss: 0.065
epoch time : 966.042952299118
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[208] loss: 0.065
epoch time : 965.857063293457
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[209] loss: 0.068
epoch time : 965.91907954216
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[210] loss: 0.067
epoch time : 966.537519454956
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[211] loss: 0.068
epoch time : 965.9970343112946
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[212] loss: 0.066
epoch time : 966.2923612594604
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[213] loss: 0.065
epoch time : 966.2488203048706
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[214] loss: 0.067
epoch time : 966.1621730327606
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[215] loss: 0.067
epoch time : 966.1918315887451
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[216] loss: 0.066
epoch time : 966.959047794342
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[217] loss: 0.068
epoch time : 966.5870630741119
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:07<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[218] loss: 0.064
epoch time : 967.8827710151672
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:09<00:00,  3.22it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[219] loss: 0.068
epoch time : 969.1063959598541
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:08<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[220] loss: 0.064
epoch time : 968.492591381073
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[221] loss: 0.062
epoch time : 965.9521567821503
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[222] loss: 0.060
epoch time : 965.9572112560272
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[223] loss: 0.063
epoch time : 965.6569948196411
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[224] loss: 0.067
epoch time : 965.706475019455
Adjusting learning rate of group 0 to 1.0000e-02.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[225] loss: 0.061
epoch time : 965.2185513973236
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[226] loss: 0.029
epoch time : 965.3194420337677
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[227] loss: 0.016
epoch time : 964.758537530899
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[228] loss: 0.013
epoch time : 965.7776200771332
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[229] loss: 0.011
epoch time : 965.8639316558838
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[230] loss: 0.009
epoch time : 966.02614402771
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[231] loss: 0.009
epoch time : 965.8666570186615
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[232] loss: 0.008
epoch time : 966.5489144325256
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[233] loss: 0.007
epoch time : 966.2690277099609
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[234] loss: 0.006
epoch time : 966.642608165741
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[235] loss: 0.006
epoch time : 965.8105075359344
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:06<00:00,  3.23it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[236] loss: 0.006
epoch time : 966.3538784980774
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[237] loss: 0.006
epoch time : 965.292590379715
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[238] loss: 0.005
epoch time : 965.476984500885
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[239] loss: 0.005
epoch time : 965.3061001300812
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[240] loss: 0.005
epoch time : 965.9576938152313
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[241] loss: 0.005
epoch time : 964.839761018753
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[242] loss: 0.005
epoch time : 964.4477708339691
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[243] loss: 0.004
epoch time : 963.9161860942841
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[244] loss: 0.004
epoch time : 964.4436931610107
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[245] loss: 0.005
epoch time : 964.3312513828278
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[246] loss: 0.004
epoch time : 964.3535845279694
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[247] loss: 0.004
epoch time : 964.4272406101227
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[248] loss: 0.004
epoch time : 964.325083732605
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[249] loss: 0.004
epoch time : 964.2995209693909
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[250] loss: 0.004
epoch time : 964.3564207553864
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[251] loss: 0.004
epoch time : 964.3781950473785
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[252] loss: 0.003
epoch time : 963.7983613014221
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[253] loss: 0.003
epoch time : 964.3046762943268
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[254] loss: 0.003
epoch time : 964.2465605735779
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[255] loss: 0.003
epoch time : 964.0841271877289
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[256] loss: 0.003
epoch time : 964.3070957660675
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[257] loss: 0.003
epoch time : 964.889000415802
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[258] loss: 0.003
epoch time : 964.4676713943481
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[259] loss: 0.003
epoch time : 964.3470001220703
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[260] loss: 0.003
epoch time : 965.1482601165771
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[261] loss: 0.003
epoch time : 964.969286441803
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[262] loss: 0.003
epoch time : 964.814935207367
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[263] loss: 0.003
epoch time : 964.584666967392
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[264] loss: 0.003
epoch time : 964.7588257789612
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[265] loss: 0.003
epoch time : 964.7304553985596
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[266] loss: 0.003
epoch time : 964.6980595588684
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[267] loss: 0.003
epoch time : 964.9496595859528
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[268] loss: 0.003
epoch time : 965.5974452495575
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[269] loss: 0.002
epoch time : 965.1716575622559
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[270] loss: 0.002
epoch time : 964.9758541584015
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[271] loss: 0.003
epoch time : 964.8829662799835
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[272] loss: 0.002
epoch time : 964.7138850688934
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[273] loss: 0.002
epoch time : 964.8461089134216
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[274] loss: 0.002
epoch time : 965.2688925266266
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[275] loss: 0.002
epoch time : 965.3458139896393
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[276] loss: 0.002
epoch time : 964.7135622501373
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:05<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[277] loss: 0.003
epoch time : 965.0293483734131
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[278] loss: 0.002
epoch time : 964.7602844238281
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[279] loss: 0.002
epoch time : 964.9982101917267
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[280] loss: 0.002
epoch time : 964.665504693985
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[281] loss: 0.002
epoch time : 964.577713727951
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[282] loss: 0.002
epoch time : 964.2427151203156
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[283] loss: 0.002
epoch time : 963.9554114341736
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[284] loss: 0.002
epoch time : 964.347493648529
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[285] loss: 0.002
epoch time : 964.0790264606476
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[286] loss: 0.002
epoch time : 964.442957162857
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[287] loss: 0.002
epoch time : 963.8614847660065
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[288] loss: 0.002
epoch time : 963.9828293323517
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[289] loss: 0.002
epoch time : 964.1441929340363
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[290] loss: 0.002
epoch time : 964.6377763748169
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[291] loss: 0.002
epoch time : 963.6701111793518
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[292] loss: 0.002
epoch time : 963.6285090446472
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[293] loss: 0.002
epoch time : 963.3750596046448
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[294] loss: 0.002
epoch time : 963.4711253643036
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:03<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[295] loss: 0.002
epoch time : 963.7459230422974
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[296] loss: 0.002
epoch time : 964.0283310413361
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[297] loss: 0.002
epoch time : 964.3518030643463
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[298] loss: 0.002
epoch time : 964.2449102401733
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]
  0%|          | 0/3125 [00:00<?, ?it/s]

[299] loss: 0.002
epoch time : 964.4442565441132
Adjusting learning rate of group 0 to 1.0000e-03.


100%|██████████| 3125/3125 [16:04<00:00,  3.24it/s]


[300] loss: 0.002
epoch time : 964.9832217693329
Adjusting learning rate of group 0 to 1.0000e-03.


# Evaluation

In [8]:
# Evaluation

# 수정
# net = ResNet152().to(device)
net = model
# net.load_state_dict(torch.load('/ 
correct = 0
total = 0
correct_pred = {classname: 0 for classname in classes}
total_pred = {classname: 0 for classname in classes}
# since we're not training, we don't need to calculate the gradients for our outputs
with torch.no_grad():
    for data in testloader:
        images, labels = data[0].to(device), data[1].to(device)
        # calculate outputs by running images through the network
        outputs = net(images)
        # the class with the highest energy is what we choose as prediction
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
        # collect the correct predictions for each class
        for label, prediction in zip(labels, predicted):
            if label == prediction:
                correct_pred[classes[label]] += 1
            total_pred[classes[label]] += 1

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

# print accuracy for each class
for classname, correct_count in correct_pred.items():
    accuracy = 100 * float(correct_count) / total_pred[classname]
    print("Accuracy for class {:5s} is: {:.1f} %".format(classname,
                                                   accuracy))



Accuracy of the network on the 10000 test images: 94 %
Accuracy for class plane is: 95.4 %
Accuracy for class car   is: 97.3 %
Accuracy for class bird  is: 92.2 %
Accuracy for class cat   is: 89.6 %
Accuracy for class deer  is: 96.5 %
Accuracy for class dog   is: 91.3 %
Accuracy for class frog  is: 97.4 %
Accuracy for class horse is: 97.0 %
Accuracy for class ship  is: 97.4 %
Accuracy for class truck is: 95.8 %


In [None]:
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.figure(figsize=(20,10))
    plt.imshow(np.transpose(npimg, (1, 2, 0)))
    plt.show()

dataiter = iter(testloader)
images, labels = dataiter.next()

# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(8)))

net = model
# net = Net()
# net.to(device)
# net.load_state_dict(torch.load('/content/gdrive/MyDrive/Baseline//model_ckpt.pt'))

_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
                              for j in range(8)))