# Capsule_Network
- Dynamic Routing Between Capsules(https://arxiv.org/abs/1710.09829) - 2017, Sara Sabour, Geoffrey E. Hinton
- pytorch source : https://github.com/leftthomas/CapsNet
- 도움 : https://jayhey.github.io/deep%20learning/2017/11/28/CapsNet_1

## Key
- capsule
- Dynamic routing
- squash
- decoder

### Caps vs DNN
![](CapsNet vs DNN.png)

### 1. Import Libs

In [1]:
import numpy as np
import os
import torch
import torch.nn as nn
from torch.optim import Adam
import torchnet as tnt
import torchvision.datasets as dset
import torch.nn.functional as F
from torch.autograd import Variable

from torchnet.engine import Engine
from torchnet.logger import VisdomPlotLogger, VisdomLogger
from torchvision.utils import make_grid
from tqdm import tqdm

### 2. setting Hyperparameter

In [2]:
BATCH_SIZE = 100
NUM_CLASSES = 10
NUM_EPOCHS = 100
NUM_ROUTING_ITERATIONS = 3

### 3. Prepare Dataset

In [3]:
def augmentation(x, max_shift=2):
    _, _, height, width = x.size()
    
    h_shift, w_shift = np.random.randint(-max_shift, max_shift + 1, size=2)
    source_height_slice = slice(max(0, h_shift), h_shift + height)
    source_width_slice = slice(max(0, w_shift), w_shift + width)
    target_height_slice = slice(max(0, -h_shift), -h_shift + height)
    target_width_slice = slice(max(0, -w_shift), -w_shift + width)
    
    shifted_image = torch.zeros(*x.size())
    shifted_image[:, :, source_height_slice, source_width_slice] = x[:, :, target_height_slice, target_width_slice]
    return shifted_image.float()

def get_iterator(mode):
    dataset = dset.MNIST(root='./data', train=mode, download=True)
    data = getattr(dataset, 'train_data' if mode else 'test_data')
    labels = getattr(dataset, 'train_labels' if mode else 'test_labels')
    tensor_dataset = tnt.dataset.TensorDataset([data, labels])
    
    return tensor_dataset.parallel(batch_size=BATCH_SIZE, num_workers=4, shuffle=mode)

## 4. Models - CapsuleNetwork
- CapsuleLayer
- CapsuleNet

### CapsNet의 구조
- Input : 28 x 28 x 1 (MNIST)
- Conv1 Kernel : 9 x 9 x 256, stride1  + ReLU
    - output : 20 x 20 x 256
- Primary Capsule : 9 x 9 x (32 x 8), stride 2 + ReLU
- Digit Capsule : Dynamic routing


### 유의점 
![](reshaping.png)
- PrimaryCaps를 reshape함. 캡슐 하나 당 8개의 property를 갖게 콘볼루션 필터를 거친 6 x 6 x (32 x 8) 피처맵을 (6 x 6 x 32) x 8로 reshape를 설정함. 이렇게 되면 총 6 x 6 x 32 = 1152개의 각각 **8개의 property**를 갖는 첫번째 캡슐들이 생성됨. 

![](reshaping2.png)
- 초록색 벡터가 하나의 캡슐이고 총 8개의 property를 갖고 있음. (36 x 32 = 1152)개의 캡슐이 한 캡슐 당 8개의 element를 갖게 됨. 그리고 이 캡슐들이 Dynamic Routing과정을 통해 상위 레벨의 캡슐과 이어짐.

![](Decoder_Structure.png)
- 최종적으로 10개의 캡슐이 output으로 나옴(MNIST 0 ~ 9)
- 가장 큰 크기를 갖는 캡슐 워소 16개를 각각 512, 1024, 784까지 fc로 연결함. 마지막에 sigmoid를 거쳐서 1과 0의 값으로 변경한 이후에 reconstruct한 digit를 확인할 수 있음
- digit과 원래의 digit사이의 유클리디언 거리를 사용하여 loss를 전파하게 만듦

### Dynamic Routing
![](Routing_Algorithm.png)

### process
![](Routing_Algorithm_Process.png)

### 설명
- 총 2개의 캡슐 레이어가 존재함. PrimaryCapsule이 Layer l이 되고 DigitCapsule이 Layer l+1이 됨. PrimaryCapsule은 총 1152개가 있으며 DigitCaps는 10개임. 
    - i는 1 ~ 1152
    - j는 1 ~ 10
    
- Dynamic Routing은 몇 번(r)의 이터레이션을 거칠 것인지 하이퍼파라미터로 설정할 수 있음(논문을 기준으로는 경험적으로 3을 권장함). 먼저 PrimaryCaps와 DigitCaps를 이어주는 값인 $b_{ij}$를 0으로 설정
- $b_{ij}$를 softmax를 통과시켜서 coupling coefficient를 만들어냄.

$$C_{ij} = softmax(b_{ij}) = \frac{b_{ij}}{\sum_k b_{ik}}$$

- PrimaryCapsule의 property를 8개에서 16개로 reshape해주는 작업을 수행

$$\hat{u}_{j|i} = W_{ij}u_i$$


$$s_j = \sum_i c_{ij} \hat{u}_{j|i}$$

- $W_{ij}$는 8 x 16의 크기를 갖는 가중치 행렬이고 PrimaryCapsule이 이 행렬과 곱해지면서 벡터의 크기가 8개에서 16으로 변경됨. 이후 $\hat{u}_{ij}$와 $C_{ij}$를 곱해서 길이가 16으로 변환된 PrimaryCapsule을 DigitCapsule로 얼마나 보낼지 결정함. 그리고 같은 digitcaps로 연결된 캡슐 벡터들을 더해주면(weighted sum) $s_j$를 구할 수 있음

$$v_j = \frac{||s_j||^2}{1 + ||s_j||^2} \frac{s_j}{||s_j||}$$

- **Squash**함수를 거쳐줌. $s_j$의 크기가 작을 경우 0에 가깝게 수렴하고 크기가 클 경우 1보다 약간 작은 값을 갖는 벡터로 변경됨. 최종 entity의 존재 확률을 0과 1 사이의 값으로 표현할 수 있음
    - L2 norm : entity가 존재할 확률
    - Element : entity의 properties
    
- 마지막으로 $b_{ij}$를 업데이트

$$a_{ij} = v_j \dot{} \hat{u}_{j|i}, \ \ \ b_{ij} \leftarrow b_{ij} + \hat{u}_{j|i} \dot{} v_j$$
    
- $a_{ij}$는 agreement라고 부름. $\hat{u}_{j|i} \dot{} v_j$는 서로 차원 수가 같기 때문에 결과값이 스칼라가 됨. 이를 logit $b_{ij}$에 업데이트 하면 한번의 routing iteration이 끝남.

### 4.1 Capsule Layer

In [4]:
class CapsuleLayer(nn.Module):
    def __init__(self, num_capsules, num_route_nodes, in_channels, out_channels, kernel_size=None, stride=None,
                 num_iterations=NUM_ROUTING_ITERATIONS):
        super(CapsuleLayer, self).__init__()

        self.num_route_nodes = num_route_nodes
        self.num_iterations = num_iterations

        self.num_capsules = num_capsules

        if num_route_nodes != -1:
            self.route_weights = nn.Parameter(torch.randn(num_capsules, num_route_nodes, in_channels, out_channels))
        else:
            self.capsules = nn.ModuleList(
                [nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=0) for _ in
                 range(num_capsules)])

    @staticmethod
    def squash(tensor, dim=-1):
        squared_norm = (tensor ** 2).sum(dim=dim, keepdim=True)
        scale = squared_norm / (1 + squared_norm)
        return scale * tensor / torch.sqrt(squared_norm)

    def forward(self, x):
        if self.num_route_nodes != -1:
            priors = x[None, :, :, None, :] @ self.route_weights[:, None, :, :, :]
            logits = Variable(torch.zeros(*priors.size()))
            if torch.cuda.is_available():
                logits = logits.cuda()
            for i in range(self.num_iterations):
                probs = F.softmax(logits, dim=2)
                outputs = self.squash((probs * priors).sum(dim=2, keepdim=True))

                if i != self.num_iterations - 1:
                    delta_logits = (priors * outputs).sum(dim=-1, keepdim=True)
                    logits = logits + delta_logits
        else:
            outputs = [capsule(x).view(x.size(0), -1, 1) for capsule in self.capsules]
            outputs = torch.cat(outputs, dim=-1)
            outputs = self.squash(outputs)

        return outputs

## 4.2 CapsuleNet

In [5]:
class CapsuleNet(nn.Module):
    def __init__(self):
        super(CapsuleNet, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=256, kernel_size=9, stride=1)
        self.primary_capsules = CapsuleLayer(num_capsules=8, num_route_nodes=-1, in_channels=256, out_channels=32, kernel_size=9, stride=2)
        self.digit_capsules = CapsuleLayer(num_capsules=NUM_CLASSES, num_route_nodes=32 * 6 * 6, in_channels=8, out_channels=16)
        
        self.decoder = nn.Sequential(
            nn.Linear(16 * NUM_CLASSES, 512),
            nn.ReLU(inplace=True),
            nn.Linear(512, 1024),
            nn.ReLU(inplace=True),
            nn.Linear(1024, 784),
            nn.Sigmoid()
        )
        
    def forward(self, x, y=None):
        x = F.relu(self.conv1(x), inplace=True)
        x = self.primary_capsules(x)
        x = self.digit_capsules(x).squeeze().transpose(0, 1)
        
        classes = (x ** 2).sum(dim=-1)**0.5
        classes = F.softmax(classes, dim=-1)
        
        if y is None:
            _, max_length_indices = classes.max(dim=1)
            if torch.cuda.is_available():
                y = Variable(torch.eye(NUM_CLASSES)).cuda().index_select(dim=0, index=max_length_indices)
            else:
                y = Variable(torch.eye(NUM_CLASSES)).index_select(dim=0, index=max_length_indices)
        reconstructions = self.decoder((x * y[:, :, None]).view(x.size(0), -1))
        
        return classes, reconstructions

### 5. Loss func
- margin loss + $\alpha$ reconstruction loss

$$L_k = T_k \ \text{max}(0, m^{+} - ||v_k||^2) \ + \ \lambda (1 - T_k) \ \text{max}(0, ||v_k||^2 - m^{-})  $$

- $T_k$ = 1
- $m^{+}$ = 0.9
- $m^{-}$ = 0.1
- $\lambda$ = 0.5

In [6]:
class CapsuleLoss(nn.Module):
    def __init__(self):
        super(CapsuleLoss, self).__init__()
        self.reconstrunction_loss = nn.MSELoss(size_average=False)
        
    def forward(self, images, labels, classes, reconstrunctions):
        left = F.relu(0.9 - classes, inplace=True) ** 2
        right = F.relu(classes - 0.1, inplace=True) ** 2
        
        margin_loss = labels * left + 0.5 * (1. - labels) * right
        margin_loss = margin_loss.sum()
        
        reconstrunction_loss = self.reconstrunction_loss(reconstrunctions, images)
        
        return (margin_loss + 0.0005 * reconstrunction_loss) / images.size(0)

## 6. Training

In [7]:
def processor(sample):
    data, labels, training = sample

    data = augmentation(data.unsqueeze(1).float() / 255.0)
    labels = torch.eye(NUM_CLASSES).index_select(dim=0, index=labels)

    data = Variable(data)
    labels = Variable(labels)
    if torch.cuda.is_available():
        data = data.cuda()
        labels = labels.cuda()

    if training:
        classes, reconstructions = model(data, labels)
    else:
        classes, reconstructions = model(data)

    loss = capsule_loss(data, labels, classes, reconstructions)
    return loss, classes

def on_sample(state):
    state['sample'].append(state['train'])
    
def reset_meters():
    meter_accuracy.reset()
    meter_loss.reset()
    confusion_meter.reset()
    
def on_forward(state):
    meter_accuracy.add(state['output'].data, state['sample'][1])
    confusion_meter.add(state['output'].data, state['sample'][1])
    meter_loss.add(state['loss'].data[0])
    
def on_start_epoch(state):
    reset_meters()
    state['iterator'] = tqdm(state['iterator'])
    
def on_end_epoch(state):
    print("[EPOCH %d] Training Loss : %.4f  (Accuracy : %.2f%%)"%(
        state['epoch'], meter_loss.value()[0], meter_accuracy.value()[0]))
    
    train_loss_logger.log(state['epoch'], meter_loss.value()[0])
    train_accuracy_logger.log(state['epoch'], meter_accuracy.value()[0])
    
    reset_meters()
    
    engine.test(processor, get_iterator(False))
    test_loss_logger.log(state['epoch'], meter_loss.value()[0])
    test_accuracy_logger.log(state['epoch'], meter_accuracy.value()[0])
    confusion_logger.log(confusion_meter.value())
    
    print("[EPOCH %d] Testing Loss : %.4f  (Accuracy : %.2f%%)"%(
        state['epoch'], meter_loss.value()[0], meter_accuracy.value()[0]))
    
    torch.save(model.state_dict(), 'epochs/epoch_%d.pt'%state['epoch'])
    
    # reconstuction visualization
    
    test_sample = next(iter(get_iterator(False)))
    ground_truth = (test_sample[0].unsqueeze(1).float() / 255.0)
    if torch.cuda.is_available():
        _, reconstructions = model(Variable(ground_truth).cuda())
    else:
        _, reconstructions = model(Variable(ground_truth))
    reconstruction = reconstructions.cpu().view_as(ground_truth).data
    
    ground_truth_logger.log(
        make_grid(ground_truth, nrow=int(BATCH_SIZE ** 0.5), normalize=True, range=(0, 1)).numpy())
    reconstruction_logger.log(
        make_grid(reconstruction, nrow=int(BATCH_SIZE ** 0.5), normalize=True, range=(0, 1)).numpy())

### visdom서버 구동

```shell
source activate pytorch
python -m visdom.server
```

In [8]:
model_path = os.path.join(os.getcwd(), 'epochs')

if not os.path.isdir(model_path):
    os.mkdir(model_path)   

model = CapsuleNet()
if torch.cuda.is_available():
    model.cuda()
    
print("# Parameters : ", sum(param.numel() for param in model.parameters()))

# Parameters :  8215568


In [9]:
optimizer = Adam(model.parameters())

engine = Engine()
meter_loss = tnt.meter.AverageValueMeter()
meter_accuracy = tnt.meter.ClassErrorMeter(accuracy=True)
confusion_meter = tnt.meter.ConfusionMeter(NUM_CLASSES, normalized=True)

train_loss_logger = VisdomPlotLogger('line', opts={'title': 'Train Loss'})
train_accuracy_logger = VisdomPlotLogger('line', opts={'title': 'Train Accuracy'})
test_loss_logger = VisdomPlotLogger('line', opts={'title' : 'Test Loss'})
test_accuracy_logger = VisdomPlotLogger('line', opts = {'title': 'Test Accuracy'})

confusion_logger = VisdomLogger(
    'heatmap', 
    opts = {'title' : 'Confusion Matrix',
            'columnames' : list(range(NUM_CLASSES)),
            'rownames' : list(range(NUM_CLASSES))
})

ground_truth_logger = VisdomLogger('image', opts = {'title' : 'Ground Truth'})
reconstruction_logger = VisdomLogger('image', opts = {'title' : 'Reconstruction'})

capsule_loss = CapsuleLoss()

engine.hooks['on_sample'] = on_sample
engine.hooks['on_forward'] = on_forward
engine.hooks['on_start_epoch'] = on_start_epoch
engine.hooks['on_end_epoch'] = on_end_epoch

engine.train(processor, get_iterator(True), maxepoch=NUM_EPOCHS, optimizer=optimizer)

100%|██████████| 600/600 [03:20<00:00,  2.99it/s]


[EPOCH 1] Training Loss : 0.5207  (Accuracy : 89.00%)
[EPOCH 1] Testing Loss : 0.4849  (Accuracy : 97.83%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 2] Training Loss : 0.4825  (Accuracy : 97.88%)





[EPOCH 2] Testing Loss : 0.4791  (Accuracy : 98.26%)


100%|██████████| 600/600 [03:20<00:00,  3.00it/s]

[EPOCH 3] Training Loss : 0.4779  (Accuracy : 98.34%)





[EPOCH 3] Testing Loss : 0.4766  (Accuracy : 98.57%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 4] Training Loss : 0.4752  (Accuracy : 98.73%)





[EPOCH 4] Testing Loss : 0.4740  (Accuracy : 98.83%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 5] Training Loss : 0.4733  (Accuracy : 98.81%)





[EPOCH 5] Testing Loss : 0.4729  (Accuracy : 98.79%)


100%|██████████| 600/600 [03:20<00:00,  3.00it/s]

[EPOCH 6] Training Loss : 0.4716  (Accuracy : 98.99%)





[EPOCH 6] Testing Loss : 0.4718  (Accuracy : 98.79%)


100%|██████████| 600/600 [03:20<00:00,  3.00it/s]

[EPOCH 7] Training Loss : 0.4707  (Accuracy : 99.04%)





[EPOCH 7] Testing Loss : 0.4702  (Accuracy : 98.97%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 8] Training Loss : 0.4692  (Accuracy : 99.21%)





[EPOCH 8] Testing Loss : 0.4691  (Accuracy : 99.17%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 9] Training Loss : 0.4686  (Accuracy : 99.18%)





[EPOCH 9] Testing Loss : 0.4681  (Accuracy : 99.24%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 10] Training Loss : 0.4678  (Accuracy : 99.27%)





[EPOCH 10] Testing Loss : 0.4679  (Accuracy : 99.16%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 11] Training Loss : 0.4672  (Accuracy : 99.33%)





[EPOCH 11] Testing Loss : 0.4669  (Accuracy : 99.29%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 12] Training Loss : 0.4663  (Accuracy : 99.35%)





[EPOCH 12] Testing Loss : 0.4661  (Accuracy : 99.29%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 13] Training Loss : 0.4655  (Accuracy : 99.40%)





[EPOCH 13] Testing Loss : 0.4655  (Accuracy : 99.22%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 14] Training Loss : 0.4650  (Accuracy : 99.44%)





[EPOCH 14] Testing Loss : 0.4647  (Accuracy : 99.43%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 15] Training Loss : 0.4646  (Accuracy : 99.42%)





[EPOCH 15] Testing Loss : 0.4650  (Accuracy : 99.26%)


100%|██████████| 600/600 [03:20<00:00,  2.99it/s]

[EPOCH 16] Training Loss : 0.4639  (Accuracy : 99.49%)





[EPOCH 16] Testing Loss : 0.4645  (Accuracy : 99.30%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 17] Training Loss : 0.4637  (Accuracy : 99.48%)





[EPOCH 17] Testing Loss : 0.4642  (Accuracy : 99.34%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 18] Training Loss : 0.4634  (Accuracy : 99.48%)





[EPOCH 18] Testing Loss : 0.4638  (Accuracy : 99.33%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 19] Training Loss : 0.4630  (Accuracy : 99.56%)





[EPOCH 19] Testing Loss : 0.4633  (Accuracy : 99.36%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 20] Training Loss : 0.4627  (Accuracy : 99.59%)





[EPOCH 20] Testing Loss : 0.4633  (Accuracy : 99.29%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 21] Training Loss : 0.4625  (Accuracy : 99.58%)





[EPOCH 21] Testing Loss : 0.4628  (Accuracy : 99.41%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 22] Training Loss : 0.4620  (Accuracy : 99.65%)





[EPOCH 22] Testing Loss : 0.4628  (Accuracy : 99.36%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 23] Training Loss : 0.4618  (Accuracy : 99.67%)





[EPOCH 23] Testing Loss : 0.4625  (Accuracy : 99.41%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 24] Training Loss : 0.4617  (Accuracy : 99.65%)





[EPOCH 24] Testing Loss : 0.4624  (Accuracy : 99.35%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 25] Training Loss : 0.4614  (Accuracy : 99.67%)





[EPOCH 25] Testing Loss : 0.4619  (Accuracy : 99.35%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 26] Training Loss : 0.4612  (Accuracy : 99.65%)





[EPOCH 26] Testing Loss : 0.4617  (Accuracy : 99.41%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 27] Training Loss : 0.4610  (Accuracy : 99.68%)





[EPOCH 27] Testing Loss : 0.4618  (Accuracy : 99.45%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 28] Training Loss : 0.4609  (Accuracy : 99.67%)





[EPOCH 28] Testing Loss : 0.4616  (Accuracy : 99.43%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 29] Training Loss : 0.4605  (Accuracy : 99.72%)





[EPOCH 29] Testing Loss : 0.4616  (Accuracy : 99.43%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 30] Training Loss : 0.4604  (Accuracy : 99.73%)





[EPOCH 30] Testing Loss : 0.4620  (Accuracy : 99.21%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 31] Training Loss : 0.4604  (Accuracy : 99.73%)





[EPOCH 31] Testing Loss : 0.4612  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 32] Training Loss : 0.4603  (Accuracy : 99.73%)





[EPOCH 32] Testing Loss : 0.4611  (Accuracy : 99.42%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 33] Training Loss : 0.4601  (Accuracy : 99.71%)





[EPOCH 33] Testing Loss : 0.4614  (Accuracy : 99.38%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 34] Training Loss : 0.4599  (Accuracy : 99.76%)





[EPOCH 34] Testing Loss : 0.4609  (Accuracy : 99.44%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 35] Training Loss : 0.4599  (Accuracy : 99.78%)





[EPOCH 35] Testing Loss : 0.4607  (Accuracy : 99.47%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 36] Training Loss : 0.4597  (Accuracy : 99.79%)





[EPOCH 36] Testing Loss : 0.4604  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 37] Training Loss : 0.4596  (Accuracy : 99.77%)





[EPOCH 37] Testing Loss : 0.4604  (Accuracy : 99.49%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 38] Training Loss : 0.4595  (Accuracy : 99.77%)





[EPOCH 38] Testing Loss : 0.4602  (Accuracy : 99.36%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 39] Training Loss : 0.4592  (Accuracy : 99.79%)





[EPOCH 39] Testing Loss : 0.4603  (Accuracy : 99.32%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 40] Training Loss : 0.4591  (Accuracy : 99.78%)





[EPOCH 40] Testing Loss : 0.4600  (Accuracy : 99.42%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 41] Training Loss : 0.4590  (Accuracy : 99.80%)





[EPOCH 41] Testing Loss : 0.4603  (Accuracy : 99.35%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 42] Training Loss : 0.4589  (Accuracy : 99.78%)





[EPOCH 42] Testing Loss : 0.4601  (Accuracy : 99.40%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 43] Training Loss : 0.4588  (Accuracy : 99.83%)





[EPOCH 43] Testing Loss : 0.4598  (Accuracy : 99.39%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 44] Training Loss : 0.4587  (Accuracy : 99.84%)





[EPOCH 44] Testing Loss : 0.4597  (Accuracy : 99.42%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 45] Training Loss : 0.4586  (Accuracy : 99.83%)





[EPOCH 45] Testing Loss : 0.4595  (Accuracy : 99.49%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 46] Training Loss : 0.4586  (Accuracy : 99.82%)





[EPOCH 46] Testing Loss : 0.4600  (Accuracy : 99.35%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 47] Training Loss : 0.4585  (Accuracy : 99.87%)





[EPOCH 47] Testing Loss : 0.4597  (Accuracy : 99.40%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 48] Training Loss : 0.4584  (Accuracy : 99.85%)





[EPOCH 48] Testing Loss : 0.4596  (Accuracy : 99.57%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 49] Training Loss : 0.4584  (Accuracy : 99.86%)





[EPOCH 49] Testing Loss : 0.4594  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 50] Training Loss : 0.4583  (Accuracy : 99.85%)





[EPOCH 50] Testing Loss : 0.4596  (Accuracy : 99.39%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 51] Training Loss : 0.4583  (Accuracy : 99.85%)





[EPOCH 51] Testing Loss : 0.4595  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 52] Training Loss : 0.4582  (Accuracy : 99.87%)





[EPOCH 52] Testing Loss : 0.4593  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 53] Training Loss : 0.4582  (Accuracy : 99.84%)





[EPOCH 53] Testing Loss : 0.4595  (Accuracy : 99.40%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 54] Training Loss : 0.4580  (Accuracy : 99.87%)





[EPOCH 54] Testing Loss : 0.4591  (Accuracy : 99.54%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 55] Training Loss : 0.4581  (Accuracy : 99.84%)





[EPOCH 55] Testing Loss : 0.4593  (Accuracy : 99.44%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 56] Training Loss : 0.4579  (Accuracy : 99.88%)





[EPOCH 56] Testing Loss : 0.4591  (Accuracy : 99.49%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 57] Training Loss : 0.4579  (Accuracy : 99.86%)





[EPOCH 57] Testing Loss : 0.4592  (Accuracy : 99.44%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 58] Training Loss : 0.4579  (Accuracy : 99.89%)





[EPOCH 58] Testing Loss : 0.4591  (Accuracy : 99.52%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 59] Training Loss : 0.4577  (Accuracy : 99.88%)





[EPOCH 59] Testing Loss : 0.4590  (Accuracy : 99.43%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 60] Training Loss : 0.4576  (Accuracy : 99.89%)





[EPOCH 60] Testing Loss : 0.4588  (Accuracy : 99.42%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 61] Training Loss : 0.4576  (Accuracy : 99.87%)





[EPOCH 61] Testing Loss : 0.4590  (Accuracy : 99.35%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 62] Training Loss : 0.4576  (Accuracy : 99.87%)





[EPOCH 62] Testing Loss : 0.4587  (Accuracy : 99.45%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 63] Training Loss : 0.4575  (Accuracy : 99.89%)





[EPOCH 63] Testing Loss : 0.4583  (Accuracy : 99.51%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 64] Training Loss : 0.4573  (Accuracy : 99.91%)





[EPOCH 64] Testing Loss : 0.4586  (Accuracy : 99.55%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 65] Training Loss : 0.4574  (Accuracy : 99.88%)





[EPOCH 65] Testing Loss : 0.4584  (Accuracy : 99.54%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 66] Training Loss : 0.4572  (Accuracy : 99.89%)





[EPOCH 66] Testing Loss : 0.4585  (Accuracy : 99.51%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 67] Training Loss : 0.4573  (Accuracy : 99.91%)





[EPOCH 67] Testing Loss : 0.4585  (Accuracy : 99.51%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 68] Training Loss : 0.4571  (Accuracy : 99.91%)





[EPOCH 68] Testing Loss : 0.4582  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 69] Training Loss : 0.4570  (Accuracy : 99.93%)





[EPOCH 69] Testing Loss : 0.4582  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 70] Training Loss : 0.4570  (Accuracy : 99.89%)





[EPOCH 70] Testing Loss : 0.4583  (Accuracy : 99.47%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 71] Training Loss : 0.4570  (Accuracy : 99.90%)





[EPOCH 71] Testing Loss : 0.4584  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 72] Training Loss : 0.4568  (Accuracy : 99.91%)





[EPOCH 72] Testing Loss : 0.4583  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 73] Training Loss : 0.4569  (Accuracy : 99.89%)





[EPOCH 73] Testing Loss : 0.4581  (Accuracy : 99.60%)


100%|██████████| 600/600 [03:22<00:00,  2.97it/s]

[EPOCH 74] Training Loss : 0.4568  (Accuracy : 99.90%)





[EPOCH 74] Testing Loss : 0.4582  (Accuracy : 99.46%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 75] Training Loss : 0.4566  (Accuracy : 99.91%)





[EPOCH 75] Testing Loss : 0.4583  (Accuracy : 99.42%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 76] Training Loss : 0.4567  (Accuracy : 99.91%)





[EPOCH 76] Testing Loss : 0.4583  (Accuracy : 99.34%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 77] Training Loss : 0.4566  (Accuracy : 99.91%)





[EPOCH 77] Testing Loss : 0.4581  (Accuracy : 99.52%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 78] Training Loss : 0.4566  (Accuracy : 99.91%)





[EPOCH 78] Testing Loss : 0.4579  (Accuracy : 99.53%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 79] Training Loss : 0.4565  (Accuracy : 99.92%)





[EPOCH 79] Testing Loss : 0.4578  (Accuracy : 99.63%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 80] Training Loss : 0.4564  (Accuracy : 99.92%)





[EPOCH 80] Testing Loss : 0.4579  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 81] Training Loss : 0.4565  (Accuracy : 99.91%)





[EPOCH 81] Testing Loss : 0.4579  (Accuracy : 99.36%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 82] Training Loss : 0.4565  (Accuracy : 99.91%)





[EPOCH 82] Testing Loss : 0.4577  (Accuracy : 99.55%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 83] Training Loss : 0.4564  (Accuracy : 99.93%)





[EPOCH 83] Testing Loss : 0.4575  (Accuracy : 99.59%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 84] Training Loss : 0.4564  (Accuracy : 99.92%)





[EPOCH 84] Testing Loss : 0.4577  (Accuracy : 99.53%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 85] Training Loss : 0.4562  (Accuracy : 99.93%)





[EPOCH 85] Testing Loss : 0.4576  (Accuracy : 99.52%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 86] Training Loss : 0.4562  (Accuracy : 99.94%)





[EPOCH 86] Testing Loss : 0.4577  (Accuracy : 99.36%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 87] Training Loss : 0.4563  (Accuracy : 99.93%)





[EPOCH 87] Testing Loss : 0.4578  (Accuracy : 99.47%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 88] Training Loss : 0.4562  (Accuracy : 99.94%)





[EPOCH 88] Testing Loss : 0.4576  (Accuracy : 99.51%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 89] Training Loss : 0.4562  (Accuracy : 99.92%)





[EPOCH 89] Testing Loss : 0.4575  (Accuracy : 99.49%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 90] Training Loss : 0.4561  (Accuracy : 99.93%)





[EPOCH 90] Testing Loss : 0.4577  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 91] Training Loss : 0.4561  (Accuracy : 99.92%)





[EPOCH 91] Testing Loss : 0.4573  (Accuracy : 99.54%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 92] Training Loss : 0.4561  (Accuracy : 99.94%)





[EPOCH 92] Testing Loss : 0.4575  (Accuracy : 99.48%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 93] Training Loss : 0.4560  (Accuracy : 99.94%)





[EPOCH 93] Testing Loss : 0.4574  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 94] Training Loss : 0.4560  (Accuracy : 99.93%)





[EPOCH 94] Testing Loss : 0.4578  (Accuracy : 99.54%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 95] Training Loss : 0.4559  (Accuracy : 99.92%)





[EPOCH 95] Testing Loss : 0.4577  (Accuracy : 99.50%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 96] Training Loss : 0.4559  (Accuracy : 99.94%)





[EPOCH 96] Testing Loss : 0.4577  (Accuracy : 99.51%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 97] Training Loss : 0.4560  (Accuracy : 99.93%)





[EPOCH 97] Testing Loss : 0.4574  (Accuracy : 99.59%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 98] Training Loss : 0.4559  (Accuracy : 99.94%)





[EPOCH 98] Testing Loss : 0.4573  (Accuracy : 99.46%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 99] Training Loss : 0.4560  (Accuracy : 99.91%)





[EPOCH 99] Testing Loss : 0.4575  (Accuracy : 99.49%)


100%|██████████| 600/600 [03:21<00:00,  2.97it/s]

[EPOCH 100] Training Loss : 0.4559  (Accuracy : 99.94%)





[EPOCH 100] Testing Loss : 0.4573  (Accuracy : 99.61%)


{'epoch': 100,
 'iterator': 100%|██████████| 600/600 [03:33<00:00,  2.81it/s],
 'loss': None,
 'maxepoch': 100,
 'network': <function __main__.processor>,
 'optimizer': <torch.optim.adam.Adam at 0x7fecd2ca50b8>,
 'output': None,
 'sample': [
  ( 0 ,.,.) = 
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
        ...         ⋱        ...      
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
  
  ( 1 ,.,.) = 
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
        ...         ⋱        ...      
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
  
  ( 2 ,.,.) = 
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
      0    0    0  ...     0    0    0
        ...         ⋱        ...      
     