*Accompanying code examples of the book "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python" by [Sebastian Raschka](https://sebastianraschka.com). All code examples are released under the [MIT license](https://github.com/rasbt/deep-learning-book/blob/master/LICENSE). If you find this content useful, please consider supporting the work by buying a [copy of the book](https://leanpub.com/ann-and-deeplearning).*
  
Other code examples and content are available on [GitHub](https://github.com/rasbt/deep-learning-book). The PDF and ebook versions of the book are available through [Leanpub](https://leanpub.com/ann-and-deeplearning).

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Sebastian Raschka 

CPython 3.6.3
IPython 6.2.1

torch 0.3.0.post4


# Model Zoo -- Convolutional ResNet and Residual Blocks

Please note that this example does not implement a really deep ResNet as described in literature but rather illustrates how the residual blocks described in He et al. [1] can be implemented in PyTorch.

- [1] He, Kaiming, et al. "Deep residual learning for image recognition." *Proceedings of the IEEE conference on computer vision and pattern recognition*. 2016.

## Dataset

In [2]:
from torchvision import datasets
from torchvision import transforms
from torch.utils.data import DataLoader
import torch
import numpy as np


##########################
### SETTINGS
##########################

# Hyperparameters
learning_rate = 0.01
num_epochs = 10
batch_size = 128

# Architecture
num_classes = 10

# Other
random_seed = 123


##########################
### MNIST DATASET
##########################

# Note transforms.ToTensor() scales input images
# to 0-1 range
train_dataset = datasets.MNIST(root='data', 
                               train=True, 
                               transform=transforms.ToTensor(),
                               download=True)

test_dataset = datasets.MNIST(root='data', 
                              train=False, 
                              transform=transforms.ToTensor())


train_loader = DataLoader(dataset=train_dataset, 
                          batch_size=batch_size, 
                          shuffle=True)

test_loader = DataLoader(dataset=test_dataset, 
                         batch_size=batch_size, 
                         shuffle=False)

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    break

Image batch dimensions: torch.Size([128, 1, 28, 28])
Image label dimensions: torch.Size([128])


## ResNet with identity blocks

The following code implements the residual blocks with skip connections such that the input passed via the shortcut matches the dimensions of the main path's output, which allows the network to learn identity functions. Such a residual block is illustrated below:

![](images/resnets/resnet-ex-1-1.png)

In [3]:
##########################
### MODEL
##########################



class ConvNet(torch.nn.Module):

    def __init__(self, num_classes):
        super(ConvNet, self).__init__()
        
        #########################
        ### 1st residual block
        #########################
        # 28x28x1 => 28x28x4
        self.conv_1 = torch.nn.Conv2d(in_channels=1,
                                      out_channels=4,
                                      kernel_size=(1, 1),
                                      stride=(1, 1),
                                      padding=0)
        self.conv_1_bn = torch.nn.BatchNorm2d(4)
                                    
        # 28x28x4 => 28x28x1
        self.conv_2 = torch.nn.Conv2d(in_channels=4,
                                      out_channels=1,
                                      kernel_size=(3, 3),
                                      stride=(1, 1),
                                      padding=1)   
        self.conv_2_bn = torch.nn.BatchNorm2d(1)
        
        
        #########################
        ### 2nd residual block
        #########################
        # 28x28x1 => 28x28x4
        self.conv_3 = torch.nn.Conv2d(in_channels=1,
                                      out_channels=4,
                                      kernel_size=(1, 1),
                                      stride=(1, 1),
                                      padding=0)
        self.conv_3_bn = torch.nn.BatchNorm2d(4)
                                    
        # 28x28x4 => 28x28x1
        self.conv_4 = torch.nn.Conv2d(in_channels=4,
                                      out_channels=1,
                                      kernel_size=(3, 3),
                                      stride=(1, 1),
                                      padding=1)   
        self.conv_4_bn = torch.nn.BatchNorm2d(1)

        #########################
        ### Fully connected
        #########################        
        self.linear_1 = torch.nn.Linear(28*28*1, num_classes)

        
    def forward(self, x):
        
        #########################
        ### 1st residual block
        #########################
        shortcut = x
        
        out = self.conv_1(x)
        out = self.conv_1_bn(out)
        out = F.relu(out)

        out = self.conv_2(out)
        out = self.conv_2_bn(out)
        
        out += shortcut
        out = F.relu(out)
        
        #########################
        ### 2nd residual block
        #########################
        
        shortcut = x
        
        out = self.conv_3(x)
        out = self.conv_3_bn(out)
        out = F.relu(out)

        out = self.conv_4(out)
        out = self.conv_4_bn(out)
        
        out += shortcut
        out = F.relu(out)
        
        #########################
        ### Fully connected
        #########################   
        logits = self.linear_1(out.view(-1, 28*28*1))
        probas = F.softmax(logits, dim=1)
        return logits, probas

    
torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)

if torch.cuda.is_available():
    model.cuda()
    

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

In [4]:
from torch.autograd import Variable
import torch.nn.functional as F

def compute_accuracy(model, data_loader):
    correct_pred, num_examples = 0, 0
    for features, targets in data_loader:
        features = Variable(features)
        if torch.cuda.is_available():
            features = features.cuda()
        logits, probas = model(features)
        _, predicted_labels = torch.max(probas.data, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels.cpu() == targets).sum()
    return correct_pred/num_examples * 100
    

for epoch in range(num_epochs):
    for batch_idx, (features, targets) in enumerate(train_loader):
        
        features = Variable(features)
        targets = Variable(targets)
        
        if torch.cuda.is_available():
            features, targets = features.cuda(), targets.cuda()
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = cost_fn(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 50:
            print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f' 
                   %(epoch+1, num_epochs, batch_idx, 
                     len(train_dataset)//batch_size, cost.data[0]))

    model.eval() # eval mode to prevent upd. batchnorm params during inference
    print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
          epoch+1, num_epochs, 
          compute_accuracy(model, train_loader)))
    model.train()
model.eval()

Epoch: 001/010 | Batch 000/468 | Cost: 2.3832
Epoch: 001/010 | Batch 050/468 | Cost: 0.2933
Epoch: 001/010 | Batch 100/468 | Cost: 0.3032
Epoch: 001/010 | Batch 150/468 | Cost: 0.3298
Epoch: 001/010 | Batch 200/468 | Cost: 0.2957
Epoch: 001/010 | Batch 250/468 | Cost: 0.3183
Epoch: 001/010 | Batch 300/468 | Cost: 0.2343
Epoch: 001/010 | Batch 350/468 | Cost: 0.3815
Epoch: 001/010 | Batch 400/468 | Cost: 0.2736
Epoch: 001/010 | Batch 450/468 | Cost: 0.4884
Epoch: 001/010 training accuracy: 92.08%
Epoch: 002/010 | Batch 000/468 | Cost: 0.2531
Epoch: 002/010 | Batch 050/468 | Cost: 0.2248
Epoch: 002/010 | Batch 100/468 | Cost: 0.2792
Epoch: 002/010 | Batch 150/468 | Cost: 0.2112
Epoch: 002/010 | Batch 200/468 | Cost: 0.4429
Epoch: 002/010 | Batch 250/468 | Cost: 0.3217
Epoch: 002/010 | Batch 300/468 | Cost: 0.2864
Epoch: 002/010 | Batch 350/468 | Cost: 0.2211
Epoch: 002/010 | Batch 400/468 | Cost: 0.3079
Epoch: 002/010 | Batch 450/468 | Cost: 0.2156
Epoch: 002/010 training accuracy: 92.12

ConvNet(
  (conv_1): Conv2d (1, 4, kernel_size=(1, 1), stride=(1, 1))
  (conv_1_bn): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True)
  (conv_2): Conv2d (4, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_2_bn): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True)
  (conv_3): Conv2d (1, 4, kernel_size=(1, 1), stride=(1, 1))
  (conv_3_bn): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True)
  (conv_4): Conv2d (4, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_4_bn): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True)
  (linear_1): Linear(in_features=784, out_features=10)
)

In [5]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 92.67%


## ResNet with convolutional blocks for resizing

The following code implements the residual blocks with skip connections such that the input passed via the shortcut matches is resized to dimensions of the main path's output. Such a residual block is illustrated below:

![](images/resnets/resnet-ex-1-2.png)

In [6]:
##########################
### MODEL
##########################



class ConvNet(torch.nn.Module):

    def __init__(self, num_classes):
        super(ConvNet, self).__init__()
        
        #########################
        ### 1st residual block
        #########################
        # 28x28x1 => 14x14x4 
        self.conv_1 = torch.nn.Conv2d(in_channels=1,
                                      out_channels=4,
                                      kernel_size=(3, 3),
                                      stride=(2, 2),
                                      padding=1)
        self.conv_1_bn = torch.nn.BatchNorm2d(4)
                                    
        # 14x14x4 => 14x14x8
        self.conv_2 = torch.nn.Conv2d(in_channels=4,
                                      out_channels=8,
                                      kernel_size=(1, 1),
                                      stride=(1, 1),
                                      padding=0)   
        self.conv_2_bn = torch.nn.BatchNorm2d(8)
        
        # 28x28x1 => 14x14x8
        self.conv_shortcut_1 = torch.nn.Conv2d(in_channels=1,
                                               out_channels=8,
                                               kernel_size=(1, 1),
                                               stride=(2, 2),
                                               padding=0)   
        self.conv_shortcut_1_bn = torch.nn.BatchNorm2d(8)
        
        #########################
        ### 2nd residual block
        #########################
        # 14x14x8 => 7x7x16 
        self.conv_3 = torch.nn.Conv2d(in_channels=8,
                                      out_channels=16,
                                      kernel_size=(3, 3),
                                      stride=(2, 2),
                                      padding=1)
        self.conv_3_bn = torch.nn.BatchNorm2d(16)
                                    
        # 7x7x16 => 7x7x32
        self.conv_4 = torch.nn.Conv2d(in_channels=16,
                                      out_channels=32,
                                      kernel_size=(1, 1),
                                      stride=(1, 1),
                                      padding=0)   
        self.conv_4_bn = torch.nn.BatchNorm2d(32)
        
        # 14x14x8 => 7x7x32 
        self.conv_shortcut_2 = torch.nn.Conv2d(in_channels=8,
                                               out_channels=32,
                                               kernel_size=(1, 1),
                                               stride=(2, 2),
                                               padding=0)   
        self.conv_shortcut_2_bn = torch.nn.BatchNorm2d(32)

        #########################
        ### Fully connected
        #########################        
        self.linear_1 = torch.nn.Linear(7*7*32, num_classes)

        
    def forward(self, x):
        
        #########################
        ### 1st residual block
        #########################
        shortcut = x
        
        out = self.conv_1(x) # 28x28x1 => 14x14x4 
        out = self.conv_1_bn(out)
        out = F.relu(out)

        out = self.conv_2(out) # 14x14x4 => 714x14x8
        out = self.conv_2_bn(out)
        
        # match up dimensions using a linear function (no relu)
        shortcut = self.conv_shortcut_1(shortcut)
        shortcut = self.conv_shortcut_1_bn(shortcut)
        
        out += shortcut
        out = F.relu(out)
        
        #########################
        ### 2nd residual block
        #########################
        
        shortcut = out
        
        out = self.conv_3(out) # 14x14x8 => 7x7x16 
        out = self.conv_3_bn(out)
        out = F.relu(out)

        out = self.conv_4(out) # 7x7x16 => 7x7x32
        out = self.conv_4_bn(out)
        
        # match up dimensions using a linear function (no relu)
        shortcut = self.conv_shortcut_2(shortcut)
        shortcut = self.conv_shortcut_2_bn(shortcut)
        
        out += shortcut
        out = F.relu(out)
        
        #########################
        ### Fully connected
        #########################   
        logits = self.linear_1(out.view(-1, 7*7*32))
        probas = F.softmax(logits, dim=1)
        return logits, probas

    
torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)

if torch.cuda.is_available():
    model.cuda()
    

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

In [7]:
from torch.autograd import Variable
import torch.nn.functional as F

def compute_accuracy(model, data_loader):
    correct_pred, num_examples = 0, 0
    for features, targets in data_loader:
        features = Variable(features)
        if torch.cuda.is_available():
            features = features.cuda()
        logits, probas = model(features)
        _, predicted_labels = torch.max(probas.data, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels.cpu() == targets).sum()
    return correct_pred/num_examples * 100
    

for epoch in range(num_epochs):
    for batch_idx, (features, targets) in enumerate(train_loader):
        
        features = Variable(features)
        targets = Variable(targets)
        
        if torch.cuda.is_available():
            features, targets = features.cuda(), targets.cuda()
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = cost_fn(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 50:
            print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f' 
                   %(epoch+1, num_epochs, batch_idx, 
                     len(train_dataset)//batch_size, cost.data[0]))

    model.eval() # eval mode to prevent upd. batchnorm params during inference
    print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
          epoch+1, num_epochs, 
          compute_accuracy(model, train_loader)))
    model.train()
model.eval()

Epoch: 001/010 | Batch 000/468 | Cost: 2.3215
Epoch: 001/010 | Batch 050/468 | Cost: 0.1409
Epoch: 001/010 | Batch 100/468 | Cost: 0.1001
Epoch: 001/010 | Batch 150/468 | Cost: 0.0979
Epoch: 001/010 | Batch 200/468 | Cost: 0.0845
Epoch: 001/010 | Batch 250/468 | Cost: 0.0924
Epoch: 001/010 | Batch 300/468 | Cost: 0.0319
Epoch: 001/010 | Batch 350/468 | Cost: 0.2797
Epoch: 001/010 | Batch 400/468 | Cost: 0.1258
Epoch: 001/010 | Batch 450/468 | Cost: 0.1133
Epoch: 001/010 training accuracy: 97.74%
Epoch: 002/010 | Batch 000/468 | Cost: 0.0779
Epoch: 002/010 | Batch 050/468 | Cost: 0.0705
Epoch: 002/010 | Batch 100/468 | Cost: 0.1595
Epoch: 002/010 | Batch 150/468 | Cost: 0.1233
Epoch: 002/010 | Batch 200/468 | Cost: 0.0170
Epoch: 002/010 | Batch 250/468 | Cost: 0.0441
Epoch: 002/010 | Batch 300/468 | Cost: 0.0719
Epoch: 002/010 | Batch 350/468 | Cost: 0.0515
Epoch: 002/010 | Batch 400/468 | Cost: 0.0204
Epoch: 002/010 | Batch 450/468 | Cost: 0.0486
Epoch: 002/010 training accuracy: 97.81

ConvNet(
  (conv_1): Conv2d (1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (conv_1_bn): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True)
  (conv_2): Conv2d (4, 8, kernel_size=(1, 1), stride=(1, 1))
  (conv_2_bn): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)
  (conv_shortcut_1): Conv2d (1, 8, kernel_size=(1, 1), stride=(2, 2))
  (conv_shortcut_1_bn): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)
  (conv_3): Conv2d (8, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (conv_3_bn): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True)
  (conv_4): Conv2d (16, 32, kernel_size=(1, 1), stride=(1, 1))
  (conv_4_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
  (conv_shortcut_2): Conv2d (8, 32, kernel_size=(1, 1), stride=(2, 2))
  (conv_shortcut_2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
  (linear_1): Linear(in_features=1568, out_features=10)
)

In [8]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 98.39%


## ResNet with convolutional blocks for resizing (using a helper class)

This is the same network as above but uses a `ResidualBlock` helper class.

In [9]:
class ResidualBlock(torch.nn.Module):

    def __init__(self, channels):
        
        super(ResidualBlock, self).__init__()
        self.conv_1 = torch.nn.Conv2d(in_channels=channels[0],
                                      out_channels=channels[1],
                                      kernel_size=(3, 3),
                                      stride=(2, 2),
                                      padding=1)
        self.conv_1_bn = torch.nn.BatchNorm2d(channels[1])
                                    
        self.conv_2 = torch.nn.Conv2d(in_channels=channels[1],
                                      out_channels=channels[2],
                                      kernel_size=(1, 1),
                                      stride=(1, 1),
                                      padding=0)   
        self.conv_2_bn = torch.nn.BatchNorm2d(channels[2])

        self.conv_shortcut_1 = torch.nn.Conv2d(in_channels=channels[0],
                                               out_channels=channels[2],
                                               kernel_size=(1, 1),
                                               stride=(2, 2),
                                               padding=0)   
        self.conv_shortcut_1_bn = torch.nn.BatchNorm2d(channels[2])

    def forward(self, x):
        shortcut = x
        
        out = self.conv_1(x)
        out = self.conv_1_bn(out)
        out = F.relu(out)

        out = self.conv_2(out)
        out = self.conv_2_bn(out)
        
        # match up dimensions using a linear function (no relu)
        shortcut = self.conv_shortcut_1(shortcut)
        shortcut = self.conv_shortcut_1_bn(shortcut)
        
        out += shortcut
        out = F.relu(out)

        return out

In [10]:
##########################
### MODEL
##########################



class ConvNet(torch.nn.Module):

    def __init__(self, num_classes):
        super(ConvNet, self).__init__()
        
        self.residual_block_1 = ResidualBlock(channels=[1, 4, 8])
        self.residual_block_2 = ResidualBlock(channels=[8, 16, 32])
    
        self.linear_1 = torch.nn.Linear(7*7*32, num_classes)

        
    def forward(self, x):

        out = self.residual_block_1.forward(x)
        out = self.residual_block_2.forward(out)
         
        logits = self.linear_1(out.view(-1, 7*7*32))
        probas = F.softmax(logits, dim=1)
        return logits, probas

    
torch.manual_seed(random_seed)
model = ConvNet(num_classes=num_classes)

if torch.cuda.is_available():
    model.cuda()
    

##########################
### COST AND OPTIMIZER
##########################

cost_fn = torch.nn.CrossEntropyLoss()  
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

In [11]:
from torch.autograd import Variable
import torch.nn.functional as F

def compute_accuracy(model, data_loader):
    correct_pred, num_examples = 0, 0
    for features, targets in data_loader:
        features = Variable(features)
        if torch.cuda.is_available():
            features = features.cuda()
        logits, probas = model(features)
        _, predicted_labels = torch.max(probas.data, 1)
        num_examples += targets.size(0)
        correct_pred += (predicted_labels.cpu() == targets).sum()
    return correct_pred/num_examples * 100
    

for epoch in range(num_epochs):
    for batch_idx, (features, targets) in enumerate(train_loader):
        
        features = Variable(features)
        targets = Variable(targets)
        
        if torch.cuda.is_available():
            features, targets = features.cuda(), targets.cuda()
            
        ### FORWARD AND BACK PROP
        logits, probas = model(features)
        cost = cost_fn(logits, targets)
        optimizer.zero_grad()
        
        cost.backward()
        
        ### UPDATE MODEL PARAMETERS
        optimizer.step()
        
        ### LOGGING
        if not batch_idx % 50:
            print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f' 
                   %(epoch+1, num_epochs, batch_idx, 
                     len(train_dataset)//batch_size, cost.data[0]))

    model.eval() # eval mode to prevent upd. batchnorm params during inference
    print('Epoch: %03d/%03d training accuracy: %.2f%%' % (
          epoch+1, num_epochs, 
          compute_accuracy(model, train_loader)))
    model.train()
model.eval()

Epoch: 001/010 | Batch 000/468 | Cost: 2.3215
Epoch: 001/010 | Batch 050/468 | Cost: 0.1409
Epoch: 001/010 | Batch 100/468 | Cost: 0.1001
Epoch: 001/010 | Batch 150/468 | Cost: 0.0979
Epoch: 001/010 | Batch 200/468 | Cost: 0.0845
Epoch: 001/010 | Batch 250/468 | Cost: 0.0924
Epoch: 001/010 | Batch 300/468 | Cost: 0.0319
Epoch: 001/010 | Batch 350/468 | Cost: 0.2797
Epoch: 001/010 | Batch 400/468 | Cost: 0.1258
Epoch: 001/010 | Batch 450/468 | Cost: 0.1133
Epoch: 001/010 training accuracy: 97.74%
Epoch: 002/010 | Batch 000/468 | Cost: 0.0779
Epoch: 002/010 | Batch 050/468 | Cost: 0.0705
Epoch: 002/010 | Batch 100/468 | Cost: 0.1595
Epoch: 002/010 | Batch 150/468 | Cost: 0.1233
Epoch: 002/010 | Batch 200/468 | Cost: 0.0170
Epoch: 002/010 | Batch 250/468 | Cost: 0.0441
Epoch: 002/010 | Batch 300/468 | Cost: 0.0719
Epoch: 002/010 | Batch 350/468 | Cost: 0.0515
Epoch: 002/010 | Batch 400/468 | Cost: 0.0204
Epoch: 002/010 | Batch 450/468 | Cost: 0.0486
Epoch: 002/010 training accuracy: 97.81

ConvNet(
  (residual_block_1): ResidualBlock(
    (conv_1): Conv2d (1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (conv_1_bn): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True)
    (conv_2): Conv2d (4, 8, kernel_size=(1, 1), stride=(1, 1))
    (conv_2_bn): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)
    (conv_shortcut_1): Conv2d (1, 8, kernel_size=(1, 1), stride=(2, 2))
    (conv_shortcut_1_bn): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)
  )
  (residual_block_2): ResidualBlock(
    (conv_1): Conv2d (8, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (conv_1_bn): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True)
    (conv_2): Conv2d (16, 32, kernel_size=(1, 1), stride=(1, 1))
    (conv_2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
    (conv_shortcut_1): Conv2d (8, 32, kernel_size=(1, 1), stride=(2, 2))
    (conv_shortcut_1_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True)
  )
  (linear_1): Linear(in_features=156

In [12]:
print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))

Test accuracy: 98.39%
