# **Demo:** Net2DeeperNet on MNIST with LeNet5

The following demo shows how to apply Net2DeeperNet to LeNet5 in order to deeper a convolutional layer. The input image shape is the one of MNIST, but the network and the Net2DeeperNet algorithm can be applied to any other image size.

In [1]:
# Import libraries
import torch
import torchinfo

# Import custom modules and packages
from models.lenet import LeNet
import params.lenet_mnist
import net2net.net2net_deeper

### 1. Create a LeNet5 model

We start by creating the standard LeNet5 model.

In [2]:
# Create a LeNet model
model = LeNet(nb_classes=params.lenet_mnist.NB_CLASSES)

# Create a random input
x = torch.randn(1,
                params.lenet_mnist.NB_CHANNELS,
                *params.lenet_mnist.IMAGE_SHAPE)

# Compute the output of the teacher network
y_teacher = model(x)

### 2. Create a wider version of LeNet5

We then apply the Net2DeeperNet algorithm to the standard LeNet5 model to increase the number of output filters of the first convolutional layer. The weights and biases of the student model are initialized with those of the teacher model, in such a way that the output of the student model is the same as the output of the teacher model for the same input at initialization.

In [3]:
# Instantiate a Net2Net object from a (pre-trained) model
net2net = net2net.net2net_deeper.Net2Net(teacher_network=model,dataset_used="MNIST")

# Set the deepening operations to be performed
# Here we only increase the width of the first convolutional layer
deeper_operations = {"operation1": {"target_conv_layers": ["layer1.0","layer2.0"]}}

# Add some noise to the copied weights (optional)
sigma = 0.  # Standard deviation of the noise

# Apply the Net2Net deepening operations and get the student network
net2net.net2deeper(deeper_operations)
student_model = net2net.student_network

# Compute the output of the student network
y_student = student_model(x)



The weights and bias of the new batch normalization layer arenot initialized yet. To be implemented.
The weights and bias of the new batch normalization layer arenot initialized yet. To be implemented.
Device: cpu



Epoch 0 [train]: 100%|██████████| 782/782 [01:17<00:00, 10.14batch/s, batch_loss=2.24] 


### 3. Check that the student and teacher models have the same output for the same input

We check that the output of the student model is the same as the output of the teacher model for the same input at initialization. They can be slightly different if some noise has been added to the weights of the student model during the initialization.

In [4]:
# The outputs should be the same
print("Teacher output: ", y_teacher)
print("Student output: ", y_student, "\n")

Teacher output:  tensor([[ 0.1087,  0.1175,  0.1888,  0.2984,  0.0080,  0.0870, -0.1261,  0.0786,
          0.0226, -0.1650]], grad_fn=<AddmmBackward0>)
Student output:  tensor([[ 0.4720,  1.2754,  2.6164,  3.6586,  0.4364,  2.1904, -0.3835,  1.4233,
          0.8637, -1.1186]], grad_fn=<AddmmBackward0>) 



### 4. Have a look at the student and teacher architectures

We display the student and teacher architectures to check that the student model has more filters than the teacher model in the first convolutional layer.

In [5]:
# Display the architecture of the student network
torchinfo.summary(model, input_size=(1,
                                     params.lenet_mnist.NB_CHANNELS,
                                     *params.lenet_mnist.IMAGE_SHAPE))

Layer (type:depth-idx)                   Output Shape              Param #
LeNet                                    [1, 10]                   --
├─Sequential: 1-1                        [1, 6, 14, 14]            --
│    └─Conv2d: 2-1                       [1, 6, 28, 28]            156
│    └─BatchNorm2d: 2-2                  [1, 6, 28, 28]            12
│    └─ReLU: 2-3                         [1, 6, 28, 28]            --
│    └─MaxPool2d: 2-4                    [1, 6, 14, 14]            --
├─Sequential: 1-2                        [1, 16, 5, 5]             --
│    └─Conv2d: 2-5                       [1, 16, 10, 10]           2,416
│    └─BatchNorm2d: 2-6                  [1, 16, 10, 10]           32
│    └─ReLU: 2-7                         [1, 16, 10, 10]           --
│    └─MaxPool2d: 2-8                    [1, 16, 5, 5]             --
├─Linear: 1-3                            [1, 120]                  48,120
├─ReLU: 1-4                              [1, 120]                  --
├─Linea

In [6]:
# Display the architecture of the student network
torchinfo.summary(net2net.student_network, input_size=(1,
                                                       params.lenet_mnist.NB_CHANNELS,
                                                       *params.lenet_mnist.IMAGE_SHAPE))

Layer (type:depth-idx)                   Output Shape              Param #
LeNet                                    [1, 10]                   --
├─Sequential: 1-1                        [1, 6, 14, 14]            --
│    └─Conv2d: 2-1                       [1, 6, 28, 28]            156
│    └─BatchNorm2d: 2-2                  [1, 6, 28, 28]            12
│    └─ReLU: 2-3                         [1, 6, 28, 28]            --
│    └─Conv2d: 2-4                       [1, 6, 28, 28]            906
│    └─BatchNorm2d: 2-5                  [1, 6, 28, 28]            12
│    └─ReLU: 2-6                         [1, 6, 28, 28]            --
│    └─MaxPool2d: 2-7                    [1, 6, 14, 14]            --
├─Sequential: 1-2                        [1, 16, 5, 5]             --
│    └─Conv2d: 2-8                       [1, 16, 10, 10]           2,416
│    └─BatchNorm2d: 2-9                  [1, 16, 10, 10]           32
│    └─ReLU: 2-10                        [1, 16, 10, 10]           --
│    └─Con