# Model Pruning Torch Practice
Pytorch has added pruning operations since version 1.4.0. In the `torch.nn.utils.prune` module, this tutorial divides the pruning range into the following pruning methods:
- Local Pruning
- Structured Pruning
- Random Structured Pruning (random_structured)
- Norm Structured Pruning (ln_structured)
- Unstructured Pruning
- Random Unstructured Pruning (random_unstructured)
- Norm Unstructured Pruning (l1_unstructured)
- Global Pruning
- Unstructured Pruning (global_unstructured)
- Custom Pruning (Custom Pruning)

**Note:** Global pruning only has unstructured pruning methods.

## 1. Local Pruning
First, we will introduce the local pruning method, which refers to pruning a single layer or a local range of the network.

### 1.1 Structured pruning
According to the pruning method, it can be divided into structured pruning and unstructured pruning. Unstructured pruning will randomly change some weight parameters to 0, while structured pruning will change some channels of a certain dimension to 0.

#### 1.1.1 Random structured pruning (random_structured)

In [343]:
import torch
from torch import nn
import torch.nn.utils.prune as prune
import torch.nn.functional as F
from torchsummary import summary

Create a classic LeNet network

In [344]:
# Define a LeNet network
class LeNet(nn.Module):
    def __init__(self, num_classes=10):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(in_features=16 * 4 * 4, out_features=120)
        self.fc2 = nn.Linear(in_features=120, out_features=84)
        self.fc3 = nn.Linear(in_features=84, out_features=num_classes)

    def forward(self, x):
        x = self.maxpool(F.relu(self.conv1(x)))
        x = self.maxpool(F.relu(self.conv2(x)))

        x = x.view(x.size()[0], -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)

        return x
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = LeNet().to(device=device)

In [345]:
# Print model structure
summary(model, input_size=(1, 28, 28))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 6, 24, 24]             156
         MaxPool2d-2            [-1, 6, 12, 12]               0
            Conv2d-3             [-1, 16, 8, 8]           2,416
         MaxPool2d-4             [-1, 16, 4, 4]               0
            Linear-5                  [-1, 120]          30,840
            Linear-6                   [-1, 84]          10,164
            Linear-7                   [-1, 10]             850
Total params: 44,426
Trainable params: 44,426
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.04
Params size (MB): 0.17
Estimated Total Size (MB): 0.22
----------------------------------------------------------------


In [346]:
# Print the parameters of the first convolutional layer
module = model.conv1
print(list(module.named_parameters()))

[('weight', Parameter containing:
tensor([[[[ 0.0220,  0.1789, -0.0544, -0.0713,  0.0478],
          [ 0.1995, -0.0415,  0.0288, -0.1431,  0.1057],
          [ 0.1600,  0.0248, -0.1903, -0.0242, -0.1961],
          [-0.0211,  0.0257, -0.1116, -0.1678,  0.0611],
          [ 0.0012,  0.0420, -0.1725, -0.1265, -0.1075]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.1655,  0.0727]]],


        [[[ 0.1216, -0.0833, -0.1491, -0.1143,  0.0113],
          [ 0.0452,  0.1662, -0.0425, -0.0904, -0.1235],
          [ 0.0565,  0.0933, -0.0721,  0.0909,  0.1837],
          [-0.1739,  0.0263,  0.1339,  0.0648, -0.0382],
          [-0.1667,  0.1478,  0.0448, -0.0892,  0.0815]]],


        [[[ 0.1976,  0.0123,  0.1523, -0.1207,  0.1493],
          [-0.1799,  0.0580,  0.1490,  0.1

In [347]:
# Print the attribute tensor named_buffers in the module, which is initially an empty list
print(list(module.named_buffers()))

[]


In [348]:
# Print the model's state dictionary, which contains all the parameters
print(model.state_dict().keys())

odict_keys(['conv1.weight', 'conv1.bias', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])


In [349]:
# The first parameter: module, represents the specific module to be pruned, here it refers to module=model.conv1,
# Indicates that pruning is to be performed on the first convolutional layer.
# The second parameter: name, represents which parameters in the selected module to be pruned.
# Here name="weight" is set, which means to prune the weight in the network instead of bias.
# The third parameter: amount, represents the pruning of a specific proportion or absolute number of parameters in the model.
# amount is a float value between 0.0-1.0, representing the ratio, or a positive integer representing how many parameters to clip.
# The fourth parameter: dim, represents the dimension index of the channel to be pruned.
#            

prune.random_structured(module, name="weight", amount=2, dim=0)

Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

In [350]:
# Print the model's state dictionary again and observe the conv1 layer
print(model.state_dict().keys())

odict_keys(['conv1.bias', 'conv1.weight_orig', 'conv1.weight_mask', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])


In [351]:
# Print the attribute tensor named_buffers in the module again
print(list(module.named_parameters()))

[('bias', Parameter containing:
tensor([-0.0893, -0.1464, -0.1101, -0.0076,  0.1493, -0.0418],
       requires_grad=True)), ('weight_orig', Parameter containing:
tensor([[[[ 0.0220,  0.1789, -0.0544, -0.0713,  0.0478],
          [ 0.1995, -0.0415,  0.0288, -0.1431,  0.1057],
          [ 0.1600,  0.0248, -0.1903, -0.0242, -0.1961],
          [-0.0211,  0.0257, -0.1116, -0.1678,  0.0611],
          [ 0.0012,  0.0420, -0.1725, -0.1265, -0.1075]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.1655,  0.0727]]],


        [[[ 0.1216, -0.0833, -0.1491, -0.1143,  0.0113],
          [ 0.0452,  0.1662, -0.0425, -0.0904, -0.1235],
          [ 0.0565,  0.0933, -0.0721,  0.0909,  0.1837],
          [-0.1739,  0.0263,  0.1339,  0.0648, -0.0382],
          [-0.1667,  0.1478,  0.

In [352]:
# Print the attribute tensor named_buffers in the module again
print(list(module.named_buffers()))

[('weight_mask', tensor([[[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]],


        [[[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]]],


        [[[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]]],


        [[[0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.],
          [0., 0., 0., 0., 0.]]],


        [[[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]]],


        [[[1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.],
          [1., 1., 1., 1., 1.]]]]))

Conclusion: After pruning, the original weight matrix weight becomes weight_orig. And module.named_buffers(), which was printed as an empty list before pruning, now has an additional weight_mask parameter.

In [353]:
# Print module.weight and see what we find?
print(module.weight)

tensor([[[[ 0.0000,  0.0000, -0.0000, -0.0000,  0.0000],
          [ 0.0000, -0.0000,  0.0000, -0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.0000, -0.0000, -0.0000],
          [-0.0000,  0.0000, -0.0000, -0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.0000, -0.0000, -0.0000]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.1655,  0.0727]]],


        [[[ 0.1216, -0.0833, -0.1491, -0.1143,  0.0113],
          [ 0.0452,  0.1662, -0.0425, -0.0904, -0.1235],
          [ 0.0565,  0.0933, -0.0721,  0.0909,  0.1837],
          [-0.1739,  0.0263,  0.1339,  0.0648, -0.0382],
          [-0.1667,  0.1478,  0.0448, -0.0892,  0.0815]]],


        [[[ 0.0000,  0.0000,  0.0000, -0.0000,  0.0000],
          [-0.0000,  0.0000,  0.0000,  0.0000, -0.0000],
          [-0.0000,

Conclusion: After pruning, the original weight becomes weight_orig and is stored in named_parameters. The corresponding pruning matrix is ​​stored in weight_mask. The weight_mask is regarded as a mask tensor, and the result of multiplying it with weight_orig is stored in weight.

**Note:** After pruning, the weight is no longer a parameter of the module, but only an attribute of the module.

For each pruning operation, the model will correspond to a specific _forward_pre_hooks function for pruning, which stores the executed pruning operations.

In [354]:
# print_forward_pre_hooks
print(module._forward_pre_hooks)

OrderedDict([(327, <torch.nn.utils.prune.RandomStructured object at 0x00000235D8EFF1C0>)])


#### 1.1.2 Norm structured pruning (ln_structured)
The parameters of a model can be pruned multiple times, which is called iterative pruning. The above steps have performed random structured pruning on conv1. Next, perform norm structured pruning on it again. Let's see what happens?

In [355]:
# The first parameter: module, represents the specific module to be pruned, here it refers to module=model.conv1,
# Indicates that pruning is to be performed on the first convolutional layer.
# The second parameter: name, represents which parameters in the selected module to be pruned.
# Here name="weight" is set, which means to prune the weight in the network instead of bias.
# The third parameter: amount, represents the pruning of a specific proportion or absolute number of parameters in the model.
# amount is a float value between 0.0-1.0, representing the ratio, or a positive integer representing how many parameters to clip.
# The fourth parameter: n, represents the norm type, here n=2 represents the L2 norm.
# The fifth parameter: dim, represents the dimension index of the channel to be pruned.

prune.ln_structured(module, name="weight", amount=0.5, n=2, dim=0)

# Print model parameters again
print(" model state_dict keys:")
print(model.state_dict().keys())
print('*'*50)

print(" module named_parameters:")
print(list(module.named_parameters()))
print('*'*50)

print(" module named_buffers:")
print(list(module.named_buffers()))
print('*'*50)

print(" module weight:")
print(module.weight)
print('*'*50)

print(" module _forward_pre_hooks:")
print(module._forward_pre_hooks)

 model state_dict keys:
odict_keys(['conv1.bias', 'conv1.weight_orig', 'conv1.weight_mask', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])
**************************************************
 module named_parameters:
[('bias', Parameter containing:
tensor([-0.0893, -0.1464, -0.1101, -0.0076,  0.1493, -0.0418],
       requires_grad=True)), ('weight_orig', Parameter containing:
tensor([[[[ 0.0220,  0.1789, -0.0544, -0.0713,  0.0478],
          [ 0.1995, -0.0415,  0.0288, -0.1431,  0.1057],
          [ 0.1600,  0.0248, -0.1903, -0.0242, -0.1961],
          [-0.0211,  0.0257, -0.1116, -0.1678,  0.0611],
          [ 0.0012,  0.0420, -0.1725, -0.1265, -0.1075]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.16

Conclusion: Iterative pruning is equivalent to serializing multiple pruning cores into one pruning core. The new mask matrix is ​​combined with the old mask matrix using the compute_mask method in PruningContainer. Finally, there is only one weight_orig and weight_mask.

How can I see all the pruning history? module._forward_pre_hooks is a mechanism for executing custom operations before the forward propagation of the model. The executed pruning methods are recorded here.

In [356]:
# Print pruning history
for hook in module._forward_pre_hooks.values():
    if hook._tensor_name == "weight":  
        break

print(list(hook))  

[<torch.nn.utils.prune.RandomStructured object at 0x00000235D8EFF1C0>, <torch.nn.utils.prune.LnStructured object at 0x00000235D9381F10>]


#### 1.1.3 Random unstructured pruning (random_unstructured)
You can prune any substructure of the model. In addition to pruning weights, you can also prune bias.

In [357]:
# The first parameter: module, represents the specific module to be pruned, here it refers to module=model.conv1,
# Indicates that pruning is to be performed on the first convolutional layer.
# The second parameter: name, represents which parameters in the selected module to be pruned.
# Here name="weight" is set, which means to prune the weight in the network instead of bias.
# The third parameter: amount, represents the pruning of a specific proportion or absolute number of parameters in the model.
# amount is a float value between 0.0-1.0, representing the ratio, or a positive integer representing how many parameters to clip.

prune.random_unstructured(module, name="bias", amount=1)

# Print model parameters again
print(" model state_dict keys:")
print(model.state_dict().keys())
print('*'*50)

print(" module named_parameters:")
print(list(module.named_parameters()))
print('*'*50)

print(" module named_buffers:")
print(list(module.named_buffers()))
print('*'*50)

print(" module bias:")
print(module.bias)
print('*'*50)

print(" module _forward_pre_hooks:")
print(module._forward_pre_hooks)

 model state_dict keys:
odict_keys(['conv1.weight_orig', 'conv1.bias_orig', 'conv1.weight_mask', 'conv1.bias_mask', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])
**************************************************
 module named_parameters:
[('weight_orig', Parameter containing:
tensor([[[[ 0.0220,  0.1789, -0.0544, -0.0713,  0.0478],
          [ 0.1995, -0.0415,  0.0288, -0.1431,  0.1057],
          [ 0.1600,  0.0248, -0.1903, -0.0242, -0.1961],
          [-0.0211,  0.0257, -0.1116, -0.1678,  0.0611],
          [ 0.0012,  0.0420, -0.1725, -0.1265, -0.1075]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.1655,  0.0727]]],


        [[[ 0.1216, -0.0833, -0.1491, -0.1143,  0.0113],
          [ 0.0452,  0.1

Conclusion: Applying different pruning strategies on different parameter sets of the module, we can find that in the model parameters state_dict and named_parameters, there are not only weight_orig but also bias_orig. In the parameter named_buffers, weight_mask and bias_mask also appear at the same time. 
Finally, because we apply two different pruning functions on two types of parameters, _forward_pre_hooks also prints out two different function results.

#### 1.1.4 Norm unstructured pruning (l1_unstructured)
Previously, we introduced different methods for pruning the weight and bias of the specified conv1 layer. So, is it possible to support pruning specific parameters of multi-layer networks at the same time?

In [358]:
# Prune the model's module parameters
for n, m in model.named_modules():
# Perform l1_unstructured pruning on all convolutional layers in the model, and select 20% of the parameters for pruning
    if isinstance(m, torch.nn.Conv2d):
        prune.l1_unstructured(m, name="bias", amount=0.2)
# Perform ln_structured pruning on all fully connected layers in the model, and select 40% of the parameters for pruning
# elif isinstance(module, torch.nn.Linear):
# prune.random_structured(module, name="weight", amount=0.4, dim=0)

# Print model parameters again
print(" model state_dict keys:")
print(model.state_dict().keys())
print('*'*50)

print(" module named_parameters:")
print(list(module.named_parameters()))
print('*'*50)

print(" module named_buffers:")
print(list(module.named_buffers()))
print('*'*50)

print(" module weight:")
print(module.weight)
print('*'*50)

print(" module bias:")
print(module.bias)
print('*'*50)

print(" module _forward_pre_hooks:")
print(module._forward_pre_hooks)

 model state_dict keys:
odict_keys(['conv1.weight_orig', 'conv1.bias_orig', 'conv1.weight_mask', 'conv1.bias_mask', 'conv2.weight', 'conv2.bias_orig', 'conv2.bias_mask', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])
**************************************************
 module named_parameters:
[('weight_orig', Parameter containing:
tensor([[[[ 0.0220,  0.1789, -0.0544, -0.0713,  0.0478],
          [ 0.1995, -0.0415,  0.0288, -0.1431,  0.1057],
          [ 0.1600,  0.0248, -0.1903, -0.0242, -0.1961],
          [-0.0211,  0.0257, -0.1116, -0.1678,  0.0611],
          [ 0.0012,  0.0420, -0.1725, -0.1265, -0.1075]]],


        [[[-0.0540, -0.1928, -0.0355, -0.0075, -0.1481],
          [ 0.0135,  0.0192,  0.0082, -0.0120, -0.0164],
          [-0.0435, -0.1488,  0.1092, -0.0041,  0.1960],
          [-0.1045, -0.0136,  0.0398, -0.1286,  0.0617],
          [-0.0091,  0.0466,  0.1827,  0.1655,  0.0727]]],


        [[[ 0.1216, -0.0833, -0.1491, -0.1143,  0.0113],


Next, we prune and permanently remove the model's weight. After the previous pruning steps, the original weight has become 'weight_orig', and weight is the result of multiplying 'weight_orig' and the mask matrix, becoming an attribute. Please observe what changes have occurred after remove?

In [359]:
# Perform pruning and permanent operation on the module weight remove
for n, m in model.named_modules():
    if isinstance(m, torch.nn.Conv2d):
        prune.remove(m, 'bias')

# Perform pruning and permanent operation on conv1 weight remove
prune.remove(module, 'weight')
print('*'*50)

# Print out the state dictionary of the pruned model
print(" model state_dict keys:")
print(model.state_dict().keys())
print('*'*50)

# Print model parameters again
print(" model named_parameters:")
print(list(module.named_parameters()))
print('*'*50)

# Print the model mask buffers parameters again
print(" model named_buffers:")
print(list(module.named_buffers()))
print('*'*50)

# Print the model's _forward_pre_hooks again
print(" model forward_pre_hooks:")
print(module._forward_pre_hooks)

**************************************************
 model state_dict keys:
odict_keys(['conv1.bias', 'conv1.weight', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])
**************************************************
 model named_parameters:
[('bias', Parameter containing:
tensor([-0.0893, -0.1464, -0.0000, -0.0000,  0.1493, -0.0418],
       requires_grad=True)), ('weight', Parameter containing:
tensor([[[[ 0.0000,  0.0000, -0.0000, -0.0000,  0.0000],
          [ 0.0000, -0.0000,  0.0000, -0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.0000, -0.0000, -0.0000],
          [-0.0000,  0.0000, -0.0000, -0.0000,  0.0000],
          [ 0.0000,  0.0000, -0.0000, -0.0000, -0.0000]]],


        [[[-0.0000, -0.0000, -0.0000, -0.0000, -0.0000],
          [ 0.0000,  0.0000,  0.0000, -0.0000, -0.0000],
          [-0.0000, -0.0000,  0.0000, -0.0000,  0.0000],
          [-0.0000, -0.0000,  0.0000, -0.0000,  0.0000],
          [-0.0000,  0.0

Conclusion: After performing the remove operation on the model's weight and bias, weight_orig and bias_orig in the model parameter set disappear and become weight and bias, indicating that pruning has become permanent. For the named_buffers tensor printing, it can be seen that only [] is left, because weight_mask and bias-mask, which are masks for weight and bias, have taken effect and no longer need to be retained. 
Similarly, only an empty dictionary is left in _forward_pre_hooks. Weight and bias have become parameters again, and pruning has become permanent.

## 2. Global pruning

Four local pruning methods have been introduced above, but to a large extent, you need to decide to prune a certain layer of the network based on your own experience.
A more general pruning strategy is to use global pruning, which prunes from the perspective of the entire network. After global pruning, the percentage of pruning on different layers may be different.

In [363]:
model = LeNet().to(device=device)

# First print the state dictionary of the initialized model
print(model.state_dict().keys())
print('*'*50)

# Build parameter sets to determine which layers and parameter sets participate in pruning
parameters_to_prune = (
            (model.conv1, 'weight'),
            (model.conv2, 'weight'),
            (model.fc1, 'weight'),
            (model.fc2, 'weight'))

# Call the global pruning function global_unstructured in prune to perform the pruning operation
prune.global_unstructured(parameters_to_prune, pruning_method=prune.L1Unstructured, amount=0.2)

# Print the state dictionary of the pruned model
print(model.state_dict().keys())

odict_keys(['conv1.weight', 'conv1.bias', 'conv2.weight', 'conv2.bias', 'fc1.weight', 'fc1.bias', 'fc2.weight', 'fc2.bias', 'fc3.weight', 'fc3.bias'])
**************************************************
odict_keys(['conv1.bias', 'conv1.weight_orig', 'conv1.weight_mask', 'conv2.bias', 'conv2.weight_orig', 'conv2.weight_mask', 'fc1.bias', 'fc1.weight_orig', 'fc1.weight_mask', 'fc2.bias', 'fc2.weight_orig', 'fc2.weight_mask', 'fc3.weight', 'fc3.bias'])


After pruning the model, different layers will have different proportions of weight parameters pruned off. Use the code to print it out and see:

In [364]:
print(
    "Sparsity in conv1.weight: {:.2f}%".format(
    100. * float(torch.sum(model.conv1.weight == 0))
    / float(model.conv1.weight.nelement())
    ))

print(
    "Sparsity in conv2.weight: {:.2f}%".format(
    100. * float(torch.sum(model.conv2.weight == 0))
    / float(model.conv2.weight.nelement())
    ))

print(
    "Sparsity in fc1.weight: {:.2f}%".format(
    100. * float(torch.sum(model.fc1.weight == 0))
    / float(model.fc1.weight.nelement())
    ))

print(
    "Sparsity in fc2.weight: {:.2f}%".format(
    100. * float(torch.sum(model.fc2.weight == 0))
    / float(model.fc2.weight.nelement())
    ))


print(
    "Global sparsity: {:.2f}%".format(
    100. * float(torch.sum(model.conv1.weight == 0)
               + torch.sum(model.conv2.weight == 0)
               + torch.sum(model.fc1.weight == 0)
               + torch.sum(model.fc2.weight == 0))
         / float(model.conv1.weight.nelement()
               + model.conv2.weight.nelement()
               + model.fc1.weight.nelement()
               + model.fc2.weight.nelement())
    ))

Sparsity in conv1.weight: 5.33%
Sparsity in conv2.weight: 17.25%
Sparsity in fc1.weight: 22.03%
Sparsity in fc2.weight: 14.67%
Global sparsity: 20.00%


Conclusion: When the global pruning strategy is adopted (assuming that 20% of the parameters are involved in pruning), only 20% of the total parameters of the model are pruned, and the specific situation of each layer is determined by the specific parameter distribution of the model.

## 3. Custom pruning.

The pruning model inherits the class BasePruningMethod() to perform pruning. There are several methods inside: call, apply_mask, apply, prune, remove, etc. The __init__ (constructor) and compute_mask functions must be implemented to complete the custom pruning rule setting.

In [None]:
# The class of custom pruning method must inherit prune.BasePruningMethod
class custom_prune(prune.BasePruningMethod):
#Specify the type of pruning implemented by this technique (supported options are global, structured, and unstructured)
    PRUNING_TYPE = "unstructured"

# Internally implement the compute_mask function and define the pruning rules, which is essentially how to mask the weight parameters
    def compute_mask(self, t, default_mask):
        mask = default_mask.clone()
# The rule defined here is to mask out every other parameter, and finally 50% of the parameters involved in pruning are masked out
        mask.view(-1)[::2] = 0
        return mask

# Customize the pruning method function, and directly call the pruning class method apply internally
def custome_unstructured_pruning(module, name):
    custom_prune.apply(module, name)
    return module

In [365]:
import time
# Instantiate the model class
model = LeNet().to(device=device)

start = time.time()
# Call the function of the custom pruning method to perform custom pruning on the bias in the first fully connected layer fc1 in the model
custome_unstructured_pruning(model.fc1, name="bias")

# The biggest sign of successful pruning is having the bias_mask parameter
print(model.fc1.bias_mask)

# Print the time taken for custom pruning
duration = time.time() - start
print(duration * 1000, 'ms')

tensor([0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.,
        0., 1., 0., 1., 0., 1., 0., 1., 0., 1., 0., 1.])
2.996683120727539 ms


Conclusion: The bias_mask tensor printed out completely masks every other bit in a predefined way, with 0 and 1 appearing alternately. When the remove operation is performed later, the weights in the original bias_orig will also be pruned every other bit.