In [2]:
# this mounts your Google Drive to the Colab VM.
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# enter the foldername in your Drive where you have saved the unzipped
# assignment folder, e.g. 'ece697ls/assignments/assignment3/'
FOLDERNAME = "ECE562 assignments"
assert FOLDERNAME is not None, "[!] Enter the foldername."

# now that we've mounted your Drive, this ensures that
# the Python interpreter of the Colab VM can load
# python files from within it.
import sys
sys.path.append('/content/drive/My Drive/{}'.format(FOLDERNAME))

%cd /content
!pwd

Mounted at /content/drive
/content
/content


# Pruning

After learning, neural networks have modified and learned a set of parameters to perform our classification task. However, such parameters are costly to maintain and do not hold the same importance.

Wouldn't it be great could optimize our resource usage by dropping less important values ? This is where pruning comes into play.

Pruning is a technique that cuts off parameters/structures from a model to increase sparcity and decrease overall model size, similar to cutting leafs or branches from bushes and trees. This process can lead to smaller memory consumption with minimal accuracy reduction. Moreover, pruning the network may also provide a speedup since there will be less operations being performed.

The pruning process can be performed during the end of an epoch of training or after training is complete. Experimenting to find out which way works the best is part of the fun !

In [4]:
from assignment1.ece662.pruning_helper import test_model, load_model
from assignment1.ece662.data_utils import get_CINIC10_data
import os

Below we will load a pre-trained model for you to work on. If you prefer, you can save your own model from the previous Tensorflow/Pytorch task and load it here.

In [21]:
#This code may take a while to execute as it is training a network form scratch

data = get_CINIC10_data()
mode = 'torch'#torch or tensorflow

test_data = [data['X_test'],data['y_test']]

path = '/content/drive/MyDrive/ECE562 assignments/assignment1/ece662/models/torch.model'
model = load_model(path, mode=mode)
test_model(model, test_data, mode=mode)


Test Acc: 0.5120


## Unstructured Pruning

Unstructured Pruning is usually related to the pruning of weights in neural networks. The general idea is to select a set of weights according to a policy and setting them up to zero.

Common policies are random weight selection or selecting the smallers weights.
Unstructured Pruning can be performed in one or multiple layers within the same network.

Altough in theory Unstructured Pruning should decrease the number of operations performed during execution there should be explicit support within the framework or hardware to bypass such operations, otherwise it will just operated over zero.

### Perform Pruning

Using the model trained in the previous step using pytorch, perform unstructured pruning in the weights of the model by removing x% of the smallest weights.

*   Increment global pruning by 10% until reaching total of 80% pruned weights
*   Perform inference at the end of each pruning and observe the impact into the accuracy.


Note: The percentages are related to the entire model, not per layer.



In [18]:
################################################################################
# TODO: Perform unstructured Pruning over the trained model using 3 different
# prunning percentages.
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
import torch
import torch.nn as nn
import torch.nn.utils.prune as prune

# pick all Conv2d and Linear weights
pairs = [(m, "weight") for m in model.modules() if isinstance(m, (nn.Conv2d, nn.Linear))]

def global_sparsity(net):
    tot = zer = 0
    for m in net.modules():
        if isinstance(m, (nn.Conv2d, nn.Linear)):
            W = m.weight
            tot += W.numel()
            zer += (W == 0).sum().item()
    return zer / max(1, tot)

print(f"Baseline sparsity: {global_sparsity(model):.2f}")
test_model(model, test_data, mode=mode)

for pct in range(10,90,10):
    prune.global_unstructured(pairs,pruning_method=prune.L1Unstructured,amount=pct / 100.0,)
    sp = global_sparsity(model)
    print(f"Target {pct:>2d}% | actual sparsity {sp:.2f}")
    test_model(model, test_data, mode=mode)

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

Baseline sparsity: 0.00
Test Acc: 0.5065
Target 10% | actual sparsity 0.10
Test Acc: 0.5138
Target 20% | actual sparsity 0.28
Test Acc: 0.5025
Target 30% | actual sparsity 0.50
Test Acc: 0.4819
Target 40% | actual sparsity 0.70
Test Acc: 0.4698
Target 50% | actual sparsity 0.85
Test Acc: 0.3982
Target 60% | actual sparsity 0.94
Test Acc: 0.2743
Target 70% | actual sparsity 0.98
Test Acc: 0.1751
Target 80% | actual sparsity 1.00
Test Acc: 0.1667


## Inline Question 1:

What happened with the accuracy as the % of pruning increased ?
Why was that the case?


## Answer:
Accuracy stayed close to baseline at low pruning rates, the nstarted dropping pretty fast after 40%, in the end at 80%, the accuracy is only 16%. Because pruning at a small scale removes redundant connections with little effect to the overall network, but when it grows bigger it started to remove the really important weights thus affecting accuracy.

## Structured Pruning

Structured Pruning consists of removing a bigger chunk of the network parameters at the same time. Instead of removing only a few weights, it is commonplace to remove entire neurons.

For example, in Convolutional Layers, removing filters can be beneficial to improve performance as it greatly decreases the amount of computation performed. However, some of these changes may affect output dimensions which may be carried over to other parts of the network. Therefore, when performing structured pruning one must always be aware of which parameters are going to be affected.

Using the previously trained model in the CINIC-10, perform Structured Prunning only in the Convolution layers of the DNN.

In [22]:
################################################################################
# TODO: Perform unstrucuted Pruning over the trained model using 3 different
# prunning percentages.
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
import torch.nn as nn
import torch.nn.utils.prune as prune

conv_layers = [m for m in model.modules() if isinstance(m, nn.Conv2d)]
for pct in range(10,90,10):
    for m in conv_layers:
        prune.ln_structured(m, name="weight", amount=pct/100, n=2, dim=0)
    print(f"\n{int(pct)}% filters removed")
    test_model(model, test_data, mode=mode)
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                              END OF YOUR CODE                                #
################################################################################


10% filters removed
Test Acc: 0.3614

20% filters removed
Test Acc: 0.1842

30% filters removed
Test Acc: 0.1950

40% filters removed
Test Acc: 0.1730

50% filters removed
Test Acc: 0.1780

60% filters removed
Test Acc: 0.1669

70% filters removed
Test Acc: 0.1653

80% filters removed
Test Acc: 0.1696


## Inline Question 2:

What is the difference between performing Structured Pruning vs Dropout ?
Why would it be beneficial to perform both techniques when developing a Neural Network?


## Answer:

structured pruning removes whole structures and it changes the architecture of the NN, Dropouts only randomly zeroes activation functions during training. the model is still dense. We use both because they provide independent benefits. Dropout improves generalization and pruning makes the model smaller without sacrificing much performance.
