In [7]:
# this mounts your Google Drive to the Colab VM.
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# enter the foldername in your Drive where you have saved the unzipped
# assignment folder, e.g. 'ece697ls/assignments/assignment3/'
FOLDERNAME = None
assert FOLDERNAME is not None, "[!] Enter the foldername."

# now that we've mounted your Drive, this ensures that
# the Python interpreter of the Colab VM can load
# python files from within it.
import sys
sys.path.append('/content/drive/My Drive/{}'.format(FOLDERNAME))

%cd /content

ModuleNotFoundError: No module named 'google.colab'

# Pruning

After learning, neural networks have modified and learned a set of parameters to perform our classification task. However, such parameters are costly to maintain and do not hold the same importance.

Wouldn't it be great could optimize our resource usage by dropping less important values ? This is where pruning comes into play.

Pruning is a technique that cuts off parameters/structures from a model to increase sparcity and decrease overall model size, similar to cutting leafs or branches from bushes and trees. This process can lead to smaller memory consumption with minimal accuracy reduction. Moreover, pruning the network may also provide a speedup since there will be less operations being performed.

The pruning process can be performed during the end of an epoch of training or after training is complete. Experimenting to find out which way works the best is part of the fun !

In [5]:
from ece662.pruning_helper import test_model, load_model
from ece662.data_utils import get_CINIC10_data
import os

Below we will load a pre-trained model for you to work on. If you prefer, you can save your own model from the previous Tensorflow/Pytorch task and load it here.

In [17]:
# Load pre-trained model and test data (running locally, not on Colab)

data = get_CINIC10_data()
mode = 'torch'  # torch or tensorflow

test_data = [data['X_test'], data['y_test']]

# Use local path instead of Google Drive path
path = os.path.join('ece662', 'models', f'{mode}.model')
print(f"Loading model from: {path}")

model = load_model(path, mode=mode)
print("Model loaded successfully!")
test_model(model, test_data, mode=mode)

Loading model from: ece662\models\torch.model
Model loaded successfully!
Test Acc: 0.5051


## Unstructured Pruning

Unstructured Pruning is usually related to the pruning of weights in neural networks. The general idea is to select a set of weights according to a policy and setting them up to zero. 

Common policies are random weight selection or selecting the smallers weights. 
Unstructured Pruning can be performed in one or multiple layers within the same network.

Altough in theory Unstructured Pruning should decrease the number of operations performed during execution there should be explicit support within the framework or hardware to bypass such operations, otherwise it will just operated over zero.

### Perform Pruning

Using the model trained in the previous step using pytorch, perform unstructured pruning in the weights of the model by removing x% of the smallest weights. 

*   Increment global pruning by 10% until reaching total of 80% pruned weights
*   Perform inference at the end of each pruning and observe the impact into the accuracy.


Note: The percentages are related to the entire model, not per layer.



In [18]:
################################################################################
# TODO: Perform unstructured Pruning over the trained model using 3 different  
# prunning percentages.                                
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

import torch
import torch.nn.utils.prune as prune
import copy

# Create a copy of the original model to preserve it
original_model = copy.deepcopy(model)

# Define pruning percentages to test (10% increments up to 80%)
pruning_percentages = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]

print("Starting Unstructured Pruning Experiment")
print("=" * 50)

# Test original model accuracy first
print(f"Original Model Performance:")
test_model(model, test_data, mode=mode)
print()

# Store results for analysis
results = []

for prune_amount in pruning_percentages:
    print(f"Pruning {prune_amount*100:.0f}% of weights globally...")
    
    # Reset model to original state
    model = copy.deepcopy(original_model)
    
    # Get all parameters to be pruned (conv and linear layers)
    parameters_to_prune = []
    
    for name, module in model.named_modules():
        if isinstance(module, (torch.nn.Conv2d, torch.nn.Linear)):
            parameters_to_prune.append((module, 'weight'))
    
    # Apply global unstructured pruning (removes smallest weights across all layers)
    prune.global_unstructured(
        parameters_to_prune,
        pruning_method=prune.L1Unstructured,
        amount=prune_amount,
    )
    
    # Make pruning permanent (remove masks and actually set weights to zero)
    for module, param_name in parameters_to_prune:
        prune.remove(module, param_name)
    
    # Calculate sparsity (percentage of zero weights)
    total_params = 0
    zero_params = 0
    
    for name, param in model.named_parameters():
        if 'weight' in name:
            total_params += param.numel()
            zero_params += (param == 0).sum().item()
    
    actual_sparsity = zero_params / total_params
    
    print(f"  Actual sparsity achieved: {actual_sparsity*100:.2f}%")
    print(f"  Performance after pruning:")
    
    # Test pruned model
    test_model(model, test_data, mode=mode)
    print("-" * 30)
    
    # Store results
    results.append({
        'target_prune': prune_amount * 100,
        'actual_sparsity': actual_sparsity * 100
    })

print("\nPruning Results Summary:")
print("Target Pruning % | Actual Sparsity %")
print("-" * 35)
for result in results:
    print(f"     {result['target_prune']:6.0f}%      |      {result['actual_sparsity']:6.2f}%")

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

Starting Unstructured Pruning Experiment
Original Model Performance:
Test Acc: 0.5089

Pruning 10% of weights globally...
  Actual sparsity achieved: 10.00%
  Performance after pruning:
Test Acc: 0.5046
------------------------------
Pruning 20% of weights globally...
  Actual sparsity achieved: 20.00%
  Performance after pruning:
Test Acc: 0.5052
------------------------------
Pruning 30% of weights globally...
  Actual sparsity achieved: 30.00%
  Performance after pruning:
Test Acc: 0.5026
------------------------------
Pruning 40% of weights globally...
  Actual sparsity achieved: 40.00%
  Performance after pruning:
Test Acc: 0.5030
------------------------------
Pruning 50% of weights globally...
  Actual sparsity achieved: 50.00%
  Performance after pruning:
Test Acc: 0.4828
------------------------------
Pruning 60% of weights globally...
  Actual sparsity achieved: 60.00%
  Performance after pruning:
Test Acc: 0.4906
------------------------------
Pruning 70% of weights globally

## Inline Question 1:

What happened with the accuracy as the % of pruning increased ?
Why was that the case?


## Answer: 

As the percentage of pruning increased, the model's accuracy generally decreased, but not uniformly:

**Observed Results:**
- Original Model: 50.89% accuracy
- 10%-40% pruning: Minimal accuracy loss (50.46% - 50.30%), showing the model is robust to removing small weights
- 50%-60% pruning: More noticeable degradation (48.28% - 49.06%)
- 70%-80% pruning: Significant accuracy drop (47.21% - 40.93%)

**Why this happened:**
1. **Redundancy in Neural Networks**: At low pruning percentages (10%-40%), the smallest weights contribute minimally to the model's predictive power. Removing them has little impact because the network has learned redundant representations.

2. **Weight Magnitude Correlation**: The L1 unstructured pruning removes weights with smallest absolute values first. These small weights often represent less important connections, so their removal doesn't significantly hurt performance initially.

3. **Critical Threshold**: Beyond 50% pruning, we start removing weights that are more critical for the network's function. The network loses important pathways for information flow, leading to degraded performance.

4. **Network Capacity**: At very high pruning levels (70%-80%), the network loses too much of its representational capacity, causing substantial accuracy drops as essential learned features are lost.

## Structured Pruning

Structured Pruning consists of removing a bigger chunk of the network parameters at the same time. Instead of removing only a few weights, it is commonplace to remove entire neurons. 

For example, in Convolutional Layers, removing filters can be beneficial to improve performance as it greatly decreases the amount of computation performed. However, some of these changes may affect output dimensions which may be carried over to other parts of the network. Therefore, when performing structured pruning one must always be aware of which parameters are going to be affected.

Using the previously trained model in the CINIC-10, perform Structured Prunning only in the Convolution layers of the DNN.

In [19]:
################################################################################
# TODO: Perform unstructured Pruning over the trained model using 3 different  
# prunning percentages.                                
################################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

import torch
import torch.nn.utils.prune as prune
import copy

# Reset to original model for structured pruning
model = copy.deepcopy(original_model)

print("Starting Structured Pruning Experiment on Convolutional Layers")
print("=" * 60)

# Test original model accuracy first
print("Original Model Performance:")
test_model(model, test_data, mode=mode)
print()

# Print model architecture to understand conv layers
print("Model Architecture:")
for name, module in model.named_modules():
    if isinstance(module, torch.nn.Conv2d):
        print(f"  {name}: {module}")
print()

# Define structured pruning percentages for conv layers
structured_pruning_percentages = [0.25, 0.50, 0.75]  # 25%, 50%, 75%

structured_results = []

for prune_amount in structured_pruning_percentages:
    print(f"Structured Pruning: Removing {prune_amount*100:.0f}% of filters from Conv layers...")
    
    # Reset model to original state
    model = copy.deepcopy(original_model)
    
    # Get all convolutional layers for structured pruning
    conv_layers_to_prune = []
    
    for name, module in model.named_modules():
        if isinstance(module, torch.nn.Conv2d):
            conv_layers_to_prune.append((module, 'weight'))
    
    print(f"  Found {len(conv_layers_to_prune)} convolutional layers to prune")
    
    # Apply structured pruning to each conv layer individually
    # We use L2 structured pruning which removes entire filters (channels)
    for module, param_name in conv_layers_to_prune:
        # Calculate number of filters to remove
        num_filters = module.out_channels
        num_to_remove = int(num_filters * prune_amount)
        
        if num_to_remove > 0:
            # Apply structured pruning (removes entire filters)
            prune.ln_structured(
                module, 
                name=param_name, 
                amount=num_to_remove, 
                n=2,  # L2 norm
                dim=0  # Remove along output channel dimension (entire filters)
            )
    
    # Calculate structured sparsity (entire filters removed)
    total_filters = 0
    pruned_filters = 0
    
    for name, module in model.named_modules():
        if isinstance(module, torch.nn.Conv2d):
            if hasattr(module, 'weight_mask'):
                # Count filters where all weights in the filter are zero
                weight_mask = module.weight_mask
                filter_sums = weight_mask.sum(dim=(1, 2, 3))  # Sum over spatial and input channel dims
                total_filters += weight_mask.shape[0]
                pruned_filters += (filter_sums == 0).sum().item()
            else:
                total_filters += module.out_channels
    
    structured_sparsity = pruned_filters / total_filters if total_filters > 0 else 0
    
    print(f"  Structured sparsity achieved: {structured_sparsity*100:.2f}% (filters removed)")
    print(f"  Performance after structured pruning:")
    
    # Test pruned model
    test_model(model, test_data, mode=mode)
    print("-" * 40)
    
    # Store results
    structured_results.append({
        'target_prune': prune_amount * 100,
        'structured_sparsity': structured_sparsity * 100
    })

print("\nStructured Pruning Results Summary:")
print("Target Pruning % | Structured Sparsity % (Filters)")
print("-" * 50)
for result in structured_results:
    print(f"     {result['target_prune']:6.0f}%      |        {result['structured_sparsity']:6.2f}%")

# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
################################################################################
#                              END OF YOUR CODE                                #
################################################################################

Starting Structured Pruning Experiment on Convolutional Layers
Original Model Performance:
Test Acc: 0.5080

Model Architecture:
  conv1: Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
  conv2: Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1))

Structured Pruning: Removing 25% of filters from Conv layers...
  Found 2 convolutional layers to prune
  Structured sparsity achieved: 25.00% (filters removed)
  Performance after structured pruning:
Test Acc: 0.5080

Model Architecture:
  conv1: Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
  conv2: Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1))

Structured Pruning: Removing 25% of filters from Conv layers...
  Found 2 convolutional layers to prune
  Structured sparsity achieved: 25.00% (filters removed)
  Performance after structured pruning:
Test Acc: 0.1866
----------------------------------------
Structured Pruning: Removing 50% of filters from Conv layers...
  Found 2 convolutional layers to prune
  Structured sparsity achieved: 5

## Inline Question 2:

What is the difference between performing Structured Pruning vs Dropout ? 
Why would it be beneficial to perform both techniques when developing a Neural Network?


## Answer: 

**Key Differences between Structured Pruning and Dropout:**

1. **Timing and Purpose**:
   - **Dropout**: Applied during training as a regularization technique to prevent overfitting
   - **Structured Pruning**: Applied after training to reduce model size and computational requirements

2. **Mechanism**:
   - **Dropout**: Randomly sets neurons to zero during training (temporary, stochastic)
   - **Structured Pruning**: Permanently removes entire structures (filters, channels, neurons) based on importance metrics

3. **Permanence**:
   - **Dropout**: Temporary - neurons are only disabled during training, all are active during inference
   - **Structured Pruning**: Permanent - removed structures are gone forever, reducing model size

4. **Impact on Architecture**:
   - **Dropout**: No change to model architecture or size
   - **Structured Pruning**: Actually changes model architecture and reduces parameters

**Benefits of Using Both Techniques:**

1. **Complementary Goals**: Dropout helps learn robust features during training, while pruning optimizes the final model for deployment

2. **Better Generalization**: Dropout forces the network to not rely on specific neurons, creating redundancy that makes later pruning less harmful

3. **Optimal Resource Usage**: Dropout helps identify which components are truly necessary, making pruning decisions more informed

4. **Deployment Efficiency**: Training with dropout creates networks that are naturally more resilient to structural changes from pruning
