# PyTorch Spatial SVD and Channel Pruning Example 

This notebook demonstrates the use of AIMET to apply Spatial SVD and Channel Pruning features on a given model. The general procedure for compressing is to use AIMET's ModelCompressor, after specifying parameters determining the manner of compression, to compress the model,  then fine-tuning it to recover lost accuracy.

Let's begin with a brief overview of the techniques used :
 ### Spatial SVD  
 Recall that in any model, a convolutional layer is defined by four dimensions (m, n, h, w), where m and n are the number of input and output channels, respectively; and h and w are the height and width of the convolution kernel.
Spatial SVD (where SVD stands for Singular Value Decomposition) seeks to split this convolutional layer into two layers of size (m, k, h, 1) and (k, n, 1, w), where k is a parameter that is known as the rank. The weights of the new layers are chosen so that the outputs of the two layers in succession are as similar as possible to the output of the original layer.


 ### Channel Pruning  
Channel Pruning seeks to reduce the number of input channels in this convolutional layer. There are two steps involved: 
1. Winnowing, which removes less informative channels, and 
2. Weight reconstruction, which seeks to shift the weights such that a linear regression between the old outputs and new outputs exists with minimal error.

### Steps: 
1. Instantiate Data Pipeline for evaluation
2. Load the pre-trained resnet18 PyTorch model and get the original floating point accuracy
3. Compress the model and fine-tune: 
    * 3.1. Compress using Spatial SVD and obtain resulting accuracy
    * 3.2. Fine-tune model after Spatial SVD
    * 3.3. Compress Spatial SVD compressed model using Channel Pruning and obtain resulting accuracy
    * 3.4. Fine-tune model after Channel Pruning

### What this notebook is not 
* This notebook is not designed to show state-of-the-art compression results. Parameters used in this example are chosen such that the example runs quickly.



## Imports
 The next three cells take care of all the imports necessary for this notebook.

In [None]:
import warnings
warnings.filterwarnings("ignore", ".*param.*")

import os
from typing import Tuple
from datetime import datetime
import torch
from torchvision.models import resnet18

In [None]:
# AIMET Imports for Compression
from aimet_torch.compress import ModelCompressor
from aimet_common.defs import CompressionScheme, CostMetric
from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters, ChannelPruningParameters


In [None]:
# Imports needed for the Data Pipeline
from Examples.common import image_net_config
from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader
from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
from Examples.torch.utils.image_net_trainer import ImageNetTrainer

## Setting up Our Config Dictionary

The config dictionary specifies a number of things
config: This mapping expects following parameters:

1. **dataset_dir:** Path to a directory containing ImageNet dataset. This folder should contain subfolders 'train' for training dataset and 'val' for validation dataset.
2. **use_cuda:** A boolean var to indicate to run the quantization on GPU.
3. **output_dir:** Path to a directory for logging.

To get a better understanding of when each of the parameters in the config dictionary is used, please read the code in those cells.
**Note:** You will have to replace the dataset_dir path with the path to your own imagenet/tinyimagenet dataset


In [None]:
config = {'dataset_dir': "/path/to/dataset", #"path/to/dataset" # Replace with the directory of your dataset!
          'use_cuda': True,
          'output_dir': os.path.join("benchmark_output", "spatial_svd_"+datetime.now().strftime("%Y-%m-%d-%H-%M-%S")),
          'epochs': 1, # Typical epochs: 15
          'learning_rate': 1e-2, 
          'learning_rate_schedule': [5, 10]}

os.makedirs(config['output_dir'], exist_ok=True)

## 1. Instantiate Data Pipeline

 The ImageNetDataPipeline class takes both evaluating a model using dataset directory. For more detail on how it works, see the relevant files under examples/torch/utils.

The data pipeline class is simply a template for the user to follow. The methods for this class can be replaced by the user to fit their needs.

In [None]:
class ImageNetDataPipeline:
    """
    Provides APIs for model evaluation and fine-tuning.
    """
    
    def __init__(self, config):
        """
        :param _config: a dictionary containing the values of necessary parameters.
        """
        self._config = config
        
    def data_loader(self):
        """
        :return: ImageNetDataloader
        """
        
        data_loader = ImageNetDataLoader(is_training=False, images_dir=self._config["dataset_dir"],
                                         image_size=image_net_config.dataset['image_size']).data_loader

        return data_loader
    
    def evaluate(self, model: torch.nn.Module, iterations: int=None, use_cuda: bool=False) -> float:
        """
        Given a torch model, evaluates its Top-1 accuracy on the dataset.
        :param model: the model to evaluate.
        :param iterations: the number of batches of dataset.
        :param use_cuda: If True then use a GPU for inference.
        :return: The accuracy for the sample with the maximum accuracy.
        """
        # Your code goes here instead of the example from below
        
        evaluator = ImageNetEvaluator(self._config['dataset_dir'], image_size=image_net_config.dataset['image_size'],
                                      batch_size=image_net_config.evaluation['batch_size'],
                                      num_workers=image_net_config.evaluation['num_workers'])
        
        return evaluator.evaluate(model, iterations, use_cuda)
    
    def finetune(self, model: torch.nn.Module, modifier: str=""):
        """
        Given a torch model, fine-tunes the model to improve its accuracy.
        :param model: the model to fine-tune.
        :param modifier: a string that is used to change the name of the path where the model will be saved.
        """
        # Your code goes here instead of the example from below
        
        trainer = ImageNetTrainer(self._config['dataset_dir'], image_size=image_net_config.dataset['image_size'],
                                  batch_size=image_net_config.train['batch_size'],
                                  num_workers=image_net_config.train['num_workers'])
        
        trainer.train(model, max_epochs=self._config['epochs'], learning_rate=self._config['learning_rate'],
                      learning_rate_schedule=self._config['learning_rate_schedule'], use_cuda=self._config['use_cuda'])
        
        torch.save(model, os.path.join(self._config['output_dir'], modifier+'finetuned_model'))

## 2. Load the Model, Initialize DataPipeline, Get Starting Accuracy 
The next section will initialize the model and data pipeline for compression.
We initialize the model and the pipeline calculate the original floating point(FP32) accuracy of the model on the dataset provided.

In [None]:
data_pipeline = ImageNetDataPipeline(config)

# Input image shape (1, num_channels (RGB), image_width, image_height)
image_shape = (1, 3, 224, 224)

model = resnet18(pretrained=True)
if config['use_cuda']:
    if torch.cuda.is_available():
        model.to(torch.device('cuda'))
    else:
        raise Exception("use_cuda is True but cuda is unavailable")

accuracy = data_pipeline.evaluate(model, use_cuda=config['use_cuda'])
print("Original Model Accuracy: ", accuracy)

## 3. Compress the model and fine-tune

### 3.1. Compress the model using Spatial SVD 
The parameters related to the Spatial SVD compression are defined below and initialized in the next cell:

1. **ssvd_target_comp_ratio**: The desired compression ratio using Spatial SVD. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%.

2. **ssvd_num_comp_ratio_candidates**: The number of compression ratios used by the API at each layer. Note that the model will test multiple different compression ratios per layer to try to compress less-important layers more, in such a way such that the overall compression ratio is equal to target_comp_ratio. The specified value is 10, which means that for each layer, the API will try the values 0.1, 0.2, ... 1.0 as ratios.

3. **ssvd_cost_metric**: Determines in what way the model is evaluated - can either be compute (mac), or space (memory).

4. **ssvd_eval_iterations**: The number of batches of data used to evaluate a model while the model is compressing. It is set to 10 to speed up the compression, rather than using the whole dataset.

5. **ssvd_modules_to_ignore**: The layers that should be ignored during compression. The first layer is ignored to preserve the way the input interacts with the model; if there are other layers that should be ignored, they can be added to this list.

In [None]:
ssvd_target_comp_ratio = Decimal(0.5)

# ssvd_num_comp_ratio_candidates = 10     # Typical
ssvd_num_comp_ratio_candidates = 2

ssvd_cost_metric = CostMetric.mac

# ssvd_num_eval_iterations = 10     # Typical
ssvd_num_eval_iterations = 1

ssvd_modules_to_ignore = [model.conv1]

The next cell sets up the other parameters needed to perform this part of the compression.

 There are two methods for which you can choose parameters: 

a. **Auto:** 
This mode supports only greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to 
reach the target ratio.

b. **Manual:** 
 In this mode, the compression ratios for each layer are to be specified by the user. A general rule of 
 thumb is to use the ratios found by Auto Mode as a starting point for this mode.


In [None]:

ssvd_greedy_params = GreedySelectionParameters(target_comp_ratio=ssvd_target_comp_ratio,
                                               num_comp_ratio_candidates=ssvd_num_comp_ratio_candidates)

ssvd_auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=ssvd_greedy_params,
                                                       modules_to_ignore=ssvd_modules_to_ignore)

ssvd_params = SpatialSvdParameters(SpatialSvdParameters.Mode.auto, ssvd_auto_params)

ssvd_scheme = CompressionScheme.spatial_svd

Finally, the model is compressed using AIMET's ModelCompressor paired with the parameters specified above. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset. Note here that the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.

In [None]:
ssvd_compressed_model, ssvd_comp_stats = ModelCompressor.compress_model(model=model,
                                                                        eval_callback=data_pipeline.evaluate,
                                                                        eval_iterations=ssvd_num_eval_iterations,
                                                                        input_shape=image_shape,
                                                                        compress_scheme=ssvd_scheme,
                                                                        cost_metric=ssvd_cost_metric,
                                                                        parameters=ssvd_params)

print(ssvd_comp_stats)

ssvd_comp_accuracy = data_pipeline.evaluate(ssvd_compressed_model, use_cuda=config['use_cuda'])
print("Accuracy of Spatial SVD compressed Model: ", ssvd_comp_accuracy)

### 3.2. Fine-tune model after Spatial SVD 

After the model is compressed through Spatial SVD, the model is fine-tuned, evaluated and saved. It is customary to do two rounds of fine-tuning so that model accuracy is preserved.

In [None]:
data_pipeline.finetune(ssvd_compressed_model)

ssvd_finetuned_accuracy = data_pipeline.evaluate(ssvd_compressed_model, use_cuda=config['use_cuda'])
print("Spatial SVD compressed Model Accuracy after fine-tuning: ", ssvd_finetuned_accuracy)

### 3.3. Compress Spatial SVD compressed model using Channel Pruning

The fine-tuned model, compressed with Spatial SVD is now compressed using Channel pruning.The next two cells specify the parameters for compression using Channel Pruning:

1. **cp_target_comp_ratio**: The desired compression ratio using Channel Pruning. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 66% of its original size.

2. **cp_num_comp_ratio_candidates**: The number of compression ratios used by the API at each layer. Note that the model will test multiple different compression ratios per layer to try to compress less-important layers more, in such a way such that the overall compression ratio is equal to target_comp_ratio. The specified value is 10, which means that for each layer, the API will try the values 0.1, 0.2, ... 1.0 as ratios.

3. **cp_cost_metric**: Determines in what way the model is evaluated - can either be compute (mac), or space (memory).

4. **cp_eval_iterations**: The number of batches of data used to evaluate a model while the model is compressing. It is set to 10 to speed up the compression, rather than using the whole dataset. More details are later in the notebook/elsewhere in the AIMET API documentation

5. **cp_modules_to_ignore**: The layers that should be ignored during compression. The first layer is ignored to preserve the way the input interacts with the model; if there are other layers that should be ignored, add them to the list.

6. **num_reconstruction_samples**: During the last stage of Channel Pruning, the Compression API tries to map the outputs of the pruned model with that of the original model through linear regression, and uses this attempt to change the weights in the pruned layer. The regression is done with this many random samples. This should generally be in the 100s.

In [None]:
cp_target_comp_ratio = Decimal(0.66)

# cp_num_comp_ratio_candidates = 10     # Typical
cp_num_comp_ratio_candidates = 2

cp_cost_metric = CostMetric.mac

# cp_num_eval_iterations = 10     # Typical
cp_num_eval_iterations = 1

cp_modules_to_ignore = [ssvd_compressed_model.conv1]

num_reconstruction_samples = 500

Similar to Spatial SVD, there are two methods for which you can choose parameters - Auto and Manual, and they have the same general meanings specified under Spatial SVD section above.


In [None]:

data_loader = data_pipeline.data_loader()

cp_greedy_params = GreedySelectionParameters(target_comp_ratio=cp_target_comp_ratio,
                                             num_comp_ratio_candidates=cp_num_comp_ratio_candidates)



cp_mode = ChannelPruningParameters.Mode.auto

cp_auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=cp_greedy_params, 
                                                         modules_to_ignore=cp_modules_to_ignore)

cp_params = ChannelPruningParameters(data_loader=data_loader,
                                     num_reconstruction_samples=num_reconstruction_samples,
                                     allow_custom_downsample_ops=True,
                                     mode=cp_mode,
                                     params=cp_auto_params)

cp_scheme = CompressionScheme.channel_pruning

Again, we call ModelCompressor to perform the actual model compression, and then evaluate the compressed model:

In [None]:
ssvd_cp_compressed_model, cp_comp_stats = ModelCompressor.compress_model(model=ssvd_compressed_model,
                                                                         eval_callback=data_pipeline.evaluate,
                                                                         eval_iterations=cp_num_eval_iterations,
                                                                         input_shape=image_shape,
                                                                         compress_scheme=cp_scheme,
                                                                         cost_metric=cp_cost_metric,
                                                                         parameters=cp_params)

print(cp_comp_stats)

ssvd_cp_comp_accuracy = data_pipeline.evaluate(ssvd_cp_compressed_model, use_cuda=config['use_cuda'])
print("Accuracy of Model compressed using Spatial SVD and Channel Pruning: ", ssvd_cp_comp_accuracy)

### 3.4. Fine-tune model after Channel Pruning 

After the model is compressed through Channel Pruning, the model is fine-tuned, then evaluated and saved.

In [None]:
data_pipeline.finetune(ssvd_cp_compressed_model, modifier="channel_pruned_")

ssvd_cp_finetuned_accuracy = data_pipeline.evaluate(ssvd_compressed_model, use_cuda=config['use_cuda'])
print("Accurracy of Model compressed using Spatial SVD and Channel Pruning, after fine-tuning : ", ssvd_cp_finetuned_accuracy)

This example illustrated how AIMET Spatial SVD and Channel Pruning can be used together to achieve model compression.

To use AIMET for your specific needs, replace the model with your model and replace the Data pipeline with your data pipeline. This will provide you a quick starting point.

As indicated above, some parameters have been chosen in a way to run the example faster.