# Torch Spatial SVD Notebook

The following notebook is an example as to how one may compress their model via Spatial SVD using the AIMET library. The general procedure for compressing is to use AIMET's ModelCompressor, after specifying parameters determining the manner of compression, to compress the model, then finetuning it to recover lost accuracy.

We now present an overview of the technique. Recall that in any model, a convolutional layer is defined by four dimensions (m, n, h, w), where m and n are the number of input and output channels, respectively; and h and w are the height and width of the convolutional kernel. Spatial SVD (where SVD stands for Singular Value Decomposition) seeks to split this convolutional layer into two smaller layers of size (m, k, h, 1) and (k, n, 1, w), where k is a parameter that is known as the rank. The weights of the new layers are chosen so that the outputs of the two layers in succession are as similar as possible to the output of the original layer.

The first three cells below takes care of all necessary imports:

In [None]:
import warnings
warnings.filterwarnings("ignore", ".*param.*")

# Imports necessary for the notebook
import os
from typing import Tuple
from datetime import datetime
from decimal import Decimal
import torch
from torchvision.models import resnet18

In [None]:
# AIMET Imports for Compression
from aimet_torch.compress import ModelCompressor
from aimet_common.defs import CompressionScheme, CostMetric
from aimet_torch.defs import GreedySelectionParameters, SpatialSvdParameters

In [None]:
# Imports needed for the Data Pipeline
from Examples.common import image_net_config
from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
from Examples.torch.utils.image_net_trainer import ImageNetTrainer

## Config

config: This dictionary expects the following parameters:
             1. dataset_dir: Path to a directory containing ImageNet dataset. This folder should contain at least 
                             2 subfolders: 'train': for training dataset and 'val': for validation dataset.
             2. use_cuda: A boolean var to indicate to run the test on GPU.
             3. logdir: Path to a directory for logging.

The config dictionary is used for all of the remaining cells. To get a better understanding of when each of the parameters in the config dictionary is used, read the code in those cells.

Explanation for few of the important parameters are below. We encourage the user to change them to suit their needs.

1. **epochs**: Number of epochs (type int) for finetuning.
2. **learning_rate**: A float type learning rate for model finetuning
3. **learning_rate_schedule**: A list of epoch indices for learning rate schedule used in finetuning. Check https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#MultiStepLR for more details.

In [None]:
config = {'dataset_dir': "path/to/dataset", # Replace with the directory of your dataset!
          'use_cuda': True,
          'logdir': os.path.join("benchmark_output", "spatial_svd_"+datetime.now().strftime("%Y-%m-%d-%H-%M-%S")),
          'epochs': 1, 
          'learning_rate': 1e-2, 
          'learning_rate_schedule': [5, 10]}

os.makedirs(config['logdir'], exist_ok=True)

## Data Pipeline

The next cell is to define the data pipeline. The ImageNetDataPipeline class takes care of both evaluating and finetuning a model using a dataset directory (which should contain both training data and validation data, already separated into folders) that is specified by the user. For more detail on how it works, see the relevant files under examples/torch/utils.

The data pipeline class is simply a template for the user to follow. The methods for this class can be replaced by the user to fit their needs.

In [None]:
class ImageNetDataPipeline:
    """
    Provides APIs for model compression using AIMET weight SVD, evaluation and finetuning.
    """

    def __init__(self, config):
        """
        :param config:
        """
        self._config = config

    
    def evaluate(self, model: torch.nn.Module, iterations: int = None, use_cuda: bool = False) -> float:
        """
        Evaluate the specified model using the specified number of samples from the validation set.
        AIMET's compress_model() expects the function with this signature to its eval_callback
        parameter.

        :param model: The model to be evaluated.
        :param iterations: The number of batches of the dataset.
        :param use_cuda: If True then use a GPU for inference.
        :return: The accuracy for the sample with the maximum accuracy.
        """

        # your code goes here instead of the example from below

        evaluator = ImageNetEvaluator(self._config['dataset_dir'], image_size=image_net_config.dataset['image_size'],
                                      batch_size=image_net_config.evaluation['batch_size'],
                                      num_workers=image_net_config.evaluation['num_workers'])

        return evaluator.evaluate(model, iterations, use_cuda)
    
    def finetune(self, model: torch.nn.Module):
        """
        Finetunes the model.  The implemtation provided here is just an example,
        provide your own implementation if needed.

        :param model: The model to finetune.
        :return: None
        """

        # Your code goes here instead of the example from below

        trainer = ImageNetTrainer(self._config['dataset_dir'], image_size=image_net_config.dataset['image_size'],
                                  batch_size=image_net_config.train['batch_size'],
                                  num_workers=image_net_config.train['num_workers'])

        trainer.train(model, max_epochs=self._config['epochs'], learning_rate=self._config['learning_rate'],
                      learning_rate_schedule=self._config['learning_rate_schedule'], use_cuda=self._config['use_cuda'])

        torch.save(model, os.path.join(self._config['logdir'], 'finetuned_model.pth'))

We next initialize the pipeline and the model. Before compressing the model, it is customary to log the original accuracy of the model on the dataset provided.

In [None]:
data_pipeline = ImageNetDataPipeline(config)

model = resnet18(pretrained=True)
if config['use_cuda']:
    if torch.cuda.is_available():
        model.to(torch.device('cuda'))
    else:
        raise Exception("use_cuda is True but cuda is unavailable")

accuracy = data_pipeline.evaluate(model, use_cuda=config['use_cuda'])
print("accuracy:", accuracy)

## Compression

The next cells are for the actual compression step. First, parameters related to the compression are specified in the following cell:

1. **target_comp_ratio**: The desired compession ratio using Spatial SVD. This value denotes the desired compression % of the original model. To compress the model to 20% of its original size, use 0.2. This would compress the model by 80%. The pre-specified value that is given is 50%

2. **num_comp_ratio_candidates**: The number of compression ratios used by the API at each layer. Note that the model will test multiple different compression ratios per layer to try to compress less-important layers more, in such a way such that the overall compression ratio is equal to target_comp_ratio. The specified value is 10, which means that for each layer, the API will try the values 0.1, 0.2, ... 1.0 as ratios.

3. **cost_metric**: Determines in what way the model is evaluated - can either be compute (mac), or space (memory).

4. **eval_iterations**: The number of batches of data used to evaluate a model while the model is compressing. It is set to 10 to speed up the compression, rather than using the whole dataset. More details are later in the notebook/elsewhere in the AIMET API documentation

5. **modules_to_ignore**: The layers that should be ignored during compression. The first layer is ignored to preserve the way the input interacts with the model; if there are other layers that should be ignored, add them to the list.

In [None]:
target_comp_ratio = Decimal(0.5)

# Uncomment one of the following lines
# num_comp_ratio_candidates = 10 # Typical
num_comp_ratio_candidates = 2 # Test

cost_metric = CostMetric.mac

# Uncomment one of the following lines
# num_eval_iterations = 10 # Typical
num_eval_iterations = 1 # Test

modules_to_ignore = [model.conv1]

The next cell sets up the other parameters needed to perform the compression.

The next lines after these create the actual parameters for Spatial SVD. There are two methods for which you can choose parameters - Auto and Manual. For Auto, the only option is a greedy selection scheme, where the optimal compression ratio is selected for each layer among a set list of candidates to reach the target ratio (which was specified in the previous cell). For Manual, you have to specify the compression ratios for each layer; a general rule of thumb, if one is to use Manual, is to start with the ratios found by Auto Mode and use it as a starting point.

In [None]:
# Creating Greedy selection parameters:
greedy_params = GreedySelectionParameters(target_comp_ratio=target_comp_ratio,
                                          num_comp_ratio_candidates=num_comp_ratio_candidates)

# Creating Auto mode Parameters:
mode = SpatialSvdParameters.Mode.auto

auto_params = SpatialSvdParameters.AutoModeParams(greedy_select_params=greedy_params,
                                                  modules_to_ignore=modules_to_ignore)

# Creating Spatial SVD parameters with Auto Mode:
params = SpatialSvdParameters(mode, auto_params)

# Scheme is Spatial SVD:
scheme = CompressionScheme.spatial_svd

# Input image shape
image_shape = (1, image_net_config.dataset['image_channels'],
               image_net_config.dataset['image_width'], image_net_config.dataset['image_height'])

Finally, the model is compressed using AIMET's ModelCompressor paired with the parameters specified above. This returns both the new model, which is saved, as well as relevant statistics. Finally, the compressed model is evaluated on the dataset. Note here that the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.

In [None]:
compressed_model, comp_stats = ModelCompressor.compress_model(model=model,
                                                              eval_callback=data_pipeline.evaluate,
                                                              eval_iterations=num_eval_iterations,
                                                              input_shape=image_shape,
                                                              compress_scheme=scheme,
                                                              cost_metric=cost_metric,
                                                              parameters=params)

torch.save(compressed_model, os.path.join(config['logdir'], 'compressed_model.pth'))

print(comp_stats)

comp_accuracy = data_pipeline.evaluate(compressed_model, use_cuda=config['use_cuda'])
print(comp_accuracy)

## Finetuning

After the model is compressed, the model is finetuned, then evaluated and saved.

In [None]:
data_pipeline.finetune(compressed_model)

finetuned_accuracy = data_pipeline.evaluate(compressed_model, use_cuda=config['use_cuda'])
print(finetuned_accuracy)