# Model compression using Channel Pruning 

This notebook shows a working code example of how to use AIMET to perform model compression. The Channel Pruning technique is used in this notebook to achieve model compression.

Here is a brief introduction to the techniques. Please refer to the AIMET user guide for more details.

1. **Spatial SVD**: This is a tensor-decomposition technique generally applied to convolutional layers (Conv2D). Applying this technique will decompose a single convolutional layer into two. The weight tensor of the layer to be split is flattended to a 2D matrix and singular value decomposition (SVD) is applied to this matrix. Compression is achieved by discarding the least significant singular values in the diagonal matrix. The decomposed matrices are combined back into two separate convolutional layers.
2. **Channel Pruning**: In this technique AIMET will discard least significant (using a magnitude metric) input channels of a given convolutional (Conv2D) layer. The layers of the model feeding into this convolutional layer also have the channels dimension modified to get back to a working graph. This technique also uses a layer-by-layer reconstruction procedure that modifies the weights of the compressed layers to minimize the distance of the compressed layer output to the corresponding layer output of the original model.

Both of the above techniques are structured pruning techniques that aim to reduce computational macs or memory requirements of the model. Subsequent to applying either of these techniques, the compressed model needs to be fine-tuned (meaning trained again for a few epochs) to recover accuracy close to the original model.

This notebook shows working code example of how the technique #2 can be used to compress the model. You can find a separate notebook for #1, and #1 followed by #2 in the same folder.

#### Overall flow
This notebook covers the following
1. Instantiate the example evaluation and training pipeline
2. Load the model and evaluate it to find the baseline accuracy
3. Compress the model and fine-tune:  
    3.1 Compress model using Channel Pruning and evaluate it to find post-compression accuracy  
    3.2 Fine-tune the model


#### What this notebook is not 
* This notebook is not designed to show state-of-the-art compression results. For example, some optimization parameters such as num_comp_ratio_candidates, num_eval_iterations and epochs are deliberately chosen to have the notebook execute more quickly.


---
## Dataset

This notebook relies on the ImageNet dataset for the task of image classification. If you already have a version of the dataset readily available, please use that. Else, please download the dataset from appropriate location (e.g. https://image-net.org/challenges/LSVRC/2012/index.php#).

**Note1**: The ImageNet dataset typically has the following characteristics and the dataloader provided in this example notebook rely on these
- Subfolders 'train' for the training samples and 'val' for the validation samples
- A subdirectory per class, and a file per each image sample

**Note2**: To speed up the execution of this notebook, you may use a reduced subset of the ImageNet dataset. E.g. the entire ILSVRC2012 dataset has 1000 classes, 1000 training samples per class and 50 validation samples per class. But for the purpose of running this notebook, you could perhaps reduce the dataset to say 2 samples per class. This exercise is left upto the reader and is not necessary.

Edit the cell below and specify the directory where the downloaded ImageNet dataset is saved.

In [1]:
DATASET_DIR = 'path/to/dataset'         # Please replace this with a real directory

---
## 1. Example evaluation and training pipeline

The following is an example training and validation loop for this image classification task.

- **Does AIMET have any limitations on how the training, validation pipeline is written?** Not really. We will see later that AIMET will modify the user's model to compress it and the resultant model is still a PyTorch model. This compressed model can be used in place of the original model when doing inference or training.
- **Does AIMET put any limitation on the interface of the evaluate() or train() methods?** Not really, but evaluate() method should return a single number representing the accuracy of the model. Ideally, You should be able to use your existing evaluate and train routines as-is.


In [2]:
import os
import torch
from typing import List
from Examples.common import image_net_config
from Examples.torch.utils.image_net_evaluator import ImageNetEvaluator
from Examples.torch.utils.image_net_trainer import ImageNetTrainer
from Examples.torch.utils.image_net_data_loader import ImageNetDataLoader

class ImageNetDataPipeline:

    @staticmethod
    def get_val_dataloader() -> torch.utils.data.DataLoader:
        """
        Instantiates a validation dataloader for ImageNet dataset and returns it
        """
        data_loader = ImageNetDataLoader(DATASET_DIR,
                                         image_size=image_net_config.dataset['image_size'],
                                         batch_size=image_net_config.evaluation['batch_size'],
                                         is_training=False,
                                         num_workers=image_net_config.evaluation['num_workers']).data_loader
        return data_loader

    @staticmethod
    def evaluate(model: torch.nn.Module, iterations: int, use_cuda: bool) -> float:
        """
        Given a torch model, evaluates its Top-1 accuracy on the dataset
        :param model: the model to evaluate
        :param iterations: the number of batches to be used to evaluate the model. A value of 'None' means the model will be
                           evaluated on the entire dataset once.
        :param use_cuda: whether or not the GPU should be used.
        """
        evaluator = ImageNetEvaluator(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
                                      batch_size=image_net_config.evaluation['batch_size'],
                                      num_workers=image_net_config.evaluation['num_workers'])

        return evaluator.evaluate(model, iterations=iterations, use_cuda=use_cuda)

    @staticmethod
    def finetune(model: torch.nn.Module, epochs: int, learning_rate: float, learning_rate_schedule: List, use_cuda: bool):
        """
        Given a torch model, finetunes the model to improve its accuracy
        :param model: the model to finetune
        :param epochs: The number of epochs used during the finetuning step.
        :param learning_rate: The learning rate used during the finetuning step.
        :param learning_rate_schedule: The learning rate schedule used during the finetuning step.
        :param use_cuda: whether or not the GPU should be used.
        """
        trainer = ImageNetTrainer(DATASET_DIR, image_size=image_net_config.dataset['image_size'],
                                  batch_size=image_net_config.train['batch_size'],
                                  num_workers=image_net_config.train['num_workers'])

        trainer.train(model, max_epochs=epochs, learning_rate=learning_rate,
                      learning_rate_schedule=learning_rate_schedule, use_cuda=use_cuda)

---
## 2. Load the model and evaluate it to find the baseline accuracy

For this example notebook, we are going to load a pretrained resnet18 model from torchvision. Similarly, you can load any pretrained PyTorch model instead.

In [3]:
from torchvision.models import resnet18

model = resnet18(pretrained=True)

---
We should decide whether to place the model on a CPU or CUDA device. This example code will use CUDA if available in your current execution environment. You can change this logic and force a device placement if needed.

In [4]:
use_cuda = False
if torch.cuda.is_available():
    use_cuda = True
    model.to(torch.device('cuda'))

---
Let's determine the FP32 (floating point 32-bit) accuracy of this model using the evaluate() routine

In [5]:
accuracy = ImageNetDataPipeline.evaluate(model, iterations=None, use_cuda=use_cuda)
print(accuracy)

100% (32 of 32) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


69.23828125


## 3. Compress the model and fine-tune

### 3.1. Compress model using Channel Pruning and evaluate it to find post-compression accuracy
Now we use AIMET to define compression parameters for Channel Pruning, few of which are explained here

- **target_comp_ratio**: The desired compression ratio for Channel Pruning. We are using 0.9 to compress the model by 10%.

- **num_comp_ratio_candidates**: As part of determining how compressible each layer is, AIMET performs various measurements. This number denotes the different compression ratios tried by the AIMET for each layer. We are using 3 here which translates to  0.33, 0.66 and 1.00 compression ratios at each layer. Optimal value is 10. The higher the number of candidates the more granular the measurements for each layer, but also the higher the time taken to complete these measurements.

- **modules_to_ignore**: This list can contain the references of model-layers that should be ignored during compression. We have added the first layer to be ignored to preserve the way the input interacts with the model; other layers can be added too if desired.

- **mode**: We are chossing **Auto** mode which means AIMET performs per-layer compressibility analysis and determines how much to compress each layer. The alternate choice is **Manual**.

- **data_loader**: Channel Pruning uses unlabelled data samples for the layer-by-layer reconstruction procedure explained at the start. This provided data loader is used to retrieve those samples. You can just pass your existing data loader - say for the validation or training dataset.

- **num_reconstruction_samples**: The number of samples used in the layer-by-layer reconstruction procedure. We are using 10 here which is a ridiculously low number but enables this notebook to execute quickly. A typical setting here would ~1000 samples.

- **allow_custom_downsample_ops**: If this flag is enabled, AIMET Channel Pruning will insert downsample ops into the model graph if needed. Enabling this can enable more convolutional layers to be considered for pruning, but it may increase memory bandwidth overhead for the additional downsample layers. So there is a trade-off to be considered. We suggest disabling this by default.

- **eval_callback**: The model evaluation function. The expected signature of the evaluate function should be `<function_name>(model, eval_iterations, use_cuda)` and it is expected to return an accuracy metric.

- **eval_iterations**: The number of batches of data to use for evaluating the model while the model is compressing. We are using 1 to speed up the notebook execution. But please choose a high enough number of samples so that we can trust the accuracy of the model given those samples. It is expected that the eval callback would use the same samples for every invocation of the callback.

- **compress_scheme**: We choose the 'channel pruning' compression scheme.

- **cost_metric**: Determines whether we want to target either to reduce MACs or memory by the desired compression ratio. We are chossing 'mac' here.

In [6]:
from decimal import Decimal
from aimet_torch.defs import GreedySelectionParameters, ChannelPruningParameters
from aimet_common.defs import CompressionScheme, CostMetric

greedy_params = GreedySelectionParameters(target_comp_ratio=Decimal(0.9),
                                          num_comp_ratio_candidates=3)
modules_to_ignore = [model.conv1]
auto_params = ChannelPruningParameters.AutoModeParams(greedy_select_params=greedy_params,
                                                      modules_to_ignore=modules_to_ignore)
data_loader = ImageNetDataPipeline.get_val_dataloader()
params = ChannelPruningParameters(data_loader=data_loader,
                                  num_reconstruction_samples=10,
                                  allow_custom_downsample_ops=False,
                                  mode=ChannelPruningParameters.Mode.auto,
                                  params=auto_params)

eval_callback = ImageNetDataPipeline.evaluate
eval_iterations = 1
compress_scheme = CompressionScheme.channel_pruning
cost_metric = CostMetric.mac

---
We call the AIMET ModelCompressor.compress_model API using the above parameters. This call returns a compressed model as well as relevant statistics.  
**Note**: the ModelCompressor evaluates the model while compressing using the same evaluate function that is in our data pipeline.


In [7]:
from aimet_torch.compress import ModelCompressor
compressed_model, comp_stats = ModelCompressor.compress_model(model=model,
                                                              eval_callback=eval_callback,
                                                              eval_iterations=eval_iterations,
                                                              input_shape=(1, 3, 224, 224),
                                                              compress_scheme=compress_scheme,
                                                              cost_metric=cost_metric,
                                                              parameters=params)

print(comp_stats)



  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:19,178 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:20,026 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:20,030 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:20,499 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:20,500 - CompRatioSelect - INFO - Layer layer1.0.conv1, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:21,344 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:22,148 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:22,150 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:22,620 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:22,623 - CompRatioSelect - INFO - Layer layer1.0.conv1, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:23,425 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:24,401 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:24,403 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:24,882 - Eval - INFO - Avg accuracy Top 1: 68.750000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:14:24,884 - CompRatioSelect - INFO - Layer layer1.0.conv2, comp_ratio 0.333333 ==> eval_score=68.750000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:25,755 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:26,507 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:26,509 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:26,982 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 84.375000 on validation Dataset
2022-02-03 22:14:26,986 - CompRatioSelect - INFO - Layer layer1.0.conv2, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:27,847 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:28,614 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:28,617 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:29,082 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:29,084 - CompRatioSelect - INFO - Layer layer1.1.conv1, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:29,963 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:30,739 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:30,741 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:31,204 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:31,206 - CompRatioSelect - INFO - Layer layer1.1.conv1, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:32,065 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:32,849 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:32,851 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:33,334 - Eval - INFO - Avg accuracy Top 1: 71.875000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:14:33,335 - CompRatioSelect - INFO - Layer layer1.1.conv2, comp_ratio 0.333333 ==> eval_score=71.875000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:34,173 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:34,995 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:34,997 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:35,476 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 100.000000 on validation Dataset
2022-02-03 22:14:35,478 - CompRatioSelect - INFO - Layer layer1.1.conv2, comp_ratio 0.666667 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:36,326 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:37,107 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:37,109 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:37,576 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:37,578 - CompRatioSelect - INFO - Layer layer2.0.conv1, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:38,443 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:39,259 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:39,262 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:39,737 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:39,739 - CompRatioSelect - INFO - Layer layer2.0.conv1, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:40,774 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:41,477 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:41,479 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:42,060 - Eval - INFO - Avg accuracy Top 1: 28.125000 Avg accuracy Top 5: 62.500000 on validation Dataset
2022-02-03 22:14:42,064 - CompRatioSelect - INFO - Layer layer2.0.conv2, comp_ratio 0.333333 ==> eval_score=28.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:43,092 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:43,892 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:43,895 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:44,350 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:14:44,352 - CompRatioSelect - INFO - Layer layer2.0.conv2, comp_ratio 0.666667 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:45,173 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:45,965 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:45,968 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:46,404 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:46,406 - CompRatioSelect - INFO - Layer layer2.0.downsample.0, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:47,225 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:48,016 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:48,018 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:48,457 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:48,459 - CompRatioSelect - INFO - Layer layer2.0.downsample.0, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:49,344 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:50,183 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:50,185 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:50,648 - Eval - INFO - Avg accuracy Top 1: 65.625000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:14:50,649 - CompRatioSelect - INFO - Layer layer2.1.conv1, comp_ratio 0.333333 ==> eval_score=65.625000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:51,525 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:52,353 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:52,355 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:52,813 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:14:52,815 - CompRatioSelect - INFO - Layer layer2.1.conv1, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:53,666 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:54,546 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:54,548 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:55,019 - Eval - INFO - Avg accuracy Top 1: 50.000000 Avg accuracy Top 5: 68.750000 on validation Dataset
2022-02-03 22:14:55,021 - CompRatioSelect - INFO - Layer layer2.1.conv2, comp_ratio 0.333333 ==> eval_score=50.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:14:55,898 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:56,699 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:56,701 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:57,180 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 96.875000 on validation Dataset
2022-02-03 22:14:57,184 - CompRatioSelect - INFO - Layer layer2.1.conv2, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:14:58,107 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:14:58,875 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:14:58,877 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:14:59,342 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:14:59,344 - CompRatioSelect - INFO - Layer layer3.0.conv1, comp_ratio 0.333333 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:00,261 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:01,055 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:01,057 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:01,528 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:15:01,530 - CompRatioSelect - INFO - Layer layer3.0.conv1, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:02,424 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:03,145 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:03,148 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:03,674 - Eval - INFO - Avg accuracy Top 1: 71.875000 Avg accuracy Top 5: 84.375000 on validation Dataset
2022-02-03 22:15:03,676 - CompRatioSelect - INFO - Layer layer3.0.conv2, comp_ratio 0.333333 ==> eval_score=71.875000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:04,633 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:05,439 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:05,441 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:05,913 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 84.375000 on validation Dataset
2022-02-03 22:15:05,916 - CompRatioSelect - INFO - Layer layer3.0.conv2, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:06,746 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:07,542 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:07,544 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:08,033 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:08,035 - CompRatioSelect - INFO - Layer layer3.0.downsample.0, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:08,848 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:09,678 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:09,680 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:10,146 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:10,147 - CompRatioSelect - INFO - Layer layer3.0.downsample.0, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:11,074 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:11,842 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:11,844 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:12,321 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:15:12,324 - CompRatioSelect - INFO - Layer layer3.1.conv1, comp_ratio 0.333333 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:13,275 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:14,085 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:14,087 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:14,561 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:15:14,564 - CompRatioSelect - INFO - Layer layer3.1.conv1, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:15,498 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:16,227 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:16,229 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:16,710 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:16,712 - CompRatioSelect - INFO - Layer layer3.1.conv2, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:17,622 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:18,377 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:18,379 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:18,845 - Eval - INFO - Avg accuracy Top 1: 71.875000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:18,847 - CompRatioSelect - INFO - Layer layer3.1.conv2, comp_ratio 0.666667 ==> eval_score=71.875000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:19,787 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:20,575 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:20,577 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:21,033 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 96.875000 on validation Dataset
2022-02-03 22:15:21,035 - CompRatioSelect - INFO - Layer layer4.0.conv1, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:22,003 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:22,799 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:22,801 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:23,255 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 96.875000 on validation Dataset
2022-02-03 22:15:23,257 - CompRatioSelect - INFO - Layer layer4.0.conv1, comp_ratio 0.666667 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:24,217 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:24,920 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:24,922 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:25,393 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:15:25,395 - CompRatioSelect - INFO - Layer layer4.0.conv2, comp_ratio 0.333333 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:26,373 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:27,032 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:27,034 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:27,522 - Eval - INFO - Avg accuracy Top 1: 75.000000 Avg accuracy Top 5: 90.625000 on validation Dataset
2022-02-03 22:15:27,525 - CompRatioSelect - INFO - Layer layer4.0.conv2, comp_ratio 0.666667 ==> eval_score=75.000000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:28,395 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:29,032 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:29,034 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:29,494 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:29,496 - CompRatioSelect - INFO - Layer layer4.0.downsample.0, comp_ratio 0.333333 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:30,374 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:31,194 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:31,196 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:31,656 - Eval - INFO - Avg accuracy Top 1: 84.375000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:31,658 - CompRatioSelect - INFO - Layer layer4.0.downsample.0, comp_ratio 0.666667 ==> eval_score=84.375000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:32,753 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:33,581 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:33,583 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:34,087 - Eval - INFO - Avg accuracy Top 1: 71.875000 Avg accuracy Top 5: 96.875000 on validation Dataset
2022-02-03 22:15:34,089 - CompRatioSelect - INFO - Layer layer4.1.conv1, comp_ratio 0.333333 ==> eval_score=71.875000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:15:35,122 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:35,952 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:35,954 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:36,429 - Eval - INFO - Avg accuracy Top 1: 68.750000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:36,432 - CompRatioSelect - INFO - Layer layer4.1.conv1, comp_ratio 0.666667 ==> eval_score=68.750000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:37,402 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:38,158 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:38,160 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:38,632 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:38,634 - CompRatioSelect - INFO - Layer layer4.1.conv2, comp_ratio 0.333333 ==> eval_score=78.125000


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:15:39,652 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:15:40,479 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:15:40,481 - Eval - INFO - Evaluating nn.Module for 1 iterations with batch_size 32


100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time:  0:00:00


2022-02-03 22:15:40,935 - Eval - INFO - Avg accuracy Top 1: 78.125000 Avg accuracy Top 5: 93.750000 on validation Dataset
2022-02-03 22:15:40,937 - CompRatioSelect - INFO - Layer layer4.1.conv2, comp_ratio 0.666667 ==> eval_score=78.125000
2022-02-03 22:15:40,941 - CompRatioSelect - INFO - Greedy selection: Saved eval dict to ./data/greedy_selection_eval_scores_dict.pkl
2022-02-03 22:15:40,946 - CompRatioSelect - INFO - Greedy selection: overall_min_score=28.125000, overall_max_score=84.375000
2022-02-03 22:15:40,947 - CompRatioSelect - INFO - Greedy selection: Original model cost=(Cost: memory=11678912, mac=1814073344)


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:16:31,203 - CompRatioSelect - INFO - Greedy selection: final choice - comp_ratio=0.838940, score=78.123093
2022-02-03 22:16:32,067 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:32,967 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:16:33,829 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:34,706 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:16:35,608 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:36,440 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:37,270 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:38,246 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable


2022-02-03 22:16:39,153 - ChannelPruning - INFO - finished linear regression fit 


  dummy_input = torch.tensor(dummy_input).cuda()  # pylint: disable=not-callable
  result = torch.tensor(tensor)  # pylint: disable=not-callable


2022-02-03 22:16:40,196 - ChannelPruning - INFO - finished linear regression fit 
2022-02-03 22:16:41,665 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:16:41,667 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:16:41,668 - Eval - INFO - Evaluating nn.Module for 32 iterations with batch_size 32


100% (32 of 32) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:16:46,722 - Eval - INFO - Avg accuracy Top 1: 69.238281 Avg accuracy Top 5: 88.671875 on validation Dataset
2022-02-03 22:16:47,515 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:16:47,516 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:16:47,518 - Eval - INFO - Evaluating nn.Module for 32 iterations with batch_size 32


100% (32 of 32) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:16:52,532 - Eval - INFO - Avg accuracy Top 1: 2.246094 Avg accuracy Top 5: 3.417969 on validation Dataset
**********************************************************************************************
Compressed Model Statistics
Baseline model accuracy: 69.238281, Compressed model accuracy: 2.246094
Compression ratio for memory=0.721591, mac=0.838940

**********************************************************************************************

Per-layer Stats
    Name:layer1.0.conv1, compression-ratio: 0.3333333333333333333333333333
    Name:layer1.0.conv2, compression-ratio: None
    Name:layer1.1.conv1, compression-ratio: 0.3333333333333333333333333333
    Name:layer1.1.conv2, compression-ratio: 0.6666666666666666666666666666
    Name:layer2.0.conv1, compression-ratio: 0.3333333333333333333333333333
    Name:layer2.0.conv2, compression-ratio: 0.6666666666666666666666666666
    Name:layer2.0.downsample.0, compression-ratio: 0.3333333333333333333333333333
    Name:layer

---
Now the compressed model is ready to be used for inference or training. First we can pass this model to the same evaluation routine we used before to calculated compressed model accuracy.

In [8]:
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
print(accuracy)

2022-02-03 22:16:53,344 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:16:53,345 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:16:53,347 - Eval - INFO - Evaluating nn.Module for 32 iterations with batch_size 32


100% (32 of 32) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:16:58,512 - Eval - INFO - Avg accuracy Top 1: 2.246094 Avg accuracy Top 5: 3.417969 on validation Dataset
2.24609375


---
As you can see the model accuracy fell sharply after compression. This is expected. We will use model fine-tuning to recover this accuracy back.

### 3.2. Fine-tune the model

After the model is compressed using Channel Pruning, we can simply train the model for a few more epochs (typically 15-20). As with any training job, hyper-parameters need to be searched for optimal results. Good starting points are to use a learning rate on the same order as the ending learning rate when training the original model, and to drop the learning rate by a factor of 10 every 5 epochs or so.

For the purpose of this example notebook, we are going to train only for 1 epoch. But feel free to change these parameters as you see fit.

In [9]:
ImageNetDataPipeline.finetune(compressed_model, epochs=2, learning_rate=15e-4, learning_rate_schedule=[5, 10],
                              use_cuda=use_cuda)

2022-02-03 22:17:00,100 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:17:01,013 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes


  cpuset_checked))
100% (63 of 63) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:17:06,036 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:17:06,037 - Eval - INFO - Evaluating nn.Module for 63 iterations with batch_size 16


100% (63 of 63) |########################| Elapsed Time: 0:00:01 Time:  0:00:01


2022-02-03 22:17:09,781 - Eval - INFO - Avg accuracy Top 1: 4.067460 Avg accuracy Top 5: 10.416667 on validation Dataset
eval :  4.067460317460317
2022-02-03 22:17:09,782 - Trainer - INFO - At the end of Epoch #1/2: Global Avg Loss=6.375429, Eval Accuracy=4.067460


100% (63 of 63) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:17:14,635 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:17:14,636 - Eval - INFO - Evaluating nn.Module for 63 iterations with batch_size 16


100% (63 of 63) |########################| Elapsed Time: 0:00:01 Time:  0:00:01


2022-02-03 22:17:18,177 - Eval - INFO - Avg accuracy Top 1: 6.250000 Avg accuracy Top 5: 16.071429 on validation Dataset
eval :  6.25
2022-02-03 22:17:18,178 - Trainer - INFO - At the end of Epoch #2/2: Global Avg Loss=5.584270, Eval Accuracy=6.250000


---
After we are done with finetuing the compressed model, we can check the floating point accuracy against the same validation dataset at the end to observe any improvements in accuracy.

In [10]:
accuracy = ImageNetDataPipeline.evaluate(compressed_model, iterations=None, use_cuda=use_cuda)
print(accuracy)

2022-02-03 22:17:18,983 - Dataloader - INFO - Dataset consists of 1000 images in 1000 classes
2022-02-03 22:17:18,984 - Eval - INFO - No value of iteration is provided, running evaluation on complete dataset.
2022-02-03 22:17:18,987 - Eval - INFO - Evaluating nn.Module for 32 iterations with batch_size 32


100% (32 of 32) |########################| Elapsed Time: 0:00:04 Time:  0:00:04


2022-02-03 22:17:23,985 - Eval - INFO - Avg accuracy Top 1: 6.347656 Avg accuracy Top 5: 16.210938 on validation Dataset
6.34765625


---
Depending on your settings you should have observed a slight gain in accuracy after one epoch of training. Ofcourse, this was just an example. Please try this against the model of your choice and play with the hyper-parameters to get the best results.

So we have an improved model after compression using Channel Pruning. Optionally, this model now can be saved like a regular PyTorch model.

In [11]:
os.makedirs('./output/', exist_ok=True)
torch.save(compressed_model, './output/finetuned_model')

---
## Summary

Hope this notebook was useful for you to understand how to use AIMET for performing compression with Channel Pruning. As indicated above, some parameters have been chosen in a way to run the example faster.

Few additional resources
- Refer to the AIMET API docs to know more details of the APIs and optional parameters
- Refer to the other example notebooks to understand how to use AIMET compression and quantization techniques