# Neural Network Compression - Haiku

Implementation of several neural network compression techniques (knowledge distillation, pruning, quantization, factorization), in [Haiku](https://github.com/deepmind/dm-haiku).

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Brandhsu/nn-compress-haiku/blob/master/notebooks/nn-compress-haiku.ipynb)

The original source code, including this notebook, can be found  [here](https://github.com/Brandhsu/nn-compress-haiku).

## Installation

In [1]:
!git clone https://github.com/Brandhsu/nn-compress-haiku/
%cd nn-compress-haiku
!git lfs pull
!pip install -r requirements.txt

Cloning into 'nn-compress-haiku'...
remote: Enumerating objects: 80, done.[K
remote: Counting objects: 100% (80/80), done.[K
remote: Compressing objects: 100% (59/59), done.[K
remote: Total 80 (delta 34), reused 62 (delta 16), pack-reused 0[K
Unpacking objects: 100% (80/80), 518.70 KiB | 2.03 MiB/s, done.
/content/nn-compress-haiku
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting dm-haiku
  Downloading dm_haiku-0.0.9-py3-none-any.whl (352 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m352.1/352.1 KB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting optax
  Downloading optax-0.1.4-py3-none-any.whl (154 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m154.9/154.9 KB[0m [31m20.0 MB/s[0m eta [36m0:00:00[0m
Collecting jmp>=0.0.2
  Downloading jmp-0.0.2-py3-none-any.whl (16 kB)
Collecting chex>=0.1.5
  Downloading chex-0.1.5-py3-none-any.whl (85 kB)
[2K     [90m━━━━━━━━━━━━━

## Training

We will be training this network on the CIFAR-10 image classification dataset.

In [2]:
!sed -n 18,50p scripts/02_train_kd.py

def teacher_net_fn(batch: Batch) -> jnp.ndarray:
    """A simple convolutional feedforward deep neural network.

    Args:
        batch (Batch): A tuple containing (data, labels).

    Returns:
        jnp.ndarray: output of network
    """
    x = normalize(batch[0])

    net = hk.Sequential(
        [
            hk.Conv2D(output_channels=6 * 3, kernel_shape=(5, 5)),
            jax.nn.relu,
            hk.AvgPool(window_shape=(2, 2), strides=(2, 2), padding="VALID"),
            jax.nn.relu,
            hk.Conv2D(output_channels=16 * 3, kernel_shape=(5, 5)),
            jax.nn.relu,
            hk.AvgPool(window_shape=(2, 2), strides=(2, 2), padding="VALID"),
            hk.Flatten(),
            hk.Linear(3000),
            jax.nn.relu,
            hk.Linear(2000),
            jax.nn.relu,
            hk.Linear(2000),
            jax.nn.relu,
            hk.Linear(1000),
            jax.nn.relu,
            hk.Linear(10),
        ]
    )
    return net(x)


In [3]:
!python scripts/01_train.py --train-batch-size 64 --train-steps 10001 --eval-steps 1000 --save-dir models

2023-01-20 02:11:07.187010: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
[Step 0] Validation / Test accuracy: 0.101 / 0.103.
[Step 1000] Validation / Test accuracy: 0.423 / 0.424.
[Step 2000] Validation / Test accuracy: 0.602 / 0.598.
[Step 3000] Validation / Test accuracy: 0.653 / 0.650.
[Step 4000] Validation / Test accuracy: 0.651 / 0.655.
[Step 5000] Validation / Test accuracy: 0.652 / 0.649.
[Step 6000] Validation / Test accuracy: 0.649 / 0.648.
[Step 7000] Validation / Test accuracy: 0.650 / 0.646.
[Step 8000] Validation / Test accuracy: 0.650 / 0.647.
[Step 9000] Validation / Test accuracy: 0.646 / 0.646.
[Step 10000] Validation / Test accuracy: 0.643 / 0.646.


## Experiments

We will be applying several techniques to compress the network we just trained.

In [4]:
# Weight Pruning
!python scripts/03_compress.py --model-path models/params.pkl --compression-func prune --save-dir figs

2023-01-20 02:13:48.895255: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Evaluating the model at 0.00% compression
Compression Fraction / Accuracy: 0.00 / 0.624.
Compression Fraction / Latency: 0.00 / 0.6368.
Evaluating the model at 10.00% compression
Compression Fraction / Accuracy: 0.10 / 0.624.
Compression Fraction / Latency: 0.10 / 0.4308.
Evaluating the model at 20.00% compression
Compression Fraction / Accuracy: 0.20 / 0.624.
Compression Fraction / Latency: 0.20 / 0.4365.
Evaluating the model at 30.00% compression
Compression Fraction / Accuracy: 0.30 / 0.624.
Compression Fraction / Latency: 0.30 / 0.4641.
Evaluating the model at 40.00% compression
Compression Fraction / Accuracy: 0.40 / 0.624.
Compression Fraction / Latency: 0.40 / 0.4291.
Evaluating the model at 50.00% compression
Compression Fraction / Accuracy: 0.50 / 0.624.
Compressio

In [5]:
# Linear Quantization
!python scripts/03_compress.py --model-path models/params.pkl --compression-func quant --save-dir figs

2023-01-20 02:18:13.997180: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Evaluating the model at 0.00% compression
Compression Fraction / Accuracy: 0.00 / 0.624.
Compression Fraction / Latency: 0.00 / 0.6407.
Evaluating the model at 10.00% compression
Compression Fraction / Accuracy: 0.10 / 0.623.
Compression Fraction / Latency: 0.10 / 0.4548.
Evaluating the model at 20.00% compression
Compression Fraction / Accuracy: 0.20 / 0.624.
Compression Fraction / Latency: 0.20 / 0.4477.
Evaluating the model at 30.00% compression
Compression Fraction / Accuracy: 0.30 / 0.623.
Compression Fraction / Latency: 0.30 / 0.4373.
Evaluating the model at 40.00% compression
Compression Fraction / Accuracy: 0.40 / 0.623.
Compression Fraction / Latency: 0.40 / 0.4322.
Evaluating the model at 50.00% compression
Compression Fraction / Accuracy: 0.50 / 0.624.
Compressio

In [6]:
# Singular Value Decomposition
!python scripts/03_compress.py --model-path models/params.pkl --compression-func svd --save-dir figs

2023-01-20 02:31:44.680737: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
Evaluating the model at 0.00% compression
Compression Fraction / Accuracy: 0.00 / 0.624.
Compression Fraction / Latency: 0.00 / 0.6399.
Evaluating the model at 10.00% compression
Compression Fraction / Accuracy: 0.10 / 0.624.
Compression Fraction / Latency: 0.10 / 0.4356.
Evaluating the model at 20.00% compression
Compression Fraction / Accuracy: 0.20 / 0.624.
Compression Fraction / Latency: 0.20 / 0.4308.
Evaluating the model at 30.00% compression
Compression Fraction / Accuracy: 0.30 / 0.609.
Compression Fraction / Latency: 0.30 / 0.4684.
Evaluating the model at 40.00% compression
Compression Fraction / Accuracy: 0.40 / 0.505.
Compression Fraction / Latency: 0.40 / 0.4381.
Evaluating the model at 50.00% compression
Compression Fraction / Accuracy: 0.50 / 0.418.
Compressio

## Knowledge Distillation

We will be distilling the knowledge from the network trained in the beginning into this one.

In [7]:
!sed -n 53,77p scripts/02_train_kd.py

def student_net_fn(batch: Batch) -> jnp.ndarray:
    """A simple convolutional feedforward deep neural network.

    Args:
        batch (Batch): A tuple containing (data, labels).

    Returns:
        jnp.ndarray: output of network
    """
    x = normalize(batch[0])

    net = hk.Sequential(
        [
            hk.Conv2D(output_channels=6 * 3, kernel_shape=(5, 5)),
            jax.nn.relu,
            hk.AvgPool(window_shape=(2, 2), strides=(2, 2), padding="VALID"),
            jax.nn.relu,
            hk.Conv2D(output_channels=16 * 3, kernel_shape=(5, 5)),
            jax.nn.relu,
            hk.AvgPool(window_shape=(2, 2), strides=(2, 2), padding="VALID"),
            hk.Flatten(),
            hk.Linear(10),
        ]
    )
    return net(x)


In [8]:
# Training on the the teacher's outputs only
!python scripts/02_train_kd.py --model-path models/params.pkl --train-batch-size 64 --train-steps 10001 --eval-steps 1000 --alpha 0.0 --save-dir models

2023-01-20 02:40:05.271035: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
[Step 0] Validation / Test accuracy: 0.080 / 0.077.
[Step 1000] Validation / Test accuracy: 0.489 / 0.489.
[Step 2000] Validation / Test accuracy: 0.610 / 0.612.
[Step 3000] Validation / Test accuracy: 0.652 / 0.653.
[Step 4000] Validation / Test accuracy: 0.673 / 0.664.
[Step 5000] Validation / Test accuracy: 0.676 / 0.667.
[Step 6000] Validation / Test accuracy: 0.675 / 0.667.
[Step 7000] Validation / Test accuracy: 0.674 / 0.667.
[Step 8000] Validation / Test accuracy: 0.668 / 0.664.
[Step 9000] Validation / Test accuracy: 0.664 / 0.658.
[Step 10000] Validation / Test accuracy: 0.659 / 0.653.


In [9]:
# Training on the ground-truth only
!python scripts/02_train_kd.py --model-path models/params.pkl --train-batch-size 64 --train-steps 10001 --eval-steps 1000 --alpha 1.0 --save-dir models

2023-01-20 02:25:52.427611: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
[Step 0] Validation / Test accuracy: 0.080 / 0.077.
[Step 1000] Validation / Test accuracy: 0.492 / 0.490.
[Step 2000] Validation / Test accuracy: 0.612 / 0.615.
[Step 3000] Validation / Test accuracy: 0.654 / 0.654.
[Step 4000] Validation / Test accuracy: 0.673 / 0.666.
[Step 5000] Validation / Test accuracy: 0.679 / 0.669.
[Step 6000] Validation / Test accuracy: 0.677 / 0.671.
[Step 7000] Validation / Test accuracy: 0.677 / 0.671.
[Step 8000] Validation / Test accuracy: 0.671 / 0.669.
[Step 9000] Validation / Test accuracy: 0.665 / 0.665.
[Step 10000] Validation / Test accuracy: 0.661 / 0.659.


In [10]:
# Training on the teacher's outputs and ground-truth with equal weights
!python scripts/02_train_kd.py --model-path models/params.pkl --train-batch-size 64 --train-steps 10001 --eval-steps 1000 --alpha 0.5 --save-dir models

2023-01-20 02:28:48.317223: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
[Step 0] Validation / Test accuracy: 0.080 / 0.077.
[Step 1000] Validation / Test accuracy: 0.491 / 0.489.
[Step 2000] Validation / Test accuracy: 0.610 / 0.613.
[Step 3000] Validation / Test accuracy: 0.657 / 0.651.
[Step 4000] Validation / Test accuracy: 0.674 / 0.665.
[Step 5000] Validation / Test accuracy: 0.678 / 0.668.
[Step 6000] Validation / Test accuracy: 0.677 / 0.671.
[Step 7000] Validation / Test accuracy: 0.673 / 0.671.
[Step 8000] Validation / Test accuracy: 0.668 / 0.667.
[Step 9000] Validation / Test accuracy: 0.664 / 0.662.
[Step 10000] Validation / Test accuracy: 0.659 / 0.657.
