## Instructions

You are asked to complete the following files:
* **pruned_layers.py**, which contains the pruning of DNNs to reduce the storage of insignificant weight parameters with 2 methods: pruning by percentage and prune by standara deviation.
* **train_util.py**, which includes the training process of DNNs with pruned connections.
* **quantize.py**, which applies the quantization (weight sharing) part on the DNN to reduce the storage of weight parameters.
* **huffman_coding.py**, which applies the Huffman coding onto the weight of DNNs to further compress the weight size.

You are asked to submit the following files:
* **net_before_pruning.pt**, which is the weight parameters before applying pruning on DNN weight parameters.
* **net_after_pruning.pt**, which is the weight paramters after applying pruning on DNN weight parameters.
* **net_after_quantization.pt**, which is the weight parameters after applying quantization (weight sharing) on DNN weight parameters.
* **codebook_vgg16.npy**, which is the quantization codebook of each layer after applying quantization (weight sharing).
* **huffman_encoding.npy**, which is the encoding map of each item within the quantization codebook in the whole DNN architecture.
* **huffman_freq.npy**, which is the frequency map of each item within the quantization codebook in the whole DNN. 

To ensure fair grading policy, we fix the choice of model to VGG16_half, which is a down-scaled version of VGG16 using a width multiplier of 0.5. You may check the implementation in **vgg16.py** for more details.

In [1]:
from vgg16 import VGG16, VGG16_half
from train_util import train, finetune_after_prune, test
from quantize import quantize_whole_model
from huffman_coding import huffman_coding
from summary import summary
import torch
import numpy as np
from prune import prune

device = 'cuda' if torch.cuda.is_available() else 'cpu'

### Full-precision model training

In [2]:
net = VGG16_half()
net = net.to(device)

# Uncomment to load pretrained weights
net.load_state_dict(torch.load("net_before_pruning.pt"))

# Comment if you have loaded pretrained weights
# Tune the hyperparameters here.
#train(net, epochs=100, batch_size=233, lr=0.0173, reg=1e-2)

<All keys matched successfully>

In [3]:
# Load the best weight paramters
net.load_state_dict(torch.load("net_before_pruning.pt"))
test(net)

Files already downloaded and verified
Test Loss=0.3027, Test accuracy=0.9177


In [4]:
print("-----Summary before pruning-----")
summary(net)
print("-------------------------------")

-----Summary before pruning-----
Layer id	Type		Parameter	Non-zero parameter	Sparsity(\%)
1		Convolutional	864		864			0.000000
2		BatchNorm	N/A		N/A			N/A
3		ReLU		N/A		N/A			N/A
4		Convolutional	9216		9216			0.000000
5		BatchNorm	N/A		N/A			N/A
6		ReLU		N/A		N/A			N/A
7		Convolutional	18432		18432			0.000000
8		BatchNorm	N/A		N/A			N/A
9		ReLU		N/A		N/A			N/A
10		Convolutional	36864		36864			0.000000
11		BatchNorm	N/A		N/A			N/A
12		ReLU		N/A		N/A			N/A
13		Convolutional	73728		73728			0.000000
14		BatchNorm	N/A		N/A			N/A
15		ReLU		N/A		N/A			N/A
16		Convolutional	147456		147456			0.000000
17		BatchNorm	N/A		N/A			N/A
18		ReLU		N/A		N/A			N/A
19		Convolutional	147456		147456			0.000000
20		BatchNorm	N/A		N/A			N/A
21		ReLU		N/A		N/A			N/A
22		Convolutional	294912		294912			0.000000
23		BatchNorm	N/A		N/A			N/A
24		ReLU		N/A		N/A			N/A
25		Convolutional	589824		589824			0.000000
26		BatchNorm	N/A		N/A			N/A
27		ReLU		N/A		N/A			N/A
28		Convolutional	589824		589824			0.000000
29		Batch

### Pruning & Finetune with pruned connections

In [5]:
# Test accuracy before fine-tuning
prune(net, method='percentage', q=45.0, s=0.75)
test(net)

Files already downloaded and verified
Test Loss=0.3275, Test accuracy=0.9082


In [6]:
# Uncomment to load pretrained weights
net.load_state_dict(torch.load("net_after_pruning.pt"))
# Comment if you have loaded pretrained weights
# finetune_after_prune(net, epochs=50, batch_size=128, lr=0.001, reg=5e-5)

<All keys matched successfully>

In [7]:
# Load the best weight paramters
net.load_state_dict(torch.load("net_after_pruning.pt"))
test(net)

Files already downloaded and verified
Test Loss=0.3664, Test accuracy=0.9101


In [8]:
print("-----Summary After pruning-----")
summary(net)
print("-------------------------------")

-----Summary After pruning-----
Layer id	Type		Parameter	Non-zero parameter	Sparsity(\%)
1		Convolutional	864		475			0.450231
2		BatchNorm	N/A		N/A			N/A
3		ReLU		N/A		N/A			N/A
4		Convolutional	9216		5069			0.449978
5		BatchNorm	N/A		N/A			N/A
6		ReLU		N/A		N/A			N/A
7		Convolutional	18432		10138			0.449978
8		BatchNorm	N/A		N/A			N/A
9		ReLU		N/A		N/A			N/A
10		Convolutional	36864		20275			0.450005
11		BatchNorm	N/A		N/A			N/A
12		ReLU		N/A		N/A			N/A
13		Convolutional	73728		40550			0.450005
14		BatchNorm	N/A		N/A			N/A
15		ReLU		N/A		N/A			N/A
16		Convolutional	147456		81101			0.449999
17		BatchNorm	N/A		N/A			N/A
18		ReLU		N/A		N/A			N/A
19		Convolutional	147456		81101			0.449999
20		BatchNorm	N/A		N/A			N/A
21		ReLU		N/A		N/A			N/A
22		Convolutional	294912		162202			0.449999
23		BatchNorm	N/A		N/A			N/A
24		ReLU		N/A		N/A			N/A
25		Convolutional	589824		324403			0.450000
26		BatchNorm	N/A		N/A			N/A
27		ReLU		N/A		N/A			N/A
28		Convolutional	589824		324403			0.450000
29		BatchNor

### Quantization

In [9]:
centers = quantize_whole_model(net, bits=3)
np.save("codebook_vgg16.npy", centers)

Complete 1 layers quantization...
Complete 2 layers quantization...
Complete 3 layers quantization...
Complete 4 layers quantization...
Complete 5 layers quantization...
Complete 6 layers quantization...
Complete 7 layers quantization...
Complete 8 layers quantization...
Complete 9 layers quantization...
Complete 10 layers quantization...
Complete 11 layers quantization...
Complete 12 layers quantization...
Complete 13 layers quantization...
Complete 14 layers quantization...
Complete 15 layers quantization...
Complete 16 layers quantization...


In [10]:
test(net)

Files already downloaded and verified
Test Loss=0.3814, Test accuracy=0.9004


### Huffman Coding

In [11]:
frequency_map, encoding_map = huffman_coding(net, centers)
np.save("huffman_encoding", encoding_map)
np.save("huffman_freq", frequency_map)

Original storage for each parameter: 4.0000 bits
{-0.1730333330730597: [1, 1, 1, 1, 1, 1, 1, 1], 0.04084310369116477: [0, 1, 1, 1, 1, 1, 1, 1], -0.09919615405110213: [1, 1, 1, 1, 1, 1, 1], 0.102154545034423: [0, 1, 1, 1, 1, 1, 1], -0.06904473675316887: [1, 1, 1, 1, 1, 1], -0.042533333516783206: [0, 1, 1, 1, 1, 1], -0.13112307597811407: [1, 1, 1, 1, 1], 0.06922307725136095: [0, 1, 1, 1, 1], -0.2452000031868617: [1, 1, 1, 1], -0.0187830189067238: [0, 1, 1, 1], 0.019636842144424436: [1, 1, 1], 0.24595999419689177: [0, 1, 1], 0.0006880597042010189: [1, 1], 0.36540000637372333: [0, 1], 0.1568235290401122: [1], -0.42910000681877136: [0]}
Average storage for each parameter after Huffman Coding: 4.5000 bits
Complete 1 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
{-0.0666328127263113: [1, 1, 1, 1, 1, 1, 1, 1], -0.02084244614158809: [0, 1, 1, 1, 1, 1, 1, 1], 0.008925806468470064: [1, 1, 1, 1, 1, 1, 1], -0.0007179807675898576: [0, 1, 1, 1, 1, 1, 1], 0.174512499943