## Instructions

You are asked to complete the following files:
* **pruned_layers.py**, which contains the pruning of DNNs to reduce the storage of insignificant weight parameters with 2 methods: pruning by percentage and prune by standard deviation.
* **train_util.py**, which includes the training process of DNNs with pruned connections.
* **quantize.py**, which applies the quantization (weight sharing) part on the DNN to reduce the storage of weight parameters.
* **huffman_coding.py**, which applies the Huffman coding onto the weight of DNNs to further compress the weight size.

You are asked to submit the following files:
* **net_before_pruning.pt**, which is the set of weight parameters before applying pruning on DNN weight parameters.
* **net_after_pruning.pt**, which is the set of weight paramters after applying pruning on DNN weight parameters.
* **net_after_quantization.pt**, which is the set of weight parameters after applying quantization (weight sharing) on DNN weight parameters.
* **codebook_resnet20.npy**, which is the quantization codebook of each layer after applying quantization (weight sharing).
* **huffman_encoding.npy**, which is the encoding map of each item within the quantization codebook in the whole DNN architecture.
* **huffman_freq.npy**, which is the frequency map of each item within the quantization codebook in the whole DNN. 

To ensure fair grading policy, we fix the choice of model to ResNet-20. You may check the implementation in **resnet20.py** for more details.

In [1]:
from resnet20 import ResNetCIFAR
from train_util import train, finetune_after_prune, test
from quantize import quantize_whole_model
from huffman_coding import huffman_coding
from summary import summary
import torch
import numpy as np
from prune import prune


device = 'cuda' if torch.cuda.is_available() else 'cpu'

### Full-precision model training

In [2]:
net = ResNetCIFAR(num_layers=20)
net = net.to(device)

# Uncomment to load pretrained weights
# net.load_state_dict(torch.load("net_before_pruning.pt"))
# Comment if you have loaded pretrained weights
# train(net, epochs=100, batch_size=128, lr= 0.01, reg=1e-3)

In [3]:
# Load the best weight paramters
net.load_state_dict(torch.load("net_before_pruning.pt"))
test(net)

Files already downloaded and verified
Test Loss=0.2608, Test accuracy=0.9169


In [4]:
print("-----Summary before pruning-----")
summary(net)
print("-------------------------------")

-----Summary before pruning-----
Layer id	Type		Parameter	Non-zero parameter	Sparsity(\%)
1		Convolutional	864		864			0.000000
2		BatchNorm	N/A		N/A			N/A
3		ReLU		N/A		N/A			N/A
4		Convolutional	4608		4608			0.000000
5		BatchNorm	N/A		N/A			N/A
6		ReLU		N/A		N/A			N/A
7		Convolutional	2304		2304			0.000000
8		BatchNorm	N/A		N/A			N/A
9		Convolutional	512		512			0.000000
10		BatchNorm	N/A		N/A			N/A
11		ReLU		N/A		N/A			N/A
12		Convolutional	2304		2304			0.000000
13		BatchNorm	N/A		N/A			N/A
14		ReLU		N/A		N/A			N/A
15		Convolutional	2304		2304			0.000000
16		BatchNorm	N/A		N/A			N/A
17		ReLU		N/A		N/A			N/A
18		Convolutional	2304		2304			0.000000
19		BatchNorm	N/A		N/A			N/A
20		ReLU		N/A		N/A			N/A
21		Convolutional	2304		2304			0.000000
22		BatchNorm	N/A		N/A			N/A
23		ReLU		N/A		N/A			N/A
24		Convolutional	4608		4608			0.000000
25		BatchNorm	N/A		N/A			N/A
26		ReLU		N/A		N/A			N/A
27		Convolutional	9216		9216			0.000000
28		BatchNorm	N/A		N/A			N/A
29		Convolutional	512		512			0.00

### Pruning & Finetune with pruned connections

In [5]:
from pruned_layers import PrunedConv,  PruneLinear
import matplotlib.pyplot as plt
def plot_weights(net):
    plt.figure(figsize=(15,35))
    count = 1
    for name, m in net.named_modules():
        if isinstance(m, PruneLinear):
            # Get the weight of the module as a NumPy array
            weight =  m.linear.weight.data.cpu().numpy()
            #Reshape for histogram
            weight = weight.reshape(-1)
            plt.subplot(8,3,count)
            plt.title("Weight histogram of layer "+name)
            _ = plt.hist(weight, bins=20)
            count  = count +  1

        if isinstance(m, PrunedConv): 
            weight =  m.conv.weight.data.cpu().numpy()
            #Reshape for histogram
            weight = weight.reshape(-1)
            plt.subplot(8,3,count)
            plt.title("Weight histogram of layer "+name)
            _ = plt.hist(weight, bins=20)
            count =  count + 1
    plt.show()

In [14]:
# Test accuracy before fine-tuning 'percentage'
net.load_state_dict(torch.load("net_before_pruning.pt"))
prune(net, method='percentage', q=60, s=0.0)
test(net)

Files already downloaded and verified
Test Loss=1.3079, Test accuracy=0.5941


In [7]:
summary(net)

Layer id	Type		Parameter	Non-zero parameter	Sparsity(\%)
1		Convolutional	864		346			0.599537
2		BatchNorm	N/A		N/A			N/A
3		ReLU		N/A		N/A			N/A
4		Convolutional	4608		1843			0.600043
5		BatchNorm	N/A		N/A			N/A
6		ReLU		N/A		N/A			N/A
7		Convolutional	2304		922			0.599826
8		BatchNorm	N/A		N/A			N/A
9		Convolutional	512		205			0.599609
10		BatchNorm	N/A		N/A			N/A
11		ReLU		N/A		N/A			N/A
12		Convolutional	2304		922			0.599826
13		BatchNorm	N/A		N/A			N/A
14		ReLU		N/A		N/A			N/A
15		Convolutional	2304		922			0.599826
16		BatchNorm	N/A		N/A			N/A
17		ReLU		N/A		N/A			N/A
18		Convolutional	2304		922			0.599826
19		BatchNorm	N/A		N/A			N/A
20		ReLU		N/A		N/A			N/A
21		Convolutional	2304		922			0.599826
22		BatchNorm	N/A		N/A			N/A
23		ReLU		N/A		N/A			N/A
24		Convolutional	4608		1843			0.600043
25		BatchNorm	N/A		N/A			N/A
26		ReLU		N/A		N/A			N/A
27		Convolutional	9216		3686			0.600043
28		BatchNorm	N/A		N/A			N/A
29		Convolutional	512		205			0.599609
30		BatchNorm	N/A		N/A			N/A
31		

In [15]:
# Uncomment to load pretrained weights
# Comment if you have loaded pretrained weights
finetune_after_prune(net, epochs=20, batch_size=128, lr=0.0001, reg=1e-3)

==> Preparing data..
Files already downloaded and verified
Files already downloaded and verified

Epoch: 0
[Step=50]	Loss=0.2724	acc=0.9259	2326.7 examples/second
[Step=100]	Loss=0.2601	acc=0.9296	2294.7 examples/second
[Step=150]	Loss=0.2531	acc=0.9324	2175.0 examples/second
[Step=200]	Loss=0.2481	acc=0.9336	2351.5 examples/second
[Step=250]	Loss=0.2478	acc=0.9343	2136.1 examples/second
[Step=300]	Loss=0.2460	acc=0.9342	2169.3 examples/second
[Step=350]	Loss=0.2439	acc=0.9350	2399.9 examples/second
Test Loss=0.2967, Test acc=0.9049
Saving...

Epoch: 1
[Step=400]	Loss=0.2116	acc=0.9531	1168.5 examples/second
[Step=450]	Loss=0.2224	acc=0.9432	2737.7 examples/second
[Step=500]	Loss=0.2224	acc=0.9419	2679.2 examples/second
[Step=550]	Loss=0.2245	acc=0.9404	2451.0 examples/second
[Step=600]	Loss=0.2226	acc=0.9409	2299.0 examples/second
[Step=650]	Loss=0.2217	acc=0.9411	2271.7 examples/second
[Step=700]	Loss=0.2198	acc=0.9418	2186.6 examples/second
[Step=750]	Loss=0.2201	acc=0.9414	2315.9 e

In [16]:
# Load the best weight paramters
net.load_state_dict(torch.load("net_after_pruning.pt"))
test(net)

Files already downloaded and verified
Test Loss=0.2710, Test accuracy=0.9136


In [17]:
print("-----Summary After pruning-----")
summary(net)
print("-------------------------------")

-----Summary After pruning-----
Layer id	Type		Parameter	Non-zero parameter	Sparsity(\%)
1		Convolutional	864		346			0.599537
2		BatchNorm	N/A		N/A			N/A
3		ReLU		N/A		N/A			N/A
4		Convolutional	4608		1843			0.600043
5		BatchNorm	N/A		N/A			N/A
6		ReLU		N/A		N/A			N/A
7		Convolutional	2304		922			0.599826
8		BatchNorm	N/A		N/A			N/A
9		Convolutional	512		205			0.599609
10		BatchNorm	N/A		N/A			N/A
11		ReLU		N/A		N/A			N/A
12		Convolutional	2304		922			0.599826
13		BatchNorm	N/A		N/A			N/A
14		ReLU		N/A		N/A			N/A
15		Convolutional	2304		922			0.599826
16		BatchNorm	N/A		N/A			N/A
17		ReLU		N/A		N/A			N/A
18		Convolutional	2304		922			0.599826
19		BatchNorm	N/A		N/A			N/A
20		ReLU		N/A		N/A			N/A
21		Convolutional	2304		922			0.599826
22		BatchNorm	N/A		N/A			N/A
23		ReLU		N/A		N/A			N/A
24		Convolutional	4608		1843			0.600043
25		BatchNorm	N/A		N/A			N/A
26		ReLU		N/A		N/A			N/A
27		Convolutional	9216		3686			0.600043
28		BatchNorm	N/A		N/A			N/A
29		Convolutional	512		205			0.599609
3

### Quantization

In [18]:
net.load_state_dict(torch.load("net_after_pruning.pt"))
test(net)
# plot_weights(net)
centers = quantize_whole_model(net, bits=4)
# plot_weights(net)
np.save("codebook_resnet20.npy", centers)
torch.save(net.state_dict(), "net_after_quantization.pt")
test(net)

Files already downloaded and verified
Test Loss=0.2710, Test accuracy=0.9136
Complete 1 layers quantization...
Complete 2 layers quantization...
Complete 3 layers quantization...
Complete 4 layers quantization...
Complete 5 layers quantization...
Complete 6 layers quantization...
Complete 7 layers quantization...
Complete 8 layers quantization...
Complete 9 layers quantization...
Complete 10 layers quantization...
Complete 11 layers quantization...
Complete 12 layers quantization...
Complete 13 layers quantization...
Complete 14 layers quantization...
Complete 15 layers quantization...
Complete 16 layers quantization...
Complete 17 layers quantization...
Complete 18 layers quantization...
Complete 19 layers quantization...
Complete 20 layers quantization...
Complete 21 layers quantization...
Complete 22 layers quantization...
Complete 23 layers quantization...
Files already downloaded and verified
Test Loss=0.2940, Test accuracy=0.9080


#### Huffman Coding

In [12]:
frequency_map, encoding_map = huffman_coding(net, centers)
np.save("huffman_encoding", encoding_map)
np.save("huffman_freq", frequency_map)

Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.5723 bits
Complete 1 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.0760 bits
Complete 2 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.1920 bits
Complete 3 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.4927 bits
Complete 4 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.5195 bits
Complete 5 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Average storage for each parameter after Huffman Coding: 3.4306 bits
Complete 6 layers for Huffman Coding...
Original storage for each parameter: 4.0000 bits
Ave