<a href="https://colab.research.google.com/github/lehduong/GINP/blob/testing/Rethinking_Network_Pruning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
!nvidia-smi

# Gifts from Iterative Pruning

## 1. Prepare

Clone github repo and install dependencies.

In [0]:
!git clone https://github.com/lehduong/ginp -b testing

Cloning into 'ginp'...
remote: Enumerating objects: 469, done.[K
remote: Counting objects: 100% (469/469), done.[K
remote: Compressing objects: 100% (204/204), done.[K
remote: Total 469 (delta 289), reused 438 (delta 259), pack-reused 0[K
Receiving objects: 100% (469/469), 1.12 MiB | 2.72 MiB/s, done.
Resolving deltas: 100% (289/289), done.


In [0]:
cd /content/ginp/cifar/filter_pruning/

/content/ginp/cifar/filter_pruning


In [0]:
!pip install ptflops

## 2. Train and Evaluate Networks

### 1. Training from scratch 

Training the baseline model from scratch. Note that, for training from scratch, I have only tested on *cifar/**filter_pruning**/train.py*. 

Change the **--arch** option to desired architecture. It should be one of: resnet56, resnet110, wrn_16_10, vgg16, preresnet110. Read *cifar/xxxx_pruning/models/__init__.py* for more details.

+ For resnet56, 110, preresnet110, vgg: $training\_epochs = 300$, $schedule=[150,225]$, $gamma=0.1$, $lr=0.1$

+ For wrn_16_10: $training\_epochs = 200$, $schedule=[60,120, 160]$, $gamma=0.2$, $lr=0.1$

Currently, I used $weight\_decay=0.0001$ for all models thought original paper of wideresnet suggested using $0.0005$.

In [0]:
!python train.py -a wrn_16_8 -d cifar100 --epochs 200 --schedule 60 120 160 --gamma 0.2 --wd 1e-4 --save checkpoints

==> Preparing dataset cifar100
Files already downloaded and verified
==> creating model 'wrn_16_8'
    Total params: 11.01M

Epoch: [1 | 200] LR: 0.100000
[KProcessing |################################| (391/391) Data: 0.002s | Batch: 0.100s | Total: 0:00:38 | ETA: 0:00:01 | Loss: 3.8460 | top1:  10.4300 | top5:  31.5840
[KProcessing |################################| (100/100) Data: 0.003s | Batch: 0.032s | Total: 0:00:03 | ETA: 0:00:01 | Loss: 3.6172 | top1:  14.5700 | top5:  41.5400

Epoch: [2 | 200] LR: 0.100000
[KProcessing |################################| (391/391) Data: 0.002s | Batch: 0.096s | Total: 0:00:37 | ETA: 0:00:01 | Loss: 3.0114 | top1:  24.3120 | top5:  54.6980
[KProcessing |################################| (100/100) Data: 0.003s | Batch: 0.029s | Total: 0:00:02 | ETA: 0:00:01 | Loss: 2.8561 | top1:  27.0800 | top5:  59.6800

Epoch: [3 | 200] LR: 0.100000
[KProcessing |################################| (391/391) Data: 0.002s | Batch: 0.096s | Total: 0:00:37 | 

### 2. Evaluate trained models
Evaluate trained (or pruned) models with **--evaluate** option and modify **--resume** accordingly to checkpoint

In [0]:
!python train.py -a wrn_16_8 -d cifar10 --resume checkpoints/model_best.pth.tar  --evaluate --save a

## 3. Pruning

### 1. Filter Pruning

For resnet56, resnet110, preresnet110, wrn_16_10: run following command

In [0]:
!python residualprune.py --dataset cifar100 --arch wrn_28_10 --model prune_2/model_best.pth.tar --save prune_3

For vgg16: run following command

In [0]:
!python vggprune.py --dataset cifar100 --arch vgg16 --model checkpoints/model_best.pth.tar --save prune_1

### 2. Weight Pruning

Run following command for **all** models.

Note that to apply iterative pruning, we have to increase **precent** at each step manually. 

For example, in first pruning step, we set the --percent option to **0.2** and resume from baseline model; at second pruning step, we increase the --percent option to **0.4** and resume from (finetuned) pruned network of step 1.

In [0]:
!python cifar_prune.py --arch preresnet110 --dataset cifar10 --percent 0.96 --resume prune_2/finetuned.pth.tar   --save_dir prune_3

## 4. Finetuning
### 1. Finetune for Filter Pruning

To retrain a network, which was pruned by **filter** pruning, run below command.

For all models and datasets, we used $lr=0.1$, $schedule=[20,30]$, $gamma=0.1$. 

TODO: Finetune wideresnet

In [0]:
!python finetune.py --lr 0.02 --schedule 20 30 --gamma 0.2 --refine prune_3/pruned.pth.tar --dataset cifar100 --arch wrn_28_10 --save prune_3

### 2. Finetune for Weight Pruning

To retrain a network pruned by **weight** pruning, run following command.

We used same hyperparameter as retraining for filter pruning.

In [0]:
!python cifar_finetune.py --arch preresnet110 --dataset cifar10  --resume prune_3/pruned.pth.tar --save_dir prune_3

## 4. Ensemble finetune

To perform knowledge distillation from ensemble: manually change the **checkpoint_path** variable in *ensemble_finetune.py* then run below command.

For knowledge distillation, we use $lr=0.01$, $schedule=[20,30]$, $gamma=0.2$.

In [0]:
!python ensemble_finetune.py --refine prune_3/finetuned.pth.tar --dataset cifar10 --save snapshot_ensemble --arch preresnet110

In [0]:
!zip -r prune_1.zip prune_1
!zip -r prune_2.zip prune_2
!zip -r prune_3.zip prune_3
!zip -r prune_4.zip prune_4
!zip -r prune_5.zip prune_5
!zip -r kd.zip kd

In [0]:
!cp kd.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1

In [0]:
!cp prune_1.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1
!cp prune_2.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1
!cp prune_3.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1
!cp prune_4.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1
!cp prune_5.zip /content/drive/My\ Drive/GINP/cifar10/weight_pruning/small_lr/resnet56_1