Skip to content

Pytorch implementation of our paper SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration accepted by NeurIPS 2023.

Notifications You must be signed in to change notification settings

JingyangXiang/SUBP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[NeurIPS 2023] SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration (Paper Link)

Jingyang Xiang, Siqi Li, Jun Chen, Shipeng Bai, Yukai Ma, Guang Dai, Yong Liu

Pytorch implementation of SUBP in NeurIPS2023.

Main code can be found in ./models/conv_type/regrow_uniform_block_v2.py

Framework

Acceleration

Introduction

In this paper, we propose a soft uniform block pruning method for 1xN sparse acceleration on multithreading CPUs.

Prepare ImageNet1K

Create a data directory as a base for all datasets. For example, if your base directory is /datadir/datasetthen imagenet would be located at /datadir/dataset/imagenet. You should place train data and val data in /datadir/dataset/imagenet/train and /datadir/dataset/imagenet/val respectively.

Training on ImageNet1K

All scripts can be obtained in ./scripts/generate_scripts.py

python ./scripts/generate_scripts.py
python pruning_train.py [DATA_PATH] --set ImageNet -a [ARCH] --no-bn-decay True --save_dir [SAVE_DIR] \
--warmup-length 0 --N [N] --M 16 --conv-type [CONV_TYPE] --weight-decay 0.0001 --nesterov False \
--workers 16 --prune-rate [PRUNE_RATE] --batch-size [BATCH_SIZE] --lr [LEARNING_RATE]

Results on ImageNet1K

All models can be obtained in OpenI community. Many thanks to OpenI for the storage space!

name N for N:M Sparsity FLOPs use dali Top-1 Accuracy Top-5 Accuracy model & log
resnet18 16 1.03G 69.9 89.3 link
resnet18 32 1.03G 69.7 89.2 link
resnet34 16 2.0G 74.1 91.7 link
resnet34 32 2.0G 73.9 91.6 link
resnet50 16 2.0G 77.6 93.6 link
resnet50 32 2.0G 77.4 93.3 link
resnet50 16 1.0G 76.3 92.6 link
resnet50 32 1.0G 76.0 92.6 link
mobilenetv1 16 279M 71.5 90.2 link
mobilenetv1 32 279M 71.1 89.7 link

Testing on ImageNet1K

python pruning_train.py [DATA_PATH] --set ImageNet -a [ARCH] --no-bn-decay True --save_dir [SAVE_DIR] \
--warmup-length 0 --N [N] --M 16 --conv-type [CONV_TYPE] --weight-decay 0.0001 --nesterov False \
--workers 16 --prune-rate [PRUNE_RATE] --batch-size [BATCH_SIZE] --lr [LEARNING_RATE] \
--pretrained [PRETRAINED_PATH] --evaluate

Testing log for 1x16 ResNet18 on ImageNet1K

[2024-03-03 15:42:11] Test: [0/1000]	Time 1.471 (1.471)	Loss 1.4947 (1.4947)	Prec@1 92.000 (92.000)	Prec@5 96.000 (96.000)
[2024-03-03 15:42:13] Test: [100/1000]	Time 0.017 (0.031)	Loss 1.3180 (1.9283)	Prec@1 96.000 (77.941)	Prec@5 100.000 (92.792)
[2024-03-03 15:42:15] Test: [200/1000]	Time 0.019 (0.024)	Loss 2.1297 (1.9438)	Prec@1 62.000 (77.104)	Prec@5 100.000 (92.935)
[2024-03-03 15:42:16] Test: [300/1000]	Time 0.017 (0.022)	Loss 1.4819 (1.9311)	Prec@1 92.000 (76.711)	Prec@5 94.000 (93.542)
[2024-03-03 15:42:18] Test: [400/1000]	Time 0.018 (0.020)	Loss 2.9675 (1.9333)	Prec@1 34.000 (76.713)	Prec@5 82.000 (93.551)
[2024-03-03 15:42:20] Test: [500/1000]	Time 0.012 (0.020)	Loss 1.4595 (2.0340)	Prec@1 94.000 (74.279)	Prec@5 100.000 (92.040)
[2024-03-03 15:42:21] Test: [600/1000]	Time 0.016 (0.019)	Loss 4.0964 (2.0947)	Prec@1 22.000 (72.922)	Prec@5 56.000 (91.085)
[2024-03-03 15:42:23] Test: [700/1000]	Time 0.021 (0.019)	Loss 2.8167 (2.1422)	Prec@1 62.000 (71.738)	Prec@5 78.000 (90.354)
[2024-03-03 15:42:25] Test: [800/1000]	Time 0.017 (0.018)	Loss 1.6174 (2.1819)	Prec@1 86.000 (70.821)	Prec@5 96.000 (89.738)
[2024-03-03 15:42:26] Test: [900/1000]	Time 0.014 (0.018)	Loss 1.4955 (2.2120)	Prec@1 94.000 (69.991)	Prec@5 94.000 (89.223)
[2024-03-03 15:42:28]  * Prec@1 69.974 Prec@5 89.266 Error@1 30.026

Optional arguments

optional arguments:
    # misc
    --save_dir                  Path to save directory
    
    # for model                         
    --arch                      Choose model
                                default: resnet18
                                choice: ['resnet18', 'resnet34', 'resnet50', 'mobilenet_v1']
    --conv-bn-type              convbn type for network
                                default: SoftMaxQConv2DBN
      
    # for datatset
    data                        Path to dataset
    --set                       Choose dataset
                                default: ImageNet
                                choice: ["ImageNet", "ImageNetDali"]   
                                                 
    # for pretrain, resume or evaluate
    --evaluate                  Evaluate model on validation set
    --pretrained                Path to pretrained checkpoint

    # 1xN sparsity
    --N                         N for 1xN sparsity
                                default: 16

    # progressive pruning
    --decay-start               Start to prune
                                default: 10
    --decay-end                 Start to prune
                                default: 180   
    --prune-schedule            Prune schedule
                                default: cubic
                                choice: ['linear', 'exp', 'cos', 'cubic']          
    --prune-rate                Prune rate for 1xN Sparse Network
                                default: 0.0
    --prune-criterion           Prune criterion
                                default: L1
                                choice: ['L1', 'L2', 'BPAR']

Dependencies

  • Python 3.9.16
  • Pytorch 2.0.0
  • Torchvision 0.15.1
  • nvidia-dali-nightly-cuda110 1.27.0.dev20230531
  • nvidia-dali-tf-plugin-nightly-cuda110 1.27.0.dev20230531

THANKS

Special thanks to the authors and contributors of the following projects:

Citation

@article{xiang2024subp,
  title={SUBP: Soft Uniform Block Pruning for 1$times$N Sparse CNNs Multithreading Acceleration},
  author={Xiang, Jingyang and Li, Siqi and Chen, Jun and Dai, Guang and Bai, Shipeng and Ma, Yukai and Liu, Yong},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

About

Pytorch implementation of our paper SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration accepted by NeurIPS 2023.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages