Pruning with bn fold #415

nzmora · 2019-10-31T17:24:07Z

pruning: add an option to virtually fold BN into Conv2D for ranking

PruningPolicy can be configured using a new control argument fold_batchnorm: when set to True, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so fold_batchnorm is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization.
This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported).
e.g.:
policies:

pruner:
instance_name : low_pruner
args:
fold_batchnorm: True
starting_epoch: 0
ending_epoch: 30
frequency: 2

PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization. This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported). e.g.: policies: - pruner: instance_name : low_pruner args: fold_batchnorm: True starting_epoch: 0 ending_epoch: 30 frequency: 2

distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function

Ranking weight-matrices by input channels is similar to ranking 4D Conv weights by input channels, so there is no need for duplicate logic. distiller/pruning/ranked_structures_pruner.py -change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function -remove the code handling ranking of matrix rows distiller/norms.py – remove rank_cols. distiller/thresholding.py – in expand_binary_map treat `channels` group_type the same as the `cols` group_type when dealing with 2D weights

Also update resnet20 AGP examples

guyjacob

I only looked at the changes actually related to BN folding in policy.py - looks fine.

* pruning: add an option to virtually fold BN into Conv2D for ranking PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization. This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported). e.g.: policies: - pruner: instance_name : low_pruner args: fold_batchnorm: True starting_epoch: 0 ending_epoch: 30 frequency: 2 * AGP: non-functional refactoring distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function * Simplify GEMM weights input-channel ranking logic Ranking weight-matrices by input channels is similar to ranking 4D Conv weights by input channels, so there is no need for duplicate logic. distiller/pruning/ranked_structures_pruner.py -change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function -remove the code handling ranking of matrix rows distiller/norms.py – remove rank_cols. distiller/thresholding.py – in expand_binary_map treat `channels` group_type the same as the `cols` group_type when dealing with 2D weights * AGP: add example of ranking filters with virtual BN-folding Also update resnet20 AGP examples

nzmora added 4 commits October 31, 2019 19:15

AGP: non-functional refactoring

dd29c90

distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function

AGP: add example of ranking filters with virtual BN-folding

0aad97c

Also update resnet20 AGP examples

guyjacob approved these changes Nov 5, 2019

View reviewed changes

Merge branch 'master' into pruning_with_bn_fold

750a8ce

nzmora merged commit c849a25 into master Nov 11, 2019

nzmora deleted the pruning_with_bn_fold branch November 25, 2019 14:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pruning with bn fold #415

Pruning with bn fold #415

nzmora commented Oct 31, 2019

guyjacob left a comment

Pruning with bn fold #415

Pruning with bn fold #415

Conversation

nzmora commented Oct 31, 2019

guyjacob left a comment

Choose a reason for hiding this comment