Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of a pruning engine #9

Open
JulianStier opened this issue Feb 2, 2024 · 0 comments
Open

Design of a pruning engine #9

JulianStier opened this issue Feb 2, 2024 · 0 comments

Comments

@JulianStier
Copy link
Member

The functional design in our original pruning.py e.g. with optimal_brain_damage() is not up to date any more.
Pytorch has a very basic implementation of pruning methods on its own (compare i.e. torch.nn.utils.prune ) and while its design allows for diverse pruning methods based on what they call "importance score" (in literature usually saliency), there seems to be mostly focus on structures/unstructures and random/magnitude-based pruning.

The saliency (importance score) usually defines per pruning step which parameters / structural elements should be masked out (= temporarily or consistently removed). Most often the saliency is simply randomly sampled with e.g. a percentage of 10% or it is based on the magnitude of the underlying parameter. But it can be also assigned based on the change in loss (then we refer to the optimal brain damage method) or even the hessian (second derivative; then we refer to the optimal brain surgeon paper). Further, Han et al (2015) showed that even just magnitude-based pruning could already be differently computed where the thresholds are either computed on a whole module or only a layer -- which changes not only the saliency but also can affect what we mean by "prune 10% based on xyz saliency".

PyTorch Ignite has a good engine design to decouple the model design and configuring a training scheme. The advantage is that a model can be independently developed/designed and the training engine can be wrapped around the model without modifying the model class. This could be a good orientation on an additional pruning engine which could work in conjunction with a training engine as to conduct a training-pruning-pipeline.

It would be good to also consider a saliency measure and an independent mask. The mask carries the actually pruned structure and has linear memory cost w.r.t. model parameters as we simply double the amount (and could even reduce by just carrying masks for actually pruned tensors). The saliency carries different information in that it is more like pytorch utils.prune importance score and could be based on the change of gradient per parameter. If the mask and saliency are separately carried, the design would allow to set the model back to its original structure or quickly extract graphs/masks per step. That might be especially interesting in the domain of Lottery Ticket experiments to reset the model to a different initialization but keeping the obtained pruned / masked structure.

@JulianStier JulianStier added this to the pruning-engine milestone Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant