-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is the initial determination of which filters are unimportant based on the L1 norm of the weights? #5
Comments
Yes, more details can be found in the research paper and source code |
However, it seems that directly modifying the code to use l1-norm as the criterion is not compatible because using the l1-norm of filters cannot prune the input to the first convolutional layer of residual blocks... |
I cannot get your idea. Actually, in the code, it is processed. You can take a look at the code to see it. |
In this code snippet, if the criterion is set to L1-norm, then the score represents the L1-norm of the filters, and it is added to the all_scores array. Additionally, all_scores = np.append(all_scores, out["act_scale_pre"]) includes the scaling factors as scores in all_scores as well. I think these are two different scores, and they can be combined for importance ranking. |
They essentially represent the same thing, emphasizing the importance of filters. The separate naming is simply for the convenience of code organization and readability. I still cannot get your issue. |
I'm sorry, I just started working on pruning-related tasks, so there may be issues with my expression. What you mean is that the L1-norm calculated using the absolute average of filter weights and the act_scale_pre calculated using the scaling factors can be combined for importance ranking, correct? |
The |
If the So the outer function two lines :
|
Should
--prune_criterion l1-norm
be added in ./scripts/dist_train.sh? I noticed that the default prune_criterion is act_scale:parser.add_argument('--prune_criterion', type=str, default='act_scale', choices=['l1-norm', 'act_scale'])
. The entire pruning process is as follows: Firstly, for a well-trained model, the convolutional kernel weights are used to select which layers are planned to be pruned based on their L1 norm. Then, sparse training is performed by incorporating the scaling factors, targeting the scaling factors corresponding to unimportant layers. Finally, pruning is executed to remove those identified layers. I'm not sure if my understanding is accurate.The text was updated successfully, but these errors were encountered: