You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great implementation.
I wonder what do you mean by 'didn't prune the bias term'.
Do you mean that you only use Wx (instead of Wx+b) to get the predictions and calculate the gradients?
For the pruned models of interests, should I use:
both new weights and (original) bias (which does not make sense).
only new weights (which may cause negative effects on the accuracy of original models because bia terms are omitted).
Thanks!
The text was updated successfully, but these errors were encountered:
Wx refers to without bias term rather than didn't prune the bias term.
We update the bias term during the fine-tuning process, thus the original bias should not be used.
DNS (Dynamic Network Surgery) set some small biases to zero, i.e. prune bias term. Strictly speaking, it is possible that all bias term are set to zero, in another word, Wx+0, which is equivalent to Wx. But we didn't prune bias, even if the value of some bias terms are very close to zero.
Hi Shuan,
Thanks for the great implementation.
I wonder what do you mean by 'didn't prune the bias term'.
Do you mean that you only use Wx (instead of Wx+b) to get the predictions and calculate the gradients?
For the pruned models of interests, should I use:
Thanks!
The text was updated successfully, but these errors were encountered: