Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference on epochs in network-slimming #3

Closed
erichhhhho opened this issue Nov 28, 2018 · 7 comments
Closed

Difference on epochs in network-slimming #3

erichhhhho opened this issue Nov 28, 2018 · 7 comments

Comments

@erichhhhho
Copy link

erichhhhho commented Nov 28, 2018

Hi, I found the default number of epochs in network-slimming(scratch training VGG-11) for Imagenet (which is 90 in the code) is different from the original paper, which is 60.

@Eric-mingjie
Copy link
Owner

Eric-mingjie commented Nov 28, 2018

Hi, Thanks for your interest in our code!

Yes, in the original network slimming paper, the epochs of ImageNet is 60. In this repo, we use the official Pytorch ImageNet training schedule which is 90 epochs.

@liuzhuang13
Copy link
Collaborator

liuzhuang13 commented Dec 3, 2018

Hi @erichhhhho I'm an author of both this paper and the Network Slimming paper.

Using 60 epochs in the original Network Slimming paper was due to the resource limit at that time, and there was a significant bug (a bug about activation functions in fc layers that was later found) in the original paper for the result on VGG-11 on ImageNet. So in this project, we fixed the bug and used 90 epochs (standard in many papers).

@erichhhhho
Copy link
Author

@Eric-mingjie @liuzhuang13 I see. Thank you for your clarification.

@erichhhhho
Copy link
Author

erichhhhho commented Dec 5, 2018

Btw, there is a bug in network-slimming cifar10 main_B.py (line 102)

if args.refine:
AttributeError: 'Namespace' object has no attribute 'refine'

The args.refine should already become args.scratch in your code, and will be redundant.

@erichhhhho erichhhhho reopened this Dec 5, 2018
@Eric-mingjie
Copy link
Owner

Hi, @erichhhhho ! Thanks for pointing it out! I just pushed a fix.

@huangbiubiu
Copy link

Hi @erichhhhho I'm an author of both this paper and the Network Slimming paper.

Using 60 epochs in the original Network Slimming paper was due to the resource limit at that time, and there was a significant bug (a bug about activation functions in fc layers that was later found) in the original paper for the result on VGG-11 on ImageNet. So in this project, we fixed the bug and used 90 epochs (standard in many papers).

@liuzhuang13 I download the PyTorch model of scratch-E from the link of the trained model (https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet/network-slimming#models). I found that the value of epoch is 60 instead of 90 in the model dict. Is the scratch-E model trained 60 epoch? And what're the epoch settings of the other experiments, e.g., Unpruned in the Table. 4 of the paper RETHINKING THE VALUE OF NETWORK PRUNING? Thanks!

@Eric-mingjie
Copy link
Owner

Eric-mingjie commented Sep 15, 2019

The epoch is actually 90. Don't be bothered by the value of epoch. Standard ImageNet models are trained with 90 epochs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants