-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make initialization of GoogleNet / Inception faster #2166
Comments
Thanks for openning an seperate issue for this. As you said i was looking into the Scipy issue and trying to understand the behaviour of vision/torchvision/models/inception.py Lines 103 to 116 in f9ef235
I think weight initialization with PyTorch nn.init module can significantly improve the scenario. I have tested with
modules=[
nn.Conv2d(3,512,2),
nn.BatchNorm2d(512),
nn.Linear(512,3) ] X 17 times for weight initialization, the current implementation takes around model=nn.Sequential(*modules)
def weight_init(m):
if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
nn.init.uniform_(m.weight,-2,2)
elif isinstance(m, nn.BatchNorm2d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
model.apply(weight_init) Would you mind if I work on it? |
Hi, Sure, please it would be great if you could work on it. I think that we should still keep the old behavior if the user wants, and raise a warning to make users aware of it. One option would be to change the default value of |
Hii,
Sure. That would be better. |
There have been several reports from users that GoogleNet and Inception are very slow to construct, see #1797 , #1977 and #2145 for example.
The underlying issue is that these models use
scipy.truncnorm
, for which the implementation was recently updated and became 100x slower than it was before, see scipy/scipy#11299 for reference. This slowdown has been fixed in scipy and will be present in the 1.5.0 release, but in the meantime, users of torchvision still obtain very long startup times.I think the simplest alternative is to make
init_weights
default toFalse
, and use a weight initialization from PyTorch instead. This is BC-breaking for users who want to train Inception from scratch, but I'm not sure how much it will affect users in general.The text was updated successfully, but these errors were encountered: