-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add densenet models #116
Add densenet models #116
Conversation
awesome. do the DenseNet pre-trained models expect the same normalization as the rest of the models? I'm going to upload your models to the pytorch s3 bucket so that you can make them available via the |
You will have to change the naming of the pretrained models as described here: |
torchvision/models/densenet.py
Outdated
|
||
# First convolution | ||
self.features = nn.Sequential() | ||
self.features.add_module('conv0', nn.Conv2d(3, num_init_features, kernel_size=7, stride=2, padding=3, bias=False)) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torchvision/models/densenet.py
Outdated
drop_rate (float) - dropout rate after each dense layer | ||
num_classes (int) - number of classification classes | ||
""" | ||
def __init__(self, growth_rate=32, block_config=(6,12,24,16), num_init_features=64, bn_size=4, drop_rate=0, num_classes=1000): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@soumith - yeah, same normalization as the other ImageNet models. I'll rename the files |
@fmasa no lint errors anymore. Sorry for not reading the contributing guidelines earlier! |
They are now uploaded to the bucket and available via URLs:
|
@soumith Sorry, the files I linked to earlier were serialized version of the models, not the model states. Here's a link to the correct files: https://drive.google.com/drive/folders/0B0Y2k_mEJpY9NXFBa1ktRUo3YlU?usp=sharing |
@gpleiss the new files have been uploaded to the same place i.e. |
|
Thank you! this is good stuff. |
It's weird, I have never seen DenseNet using this configuration for the first convolution. Checking other implementations from the authors, they also used conv 3x3, stride=1, padding=1 |
@trypag the cifar models use 3x3 convolution for the first layer, but the ImageNet models use 7x7 convolution. The author's implementation is only for the CIFAR models. However, if you download their pretrained imagenet models the first layer is a 7x7 convolution. |
Alright thanks @gpleiss ! |
The authors address the memory efficiency in the followup paper and updated repo with underlying shared memory and re-computation on bp. The current pytorch code just uses torch.cat() directly. Any plan to formalize the technique? It is likely densenet based CNNs are going to prosper for other applications. |
* initial commit of ssd code * some readme fixes * some bugfixes, adding model download * requirements.txt for dockerfile * switching the backbone to R34 * updating ssd300.py file * removing imports that are no longer needed * bug fixes for resnet backbone update (pytorch#112)
See issue #97.
It's not super-memory optimized (i.e. there's a concatenation at every layer). This is consistent with the original Torch implementation, and prevents some gross autograd hacks.
Pretrained models for the model zoo are available here. They're converted over from the original Torch implementation (ported from LuaTorch).